DOD - Data Oriented Design & why you should try it

kalabgs · September 24, 2023, 8:08am

What is DOD?

Data-Oriented Design (DOD) is an approach to programming that focuses on organizing and structuring data for optimal performance. In Roblox, DOD can help you achieve better performance, especially in scenarios where you need to manage large numbers of game objects, like enemies, bullets, particles, etc.

Code example

-- Define a table for storing the enemy data and for the module it self
local module = {}
local Enemies = {}

-- A format for concatenating the entity data when displayed in the console
local Format = "EnemyID: %s\nEnemyPosition: (%d, %d)\nHealth: %d"

-- Function for creating entity
function module.CreateEnemy(x, y)
    local Enemy = {
        Position = Vector2.new(x,y),
        Health = 100
    }

    table.insert(Enemies, Enemy)

    -- Returning the ID of the entity 
    return #Enemies 
end

-- An example function for dealing with one entity
function module.DealDamage(EnemyID, Damage)
     local Enemy = Enemies[EnemyID]

     if Enemy then
          Enemy.Health -= Damage
     end
end

-- An example function for updating all entities
function module.UpdateEnemies()
    for k,Enemy in Enemies do
        if Enemy.health <= 0 then
            Enemies[k] = nil
        end
    end
end

function module.Print()
    for k,Enemy in Enemies do
         print(string.format(Format,Enemy.ID,Enemy.Position.X,Enemy.Position.Y,Enemy.Health)
    end
end

return module

-- Example usage
local module = require(module)

local Enemy1 = module.CreateEnemy(10, 20)
local Enemy2 = module.CreateEnemy(30, 40)
local Enemy3 = module.CreateEnemy(100, 50)

module.DealDamage(Enemy1,30)
module.DealDamage(Enemy3,100)

module.UpdateEnemies()

module.Print()

Pros and cons of using DOD

Pros of DOD

Improved Performance: DOD optimizes memory layout and data access patterns, leading to better performance, especially in scenarios with many game objects.
Cache Efficiency: DOD strives to maximize cache locality, reducing memory access times and improving overall game performance.
Parallelization: DOD often naturally lends itself to parallel processing, allowing you to take better advantage of multi-core processors.
Reduced Memory Overhead: DOD can minimize memory overhead compared to traditional OOP, where each object may carry unnecessary data and methods.

Cons of DOD

Complexity: Implementing DOD can be more complex and may require a mindset shift, especially if you are used to OOP.
Less Intuitive: DOD code may be less intuitive for some developers, as it often separates data from behavior.
Potential Over-Optimization: Over-optimizing prematurely in DOD can lead to complex and hard-to-maintain code.

Conclusion

Data-Oriented Design can be a powerful approach to optimize performance in your games. However, it may not always be the best choice for every situation. When used sensibly, DOD can lead to more efficient code and better-performing games, especially when dealing with a large number of game objects. Make sure to evaluate your project’s specific needs and performance bottlenecks before deciding whether to use DOD or OOP.

That’s from me, keep on learning everyone! `:)`

omrezkeypie · September 24, 2023, 9:20am

Wow! Great explanation. Very down to earth and understandable. I might test this out for future projects.

schaap5347 · September 24, 2023, 10:24am

How is this code supposed to automatically run parallel? I dont see anything that has to do with parallel programming.

kalabgs · September 24, 2023, 2:03pm

It can be initilaized in Parallel Lua using SharedTables easily.

Deathbandss · September 26, 2023, 5:35am

Bravo omrezkeypie, I expect to see more amazing work from you! GTD:A On top!

xChris_vC · September 26, 2023, 6:29am

Separating data from behavior is a very good practice! Been using a similar approach for my game the last year too, and while it’s like learning a new language, it has great benefits

If you’re interested, you can look into the ECS Matter. It’s built on the same principle, but goes deeper in the entity-component relation.

sethsethisthebest · September 29, 2023, 1:34pm

This specific piece of code isn’t parallelized - that’s not the point. The point is that Data Oriented Design in general can be parallelized.

For example, an Entity Component System architecture is an application of DOD. Systems can be parallelized - i.e. OP’s Enemy Damage System would work on one core under one Actor, while they have an Enemy Pathfinding System working on another core under another Actor. The systems aren’t directly related and can work on the same entities at the same time without conflicting with each other because entity data is nicely separated from entity behaviour in an efficient way. Imagine trying to accomplish the same in an OOP architecture where an Entity’s data and behaviour are directly intertwined and systems aren’t segregated.

Because of the way DOD (or in this case specifically, ECS) works, it’s naturally easier to parallelize. You can adopt DOD without parallelization and still benefit from performance improvements. But having the option of easy parallelization makes DOD very attractive for projects where performance gains matter.

bluebxrrybot · September 29, 2023, 2:34pm

That’s just object oriented programming (OOP)?

kalabgs · September 29, 2023, 4:22pm

I’ve stated that DOD is different from OOP above. The main difference is the fact that methods are not “carried” into every object, this way you reduce memory usage and improve overall compilation time (due to less indexing)

bluebxrrybot · September 29, 2023, 6:05pm

I see it now. So DOD just returns a number instead of an object. Pretty sure OOP just uses a few extra bits to remember methods, though.

-- DOD
return id

-- OOP
return setmetatable(self, Object)

Using a metatable with all the methods will avoid the need to redefine functions. Redefining the functions everytime will cause that memory usage you’re talking about.

Clueless_Dev · October 1, 2023, 5:06pm

No, not necessarilly DOD is an arquitectural paradigm, it’s not just about what you return, you can return whatever you want.

the meat of it is how you go about handling your objects and organizing them. aside from that you don’t have to “redefine functions” am not sure in which part of the post feddy mentioned that, if anything it’s writing objects like this is more flexible cause in pure OOP if you wanted both a client and server version of an object, you would need dedicated client and server constructors (and probably dedicated client-server methods). Cause you can’t send an object over the network, it looses it’s metatable.

But we are talking about code design, not really DOD or OO, beside the point cause you can do OOP with DOD. it’s just kinda defeats the purpose of some optimizations.

but main takeaways:

Writing pure OO in roblox is just an unnecessary pain, it’s more productive to just write libraries
OO and DO can be used in common, in which case update function would basically just transform self properties, but again kinda defeats an optimization you can get there cause self does not need to have any methods, it can just be passed to a function.

codewise that would look

local module = require(module)
local Enemy1 = module.CreateEnemy(10, 20)
local Enemy2 = module.CreateEnemy(30, 40)
local Enemy3 = module.CreateEnemy(100, 50)

Enemy1:DealDamage(30)
Enemy3:DealDamage(100)

while task.wait() do
     -- you would put your enemies in a table and for loop it instead it
     Enemy1:Update()
     Enemy2:Update()
     Enemy3:Update()
  end

which again kinda defeats the purpose of the optimization OF NOT having a metatable, but you can do that just fine.

Clueless_Dev · October 1, 2023, 5:09pm

Amazing post man! This is among th most approachable explanations of Data Oriented Programming I’ve read in fair bit!