One of the most common questions asked, upon first reading about Data Oriented Design, is “I don’t get it – how is this different from just Structures Of Arrays?”
I think it may help to draw a parallel with garbage collection, another fiercely debated topic. Sometimes programmers used to more abstract languages will complain about “having” to manually manage memory in non-garbage collected languages, implying the goal is writing code and managing memory is a chore which gets in the way of that.
They’re not wrong to think that, but you have to recognize that it’s not the only valid viewpoint.
You see, people working in games often see it the other way around. In some programs, managing memory is the goal, and the code is the chore which gets in the way.
Underneath all the high-level concepts we apply onto programming, all computers really do is move memory around. It’s not hard to see though how one could forget that. If you have a good debugger, you can see the memory underneath; a good memory window or disassembly view can show you things the way the computer sees them. Many languages and platforms though don’t have that luxury. The user, presented only with the user-facing inputs/outputs (the source code and an expression-based debugger), will be pushed towards the code-first viewpoint without realizing it, because they can’t see things the way the computer does.
Code and memory are like two sides of the same coin – the Yin and the Yang, perhaps. You need both.
Code-driven design is about trying to express in source-code form what you wish the computer to do – it’s a forward-based planning method. The motivation behind data-oriented design is to consider what the ultimate result you want is, and then work backwards to discover the best way to get those bytes in that place.
In OOP the code is king – everything is designed around your core ideas in a way that showcases them neatly. In DOD the code is simply an unfortunate artifact which we have to suffer while we tend to the real business of managing memory.
- The OOP code will be cleaner, simpler. It’ll make more sense to read, and there might be less of it. It exists to encapsulate an idea in a way that both the human and computer can understand. The goal in OOP is to write good code.
- The DOD code will be longer, messier, with odd little edges cases and weird intrinsics everywhere, because it wasn’t written to be code – it exists secondarily to the real goal. In DOD, the goal of writing code is not itself writing code – the code exists only as a byproduct of an underlying process.
Structures Of Arrays are simply a means to an end, but someone trained only to work with code sees them as just that – a means – because they only see it as another form of code, rather than the ‘end’. They’re comparing one house to another house, when what we’re really concerned with is bricks, not houses at all.
- Given a particle system to write, the code-first designer think “Right, I need the idea of a particle, let’s make a class for that. Now what do I need to do to each particle? I need to first update its physics, then update its color, then render it. And I guess I need to then repeat all that for each particle, so I’ll store my particles in a list.”
- The data-first designer thinks “Right, I need to fill this vertex buffer with the updated particles. Where should I pull that information from? I guess I need an array to store all the particles in, I can read everything from there. They’ll need position and color. Hmm, maybe those would be better split into two arrays.”
It’s the same process, just starting from different ends. One expresses the idea better, one gets it done better. Which of those is most important to you at any given time depends very much on what you’re working on.
Of course, like any large construction project, the best way to do it is often to work neither forwards nor backwards, but instead to work from both ends and hope to meet in the middle.