Skip to content

The Man Behind The Curtain


I love peeking behind the scenes. I like knowing how things were made. I think most people do really, it’s something built-in to humans.

Historically, video games don’t give out the source code. And why should they – after all, a game is made for players. You’re supposed to be having fun playing it, not trying to peek behind the curtain.

But for us programmers, there’s a few gems out there. A small number of commercial games do have source code releases. I’ve been meaning for a while to do a round-up of them, so here it is.

Somewhat disappointingly, almost all of these are PC games. Finding source code to console or arcade games is incredibly rare, and a shame, as I think many programmers don’t realize the different approaches to programming that exist on non-PC platforms.

Many of these games have improved community versions based off of these original source drops, which I’ve deliberately not linked to as I want to show the games as they originally were, not ported to SDL or whatever. So please bear in mind, if you’re after a more modern upgrade, there may well be one out there somewhere.

Wherever possible, I’ve linked to the original source drops as released by the original developer. Where that is not possible, I’ve re-hosted the original files locally. (sometimes re-zipping for convenience, to avoid the numerous self-extracting EXEs that people like to use)

Many of these games are included in Fabien Sanglard’s excellent series of informative and very well-written code reviews. I’ve added links to the code reviews where appropriate. If you want to know more about how a game works (and isn’t that why you’re here, after all?), go check it out.

You’ll also notice that many of the games are from the id Software/Apogee lineage. This is not a co-incidence. id should be congratulated for their open attitude to releasing their older source. The source to older commercial games has little commercial value any more, and often will be lost entirely. Surely it’s better that someone can learn something from it, rather than it being lost to the mists of time? It is, however, a shame that so many of the releases are all FPSs.

This is not an exhaustive list, and further more I don’t really want it to be. I’ve deliberately only included games that I like and feel are important enough to be mentioned here. So if there’s some obscure commercial game you like that’s not listed, tough. Hey it’s my blog, I’ll put what I want on it. Get your own blog and put it there :)

So without further ado, starting chronologically:

 Colossal Cave Adventure (1976)

Colossal Cave AdventureDeveloper: William Crowther and Don Woods
Publisher: Various.
Platform: PDP-10 and friends.

“You are in a maze of twisty little passages, all alike.”

A slight cheat perhaps – it’s debatable if this is technically a commercial game. It certainly has been sold commercially though, and it’s of such historic significance that I’m listing it anyway.

In case you didn’t know, this game is why adventure games are called “adventure” games – they’re all named after this.

The original version was in Fortran, so it’s going to be hard work for modern coders to pick through. Later versions were in C, so perhaps try those too.

Source code:

Catacomb (1989)

CatacombDeveloper: John Carmack
Publisher: Softdisk
Platform: Apple II / DOS

Not to be confused with the later “Catacomb 3D” (see below), this game is an early 2D incarnation. Developed by John Carmack before id Software formed, this is written entirely in Turbo Pascal. While commonplace at the time, it fell out of favor once C started to dominate.

Source code (DOS):

Prince Of Persia (1989)

Prince Of PersiaDeveloper: Jordan Mechner
Publisher: Brøderbund
Platform: Apple II / DOS / many more
Code review:

Prince Of Persia wowed gamers with its (for the time) revolutionary fluid animation, Hollywood-style storytelling, and exciting exploration.

Like many games of the time, this is written entirely in assembly language, making it hard for modern readers to figure out what’s going on.

If you haven’t already, please watch Jordan Mechner’s excellent GDC talk where he details the making of this fantastic game. Go do it now.

Source code (Apple II):

SimCity (1989)

SimCityDeveloper: Maxis
Publisher: Maxis / Brøderbund
Platform: All

SimCity basically invented a whole genre when it came out. The heart of its city simulation revolves around using cellular-automata-inspired algorithms, operating on the various municipal properties. This makes a good example of having to pop open the source code to really figure out the actual workings of the gameplay mechanics.

The source to the 1990 Unix port was released in 2008 as part of the OLPC project.

Source code (Unix version):

Hovertank 3D / Catacomb 3D (1991)

Hovertank 3DDeveloper: id Software
Publisher: Softdisk
Platform: DOS

This marks the first of the id Software 3D series. Both of these games use the same raycasting technique refined in the later Wolfenstein game, but Hovertank had yet to include texture mapping.

Source code:

Star Control II (1992)

Star Control IIDeveloper: Toys for Bob
Publisher: Accolade
Platform: DOS / 3D0

Star Control 2 is just… well.. it’s just different. It dates from an earlier age, when games weren’t expected to fit neatly into preconceived genres. Like so many contemporary games, it has an unmistakable “90s VGA” feel to it, with colors picked not because they looked nice but because they were in the default DPaint palette.

I recommend reading The Escapist’s review for a modern look back at it.

The source presented below is derived from the 3D0 port, because the original PC source was lost. Unfortunately this is an all too common occurrence, with many old games simply disappearing once everyone leaves the company.

Source code (3D0):

Wolfenstein 3D / Blake Stone (1992/3)

Wolfenstein 3dDeveloper: id Software
Publisher: Apogee Software
Platform: DOS

While based on the earlier Catacomb engine, Wolf3D marked a notable upgrade, featuring VGA graphics. But at its heart, it was just plain more fun.

Like many of id’s releases, the source code is (relatively) readable for modern eyes, although the core parts are written in 16-bit assembly language (something its sequel Doom managed to stay away from).

Of particular note is the method they use for drawing vertical lines, where they actually dynamically generate different drawing functions for each possible wall height.

Fabien has a guide on getting the original source to compile using the original development tools.

Blake Stone, an Apogee spin-off title based on the same engine, was released in 1993, one week before Doom. As you can imagine, it struggled to gain attention and faded off into obscurity afterwards.

Wolf3D source code:
Blake Stone source code:

Doom (1993)

doomDeveloper: id Software
Publisher: GT Interactive
Platform: DOS
Code review:

In many ways, Doom is the ultimate engine to study. At the time of release, it broke new ground in every way possible; a fully-immersive first-person world, not limited to a flat plane like its predecessor Wolfenstein. It had lighting, texture mapping, and invented DeathMatch.

But perhaps its most significant contribution is popularizing the idea of the “engine”. Before Doom, games tended to be tied very tightly with their data. Doom encouraged the idea of the data-driven game – an engine detached from its source assets. This allowed the newly-developing mod community to take it in entirely new directions (see for example, Aliens TC or Fistful Of Doom).

Source code:

Descent (1994)

descent2Developer: Parallax Software
Publisher: Interplay Productions
Platform: DOS

In the wake of 1993’s Doom, many companies were rushing about trying to play catch-up, resulting in a wave of “Doom clones”. One company, Parallax, managed to create something completely different.

Descent allowed the player to fly a ship through a maze of underground passages. Its innovative portal technology allowed it to feature full 3D, with rooms not limited to Doom’s 2.5D (something that id Software’s Quake would not feature until a year later).

Source code:

Gravity Force 2 (1994)

Gravity Force 2Developer: Jens Andersson and Jan Kronqvist
Publisher: Shareware
Platform: Amiga

Anyone of a certain age may well remember this. Amiga Power once rated it the 2nd-best game of all time.

Perhaps this isn’t really a commercial game. It’s debatable. It was released as paid shareware, and then later (due to its popularity) licensed to be given away free on an Amiga Power coverdisk. I’m including it as, to be honest, there’s very few games of the era that give away any source code at all. So if you want to see how a 16-bit game was made, here’s where you might look.

GF2 still has a homepage (see the link below). Be sure to check out this interview with the authors!

Source code:

Heretic / Hexen (1994/5)

HexenDeveloper: Raven Software
Publisher: id Software / GT Interactive
Platform: DOS

Many Doom-clones littered the market in the wake of Doom’s fallout, but Raven’s offerings are unique for two reasons: 1) for actually licensing the Doom engine, and 2) for actually being any good.

Notable improvements over Doom include (famously) the ability to look up and down, but only a little. Hexen also used a custom scripting language for its game events, something that was a relatively new idea at the time, but has since gone on to be commonplace. It also popularized the “hub-world” system of level progression.

Source code:

Rise Of The Triad: Dark War (1995)

Rise Of The TriadDeveloper: Apogee Software
Publisher: Apogee Software / FormGen
Platform: DOS

ROTT was an odd beast. It dervies from the Wolfenstein 3D engine, and it’s interesting to see how they managed to work around the limitations of their flat-world Wolf3D engine to create height-ish effects.

Although Apogee tried to stick as many bells and whistles on it as possible, it simply couldn’t compete once 1993’s Doom was released. Still, it was a lot of fun. PC Gamer summed it up well:

It didn’t ask you to aim up or down, quick save every few minutes, or worry about fiddling with graphics settings. It did, however, beg you to explode, shoot, and instagib everything.

Source code:

Marathon 2: Durandal (1995)

Marathon2Developer: Bungie Software
Publisher: Bungie Software
Platform: Apple Macintosh / Windows 95

The Marathon series were famous at the time for basically being about the only games available on the Apple Macintosh. And to be honest, well… I think it’s fair to say they fall into the category of “Doom clone”. 3 months after its release, id Software released the famous “qtest” build of Quake, which allowed players to get a teaser of what was coming.

Macs at the time were often thought to be only used by writers and artists, and despite Marathon’s best efforts, that image largely stuck. The small company that made it did however go on to have moderate success on other platforms.

Source code:

Duke Nukem 3D / Shadow Warrior (1996)

Developer: 3D RealmsDuke Nukem 3D
Publisher: GT Interactive Software
Platform: DOS
Code review:

Out of the many Doom clones, the 3D Realms games are some of the few to try and take the technology in new directions.

Powered by Ken Silverman’s famous Build Engine, they added many new features such as sloped floors, rooms-above-rooms, mirrors.

Unfortunately, while it pulls of some very impressive technical feats, the source code itself is… well… let’s try and be nice polite here, awful. I’ve looked at it several times and still have literally no idea what any of it does. Luckily Fabien’s code review manages to shed some light on it.

For more information on the Build Engine, see the author’s homepage.

Duke Nukem 3D source code:
Shadow Warrior source code:
Build engine source code:



Quake 1/2/3 (1996-1999)

QuakeDeveloper: id Software
Publisher: GT Interactive / Activision
Platform: DOS / Windows / others
Code review: (Quake 1)
Code review: (Quake 3)

I’m not going to write much about Quake. You probably already know most of it. Its landmark software renderer allowed a full 3D texture mapped world, with no 2.5D hacks or cheating.

Perhaps there are some less known things about its source though. As far as I know, this is the first commercial game to be compiled with an open-source compiler (DJGPP for DOS, an early gcc port).

It came with its own scripting language, “Quake C” (later lcc for Quake 3). This was created not to ease development, but specifically so that players could make their own custom modifications. This, along with Doom’s PWAD system, helped to launch the mod community.

Quake 1 had a innovative surface caching mechanism to cache the shading results. Once 3D accelerators became common however, this fell out of fashion. I wonder if it’s time for a revisit? id’s later game “Rage” used a similar idea.

One other notable thing about Quake is that it was absolutely rock-solid. Both the rasterizing and collision detection never glitched or slipped, unlike many contemporary games for years after.

Quake source code:
Quake 2 source code:
Quake 3 source code:

Abuse (1996)

AbuseDeveloper: Crack dot Com
Publisher: Electronic Arts / Origin Systems
Platform: DOS / Linux / Mac

I like Abuse. (wait, that sounds wrong).

It caused a mild stir when it came out due to several innovations. It had a really cool mouse+keyboard joint control scheme, which was kinda new at the time. It had dynamic lighting, which for a platformer was unheard of.

But that’s not what I liked most. From a programmer’s point of view, the most interesting thing in the game is the “visual Lisp” system that it contains.

The entire game is scripted in a custom Lisp-like language. So things like enemy behaviors are fully configurable at runtime, rather than being compiled in.

Even more interesting perhaps is the way events can be hooked up inside the (built-in!) map editor: you can visually drag lines from a switch to a door, or from a tripwire to a enemy-spawner. Including adding in AND/OR gates, all visually as hidden level objects. It’s certainly something I’ve not seen in other editors.

Unfortunately, I don’t think it was a great commercial success, and only two years later was released as open-source. The second Crack Dot Com game, Golgotha, was also released as open-source, including all the art assets.

Abuse source code:
Golgotha source code:

Aliens versus Predator (1999)

Aliens versus PredatorDeveloper: Rebellion
Publisher: Fox Interactive / Electronic Arts / Sierra On-Line
Platform: Windows / Mac

Honestly, I’m just hugely relieved to see a FPS listed here that isn’t based off id Software’s engines.

While it didn’t have much in the way of technical innovation, the single-player campaign was a blast.

This remains a good example of a non-idTech engine.

Source code:

Freespace 2 (1999)

FreeSpace 2Developer: Volition, Inc.
Publisher: Interplay Entertainment
Platform: Windows

Technically a descendent of the Descent franchise but not really, FreeSpace 2 is an entirely space-based single and multi-player flight-combat game.

FreeSpace 2 is an excellent example of how opening up the source code has allowed the game’s lifespan to continue on for years after it otherwise would have died out, with various content packs and engine upgrades being released.

Source code:

The Operative: No One Lives Forever (2000)

No One Lives ForeverDeveloper: Monolith Productions
Publisher: Fox Interactive / Sierra Entertainment / MacPlay
Platform: Windows / Mac / PlayStation 2

The LithTech engine has a long history, although has often taken a backseat to the more famous Quake and Unreal engines.

I haven’t poked around in the NOLF source too much, but I suspect it may only be the source code to the game part, and not the actual LithTech engine itself. Certainly, despite this game being released for the PlayStation 2, there’s no chance to see the workings of a PlayStation 2 game in there.

I feel this is a shame – developing for the PS2 would be considered completely alien for many programmers nowadays, benefiting from a much more data-oriented approach than today’s APIs allow.

Source code:

MechCommander 2 (2001)

MechCommander 2Developer: FASA Interactive
Publisher: Microsoft
Platform: Windows

The idea of Microsoft and open-source historically haven’t gone together. But, they’re starting to relax a little in their old age.

Still, it’s nice to see that big companies are starting to adopt a more open attitude to things that otherwise have zero commercial value for them but immense historical value.

Last year even saw the release of the source code to early versions of MS-DOS and Word, something unthinkable 30 years ago.

Source code:

Doom 3 (2004)

Doom 3Developer: id Software
Publisher: Activision
Platform: Windows / Mac / Linux / Xbox / PS3
Code review:

Doom 3 remains one of the best examples to look at if you want to study a modern AAA game engine.

At the time of its release, it innovated in many areas. Its method of baking high-res source models onto low-res game assets is now standard in all commercial games.

The source has many other interesting finds – its physics system alone is worth a look to see how it attains continuous collision detection.

This is the first of the id engines to be written in C++ instead of C. One thing about all the older id engines is that they had a very clear simplicity to them due to it. Doom 3 manages to keep that, but it does mark a notable change in direction. Comparing id’s C++ style to other C++ engines is this list certainly reveals different approaches.

Doom 3 was also (in)famous for using stencil shadows for all its lighting. It’s debatable as to whether this was an “interesting experiment” or something that should have been pursued further, but the fill-rate costs associated with it mean that almost all games nowadays tend to prefer shadow maps instead. Still, perhaps that will come around again one day.

Fabien Anglard’s excellent code review is well worth the read, as usual.

Source code:
BFG edition source code:

Gish (2004)

GishDeveloper: Cryptic Sea
Publisher: Chronic Logic / Stardock
Platform: Windows / Linux

Gish was notable for its unique blobby-physics gameplay (as well as its unusual time signatures…), so it’s nice to be able to pick apart the source and find out how it was done. No third-party physics engines here.

Many people commented on Casey Muratori’s Handmade Hero project recently to say how “no-one would write a real game like that”. Well here’s an example of someone doing just that with good success.

It’s interesting that Gish is written entirely in C – something that doesn’t happen much nowadays. While it’s not the neatest code I’ve ever seen, it’s a very good example of how a game doesn’t have to be a giant sprawling codebase with hundreds of external dependencies.

Source code:

Canabalt (2009)

CanabaltDeveloper: Adam Saltsman
Publisher: Semi-Secret / Beatshapers / Kittehface
Platform: Flash / iOS / PSP / Android / Ouya

It’s not the most complex game, but does it need to be? If you want to learn how to write a game, start with something simpler. Here it is.

Prototyped in only 5 days, and ported to iOS in 10, Canabalt shows how you build up a simple idea into something worthwhile. In many ways, it’s a return to the 8-bit days when new genres could be invented every week. It’s a shame therefore that there have been so many clones of it, with people preferring to stick with someone else’s idea rather than grow their own.

Canabalt showed how simple things could be if we let them. Learn from it, and use it to launch your own dreams.

Source code:
Release notes:

Ones I missed

UPDATE: Thanks to the many eagle-eyed readers who, after reading this, flocked to mention some other games I missed out. I may do a proper update for them, but here’s a brief list:

  • Jagged Alliance 2
  • Homeworld
  • Aquaria
  • Star Wars Jedi Knight: Jedi Academy / Jedi Outcast
  • Arx Fatalis
  • Penumbra Overture
  • Meridian 59
  • Commander Keen: Keen Dreams

I’ll try and track down proper links for them.

Runners up

There’s a few other games I should mention. These have not had a source-code release; instead they have been reverse engineered by fans. While this doesn’t provide the same information as the actual source code, it can still be worth a look:


And lastly, the following games have had their source-code leaked illegally, so I won’t give them any further attention:

  • Half Life 2
  • Falcon 4.0
  • ReVolt
  • Turrican III
  • Mr. Nutz: Hoppin’ Mad
  • Trespasser (oh all-right, go read the code review)

Very Sleepy 0.9


Attentive readers may have noticed there hasn’t been much activity on the Very Sleepy front in recent years. This has led to several people forking Sleepy in order to provide new fixes. Of course, due to the GPL license the project exists under, this is not only allowed, but even encouraged!

However, having multiple forks around tends to confuse users as to which version they should be using. It also makes it harder to know where to go to get the “latest” version.

A lot of this was a failure on my part, due to not having an online DCVS repository for people to be able to contribute to. Maintenance gets a lot easier when pull requests can drop straight in.

One of the more popular forks is Vladimir Panteleev’s GitHub-hosted Very Sleepy CS. Vladimir has put an incredible amount of effort into new improvements and bugfixes over the past couple of years. Thanks to persistent pestering on his part, the decision has been made to merge the two projects back together.

Very Sleepy CS will now be dropping the “CS” and merge into the official distribution. The homepage will remain active here at the same address (, and news updates will continue to be posted here as new versions are released.

Vladimir’s GitHub page will become the official repository for latest development versions, and for bug tracking.

The latest CS version has been posted to the homepage here for download. If you’re still using 0.82, what are you waiting for? :-)

It’s a magical world, Hobbes, ol’ buddy… …let’s go exploring!

Changes brought forward from the CS branch:

  • Redesign parts of the file format and internal database representation, to allow more exact late symbol loading, as well as a disassembler view in the future
  • Add an “Address” column to all function lists
    • For the call stack and callers view, the address specifies the address past the call instruction
  • Several fixes to the crash reporter
  • Use wxWidgets 2.9.5
  • Fix problems caused by dbghelp.dll hijacking
  • Fix handling of symbols containing whitespace characters
  • More user interface improvements
  • Contributed by Michael Vance:
    • Add CSV export for the callstack view
    • UI fixes and code cleanup
  • Numerous user interface performance, responsiveness and usability improvements
  • Allow specifying additional symbol search paths
  • Add Back and Forward menu items and hotkeys for function list navigation
  • Improve overall performance
  • Add late symbol loading by saving a minidump during profiling
  • Install 32-bit version alongside 64-bit version
  • Contributed by Richard Munn:
    • Added a time limit option to the interface
    • Added function highlighting and filtering

The joy of INCBIN


-or- why do we have to load?


Why do games have to load data?

At first glance that seems like a stupid question. Of course they have to load data, textures, models, etc. How else would they draw anything?

But games didn’t used to load data files. That’s right, in the early 80’s, calling functions to load your data would have been considered unthinkable.

Many people reading this will be too young to remember the INCBIN command, or one of it’s many variations. Video games in the early 80’s were generally written entirely in assembly language. Most assemblers had a special directive, usually called INCBIN or similar, which would allow you to include any binary file and embed it into your program. I’ll provide a brief example:

The C++ equivalent for this would be something like:

Basically you’re telling the assembler to read in your data, and spit it out directly as an array in your program.

It’s pretty rare to see data loaded in that way in a modern C++ program. Instead perhaps you might do something like this:

So what’s the difference here? In the C++ version, the data files are assumed to lie *outside* the program, and each will be loaded separately when asked.

Why was this considered unusable at the time?

Let me give you a brief rundown of loading games on a ZX Spectrum.

ZX Spectrum memory map.  The Spectrum has 48KB of memory for use by the game.

The Spectrum has 48KB of memory for use exclusively by the game.

From the point of view of a user wanting to run a game, you’d typically issue the “LOAD” command, wait 4 minutes for the tape to load, and then the game would run. And generally speaking, (ignoring multiload games for this discussion), you’d then stop the tape and play the game.

Tapes aren’t random access. The data feeds into the computer in the order it is on the tape. You can’t just ask for a specific file to be read, you have to accept whatever data is next up in the queue.

OK, so maybe you could figure out what order you wanted the data in, arrange for it to be on the tape in that order, and then figure out some way of annotating each chunk so that the loader knew what it was and were it needed to be. You’d need to write a special little tool to that.

Oh wait, no you don’t. You have one already, it’s called the assembler. Using INCBIN, the assembler automatically places everything where it needs to be, keeps track of a symbol name for each piece of data, and everything can just get loaded as one giant binary blob.

More than tapes

Of course tape-based loading went out of fashion fairly quickly, making way for ROM cartridges (the Genesis and SNES era). And yet the INCBIN approach still works well here.

How do you load a file on the SNES? It’s stored on a ROM cartridge. So maybe you could allocate some RAM for it, find it on the cartridge, and then copy it from there into RAM.

But wait – you didn’t need to do that – it was already right there in ROM. You don’t even need to load it. You can just use it directly in-place from the ROM. So all you really need is some kind of file-system; a table that contains a mapping from each filename string to the address in ROM of the file.

And again here we realize there’s already a tool to do that for us; the assembler. We don’t need to invent our own string->data mapping code, there’s one already in the assembler. It’s called the symbol table. We just INCBIN the files, and let the assembler take care of tracking the name for each one.


So this sounds great! We don’t need any external files for our game, we can just INCBIN everything and produce one giant executable with everything right there at our fingertips. We don’t need to load anything as we have a loader already in the OS!

And yet you’ll never see a modern commercial game today using this technique. Why?

Compiler-writers and OS developers broke it.


Firstly you’ll notice most high-level languages didn’t think to add an INCBIN directive. C/C++ has an #include command, but it can only be used to bring in more C++ source code, not binary files. You can get around that though by writing a small utility to convert your binary files to a char[] array, so it’s only a mild annoyance.

That’s not the real problem though. A modern game today might use let’s say 1GB of data. Can you imagine what would happen if we tried to make a 1GB executable? It’d be a mess.

Firstly you’re talking about pumping 1GB of data through the poor linker. Linkers should absolutely be able to handle that. But I wouldn’t like to bet any money on it. You’d get ‘weird’ errors. Segments too big, relocation offsets too big, who knows.

Secondly, even if you did manage to do it, the OS EXE/DLL loader wouldn’t be expecting it. Even on 64-bit Windows, you can’t make an EXE bigger than 4GB. I can imagine some virus checker kicking in every time you tried to run the game, waiting for minutes while it scanned this giant EXE. Even though there’s no difference between data coming from a file and data in an EXE, the virus checker would still want to have it’s way first.

Where does this leave us?

You can’t ship a large modern game using this method. Not because it wouldn’t work, it’d work fine. But because control over loading got taken away from us. It used to be that your EXE was king, you were given memory space with guaranteed properties, and within that space you controlled the entire system.

Nowadays that memory space isn’t really yours any more. Some systems (iOS, Xbox 360, etc) don’t allow you to even allocate executable memory areas. Everything has to go through the approval of the OS writers. If you want to do anything differently, you can’t. End of story.

We have a compiler. We have a symbol table. This table maps names to addresses! It’s everything we need, but we can’t use it, purely because loading is outside of our control. We can’t even write our own dynamic loader, as they’re taking that away too.

The modern method of having to maintain a separate resource manager sucks. It violates the DRY principle, with load management completely separated from the code trying to use it.

I know of few commercial games that manages their data like this. Jak & Daxter on the PS2, due to having it’s own programming language, wrote their own dynamic code/data loader (somewhat similar to Unix shared libraries). The PS2 was one of the last consoles where you could get away with this, due to having full control over the machine.

So now we’re stuck in a world where we have this system for loading things, but I can’t use it because it got too specialized for the mainframe use case, and now I’m not allowed to write my own version either.


On Building Living Worlds


I always found it fascinating how humans seem to have this in-built desire to create new worlds.

For the past 4 years, Miguel Cepero has been running his Procedural World blog, chronicling his exploration of world-building. From a simple voxel landscape, through to L-system grammars for making voxel buildings, it’s certainly an impressive effort, and worth going back over his post history to see it evolve.

No Man's Sky, scheduled for 2015.

No Man’s Sky, scheduled for 2015.

The upcoming game “No Man’s Sky” has been lauded recently for it’s use of procedurally-generated content. It certainly has some damn fine visuals. In this video they talk about how the game uses random seeds to build each planet, and how they use a set of algorithms to place content in the world.

Lords Of Midnight on the ZX Spectrum, 1984

Lords Of Midnight on the ZX Spectrum, 1984

This isn’t by itself a new technique by any stretch. The ZX Spectrum classic “Lords Of Midnight”, (recently remade for iOS and Android with an amazing amount of respect and devotion to the original), is one of the earlier examples of using random numbers and simple algorithms to create a world bigger than the computer could otherwise store.

Mike Singleton crammed an entire world, with enemies, forests, mountains, towers, and more, into only 41KB. For reference, NOTEPAD.EXE is 189KB.

What sets Lords Of Midnight apart from countless other games of the time is it’s complex mythology. To the player, this isn’t just some randomly generated maze or fractal heightfield. The world is inhabited by characters, allies, foes. The back story dictates how these relationships came about, sets the reason for the quest, and lays out which of the multiple paths you’ll take to complete the game.

It’s not just enough to create a landscape. You need something to go in it too. You need a story. Lords Of Midnight, like many others, drew heavily on the works of Tolkien for it’s inspiration.

Many game creators think that a procedural world means generating a random landscape, then sticking random things on it at random. But a world is more than just a collection of things.

You’ll often see fantasy authors get that wrong. Did you ever read a book where they’ll create some character called perhaps “Rag’na K’ptolth”, with apostrophes flung at random throughout like some kind of Photoshop Apostrophe Lens Flare?

Or perhaps you played a game where they created “Tolkeinesque” names by automatically throwing random letter-pairs together? This is the complete opposite to how Tolkien crafted his world. There was nothing random about a single word or language he created.

Tolkien was foremost a professor of languages, and only secondarily an author. Nothing he invented was random. He considered languages inseparable from the mythology associated with them.

He wrote many letters over the years explaining to people parts of his world, and how he went about building it. In his 1955 letter to the Houghton Mifflin Co. (his publishers), he writes:

J. R. R. Tolkien, 1955: (emphasis mine)

All the names in the book, and the languages, are of
course constructed, and not at random…

… what is I think a primary ‘fact’ about my work, that it is all of a piece, and fundamentally linguistic in inspiration. The authorities of the university might well consider it an aberration of an elderly professor of philology to write and publish fairy stories and romances, and call it a ‘hobby’, pardonable because it has been (surprisingly to me as much as to anyone) successful. But it is not a ‘hobby’, in the sense of something quite different from one’s work, taken up as a relief-outlet.

The invention of languages is the foundation. The ‘stories’ were made rather to provide a world for the languages than the reverse. To me a name comes first and the story follows. I should have preferred to write in ‘Elvish’. But, of course, such a work as The Lord of the Rings has been edited and only as much ‘language’ has been left in as I thought would be stomached by readers. (I now find that many would have liked more.) But there is a great deal of linguistic matter (other than actually ‘elvish’ names and words) included or mythologically expressed in the book. It is to me, anyway, largely an essay in ‘linguistic aesthetic’, as I sometimes say to people who ask me ‘what is it all about?’

The way a language evolves over hundreds of years depends on what other cultures you have contact with. Nothing is named in isolation. For example, the town of York in England underwent several changes of name during it’s lifetime:

Originally Eborakon, from when the Briton people inhabited the mainland. Then when the Anglo-Saxons invaded from Northern Europe, it became corrupted into Eoforwic (confusing the words Briton word Ebor with the Anglo-Saxon word Eofor).

In the 9th century, the Danes captured control over the north of England, and as the city fell under Viking rule, the spelling shortened to Jorvik. Over the years since this gradually modernized to York, still in use today.

Tolkien made over 20 different languages during his lifetime, each based on specific races and geographical areas. He created his world specifically to allow somewhere for his languages to evolve in.

One of the few games that seems to use this kind of history-driven approaches to world-building is Dwarf Fortress.

Hundreds of years of history being played out.

Hundreds of years of history being played out.

Dwarf Fortress starts by doing the usual things. A fractal landscape is used as a base. They then add a temperature map, rainfall projection, drainage, vegetation. Erosion is applied and then the world is populated with animals and people.

But what’s somewhat unique is that the game then goes on to simulate an entire cultural history of the world. There’s an absolutely fascinating write-up of it at, but basically it simulates the creation of towns, conflicts with neighbors, the rise and fall of civilizations, every week, for 250 years. All before you even think about starting to play the game.

For instance, when you travel to certain cities in the game and speak to a merchant they might tell you that their leather caps are made in an elvish city half a world away. And it will be true. They really were made there, during world creation, and traveled to this market for you to buy before you even started playing.

It seems to me like many of the more successful attempts at world-building follow this basic theme: Invent the culture and situation first, and then discover what kind of world would be born out of that.

I don’t know of any other procedural games that use the same kind of internal depth to world-building. Let us know of any you come across!


The Winter Of Our Dis-content


I’m going to have to be a little mean here. I try not to be. It’s so easy to be mean and so hard to be constructive, and generally the world is a better place for everyone when people can be constructive.

But I’m afraid I need an example for this post, and so I’m going to have to pick on SourceForge.

SourceForge was basically the first company in the open-source hosting movement, and back in 1999 the novelty of simply having free hosting for a project was a good enough reason to use it.

But let’s have a look the their website now:


Wow. Just wow. Well there’s a lot of adverts, but let’s ignore that. What’s the real problem here?

Where’s the content?

Presumably you arrived at a SourceForge website because perhaps you wanted to:

  1. Read about the project and what it’s supposed to do.
  2. Poke around in the code and see how they do things.
  3. Contribute by reporting bugs or such.

This is the content. The project, the thing the creators have created. This needs to be the foremost thing on the website. And yet the lonely “Description” text gets cramped into a tiny text-box buried underneath a glacier of metadata.

Where’s the code? I mean it’s open-source, right? The clue is in the name, there should be some source here somewhere. That’s the ONE THING the website does is to host source code.

Oh wait, there it is. In case you haven’t spotted it yet, there’s a tiny tiny little “Code” button hidden up on the top-right of the toolbar. OK, so SourceForge has terrible web design. But why does this anger me so much? There’s other source-control websites out there people could be using, sure, but there’s a fundamental problem here that is worth studying.

Our glorious benefactor Gabe

I want to divert briefly to talk about Valve. Over the past 10 years Valve have been one of the companies most famous for trying to innovate in the field of online economics. The release of Steam as a distribution platform, launching Team Fortress 2 as a free-to-play game, the Hat Based Economy, and more.

Their founder Gabe Newell has talked at length on many occasions about what Valve tries to do as a company, why they exist, and how they interact with their customers.

Gabe gave a fascinating talk last year at the Lyndon B. Johnson School, where he describes the process Valve use to empower their customers in creating new value.

That by itself is an unusual idea to some businesses. Often you’ll hear a business talk about increasing “sales”. What Valve have figured out is that sales are simply a subset of value, and that increasing “value” is what you need to be doing.

The first thing that comes across is that Valve really care about what’s best for the customers:

(at 15:45)
Valve is not a publicly traded company. Being a publicly traded company adds a bunch of headaches, and it didn’t really solve any problems for us. It means that control and decision making now involves third-parties.

So for a developer or somebody building something at Valve, it’s like: there’s the customer, and they’re the person that you’re trying to make happy. […]

You don’t go to board meetings where the board argues about what the third series of venture capitalists are worried about, dilution and hitting certain targets. There’s no notion of a distribution channel who says “we’re really looking for something that fits in this particular slot at this particular time”.

The whole point of being a privately held company is to eliminate another source of noise in the signal between the consumers and producers of a good.

That last line is the key part. The easy interchange of content between the consumer and producer is the single most important thing any company should be worrying about.

(at 33:25)
We’re also seeing this huge uptake in user-generated content.

To be really concrete, 10X as much content comes from the userbase of TF2 as comes from us. So we think that we’re super productive and kinda bad-ass at making TF2 content, but even at this early stage, we cannot compete with our own customers in production of content for this environment.

So the only company we’ve ever met that kicks our ass is our customers. Right we’ll go up against Bungie, or Blizzard, or or… anybody, but we won’t try to compete with our own userbase, because we already know that we’re going to lose.

Once we start to building the interfaces for users to start selling their content to each other, we start to see some surprising things.

An inquiry into value

Now let’s apply some of this advice to SourceForge. In 2008 GitHub launched as another of the many open-source project hosting sites.

Let’s look at a typical GitHub page:


Ignoring all the adverts, toolbars, whatever, there is exactly one critical difference between the two. GitHub has the user’s content as the most important thing on the page. Right there, the first thing you see, is the source code. Directly underneath, the user places their documentation, laid out as if it were a website.

The link between the content creator and the content consumer is established directly. The trouble with SourceForge is that it’s under the impression that the content’s metadata is more important than the content.

Gabe mentioned how Steam actually acts as a bottleneck between content creators and consumers, something he’s actively trying to change. This is what SourceForge is – a bottleneck. A bottleneck designed to slow you down and make it harder to get what you want. By removing the barrier between content creators and users, by placing the content centered boldly on the project’s page, GitHub allows their both their customers and users to directly create value. GitHub’s customers will do a better job of marketing the website than GitHub themselves could ever manage.

You are not your customer. Your advertisers and affiliates are not your customer. Your advertisers are there to fund the customer’s needs. For every decision you make you have to stop and ask yourself: Is This Best For The Customer?

I don’t know who SourceForge’s intended customer is, but I’m fairly sure it isn’t supposed to be the people trying to work with projects.

To paraphrase the classic Zen and the Art of Motorcycle Maintenance, neither the producer nor the consumer have value by themselves. Value is what defines the producer and the consumer:

Value can’t be independently related with either the creator or the consumer but could be found only in the relationship of the two with each other. It is the point at which the creator and consumer meet.

If your customers can’t get at your content then your content can’t get at your customers, and you have no customers.