I’m enjoying myself with some optimizing today (yeah I’m weird like that). So I thought I’d jot down some of my tips for making your game faster. These are general, not language-specific tips.

Never run code you don’t have to run

Seems obvious but few people actually do this. For example, in democracy 3, the simulation calculates the popularity of each policy by asking every voter if they benefit from it. That question is complex, and there are a few hundred policies and 2,000 voters. This takes time. Solution: I only ask them about a policy if I need the answer right now. Some policies can go a dozen turns without the player ever checking their popularity, so why keep calculating it?

Batch Stuff

If you have a dozen icons that are always drawn one after another on the same screen, stick them in a texture atlas. If you are 100% they will never overlap, then draw them in a single draw call. the less texture swaps and draw calls, the faster your code. This is trivial to do in 3D, an absolute nightmare to do properly in 2D, but it’s worth it.

Cache Stuff

If you have a variable that is complex to evaluate, evaluate it once, then cache it until it changes (we tend to call that setting it ‘dirty’). If there is some data that is going to be accessed a LOT, then make a local copy of it. And if you have a lot of stuff to write to disk, to the same file, buffer it. Writing to or reading from a file is slow, especially if you are going to do it a lot. Reading in a single file is much quicker than opening 200 of them one after another.

Don’t use sqrt()

Do you ever use sqrt()? never realised how scarily slow it was? Most of the time you can keep the squared result and use some clever tricks to not actually need the sqrt() result. If you were going to get the sqrt() and compare it against a value, just multiply the comparison value by itself and check it that way instead. it’s amazingly faster.

Use the right container class

Sometimes you will use a list where a vector will do. The vector is MUCH faster. And you know what is faster still. I mean REALLY fast? An array. If you really need speed, and the array size won’t change that often, allocating an array of items is much faster and is worth the overhead.

Re-use objects

If you have a collection of objects that keep getting created and destroyed, you want to wrap that up. Stick a factory object around them to handle their construction and destruction. That way, you can just set them inactive on destruction, and save yourself the hassle of the creation and destruction when it comes time to re-use them. Setting a single flag to say an object is ‘dead’ is way faster than calling a destructor, and resetting is way faster than a constructor.

Obviously there are lots more tips, and you should get a decent commercial profiler. That PC you develop on is a super-computing beast. if your game takes more than a few seconds to load, you are being sloppy.

 

9 Responses to “Some optimization tips for game programmers”

  1. It’s important to note that these optimizations are mainly useful for CPU code.

    On the GPU, some common instructions that would be expensive on the CPU, might be handled by dedicated hardware, making them ‘free’. And with many GPU architectures your code will actually run slower if you have branching code, so you don’t want to use if-statements to fulfill the ‘never run code you don’t have to’ part.

    It might also actually sometimes be better to recalculate a value, rather than reading it from GPU memory, if it means you don’t have to wait for that memory to be accessed.

    So I’d say that your last advice is the most important: test if your code is actually faster (for example, by using a profiler).

  2. cliffski says:

    Yeah absolutely. I’m in sim-game mode, so my thinking is primarily CPU based for now.

  3. Sik says:

    About file accesses, also don’t forget that opening a file in itself is a slow operation. Accessing many tiny files is a lot slower than accessing a single large file, even if the accesses are all split in the same amount of bytes. If one can’t avoid having that many files around the best idea is to put them into an archive (even if uncompressed), because then one only needs to open one file.

    I’m not sure how true is that still with solid state media (e.g. Flash), but even in those cases there’s still the filesystem overhead which is not small.

  4. cliffski says:

    With O/S’s like windows XP/Vista/7 there is a lot of security nonsense associated with opening a file, even forgetting for a moment the actual physical access overhead (which many old hard drives cache anyway). Plus if you have any security/antivirus software, you end up firing up the code to scan a new file, plus windows Xp/Vista/7s pre-caching code and other per-file overhead waffle every time a new file is accessed.

  5. Cygon says:

    I haven’T noticed that much of a difference between arrays and vectors, at least in Visual Studio 2012 (benchmark code: http://pastebin.com/7gkuiNtM)

    100,000,000 array accesses: 229 ms (x64: 225 ms)
    100,000,000 vector accesses: 325 ms (x64: 292 ms)

    Though if one uses an instrumenting profiler, vectors tend to light up because this will prevent the compiler from doing the inlining that’s vital to a vector’s performance.

  6. cliffski says:

    thats a pretty big difference.

  7. Ryan says:

    @Cygon: Well, you did just show us that arrays are approximately 40% faster :P. For many lightweight programs this isn’t a problem, but with something like a demographics simulator, where you might be looping over a 2,000 element container before drawing a menu, I think it would be worth considering.

    Another optimization to look at is multithreading. You can almost double the speed of a program on a dual-core CPU, and quadruple the speed on a quad-core. Unfortunately it can also cause cause bugs and crashes which are very difficult to isolate. Apparently some compilers actually multithread some code automatically, which I noticed when one of my non-threaded programs ran at 100% on 2 cores.

  8. cliffski says:

    I have experimented a bit with multithreading for D3, just some loading screen fun, but it scares the hell out of me, the amount of synchronization required for a really interlinked simulation is just not funny :D Next-turn and load times are pretty amazingly short now anyway.

  9. gta says:

    Excellent video on CPU caching and performance.

    Native Code Performance and Memory: The Elephant in the CPU

    http://channel9.msdn.com/Events/Build/2013/4-329