Category Archives: programming

If you read my blog often you will know I can be very irritated by poor performance in my code, or for that matter, anybodies. Firefox is possibly the most memory-wasting application in the known universe. Quite why it needs >3% of my entire CPU right now for me to just type these characters is beyond me. Despite this, such performance is not ‘noticeable’ in the same way that a sluggish GUI can be. When you click a button in a game, the resulting action needs to happen IMMEDIATELY for you to feel like you are using the interface, not fighting it. Thus when launching a dialog box in a game, the aim is always to have its initial loading time to be as minimal as possible. often thats easy, sometimes…not.

When you click the ‘load’ button on the main menu of Production Line (my latest game), it loads a dialog with a list (scrollable) of windows for each save game on the disk. There are thumbnails for each one showing the screen grab from when they were saved, plus some data about each save game. Example:

 

This probably sounds like it should be pretty fast to create, but actually its annoyingly, painfully slow. Before you ask, yes I do the initialisation ‘lazily’ in that I am not loading in textures for the save games until I draw them, so the ones that are currently not visible due to the scroll position have not slowed me down. Actually the slowdown is much simpler than that.

There are currently 25 savegames in my list, in a folder with 50 files (a thumbnail for each one is in the folder too). The files range from 600k to 176MB for the actual save games (XML format) and the thumbnails are tiny 50k jpgs. Why so slow?

At the very least I need to query data about 25 files here. The dialog box puts them in order of creation, and to ensure its really accurate, I dont use windows file attributes but actually crack open the XML to take a look at the header data inside. At this point, I extract the date, and time, and do a version check to reject super-old unusable saves. I strongly suspect that the delay I sometimes experience (only when I’ve been doing other stuff, and the files are not in the cache of the hard drive, or in windows RAM already) is actually not even the reading of the files, or the enumerating of them (50 is not many) but the accessing of them.

When you access a file in windows quite a lot of behind the scenes crap happens. Drives may have to be spun up (or not, depending on tech), maybe even network shares may have to be connected to (not in this case), maybe wireless network drivers need kicking out of sleep. Windows needs to check that you have permission to access that file, to compare the desired access against permitted access. It needs to navigate a chain of block links if the file is fragmented on disk, and as it does all of this, the users anti virus program will kick into gear, scanning the file (maybe even the entire thing, like my big 176MB xml?) for malware.

All of this takes TIME.

The worst thing is, this stuff all happens for each individual file, which is why game engines tend to use pak files. (I have support for them in my engine, just not using it yet). The problem is, users save games are one area where you likely really cannot use them. These are files created by the user, and its often helpful (especially during beta) for them to be simple files the users can access, delete if necessary, copy if necessary, email to the dev if necessary. So pak-filing them is not an option. There are many hacks I can think of, including maintaining a summary of the games in a single file I can update lazily at another time, but nothing that doesn’t generate more complexity and potential for bugs.

One solution, if I was really bored and desperate for speed, would be to embed the jpg into the xml, so that the umber of files instantly halved. Certainly a future option. I could also swap to compressed save games that were likely 1/10th (or less) the size, which would make debugging them a tad harder, but would mean much less raw data for windows and file-scanners to deal with.

I’m definitely not happy with this tiny, tiny (under half second) delay when you click that button :D

An impossible bug. ARGGGH

May 15, 2018 | Filed under: programming

Check out this code from production line for displaying pie charts of expenses.  I declare arrays of float totals for each of 3 pie charts.

 float totals[NUMPIES];
 totals[PIE1] = 0;
 totals[PIE24] = 0;
 totals[PIEALL] = 0;

 

Then I declare a 2-dimensional array of floats which I fill with some data. As I build up those amounts I also update the totals:

float amounts[NUM_FINANCE_CATEGORIES][NUMPIES];

for (int r = 0; r < NUM_FINANCE_CATEGORIES; r++)
 {
 if (r != FC_CAR_SALES)
 {
 amounts[r][PIE1] = SIM_GetFinanceRecords()->GetAmount(1, (FINANCE_CATEGORY)r);
 amounts[r][PIE24] = SIM_GetFinanceRecords()->GetAmount(24, (FINANCE_CATEGORY)r);
 amounts[r][PIEALL] = SIM_GetFinanceRecords()->GetAmount(-1, (FINANCE_CATEGORY)r);

totals[PIE1] += amounts[r][PIE1];
 totals[PIE24] += amounts[r][PIE24];
 totals[PIEALL] += amounts[r][PIEALL];
 }
 }

 

Then some simple resetting of data, which is irrelevant for this bug, and a check that I am not about to divide by zero:

Pies[PIE1]->Clear();
 Pies[PIE24]->Clear();
 Pies[PIEALL]->Clear();

if (totals[PIE1] <= 0 || totals[PIE24] <= 0 || totals[PIEALL] <= 0)
 {
 return;
 }

 

Then the final code:

 //now create
 for (int r = 0; r < NUM_FINANCE_CATEGORIES; r++)
 {
 if (r != FC_CAR_SALES)
 {
 for (int p = 0; p < NUMPIES; p++)
 {
 float perc = amounts[r][p] / totals[p];
 assert (perc >= 0 && perc <= 1.0f)

Hold ON STOP!. How can that assert ever trigger? EVER? (it does for some people). Its driving me mad :D. It can ONLY trigger if one of the amounts in the array is less than 0% or more than 100% of the total. I KNOW that the total is greater than zero, so the amount must be greater than zero. The total is only comprised of the sum of the amounts. There is no way these numbers can be out of synch, PLUS, they are all floating point vars so there is no rounding going on… or is it maybe a tiny tiny quantizing thing? (I’ve update the game with extra debug data for this error so I’ll find out soon). Surely thats the only explanation, and perc is something like 1.000000001%?

 

 

I have a system in the early access version of production Line that when an error happens that I assert() on, it logs that error out to a debug file, along with a timestamp, and the file name and line number of code where the assert failed. This is easy to do (google it). I recently changed this code so it also posts the error message, line number and filename data to my server (anonymously, I have no idea which player triggered it). That code basically just uses the WININET API to silently post some URL variables to a specific php page. That page then validates the data to make it safe, and runs an SQL query to stick the data into a database on my server.

IU can run SQL queries on the server to see what bugs have happened in the last 24 hours, which ones happened the most, what the version number of the game was, and how often they are triggering etc.

This has proven to be awesome.

The default solution these days to this kind of stuff is to buy into some middleware to do this, but thats not how I roll. I also have some balance-tracking stats analysis stuff that does the same thing. I can tell if the game is too hard, or too easy, or if players dont get to build electric cars until hour 300 in the game…and all kinds of cool stuff. This is the sort of thing that middleware companies with 20 staff, a sales team and a slick marketing campaign try to flog to you at GDC in the expo. Its really not that hard, it really is not rocket science. I get the impression that the majority of coders think that writing this sort of stuff is an order of magnitude harder than it actually is.

Oh also, production Line checks once per day on start-up to see if there is a newer version, then pops up a notice and a changelist if there is (for non steam versions). Thats home-grown code too, and its easy, EASY to do,m once you actually sit down and work out whats involved.

I know the standard arguments in favor middleware ‘it comes with tech support, its faster than coding it yourself, it allows you to leverage the experience and knowledge of others’, but I reject all of them. The only reason I know how to use WinInet, php and SQL and all that sort of stuff is because I taught myself. I taught myself ONCE in the last 36 years of programming, and I know how to do it now., I have my old code, and my own experience and skills. I dont care if XYZ Middleware INC is going to go bust, stop supporting my platform, double their prices, or stop answering emails, I have all this experience in house.

When you look at most middleware, its vastly bigger, more feature packed and complicated than you need it to be. Middleware has to be, because it has to be all things to all people. My stats tracking and error-reporting doesn’t have to work on IOS or on OSX or on mobile, or XBox or Playstation. it doesn’t have to be flexible enough to talk to four different database types, or support Amazons AWS. I don’t NEED any of that stuff, so guess what… the code to do what I *actually* need to do, is incredibly, incredibly simple. Don’t think about middleware contracts and APIs and excuses not to write the code, ask yourself ‘what exactly does the code need to do here anyway?’

I think when coders do that you will find, its actually way easier than you think. I did my error reporting code, both the client side and the server side, and tested and debugged in in about an hour. This stuff is not complex. Stop buying ‘fully-featured’ bloatware you don’t need.

I’m an Elon Musk fanboy, drive a tesla and own tesla stock, I’m a true believer. One of the things I like about the man is the way he does everything in reverse, when it comes to efficiency and optimization. The attitude of most people is

‘This thing runs at 10m/s. How can we make it run at 12 m/s?

Whereas Elon takes the opposite view:

‘Are there any laws of physics that prevent this running at 100,000m/s? if not, how do we get it to do this?’

This is why he makes crazy big predictions and sets insane targets, most of which don’t get met on time, but when they do, its pretty phenomenal. If the next falcon heavy launch recovers the center core too, thats even more game changing, and right now, the estimate is that the falcon heavy launch cost is $90 million verses $400 million of its nearest competitor (which only launches half the weight anyway). That not just beating the competition, but thats bludgeoning them, tying them up, putting them in a boat, pushing the boat out into the middle of the lake, and laughing from the shore during a barbecue as the boat sinks.

When it comes to my favorite topic (car factory efficiency, due to me making the game Production Line), he comes out with even crazier targets.

“I think we are … in terms of the extra velocity of vehicles on the line, it’s probably about, including both X and S, it’s maybe five centimeters per second. This is very slow,” he said. Musk then added he was “confident” Tesla can get a twentyfold increase of that speed.”

Now we can debate all day long whether the guy is nuts, and over promising and whether or not we could ever, ever get a production line that fast, but you have to admire the ambition. You dont get to create privately-made reusable rockets without ambition. I just wish we had the same sort of drive in software as he has for hardware. The efficiency of modern software is so bad its frankly beyond embarrassing, its shameful, totally and utterly shameful. let me dredge up a few examples for you.

I’m running windows 10, and just launched the calculator app. Its a calculator, this is not rocket science. A glance at task manager shows me that its using up 17.8MB of RAM. I am not kidding, try it for yourself. I’m pretty sure that there was a calculator app for the sinclair ZX-81 with its 1k (yes 1k) of RAM. Sure, the windows 10 app has…err nicer fonts? and the window is very slightly translucent… but 17MB? We need 17MB to do basic maths now? As I type this, firefox has got 1,924MB of RAM assigned to it, and is regularly hitting 2% of my CPU. I’m just typing a blog post, just typing… and thats 2% of a CPU that can do 49,670 MIPS or roughly 50 BILLION instructions per second. Oh…we have slightly nicer fonts too. wahey?

I’d wager the percentage of people coding games who have any real idea how the underlying engine works is tiny, maybe 5%, and of those maybe 1% understand what happens at a lower level. Unity doesn’t talk to your graphics card, it does it through OpenGL or DirectX, and how many of us really understand the entire code path of those DLLS? (I don’t) and of those, how many understand how the video card driver translates those directx calls into actual processor instructions for the card hardware? By the time you filter your code through unity, directx and drivers, the efficiency of what actually happens to the hardware is laughable, LAUGHABLE.

We should aspire to do better, MUCH better. Perhaps the biggest obstacle is that most of us do not even know what our code is DOING. Unless you have a really good profiler, you can easily lose track of what goes on when your game runs, and we likely have zero idea what happens after our instructions leave our code and disappear into the bowels of the O/S or the graphics API. Decent profilers can open your eyes to this stuff, one that can handle displays of each thread and show situations where threads are stuck waiting is even better. Both AMD and nvidia provide us with tools that let us step through the rendering of individual frames to see how each pixel is rendered, then re-rendered and re-rendered many times per frame.

If you want to consider yourself not just a hacker but an ENGINEER, then you owe it to yourself, as a programmer to get familiar with profilers and code analysis tools. Some are free, most are not, but they are a worthy investment. Personally I use AQTime, and occasionally intel XE Amplifier, plus the built-in visual C++ tools (which are a bit ‘meh’ apart from the concurrency visualizer). I also use nvidias nsight tools to keep an eye on graphics performance. None of these tools are perfect, and I am by no means an especially good programmer, but I am at the very least, fully aware that the code I write, despite my efforts, is nowhere REMOTELY close to as efficient as it could be, and that there is plenty of room to do better.  If Production Line currently runs at 60FPS for you (average speed across all players is 58.14) then eventually I should be able to get it so you can play with a factory at least 10x that size for the same frame rate. I just need to keep at it.

I’ll never be the Elon Musk of software, but I’m trying.

 

Battering the RAM

February 03, 2018 | Filed under: production line | programming

I had a bug in Production Line recently that made me think. Large factories under certain circumstances (and by large I mean HUGE) would occasionally crash, seemingly randomly. I suspected (as you usually do if you have a large player-base) that this must be the players machines. If code works fine of 99.5% of PCs, and breaks on the remainder…that seems suspicious. Once I managed to get the same save games to crash on mine, again in seemingly weird places, but always near some memory allocation…the cause became obvious.

I had run out of memory.

This might seem crazy to most programmers because memory, as in RAM is effectively unlimited right? 16GB is common, 8GB practically ubiquitous, and in any case windows supports virtual memory, so really we are talking hundreds of gigs potentially. Sure, paging to disk is a performance nightmare…but it shouldn’t just…crash?

Actually it WILL, if you are running a 32 bit program (as opposed to 64 bit) and you exceed 2 GB of RAM for your process.  This has always been the case, I’ve just never coded a game that used anything LIKE that. Cue angry rant from unhinged ‘customer’ who thinks it is something akin to being a Neanderthal that my game is 32 bit. Take a look at your program files folders people…the (x86) is all the 32 bit programs. I bet yours is not empty… 64bit is all well and good, but the majority of code you run on a day-to-day basis is still 32 bit. For the vast majority of programs it really wont matter. For some BIG games, it really does. Clearly a lot of FPS games easily need that RAM.

So the ‘obvious’ solution is just to port my engine and game to 64 bit right?

No.

Not for any backwards compatibility reasons, or any porting problems reasons (although it WOULD be a pain tbh…) but because asking how to raise that 2 GB RAM limit is, to me, completely the wrong question. The correct question is “Why the fuck are we needing over 2GB  of RAM for an indie isometric biz sim anyway?” And it turns out that if you DO ask that question, you solve the problem very quickly, very easily, and with a great deal of pride and satisfaction.

So what was the cause? and how was it fixed? Its actually quite surprising. Every ‘vehicle’ in the game code has a GUI representation. Because the cars are built from a number of layers, they are not simple sprites, but actually an array of sprites. The current limit is 34 layers (some layers have 2 sub-layers, for colored and non-colored), from axles to drive shafts to front and rear doors, windows, wing mirrors, headlights, exhausts and so on. Every car in the game may be drawn at any time, so they all need a GUI representation. The game supports 4 directions (isometric), so it turns out we need 34 layers X 2 sub-layers X 4 directions = 272 sprites per car. Each sprite needs 4 verts and a texture pointer (and some other management fluff). Call it 184 bytes per sprite, that means the memory requirements for any car amount to 50k per car. If the player has been over-producing and has 6,000 cars in the showroom then this suddenly amounts to 300MB just of car layer data.

So that explains where a lot of the memory comes from…but what can I do about it? I’m not stupid enough to DRAW all these all the time BTW, I cache them into a sprite sheet and when zoomed out I use that for single-draw-call splats of a whole car, but for when zoomed in, or when the sprite sheet is full, I still need them, and I need them to be there to fill the sprite sheet mid-game anyway. How did I reduce the memory requirements so much?

Basically I realized that although SOME vehicle components (car doors etc) had 2 layers, the vast majority did not. I was allocating memory for secondary layers that would never be rendered. I simply reorganized the code so that the second layer was just a NULL pointer which was allocated only if needed, saving myself the majority of that RAM. With my optimizing hat on, I also found a fair few other places in the code where I have been using ints instead of shorts (like the hour member of a sales record) and wasting a bunch more RAM. Eventually I ended up knocking something like 700MB of RAM in the largest, worst cases.

Now my point is… if I had just taken the attitude of many *modern* coders and thought ‘ha! what a dick! lets allow for > 2GB memory’ rather than thinking about exactly how the amount had crept up so much, I would never have discovered my own silliness, or made the clearly sensible optimization. Lower memory use is always good, as long as there isn’t a major CPU impact. More vehicle components can now fit into the cache, and be processed quicker. I’ve sped up the games performance as well as reducing its RAM hunger.

Constraints are good. If you have never, ever given any thought to how much RAM your game is using, you really should. It can be eye-opening.