Its always worth occasionally grabbing a full run-time-profile of your game to see whats taking up all that processor power. Today I dug into it a bit with aqtime and wanted to see what was chewing up the PC the most on a (fairly) big map.
If I ignore child threads (that take a lot of the load-off in terms of route-finding and other things) and just concentrate on the main thread, I can see that 73% of a frame is in the drawing code, and 22% is processing. I suspect on crazy big maps that processing chunk goes higher, as the drawing code scales to some extent with the screen size, and a big map on a small laptop is still my target for improvement.
If I break open that 22% I get this:
SlotManager::Process is obviously the biggest problem here. Its also something that scales quite badly because on really jam-packed maps, there is a lot of slots, and for my sins, EVERY item placed down is a slot, even if its just a conveyor with one entrance and one exit… so whats going on in there…? 95% of the time happens in ProductionSlots (the rest is facilities or supply stockpiles…
…so quite a mixture here, which is a good sign really, as it shows no real obvious glaring screw ups! Because I’m really worried about maps with just a lot of conveyors on them I guess the thing I really need to look at is Processvehicle(), as the other top offenders don’t seem to be related to those theoretically simple slots…
Hmmm. so CheckExitSlotFree() is taking up a lot of time, and some digging around shows that in my short sample run I called that on average of 1,248 times per frame. Ouch. Thats a lot, but to be expected with so many slots, and them all wanting to know their immediate status. The sad thing is, this is likely to be 99.5% of the time, a lot of slots going *is the slot in front of me empty* and the answer coming back *no, just like it was the last 100 times you asked*.
There is a fundamental algorithmic screw-up here, in that polling a value, rather than being notified when it changes, is hugely inefficient. Its also incredibly reliable and fault tolerant, which is why its such a popular method… What exactly takes so long anyway in that function?
Aha… so it looks like a big chunk of the time is actually spent just notifying the popup GUI above a vehicle or a slot, what message to display which is almost *all* cases is none. This seems crazy, and clearly a target for optimization. On its own, this function is taking up a tiny part of total run-time, but a larger part of the main (blocking) thread, and certainly seems to be inefficient. What takes up most of the code? actually its the code that works out the width and height of any of the strings based on the UI font, and thus resizes that notice. Yikes, am I *really* calling that every frame?
Out of 1,992,097 calls to SetType() only 55,280 update it, so only 2.7%, but its still a lot. Inside that function we set the size and the text without even checking if its onscreen, or if we are zoomed in far enough to see it anyway. I can simply set that to be something thats done ‘lazily’ once I know we are onscreen and zoomed in…
Ok done that and…. Hahaha. That function now averages 0.0001ms vs 0.0007ms before the change. Thats a decent hours work :D
That reduces the overall time for a slot processing a vehicle from 0.0012ms to 0.0009ms. For big factories, thats a welcome change.