Game Design, Programming and running a one-man games business…

Fixing a single GSB bug

Step 1:
Where is the game crashing? it crashes here:
void GUI_TitleBar::SetTitle(std::string str)
on an assert
GASSERT(Width > 0);
Width is clearly not set right when we set the title. Cue lots of hunting through every window that uses that code to check that they never set their title when they are yet to set their width. No luck.

Step 2:
Sudden realisation that the window width is a red herring, it’s the titlebar width itself that isn’t set. Sudden discovery that a function in the dialog class is virtual, but the function it supposedly implements is not. Possible confusion? certainly sloppy and needs fixing, but I can’t fix it *now* because I need to know *exactly* what causes this, rather than changing stuff and hoping I found it.

Step 3:
Realisation that the negative width may be an infinite high width wrapping round, which implies width has never been set. Maybe this is why I never see the crash in debug? A check shows that window widths are always set to zero, but that would still trigger it…

Step 4:
Reading up on related bugs suggests something to do with the race changing, combined with the ship editor. Decide to test in release mode deploying, then changing race then going to design mode. W00T! It crashes, although release mode debugging is far from trivial. It looks like a null pointer though, meaning it should reproduce in debug

Step 5:
Reproduced it in debug, which normally means home and dry. A module type pointer is set to 0xcdcdcdcd. This means uninitialised, and this seems to be the case. I’m assuming it is null without setting it to NULL. If it had been set to NULL, then a module would have been selected.

Step 6:
Temptation to fix an obivous unintialised pointer error here, but that would be BAD. The task is to really understand the bug before fixing it. For example, why does this not *always* crash when launching ship design? Lets test… It seems to always crash for me in ship design. That cannot be right…Aha, it’s handily crashing in some code inside the STL library, but only in debug build. A quick check of the shipped release build shows I can’t now replicate the bug there. Even stranger. Could I possibly have shipped a debug executable in a patch, this explaining why not everyone has the issue? No, the file size difference is 3MB.

Step 7:
Tests shows that the same exe does not crash taking the same steps if outside the IDE. The IDE is clearly setting those debug values and trapping an error. Is this actually related to the titlebar error crash and change race crash everyone has? If the module pointer is invalid, all hell could theoretically break loose. Getting tempted now to just fix the obvious errors, at least to see what happens next.

Step 8:
There is just no way that this pointer is correctly set to null. Unless it’s null by happy co-incidence, the game should crash whenever the ship builder is viewed. (if null, the game assigns it a valid value). Adding an assert for this, which still triggers in my release builds. It doesn’t trigger. How the hell is that pointer being set to NULL?

Step 9:
I suspect I am being lucky in release build with this. The pointer is uninitialised and being set to NULL most of the time as a side effect of something else, but this is not guaranteed, hence the randomness of the crash. It’s clear this pointer should be initialised, so I’ll just do that, and move onto new stuff.


7 thoughts on Fixing a single GSB bug

  1. So your conclusion is that an uninitialized pointer is ‘sometimes’ equal to null instead of nearly always being some random data?

    Hrm, interesting. Sounds like a plausible theory.

    Depending on what mood I am in, I might have just initialized it to null as soon as I located the problem rather than spending time tracking down why. Depends :)

  2. Enter the beauty of unit and integration tests.
    I have a bug.
    Sorry there should be test for that situation already,Ok let me generate a test the catches the bug. Ok the bug fails and all other tests pass.
    Initialise the pointer and rerun the tests.
    Ok all tests pass. release the latest build :)
    See all those steps we avoided.

  3. And that is why I use a language that initializes my pointers (or actually in my case references) for me. Just saying… :)

    unitTestComment++ (though I know it’s hard to test every single thing using unit tests, it’s awesome when you can do exactly what Liam said)

  4. 2 comments:

    You may be able to better debug the problem by setting a ‘memory changed’ breakpoint on the pointer address variable whenever it is changed.

    The behavior you’re seeing is what I would expect. In debug mode in the IDE memory and pointers are set the same usually every run, because the old memory values they occupy are usually the same. However if run outside the IDE the base memory address is different and all variables of dynamically allocated objects will be pointing to random locations and values in memory. Since a lot of memory is zeroed by default by many apps there is good chance its zero, but sometimes it might not be.

Comments are currently closed.