I think I’ve fixed a bug in the gratuitous space battles campaign game. I’ll know ‘officially’ soon, but it fixed it on my machine :D Here’s what was involved.

A player complained of a random crash bug at the end of some campaign battles. I could not ever reproduce it, and back-and-forth emails began. Eventually, this awesome customer provided me with exact steps to reproduce, and all their save game data to let me replicate it. First run through and….. nothing. It was fine. Roughly every third run-through, in release mode it crashed…

Step 1! Hurrah! it actually crashes for me. This is 50% of the battle because then I *KNOW* that it is the games fault, and not the gamers system, or software. This is good, although frustrating news.

Step 2! It crashes in debug mode. This is another 25% of the battle because I can actually see what data is corrupt. As it turned out, the ‘firstfleet’ pointer in the code that assigns captured ships to the players winning fleet is clearly trashed. How did this happen?

Step 3… debugging. It transpires that the firstfleet pointer is accessed multiple times before this point, and after being initialised, confirming that it *must* have been valid, and becomes invalid between initialisation and access when adding captured ships to the fleet. This means we¬† can step through and watch what happens, if I break on initialisation..

Step 4 discovery! Stepping through the code confirms my suspicions. Once the battle ends, the code updates all the players fleets and removes ships that died in battle. Then, other code innocently picks the first of the players fleets in the battle, and initialises a dialog box listing the enemy captured ships that will be assigned to the fleet. Later…. *drum roll* it deletes any fleets that are now empty. Can you see the bug yet?

Step 5: fix! Changing the code that naievely picked the ‘first fleet’ to pick the first player fleet that still has some intact ships ensures that the later deletion of an empty fleet, and invalidation of the pointer is harmless, because the captured ships are now getting added to a surviving fleet. Bug probably fixed, pending the player confirming that a new exe fixes it.

Why did I not spot this bug six months ago? Well here is what has to happen.

  1. The player has to fight a battle with multiple fleets at once (common).
  2. He has to win (fairly common)
  3. The ai has to lose by the right margin to leave some captured ships (fairly rare).
  4. The winning player has to have enough ships removed from the *first* fleet in the list to have that fleet entirely deleted, despite winning overall. (pretty darned rare).

Simply put, This didn’t happen to me once in testing. It hasn’t happened to many players either, as I understand it. And if it has happened to you… I may have good news :D

2 Responses to “Anatomy of a gratuitous bug”

  1. Wouter Lievens says:

    You missed a pretty important step that tends to be overlooked in fixing bugs: adding an autonomous unit test. It should be somewhere between 2 and 3 :)

  2. George says:

    It happens to me 40% of the time. Fortunately u can almost save your progress in real-time, so everytime it crashes, i load the game again and play the battle again :p its crude, but thats the only option i had. I wondered if playing with modded content should be the problem, but after read the Cliff info, im glad an update will be launch.