If you read my blog often you will know I can be very irritated by poor performance in my code, or for that matter, anybodies. Firefox is possibly the most memory-wasting application in the known universe. Quite why it needs >3% of my entire CPU right now for me to just type these characters is beyond me. Despite this, such performance is not ‘noticeable’ in the same way that a sluggish GUI can be. When you click a button in a game, the resulting action needs to happen IMMEDIATELY for you to feel like you are using the interface, not fighting it. Thus when launching a dialog box in a game, the aim is always to have its initial loading time to be as minimal as possible. often thats easy, sometimes…not.

When you click the ‘load’ button on the main menu of Production Line (my latest game), it loads a dialog with a list (scrollable) of windows for each save game on the disk. There are thumbnails for each one showing the screen grab from when they were saved, plus some data about each save game. Example:

 

This probably sounds like it should be pretty fast to create, but actually its annoyingly, painfully slow. Before you ask, yes I do the initialisation ‘lazily’ in that I am not loading in textures for the save games until I draw them, so the ones that are currently not visible due to the scroll position have not slowed me down. Actually the slowdown is much simpler than that.

There are currently 25 savegames in my list, in a folder with 50 files (a thumbnail for each one is in the folder too). The files range from 600k to 176MB for the actual save games (XML format) and the thumbnails are tiny 50k jpgs. Why so slow?

At the very least I need to query data about 25 files here. The dialog box puts them in order of creation, and to ensure its really accurate, I dont use windows file attributes but actually crack open the XML to take a look at the header data inside. At this point, I extract the date, and time, and do a version check to reject super-old unusable saves. I strongly suspect that the delay I sometimes experience (only when I’ve been doing other stuff, and the files are not in the cache of the hard drive, or in windows RAM already) is actually not even the reading of the files, or the enumerating of them (50 is not many) but the accessing of them.

When you access a file in windows quite a lot of behind the scenes crap happens. Drives may have to be spun up (or not, depending on tech), maybe even network shares may have to be connected to (not in this case), maybe wireless network drivers need kicking out of sleep. Windows needs to check that you have permission to access that file, to compare the desired access against permitted access. It needs to navigate a chain of block links if the file is fragmented on disk, and as it does all of this, the users anti virus program will kick into gear, scanning the file (maybe even the entire thing, like my big 176MB xml?) for malware.

All of this takes TIME.

The worst thing is, this stuff all happens for each individual file, which is why game engines tend to use pak files. (I have support for them in my engine, just not using it yet). The problem is, users save games are one area where you likely really cannot use them. These are files created by the user, and its often helpful (especially during beta) for them to be simple files the users can access, delete if necessary, copy if necessary, email to the dev if necessary. So pak-filing them is not an option. There are many hacks I can think of, including maintaining a summary of the games in a single file I can update lazily at another time, but nothing that doesn’t generate more complexity and potential for bugs.

One solution, if I was really bored and desperate for speed, would be to embed the jpg into the xml, so that the umber of files instantly halved. Certainly a future option. I could also swap to compressed save games that were likely 1/10th (or less) the size, which would make debugging them a tad harder, but would mean much less raw data for windows and file-scanners to deal with.

I’m definitely not happy with this tiny, tiny (under half second) delay when you click that button :D

15 Responses to “Slow file access”

  1. Dan Hulton says:

    Have you considered just having a single master file that describes all the other savegames enough to display in the list? Then you’re just opening the one XML file, plus lazy-loading the jpegs.

    • cliffski says:

      Yes I thought about it, but then if a player monkeys around in that folder and deletes a file from windows explorer, the game wont know about it, so it would have to periodically check that was not the case and do the same enumeration anyway…adds complexity and potential for synchronisation screw ups.
      Its definitely the case that the actual save format is too big right now.

  2. Tim says:

    If you’ve loaded all the data you need on a background thread before the dialog box even gets launched, then the dialog should always pop up quickly.

    You could set a file watcher to update whenever a change is made in your savegame directory.

    • Cliffski says:

      Indeed, I was thinking I should do it when loading the main menu itself to prevent the feeling of a delay, although technically I’m just moving the slowdown then, not fixing it :D

      • Tim says:

        To be honest I feel like file access should never be done on your main (or UI) thread anyway – especially as it may be run on a spinning rust drive that might also be trying to do some other random file access at the same time.

        Assume file access is slow, and code accordingly.

  3. Sam says:

    A couple of options occur:
    1) Background load the save file information as soon as the game starts. 95% of the time that’s what the player’s gonna go and do anyway.
    2) Render the dialogue, grey thumbnails, and whatever information you can get on the save file from just the filename and file attributes. Patch in the other details as they load. When you can see stuff happening it feels a lot faster.

  4. Gert-Jan says:

    We just put all initial data in the filename (title, version, date). If the user changes the filename, just display some ‘unknown’ fields and do a version check later on load (it’s an edge case anyway).

  5. Mnemonic says:

    You say “I strongly suspect”. Did you profile the dialog? Somehow, I find strange that accessing 50 files on a local drive should cause any significant delay (on a decent PC configuration).

    But, yeah, background thread separated from the UI is definitely the way to go. Windows uses this “trick” all the time.

  6. DaveS says:

    Instead of loading the whole file convert the first section of information into a binary blob header of data followed by the xml – this way you just need to read in the header instead of the whole file.

  7. Sam Atkins says:

    Not sure how feasible this is, but could you split it into something like this?

    1) During game startup, do a quick count of how many save files are in the directory, and check their names and last edit times.
    2) When the user clicks on the “Load” button, you can immediately display a savegames list, with a placeholder image and the filename instead of the save name. The player can still click “load” on these, because the game still knows what file it is.
    3) At the same time, you start up a background process of scanning the files from top of the list to the bottom, and updating the UI for it as each one is completed.
    4) When the user loads a game, if it’s not been scanned already, you’d need to check if it’s incompatible and display a message if it is.

    If you wanted it to be better disguised, you could wait to display the list until the first page of savegames is scanned, and do the rest in the background.

    I just checked the comments above and another person named Sam already suggested roughly the same thing. Hmmmm.

  8. Arowx says:

    Have you checked your buffer size as most modern PCs have a bandwidth sweat spot around the 64-128k buffer size.

    Check out ATTO a disk benchmark tool (ATTO bechmark image search) -> https://www.google.co.uk/search?q=atto+disk+benchmarks&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjajen5-4fcAhWrB8AKHX1VCYUQ_AUIDCgD&biw=1509&bih=1080

  9. CdrJameson says:

    Need to do a detailed profile to find out which bit is taking the time.
    Many are the examples I’ve seen where something silly and unexpected is taking all the time.

  10. Bubo says:

    I was thinking about something technical to write; some dumb opinion about programming and performance and shit, but then I realized that 1) this is not a problem that I can really help with, and 2) it’s not something Cliffski really needs help with anyway. It’s one of his occasional “I’m procrastinating” unfiltered social media posts. I’m sure it says more things, but I think I’d better stop there.

    I was thinking about this yesterday, and my current theory is that “be yourself” is the invention of John Lennon. I’m sure it was done earlier, but maybe he was the first mega-celebrity to really plaster his whole life all over mega-media? I dunno. I’m pretty much basing that theory on one single thing that I saw once on YouTube, which wasn’t even attempting to address that question, but research is more effort than making a post on social media, and I guess I don’t really care that much about the answer anyway. It’s much more because there is some kind of thrill associated with just dumping what pops into my head where lots of people might see it. Something like streaking. Look at my unfiltered thoughts, man. Straight out of the oven. Look at ’em. Almost all of them are shit, many many nines of shit, but filtering is effort, man. It’s not thrilling. Look at ’em.

    I don’t know if the fact that today there are so many people doing it and watching other people do it increases or decreases the probability of getting shot compared to forty or fifty years ago, but no doubt it’s sensible to be a little less open about things like your home address. There’s a reason why slightly older and/or more sensible people value a certain amount of privacy and anonymity on the Internet.