Game Design, Programming and running a one-man games business…

I know file access is slow but….

As a programmer, you learn fairly early on that file access is very slow. The slowest place to get data is the hard disk, we all know that. Ok, maybe tape storage. But it always amazes me just *how slow* it can be. I don’t mean the stuff like reading a texture file from disk, you expect that to be tiresome, but just file-system operations stuff. I guess I’ve always assumed that because modern hard drives have cache chips on them, and we also have a LOT of RAM available to page stuff, that querying the NTFS (or whatever file system you have)  file table should be fairly quick.

In other words, seeing if a file exists, or enumerating files in a folder shouldn’t take an age, if I’m not actually going to load in the contents. Doesn’t windows know enough about optimizing to page chunks of the FAT into RAM?

Maybe it can’t, or it doesn’t work that way, but I have discovered, to my mild surprise that if you don’t want to use compressed or locked pak files (I like an open file structure to encourage investigative modders), it actually makes enormous sense from a performance POV for you to scan your whole game folders file structure into disk and manage your own file-cache for stuff like enumerating filenames later on.

Why would I not know what files to access already? Well GSB2 writes a lot of files at runtime, and it also handles some objects having companion lightmap files, or sometimes not. The simplest and most flexible system is for it to check for companion files at runtime on startup, and then field any FileExists() calls internally. It’s faster. And I mean its 10x as fast, we aren’t talking 1% speedups here. Things are now getting to the point where I just assume all O/S code is sluggish bloatware and write my own versions whenever possible. I might eventually have to do some trickery involving multi-threaded file-loading, or loading in only specific mip-maps from files at specific points.

I do actually REALLY enjoy this kind of thing, which is a pain because I should be fixing bugs, implementing features and generally trying to keep GSB2 on schedule. Not to mention some new D3 DLC and the TOP SECRET THING. Oh and SMTG.

It’s a good thing I already have a holiday booked this year, as I’d never book one now if I’d waited this far :D


4 thoughts on I know file access is slow but….

  1. It is astonishing, but it could be worse. Try having to go over a LAN, or even worse in ‘the cloud’.

    Online bug tracking is a nice idea, but if even a human being can notice the pauses then it’s toooo sloooow.

  2. Try optimising for load speeds from a CD ROM, its a nightmare, you really don’t want to be seeking all over a CD. You can optimise them by placing data files physically on the CD in sequential manner, the same sequential read order as your levels / game loads. You read from the inner (or is it outer?) edge of the disk to the opposite edge to make sure the read head has the smallest possible distance to move.

    Also you can optimise read times by packaging you files into an archive with its own directory, avoids any file exists checks, you simply check your archive directory (which is always loaded in memory) and you are not having to open files all the time if the OS has a slow call to open files. You simply read a different part of the already open archive, and you know exactly where your subfile is because the directory tells you which part of the archive it is packaged into.

    1. Just read that you don’t like having packed files because of modders. Perhaps the answer is to pack the files but provide the modders with an open source tool for reading/enumerating/writing pack files?

  3. A nice comprimise to using compressed files but allowing modding is http://icculus.org/physfs/ an open source zlib liscensed library that allows some nifty modding ability. It uses simple compression formats.

    (I am not affiliated with this project. It might save someone some headaches, however. At least it could give you another idea on how to solve this problem.)

    Modders can simply put a new zipfile in the directory, or a developer provided directory, and the files in inside of it can override the original files, at the developers discretion. It can simply choose the newest file.

    From the site.

    “PhysicsFS is a library to provide abstract access to various archives. It is intended for use in video games, and the design was somewhat inspired by Quake 3’s file subsystem. The programmer defines a “write directory” on the physical filesystem. No file writing done through the PhysicsFS API can leave that write directory, for security. For example, an embedded scripting language cannot write outside of this path if it uses PhysFS for all of its I/O, which means that untrusted scripts can run more safely. Symbolic links can be disabled as well, for added safety. For file reading, the programmer lists directories and archives that form a “search path”. Once the search path is defined, it becomes a single, transparent hierarchical filesystem. This makes for easy access to ZIP files in the same way as you access a file directly on the disk, and it makes it easy to ship a new archive that will override a previous archive on a per-file basis. Finally, PhysicsFS gives you platform-abstracted means to determine if CD-ROMs are available, the user’s home directory, where in the real filesystem your program is running, etc.

    To explain better, you have two zipfiles, one has these files:

    music/intro.mid
    graphics/splashscreen.bmp
    mainconfig.cfg

    …the other’s got these:
    music/hero.mid
    maps/desert.map

    …and, finally, in your game’s real directory:
    maps/city.map
    graphics/gun.bmp

    When you create the search path in PhysicsFS with those three components, and ask for what’s in the “music” directory, you are told:
    intro.mid
    hero.mid

    …in the maps directory:
    desert.map
    city.map

    …in the graphics directory:
    splashscreen.bmp
    gun.bmp

    …and, finally, in the root (toplevel) directory:
    maps
    music
    graphics
    mainconfig.cfg

    The programmer does not know and does not care where each of these files came from, and what sort of archive (if any) is storing them. But if he needs to know, he can find out through the PhysicsFS API. Furthermore, he can take comfort in knowing that those untrusted scripts we mentioned earlier can’t access any other files than these. The file entries “.” and “..” are explicitly forbidden in PhysicsFS.”

Comments are currently closed.