Optimizing dilemma of the day

September 15, 2014 cliffski

Below is an image from Gratuitous Space Battles 2‘s ship design screen. On the left is my problem. That’s a load of ship components you can add to a hull, rotate, change size, color etc. All very cool. The problem is that you might choose to have BIG versions of some of them, as main structure bits, so the source graphics have to be big, normally 256sq for sub-components.

Ok, so that’s cool, but the problem is, when I load in those icons I’m loading in a DDS file that is 256 square, which means about 170k in the format I’m using. If I have 300 of them (rough guess) then that’s 51MB of file access, which is bad but not catastrophic, but it does mean 300 distinct file accesses, which is slow, even after I’ve rewritten the DDS loader to be massively faster. As a result, when you click on ‘edit appearance’, there *might* be a slight delay, which is intolerably awful for someone like me with zero patience. And I have a FAST PC, I want this to be fast and smooth on low-spec.

So as I see it the options are:

1) Only load visible ones, then load the others as you scroll (could be irritating for scrolling)

2) Load in placeholders, and spin off the file accessing portion of the texture load into a separate thread, then when they are all there, interleave the texture creation with the display frames of the main thread (DX9 so only main thread may do DX stuff). This seems ultra complex and hacky.

3) Save out small preview images for each item, and load those instead. Less memory, but a bunch of useless duplicate files AND still 300 file accesses.

4) Stick em all in a single big pak file and see if that’s quicker. This is easy, but I find it messy during development as I’m always adding/removing/editing files in those folders, so I need a hybrid debug/release system.

I think I might have to go with 4…

14 thoughts on Optimizing dilemma of the day

Carl says:

September 15, 2014 at 12:36 pm

I’m no expert about this kinda thing but couldn’t you just combine all the thumbnails in to one single image and then effectively “image map” the individual pieces so you get what you expect when you click on them?

I’ll grant if you have any sorting capabilities you’d need a different combined image for each category or you’d have to cleverly organize another way to minimize the wasted space and just have each category jump to a different point in the image.
Sam Swain says:

September 15, 2014 at 12:44 pm

I’d say; “whatever it takes to make the UX slick” (you are one, users are many, yadda yadda). Probably a combination of 2, 3, and 4. A caching/atlassing layer to put small versions for UI together into UI textures (kinda like font caching), load those for UI (as close to the order they’re needed), then spread loading of larger ones in background. The DX threading thing is something you should probably support anyway to smooth things generally I’d have thought. Could probably load the un-needed (not visible) small version textures in the background too for faster UI setup time.
Tom H. says:

September 15, 2014 at 2:55 pm

Sprite sheet / atlas is probably the performant way to get lots of little textures on screen. If different race’s ships are going to have different graphics, you can have one sprite sheet per race.

(Gets a little harder to work when you have mods + expansions, but you can still be O(races + mods + expansions) rather than O(components).)
e-dog says:

September 15, 2014 at 3:09 pm

3+4, but load mipmaps instead of preview images?
cliffski says:

September 15, 2014 at 3:09 pm

I’m kinda guessing directx is clever enough to load the mipmaps only anyway, but its still 300 file reads.
Jovoc says:

September 15, 2014 at 6:37 pm

This is why having a good asset build system is so important. (I have no idea what your asset build setup is like). An asset build step could make thumbnails of these and paste them up to a few big atlases, and you could just load a handful of those atlases and you’re good. So much of optimization boils down to precalculating things and getting them in good shape for the runtime.
Alex says:

September 15, 2014 at 9:06 pm

Could you pick option 4, but write yourself something which will quickly create a new .pak for you (say, < 20 seconds) after you've made a change?
Isaac says:

September 16, 2014 at 3:40 am

How about taking the sprite sheet a bit further, and have the whole part grid be a single image map (like old-school html), and do animation/variation loads when there’s a mouseover event, or if you are tricky, a mouse event nearby.
CdrJameson says:

September 16, 2014 at 1:26 pm

For combining debug/release/modding you could generate your pak file, but have a folder for overrides. You then load everything from the pak, then replace what you’ve loaded with anything you’ve put in the override directory.

You only need un-pak’d versions of the stuff you’re working on at that minute, and when you’re done you can roll the final version into the main pak file.

Modders can use the same system without mucking up their original assets, which are safe in the pak.

There’s a bit of redundant effort, loading stuff and then ditching it, but it might be worthwhile.
Bram says:

September 16, 2014 at 5:22 pm

Save them in a single 3D TIFF.
You only have to open a single file for all imgs.
And can probably use off the shelf image software that has tiff support to author the file.
John says:

September 17, 2014 at 11:13 am

Interleaving loading in number 2) is only hack if you make it so.

Why not make it a thread with an input “work queue” and have it pick up requests from that queue (probably little more than function pointers or in better c++ std::function objects with bound parameters).

Then “render a frame” and “create texture object from this memory” are just commands you push onto it’s input queue for it to work on. It’s a fairly standard way to organize things where you need work to be done on a particular thread (usually to avoid hacking to lock but also for reasons like this)
(Obviously you’d want to have a system for deciding which command to work on next as rendering might take priority).

I’m probably telling you stuff you already know so I’ll shut up now :)
ac says:

September 20, 2014 at 2:19 pm

or 1) only load visible ones, after they have loaded, start generating in-memory cached images of the ones to be loaded. Generate atleast 2 pages forward if there’s potentially unlimited amount of pages… If you save the cache to disk then save it in one file.

Even the crappiest pc’s tend to read from disk atleast at 50 MB/s these days if stuff is in one file and not fragmented. Still if low end laptop is a target, I would also pre-cache most commonly seen items/the first page items during eg. intro screen, so they appear instantly when you actually need them.

The only tricky bit that comes to mind is invalidating the cache when full screen resolution changes.
cliffski says:

September 20, 2014 at 3:14 pm

fortunately invalidating isn’t a problem, as I decided ages ago that if you can optimize load times (and I do!) the best solution to resolution changes is to actually ditch the whole app and re-launch it instantly :D
Bill D. Strong says:

September 22, 2014 at 2:11 am

What about a more hybrid approach?

From approach number 3, save out the preview, but make it an atlas of all of the images. If you are not doing any 3d things with the files, choose a smaller compression format without mipmapping, to save load times. (They are really overkill for a 2d interface. Especially if they are only being used in this specific suggestion as a 2d scrolling picture.)

Now, load in your original files as necessary when the user clicks on it to move it to the ship. You can then optimaize for best performance/UX and not have to call all 300 images at one time. This assumes the user can’t select multiple objects at once, of course.

Comments are currently closed.

Cliffs Solar Panels:
	CO2 emission reduced 445.05 kg
	Equivalent trees planted 26.93 trees
	Equivalent lightbulbs 6973.96 lightbulbs