Systematic Gaming

September 11, 2008

Load Times: Compression

Filed under: game programming, memory management — Tags: , — systematicgaming @ 11:49 am

We’ve gone over what happens when we read files, and came up with some ways to reduce stalls between our game and the OS and hardware. We designed a file manager that allows us to read files asynchronously and remove all the wait time. What next?

Well the simplest way to reduce load times is to load less data. If we simply compress our data we’ll reduce load times a lot. General compression algorithms can often get a 50% or more reduction in file size, which translates directly into reduced load time.

Compression

Compression is pretty straightforward in concept – compress your file and decompress it after loading. However there’s a little more to it than that. First of all, we need to decide how to compress our data. There’s to main categories of compression: general compression and type-specific compression.

General compression is what lossless algorithms like those used in zip files, Lempel-Ziv, and Huffman, etc. You can compress any data with a general compression system. Type specific compression knows something about the data being compressed and uses knowledge of that format to reduce the data size. Type specific formats are often lossy. Mp3s are one example, as are S3TC/DXT texture formats. Type specific compression is usually much better at reducing data size than general compression.

Type Specific Compression

Type specific compression is very useful and important – you should use it whenever appropriate. For example, keep your textures in a GPU supported compressed format like S3TC (or DXT or BC, or whatever people call it now). If you really need smaller data, try using JPEG or PNG for textures. You can’t render with them directly but they can reduce load times a lot, which may be important if you’re creating a downloadable game. Compressed audio formats are almost mandatory nowadays, and uncompressed video would just be silly.

It’s also possible to create your own compressed formats. It requires extra design work but a data format tuned for size will often beat any general compression. There are a variety of techniques to reduce size, some are pretty simple:

  • Bit packing is simply using non-native sized values. For example using single bits for boolean flags. If you have a value with a range [0,15] you only need 4 bits not a whole byte or 4-byte integer.
  • Run length encoding is a simple technique for reducing repeated data. Instead of storing an array of ten 1’s, store a 10 and a 1.  This works very well for sparse data.
  • Quantization is reducing the range of data, usually used with floating point data. If you don’t need the full range of a 32-bit float you can simply store it as a 16-bit float (or any arbitrary bit length). Quantization is very common on 3D models where many attributes (normals, UVs, etc.) don’t need full precision.
  • Delta encoding is a trick for storing large numbers within a known range. Instead of storing the actual values, you store a base number and offsets. So if you have the values 10000, 10004, 9999, 10001, you can save space by writing 10000, +4, -1, +1, where the offsets can be stored using a smaller number of bits.
  • Curve fitting can reduce the number of samples required to represent a high-density curve. Animation data can benefit from this, the idea is you choose a simpler curve (or curve type) that needs less control points to approximate the original curve.

Many of these techniques complement each other very well, delta encoding and quantization work very well together.  Quantization and curve fitting are both lossy methods, so you have to be careful that you don’t break stuff.

Ideally you can use these smaller data formats directly in your game code, rather than having to decompress them after loading. Designing a custom compressed data format can be time consuming, but if done properly a smaller data size will also speed up your game since they’re cache friendly.

General Compression

We’ll focus on how to integrate general compression into our file manager. We’ll add zlib support to our file manager. Now, zlib implements the same compression as zip files, so it’s pretty efficient as a general compressor. There are many compression libraries around, but we’ll use zlib because it’s readily available and simple to use.

What we need to do is

  1. Read the compressed file
  2. Allocate enough memory for the decompressed data
  3. Decompress it

We will do this inside our FileManager class, to the user of our file system doesn’t need to know if a file is compressed or not. All we have to do is decompress the data when the read operation is completed.

But first things first – we need to create compressed data files. Let’s use zlib to create a simple command-line compression tool.

int main(int argc, char *argv[])
{
   const char *fileNameIn = argv[0];
   const char *fileNameOut = argv[1];

   // read input file
   FILE fileIn = fopen( fileNameIn, "rb" );
   fseek( fileIn, 0, POS_END );
   uint32_t sourceSize = ftell( fileIn );
   char *sourceData = new char[ sourceSize ];
   fseek( fileIn, 0, POS_BEGIN );
   fread( sourceData, 1, sourceSize, fileIn );
   fclose(fileIn);

   // compress with extra large buffer
   // we don't know how large the output buffer is until we compress it
   // lets be safe and assume it won't be larger than 2x the source
   uint32_t compressedSize = 0;
   char *compressedBuffer = new char[ 2 * sourceSize ];

   int result = compress( compressedBuffer, &compressedSize, sourceData, 2 * sourceSize );
   if ( Z_OK != result )
   {
      fprintf(stderr, "Unable to compress %s: error %d\n", fileNameIn, result );
      return -1;
   }

   // create our output file
   // zlib's compressed data doesn't store the original size, so we'll
   // append a little header - a magic number followed by the original size
   FILE *fileOut = fopen(fileNameOut, "wb");
   const int32_t MagicNumber = 0xbeef9999;
   fwrite( &MagicNumber, 1, sizeof(int32_t), fileOut );
   fwrite( &compressedSize, 1, sizeof(uint32_t), fileOut );
   fwrite( compressedBuffer, 1, compressedSize, fileOut );
   fclose( fileOut );

   return 0;
}

We simply run our executable like so

compress myfile.dat myfile.zzz

Now that we have our compressed data, with a small header that we can use to identify normal data from compressed data. Let’s modify our file manger we designed in the last article to handle the decompression.

void FileManager::Update()
{
   for ( int i = 0; i < MAX_OPERATIONS; i++ )
   {
      switch ( mOperations[i].GetOperation() )
      {
         case OP_NONE:
            break;

         case OP_OPEN:
            // we're opened, so start reading
            DWORD fileSize = GetFileSize( mOperations[i].mHandle, NULL );

            void *fileBuffer = new uint8_t[fileSize];
            *mParams[i].mBufferPtr = fileBuffer;
            *mParams[i].mSizePtr = fileSize;

            mOperations[i].Read( fileBuffer, fileSize );
            break;

         case OP_READ:
            // overlapped reads are asynchronous, so poll, if done close
            bool complete = GetOverlappedResult(
                  mOperations[i].mHandle,
                  &mOperations[i].mOverlapped, NULL, FALSE);            

            if ( complete )
            {
               mOperations[i].mState = STATE_END;
               mOperations[i].Close();

               // check for our magic number
               if ( mOperations[i].mSize > 4 )
               {
                  if ( *((*int32_t)mOperations[i].mBuffer) == 0xbeef9999 )
                  {
                     DecompressData( i );
                  }
               }
            }
            break;

         case OP_CLOSE:
            // nothing to do here
            break;
      }
   }
}

void FileManager::DecompressData(FileHandle file)
{
   // decompress the buffer

   // get the decompressed size and allocate a new data buffer
   int32_t decompressedSize = ((*int32_t)mOperations[i].mBuffer)[1];
   uint8_t *decompressedData = new uint8_t[ decompressedSize ];

   // call zlib to decompress the data
   int result = uncompress(
      decompressedData, decompressedSize,
      mOperations[file].mBuffer, mOperations[file].mSize );

   if ( Z_OK != result )
   {
      // error decompressing, many possible reasons
      // for now, just assume the file data is bad
      // possibly this isn't compressed data
      mOperations[file].mState = STATE_ERROR;
      delete [] decompressedData;
      return;
   }

   // redirect our return parameters to the decompressed data
   delete [] (*mParams[i].mBufferPtr);

   *mParams[i].mSizePtr = decompressedSize;
   *mParams[i].mBufferPtr = decompressedData;
}

I hope that looks pretty straightforward. When a read operation completes we check the data for some magic number to see if it’s compressed or not. Then we simply decompress the file into a new buffer and return that, after deleting the raw file data.

This implementation decompression has some problems. The first is we’re decompressing the data all at once, which can take quite a bit of CPU time for larger files. One solution is to simply put the file system on a low-priority thread, which is pretty easy to do, just add a mutex around the FileManager methods. This assumes you have a second processor to run on, which is not always the case.

Another problem is the memory overhead required by this system. To load and decompress a file we need (Compressed Size + Decompressed Size) bytes. That can increase the memory high watermark quite a lot. We can solve that by not decompressing the whole file in one go, if we read in fixed blocks (4kB, 64kB, whatever) we can decompress block by block for a much lower memory usage.

Compressing data in smaller blocks generally gives lower compression ratios than larger blocks. It also means you’ll have to have multiple block-sized read operations at the same time to keep the disc busy while you compress blocks. It can take a fair bit of tuning to get the right balance of block size.

A more subtle problem is the timing of the file system. With the compressed data format we created, we don’t know the complete data size until the file is loaded. The subtle issues is asynchronous file IO is not really deterministic, this means we don’t know when the memory for the decompressed data will be allocated. 99.9% of the time it won’t make any difference. However, if timing changes, because of a scratched DVD or something, you will load a file at a different time than expected. Causing a higher memory high watermark than normal, and possibly a crash.

This class of bug is the really ugly one – its not at all obvious and it’s timing and hardware related, meaning you’ll almost never catch it in a debugger. Many programs ship with this type of bug, because it’s difficult to catch in testing.

We can avoid this particular if we knew the size of the file beforehand. Also, if we knew if a file was compressed or not before loading it we could remove that ugly magic number.

Next post we’ll look at packfiles, which let us do that and more.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: