Systematic Gaming

August 7, 2008

Memory Management: Overview of Memory

Filed under: game programming, memory management — Tags: , , — systematicgaming @ 2:19 pm

Before we can really delve into the design of a memory management system we need to have a solid understanding of memory. In this post we’ll go over different types of memory and look at how they are used in your average program. We’ll also take a look at the main issues faced with each memory type, such as lifetime, scope and problems such as fragmentation.

Disclaimer: You’ll see some code samples in the following article, they’re in pseudo-C, not compilable or even compliant. They’re included for illustration and discussion purposes.

Heaps of Stacks

In a C/C++ program memory can generally be categorized as being from the heap, the stack or global memory. The heap is where dynamic allocation comes from, anything created with ‘new’ or ‘malloc’, and is what most people think of when considering memory management. The stack is where local variables and function parameters are allocated from, and is generally handled behind the scene by your trusty compiler. Lastly, global memory (sometimes called static memory) is where all your global and static variables get stored. Those are main types of memory, how are they different? How are they organized?

Global Memory

Global memory is one or more fixed chunks of memory used store all the compiler generated memory. It’s usually laid out by the linker when your compile your program, and organized by the type of data stored. Take the following hello world program:

int globalInt = 3;
float globalArray[1000];
int main(int argc, char *argv[])
{
   static int someStatic = 1;
   const char *text = "hello world!\n"
   for ( int i = 0 ; i < globalInt; i++ )
   {
      printf(text);
   }
   return 0;
}

The above program, when compiled and linked, will contain both code and data. What does it look like in memory? Well, it will be laid out something like this

Global memory layout

Global memory layout

The code is stored in a .text section, the .data section contains initialized data like our globalInt and “hello world” variables, while the .bss (below stack segment) contains uninitialized data like our globalArray. Some times linkers will put read-only data, like our “hello world” string into a separate section from writable data, like our globalInt variable. (In reality it’s a bit more complicated, since often different sections are loaded at different memory addresses, but we’ll be looking at memory addresses in more detail later.)

So, that’s how global data is managed in memory. Fixed size blocks of memory that are allocated when we start our application. Usually there’s no need to alter it much, but it helps to understand how this data is managed and the affect global and static data has on memory.

Stack Memory

Stack memory is used to store local variables and arguments to functions. As the name implies, the stack acts is a stack of allocations using a LIFO (Last In First Out) ordering. You can push (allocate) from the top or pop (deallocate) the top of the stack, that’s it. The size of the stack is usually determined by the linker or when you create a thread (each thread has their own stack). Let’s look at the following sample code and see how the stack behaves.

void main()
{
   int x = 3;
   const char *str = "hello";
   {
      float data[4];
   }
   char str2[100];
}

Running this program we’ll get the following sequence (assuming a standard 32bit CPU):

push x        + sizeof(int) = 4 bytes
push str      + sizeof(char *) = 4 bytes
push data     + 4 * sizeof(float) = 16 bytes
pop data      - 4 * sizeof(float) = 16 bytes
push str2     + 100 * sizeof(char) = 100 bytes
pop str2      - 100 * sizeof(char) = 100 bytes
pop str       + sizeof(char *) = 4 bytes
pop x         + sizeof(int) = 4 bytes

Okay, so not very exciting. Keep in mind how we can manage this memory with almost no effort, simply tracking the end of the stack we can allocate memory very quickly. This works due to our knowledge of how the program will execute, by entering and exiting functions and nested blocks we control how memory is allocated and freed from the stack. Let’s look at another piece of code.

void main()
{
   int buffer1[1024];
   int buffer2[1024];
   int buffer3[1024];

   initialize_bufferA( buffer1 );
   initialize_bufferB( buffer2 );
   initialize_bufferC( buffer3 );

   do_something( buffer1, buffer2 );
   do_something_else( buffer1, buffer3 );
}

We push 3 4Kb arrays onto the stack, initialize some stuff then do some processing, no big deal, right?

Well, what if we only have 10Kb of stack space? 3 * 4Kb is 12Kb, so our stack isn’t large enough. We’ll get what’s usually called a call stack exception or in layman’s terms, a crash. Stack exceptions are triggered when an application goes past the end of the stack. The are often seen when you have an infinitely recursive function call, or a very deep call stack.

So, how can we fix this problem? One way is to make more efficient use of the stack using our understanding of how it allocates:

void main()
{
   int buffer1[1024];
   initialize_bufferA( buffer1 );

   {
      int buffer2[1024];
      initialize_bufferB( buffer2 );
      do_something( buffer1, buffer2 );
   }

   {
      int buffer3[1024];
      initialize_bufferC( buffer3 );
      do_something_else( buffer1, buffer3 );
   }
}

Not the prettiest code, but it gets the job done. Now buffer2 and buffer3 are allocated within different scopes. So we only have two buffers in memory at once, and only use 8Kb for our buffers. Assuming the other functions don’t use much stack space we can now run without causing an exception. We’ve taken advantage of how the call stack works to reduce the maximum memory required. We are able to take advantage of the fact that buffer2 and buffer3 have shorter lifetime and more limited scope. This allows us to reduce memory usage. Understanding lifetime and scope is very important when you want to maximize your usage of memory!

Heap Memory

Heap memory is what most people think about when discussing memory. The heap is where every object you new or malloc is allocated from, and is also the largest memory resource available to us. The heap will be the focus of most of this series.

What is the heap? Basically, all the memory not used by the stack or global memory.

Where does the heap come from? From the OS.

So if the OS manages our heap, then memory management problem solved, right? Well… not quite. The OS manages memory at the physical page (and virtual page) level, usually in large chunks of 4Kb to 16MB or more (it depends on the CPU and OS). When you call malloc it’s the C-runtime libraries that request pages of memory from the OS and then divvies up those pages to fit your request. Now malloc is nice and all, but since this series is about memory management we’ll be focusing on managing our own memory.

But you know what? Managing individual pages of RAM is a pain in the ass. We’re not running a multi-process OS, we don’t need to page memory in and out of disk, our goal is to manage the memory of a single game. So the simplest approach is to just grab all the available pages of memory when our program starts and divvy it up as needed. That’s the approach this series on memory management is going to take.

With the heap included our program’s memory layout looks something like this:

Initial memory layout

Initial memory layout

The diagram isn’t to any exact scale, in reality the heap would likely be 10x or more the size of the stack and global data combined.

Now that we have a heap, we have to figure out how to manage it. Unlike the global data and stack the heap doesn’t really have a set allocation scheme, we’ll need to roll our own. Let’s look at a simple heap management scheme.

Allocation

We’ll manage our heap at first with a very simple scheme. It will simply have a list of free memory blocks and a list of allocated blocks. Initially we’ll have a single free block which has represents the entire heap. We can’t get much simpler than that.

One important fact to remember is that this code is managing the heap, so we can’t really allocate memory. Where would we allocate it from? Instead we’ll be doing some “ugly” pointer tricks to manage our heap from within the heap. Really, it’s nothing too complicated, but if you’re accustomed to non-systems level programming it may seem a wonky. I assure you it’s nothing to be scared of, and this sort of pointer manipulation is fairly common in lower-level systems code (for better or for worse).

struct MemBlock
{
   uint32_t size;      // size of block in bytes
   void *data;       // pointer to actual block of data
   MemBlock *next;   // next memory block in list (or NULL)
};

class Heap
{
public:
   Heap(void *start, size_t size);

   void *Alloc(size_t size);
   void Free(void *ptr);

private:
   MemBlock *mFreeBlocks;
   MemBlock *mUsedBlocks;
};

Heap::Heap(void *start, size_t size)
{
   // setup our free block list using memory from the heap
   // the actual block of data starts just after this MemBlock
   mFreeBlocks = (MemBlock *)start;
   mFreeBlocks.size = size - sizeof(MemBlock);
   mFreeBlocks.data = (uint8_t *)start + sizeof(MemBlock);
   mFreeBlocks.next = NULL;

   mUsedBlocks = NULL;
};

// Allocate 'size' bytes from our heap, NULL if we cannot
void *Heap::Alloc(size_t size)
{
   MemBlock *block = mFreeBlocks;
   while ( block && block->size < size )
   {
      block = block->next;
   }

   if ( block )
   {
      // we've found a large enough block, split it into a free and used block
      // we'll use the end of this block to allocate or memory from
      uint32_t sizeNeeded = size + sizeof(MemBlock);
      uint32_t offset = block->size - sizeNeeded;

      MemBlock *usedBlock = (MemBlock *)((uint8_t *)block->data + offset);
      usedBlock->size = size;
      usedBlock->data = ((uint8_t *)block->data + offset) + sizeof(MemBlock);

      // adjust the of the free block
      block->size -= sizeNeeded;

      // add to our list of free blocks
      usedBlock->next = mFreeBlocks;
      mFreeBlocks = usedBlock;

      return usedBlock->data;
   }

   // we reached the end of the list, but no block was large enough
   // this means we're out of memory!
   return NULL;
}

// Free the memory at the given pointer location
void Heap::Free(void *ptr)
{
   MemBlock *prevBlock = NULL;
   MemBlock *block = mUsedBlocks;
   while (block && block->data != ptr )
   {
      prevBlock = block;
      block = block->next;
   }

   if ( block )
   {
      // remove from the used list
      if ( prevBlock )
      {
         prevBlock->next = block->next;
      }
      else
      {
         mFreeBlocks = block->next;
      }

      // insert back into the free list, merging blocks if needed
      MemBlock *prevFreeBlock = NULL;
      MemBlock *freeBlock = mFreeBlocks;
      while ( freeBlock && freeBlock->data < block->data )
      {
         prevFreeBlock = freeBlock;
         freeBlock = freeBlock->next;
      }

      // either merge with the previous free block
      // or add to the end of the free list
      bool canMerge = (MemBlock *)((uint8_t *)freeBlock->data + freeBlock->size) != block;

      if ( canMerge )
      {
         // merge with freed block, also merge with any newly adjacent blocks
         prevFreeBlock->size += block->size + sizeof(MemBlock);
         while ( (MemBlock *)((uint8_t *)prevFreeBlock->data + prevFreeBlock->size) == prevFreeBlock->next )
         {
            prevFreeBlock->size += prevFreeBlock->next->size;
            prevFreeBlock->next = prevFreeBlock->next->next;
         }
      else
      {
         block->next = prevFreeBlock->next;
         prevFreeBlock->next = block;
      }
   }
   else
   {
      // if block was NULL, it means we tried to free a pointer we didn't allocate
      // this usually indicates a bug in the program (or possibly our allocator)
      assert(false);
   }
}

Our allocator is a little long, but very simple. When we allocate memory we search our list of free blocks until we find one large enough. Then we simply cut off the amount of memory we need from the end, adding the new block to our list of used blocks.

When we free memory we look through our list of previously allocated blocks. When we find the matching block it gets removed and added back to the list of free blocks. However, since we want contiguous memory blocks of memory when we release a block we need to merge it with allocated blocks if possible. Let’s see how our allocator behaves if we have a 10Kb heap running some simple code:

void *prt1 = gHeap.Alloc(1024);
void *prt2 = gHeap.Alloc(2048);
gHeap.Free(ptr1);
Sample heap allocation

Sample heap allocation

We can see that our heap gets divided into multiple blocks as we allocate and free memory. One thing to understand is that there is overhead for each allocation, 12 bytes per allocation with our allocator. This may not seem like much but consider that small allocations, say 4 bytes, will take 16 bytes of memory, that’s 300% overhead! Many small allocations will quickly waste a lot of memory. Additionally, we have to traverse a linked list each time we allocate or free memory. If we have thousands of allocations in our heap this can get pretty expensive.

Keep these issues in mind, we’ll return to them later on in this memory management series.

Fragmentation

Look back at our example of heap allocation, our 10Kb heap now has a little less than 8Kb free. So what happens if we allocate 7Kb? Well we’ve only allocated 2048 bytes from our 10240 byte heap so we should be able to do it, right? Well, no, looking at the state of our heap our largest block of free memory is 7132 bytes, and we need 7180 bytes (7Kb + 12 bytes). This means our heap’s Alloc will return NULL, meaning we’re out of memory! So even though we have over 7Kb of unallocated memory, we can’t actually allocate it in one block. This is called memory fragmentation and is one of the major problems with heap management.

How can we fix this? One way is to simply rearrange the order we allocate memory.

void *prt2 = gHeap.Alloc(2048);
void *prt1 = gHeap.Alloc(1024);
gHeap.Free(ptr1);
Reordered heap allocation

Reordered heap allocation

Now we have enough contiguous free memory to allocate our 7Kb. However, it isn’t always possible to rearrange allocations this easily. We could choose a different allocation scheme, or even allow the user to decide how to allocate from the heap. One simple such scheme is to add a parameter to our allocator that specifies if we load from the front of the heap or from the end.

void *prt1 = gHeap.Alloc(1024, MEM_HEAP_START);
void *prt2 = gHeap.Alloc(2048, MEM_HEAP_END);
gHeap.Free(ptr1);

Instead of specifying the location to allocate from we could allow the user to specify the expected lifetime of the memory allocation. This can be used a hint to our allocator so it can attempt to reduce fragmentation by preventing temporary allocations from fragmenting the heap used by long term or permanent allocations. We can simply allocate permanent memory from the start of the heap and temporary allocations from the end. This would fix our previous problem, and actually works quite well in practice.

void *prt1 = gHeap.Alloc(1024, MEM_HEAP_TEMP);
void *prt2 = gHeap.Alloc(2048, MEM_HEAP_PERM);
gHeap.Free(ptr1);

Both changes would have fixed the fragmentation issue without reordering the code. However both solutions, either using location hints or lifetime hints, suffer from a very serious problem. Whomever allocates memory has to give the proper hint. This assumes they know the lifetime of the memory or how your memory allocator works internally. This is a very dangerous assumption! We were only able to fix it because we had a nice clear picture of our entire sequence of memory allocations.

Okay, so what if we have a smarter allocator? Maybe one that puts small allocations separated from large ones? Or some really fancy heuristics to manage the heap? Sorry, no matter how sophisticated your heap allocation scheme is, it can be fragmented.

Can’t we just defragment our heap? All we’d have to do is rearrange the allocated memory and we’d remove the fragmentation. Unfortunately you can’t really do that in C/C++ since pointers are direct memory addresses, you can’t move them around without breaking everything. Simply put: You can’t defragment a heap in C/C++.

Review

Ok, we’ve looked at the 3 main memory resources available to a program, lets do a quick review.

Global Memory

  • Contains code and data defined directly by the source code (global and static variables)
  • Is fixed in size and allocated when the application is launched
  • It cannot be increased or decreased

Stack Memory

  • Contains local variables and function parameters
  • Is fixed in size and allocated when the application is launched
  • Stack memory is allocated in a linear fashion, only allocating or freeing memory from the top of the stack

Heap Memory

  • Largest available memory
  • Freedom in allocation mechanisms, not system defined
  • Most complicated to deal with
  • The focus of most of this series

That’s the overview of memory. We’ll be using this article for the basis of our discussions on memory management.

Next we’ll look at console-specific issues faced in memory management

Advertisements

10 Comments »

  1. Nice, very interesting. Not perfect, but informative enough. Hope to see more in the following articles.

    Comment by John Fragkoulis — August 17, 2008 @ 3:39 pm

  2. The series is meant as a fairly lightweight introduction so some of the details are glossed over. As long as the articles are interesting and informative I suppose my goals are being met.

    Comment by systematicgaming — August 25, 2008 @ 7:39 am

  3. I think there’s a bug in Heap::Free, the second line reads “MemBlock *block = mFreeBlocks;” but should be “MemBlock *block = mUsedBlocks;”, since you’re looking for a ptr which is currently used and needs to be freed. Or I might be very confused πŸ˜‰

    Comment by raigan — September 5, 2008 @ 7:43 pm

  4. Well I did have a disclaimer about the code πŸ˜‰

    Good catch, you’re not confused. I’ve corrected it, thanks!

    Comment by systematicgaming — September 6, 2008 @ 12:43 am

  5. Good read. Thanks

    Comment by dekz — January 5, 2009 @ 5:23 am

  6. Very nice, I will have a better idea of memory for my interview tomorrow πŸ˜‰

    Comment by Andrea — March 19, 2010 @ 1:13 am

    • Yes . Very clear explanation about memory management .

      Comment by Amaravathi — August 25, 2011 @ 3:01 pm

  7. Nice article! Congrats!

    I was wondering about this code in the Alloc function, though:
    // add to our list of free blocks
    usedBlock->next = mFreeBlocks;
    mFreeBlocks = usedBlock;

    I guess it’s an error, since the free block has already been updated (it’s size) and what you need here is to update the mUsedBlocks list. Moreover, usedBlock->next should point to NULL, shouldn’t it?

    Thank you!

    Comment by Marc Costa — August 9, 2012 @ 9:33 am

    • Yes, I think you’re right, the usedBlock is already in the free list, so I don’t think it needs to be updated.

      Instead of adding to the free list (which it’s already part of):

      // add to our list of free blocks
      usedBlock->next = mFreeBlocks;
      mFreeBlocks = usedBlock;

      The block should be added to list of allocated blocks:
      // add to our list of used blocks
      usedBlock->next = mUsedBlocks;
      mUsedBlocks = usedBlock;

      So that’s what I get for posting untested pseudo-code…

      Comment by systematicgaming — August 9, 2012 @ 11:32 am

  8. It’s the first clear post I find for newbies like me about memory management in computer games :). I am going to read the second one right now.

    Comment by Carlos — September 1, 2012 @ 11:29 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: