We all do it!
Yet, this is a topic that rarely receives attention: memory allocation and performance; When our game has been created and that functionalities have been implemented, we would typically use a profiler to try to identify issues that may slow down our code; this being aid, many of the bottlenecks detected by the profiler, may be prevented by coding defensively and avoiding some common pitfalls; these are often linked to garbage collection, memory allocation, and speed of access; in this post, we will start to talk about the heap and the stack, explain what these terms mean, how and why data are stored in these, and how to optimize our code to use them wisely and efficiently.
When a variable is created, its name, type, and values are stored in usually the RAM (Random Memory Access) of your computer (i.e., some time it can also be saved in the CPU cache, but that’s for another day :-)). However, the location where these will be stored will depend on the variable type. Depending on the type, the variable will be saved in what is called the heap or the stack. So what are these?
The Stack
The stack is a type of memory that is used when a function or block of instructions is employed. In this stack, for example, temporary variables are stored, until the function (or block of code) is exited. Let’s look at the next piece of code:
public void TestMemory { int i = 0; int j = 2; }
In the previous code, the variable i is declared inside the method TestMemory; it is a local (hence temporary) variable that will be stored in the stack when the function is accessed; once we exit the function, this variable will be deleted (or discarded) from the stack, so that the stack can be used for other methods and corresponding temporary variables.
The stack usually has a fixed size (i.e., static memory allocation), and partially gets its name from the fact that items stored in it, are stored in a way that looks like a stack.
Let’s imagine that you have several books that you want to stack together; as you add books to it, the very first book that you added to the stack first is at the bottom, and the very last object added to the stack will be near at the top; and as you want to remove objects from the stack, the last object added to the pile (or stack) will be the first object to be removed (i.e., from the top); in computer terms it is called LIFO (Last-In-First-Out). So, in our case, as the function TestMemory is called, the variable i is created and stored in the stack, and so is the variable j; as we exit the method TestMemory, the last variable created (j) is destroyed (i.e., removed from the stack) and then the variable i is destroyed (i.e., removed from the stack). So, as you can see, using this principle, it is very easy to keep track of the items added to the stack and to remove them as they are no longer needed. Using a stack for variables is very efficient in terms of memory allocation, especially when no garbage collection is required as the memory allocation is managed automatically for the stack.
So, using the stack makes a lot of sense; now, there are variables that will, by default be added to the stack, these are value types variables and include most primitive and basic data types such as int, structs, double, float, char, bool, Color, or Vector3 except arrays. Primitive data types are very basic in the sense that they contain information related to only themselves, and they usually occupy a small amount of memory (e.g., a few bytes). This is the reason why when we create a variable of primitive type, it is referred as static memory allocation. Now, because these items are relatively small, it has a small cost to copy them on the fly to the stack. However, this may not be the case if we were to deal with bigger data. You could compare it to leaving your apartment every day, versus moving-out every day. While the former could be done very frequently and quickly, because it does not demand much resources, the latter would become quite exhausting if it was to be done every day. So we need to find a solution for this: and the solution, in terms of memory management, is called the heap.
The Heap
The heap is used for what is called dynamic memory allocation, for example, when you instantiate classes (e.g., by using the keyword new) or create a string. The heap has the ability to expand and change its size to accommodate the data stored within. Classes and objects, by definition are more complex than primitive types because they refer to other variables. For example, we could have a class called Car that includes information about a car: its color, size, name, etc. In this particular case, instantiating a car object will also mean keeping track of the variables within (i.e., member variables or classes), and not just one variable (as we would for primitive data types). So because of the complexity associated and the size also, we would save this data in the heap. Now, the only issue with the heap is that memory is not managed as easily as for the stack (LIFO); in this case, the garbage collector is needed to tidy-up and make sure that memory slots are freed-up when they are not used anymore; and in this case, using the garage collector has a cost in terms of performances.
Garbage collection is very important as it manages your memory. It may be called several time based on specific conditions to optimize the memory and de-allocate/reallocate memory slots. However, like any activity performed by your computer, it may become resource-intensive. So, part of the purpose of optimizing your performance is that the Garbage Collector is called as least as possible, hence freeing-up some CPU for other processes. The call to garbage collection can be triggered by specific code; in other words, depending on how the code is written we can in some cases limit or decrease the number of times the garbage collector is called. The ability to decrease the call to the Garbage Collector (GC) is usually linked to good programming practices;
Variables saved in the heap are usually referred as reference types; this is because, when these variables are created, a reference (a pointer) is created in the stack, and then the data related to the variable is stored in the heap. The references are managed in the exact same way as value types variable. This is illustrated in the next example:
Class Testperf { public void test() { MyClass mc1 = new MyClass(); MyClass mc2 = mc // a refere } }
- In the previous code, a new object mc1 is declared and created; because it is an instance of a class, it is stored in the heap (lets called it obj1).
- A new object mc2 is declared; however, because we don’t use the keyword new, a reference is created in the stack to the object obj1 previously created in the heap; in this case, we save some memory allocation (and garbage collection) because the second object is not created in the heap, but instead a reference is created to the corresponding object.
While value types are not usually garbage-collected; if they are member variables of value types, then they will be collected (as they belong to the class).
For example:
Class TestPerf { public int test; }
- In the previous code, test is a member variable of type int, which is a basic type; although this is a basic type, it will be garbage-collected because it is a member variable.
Class TestPerf { public void MyMethod() { int test; } }
- In the previous code, test is a temporary variable created only when the method MyMethod is called; in this case, it will not be garbage-collected because it is a temporary variable; instead, it will be added to the stack (and consequently deleted from it when the method is exited).
- Recycle objects created in the heap (instances) so that the garbage collector is not called or a new statement is not called.
- Use local variables when possible (these will be stored in the stack, not the heap) and memory management will be faster and easier.
- Use reference allocation (e.g., creating a new object using the new keyword) essentially at the start of the application so that they are called once and not repeatedly, as these will impact on the performances.