Memory Management Part #1Filed Under: Articles, Discussion, Weekly Tuesday Dose of goodness
Hi all,
Today we’ll begin talking about memory management in Java, C# and C++. This article will be focused more on C++’s memory management as in knowing your memory usage and which type of memory is being used at which stage.
Since Java and C# has several differences in the way they handle their memory, this article will only mention specifics in C++, also on the high-level understanding.
Memory management is something that one has know to a certain extent regardless of the programming language used. Why?
Read on…
In my previous post, I’ve mentioned that there’re at least 3 types of memory used during development. In C++, there should be at least 4 types. There’re namely,
1) Stack
2) Heap
3) Static
4) Code
This is how I would name them for clarity. Some may refer static as private heap. This is ok as long as you can tell the difference.
In any case, this article will describe when each of these memory types are being used.
Stack
We use the stack all the time without knowing it. Sometimes we even get errors like stack overflow, stack corrupted and so on.
So what are the unique qualities of the stack memory? Surprise, surprise… it’s actually auto-managed by the system! It cleans up all the memory taken up by temporary variables in a function call for example when the function is done with.
Seems too good to be true? Well, yes. Stack sizes are very small and they’re ineffective when it comes to sharing memory. (via pointers for example, you won’t want to risk corrupting the stack just for sharing memory)
So how small is the stack? My rough estimate after intentionally crashing a test program with a recursive program is about 300,000 bytes. Do note that this is just a rough estimate and for Visual Studio users, you can change your stack size as well.
Heap
The heap memory is also known as dynamic memory to some people. Why? This is because, the memory here are allocated or deallocated only when needed during runtime. The compiler has no way of determining when memory in the heap will be used or not since it cannot predict the exact runtime behavior of the application.
Advantages? The amount of heap available to you is roughly the size of the maximum physical memory you have. This is of course subjected to how the allocator manages the memory allocation. Generally speaking, you can allocate as much memory as the system can afford. Of course, the heap is much larger than the stack.
Disadvantages? If you allocate a piece of memory in the heap, you’ll have to be responsible for deleting it as well. In pure C and C++, it’s handled by free() and delete or delete[] respectively. However, for Java and C#, this is handled by their respective VMs.
Static
When is this type of memory used? Typically when you explicitly declare a variable as static. Basically, what does this mean? It means ONE AND ONLY instance. And yes, instance, not functions. I’ll explain why later.
This rule may not apply 100% to Java as I know that inner classes do not share the one and only characteristic.
Static memory is useful when you don’t want multiple copies of the same class. For example, game engine objects should always be a singleton; it doesn’t make sense to have multiple copies of the game engine in a single application.
For objects to be one and only, other than static, one have to make sure that their constructors are in either the private or protected areas to “completely” seal off the class.
Code
Code memory is the easiest form of memory to understand. It’s the fixed amount of memory taken up by any application when they’re started. Why? When you write codes and compile them with the compiler, they eventually end up as a machine code (or VM byte code) to their respective platforms. When this happens, sequence of machine codes are stored in the code memory as a “template” or “palette”.
This is known to the compiler during compilation and it also determines how big your final executable will be.
In short, the code memory stores all your code procedures. Just a note, since a template’s final form are constructed only during compilation, each template variation will have their own entire code set generated. This inherently increases your executable size but doesn’t necessarily affect your application performance. This also means that more code memory are used when your application is being executed.
So, now let’s talk about some of the myths that are circulating out there . Here it goes:
Myth #1: Stack and Heap walk towards each other
That’s untrue, at least in the context of C/C++. Even in Java, its stack is different from its heap. From what I know, the stack has its own memory address space which is different from the heap’s address space. Therefore, it’s not possible for them to walk towards each other.
And logically speaking, if they can walk towards each other, then why is the stack size so limited?
Myth #2: Memory leaked in the heap continues to be leaked after the application exits
Again, this is untrue. The deallocator will automatically clean up the memory for you once the application exits. This is subjected to some OS control as well (to what extent I don’t really know). In short, even if you’ve leaked 1GB of RAM during execution, it’ll be cleaned up the moment you quit. But note, this only applies to memory leaked in your own heap and not the OS handles.
Myth #3: I use a pointer to point at the address of a variable, so it must be a heap variable
I’ve addressed this in my previous post. Again, this is untrue. There isn’t really such a thing as heap or stack variable. All temporary variables with the exception of static variables and class member attributes are by default stored in the stack. Yes, this includes pointers as well.
Myth #4: Since pointers are in the stack, I don’t have to delete them as long as I use them in a function
This is an obvious confusion between the contents of a pointer and the pointer itself.
What’s the content of a pointer variable? A heap location!
What’s the memory that holds the pointer variable itself? The Stack!
Key point 1: You have to delete the allocated heap when you’re done with it
Key point 2: You don’t have to delete the pointer variable itself, it’s done by the stack
Myth #5: When I dereference a pointer, ain’t I pointing to the stack?
Now this is serious. First you have to disassociate pointer variables from stack and heap. Why? Look at the example below:
struct tSmall {
int a;
int b;
};
tSmall s1; //This is allocated on the STACK
tSmall* s2 = new tSmall(); //s2 points to a location in which a tSmall structure is located in the HEAP
Let’s say if we have:
void foo(tSmall* ptr) {
int myA = ptr->a;
int myB = ptr->b;
}
foo(&s1); //Pass the address s1 which is allocated in the stack
foo(s2); //Pass s2 in since s2 itself IS the address of the heap allocated tSmall
Does it matter whether dereferencing is done to the stack or heap? No!
However, there’s a big caveat to observe. The following code below will kill everyone!
void die(tSmall* ptr) {
delete(ptr);
ptr = NULL;
}
die(s2); //This is ok, since s2 is allocated in the heap and is supposed to be managed by the programmer
die(&s1); //You arbitrary free the memory in the stack, this will cause the stack be unable to unwind properly. (This is sometimes reported by as Exit Code 3 in Visual Studio since Stack is corrupted)
Conclusion
I’ll just cut short to this article since it’s getting really long. In conclusion, memory management isn’t just all about technical knowledge. It’s also about design and contract. How so?
Design - This determines when should pointers, references or variables be used. Good optimization techniques can help reduce unnecessary use of the limited Stack memory. Such as pass by reference, pass by pointer, use scope brackets.
Here’s a small tip for C++ programmers: (might apply to C# and Java, I’m not sure… heh)
void foo() {
int a = 0;
int b = 100;
//if you have a temporary need for array[100] and needs certain visibility and cannot be placed in a separate function, scope brackets will help.
{
int array[100];
for(a = 0; a < 100; a++)
array[a] = 0;
//do something
} //stack unwinds, array[100] is freed up in the stack after this bracket
func1();
func2();
}
There is something to note about this tip. If scope brackets are used, then the application doesn’t have to bear the burden of storing 100 integers into the stack when it calls func1() and func2().
Contract - It determines WHEN, WHO is supposed to do WHAT. It’s important to determine the contract for pointer allocated objects as early into the project design as possible. This will help reduce the occurences in unsolicited deletion, reading into dangling pointers (pointers pointing to deleted memory addresses), memory leaks due to lack of deletion.
That’s all for this week.
Next week, we’ll talk about Smart Pointers and managing the heap in more details.
Signing off,
Jeremy
- Permalink
- Admin
- 4 Aug 2009 1:30 PM
- Comments (3)
August 5th, 2009 at 11:58 am
when an object is created with ‘new’ during runtime, it is allocated in the heap right?
then according to Myth#2, any ‘new’ that haven’t been ‘delete’ed will get freed when the program exits right?
i know memory leak is avoidable, but with a large enough memory, a small memory leak will not have a large effect right?
please correct me if i’m too naïve
August 5th, 2009 at 12:12 pm
Hi anteater, don’t worry about it.
Memory leaks are lethal only when they replicate.
A single one-time memory leak will not cause any damage. (In most cases…)
Key thing is to keep a look out for repetitive allocations.
Small memory leaks can easily accumulate in real-time applications like games and crash within minutes due to out of memory or thrash your system to a halt.
On the other hand, if it’s just a single procedural run through application, memory leaks may not necessarily be a bad thing - this I’ll explain if possible in my next article.
Lastly, if you’re leaking external resources, eg, COM objects, it’ll probably leak even after the OS has cleaned up your heap.
Hope that answers your questions.
August 6th, 2009 at 1:40 pm
yes. thank you very much.