Do you know who you’re pointing at?Filed Under: Weekly Tuesday Dose of goodness
Dear all,
This article is in response to some of the people I’ve met during interviews. I’m just kinda surprised at their level of understanding of C or C++.
One of the basic concepts of C and C++ is pointers though C++ has a stricter references (int& for example) that help to clarify things in terms of usage, the basic understanding of pointers cannot be underestimated.
Why is that so? Read on…
Introduction
First of all, there’s a need to understand what exactly is a pointer. In very simple terms:
A pointer is nothing but a number representing a memory address.
The size of the memory address of course depends on the architecture. 32-bits, 4 bytes, 64-bits, 8 bytes. Nothing complicated.
However, a pointer also has a type. A type means class, struct or primitives. Which is why void* is valid. They’re simply memory addresses with no classification of types. That’s why they’re the most generic form of a full pointer.
Once you start to classify pointers with types, for example (C style):
void* ptr = 0×12345ABC;
cMyClass* classPtr = (cMyClass *)ptr;
By doing this, you are, in my terms, casting a shadow or a net on top of the address.
The system doesn’t understand it - neither does it perform checks on the cast validity (unless it’s dynamic_cast<T> with terms and conditions). So why does it even work then?
Classes and Structs
Now we just take a short moment to pull back things a little. We all know that classes and structs are very similar with the exception that classes have methods, access modifiers and inheritance capability. Structures can do that too - with disastrous results of course - as mentioned in my previous post here.
On top of that, a class or structure has this property, at least in Visual Studio, known as struct alignment. Under default setting, it’s 8-bytes.
What does it mean?
It means that between each attribute in the class, it’s a 8-bytes block. For example, if you have 3 booleans inside a struct, how many bytes will they take up? 1 byte because they’re just 3 booleans? No! Under struct alignment, it’ll be 24 bytes.
As to why it’s set at 8 bytes - that I can only speculate. My guess is, since the biggest data type in C/C++ is double or unsigned long long which are both 64-bits wide and thus translates to 8-bytes, it’s sort of optimized to assume that all data are 8-bytes apart rather than having to check for unaligned memory.
I know this is exceptionally true in assembly language though (in terms of the aligned or unaligned memory).
Shadow-casting
Don’t get me wrong - this is not another type of casting mechanism in C or C++. It also doesn’t refer to the old shadow ROM or memory that your ancient motherboard might have as an option to turn on or off.
A shadow-cast is basically like this: Imagine yourself standing on a pavement and the sun is shining on you, you cast a shadow naturally. If the shadow is cast correctly, then you’ll see a complete shadow of yourself. In this sense you can take it as a valid cast.
In another scenario, imagine yourself standing a few steps before a cliff, the sun shines on you and you cast a shadow. Notice how the shadow disappears as it reaches the cliff? In that sense, you can take it as a wild/invalid/dangling cast. Basically, it means that the memory is shadowed incorrectly and will lead to disastrous results if it’s read from or written onto.
Now, knowing the shadow-cast size is important since you know that memory buffers too have such problems. Knowing your struct alignment will give you an accurate picture of whether your shadow casting is done correctly or not. In short, if you’re shadowing a piece of memory that is 24 bytes long and that your shadow specification is 32 bytes long, you’ll end up having 8-bytes shadowed illegally. While that alone doesn’t cause a crash, again, accessing it in terms of read or write operations will almost certainly lead to a crash.
Known Memory
So, the next thing we need to refresh ourselves is the 4 basic types of memory used by C/C++. That is:
- Stack
- Heap
- Code
- Private Heap aka Static
Though they’re used at different times, it doesn’t mean that we cannot point at them anytime. Have you heard of things like function pointers? These pointers are basically pointing to the code memory address of the function itself.
Myths
With the stage set, I’m going to move into the myths. But just quickly allow me to define some phrases to reduce or eliminate any confusion.
- Stack pointer means a pointer pointing to an address on the stack
- Heap pointer means a pointer pointing to an address on the heap
- Function pointer means a pointer pointing to a code memory address in the code memory
- Singleton pointer means a pointer pointing to an address in the private heap
- Static pointer means that a pointer is declared as static and its contents, which is the memory address is stored private heap.
Myth #1 - I should always delete a pointer when I’m done with it.
In truth, this statement is technically correct in a very literal perspective. However, this statement falls apart when you’re given pointers that is pointing to any type of memory other than heap.
Unless you have a memory map which maps a range of memory for stack, heap, code, etc, there’s no direct way to tell, by simply reading from the pointer, whether the address is pointing towards a heap or not.
In this sense, this statement should only apply to your own attributes. If you’re receiving pointers from another party, always check out its contract and origins. If your design requires the pointer to be deleted nevertheless, then the caller has invoked your function wrongly by passing in the wrong type of pointers.
Myth #2 - If I have a stack pointer, I don’t have to delete them.
Basically, this is not a myth only if you’re absolutely sure that it’s not a heap pointer. Again, this is only possible if you know the contract and possible pointer inputs. You can also state your own contract of how things work in your class in order to deter callers from passing in the wrong type of pointers.
Otherwise, it’s true.
Not only you don’t have to delete them. It’s best to set them to NULL once you’re done with them. Don’t worry about the actual stack though; you’re just setting the pointer’s holding address to NULL that’s all.
Reason being that should the life of the pointer exceeds the life of the function call, then it’s almost certain that this innocent pointer will turn into a wild or dangling pointer. Reading or writing means death to the application.
Myth #3 - I don’t have to delete static pointers because the compiler does it for me
Again, this statement demonstrates an insufficient level of understanding for pointers. If a pointer is declared as a static variable, then the pointer itself is stored in the private heap. But if you instantiate an object and assign the new address to this static variable, where’ s the object?
Very simple. Just ask yourself how do you instantiate your object? malloc? new? Fine. Where does these allocators draw their memory from? THE HEAP!
So, even though your pointer is declared as a static pointer, you’re responsible for the heap address assigned to your static pointer during instantiation!
Myth #4 - I don’t have to delete my heap pointers all the time
Surprising this myth is actually true! However, it comes with several conditions. You see, memory leaks are not lethal by nature. It’s the accumulation of leaks that eventually lead to crashes.
Therefore, if you can ensure that all your pointers will only ever be instantiated once, then this statement holds true. That’s because the OS will clean up your heap for you anyway.
However, do note - OS handles (ie, file handles) are not released just like that. You’ll need to manually release it or else it’ll be leaked until you restart your machine!
Myth #5 - I can control my private heap
Sure you can control the contents in the private heap. But this control doesn’t extent itself to as far as allowing you to delete the actual private heap itself. See the next myth to understand more.
Myth #6 - Singleton pointers must be the last pointers ever to be deleted
Singletons are stored in the private heap as we know. If there’s a pointer to the private heap, then so be it.
But know this - the CRT will eventually clean up the private heap. But the order of destruction is not known to you or can be controlled by you alike the way you control heap deletes.
Therefore, this statement is a myth. Try it and you’ll probably get some weird errors as your program exits. Also, if you create a cross-dependency across multiple singletons, then it’s going to be a long long nightmare.
Conclusion
Those myths above are just some examples of insufficient understanding on C and C++ memory handling. There’re plenty more but I don’t have the luxury to post them all up here.
Besides, even I myself is still learning new things everyday.
Therefore, I hope that this article can be understood by everyone and doesn’t contain technical errors. I’m sharing from experience yes? But it doesn’t mean that my experience is flawless.
Thus, don’t try to kill yourself by reading this post too hard. If you find it hard to swallow, take a step back. Read other posts prior to try to gain more understanding towards C/C++ memory.
Otherwise, if I’m wrong, please feel free to criticize by leaving a comment. I’ll be glad to receive any guidance and corrections anytime!
Have a great week ahead!
Signing off,
Jeremy
- Permalink
- Admin
- 13 Jul 2010 9:33 AM
- Comments (3)
July 15th, 2010 at 1:39 pm
answer from interviewee..
pointer is the thing that moves on the screen when you move the mouse.
July 16th, 2010 at 12:09 am
omg…
i think i’ll get shot for even arranging this interview…
July 20th, 2010 at 1:11 am
[...] C++, it’s very clear cut. In my previous article, if you should ever use the new operator (Yeah, new is an operator, not a special keyword) the [...]