Memory leaks beyond memory leaksFiled Under: Weekly Tuesday Dose of goodness
Hi all,
Finally! I have some time to post something new here. As time goes by, the challenges I face gets tougher by the day. Remember some time ago I talked about memory leaks that occurs due to design flaws and abuse despite having a good high-level memory management pointer container?
Well, my talk has just caught up with me once more. It seems that the simple case of allocator race leak is no longer that simple. Usually, we’re always on the illusion that the memory we allocate is the same memory that is supposedly deallocated immediately (alike the allocator when sourcing out for RAM).
But no. This is indeed not the case.
Read on to find out why…
Introduction
First of all, this post should be regarded more like a speculation rather than a fact sheet of how things work. The reason is because some of the contents requires proper understanding of the host OS which I admit that I only have limited knowledge.
Secondly, different OS have different means of handling memory. The allocator we’re talking about in this article is also based on the de-facto compiler. What I’m saying is, for Windows, we’ll be referring to a Visual Studio compiler and CRT. For Linux, one can assume GCC and its CRT to be the one.
High-level memory management vs Low-level memory management
When we talk about high level and low level, it’s usually based on the level of control you have on your memory. One can argue that he/she have absolute control over his/her memory. Granted. Only if the number of allocations are minimal and each allocation is large.
A low-level memory management is like managing which block of memory is to be allocated for the requested size. Managing memory to such details are usually for platform-level languages such as Java and her JVM. Have you ever wondered why the JVM takes up a contiguous amount of memory and this amount is usually in large amounts, such as 256MB onwards?
A high-level memory management is like cSmartPtr<T> where its management is dictated by technical design of the object. It doesn’t care how the memory is allocated or when the memory is deallocated from the OS. It only knows when to delete an object when the time comes for it to be destroyed.
Deleting an object doesn’t necessarily guarantee a direct refund of memory. This is especially so for debug CRTs as compared to release CRTs.
This article’s topic isn’t on the difference between debug and release CRTs. But one notable difference is that, debug CRTs usually take up more memory like a glutton compared to release CRTs.
Memory Paging
In Windows, from my limited knowledge, memory paging is used. The exact size of each page is unknown to me, but my guess is around 4k bytes per page.
My second guess in this article is that - the allocator cannot get a memory refund unless all the allocations on the page is deleted. Thus, if there’re 100x 4k pages and that these pages only have 1x 64 bytes of allocation, then for 64k worth of memory objects, 400K of RAM is used up instead. This is indeed memory fragmentation - fragmentation beyond what high level management can do (unless the user creates an array and manages it manually…)
Thus, until there’s a customized solution for game-based applications, who has high volume (rate of construction/destruction per second) of object construction and destruction, the amount of memory taken up will increase gradually no matter how leak-free your application might be.
Possible Solutions
There’re several solutions which are of a similar nature that can help reduce or eliminate the problem completely. I’ll list them in order of complexity (I hope I’m right)…
1. Use limited managed arrays
Although arrays are known to be evil, sometimes if properly employed, can be a powerful ally. I mentioned the use of managed - that’d mean arrays represented by a proper data structure such as std :: vector<T> or cArray<T>.
The crux of this solution is to design the application in such a way that certain objects have a limitation in numbers. Such as projectiles limits, particle (dust, sparks, fire, etc) limits, and other object limits. Different games require different limits, some prohibit limitations due to the way the game is meant to be played. Either way, someone has to make the decision.
The disadvantage of employing this solution is limitation and increases processing load. It’s difficult to calculate which of the array slots will be available during runtime and thus the algorithm will usually place the game object in the next available slot and goes back to index 0 to start searching for one when the index has come to an end.
The search for an available slot is timed at O ( n ) though in my personal opinion, it should be somewhat lesser.
On top of that, the execution update loop will need to iterate all slots to see if they’re NULL or not and then process those that are not NULL.
2. Use a large memory buffer and manage manually
This is a significantly more complex solution which requires a high-level allocator to search, allocate, release and maintain a heap. Thus one requires a heap data structure to properly manage the user-defined memory.
The crux of this approach is to reduce the number of default allocation calls. This in turn will gradually reduce the amount of memory page fragmentation and thus the memory leak itself.
The disadvantages of this approach is that the user have to manually manage the memory buffer. The amount of effort for coding this management system; which may not be reusable or transferable easily without massive recoding, makes it highly undesirable.
3. Implement your own garbage collector / memory allocator
The 4 major C++ operators new, new [], delete and delete [] must be overridden. This applies to all objects, thus templates shouldn’t be in the design considerations. That’s the first step.
Next, the internal allocation mechanism must allocate an page aligned memory buffer that is large enough by estimation to accommodate the user’s requests. On top of that, the system must also expand as necessary.
The way it sounds, it certainly sounds like a garbage collector isn’t it? Well, it certainly is - with the exception that your low level mechanism doesn’t include memory housekeeping. This should be done by a smart pointer container such as cSmartPtr<T>.
The crux is to ensure that memory management is done from bottom-up and top-down as well. Since the memory is pre-allocated, the number of default allocations are minimized. In fact, a well-designed application can even eliminate such memory leaks altogether.
This is certainly the best solution since it doesn’t require any major recoding in order for it to be reused.
Disadvantages? Well, the system is notoriously hard to implement and may take a long time to validate itself. Kudos to those who wrote the Java language. You guys really know your stuffs. Too bad, many of your subscribers are too stuck up to see what you’ve done for them.
Conclusion
This episode that I’ve encountered taught me an important lesson. That is - never to take memory allocation for granted. Understanding the OS is just but a first step to understanding how its memory allocations are done.
A good game developer will at least understand this and know the limitations and thus will not design something that the system cannot handle.
At this stage, I believe this is the end of the spectrum for memory leaks. This topic has been quite a hot topic on stridesdev.org. I hope that this article can raise the awareness of memory allocations among the community. Don’t take it for granted.
Do feel free to correct me should you feel that my guesses are wrong. I’ll be more than willing to acknowledge my mistakes and post them here.
Signing off,
Jeremy
- Permalink
- Admin
- 2 Aug 2011 8:20 AM
- Comments (2)
January 25th, 2012 at 10:34 pm
Sweet blog! I found it while searching on Yahoo News. Do you have any suggestions on how to get listed in Yahoo News? I’ve been trying for a while but I never seem to get there! Many thanks
January 25th, 2012 at 11:19 pm
Hi Nichelle, I think I did something to register stridesdev with yahoo a long long time ago… but it takes time for the bots to crawl…