Debugging DisastersFiled Under: Weekly Tuesday Dose of goodness
- Private Heap Disaster
- Debugging Disasters
- ‘this’ pointer disaster
- Inheritance disaster
Hi all,
Here’s another article regarding software development disasters. However, do note that this is a fictitious story that contains certain fragments of events that are based on my own experience. I’m just using a single story to string them up. So, here it goes:
A team of developers worked on a software for 12 months already. So far, there hasn’t been any issues in terms of functionalities. Everything was proceeding smoothly.
One day, a new user joined the team to help test for exceptional conditions. That’s where the flight to heaven fell from the sky, hit a shit mold and starts to roll into the hot depths of hell.
What happened? Read on…
Introduction
Basically, there’s nothing much to introduce here. Most of the introduction has already been done above. But allow me to set the stage instead for this whole mess.
First of all, the developers are not very experienced. They have approximately 1-3 years experience in C++ development and their project manager is purely a task and requirements management person.
There’s a technical leader who has around 4 years C++ experience. Their code base is approximately 400,000 lines of codes so far.
So what exactly happened?
13th Month - Unlucky or just a plain number?
On the 13th month of the project, a new young but experience user-cum-tester joined the team. As expected, his tasks were to test the project with existing requirements and come up with test plans that are exceptional in behavior to the system.
He’d then test the system and submit a report to the manager who will then reassign new tasks based on the report.
The first few days was really honeymoon. The tester heaped tons of praises on the work done so far to encourage the team. Everyone was energized and motivated.
But problems start to appear right after the 2nd week.
In certain exceptional cases, the system could not be recovered. In fact, in extreme scenarios, the team had to shut down the server and restart the machine literally.
This came as a big blow to the team since they were on such a smooth sailing progress since last year.
The developers began to rage a war against the new tester by begin very difficult to deal with. They soon question the tester’s intentions in the company and even accused the tester of sabotaging the project for other competitors.
It was ugly.
In response to that, the project manager ordered an emergency meeting and launched an investigation into the series of crashes caused by the exceptional tests.
Debugger Faulty?
As usual, the developers stepped into the codes, they could even identify where the point of the crash occurred. But they can’t make any sense out of it. Basically, the code look something like this:
void Bank::UpdateStatus(int newStatus)
{
m_iCurrentStatus = newStatus; // CRASH here.
}
It looked as though as the debugger was faulty here. But it can’t be. They’re using Visual Studio 2005 to begin with and we know that it’s a stable and dependable.
Apparently, they have a clue.
The memory address of m_iCurrentStatus looked dubious. The address reads something like 0xCCCCCCDF . But this alone doesn’t tell much of why the address was allocated this way.
Naming disaster
Another thing noted by the tester is that - whenever he encountered a compilation error in a sea of compiler warnings and template warnings, it’s almost impossible to look for THE compilation errors.
Usually, in Visual Studio, it’s easy to search for compiler errors by typing “error” in the search pop-up.
However, for this team, they apparently named classes with the word error as part of the name. For example:
cBankingErrorException cAccountErrorException cErrorReport cErrorStatus
What the hell?!! For certain reasons, certain names like warning or errors should never be used as part of name of a class. This certainly leads to more work when debugging or figuring out syntax errors.
I mean, take a look at this. You have like 20,000 lines of warning and stuffs like that during a build and there’re like only 5 to 6 compiler errors. The compiler says, 10222 warnings 5 errors.
So what’s next? Look for the syntax errors right? NO.
You can’t because the freaking CLASSes are NAMED as errors. So now you’ll have to filter through like 2000 instances of errors in order to find that few compilation errors.
Worse if it’s spread evenly apart. Good luck finding them.
Note: Don’t assume that everybody’s using Visual Studio 2003 and above ok - errors don’t just pop up like that.
Mysterious System Slowdown
The system was actually ok when the standard test plans were performed. There were no glitches, bugs or crashes.
However, when the exceptional test set came into play, the whole system began to slow down after 2-3 days of continuous operation.
The so-called exceptional tests that caused the slowdown are nothing but creating bank accounts.
Cause of Problems
Now I’m sure you guys want to know what caused all these problems right?
Let’s go through them 1 by 1.
1) Debugger Faulty?
The debugger isn’t faulty. Neither is the address faulty.
The code is behaving correctly. So what’s wrong anyway? First of all, we need to understand that there’re 4 types of memory in C++. See this the memory management in totality article for more information.
So, a few things are known after compilation. One of which is the code memory. The code memory stores a single undisputed and unique copy of your compiled source codes in it. This applies only to an application by itself - additional DLLs have their own respective code memory.
We’ve now established that the code itself is perfectly ok.
Next, let’s establish the validity of m_iCurrentStatus . Based on the coding standards, it means that it’s a member attribute, type: native integer. The thing you must know about classes is, a member attribute is actually an address offset in the class.
And since everybody has an address of its own, let’s assume that the address of the object is 0×1234cccc. Based on the struct alignment of 8 bytes default, each attribute is spaced out at 8 bytes distance.
Therefore even if you have 10 shorts, it’ll still take up 80 bytes.
So, when an member attribute is accessed, the formula is bascially:
base address + offset
Therefore if the base address is faulty in the first place, ie, wild or dangling, the CRT will not be able to tell unless it’s a null pointer which is zero in this case. Since it’s not able to tell, it’ll just blindly execute the line with the code, base address and offset.
Thus causing this weird error to occur.
2) Naming Disaster
By naming classes with error as part of the class name suggests a possible defect in the design itself. How could a class be created simply to represent an error?
While naming the class with error as part of it may have its merits, it’ll certainly cause a lot of problems should the build breaks. People who’ve used VC6 before should understand. In the later versions of Visual Studio, errors are presented in a different window apart from the console, allowing developers to quickly zoom into it.
However, for people who’re using make files, build utilities and the old VC6, good luck to you finding the compiler error. Even if you can, it’ll not be easy and it slows you down for nothing.
3) Mysterious System Slowdown
If your system doesn’t crash within the same day even though there’s an obvious design-based leak going on there.
Even if your system doesn’t crash but if it slows down dramatically, then either design-based leak or memory hogging is responsible.
For more information - read this article for memory hogging, and these two articles below for design-based leaks.
Design-based leaks (continued)
Legal Leaks
Conclusion
Finally, we’ve arrived at the conclusion - it has been a long long post I believe. I hope that these lessons will help people realise potential problems early during development even though the story is fictitious, it’s based on true experiences put together to form a scenario.
Do read up the references I’ve linked up here. It’ll help as well.
Have a great week ahead!
Signing off,
Jeremy
- Permalink
- Admin
- 22 Jun 2010 11:09 AM
- Comments (2)
July 25th, 2010 at 6:44 pm
Excuse my french but, This post makes my mind spin at the speed of dark.
July 26th, 2010 at 11:34 am
sorry if my post made u feel that way… but if u hav any questions, feel free to ask…