A convenient First Line of Defense against memory misuse.
Introduction
Finding memory leaks in C++ can be hard. Therefore, checking for leaks throughout development is important. Many developers are willing to tolerate small or infrequent leaks during the development phase just as drivers will sometimes drive on a slowly leaking tire. However, its important to quantify a leaking application so you can make an intelligent decision about how long you can wait before fixing it.
Detecting heap corruption in C++ is even harder than finding memory leaks. It is therefore even more important to check for heap corruption errors during development. Win32 does a good job of checking for heap overruns in the debug configuration, but a release build does not do this kind of checking.
This article introduces an Application Watchman class that encapsulates some of the Win32 CRTL (C Run-Time Library) extensions for heap state reporting. The class is an application of the Façade pattern, which is intended to provide a unified interface to a set of interfaces in a subsystem [1].
This class, AppWatchMan, takes snapshots of the memory state at any point in the program. AppWatchMan also computes the differences of these memory states in order to determine potential leaks. AppWatchMan works as a first line of defense to prove or disprove that an application is leaking memory. You can also use AppWatchMan to generate a static profile of how much memory is being used at any one time, before or after significant events. AppWatchMan is intended to be lightweight and easily integrated into an application. The class is probably not a substitute for fully featured commercial products like NuMegas BoundsChecker or Rationals Purify or readily available freeware [2]. However, you can use AppWatchMan to reveal a deeper need for a more comprehensive tool.
Win32 C Run-Time Extensions
The Win32 environment extends the CRTL by adding many useful functions to help debug C and C++ programs. The sidebar, Heap State Reporting Functions, summarizes the heap state reporting functions, as taken from the MSDN (Microsoft Developer Network) Library [3]. The last two functions allow clients to provide hook functions for enhancing the output of the allocation statistics. The hook functions, explained in the MSDN, are beyond the scope of this article.
The functions shown in the sidebar are only available when _DEBUG is defined, and therefor are compiled out for release configurations. Clients access the functions by including the file <crtdbg.h>. The import library varies and usually gets pulled in through the kernel32 import library by way of dependencies. The _CrtMemState structure (described in the sidebar) is defined in this header file.
The debug heap allocates blocks by type. These types include normal program allocations, allocations made by the CRTL, special client-allocated blocks, and two other lesser-used types [4]. Special allocation functions are required for applications to force their blocks to be allocated as _CLIENT_BLOCKs. Most application allocations will occur as _NORMAL_BLOCKs or as _CRT_BLOCKs. Using these special functions would either require the client to litter their application with these functions or use another class or set of functions that override allocations; its not trivial, and it is also beyond the scope of this article.
AppWatchMan Class Interface
The CRTL heap state reporting functions are very handy. However, the limitation of a simple C API is that the client usually maintains the state. For example, the _CrtMemCheckpoint function takes a pointer to a _CrtMemState structure that the client must allocate and manage. Another disadvantage is that the client needs to know how to make the different calls; in this case, there are five functions. Placing these calls throughout a client application could become messy, especially when two or three might be needed at any one point.
In C++, its pretty easy to resolve some of these issues, encapsulating and abstracting the CRTL heap reporting functions, reducing the clients implementation effort with a tradeoff of flexibility. Figure 1 shows a UML class diagram for the AppWatchMan class that provides these capabilities. The class declaration is shown in Listing 1.
The class encapsulates the first, previous, and last check-pointed memory states using the _CrtMemState structure. An alternative might have been to store readings in a circular buffer or list; for this demonstration, it did not seem necessary. A function DoRounds allows clients to request AppWatchMan to perform certain activities defined by the FLAGS enumeration. There are two levels of operations that can be performed. One level is a finer granularity, like asking AppWatchMan to check the heap for corruption. The other level is based on scenarios and is an ORing of the lower-level flags to make it convenient for the client. The scenarios will be discussed in later sections. A form of ostream is held privately and used for logging program output. Finally, a static class variable m_bOnDuty, which mirrors the _DEBUG pragma value, is held privately for tests at run time.
AppWatchMan Class Implementation
The source for this class appears in Listing 2. The constructor and DoRounds function each have an if statement block in which the memory checking functions are called if m_bOnDuty is true. The memory checking functions are compiled out in a release build. Therefore, m_bOnDuty is set to false so that the context in which they are called is also skipped. An alternative to the if statement would have been to use the preprocessor to conditionally compile the code using the _DEBUG pragma. I felt that using the pragma made the code more difficult to read, and the run-time check really doesnt add much overhead. Other than that, I dont have a compelling argument for using either the if statement or the _DEBUG pragma.
The constructor initializes an instance variable m_pStrLog, of type ostream_withassign, with the address of cout. The CRTL heap reporting functions are limited to using stdio. I did not want to bother the client with all the issues of file I/O and sharing file handles, so I did not allow them to change the log destination. This would have been more flexible, but also more work. The constructor executes ios::sync_with_stdio to synchronize the use of iostreams with stdio. The report mode and report file functions are set to send all warnings, errors, and assertions to stdout. However, I can still use stream operations using m_pStrLog to write special trace comments, which are more convenient than the Win32 special _RPT report functions. I should also mention that the report mode can be set to a pop-up window when an error occurs. This could be handy because you can start debugging, and youll have an active call stack. I did not allow the client to specify this option to keep things simple.
The constructor calls _CrtCheckMemory to look for heap corruption. By doing this before AppWatchMan ever does the rounds, the current state of the heap will be well known and serve as a reference point for further observations. Its like the night watchman seeing the prior shift log and knowing not to phone in an abandoned car that was already reported. The constructor then calls _CrtMemCheckpoint, saving the data as the first snapshot, and dumps the data by calling _CrtMemDumpStatistics.
The function DoRounds is where most of the client-directed work is done. The client calls this function passing a comment and a FLAGS value. There are seven fine-grained values that are used for checking heap corruption, taking heap snapshots, dumping statistics, comparing snapshots, and performing leak detection. The FLAGS enumeration defines the values with an implied hierarchy where CHECK_HEAP is done first and LEAKS_FROM_START is done last. I have also made three higher-level FLAGS values for some typical scenarios that I think will be very useful.
AppWatchMan Scenarios and Examples
The first scenario is a Confidence Check. The calls to DoRounds are intended to be placed close to the beginning of the program, near main, and as close to the end of the program, near mains return. The client calls DoRounds with a FLAGS value of BEGIN_CONF_CHECK. DoRounds first checks the heap for corruption, then takes a snapshot of the heap statistics, and dumps them. The client then executes the rest of the program. At the end, the client calls DoRounds with a FLAGS value of END_CONF_CHECK. DoRounds again checks the heap for any corruption and takes and dumps a snapshot of the heap statistics. Finally it checks for any leaks since the start of the program and dumps all the leaking objects. This scenario is illustrated in Figure 2.
A sample program for the Confidence Check scenario is shown in Listing 3 and the output is shown in Figure 3. An intentional leak was introduced by omitting delete. DoRounds calls _CrtDumpMemoryLeaks to dump all leaks since the start of the program. The heap memory allocated in function IntroduceLeak was the 32nd block allocated, its address was 0x00790DA0, and its length was 12 bytes long. The data string "Hello World" is dumped in ASCII. String data may be obvious, but other data will be harder to identify. I would suggest integrating with a memory manager that provides new and delete functions that keep track of allocations, which functions did them, and where they were done. A search on the Internet will produce quite a few to choose from.
The second scenario is a Leak Test. The calls to DoRounds should be placed immediately before and after the code to be tested, or where a leak is suspected. The client calls DoRounds with a FLAGS value of BEGIN_LEAK_TEST that just takes a snapshot of the heap. After the client executes the test code, it calls DoRounds with a FLAGS value of END_LEAK_TEST that results in another snapshot being taken. A difference is computed between the two memory states; if there are any differences, any leaks since the previous snapshot are dumped.
A sample function for the Leak Test scenario is shown in Listing 4 and the output is shown in Figure 4. The leak introduced is the same as the one in Listing 3. This time the error was found by taking before and after snapshots and executing _CrtMemDifference. If a difference is found, function _CrtMemDumpAllObjectsSince is called with the previous _CrtMemState. The results in Figure 4 agree with those in Figure 3.
The last scenario is a Profile Test. The calls to DoRounds are placed around a specific section of code in which the heap statistics are to be profiled. The client calls DoRounds with a FLAGS value of BEGIN_PROFILE_TEST that takes a snapshot and dumps the data. After the client executes the code to be profiled, it calls DoRounds with a FLAGS value of END_PROFILE_TEST. A snapshot is taken and dumped, and a difference with the last snapshot is made and dumped. There is quite a bit of valuable data in these statistics.
A sample function for the Profile Test scenario is shown in Listing 5 and the output is shown in Figure 5. Snapshots are taken before and after the profiled code. A difference is computed and dumped. The memory allocated is our old friend, the string "Hello World", which including the null terminator is 12 bytes long. The statistics for the number of blocks are all zero; this means there were no leaks between the two snapshots. You can also see that the largest number of bytes used, which is the high water mark, and the total allocation bytes both increased by 12 bytes. Now you can have solid numbers for making decisions and assessments rather than guessing.
A sample function for corrupting the heap is shown in Listing 6 and the output is shown in Figure 6. A variation for using the Confidence Check is when you suspect that the heap is being corrupted. One of the more common errors is writing past the end of a character array. In debug mode, the Win32 Memory Manager places guard blocks with values of 0xFD at the end of the allocated memory. In the example, the client writes an exclamation point in the space reserved for the null terminator. The null terminator is written into the guard block. When the client deletes the memory, the Win32 Memory Manager checks the guard block for corruption and generates an error, which is the first one listed in Figure 6. The second error in the example code is also an overrun when writing the string "Goodbye!". In this case, the block is intentionally not returned, but the damage is already done. When the client calls DoRounds at the end of the Confidence Check scenario, a call to _CrtCheckMemory is made revealing the error and dumping relevant statistics. The last error in Figure 6 is the end of the Confidence Check scenario where function _CrtDumpMemoryLeaks is called to show all blocks that have not been returned.
Discussion
Executing the Leak Test scenario answers the question Are there any leaks? You should be able to answer the question confidently with a simple yes or no response. For more information on finding leaks, consult [5]. To learn more about troubleshooting, read Debugging Applications by John Robbins [6]. This excellent book covers the debugging details for the Win32 environment.
The after-market heap corruption checkers do a better job of catching heap corruption. Some products even check for overruns on the stack. The dilemma is deciding if you need those products. You can think of the AppWatchMan Confidence Check scenario as a first line of defense for determining if further heap corruption checking is necessary.
There are several areas worth exploring. One would be creating a static AppWatchMan as an attempt to get it called even before main is executed. It would also be interesting to see when or if its destructor would get called. Id also like to know if Linux or Unix has extensions to the CRTL like Win32. I know there are products that are made for Linux to detect leaks; one example is Insure++ by Parasoft. There are also lots of freeware and public license implementations; Im just not sure if the OS supports a set of C functions like Win32 does.
Several enhancements would be beneficial to AppWatchMan. One improvement would allow the client to specify the name and location of a log file that is totally managed and encapsulated by AppWatchMan. Another enhancement would be to allow the client to specify that the report mode generates a Win32 MessageBox on all errors. The pop-up gives you the option of just-in-time debugging only when necessary on a debug version.
Conclusion
AppWatchMan is a lightweight class that can help catch memory allocation errors and heap corruption in your programs early in the development cycle. AppWatchMan will provide peace of mind and concrete information on heap usage. One tradeoff is the buy versus build argument. AppWatchMan is very lean and cant be compared to a more full featured product like NuMegas BoundsChecker. However, if you cant afford the commercial products, you can afford AppWatchMan. One thing is for sure: you cant afford to be uninformed about your applications use or misuse of the heap. AppWatchMan should be your first line of defense!
Notes
[1] Erich Gamma, et al. Design Patterns (Addison-Wesley, 1995), page 185.
[2] While editing this article, I came across a controversial opinion written by Randy Charles Morin on one of the more popular commercial heap checking tools. His paper appears at <www.kbcafe.com/articles/memory.leaks.html>. Randy also developed a class similar to mine; however, it does all the work in the constructor and destructor. I was not comfortable doing that because the intent of the constructor and destructor are initialization and clean up. My class aggregates functionality and is scenario based.
[3] Heap State Reporting Functions, MSDN Library, <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsdebug/html/_core_heap_state_reporting_functions.asp>.
[4] Types of Blocks on the Debug Heap, MSDN Library, <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsdebug/html/_core_types_of_blocks_on_the_debug_heap.asp>.
[5] Edward Wright. Detecting and Isolating Memory Leaks Using Microsoft Visual C++, MDSN Library, <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvc60/html/memleaks.asp>.
[6] John Robbins. Debugging Applications (Microsoft Press, 2000).
Bill Trudell is a senior developer with Capital One where he uses C++ to implement middleware solutions for their Call Center applications. His articles have appeared in the Journal of Object Oriented Programming, Dr. Dobbs Journal, and Embedded Systems Programming. His interests include his wife and children, racquetball, and woodworking. He can be reached at billtrudell@yahoo.com.