Features


malloc-Related Errors

Steve Halladay


Steve Halladay currently works for StorageTek in Louisville, CO. He has over 10 years experience developing products like CSIM and Multi-C in C++. Steve received his MS in computer science in 1982 from Brigham Young University. You can each Steve at StorageTek (303) 673-6683.

Though malloc and free can make a program more flexible and extensible, using these functions improperly can create subtle errors. This article enumerates several types of malloc and free misuse so that C and C++ programmers can identify and rectify these errors more quickly.

Using Freed Memory

A most serious and difficult error to track occurs when you deallocate a properly allocated data item, and then continue to use the item normally. This error can produce three types of symptoms depending on how the program uses and accesses the freed data item on the heap.

If the program frees the data item (see Listing 1) , the contents of the data item may change. The heap manager (i.e., malloc and free) may use the item's memory to link it onto a free list as a variant record. If the program immediately attempts to read the item, it will appear to the programmer, who is not aware the data item has been freed, that the item's contents have mysteriously changed. Fortunately, a debugger can help identify and solve this bug by tracking the data item through the program.

If the heap manager does not write over a portion of the data item, freeing the item and reading its contents will not immediately reveal this error. Porting the code from one compiler to another or even adding additional unrelated code will often cause these masked errors to surface. The inexperienced programmer might then blame the compiler or the individual who introduced the unrelated code.

Another symptom of using freed memory occurs when the program attempts to modify the freed memory (see Listing 2) . If the memory is still on the heap, and the modification conflicts with the way the heap manager is using the memory, the error will go unnoticed until a future call to malloc or free. At this point you can expect an access violation or related error, or what appears to be a call to malloc that never returns. This is a painful death because the effect of the error is not in the vicinity of the error.

The worst scenario for this type of error is where a program reallocates the freed memory before the previous owner accesses it (see Listing 3) . Multiple program sections then believe they each own the same memory location. Each part of the program sees the contents of its data structure change mysteriously. This behavior is probably the most common symptom for this kind of error.

Probably the most discouraging side effect from this type of bug occurs in the debugger. Some debuggers use the heap to keep track of breakpoints, etc. The symptoms of the bug can therefore change (or even disappear) while in the debugger. These bugs render the debugger nearly useless since it cannot identify the bug locality.

Magic Numbers

A simple yet effective approach for diagnosing heap-related errors is the use of a magic number. A magic number is a unique, arbitrary value the program stores in the allocated data structure. Each data-structure type has a field that contains a specific, predetermined value. The program sets the data structure's field when it allocates the data structure, and resets the field when it deallocates the data structure. Object-oriented or abstract data type approaches lend themselves easily to this mechanism. Each of the object's methods checks the magic number to verify its contents. Since you can use any number, a byte value is sufficient. I personally prefer to use a value that is the address of a static string that identifies the data structure type. This scheme can help debug these heap-related errors. For magic numbers to be useful, they must be used religiously.

Listing 4 (and the associated files, Listing 5 and Listing 6) implements a stack abstract data type that uses magic numbers. The static pointer named magic is the magic value for the stack abstraction. The three macros MAGIC_ON, MAGIC_OFF and MAGIC_CHECK set and reset the magic number as well as verify its value. StkConstruct is the stack constructor, which allocates and initializes an instance of the stack. Notice that this routine sets and checks the magic number just before returning the stack handle (the handle is the pointer that references the stack). StkConstruct employs MAGIC_CHECK as a minor sanity check to ensure MAGIC_CHECK and MAGIC_ON are consistent.

All other member functions first validate the stack handle by using MAGIC_CHECK. This check guarantees that the data structure in question was created by StkConstruct. StkDestroy also uses MAGIC_OFF just before it frees the stack data structure to prevent member functions from inadvertently manipulating the data structure after it has been released. While abstractions like this stack object require a few more lines of code, they provide bullet-proof code.

Forgetting to Declare malloc

When C programs reference a function before declaring it, the compiler assumes the function to be of type int. Errors result when a function appears to return an int, but the actual item returned is of a different size. Since malloc returns a pointer value, a potential error exists when a program uses malloc before it declares malloc. Some compilers will flag the error only in environments where the size of an integer differs from the size of a pointer.

This error is common for PC programmers who experiment with various memory models (I personally cause this error annually whether I need to or not). Symptoms of this error include a call to malloc that hangs the program, or the program runs to completion but DOS cannot be reloaded because the program trashed some system memory, such as the interrupt vector.

This error can be especially annoying because the same program may run without a hitch using other compilers or memory models. Again the inexperienced programmer may blame the compiler, the runtime or some other library, and sometimes even the hardware. The fallacious proof for these indictments is that the program runs fine elsewhere.

The cure is to include an external definition for the heap routines in a system header file. Unfortunately the name of the system file may vary from compiler to compiler. stdlib.h contains the external definitions for ANSI C. Other environments might include definitions in alloc.h, malloc.h or even stdio.h.

Including the appropriate file with the declaration is preferable to simply externally declaring the functions in the code. Usually, including the system definitions will not cause type conflicts with the actual function in the library. Some compilers define malloc as returning a character pointer (i.e., char *), while others define it as returning a void pointer (i.e., void *). Strict type checking notifies the programmer when he or she forgets to declare malloc. However, a programmer can experience hours of aggravation if he or she declares the function differently from the runtime library definition and the compiler enforces strict type checking.

Memory Leakage

A less catastrophic but potentially serious error results from failing to free unused allocated heap memory. For smaller programs with little memory utilization, this type of error may go unnoticed for the life of the program. However, these errors become visible as the program consumes greater amounts of heap memory. Even so, such programs often run for a while before running out of memory.

When a program on a PC runs out of memory, malloc will generally return a NULL pointer. If the programmer did not bother to check the malloc return value for NULL, the program will probably step on the interrupt vector in low memory and eventually cause the machine to hang. Experienced programmers recognize that identifying bug locality is worth the additional effort required to check return codes from library routines such as malloc. On a machine with virtual memory, this type of memory leakage has a different symptom. As the program continues to run, it chews up virtual memory until the swap space is full. One can discern this situation by an inordinate amount of I/O activity. Increased I/O slows the program to a glacial grind. If the program continues, it also will eventually return a NULL from malloc. On most virtual memory machines, if the program does not check the return value from malloc, the program will dereference the NULL pointer and cause an access violation.

Sound program design will prevent memory leakage errors. Generally, you should allocate and deallocate memory at the same level in a program.

You can employ many diagnostic tricks to identify this error. One approach builds a housekeeping layer between the application and the heap manager. The housekeeping layer keeps track of the allocated and deallocated memory. Listing 7 and Listing 8 show how such a layer might be built for malloc and free. The concept is easily extensible to additional heap routines such as calloc, realloc, strdup, etc. Notice the use of the preprocessor definitions of malloc and free. Using the preprocessor to redefine malloc and free allows the layer to be retrofitted to existing code, as well as disabled for final release code. Being able to turn off the housekeeping layer is vital because of its impact on program performance. Robert Ward covered "malloc wrappers" in detail in the October 91 issue of The C Users Journal.

Using Unallocated Memory

A serious but less common error is dereferencing a pointer to a structure that the program never allocated. This amounts to trying to use an uninitialized pointer. Many current compilers will flag these errors with warning messages; more traditional compilers will not. Since C does not specify the value of uninitialized auto variables, this error can produce symptoms that vary significantly.

Using an uninitialized pointer in a PC program can have drastic effects. Since there is generally no memory protection, wild pointers let a programmer step on program and system memory. Wild pointers can corrupt areas of memory containing program code, the system configuration, and even the current date and time.

On machines with memory protection, this error commonly results in an access violation. But when, by chance, the uninitialized pointer points to a valid memory address, the symptoms are similar to those observed when dereferencing a deallocated memory section (i.e., data item values mysteriously change). The previously discussed magic numbers are an effective means of debugging programs.

Freeing Unallocated Memory

A bug that occurs less frequently results from attempting to free a pointer that the program never allocated. Since free usually does not return a value, the programmer is not gracefully notified when this error occurs. Fortunately this error is the rarest of any mentioned in this article.

Symptoms are similar to those seen when inadvertently freeing the same location more than once. The next call to malloc may result in an access violation, a NULL pointer assignment or even in malloc hanging in an infinite loop.

Some more intelligent implementations of free use a hidden magic number to verify that the pointer originated from malloc. If the pointer was never malloced, free prints a message and returns, or even exits. This seems the exception (no pun intended) rather than the rule since additional checking adds unavoidable overhead.

Using a housekeeping layer is also an effective way to shoot these bugs. The malloc/free housekeeping layer helps the program recognize that it has not allocated a particular address. The housekeeping layer then raises an exception.

Conclusion

As programs become more dynamic, it is critical that programmers understand how to defend themselves against the pitfalls of heap abuse. Knowing the symptoms and reasons for heap-related errors can save a programmer significant time and aggravation.