August 1993/Automated Unit Testing

Features

Automated Unit Testing

Roger Meadows

Roger has a Master's degree in Computer Science. He is interested in object-oriented programming and automated testing. He can be reached via Compuserve at 73040,436.
This article presents some simple but very powerful techniques for automating unit testing. Although automated unit testing does not solve all testing problems, it does provide a mechanism that can dramatically increase software reliability without a corresponding increase in effort.

Problems with Test Code
Usually, the process of unit testing involves writing additional code that exercises the unit being tested. (I will refer to code being tested as application code and code doing the testing as test code.) Often, test code, viewed as a one-time effort, is written and used during initial development, but forgotten once a module is working. Even when you try to reuse test code, it may be lost, or not updated to reflect application code changes.
Interpreting the output of the test code another, more subtle problem sometimes requires a detailed understanding of how the test and application code works. The output of test code written during or shortly after development of the application code is probably easy to understand. But later, the same output may have little meaning to you. So to determine if the test code is working, you may have to examine test cases within test code to identify the expected output. Even worse, you may have to revisit the application code to determine what to expect.

The Value of Automated Unit Testing
For maximum value, you must track test code as you would an application, so it will be up to date and easily located. You must document your test code: how to build it, how to run it, and especially, what the output should look like. Automated unit testing helps you obtain this maximum value from your test code by increasing program reliability, permitting shorter regression testing times, enabling you to hand off minor changes to more junior programmers with no loss of confidence, and documenting expected and actual output.
The germ of this concept can be found in C Programming Guidelines by Thomas Plum, Prentice-Hall, 1984. Automated unit test code requires more thought up front and probably takes longer to write than conventional test code. But this effort is more than repaid by the improved reliability and time saved in retesting during the life of a module.

Implementation
Automated unit testing procedures are based on a modular programming model. In this model, complex programs consist of functional modules with clearly defined interfaces to other modules. The implementation of one module is hidden from other modules, so portions of the system can be modified, or even rewritten, without changing the entire system.
To implement automated unit testing you add a main program to every source file that contains a module. The main program tests the functions in the module. You put the main program and related declarations inside a conditional compile block that the compiler ignores unless the symbol TESTMAIN has been defined. See Listing 1.
The test code calls application code functions to perform operations and compares the results obtained with the results expected. This automates the process of checking whether the application code is working correctly. It also documents the expected output for each test case. This method is called black box testing, because the test code views the application code as a "black box," focusing on the inputs and expected output for each set of inputs.
In C, programmers commonly use file scope as a means of hiding data internal to a module. That is, data shared between functions within a module but not accessed directly by code outside the module will be declared as static. Using this technique, sometimes referred to as information hiding, the static data can only be accessed through the unctions representing the external interface to the module. Automated unit test code can access this hidden data to perform some internal consistency tests in addition to exercising the normal interface to the application code. Verifying the consistency of this hidden data, or white box testing, tests not only what the application is doing, but how it is done.
During program development a C program containing automated unit testing code would be compiled with the TESTMAIN symbol defined. This produces an executable test file. For example, the test program for the program in Listing I could be created using the Turbo C command-line compiler with the following command:
tcc -DTESTMAIN strws.c
This generates a file named strws.exe that contains the function to be tested and a main program that tests the function.
After the test program executable has run without errors, it can be compiled without the TESTMAIN symbol defined to produce an object file that can be linked with other modules. For example, an object file for the program in Listing 1 could be created using Turbo C with the following command:
tcc -c strws.c
This generates a file named strws.obj that can be linked directly with other modules that need the strws function. Or strws.obj can be added to a library, making it available when needed. There is no conflict in having multiple object files that use automated unit testing. When the symbol TESTMAIN is not defined, the main program section of the file is ignored by the compiler. No extra space is needed in the object file, no matter how long the test program is, if TESTMAIN is not defined.
By defining the TESTMAIN symbol on the command line, you can compile two different versions of a source file without having to edit the source file. The process can be simplified even further by creating a MAKEFILE for Turbo C's make utility that automatically defines TESTMAIN when an executable file is the target. The following are the make rules that can be put in your MAKEFILE:
.c.exe :
     tcc -DTESTMAIN $&

.c.obj :
     tcc -c $<
Given the above rules, building an executable test program becomes:
make strws.exe
Building an object file that can be linked with other files becomes:
make strws.obj
Guidelines for Development
Building an automated test executable can be simple. To help in developing an automated testing process, I offer you these guidelines:

Include all test code inside a main program, that is, inside a #ifdef TESTMAIN block.
This makes it just about impossible to lose track of your test code. It is always there. One might even be more apt to keep it up to date.

Write very little output if things work.
If no errors are encountered the only output from the program should be one line at the beginning identifying the module and one line at the end indicating the test succeeded. Why bother looking at extra output if you don't need to? All you really want to know is "does it pass the tests I created?"

Write lots of output when errors are detected.
If errors are encountered, enough information should be printed to identify the symptoms and the exact location of the code performing the test. Too little output after encountering an error requires you to go back and modify the test code to give more information. At a minimum, the result obtained from the test, and the result expected, should be printed.

Do not make the test program dependent on external files.
You should not use other files in the process of testing, if possible. External files have a tendency to get lost (especially during directory cleanup campaigns.) If the module being tested reads data from a file, have the main program create the file. A series of fprintf statements can be used to create simple ASCII files. If more complex files are needed there are probably functions in the module or in another module for creating the files.

When you find a new error, always demonstrate the error in the test code before correcting the error.
If you encounter a bug in a module containing automated unit testing, you should add a new test case to the test portion of the module before fixing the bug. Then you should run this revised test program to verify that the new test case demonstrates the bug. After fixing the bug, run the test program again to ensure that the bug is fixed, and that the fix did not break anything else in the module.

Make sure you have released dynamically-allocated memory, and check for writing beyond dynamically-allocated blocks of memory.
If the module allocates dynamic memory, and you expect the memory to be freed by the end of the program, special checks can be added to verify that all memory has been released. This is not as straightforward as it may seem. I added this rule as a result of my last use of automated unit testing. However, I have not written the support functions that would be needed. The Turbo C and Microsoft runtime libraries provide functions that will let you walk through the heap. You will need additional code to walk the heap and add up free space.
There are a variety of public domain support functions for finding errors caused by a program writing beyond the memory that has been assigned. The test code can replace Standard C library dynamic memory functions (malloc, calloc, realloc, and free) with a set of C functions (my_malloc, my_calloc, my_realloc, and my_free) that take the steps needed to detect common dynamic memory errors. They should be linked with automated unit test code during testing but can be mapped directly to the Standard C library functions after testing is complete. This swapping of calls can be handled automatically by the C preprocessor using:

#ifdef TESTMAIN #define malloc my_malloc #define calloc my_calloc #define realloc my_realloc #define free my_free #endif
When TESTMAIN is defined, all references to the standard library functions will be replaced by the debugging (my_...) version. The statements above are ignored when TESTMAIN is not defined, and the normal library functions will be called.

Make sure files have not been left open.
If the module opens and closes files, you should check that all files are closed by the end of the program. This is another rule that I have added recently and for which I have not yet written support. A simple method is to supply functions such as my_fopen and my_fclose that count how often they have been called. Switching between the debug versions and the regular versions can then be handled automatically by the preprocessor (just as we did with memory allocation calls):

#ifdef TESTMAIN #define fopen my_fopen #define fclose my_close #endif

Conclusion
Automated unit testing requires more effort to produce than conventional testing. This extra effort at the beginning more than pays for itself during the life of the application software being developed. I advocate an attitude that says a program is not complete until it has an automated unit testing section. When I move code that has automated unit testing from one operating system to another, testing it on the new system is easy and quick. Even verifying the reliability of a new release of a C compiler is as simple as running a batch file that builds and runs automated unit tests. I never lose the testing code, and I don't have to remember exactly what files are needed for testing, or how to run the test. I don't have go back and look at the code unless the test program fails. The hardest part of this approach is making a habit of it in the beginning.
Try automated unit testing. After seeing the results, you may agree that everyone should write code this way.