February 1991/Automated Software Testing

Features

Automated Software Testing

Robert McLaughlin

Robert McLaughlin is Principal Engineer for Check*mate. He has been a C programmer for 12 years. Bob has written several articles and is an author of "Fix Your Own PC," MIS Press 1990. His interests include collecting mathematical puzzles and ocean watching. Mr. McLaughlin can be reached at PRA, 1953 Gallows Rd. Suite 350, Vienna VA 22182. (703) 883-2522.
Software development is generally characterized by a cycle as shown in Figure 1. The first step is developing a specification. The second step is validating this specification. The third step has two parts that occur at the same time: coding based on the specification, and developing a test plan. These two parts are generally done by independent groups. The fourth step is validating the coding through debugging. The fifth step is validating that the code meets the specification. The sixth step is customer acceptance, verifying that the delivered product works and meets the customer's interpretation of the specification.
In his paper, "Applying Automation to the Test Process" [1], David Godfrey discusses how changes increase in cost exponentially as they are made later in the development cycle. The more work put into the specification, the less the software will cost, since fewer omissions will have to be corrected. The sooner bugs are caught, the less it will cost to fix them. Also, the fewer bugs in the delivered software, the greater the customers' confidence that the code works, speeding up customer acceptance.
The process of specification development is human-to-human interaction. No program can ever make this process easier. All you can hope for is that the specification can be made clearer. Prototyping helps for example, or using an abstract language such as ASN.1 [2]. Until users understand the specification process enough to ensure that the specification is correct, specification development will be the area in which the greatest number of bugs are introduced.
The benefits of automating the testing process are two-fold. It helps ensure that the code delivered meets the specification and is free of most bugs. Automated testing is not a panacea. It cannot ensure that your code is 100 percent bug-free, nor can it solve problems caused by a poor specification. Automated testing is a program testing a program. If there is a bug in your testing program, it will cause good code to be declared bad.

Automated Testing
One way to develop and automate a test plan is to use a product called Check*mate. Check*mate is a set of Microsoft C library routines that enable automated testing of serial telecommunications through two serial ports. The product is one of many automated testing packages on the market. Check*mate's niche is telecommunications.
Other testers exist for different applications. I suggest that you examine them all before purchasing one. None of these packages are actually all-purpose. Because each product has its own market niche, one particular product may meet your needs far better than the others.
Two other automated testing packages I will mention here are Autotester and the Atron Evaluator. Autotester, from Software Recording Corp., is a playback-capture tool for telecommunications. It captures keystrokes and responses to the program under test to allow playback for regression testing. The Atron Evaluator is a playback-capture tool used on PCs running OS/2 or MS-DOS. The Evaluator works with one PC controlling another PC using special hardware. The Evaluator can play back the details of a session, including mouse movements. Both packages allow you to edit captured keystroke scripts. Autotester also allows a certain amount of branching.
Regression testing is the process of retesting. Say, for example, you change the code that controls the way your program handles queues. Ideally, each subroutine that calls the queuing features should be retested. This retesting process is called regression testing.
This can quickly become tiresome. In fact, it is in regression testing that programmers take the greatest liberty with "Oh yeah, it works." So automating this process is very useful, since it ensures that changes have not introduced new bugs.

Review Of Testing Methods
A great deal of money is spent on testing, currently over half the budget of most large scale projects so a lot of thought has gone into how to improve testing.
In spite of the efforts to improve testing, there is no cure-all. Just as PL/I was going to save us all from ourselves 14 years ago, each new idea in testing is promoted to save us from ourselves. Computers do only what we tell them to do. If our methodology is bad, then no tool will save us from ourselves. If the specification process is sloppy, no test procedure will help us deliver the code the customer needs.
Testing should always be done from the standpoint of the customer. In a database application your concern may be in file structures. But in the customer's mind, bugs are screens that don't ask the right questions or reports that are hard to read. In a telecommunications application, you must demonstrate that the product conforms to some standard, such as X.25 [3]. You can do this by performing tests in accordance with some other standard, such as ISO-8882 [4].

Testing All The Code
The difficulty that many programmers have with testing is developing a test plan. They do not like the idea of someone looking over their shoulder. However, a good test plan should be based entirely on the specification and should be done by a group independent of the programmers. A test plan will not work if the test code is based on the code to be tested or written by those that have knowledge of the code that is to be tested. This means that a second group parallel to the programmers is required. This second group helps ensure that the code meets specification.
A test plan must both exercise all the features of the program and all the segments of the code. This task is not as simple as it sounds. A typical feature-based test plan tests only 80 percent of the code. One reason is that some of the code is borrowed from other programs and never used in the current program. Also, a feature test does not test error conditions and other paranoid conditions that programmers write code for.
The difficulty even with well-tested code is that conditions occur in the field that were not tested for. The only way to ensure that the code does not break down in an unexpected manner is to use a profiler, such as the UNIX utility PROF to ensure that every line is being exercised, that all calls to all routines are used. If this is not done then a program that passes a test plan may break down in the field because a subroutine was not tested under all conditions. This is especially true for routines that handle queuing or link lists.
The development of a test plan is not an easy matter. Several decisions must be made. To what extent is the software to be tested? That is, how much testing needs to be done to get a feeling that the program works? How important is it that features meet the specification exactly? How much regression testing must be done to ensure that a bug fix does not introduce more bugs?
With these questions answered, it is possible to devise a test plan. The test is generally based on the specification and is designed to allow no more than one bug in 1,000 lines of code. Some applications require that bugs occur no more frequently than one bug per 1,000,000 lines of code. The allowed bug rate and how fuzzily features can match the specification determine the type and extent of testing. In telecommunications, the code must have the exact features of the specification and generally not more than one bug in 10,000 lines of code.

Some Words On Check*mate
Check*mate is a program within a program. It is composed of two parts. One part is an environment that handles I/O and does a certain amount of multitasking under MS-DOS. The second part is the user's script.
The script is a Microsoft C program that is compiled and linked into an .EXE file using the standard Microsoft tools. For it to mesh with the environment part, it requires an include called cm.h. Also at link time the Check*mate libraries must be linked in.
The use of C gives the tester great power, but this great power can drive people mad. For simple regression testing — testing where keys need to be played back and exact responses compared — Autotester is far better. In a situation where different responses can be returned depending on the state of the process under test, such as is the case in X.25, then using C makes sense.
The example of automated testing (Figure 1) is devised more from the view of clarity than practical utility.
I assume the program under test is running on a UNIX machine that can be dialed into, and that the program can be run over a modem. It does not matter for my purposes where the program under test lives. It does matter that I ensure my testing program sets up the serial port correctly, logs into the machine, starts the program to be tested running, tests the program, logs off, and resets the serial port as required. Since setting up and resetting the serial port are common to many tests, I have placed them in an include file. For the same reason, I have done the same with logging on and off the target machine.
The customer specification is seen in the box Customer Request. It is short and to the point. It does not, however, discuss how the program is to end.
In reply to the customer request, a Specification is written. It is shown in the Specification box. It says what the Customer Request says but in technical phrasing. Note that it, too, omits how the program is to end.
Listing 1 is the program as coded according to the specification. It reads in a character. If the character passes islower, it is turned into an uppercase letter, using toupper. No matter what, the character is echoed. Since nothing ever specified how the program is to end, the program does not end.
Based on the specification, the quality control department writes a test plan, which is shown in the Test Plan box. This may be written in English, as I have done here, or in TTCN or some other abstract language. In telecommunications, TTCN is the language of choice for test plans. Since the specification says nothing about how the program is to end, it does not mention testing if the program ends correctly.
Listing 2 is the Check*mate-based testing program. The testing program is bigger and more complex than the program under test, which is often the case. This is tolerable because in many testing situations the code is too complex for any other means to be reliable. The test code assumes that in the include files setups.h and log.h are the routines to set-up the serial port and to log on and off the target machine.
The test code uses three Check*mate routines, TX_chr, RX_chr, and CM_log. TX_chr places the character in a transmit queue, Check*mate sends that character when it is next up in the queue. RX_chr gets the next character from the receive queue. CM_log puts a formatted message in a log file. We will examine the log file to see how well the test did. Check*mate includes routines to sort through the receive queue for a string, to time response, etc.
Listing 2 goes slightly beyond the test plan. It tests not only to see if a lowercase letter is turned into uppercase, but also if other letters are echoed back as they were sent — something implied by the test plan. The testing program uses a BREAK to cause the program under test to end. The testing program logs a failure by noting that a character returned was not the one expected. A count of correctly returned characters is kept to obtain a sense of how badly the program failed.
In the Sample Session (Figure 2) , the Log box shows the log that the test program would create. The first five lines note that Check*mate was loaded under what MS-DOS, with how much free memory, and with how many serial ports. The last four lines are the log messages of the test. The first MSG line notes that the test has started. The second and third MSG lines note failures. The fourth MSG line notes the number of correct characters sent and received. The fifth MSG line notes the end of the test.
The failures are of interest. The first failure is possibly line noise. The second failure is clearly a bug. See if you can find it. In general, the test should be run several times to ensure that no gremlin has caused the random number generator to produce only non-lowercase letters.

Summary
I have attempted to demonstrate the wonders and pitfalls of automated testing. Automated testing does not lessen the difficult job of program specification — it will not help you when you are sloppy. Since automated testing brings in another body of people, quality control, the specification needs to be tighter.
In the example, the failure to find out how the program was to end resulted in the writing of a program that could not be cleanly ended, and a testing program that would pass it. Clearly the customer did not intend the program to run forever. When the code is delivered, the customer will, to the surprise of all, reject it. All that automated testing did in this case is reduce the cost of testing the code according to the specification.
Automated testing cannot test your specification process. However, the mistakes made in this phase will cost the most to fix if they are not caught until the customer acceptance phase.
In this simple testing program, more that 1,000 tests of the program were done. Imagine sitting at a keyboard and typing in 1,000 random letters in less than five minutes. Imagine doing this several times in a row. Imagine finding a bug, and having to do it all over again.
Automated testing will allow greater assurance at a reasonable cost that your code meets specification with minimum bugs. It does not help you assure that the specification meets the customer needs. This area is the next frontier in software engineering.

References
[1] David Godfrey, "Applying Automation to the Test Process," Proceedings of the 6th International Conference on Testing Computer Software, US Professional Development Institute, Silver Spring MD, 1989.
[2] CCITT Blue Book. X.208 and X.209.
[3] CCITT Blue Book. X.25
[4] ISO Draft Standard 8882.
CHE01 Check*mate Users Guide.