October 1994/Powerful Assertions for C++

Debugging

Powerful Assertions for C++

Harald M. Mueller

Harald M. Mueller has been involved in software development for about ten years, starting at the Technical University of Vienna, where he received a Ph.D., then at Siemens Austria, where he is working on projects developing hardware support tools. His current assignment (and interest) is mainly software testing and quality assurance. He may be contacted at mueller@garwein.hai.siemens.co.at, or c/o Siemens AG Austria, EZE 41 Erdberger Laende 26, A-1030 Vienna, Austria.

Introduction
Many programming languages provide means for specifying assertions, i.e. conditions that should hold in every case. Writing assertions helps to catch specification, design, and programming errors by giving necessary conditions that must hold to ensure that continuation is possible [3, 4, 5].
Among the many languages providing assertions, some of them are object oriented, for example, C++ and Eiffel. Eiffel has demonstrated a notable effort to integrate assertions into the C language, having pre- and post-conditions for methods, invariants for classes and loops, and automatic checking during execution [2]. C++, on the other hand, is quite weak in this respect; in most installations, it merely allows use of the assert macro (a holdover from C) to check Boolean conditions at strategic points in a program. Programmers can formulate simple conditions with this macro, as shown by the following fragment:
template <class T>
class Stack {
   public:
      T pop();
      ...
   private:
      unsigned height;
      ...
};
template <class T>
T Stack<T>::pop()
{
   assert(height > 0); // check that there is at least
                          // one element in the stack
   ...
}
By defining and calling special functions, a programmer can in principle check arbitrarily complex conditions. However, besides its ugliness (in my eyes) in defining a new function for each assertion, this approach poses a new problem in C++: to access a class object's private members, this function would have to be a member function or a friend of the class. As an example, observe a general array class with a sorting method:
template <class T>
class Array {
   public:
      void sort();
      ...
   private:
      T elems[size];
};
When creating the sort function, the programmer may want to include an assertion at the end of it to be sure that his function really sorts the elements:
template <class T>
Array<T>::sort()
{
   ... do the sorting ...
   assert(there is no index i=1..size-1 such that elems[i-1] > elems[i]);
}
As this assertion needs a loop to check all elements, it can no longer be formulated as a parameter to the standard assert macro. The programmer could instead reformulate the condition and write
for (int i = 1; i < size-1; i++) {
   assert(!(elems[i-1] > elems[i]));
}
but that would make it impossible to hide the check in a production version by simply defining the assert macro as empty. Moreover, reformulating conditions can introduce subtle bugs.
In C, it is possible to write a separate function
check_ascending(T array[], int size);
which returns 0 if it finds a pair of elements in wrong order and which could then be called in an assert macro:
assert(check_ascending(array,size))
In C++, however, the function must have access to the private members of an Array object, and therefore must either be a member function or a friend to the Array class. Both declarations make it necessary to change the class declaration of Array solely because an assertion is introduced in the implementation of a member function.
Certainly, it would be much better to formulate assertions as shown above by simply writing
assert(!exists(i = 1; i < size-1; i++ : elems[i-1] > elems[i]));
right at the end of the sort method, where the assertion actually belongs.
This article describes a set of powerful macros that support forall and exists operators (or "quantifiers," as they are called in logic texts). I have implemented these extensions to the assert facility solely by means of the standard C++ macro processor and a little bit of overloading. Thus, these extensions require no separate preprocessor or other complication in their use. Moreover, since all the necessary functions are very small, they will all fit in a single header file so that no additional object file or library is necessary. The rest of the article consists of three parts: "What the Extensions Do" explains how to use the new forall and exists operators in a call to the redefined assert macro. The second part, "How the Extensions Work," is quite long, as it explains in detail how these operators are implemented. The third part shows an application that uses the forall operator in a small function (see [5] for more information about assertions).

What the Extensions Do
I have written the extensions so as to blend in with the normal behavior of assert. Thus, you can mostly use assert in the usual manner, without having the extensions get in your way. But when you need the extended behaviors, they're available. To show what the extensions do, and how they interact with C++, I define the syntax of assertions as follows. After including the header file

#include <assert.h>
shown in Listing 1, the (modified) assert macro and a companion, called assertp, are available. Inside these macros (and only there), the assert conditions can be written as follows:
assertion ::=
       'assert' '(' assert-condition ')'
       'assertp' '(' assert-condition ',' stream-output ')'
assert has the same syntax and meaning as defined in the traditional assert.h header file. assertp has the same effect as assert, but additionally it outputs all values given in the second parameter to a special output stream. I introduced this macro because I hate the secrecy of assert about the values that caused the assertion to fail. assertp will output any value desired; it is especially useful with the forall and exists conditions, as knowing which cycle of the internal for-loop failed is important in tracking down an error.
Going now one level deeper into the syntax, an assert condition is a list of subconditions connected by && or //:
assert-condition ::=
       assert-condition '&&' assert-subcondition
       assert-condition '||' assert-subcondition
       assert-subcondition
&& and // obey the usual C++ precedence laws and are shortcut operators — i.e., if the left side of an && (//) yields 0 (1), the right side is not evaluated. This feature may be important when pointers are involved, as in
assert( p == 0 || forall ((q = list; q; q = q->next), *q != *p) );
An assert subcondition may be any of the following:
assert-subcondition ::=
      a usual C++ expression (or more precisely:
                          a logical-or-expression
                          as defined in the ARM in
                          section 5.15 [1])
      not-prefix 'forall' '(' for-specification ','
                          assert-condition ' )'
      not-prefix 'exists' '(' for-specification ','
                          assert-condition ')'
Note that the condition inside a forall or exists can be an arbitrary assert condition, again using foralls or exists. I will demonstrate the power of this syntax via examples later in this article.
A not-prefix is simply a series of zero or more '!' -operators ( one is enough, but this is more general):
not-prefix ::=
// empty
not-prefix '!'
Finally, a for-specification is the parenthesized part of a for statement:
for-specification ::=
'(' for-init-statement ';' expression_opt ';' expression_opt ')'
(The above is copied from the ARM [1], section 6.5.)

How the Extensions Work
The extensions rely heavily upon the macros forall and exists. This discussion focuses mostly on forall — exists is almost exactly the same (just a few zeroes changed to ones) so I don't elaborate on it here. A few notational items: where code is embedded in article text a sequence of four dots represents code not yet explained. For enhance readability, I have also omitted trailing backslashes (for multi-line macros) in the text. Listing 1 contains the complete, unadulterated code.

The assert macro
All assert conditions occur in a call to the assert macro. This wrapper macro is very simple:

#define assert(assert-condition) asstResult = .... assert-condition, asstResult || (action to take if assertion fails)
asstResult is a static variable that holds the value of the assert condition after its evaluation. If assert condition is a C++ expression, the expansion of assert will also be an expression (a comma expression, to be precise), which can be used anywhere where expressions are allowed. If assert condition is not a C++ expression, however, assert itself will become a sequence of statements. I explain how this happens in the following sections.

forall — the Core of the Code
The core of the forall macro will be a standard loop that is exited when the subcondition does not hold; thus,

forall (for-specification, subcondition)
expands to

.... asstResult = 1; for for-specification { asstResult = .... subcondition; if (!asstResult) break; } ....
After the loop finishes, asstResult indicates the overall result of forall: If asstResult is 1, all loop iterations were successful; if it is 0, a loop iteration occurred for which the subcondition was not true. (The initialization of asstResult is necessary in case the loop is not entered at all.) Note that the innocent-looking assignment to asstResult may again be a lot of statements if the subcondition itself includes foralls and exists.

The ! Operator
Up to now, things were simple. From here on, the macro gets a little bit weird. All the following complications result from the wish to insert one of three logical operators, namely !, &&, and // between calls to forall and exists. To this end, the beginning and the end of the forall macro must be parts of expressions that can be connected by the logical operators. By a little trial and error, the following form evolves for forall:

anAssertCt(); .... asstResult = 1; for for-specification { asstResult = .... subcondition; if (!asstResult) break; } .... asstResult
anAssertCt() terminates the expression starting outside the macro call. asstResult at the end starts the next expression. Look at what happens:

assert(cond1 || ! forall(....))
expands to

asstResult = .... cond1 || ! anAssertCt(); .... asstResult = 1; for for-specification { asstResult = .... subcondition; if (!asstResult) break; } .... asstResult , asstResult || (action to be done if assertion fails)
where the outer parts come from the assert macro and part of its parameter, and the inner (indented) lines result from expansion of the forall macro. The role of anAssertCt() as an expression terminator should be clear.
anAssertCt() also serves another purpose: it counts the number of !'s in front of it. This operation works as follows: anAssertCt() returns a result of type AssertCt, a class with an overloaded !-operator:

int asstInvert; // flag; 1 if number of !'s is odd, // i.e. result of forall must be // inverted. struct AssertCt { // empty } asstCt; // the only AssertCt object // ever created AssertCt& anAssertCt() { asstInvert = 0; // initialize !-flag .... return asstCt; } const AssertCt& operator!(const AssertCt& a) { asstInvert = !asstInvert; // a ! has occurred! return a; }
Now observe what happens. The statement

asstResult = .... cond1 || ! anAssertCt();
(the first part of the example shown above) will call anAssertCt() and thereby initialize asstInvert to 0. Since anAssertCt() returns an AssertCt, the operator ! is the overloaded ! for AssertCt, which will toggle the inversion flag. If there is another ! — e.g. in !!forall(...) — this will set asstInvert back to 0, then to 1, and so on, until all !'s are executed. Thus asstInvert indicates whether asstResult must be inverted after the loop.
(As the situation stands, the code would have to cast the result of the operator! to int, and then (in this example) // it with the previous condition cond1. Later I'll show that the // and the && operators must also be overloaded for class AssertCt, and that the cast is not necessary.)
Of course, the inversion flag must be evaluated after the for loop, and since there might be other calls to operator! inside the subcondition, it is necessary to save its value in a local variable, resulting in the following code for forall:

anAssertCt(); { .... int asstInvert = ::asstInvert; // save global asstInvert asstResult = 1; for for-specification { asstResult = .... subcondition; if (!asstResult) break; } if (asstInvert) asstResult = !asstResult; .... } .... asstResult

Conditional Evaluation
In C++, it is necessary to take some shortcuts when evaluating conditional expressions. Consider the following condition:

a || b && forall(....) && c || d
where a, b, c, and d are simple expressions (without any foralls and exists). This condition should be true if either

a is 1 -- and in this case no other part of the condition should be evaluated, or b and forall and c yield 1 -- and then d is not evaluated, or d is 1
Moreover, in the && chain, forall need only be evaluated if b is 1; and c need only be evaluated if forall yields 1 as well. It follows that there has to be a way to suppress the evaluation of the forall condition and subsequent conditions; I do this by inserting a simple if statement that calls a function asstDoEval(int &), which will tell when to run the for loop and when not:

anAssertCt(); { int asstShortCut; // flag to suppress evaluation of following // parts of condition if (asstDoEval(asstShortCut)) { // suppress execution of for loop int asstInvert = ::asstInvert; asstResult = 1; for for-specification { asstResult = .... subcondition; if (!asstResult) break; } if (asstInvert) asstResult = !asstResult; } // end if (asstDoEval()) ::asstShortCut = asstShortCut; // load global asstShortCut } asstResult = ::asstShortCut ? 1 : asstResult
I show later how asstDoEval determines whether the loop should be executed. Function asstDoEval also stores a value in variable asstShortCut which controls whether forall evaluates trailing parts of the assertion. This decision occurs in the last assignment of the forall macro: If asstShortCut is equal to 1, the condition attached to the right is not touched, and asstResult is assigned 1 (I explain why I use 1 in the next section).
asstShortCut's value follows a somewhat complicated route: It is assigned a value in function asstDoEval, and is copied to a global variable before it is finally used. This detour is necessary because there may be more foralls around and in the subcondition, and each of them must maintain its own flag. On the other hand, the evaluation can only happen in the outermost scope, since the end of the macro must be connectable to arbitrary following conditions. Therefore forall copies the local flag to the global flag immediately before its value is needed.

Breaking Conditions Apart
The forall macro shown (it is almost the complete code) brings up two questions:
1. hen is forall itself to be evaluated? The answer of this is determined by the result of asstDoEval(); and
2. hen are the conditions after the forall macro to be evaluated? This will influence the asstShortCut flag.
The first question is easily answered. Consider an arbitrary condition

.... op a_i op .... op a_k op foralL(....) ....
where each op is either // or && and which expands to

.... op a_i op .... op a_k op anAssertCt(); .... forall code ....
Of course, the forall code must execute if and only if anAssertCt() is called. anAssertCt() makes this information public by setting a global flag asstEval:

AssertCt& anAssertCt() { asstInvert = 0; asstEval = 1; // remember // evaluation return asstCt; }
whose value will be returned by asstDoEval. After querying its value, asstDoEval will reset this flag as this value must always be 0 before evaluating a condition:

int asstDoEval(int& asstShortCut) { int eval = asstEval; . . . . asstEval = 0; return eval; }
The second question concerned the asstShortCut flag. For the condition

.... op a_ij op .... op a_nk op forall(....) op rest
must rest be evaluated? That depends. First, is forall itself evaluated? If this is the case (call this case A), rest must be evaluated "together with the result of forall," i.e.,

asstResult = asstResult op rest
But if forall is not evaluated (case B), a second detail must be considered, namely, whether the evaluation so far (i.e. up to and including a_nk) yielded 1 or 0. If it is 1 and forall is not evaluated, a clause before a preceding // must have succeeded, and no further evaluations are necessary. Therefore, in this case (case B1), the code sets

asstResult = 1
(this is why I used the value 1 in the previous section). But if the result is 0 (case B2), the evaluation must continue after the forall; again, I can set

asstResult = asstResult op rest
although this time it is apparent that asstResult is 0. Combining these cases, the last line of the forall macro is

asstResult = ::asstShortCut ? 1 : asstResult
which can be concatenated to the op rest part, and the value of the asstShortCut flag follows from the previous discussion:

int asstDoEval(int& asstShortCut) { int eval = asstIval; asstShortCut = ! asstEval && asstResult; // set to 1 only if previous result - up to ank - was // 1 and we do not execute for loop, i.e. in case B1. asstEval = 0 return eval; }
One complication remains to be solved: In the second line of the code above, asstDoEval() requires the result of the evaluation up to ank! The current implementation already has an

asstResult =
in the next macro call "to the left" (either an assert, a forall, or an exists). The problem is that the condition on the right extends one element too far, namely up to the anAssertCt() call:

asstResult = ...op a_ij op ... a_nk op anAssertCt();
To get the correct result, anAssertCt() should therefore return an int value that makes it "invisible" to the rest of the condition — In other words, we need && and // operators that ignore their right parameter. I can easily do this by overloading the && and // operators for a right operand of type AssertCt and have them return their left operand all the time:

int operator&&(int left, const AssertCt&) { .... return left; } int operator ||(int left, const AssertCt&) { .... return left; }
But now there is a new problem: These overloaded operators are no longer "shortcut operators." Unlike the standard && and // operators, they evaluate both operands on each call, regardless of the value of the left operand. This is harmful because in the evaluation of the right operand, which always contains anAssertCt(), the code sets the asstEval flag — which is now set too often! To correct this problem, I reset this flag if an anAssertCt() was called "inadvertently":

int operator&&(int left, const AssertCt&) { asstEval = left; // reset asstEval if left == 0 return left; } int operator (int left, const AssertCt&) { asstEval = !left; // reset asstEval if left == 1 return left; }
The cast-to-int-operator of class AssertCt anticipated at the beginning of this section is now no longer necessary — almost. If a condition starts with a forall, the resulting code is

asstResult = anAssertCt(); ...
which is the only place requiring an assignment of an AssertCt to an int. However, this assignment can be avoided by starting each condition with

0 || ...
which does not change its value. This is necessary in the assert macro, which now becomes

#define assert (assert-condition) asstResult = 0 || assert-condition, asstResult || (action to be done if assertion fails)
and in the forall macro, where the for loop becomes

for for-specification { asstResult = 0 || subcondition; if (!asstResult) break; }
avoiding a dangerous implicit cast to int and providing a sound implementation of forall.
I've just presented the complete implementation of forall. To get an overview of this implementation, you can study Listing 1, where you will find all of the parts again. In the actual implementation, I replace the forall macro by a more general macro asstLoopTest, which implements both forall and exists. All the functions are declared inline, and all the flags are static. Together, this yields an implementation confined to a single header file, without the need to link a separate object file or library to the executable.

An Application
In this section I demonstrate the use of the macro extensions with an example. Consider a sort function which can sort any array of different elements (sort functions such as qsort in the C library usually also deal with equal elements in an array, but this complicates the assertions a little bit):

template <class T> class Array { public: T operator[](unsigned index) const; unsigned size() const; ... }; template <class T> void sort(Array<T>& array) { ... sorting algorithm ... }
Which assertions are useful for this function? Consider the possible post-conditions first. Obviously, one post-condition of this function is that all the elements should be sorted; in other words, elements should increase in size as the array is traversed. This condition is easy to implement with a forall quantifier:

Post-condition 1: assert( forall((int i = 1; i < array.size();i++), array[i-1]<array[i]) );
What would the output of a program look like if Post-condition 1 was violated? As follows:

Assertion in line 117, file sort.cxx failed for condition 'forall( (int i = 1; i < array.size(); i++), array [i-1] < array[i])'
Of course, this is an important message, but tells nothing about the index where the catastrophe happened.
It is possible to formulate more post-conditions for sorting. Imagine a sorting algorithm that simply replaces the elements of the array by 1, 2, ..., size. These (new) elements would be in ascending order and would thus fulfill the first post-condition, but this could hardly be considered a valid sorting algorithm for an array. Another post-condition therefore specifies that each element in the resulting array must have been present in the original one; at the same time, the size of the array must not have changed during sorting. (Did you ever realize that a sorting routine could do so much evil?).

Post-condition 2: assert( array.size() == old(array).size() && forall ((int i = 0; i < array.size(); i++), exists ((int j = 0; j < array.size(); j++), array[i] == old(array)[j] ) ) );
The old(array) construct introduced in this assertion refers to the value of the variable array before the changes made by the sorting algorithm. C++ does not automatically store an "old" value at the beginning of each function. (Eiffel has a facility for this.) Therefore I provide a macro

saveold(T,v)
to the assert macros that assigns the variable v of type T to a variable with the same name, but prefixed with old_. The old macro then accesses this variable. The call to saveold must be inserted "by hand" into the function:

template <class T> void sort(Array<T> array) { saveold(Array<T>,array); ... sorting algorithm ... ... Postcondition 1... ... Postcondition 2... }
As a last example, I show a pre-condition to the sort function, namely that all elements in the array must be different:

Pre-condition: assert( forall((int i = 0; i<array.size();i++), ! exists((int j = 0; j<array.size(); j++), i != j && array[i]==array[j] ) ) );
or in words: For all indices i, there is no index j different from i (i != j), where array[i] equals array[j].

Loop variables
In each of the foralls and exists in the examples given above, I defined a local loop variable i or j. Inside each forall and exists macro, there is a (hidden) block which contains a corresponding for loop; therefore, these variables can only be accessed inside the immediately following assert condition. Suppose I want to output information about the loop variables when an assertion fails? In this case, I must declare the loop variables at global scope, as in:

int i; assertp(forall((i = 1; i <array.size();i++), array[i -1]<array[i])); ,"index" << i);
Now the output will be more informative than the one shown previously:

Assertion in line 117, file sort.cxx failed for index 7
where the 'index 7' part of the output comes from the second parameter of assertp.
After the message prints to the stream cerr, both assert and assertp will call exit(125) by default. However, both the output stream and the exit action can be changed by simply redefining macros ASSERTSTREAM and ASSERTACTION, respectively. (For compilers supporting exception handling, you can redefine ASSERTACTION to throw an exception instead of exiting the program. But look out — due to the definition of assert as an expression, ASSERTACTION must be an expression, too.)

Restrictions
A close look at the syntax will reveal a restriction of assert conditions when compared to usual C++ expressions. An assert condition cannot be put inside parentheses. Thus, the following is not a valid use of an assert condition:

assert(array.size() > 0 && && (array.elements() == 0 || forall((i = 1; i < array.size(); i++), // error! array[i] > array[0]) ) );
Here, the forall-condition erroneously occurs in a parenthesized sub-expression for precedence purposes, which will result in a compile-time error. The reason is, of course, that forall and exists are implemented using for loops, which are statements; however, there is no way in C++ to put a statement inside parentheses. (Unfortunately, the error messages given by compilers in this case are not particularly clear, as the error is detected some way inside the forall or exists macro.) The work-around for parenthesized assert conditions is to rearrange the whole condition until it is no longer necessary to put parentheses around a quantified subcondition. The example above is equivalent to

assert(array.size() > 0); assert(array.elements() == 0 || forall((i = 1; i < array.size();i++), array[i] > array[0] ) );
which will work.
Another restriction of this implementation is that assert conditions can only be used inside assert (but not as conditions in if statements, for example). Actually though, this should not be much of a restriction, as they were only designed for this use.

Conclusion
This article showed that is is possible to implement powerful logical operators in standard C++ without any language extension and with only minor compromises.
Besides some useful code, I hope I have presented some C++ programming tricks that are not obvious in the first place. Of course, I wouldn't advise using such intricate macros under all circumstances; they make syntax errors more difficult to find, and they prohibit looking at the actual source code when debugging. Nevertheless, these macros can sometimes help you to produce more understandable (and therefore reliable) code and to maintain this code through a longer period of time.

References
1. M. Ellis and B. Stroustrup. The Annotated C++ Reference Manual (Addison-Wesley, 1990).
2. B. Meyer. Object-oriented Software Construction (Prentice-Hall, 1988).
3. C.A.R. Hoare. "An Axiomatic Basis for Computer Programming," Communications of the ACM, October, 1979, pp. 576-583.
4. E. Dijkstra. A Discipline of Programming (Prentice-Hall, 1976).
5. D. Gries. The Science of Programming (Springer, 1981).