A Generative Programming Debugging Tool

C/C++ Users Journal May, 2005

Dealing with the compiler

By Jeremy Dickman

Jeremy Dickman works in a proprietary trading house in London, England. He can be contacted at jeremy_dickman@hotmail.com.

Generative programming in C++ is implemented using metaprogramming with templates to write programs that generate code at compile time. The problem with this form of programming is that there are no tools available for debugging what the compiler is doing. In Modern C++ Design [1], Andrei Alexandrescu gave us the compile-time assert. What I propose with this article is the compile-time debug.

Consider Listing 1, a metaprogram that implements a compile-time floating-point number. If you were to use it like this:

CFloat<999, 2> f;
cout << f;       // Will print 99900.0,
// the same as writing cout << 99900.0f;

the compiler would happily generate a single float wherever f is instantiated with a positive exponent. However, you would soon run into problems if you wrote:

CFloat<999, -2> f;
cout << f;   // Want it to print 9.99,
  // the same as writing cout << 9.99f;
  // but it fails to compile!

Why? For a negative exponent, I am only interested in the first branch of the ternary "?" operator. Unfortunately, the compiler tries to instantiate both branches of the ? operator because of the reference to result in CPower in both branches of the operator; for example:

CPower<10, iExponent>::result

This is undesirable for two reasons:

First, the type in the first branch may not be appropriate for use in the context of the type in the second branch, thus causing a compiler error. (In this example, raising to the power of a negative is illegal, but in more contrived metaprograms, you might be trying to access enumerated types, member functions, or nested classes that simply don't exist for the type in question.)
Second, it's inefficient to unnecessarily instantiate the second branch, especially when considering the limited (and arbitrary) resources of the compiler, in terms of the number of pending instantiations [2] and the additional compile time required to instantiate it.

Listing 2, an improved version of the compile-time float, uses the CIfThenElse "construct" [3] to solve this particular problem. But you are still presented with a problem—what is it that tells you that the implementation in Listing 2 is somehow better than that in Listing 1? How do you know what the compiler is instantiating at compile time? Without any feedback, you are left to your own intuition. Have you ever written a program right the first time, without the use of debug?

In this example, what goes on at compile-time isn't obvious, and indeed, the first example appears to work some of the time (with a positive exponent). For more contrived metaprograms, however, things become more difficult and you are at the mercy of the compiler, feeling around in the dark, without any feedback as to what's going on and why. Thus, the need for a debugging tool to provide you with the visibility you desperately need.

Debugging Tool

When writing template metaprograms, I often see errors from the compiler telling me which parameters were provided in the template causing the problem. Consequently, I decided I needed a way of forcing these error messages out of the compiler, yet without causing an actual error and without changing the compiler's error levels into warning levels. After some tinkering, I came up with the simple—and meaningless!—code in Listing 3. Bear in mind I was using the Microsoft Visual C 7.1 compiler.

When an instance of this class is statically contained inside your template under scrutiny—the debuggee—the compiler produces a warning (Figure 1) for each instantiation it tries to compile. The interesting part of the warning message is:

T=CIsPrime<19,18>::This

Thus, you have coaxed the compiler into giving you a (somewhat messy, but better than nothing!) display of each instantiation it is trying to make. For recursively "unrolled" instantiations, the warnings actually become a little bit neater as only the instantiation is displayed. Et voilà! You now have the visibility you need to debug template metaprograms.

Notice that the template parameter T in Listing 3 is not actually explicitly used. You may think this parameter to be unnecessary, but indeed, it's required to make each instantiation of the debuggee template contain the static member variable CMetaProgramDebug as a unique type (and thus force the warning for every instantiation).

Prime Number Algorithm

The debugging tool really comes into its own when used with a more contrived example. I decided to use a prime-number test (hereafter referred to as a "prime-number generator," as it reads better!) because I'd had trouble understanding it in the past. Consider the implementation in Listing 4, which is presented in [4] and [5]. Not only does it initially look daunting, its use is not very intuitive either. To find out whether the number 57 is prime or not, you'd need to type:

CIsPrime<57, 56>::result

where result is either true or false (in this case, false).

P is the number to test to see if it's prime or not. I is used as a counter that counts all the way down from one less than P, being decremented in each instantiation of the recursive template. If any value of I divides cleanly into P (for example, P % I is false), then the result of the right side of the logical OR statement will be false. However, it's important to note that this doesn't stop the recursive template instantiations. In effect, they continue forever, although the specializations of CIsPrime end the recursion.

If you're finding this difficult to follow, now would be a good time to place the debugging tool into the code. Listing 5 shows the prime-number generator with the debugging code in place. When you compile this code, the compiler displays a warning for each instantiation it is making of CIsPrime. You should see a warning message for each instantiation of CIsPrime from CIsPrime<57,56> all the way down to CIsPrime<57,2>, at which point CIsPrime<0,1> is instantiated and the recursion ends. No warning is displayed for the specialization because you didn't put the debugging code in it.

As well as providing information about the instantiations being made, the debugging code lets you see two major things wrong with this algorithm.

First, templates are instantiated all the way down to I=2; that is, the recursion could stop earlier, when the first occurrence of P % I == 0, or in English, when I divides perfectly into P with no remainder. By definition, P cannot then be prime if a number smaller than it is a factor of P.
Second, if you really want to take advantage of this first point, it'd be wise to start I at 2 and count up, rather than start at P-1 and count down.

Both these features are hugely inefficient and knowing that instantiations are at a premium, you need to find a better solution. Indeed, the largest number I could use for P was 1072, and I'm sure you agree, instantiating all 1072 templates seems somewhat unnecessary—especially when you consider that the number 4 is a factor of it! Using the ideas from our two aforementioned findings, you would have only needed to have instantiated templates when I = 2, 3, & 4!

A New Prime-Number Algorithm

Armed with this super new debugging tool that tells you exactly what the compiler is doing at compile time, along with the two ideas for improving the algorithm, you can form a new design. Listing 6 starts counting from I=2, and I also defaults to 2 in the parameter list so that you can write:

CIsPrime<57>::result

which is much more intuitive. It uses two nested classes, NotPrime and FindNextPrime, to provide an enumeration result that either returns false in the case of NotPrime (obviously), or it recurses to the next template instantiation with I+1 in the case of FindNextPrime (also obvious!). The design is already more understandable. Further, you use the CIfThenElse metaprogram instead of the ternary ? operator so that you ensure only one side of the IF statement is instantiated. The final feature is the specialization:

template<int P> struct CIsPrime<P, P>

Remember, you are counting from I=2, all the way up until I=P, at which point no other number has divided into P along the way (if it had, the nested NotPrime class would have been instantiated, ending the recursion). Therefore, you know that P is prime and thus result = true for the specialization.

With the debug code also in place, you can test the code using various numbers of your choice. Using the example just mentioned with P=57, you see that the compiler stops instantiating at I=3. The original algorithm instantiated all 57 templates. In fact, I couldn't readily find an upper limit for the value of P. If the total number of instantiations the compiler could make was 1072 (taken from the original algorithm), then you'd imagine the upper limit would be the 1072^nd prime number, as only prime numbers require the maximum number of template instantiations.

You might agree that the code is cleaner and more readable, and the algorithm is much more efficient, given the limited compiler resources.

Further Improvements

More work could be done on the debugging tool itself to discover perhaps more sources of compiler warnings. For example, using the this pointer in the memberwise initialization list of a constructor of CMetaProgramDebug would produce warning number "C4355." Certainly, discovering more sources of warning messages might give you more choices in making the somewhat messy compiler output more readable?

Unfortunately, I didn't have the resources to try this out on compilers other than Visual C 7.1. (Maybe you would like to try this on other compilers?)

Also, I arrived at the CMetaProgramDebug implementation and didn't bother to search further for neater implementations. Maybe you know of an even cleaner solution (although I don't find three lines of code too obtrusive)?

Finally, you may be wondering, "What is the use in knowing whether a number is prime at compile time?" Maybe some of you have your own ideas you'd like to share? I use it along with another template metaprogram that computes the nearest prime number greater than or equal to a given, arbitrary number. This can then be used to provide compile-time map sizes, with the idea in mind that access to maps is deemed to be optimal if the map size is prime. For example, if I know I have a fixed-size map of say, 10 items, then at compile time, I'd like to know the next prime number greater than 10 to set my map size for optimal performance. In this case, 11 would be the map size. I leave it as an exercise for you to come up with an efficient template metaprogram to do the job!

Conclusion

In this article, I've highlighted the inherent difficulties with template metaprogramming and the need for some kind of feedback from the compiler. In the process, I presented a simple technique that can be used to provide "debug messages" in the form of compiler warnings that help show which instantiations the compiler is making. Finally, I presented a much-improved prime-number generator algorithm as a result of using the information gleaned from debugging a well-known implementation.

References

[1] Alexandrescu, Andrei. Modern C++ Design, Addison-Wesley, 2001, p. 23.
[2] The compiler keeps track of the number of instantiations at a given time. These pending instantiations are all the instantiations started but not yet complete, in unrolling and instantiating a template. This number is arbitrarily chosen by compiler writers, although only 17 is recommended as a maximum by the ISO/ANSI C++ Standard. Thus, it's not only prudent, but often necessary, to minimize the number of instantiations wherever possible.
[3] Implementations of CIfThenElse, CPower, and CAbs can be found in the zipped code archive at http://www.cuj.com/code/, or in [4] & [5].
[4] Czarnecki, Krzysztof and Ulrich Eisenecker. Generative Programming, Addison-Wesley, 2000, p. 500.
[5] Vandevoorde, David and Nicolai Josuttis. C++ Templates, Addison-Wesley, 2002, p. 318.