October 2000/STL & Generic Programming

C/C++ Contributing Editors

STL & Generic Programming: The Template Compilation Model

Thomas Becker

Templates change the way compilers and linkers interact, often in ways that are hard to fathom.

In my last column, I gave you a first, hands-on definition of generic programming. In fact, I lifted this definition from Bjarne Stroustrup's book [1]: generic programming means using types as parameters. We saw that C++ allows us to do this in two ways: we can use types as run-time parameters, and we can use them as compile-time parameters. I also said that for the purpose of understanding and using the C++ Standard Template Library, it is the second kind, the use of compile-time type parameters, that matters most. Finally, I said that in C++, using compile-time type parameters means using templates. Therefore, before we can tackle the STL, we must make sure to have a solid and practical understanding of C++ templates and their usage. Please recall also what I said about the prerequisites that this column assumes on your part. I am not assuming that you have used templates extensively in your practical work. However, I will assume that you have read about them in a C++ book or learned about them in a C++ class.

Every time a fellow programmer walks into my office and says, "Look, I have this code that won't compile, I'm banging my head against the wall here, do you mind taking a look," the odds are nine to one that the problem has to do with templates. Does that mean that templates are inherently hard, accessible only to very smart people, sort of like word problems in mathematics? I think not.

To see why templates have that aura of mystery about them, you first have to understand that templates pose a major headache to the people who write our compilers. C++ templates just don't fit traditional compiler design principles very well. Therefore, even good C++ compilers will often give you cryptic or outright wrong messages on syntax errors involving templates. Moreover, as of this writing, several major compilers, which are state of the art in most respects, do not fully support all template features. Unfortunately, when you try to use an unsupported feature, you will not usually get a message like, "Sorry, partial template specialization not supported." Instead, you will be showered with cryptic error messages. Finally, it is possible for even a respectable compiler to just give up, in a more or less graceful manner, on a template construction that is correct but very complex. These are nuisances that we will have to live with for some time to come. There is not much I can do for you in this respect.

There is one more aspect of templates that makes them tricky to use in practice, and that I can help you with, namely, the compilation model. Most C++ books address this issue either not at all or only very superficially. That's for a good reason: the way template instantiations are compiled and linked is currently highly compiler dependent. Eventually, we should have standardization in this respect, but for now, and probably for years to come, there is hardly anything that can be said that would apply to every major C++ compiler and linker. On the other hand, templates will continue to mystify you until you understand the issues involved in compiling and linking template instantiations. Therefore, I will now take it upon myself to explain these issues. Given the fact that real-life C++ compilers differ in their handling of templates, I will talk about a hypothetical, idealized compiler and how it could possibly deal with templates. As I go along, I will identify a minimal intersection of features that all reasonable C++ compilers should have in common and that you should therefore be able to rely on.

As a starting point, let us look at the one problem that almost every first-time template user stumbles into. Let's declare a simple class template and put it in a header file named TheThingClass.h:
// File TheThingClass.h:

...

template<typename T>
class TheThingClass
{
public:
   T GetTheThing()
   { return m_theThing; }

   void SetTheThing(const T& aThing)
   { m_theThing = aThing; }

private:
   T m_theThing;
};
Next, we create a simple console application with one source file called main.cpp, which looks like this:
// File main.cpp:

#include"TheThingClass.h"
#include<iostream>

int main()
{
   TheThingClass<int> anIntThing;
   anIntThing.SetTheThing(42);
   std::cout <<
      anIntThing.GetTheThing();
   return 0;
}
What's happening here is of course that we are defining an object named anIntThing of type TheThingClass<int>. TheThingClass<int> is the instantiation of class template TheThingClass<T> with the type argument int substituted for template parameter T. The get and set methods just store an int into the TheThingClass<int> object and pull it out again. If this code does not compile, link, and run, then it is fair to say that your compiler does not handle templates at all.

Now suppose we add a third public member function to our class template TheThingClass:
void CrunchTheThing();
Suppose further that the body of this function is lengthy, performing some major arithmetic on the data member m_theThing. Therefore, we do not want to put the code for CrunchTheThing inside the class declaration. Instead, we do what we always do with member function definitions: we create a source file named TheThingClass.cpp, and in this file, we put the following lines:
// File TheThingClass.cpp:

#include"TheThingClass.h"

template<typename T>
void
TheThingClass<T>::CrunchTheThing()
{
   // Lengthy code crunching
   // m_theThing
}
Then we add a call to CrunchTheThing to main:
// File main.cpp:

#include"TheThingClass.h"
#include<iostream>

int main()
{
   TheThingClass<int> anIntThing;
   anIntThing.SetTheThing(42);
   anIntThing.CrunchTheThing();
   std::cout <<
      anIntThing.GetTheThing();
   return 0;
}
Finally, we instruct our compiler to compile both main.cpp and the new source file TheThingClass.cpp, and we tell the linker to link all the resulting object code together. If you are working with an integrated development environment, you just throw both source files and the header file into your project. If there were no template involved, everything would be fine. The object code for CrunchTheThing would be in TheThingClass.obj. The linker would find it there, throw it into the main executable, and resolve the call
anIntThing.CrunchTheThing();
in main accordingly. However, as it is, with TheThingClass being a class template, you are practically guaranteed to get a linker error message like this one:
unresolved external symbol:
"void TheThingClass<int>::CrunchTheThing()"
First off, if this has ever happened to you, and you did not know why, do not let anybody tell you that you made a dumb mistake. I will show you in just a moment that this is in fact a very intelligent mistake. Let us now clarify all this by going through the code and analyzing step by step how our idealized, hypothetical compiler might have dealt with the whole template situation, and why it did not handle the call to CrunchTheThing.

Let's say the compiler compiles file main.cpp first, then file TheThingClass.cpp. When the compiler gets to see main.cpp, the preprocessor has already processed the include statements. This means that the include statements have been replaced with the contents of the respective files. Therefore, our definition of the class template TheThingClass is now at the top of main.cpp. When the compiler encounters this definition, pretty much all it can do is to check for syntax errors. If, for example, you drop a semicolon or a parenthesis somewhere in there, your compiler will most likely tell you about it. However, the compiler can take no further steps towards code generation. It cannot figure out the size or binary layout of TheThingClass objects, and it cannot generate functions such as constructors, a destructor, and the like. All of these actions would require the compiler to know what actual type will be used for T, or at least what the size of that type is. Since this is not known, the compiler does pretty much nothing with our class template declaration except to store it in its bowels.

Next, our compiler eats its way through the header file iostream. Things get interesting again when the compiler gets to see our local variable definition
TheThingClass<int> anIntThing;
To process this definition, the compiler must, at the very least, know the size of an object of type TheThingClass<int>, and it must call to generate a default constructor on this object. Recall that we have not explicitly defined any constructors, so the compiler will have to synthesize a default constructor. What happens now is that the compiler goes back and grabs a copy of our definition for template class TheThingClass, which it had put aside earlier. Then, much like text substitution, the compiler replaces all occurences of the template parameter T with int. The result is an instantiation — a good old plain vanilla class definition. It is as if you had defined:
class TheThingClass
{
public:
   int GetTheThing()
   { return m_theThing; }

   void
   SetTheThing(const int& aThing)
   { m_theThing = aThing; }

private:
   int m_theThing;
};
In fact, when dealing with template-related error messages, it would often be helpful to be able to see the result of the template substitution as shown above. However, I have not heard of a compiler that offers this convenience. Perhaps it is asking a little too much, considering how hard it is to implement templates in the first place.

The compiler may now proceed as if TheThingClass had never even been a class template, and we had written "int" instead of "T" all along. Processing the function body of main is now a matter of regular, non-template C++ compilation. We should, however, take a closer look at the line
anIntThing.CrunchTheThing();
since that's what will cause the error message later. From the compiler's point of view, this line calls a function that has been declared, namely as a member of the class TheThingClass<int>, but whose actual function body is not in this compilation unit. No problem. The compiler does what C and C++ compilers have always done: it marks this as an unresolved external for now. It is left to the linker to find the function body in a different compilation unit and to fill in the function address when everything has been thrown into one executable.

Next, the compiler processes our second source file TheThingClass.cpp. After preprocessing, this file contains the definition of the class template TheThingClass and the function definition of the member function CrunchTheThing. Recalling our discussion of what the compiler did with the definition of TheThingClass before, you can guess what it will do with this entire file: essentially nothing. There will be some syntax checking, but no real code can be generated, because that would require some knowledge about the unspecified type T.

At this point, we have two object files, main.obj and TheThingClass.obj, and the linker must produce a single executable from these. For once, the linker's job is easy. The file TheThingClass.obj is essentially empty. Therefore, all the linker has to do is turn main.obj into an executable. But alas, main.obj contains an unresolved external, namely, the call
anIntThing.CrunchTheThing();
and no function body for CrunchTheThing has ever been compiled. Therefore, the linker fails in its task and produces the error message instead.

Before we remedy the situation, let us get to the deeper reason of the problem. When the compiler was processing the source file TheThingClass.cpp, it should of course have compiled a version of TheThingClass<T>::CrunchTheThing with the template parameter T replaced by int. That is what was needed in main.cpp, and it would have resolved the unresolved external. But to do this, the compiler would have had to carry information from the compilation of main.cpp to the compilation of TheThingClass.cpp. Had the order of the compilation of the files been the other way round, it would have been even worse: after compiling main.cpp, the compiler would have had to go back to TheThingClass.cpp and process it again, this time with the knowledge that an instantiation with T equal to int is needed.

Whether we will ever have this kind of compilation is a question that I will come back to shortly. For now, let us note that the traditional compiler-linker model is based on strictly separate compilation of source files and is therefore incapable of such processing. The compiler creates object code for each source file separately. It is the linker that is responsible for combining the object files and resolving references.

So how do we fix our problem? There is a short answer and a long answer. The short answer is: place all code for the class template in the header file. In our case, this means to just scrap our source file TheThingClass.cpp and copy and paste the definition of CrunchTheThing to TheThingClass.h, following the class definition. You could actually place the function definition inside the class definition, as with SetTheThing and GetTheThing, but that may cause the compiler to inline the function, which we do not want because it is lengthy.

It should be easy for you to see how that solves our problem. When the compiler, during compilation of main.cpp, encounters the call to CrunchTheThing, it now finds the function definition in the same compilation unit. It goes through the same substitution process that we discussed before, replacing T with int, compiles the resulting function, and voila, there is no more unresolved external. This solution is good for almost all practical purposes. In fact, all implementations of the STL that I have seen do just that. Take a look at the STL source code, and you will see that there are hardly any source files at all. Everything is in the header files.

Now you have understood the problem, and you have a solution. But remember, this was only the short answer. Read on for the long answer.

Separate Compilation of Templates

Placing all template code in header files seemed like an easy enough solution to our problem. On second thought, though, you should be rather unhappy with this solution. There are two major problems with it. First of all, think about what happens if you include our header file in a hundred different source files, and each of these uses the class template TheThingClass in the same way as our main function, with the type int substituted for T. What happens is that the function TheThingClass<T>::CrunchTheThing gets compiled within each of these one hundred source files. Without taking any further provisions, the executable would contain one hundred copies of the function body's object code. This can be a major cause of code bloat in C++ programs.

A standard-conforming compiler must eliminate these duplicates. Many of today's C++ compilers, notably Microsoft's Visual C++, now conform to the Standard in this respect. Some compilers that I have worked with in the past required tweaking the compiler options to avoid this type of code bloat. Find out where your compiler stands on this issue. Note, however, that you may still have multiple template instantiations if your program is partitioned into dynamically linked libraries (a.k.a. dlls, dynamic link libraries, or shared libraries). That is because the C++ Standard cannot and will not address issues that arise from the use of such libraries. Also, even if the linker automatically removes duplicates, your compile time still takes a hit. You compile the same code a hundred times, only to throw away ninety-nine copies of the object code. This can be a significant problem in large projects.

The second problem with placing all template code in the header file is that it greatly increases compile-time dependencies. Change one comment in a function definition, and everything that includes that header file recompiles. This may sound trivial if you have always worked on small to medium size projects. For large projects, having too many compile-time dependencies often ranks among the top project risks.

Recall that I said earlier that it was maybe a mistake, but certainly not a dumb mistake to place the function definition of CrunchTheThing in a separate source file. As we just saw, that was a very sensible thing to try. Moreover, believe it or not, the C++ standard requires that it must be possible to do that using the new keyword export. This is referred to as "separate compilation of templates." I do not know of any compiler that currently fully implements this requirement. There are some partial implementations using pragma directives, naming conventions, and the like. If your code needs to be portable, these solutions are no good. Otherwise, find out about them and use them.

Explicit Instantiation

If your compiler does not support separate compilation of templates, but having all template code in header files causes you serious grief because of compile-time dependencies or increased compile times, there is an alternate solution. It guarantees you completely separate compilation of templates, at the cost of a certain maintenance effort. Referring to our example, here's what you do: place the function definition of CrunchTheThing in the separate source file TheThingClass.cpp, exactly as we had it before when the linker error occurred. At the end of that file, place the line
template class TheThingClass<int>;
This will cause what is called an explicit template instantiation. The compiler will now generate all code for our class TheThingClass, with the type int substituted for T. In other words, you are explicitly instructing the compiler to create an instantiation, rather than letting this happen on demand. The obvious disadvantage is that you must manually keep track of all the different types for which your program needs instantiations. Another, rather minor disadvantage lies in the fact that the compiler will be forced to compile all member functions of the class template. When you let template compilation happen on demand, your compiler need instantiate only those functions that are actually being called. A unique advantage of explicit template instantiation is that on many platforms, you can use it to create a single instantiation in one dynamically linked library that is used across several other dynamically linked libraries and the executable.

Reference

[1] Bjarne Stroustrup. The C++ Programming Language, 3rd ed. (Addison-Wesley, 1997).

Thomas Becker works as a senior software engineer for Zephyr Associates, Inc. in Zephyr Cove, NV. He can be reached at thomas@styleadvisor.com.