Features


Simulating C++ Templates in C and C++

John W. Small


Since receiving his BSCS from George Mason University in 1985, John W. Small has run PSW/ Power Software, a software tools and consulting company. He has authored the FlexList C/C ++, Container Lite C++, and COP (C Object Programming) tools as well as EB, the Electronic Book, a hypertext application. He has been programming in C/C ++ for 10/5 years, respectively, as well as in Smalltalk. He can be reached via email at john.small@wdn.com or (703) 759-3838 evenings.

This article outlines a systematic approach to simulating templates in C and C++. You may need to do so because not all C++ translators implement templates, and templates are certainly not a part of conventional C. I begin by demonstrating why you might want to use function and class templates. Then I show how to synthesize function and class templates, with examples in both C and C++. If you are familiar only with C, you should learn a little C++ in the process.

Function Templates

Macros in C and C++ allow you to define generic functions such as:

#define max(x, y) ((x > y) ? x : y)
However, using the #define preprocessor directive circumvents any strong type checking that might be provided by the translator. And it can also introduce undesired side effects for the unsuspecting programmer who assumes max(x, y) is a function call instead of a macro call.

Consider:

t = max(++r, s);
which expands to

t = ((++r > s) ? ++r: s)
If r is greater than or equal to s upon calling max(++r, s), r will be incremented twice. The arguments ++r and s are treated simply as token sequences by the max macro. The preprocessor substitutes them for the formal parameters x and y wherever they appear in the definition of max. Macros are thus said to be called by name. The names of the actual arguments passed to the macro replace the formal parameters in the expansion process.

Such problems can be avoided by coding max as a function:

int max(int x, int y)
   { return (x > y) ? x : y; }
With this definition, the expression

t = max(++r, s);
always leaves r incremented by one instead of sometimes being incremented by two. The max function is said to be called by value. The values of ++r and s are stored in the auto (local) variables x and y respectively. The value of r is incremented before it is stored in x and it can never be erroneously incremented again inside max.

With this solution however, we have lost the ability to define max generically. If we need to compare two float values, we would have to code:

float max(float x, float y)
   { return (x > y) ? x : y; }
Both the int and float versions of max can be simultaneously defined in C++. Unlike C, the newer language allows function names to be overloaded. The types of the arguments in the function call dictate which version the C++ translator calls.

C++ also provides a construct known as a function template that allows us to define a generic family of functions without sacrificing strong type checking. The C++ template approach to defining max would be;

template <class TYPE>
   inline TYPE max(TYPE x, TYPE y)
      { return (x > y) ? x : y; }
A function template is introduced by the keyword template followed immediately by one or more formal parameters, enclosed in angle brackets, that specify types. The function body is the same as a regular function definition except that the template's type parameters can appear anywhere in the definition that a data type name might normally appear. The following code causes the translator to automatically generate three different versions of the max function from its template definition. One definition is for int arguments, another is for long arguments, and yet another is for float arguments.

int r, s, t;
long u, v, w;
float x, y, z;
...
t = max(++r, s);
u = max(++v, w);
z = max(x, y);
The keyword inline signals to the translator to generate inline code instead of invoking a function call each time. Thus you can see that function templates require no more overhead than macros but provide strong type checking and preclude erroneous side effects. Furthermore you can manually override template functions by supplying your own specialization for some combination of arguments:

inline char *max(char* x, char *y)
   { return (strcmp(x,y) >= 0) ? x : y; }
With this definition, the code sequence:

char a[] = "one", b[] = "two", *c;

c = max(a, b);
calls your version of max for strings, preventing the translator from generating a version from the max template.

Class Templates

C++ also provides a construct known as a class template, for which there is no direct counterpart in C. Suppose you need to implement a stack of strings in C (see Listing 1) . Our stack of strings includes functions for all the stack primitive operations, e.g. full, push, top, and pop. The init function is used to initialize a string stack. The main function demonstrates stack usage by building a "To Do" list, which is then streamed to the standard output. To implement a stack for some other data type you would have to clone this code, replacing char * throughout with the desired data pointer type. Of course all occurrences of the name StrStack would also have to be replaced with the appropriate new stack name.

The C++ version of Listing 1 is shown in Listing 2. The first thing you should notice is that C++ automatically makes a structure tag a type definition as well, so that StrStack is now a type. Thus:

class StrStack { ... };
is equivalent to:

typedef class StrStack { ... } StrStack;
Also, in C++ the keyword class is equivalent to the keyword struct except the members of the structure are private in scope by default and thus cannot be accessed from outside the class.

The second thing you should notice is that the functions full, push, top, and pop are now public methods of the StrStack class. They don't make the stack structure any larger, but the C++ translator can now keep track of which functions go with which data structures. The public functions provide a secure interface to the hidden object members, providing an important form of encapsulation. Users of the StrStack class cannot accidently corrupt the member objects in a StrStack class instance (object), for example.

Furthermore, all member functions have an implicit this pointer to a class instance as a hidden formal parameter. To the translator, the definition

char *pop()
   {
   return (items ? itemPtrs[--items]
          : 0);
   }
really behaves like:

char *pop(StrStack *this)
   {
   return (this->items ?
      this->itemPtrs [--this->items]
         : 0);
   }
Thus, pop implicitly knows which StrStack instance it is going to pop. Sometimes the implicit this pointer may need to be explicitly used in an expression to distinguish between a data member and a formal parameter or some other object, when their names conflict.

Member functions defined within the class declaration are considered inline functions. The function StrStack::push(char *) is not inlined since its definition appears outside its class declaration.

The function StrStack() is a constructor for the class, equivalent to the C version's init function. Constructors are used to initialize class instances at their point of definition. Thus

StrStack ToDo;
appearing in the function main not only allocates memory for the ToDo stack but causes the translator to automatically generate a call to StrStack() for you.

Lastly, notice how methods are invoked:

while (ToDo.top())
The implicit this pointer of top is automatically set to point to ToDo. The C++ version of top() doesn't have to check to see if this is a null pointer. By contrast, the C version primitives must always validate stackPtr.

Listing 3 shows how a class template can be used to parameterize the stack class so that it can be reused for any data type without requiring the programmer to clone code. The class template is also introduced by the keyword template followed immediately by one or more formal parameters, enclosed in angle brackets, specifying types or constants. We now use the expression:

stack<char> ToDo;
to define our ToDo stack of strings (or character pointers). Since the template has no second actual parameter, a default value of 5 is used to limit the stack to 5 strings. Defaults can currently be specified only for constants, never types. Thus constant parameters with default arguments must always appear last in a template's formal parameter list. When generating a template class, the translator substitutes the actual parameters supplied or defaulted for the template's formal parameters throughout the template model.

A second stack:

stack<int, 10> CountDown;
defines a stack of integer pointers limited to a maximum of ten items. Upon encountering a new class specification, the translator will only generate the parameterized class if it hasn't already been generated. Please note that a class template's out-of-line member function definition can appear in a header file, since it is only a model with which the translator can generate the actual function definition. Hence a template function definition, in itself, generates no code. Template generation is actually a two-part process, consisting of generating declarations and definitions.

Synthesizing Templates in C++

When you invoke a template, it automatically generates both declarations and definitions. For example, a function declaration names the function and describes its required parameters, while its definition is an algorithmic description for which the translator must generate code. Likewise, a data structure declaration names its type and specifies its layout but doesn't yet reserve any storage. Only when an object is associated for that data type might memory be allocated for it.

The storage class extern turns what would otherwise be an object definition into an object declaration — the object has external linkage with memory allocated for it elsewhere. A function declaration without a defining body generates no code regardless of the linkage you specify. An inline function has a defining body, but generates no code until it is called in a context that is not itself an inline function. For our purpose here, I will call all these declarations, because they only describe. By contrast, definitions generate code or reserve storage for objects.

In our synthesis of templates, we need to emulate separately the process of generating declarations and definitions. Since declarations typically appear in header files while definitions appear in source files, our approach will place the declarations in a header file and the definitions in a source file. The logic for this will be apparent shortly. Listing 4, Listing 5, and Listing 6 show how both function and class templates are synthesized for the function max and the class stack. Listing 4 is the file that generates declarations, Listing 5 the file that generates definitions, and Listing 6 is the source file where these emulated templates are used. We'll call our approach to emulating templates "form templates."

Layout Rules

1) Since more than one form template may appear in a generating file, each form template must be encased within its own conditional preprocessor directive, as in:

#ifdef NAME  /* open NAME envelope */
 ...
#endif       /* close NAME envelope */
which I call a form-template envelope. Here, NAME is the name of the function or class form template. For C++, function form template envelopes are named the same as their first template parameter. See if you can pick out the form template envelopes in
Listing 4 and Listing 5.

2) Form template parameter names must be unique within a generating file. For example, the following C++ templates:

template <class max_TYPE>
   inline max_TYPE max(max_TYPE x, max_TYPE y)
      { return (x > y) ? x : y; }

template <class stack_ITEM,
   unsigned stack_MAX_ITEMS = 5>
      class stack { ... }
have no naming conflicts among their various parameter names — max_TYPE, stack_lTEM, and stack_MAX_ITEMS. Though not a requirement for C++ templates, it is for form templates.

3) A default value for a constant parameter must be conditionally defined as its default value before its class declaration. Both the default value and class declaration must be contained within the form template envelope:

/* open form template envelope */
#ifdef stack

/* define default value */
 #ifndef stack_MAX_ITEMS
#define stack_MAX_ITEMS 5

 #endif

class stack {
   stack_ITEM *itemPtrs[stack_MAX_ITEMS];
...
#endif /* close form template envelope */
4) A form template doesn't utilize the keyword template, nor is there a formal parameter list. Otherwise the form template model is exactly the same as the standard C++ template model. There is one additional exception: scope resolution names must drop their parameterizing arguments. For example:

int stack<stack_ITEM, stack_MAX_ITEMS>::
   push(stack_ITEM *itemPtr)
is changed to read

int stack::push(stack_ITEM *itemPtr)
5) A form template header file, such as Listing 4, always concludes with the conditional parameter wrapup section. This section undefines all form template parameters and names unless the header is being included by its corresponding form template source file, the file that generates definitions. For example:

/* decl_gen.hpp */
....
/* parameter wrapup section */
#ifndef def_gen_cpp  /* defined in def_gen.cpp */
 #undef max_TYPE
 #undef stack_ITEM
 #undef stack_MAX_ITEMS
 #undef stack
#endif
6) A form template source file, such as listing 5, must first include its corresponding header file. For example:

/* def_gen.cpp */
#define def_gen_cpp
#include "decl_gen.hpp"
#undef def_gen_cpp
Notice how the include directive is sandwiched between the definition and undefinition of the source-file macro id. This is the macro id used to exclude the parameter wrapup section described previously in rule 5.

7) Much like the form template header file, the source file terminates with a parameter wrapup section. In this case, however, the section is unconditional.

...
/* parameter wrapup section */
#undef max_TYPE
#undef stack_ITEM
#under stack_MAX_ITEMS
#under stack

Rules for Using Templates

1) In order to use a form template to generate either declarations or definitions, you must define its name. For a C++ function form template you define its first parameter instead. For example:

#define max_ITEM int
...
#define stack stack_strings
The above code snippet indicates we are about to generate the function max for integers and a parameterized class stack named stack_strings.

2) All of the various form template parameters must also be defined as required. For example:

#define max_TYPE int
#define stack_ITEM char
3) Immediately following the form template envelope names and parameters, include the declarations generating file. To generate both declarations and definitions, include the definitions generating file instead, as in:

#define max_TYPE int
#define stack_ITEM char
#define stack stack_strings
#include "def_gen.cpp"
At this point we have generated declarations and definitions for:

   int max(int x, int y);
and

   stack_strings; // i.e. stack<char, 5U>
with all form template envelope names and parameters left undefined!

4) To generate an additional type from the same form template, simply repeat the process.

define stack_ITEM int
#define stack_MAX_ITEMS 10U
#define stack stack_int_10U
#include "def_gen.cpp"
At this point we have also generated a declaration and definition for:

stack_int_10U; // i.e. stack<int, 1OU>

The naming conventions used for these stacks has no significance as far as the form template generating mechanism is concerned.

Synthesizing Templates in C

Listing 7, Listing 8, and Listing 9 repeat our example, this time in C. Please note the following differences from the C++ approach:

1) Function form template envelopes have the same name as the function itself:

/* max function envelope (see Listing 7) */
#ifdef max/* C uses function name test */
extern max_TYPE max(max_TYPE x, max_TYPE y);
#endif
Since C doesn't allow for inlined functions, max is declared as external in the header file and defined in the source file as:

/* max function envelope (see Listing 8) */
#ifdef max  /* C uses function name test */
max_TYPE max(max_TYPE x, max_TYPE y)
   { return (x > y) ? x : y; }
#endif
2) Neither does C allow for function name overloading, so in order to generate a max function we must redefine its name as well as its parameter.

#define max max_iii    /* see Listing 9 */
#define max_TYPE int
...
#include "def_gen.c"
The convention I use here, max_iii, indicates a function generically named max returning an integer and taking two integer parameters. In order for C++ to allow for function-name overloading, the C++ translator generates unique mangled names based on return and parameter types. You can think of max_iii as a hand-mangled name. You might name a function for floats max_fff. However, you must call the proper function, max_iii or max_fff, unlike in C++.

3) Since C doesn't support member functions, you must define a scoping macro to support pseudo membership, as in:

#define stack_ITEM char    /* see
Listing 9 */
#define stack stack_char
#define stack_scope(memFnc)
stack_char_ ## memFnc
#include "def_gen.c"
The macro stack_scope is used in both the form template header and source files, as in:

extern int stack_scope(push) /* see
Listing 7 */
   (struct stack * stackPtr,
stack_ITEM * itemPtr);

int stack_scope(push)  /* see Listing
8 */
   (struct stack * stackPtr, stack_ITEM * itemPtr)
      {...}
to generate stack-scoped function names. You can think of this as a different sort of name mangling. The function name is mangled to reflect its pseudo structure membership instead of its variant parameter types. In order to call a pseudo member function you must specify its scope mangled name, as in:

/* see Listing 9 */
stack_char_push(&ToDo,"wash car");
The struct scoping macro must be undefined in the parameter wrapup sections. See the tail of Listing 7 and Listing 8.

Conclusion

The "form template" approach shown here can be used in a nested fashion to implement any structure that can be described with conventional C++ templates. No longer must you limit your designs to non-parameterized types simply because your C++ translator doesn't yet support templates. Likewise C application design can now also benefit from the C++ template concept.