February 1997/Questions and Answers

Columns

Questions & Answers

Pete Becker

Resolving Type Inside Templates

Pete tackles how to distinguish basic types from derived types in templates, and which end is up in the world of integers.

To ask Pete a question about C or C++, send e-mail to pbecker@wpo.borland.com, use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

Q
Is there any way to infer whether a parameterized type is primitive or user-defined? For example, in this parameterized class I want to have each array element initialized to some default state. For a primitive type, this would mean zero, and for a user-defined type it would mean whatever the default constructor chooses to do. Thus, I would write it like this:
template <class T, int dim>
class Array
{
    T array[dim];
public:
    Array()
    {
    for(int i = 0; i < dim; ++i)
        array[i] = T();
    }
};
The problem is that when T is a user-defined type, the code assumes that a suitable assignment operator exists for class T, and even if such an operator exists, the code is redundant because without the for loop, each element would still be initialized to whatever the default constructor dictates. Nevertheless, the loop must still be written in order to accommodate the primitive types. Otherwise, each element would contain unknown data.

So my goal is to execute the loop only if T is a primitive type. Is there a way to code this?

Eric Nagler

A
There are a several ways to code this. It's not quite as simple as simply having a keyword to give you this information, but you can do it without too much work. First, though, for the benefit of those who haven't kept up with the latest changes in the C++ Working Paper, there's an important bit of information that you need in order for the question to make sense.

Everyone's familiar with the use of parentheses to tell the compiler to use the default constructor:
class C
{
public:
    C();
};

void f( const C& );

f( C() );
The last line in this snippet tells the compiler to construct a temporary object of type C using the default constructor, and to pass a reference to that temporary object to the function f. The recent change in the Working Paper generalizes this construct to types that do not have a constructor: they are to be initialized as if they had been defined at file scope. For example:
class D
{
    int i;
};// no constructor
// file scope: initialize to
// all zeros;
D d;
    // i.e. d.i == 0
    // after construction

int i;// also initialized to zero
The above snippet just applies the usual rules for constructors of static objects. Under the new rule, the same initialization occurs when an object of either of these types is constructed with empty parentheses:
void f( const D& );
void f( int );

f( D() );   // initialize temporary
            // object
            // to all zeros
f( int() ); // initialize the int
            // to 0
Most of us, when we first write the code for an Array, would do it like this:
template <class T, int dim>
class Array
{         
T array[dim];
};
That works fine, provided you aren't concerned with whether the contents of the array member get initialized. Let's instantiate Array for each of the three types in my earlier illustrations:
int main()
{
Array<C,10> arrC10;
Array<D,10> arrD10;
Array<int,10> arrI10;
return 0;
}
These are all perfectly valid definitions of variables, but they have slightly different characteristics. Since C has a non-trivial default constructor, the elements of arrC10 all get initialized with that default constructor. D does not have a default constructor, so the elements of arrD10 are not initialized. Similarly, int does not have a default constructor, so the integer values in arrI10 are not initialized.

There's nothing wrong with writing an array template that does this, but sometimes you need a little more assurance that you're dealing with truly meaningful objects. That's the reason for the additional code in your example:
template <class T, int dim>
class Array
{
    T array[dim];
public:
    Array()
    {
    for(int i = 0; i < dim; ++i)
    array[i] = T();
    }
};
The constructor for Array walks through the internal array, assigning the value T() to each element. For arrD10 and arrI10 this sets each of the elements to a known value, rather than the essentially random value each element had before execution of the loop.

The problem you point out shows up in arrC10, where the elements of array have already been initialized by the default constructor for C. Assigning the value C() to each of those elements usually doesn't do any harm (provided you haven't done something perverse in C's assignment operator), but it is redundant. For a large array this could involve a significant amount of time.

The way to avoid this extra assignment is to write a template function to do the actual assignment, and explicitly specialize it appropriately. Like this:
// general template
template <class T>
void assign( T *target, int size )
{
}

// specialization for int
template <>
void assign( int *target, int size )
{
for( int i = 0; i < size; i++ )
    target[i] = 0;
}

// specializations for other
// builtin types omitted

The first version of assign defines a template function. It takes one template parameter, T, which the compiler binds when you call it:
int main()
{
C i_arr[10];
assign( i_arr, 10 );
return 0;
}
The call to assign will use the first version of assign, which does nothing.

Now, it may not seem very useful to define a template function that does nothing, but that's actually what you want to happen in the constructor for Array when it is instantiated on type C: the default constructor for C already does the right thing, so we don't want to do any assignment. In the case of an int, however, there's real work to be done:
int main()
{
int i_arr[10];
assign( i_arr, 10 );
return 0;
}
In this case, the compiler sees that there is an explicit specialization of assign that can be called with an int * as its first parameter. The call to assign here will use that version of assign, and each of the int values in i_arr will be initialized to 0. Using this function in the Array template is straightforward:
template <class T, int dim>
class Array
{
         
 T array[dim];

public:
    Array()
    {
    assign( array, dim );
    }
};
Now the compiler will choose the appropriate version of assign for arrC10 and for arrI10. It won't handle D correctly, though. In our initial definition of assign we took care of the cases where we have a type with a default constructor, and by explicitly specializing assign we took care of all of the built-in types. However, for user-defined types that do not have default constructors, we don't do any initialization: the compiler uses the general version of assign, because there are no explicit specializations for these other types.

It's simple to add an explicit specialization for D:
// specialization for D
template <>
void assign( D *target, int size )
{
for( int i = 0; i < size; i++ )
    target[i] = D();
}
In fact, it's simple to provide an explicit specialization for each of the user-defined types that we use in our program. Unfortunately, although each of these explicit specializations is simple to write, remembering to write all of the appropriate ones is not simple. That's where this technique breaks down: you must remember to provide a suitable explicit specialization for every type that requires initialization. The only way I can see to avoid requiring explicit specializations for such types is to simply ban types that do not have default constructors.

It's not completely unreasonable to tell users of your Array template that they must provide a suitable default constructor for types that they use in your template. If you are willing to tell users of your template that they must provide an appropriate explicit specialization or that they must provide a suitable default constructor, this technique will make sure that the array gets properly initialized.

It may be easier to tell your users that they need to provide an explicit specialization of a traits-like template, similar to the ones I described in this column in November 1996. The reason for this is not that such a template is easier to write than a template function, but that traits are used in the standard library. As users become more familiar with this technique they may begin to think of it as a natural way to describe properties of objects. Then it will seem less artificial. Rewriting your Array template using a template named array_traits, we end up with this:
template<class T>
struct array_traits
{
    static void assign( T *, int )
    {
    }
};

struct array_traits<int>
{
    static void assign( int *target, int )
    {
    for( int i = 0; i < size; i++ )
        target[i] = 0;
    }
};

// remaining specializations omitted
Here's the new version of the Array template:
template <class T, int dim>
class Array
{
    T array[dim];
public:
    Array()
    {
    array_traits<T>::assign( array, dim );
    }
};
It's a bit more work for users of your template to write out their own array_traits template, but you may find that there are more things that you need to handle this way. That's what happened with the string class: the string_traits class grew considerably from what was originally contemplated. Going to a traits template allows you to add to the information that you get from the traits template without adding more global functions. Those functions are all encapsulated in the traits template, so when you expand it you aren't adding to global pollution.

Q
You may not remember me, but I sat through a coupla classes w/you at BDC96, and I read your columns regularly in C/C++ User's Journal. I'm a small-time, part-time developer/consultant w/o any formal training in Computer Science, and there are quite a few gaps in my "knowledge base."

One of these gaps has to do with big- vs. little-endian storage. (I am aware, in general, of the meanings of these terms, but less certain of their implementations.) For example, on a 16-bit system (which may be entirely hypothetical since I am not aware of any 16-bit big-endian machines): Is there any difference in bit order within bytes? ( I hope not). In a multi-byte value, such as a long, a double, a float, or even an int or wchar_t — are all bytes reversed, or just bytes within words? For example,
int examint = 5;
In little-endian, is storage 0x05 0x00? In big-endian, is storage 0x00 0x05? Further, for a 32-bit value, would
long examlong = 5;
be 0x05 0x00 0x00 0x00 or 0x00 0x05 0x00 0x00 or 0x00 0x00 0x00 0x05 or 0x00 0x00 0x05 0x00?

In a 32-bit system, examint would have size of examlong, but I assume the mechanics would be similar.

The reason I ask this is because I have run into a situation where it would be useful to know just what big- or little-endian storage implies: writing portable binary files which I would want to be able to be read on PC and/or Macintosh. Do you have any quick answer?

— Karl Button

A
Yes, I remember you. We sat at the same table for lunch the first or second day of the conference, didn't we? Anyway, on to endianness. Let's get one issue out of the way right at the start: endianness does not tell you anything about the bit-order which a particular system uses when it writes bytes to a binary file. I'll come back to this as part of the broader question of portability at the end. Endianness is about the order in which a computer stores the bytes that make up a multi-byte value. Knowing which way the bytes are stored doesn't, in itself, get you portability, though. For example, different processors can use different representations for floating-point values. Being able to reconstruct the pattern of bits that one processor used won't do you any good if that pattern doesn't make sense to the new processor. So, I'm not going to talk about floating-point types any further, except as it relates to the broader question of portability.

In relation to the way computer architectures store integral values, there are two common types: big-endian and little-endian. In a big-endian architecture, the big end of a multi-byte integral value is stored first, that is, in the lowest address that the stored value occupies. In a little-endian architecture, the little end is stored first. You can see this with a simple C program:
#include <stdio.h>

union show_it
{
long value;
unsigned char bytes[sizeof(long)];
};

int main()
{
union show_it show;
int i;
show.value = 1;
for( i = 0; i < sizeof(long); i++ )
    printf( "%02d ", show.bytes[i] );
printf( "\n" );
return 0;
}
On a little-endian system this will print out
01 00 00 00
and on a big-endian system it will print out
00 00 00 01
Whichever architecture is being used, the bytes are ordered consistently from lowest to highest or from highest to lowest, regardless of the actual size of the integral type.

Are you familiar with the Unicode character encoding system? It uses 16-bit values to encode characters, which enables it to represent all the characters used in writing in the world today. To make files written in Unicode portable, the writers of the Unicode standard had to deal with endianness. They live in a much simpler world than you do, though: they only have to deal with 16-bit integral values.

The authors of the Unicode standard came up with a simple mechanism. The first character written out to a Unicode file contains the value 0xFFFE. If you dump the bytes in the file you'll see 0xFF 0xFE as the first two bytes on a big-endian system, and 0xFE 0xFF as the first two bytes on a little-endian system. When you read that file back into a program that understand Unicode, though, the Unicode system doesn't need to worry about whether things are little-endian or big-endian. Rather, what matters is whether the file is being read on a system with the same endianness as the system on which it was written.

If the two systems used the same endianness, the first Unicode character will be read back in with the value 0xFFFE, and all the remaining characters can be read without modification. If the two systems used different endianness, the first character would come back in as 0xFEFF. This would tell the run-time system that it needs to switch the bytes in all subsequent characters as it reads them in. That would resolve the difference in endianness, and the program using the data would not notice the difference in the hardware.

Extending this sort of portability beyond 16-bit integral values is much harder. First, different systems have many different ways of representing floating-point values. The representation scheme depends entirely on the system itself, and knowing whether the hardware is big-endian or little-endian does not help you determine that representation. Further, negative integral values can cause a problem, because there are different ways of representing negative numbers in binary. The most common way is probably twos-complement, but a system can also possibly use ones-complement, or any of several other schemes. Producing a truly portable binary file format would have to allow for these differences as well.

The writers of the ANSI C standard (and the C++ Working Paper, as well) decided that this was too complicated an issue for them to try to resolve. For this reason, the rules pertaining to binary files are quite restrictive: it is only meaningful for a program to read a binary file that was written with a program built with the same compiler. This policy eliminates the issues surrounding endianness and floating-point representations.

If you want to be able to move numeric data between applications there are a several possibilities. First, you can write the data as formatted text, using fprintf. That's guaranteed to work, provided you make sure you end your text file with a newline and you allow for the possibility that a binary file will be padded at the end with nul characters. Second, you can dig into the ASN1 standard. ASN1 provides a uniform and portable mechanism for representing binary data of arbitrary complexity. It does this by defining a few basic data types and some control structures so that you can write to the file the definitions of any additional data types to be used. Third, you can use the data marshalling techniques provided by various tools that support development of multi-tier applications. These tools typically require you to write function prototypes and class definitions in some sort of declarative language (some form of Interface Description Language (IDL) for functions, some form of Object Description Language (ODL) for classes), and let the tool take care of generating the code to actually write the data to a network connection and read it back. The writer of the tool has done the work of figuring out the details of internal representations for you. You should take advantage of this, and not redo their work yourself.

Endianness is only one of the issues presented when you need to transfer binary data between computer systems. Understanding endianness will let you transfer integral data if you're reasonably skillful. It does not tell you enough to enable you to transfer arbitrary types of data, however. That's a much more difficult task, one that should be left to the people who make their living at it. o

Pete Becker is a senior development manager at Borland International, working in Borland's Middleware and Application Management Development group in Boston, Mass. He has been actively involved in C++ development work at Borland for seven years, both as a developer and a manager. He is Borland's representative to the ANSI/ISO C++ standardization committee. He can be reached by e-mail at pbecker@oec.com.