C/C++ Users Journal May 2004
In a recent project, I was designing a data structure that required the storage of additional data depending on information that was available at compile time, but unknown at design and implementation. As an example of the problem, consider an unrelated system to store people's personal details. A subset of the information could be a list of forenames, surnames, maiden names, and children's names. Forenames and a surname are applicable to all people, but a maiden name and children's names are each only applicable to a subset of people. With this in mind, there were three design goals I aimed to achieve:
Here's a structure that could store all the information:
template<typename Char>
struct personal_details
{
typedef
enum { male, female }
gender_t;
gender_t gender_;
std::vector<
std::basic_string<Char> > forenames_;
std::basic_string<Char> surname_;
std::basic_string<Char> maiden_name_;
std::vector<
std::basic_string<Char> > children_;
};
This structure fails to meet any of the three design goals, of course. Perhaps the most significant point of interest here is the memory allocation required in storing the maiden_name_ and children_ variables that may never be used. The actual size depends on the STL implementation in use. Table 1 shows the size with three compiler/STL combinations.
An obvious solution to this problem is to allocate the optional variables on the heap at runtime and store pointers in the personal_details structure. However, this only meets the third design goal, failing on the first two. So perhaps using a class hierarchy could be an answer? I could build a hierarchy to use a base class for forenames and surnames and derived classes for parents and married people. Listing 1 is an example hierarchy, and shows how this solution can quickly become unwieldy and difficult to maintain. With respect to the design goals, the first and second are met, but this solution is very inefficient in terms of space allocation. Table 2 shows the sizes of these classes. The results are considerably variable for different compiler/STL configurations, with size for the structure to represent the details of a married female parent ranging from 44 bytes with GCC 3.2 to 100 bytes with Microsoft Visual C 7 (VC7).
Consequently, I set out to design a structure that can under specific compile-time conditions occupy minimal storage, and under other conditions can store a data-type object and has the same semantics as the data type itself. There must, however, be no storage overhead in using the optional storage structure itself.
template<typename Type, bool store> struct optional_storage;
The structure optional_storage has two template parameters: The first is the data type that could be stored, and the second is a Boolean value to determine whether storage should be allocated or not. For the case of no storage, you simply define a structure with an empty type:
template<typename Type>
struct optional_storage<Type, false>
{
typedef struct { } type;
};
Now to the trickier case of allocating storage. The easiest way to achieve the goal of data-type semantics is to derive the optional_storage class (available at http://www.cuj.com/code/) from the data type itself. A type is implicitly convertible to its base, so deriving will solve all the problems. Well, it would do, but for a small problem in that a structure cannot derive from a fundamental type (integrals and floating point), only from a class or another structure. To overcome this, you can create a structure that wraps the basic data type and provides operators to convert to and from the basic data type. I call it mandatory, as it stores data when the optional_storage structure becomes mandatory; see Listing 2. You can now derive from a mandatory structure that wraps the data type; see Listing 3.
And there you have it a structure that "auto-magically" reduces to the minimum size allocation allowed by the language for the given data type, or no data, dependent on a compile-time expression. The minimum size allowed by the language depends on its use. As a member of a class or structure, it must be at least 1 byte in size, and its actual size will depend on the compiler's alignment settings. As a base class, there are no restrictions, and the compiler is allowed (but not required) to optimize away this space. This is a technique known as "Empty Base Optimization," and means that a class that derives from an empty base class can be the same size as if it were deriving from nothing.
Returning to the personal_details example, you can now define the structure, as in Listing 4. Here, the only compulsory members are gender_, forenames_, and surnames_. All the others are optional based on one or a combination of template parameter values. Table 3 shows the size of the structure with each combination of template parameter values.
Take a closer look at the definition of maiden_name_. For married women, you want to store their premarital name in a string type, and for all other people, a maiden name is inappropriate. You already have a type defined to store names, name_t, so you can declare maiden_name_t like this:
optional_storage< name_t, G == female && M == married> maiden_name_;
The expression G == female && M == married is evaluated to a Boolean constant at compile time, based upon the values of G and M, which are themselves template parameters to personal_details. It is this use of a Boolean constant as a template parameter that is key to optimal_storage working. This static constant is used in the selection of a type, and the type will contain, or not contain, storage for the data. Because this type selection is performed during compilation, the generated types are subject to optimization and normal error conditions. For example, consider:
personal_details<male, single, true> male_single_parent;
where male_single_parent is instantiated as a personal_details structure that can optimally represent an unmarried male with children, according to the rules we've defined. This is now a concrete type, so attempting to access the maiden name of this person yields a compilation error:
male_single_parent.maiden_name_ = "Smith";
Although the mandatory structure presented here works, I wasn't happy that it always wrapped the real data type that I wanted to store. The conversion operators are implemented inline and have no overhead in production code with modern compilers, but they are fussy and get in the way in development (debug) code.
So I set about to improve the situation. The structure would remain the same for a data type from which derivation is illegal. However, in all other cases, it could be optimized in code terms, even if not in performance.
First, I needed a way of determining whether a type is fundamental; that is, is the type an integral type or a floating-point type? I defined a simple structure, which I can specialize for types that are fundamental. The default (unknown) case is False.
template<typename T> struct is_fundamental
{ enum { value = false }; };
Now I could easily specialize this structure for all types I know to be fundamental, and I can use the value member of the structure at compile time to determine if any given type is fundamental. Now, every type, in every combination of signed/unsigned and cv-qualifiers, must be accounted for, so I created some macros one for types that can be signed/unsigned (Listing 5) and one for those that can't (Listing 6). I could then use these macros to define a complete set of fundamental types (Listing 7).
Okay, so now that I know at compile time if a type is fundamental or not, I can reinvent the mandatory structure to encapsulate fundamental types and derive from everything else. I define the structure with two template parameters. The first is the data type, as it always was. The second is a Boolean value that determines whether the data type is encapsulated (True) or derived (False). This second template parameter has a default value using the aforementioned is_fundamental structure, and will never need to be supplied by users of the structure.
template<typename Type, bool IsFundamental=detail:: is_fundamental<Type>::value==true> struct mandatory;
I now partially specialize the structure using the second parameter. If the parameter is True, then the structure is unchanged:
template<typename Type>
struct mandatory<Type, true>
{
// ... as above
}
However, if the parameter is False, I know I can derive from the data type and provide three simple constructors.
template<typename Type>
struct mandatory<Type, false> : public Type
{
typedef Type type;
mandatory();
mandatory(const type &other);
mandatory(const mandatory &other);
};
Microsoft Visual C++, up to and including VC++.NET 2002 (VC7), does not support partial template specialization. For this compiler, the optimal_storage structure always wraps the data and never inherits from nonfundamental types.
The optional_storage structure provides an easy-to-use interface for optimizing the storage requirements of any CopyConstructible data type. Wrapping a data type within an optional_storage structure is mostly transparent, although there is a small caveat. If the contained data type has a conversion operator to another type, it will not be invoked because one implicit conversion has already been performed. In such a situation, an explicit conversion, or intermediate assignment to const Type&,will be necessary.
The three original design goals have all been met, and there is no storage overhead in using the optional_storage structure itself. This can be seen easily by comparing the bottom line in Table 1 the starting point and Table 3 the optional_storage implementation. For each compiler/STL configuration, the fully populated structure using optional_storage is exactly the same size as the original fully populated structure. The real benefit comes when the optional parts of the structure are eliminated; for example, a structure to represent a single male without children now occupies 52 bytes, 32 bytes, and 24 bytes on VC7.0, VC7.0/STLPort, and GCC3.2, respectively. This is down from 92 bytes, 52 bytes, and 36 bytes on the same compilers, saving 40 bytes, 20 bytes, and 12 bytes. This can add up to a massive savings in a large system storing many of these structures simultaneously.