September 2002/Towards Improved Static Safety: Expressing Meaning by Type

Features

Towards Improved Static Safety: Expressing Meaning by Type

Karsten Weihe

With a little extra type discipline on your part, you can greatly increase the reliability of mixed-type computations.

Many logical errors can be caught by a good compiler because they provoke compiler errors or at least warnings. For example, many logical errors appear as type errors in the code. If a wrong object appears in an expression (logical error!), then it may or may not have a compatible type. If not, this results in a type error, and the compiler issues an error message.

However, many other logical errors silently pass all compiler checks and may make debugging a nightmare. In this article, I discuss a simple and convenient technique to increase the degree to which the compiler can check your logical errors. In fact, this technique made my own code much more reliable. A similar idea can be found in Section 16.5 of Barton and Nackman’s seminal book, Scientific and Engineering C++ [1]. However, I will go far beyond them and show that there is much more inside their fundamental idea.

In the following case study, I will stick to a simple, ubiquitous example: integral numbers. This case study stands for a variety of possible applications.

Case Study

Consider this innocent-looking example, which works on an object A of some array class and on an object M of some matrix class:
int h = 0;
for ( int i = 0;
  i < M.number_of_columns(); ++i )
  for ( int j = 0;
    j < M.number_of_rows(); ++j )

    {
    M[i][j] = A[h];
    ++h;
    }
It is very easy to make a typo here: M[i][i], M[j][j], M[j][i], M[h][j], or M[i][h] instead of M[i][j]; A[i] or A[j] instead of A[h], etc. The compiler has no chance to find such an error. Hence, the error must be found by an extensive code review or by extensive run-time tests. Clearly, even the most extensive code review or run-time test is by far less reliable than a compiler check.

The problem is even more virulent during maintenance. For the sake of exposition, suppose you decided to exchange the order of columns and rows: from now on, M[i][j] shall not refer to the entry in column i and row j anymore, but to the entry in row i and column j. Each and every expression of the form M[i][j] must be modified. Moreover, you must not forget the headers of the for loops: it is quite easy to overlook something here. Then, for example, the column index may erroneously run up to M.number_of_rows() or the row index up to M.number_of_columns(), or the wrong variable may be incremented.

However, in the following variant, the compiler can indeed catch every error of this kind:
#define array_loop(Type,index_name,size) \
  for ( Type index_name = 0;
  index_name < size; ++index_name )

...

ArrayIndex h = 0;
array_loop ( ColumnIndex, i,
  M.number_of_columns() )
  array_loop ( RowIndex, j,
    M.number_of_rows() )
    {
    M[i][j] = A[h]; // Fine
    M[i][i] = A[h]; // Error:
    // Wrong type of second matrix index
    M[j][j] = A[h]; // Error:
    // Wrong type of first matrix index
    M[i][i] = A[i]; // Error:
    // Wrong type of array index 
    }
I will look at the types ArrayIndex, ColumnIndex, and RowIndex in a minute. The crucial point is that these three types are identical but mutually incompatible, thus every mismatch in the body of the loop is caught because it provokes a type error.

Thanks to the macro array_loop, it is also nearly impossible to make an error in the headers of the loops. If you prefer to avoid macros like array_loop and write down the headers of the for loops explicitly, most imaginable errors can still be caught in the new version. For example, if M.number_of_columns() is of type ColumnIndex and M.number_of_rows() of type RowIndex, any confusion of these two methods will be caught:
ArrayIndex h = 0;
for ( ColumnIndex i = 0;
  // Error
  i < M.number_of_rows(); ++i )
  for ( RowIndex j = 0;
    // Error
    j < M.number_of_columns(); ++j ) 
   ...
In fact, only one serious source of errors remains then, namely confusions of the two increment statements, ++i instead of ++j or vice versa.

However, this kind of error is eliminated by the macro array_loop.

Now I’ll return to the above maintenance scenario: I want to exchange the roles of the row and column index in expressions of the form M[i][j]. Of course, I will overlook many points where a revision is necessary and will do the revision erroneously at many other points in the program. However, the compiler will mercilessly face me with each error. In other words, the program will be correct again once I have managed to bring it through the compiler (unless the revision was done incredibly sloppily).

Implementation

It is not too hard to implement a class ColumnIndex and a class RowIndex that may replace int, but are mutually incompatible, for example:
class ColumnIndex
  {
  int _i;
  public:
  ColumnIndex ( int n ) : _i(n) {}
  ...

  bool operator<
    ( ColumnIndex const& n ) const
    { return _i < n._i; }
  ...

  ColumnIndex const& operator=
    ( ColumnIndex const& n )
    { _i = n._i; return *this; }

  ColumnIndex const& operator+=
    ( ColumnIndex const& n )
    { _i += n._i; return *this; }

  ColumnIndex const& operator++
    ( ColumnIndex const& n )
    { ++_i; return *this; }
  ...

  int int_value () const
    { return _i; }
  };
Method int_value() allows access to the real value of an ColumnIndex object. It is tempting to define a conversion operator instead of int_value:
operator int () const { return _i; }
However, this would silently break the static safety that has been achieved. For example, the following line would not raise a compiler error anymore:
for ( ColumnIndex i = 0;
  i < M.number_of_rows(); ++i )
In fact, both i and M.number_of_rows() would silently convert to int, which means that operator<(int,int) would be called. However, the absence of an implicit conversion to int has an undesirable consequence: it is not possible to use a ColumnIndex object wherever an int is expected. Here is a simple example:
ColumnIndex  i (1000);
double*  p1 = new [i];         // Error
double*  p2 = new [i.int_value()];   // Fine
This is particularly annoying in template code, because template code should be applicable to both built-in integral types and integer classes:
template < typename IntegerType >
double*  allocate ( IntegerType i )
  {
  return new [ /* 'i' or 'i.int_value()' ??? */ ];
  }
The following standard trick solves this problem: simply introduce a common syntax for int and ColumnIndex.
int int_value ( int i )
  {
  return i;
  }

int int_value ( ColumnIndex i )
  {
  return i.int_value();
  }

template < typename IntegerType >
double*  allocate ( IntegerType i )
  {
  return new [ int_value(i) ];
  }
Template Solution

In a real program, you potentially need many types like ColumnIndex and RowIndex, and you do not want to write as many complete integer classes from scratch. There is a feature in C++ that allows you to write the necessary code once and for all: templates. However, I will “misuse” templates in a sense; the template argument T does not appear at all in the code of the template. At first glance, this may look a bit strange. However, such an “anonymous” template argument allows you to generate an infinite set of types that are identical but not compatible. Here is a sketch of the class template:
template < typename T >
class Integer
  {
  int _i;
  public:
  Integer ( int n ) : _i(n) {}
  ...
  bool operator< ( Integer const& n ) const
    { return _i < n._i; }
  ...
  Integer const& operator= ( Integer const& n )
    { _i = n._i; return *this; }
  ...

  int int_value () const
    { return _i; }
  };
For each type that you want to create, you have to define a template argument to replace the placeholder T. An empty struct suffices. However, you need one separate empty struct for every type:
struct ColumnIndexDiscriminator {};
struct  RowIndexDiscriminator {};
Now you can define the types ColumnIndex and RowIndex:
typedef Integer < ColumnIndexDiscriminator >  ColumnIndex;
typedef Integer <  RowIndexDiscriminator >   RowIndex;
Although the structs ColumnIndexDiscriminator and RowIndexDiscriminator are identical, they are not compatible. Hence, their template instantiations, ColumnIndex and RowIndex, are not compatible either. In summary, you have to write two lines for every type such as ColumnIndex and RowIndex. This is certainly a worthwhile effort for such an increase of static safety. (If you are a minimalist, write a macro for it!)

To conclude, a template such as Integer makes it easy to express different meanings by different types. A column index and a row index have different meanings, which means that every replacement of a column index by a row index or vice versa is a logical error. In the above approach, column and row indexes indeed have different types, so the compiler can catch these logical errors.

Collaborating Types

However, this is not yet the whole story. Although types like ColumnIndex and RowIndex shall not replace each other, they often should be able to collaborate. To understand this point better, I now turn your attention to a completely different context: the management of bank accounts. For ease of exposition, I will assume for a moment that each amount of money is integral (only dollars, no cents). Then you could define an instantiation of class template Integer for money:
struct MoneyDiscriminator {};
typedef Integer < MoneyDiscriminator >  Money;
However, a bank account may have credits and debits, and any confusion of credits and debits would be a serious logical error (presumably with serious consequences for the bank or its customers). Hence, separate types for credits and debits would even increase the static safety of the program:
struct CreditDiscriminator {};
struct  DebitDiscriminator {};

typedef Integer < CreditDiscriminator >  Credit;
typedef Integer <  DebitDiscriminator >   Debit;
However, you want to subtract debits from credits to obtain the balance of the account. For this, I introduce yet another type, Balance, and operators to add credits and subtract debits:
struct BalanceDiscriminator {};
typedef Integer < BalanceDiscriminator >  Balance;

Balance operator- ( Credit c, Debit d )
  {
  return Balance ( c.int_value() - d.int_value() );
  }

Balance operator+ ( Balance b, Credit c )
  {
  return Balance ( b.int_value() + c.int_value() );
  }

Balance operator+ ( Credit c, Balance b )
  {
  return Balance ( b.int_value() + c.int_value() );
  }

Balance operator- ( Balance c, Debit d )
  {
  return Balance ( c.int_value() - d.int_value() );
  }
The profit from these operators’ definitions is a significant increase in static safety.

If an expression made of credits, debits, and balances does not explicitly call method int_value(), then the collaboration of these types perfectly conforms to their meanings.

In other words: as long as int_value() is not used explicitly outside these few lines, you may blindly assume that credits are always added and debits are always subtracted. Of course, that does not mean that credits and debits can never be added. A call to int_value circumvents all precautions:
Credit c;
Debit  d;

Balance b = c.int_value() + d.int_value();
In some rare cases, this might indeed be reasonable, so you should not try to close this security gap (to close this gap, you could make int_value private and the above operators friends). The crucial point is that all points where you circumvented your precautions — be it for good or bad reasons — are easily found by searching the source files once for the pattern int_value.

Analogy in Science

This concept has an interesting analogy in science, notably in the realm of physics. Every physical value has a specific dimension, for example, speed, weight, power, performance, whatever. You learned in school that a physical formula is only correct if the dimensions of all involved variables are correct. To give a concrete example: speed is distance divided by time, and if a speed value is on the left-hand side of an equation, the right-hand side must not multiply distance and time. Hence, if speed, distance, and time are realized as different types and the operators:
Speed   operator/ (Distance, Time);
Distance  operator* (Speed, Time);
Distance  operator* (Time, Speed);
Time    operator/ (Distance, Speed);
are the only operators defined for this triple of types, then no expression on these three types can yield physical nonsense.

This analogy is not new. In the introduction, I mentioned Section 16.5 of Barton and Nackman’s book [1]. There they introduce physical dimensions as C++ types to ensure that the dimensions in scientific formulae are correct. However, the banking account example demonstrates that there is much more inside this general idea. For example, in physics, you cannot subtract two values of different dimensions from each other, whereas it makes perfect sense to give credits and debits different types just to ensure that they may be subtracted from each other and nothing else. In the following, I will completely leave the analogy to science behind.

Semi-Compatible Types

From now on, I will distinguish between dollars and cents in the banking program. Of course, every confusion of amounts in dollar and amounts in cent would be a serious error. All errors of this kind can be avoided by introducing incompatible types for dollars and cents:
struct CreditInDollarsDiscriminator {};
struct CreditInCentsDiscriminator   {};

typedef Integer < CreditInDollarsDiscriminator >  CreditInDollars;
typedef Integer < CreditInCentsDiscriminator   >  CreditInCents;
However, I want to add amounts in dollars and amounts in cents. Operators such as:
CreditInCents operator+ ( CreditInDollars, CreditInCents )
are certainly not appropriate, because the correct semantics is counter-intuitive:
CreditInCents  credit_in_cents   (1000);   // $10
CreditInDollars  credit_in_dollars (1);    // $1

// Should write '11' or '1100', not the "arithmetically
// correct" value '1001':
cout << credit_in_cents + credit_in_dollars;
The code becomes more intuitive when the programmer is forced to insert the conversion factor explicitly:
cout << credit_in_cents + credit_in_dollars * cents_per_dollar;
Object cents_per_dollar is of a specific “conversion-factor type,” which is compatible with the dollar type and the cent type exactly in the way in which a collaboration makes sense:
struct CentsPerDollarDiscriminator {};
typedef Integer < CentsPerDollarDiscriminator >  CentsPerDollar;

CreditInCents operator* ( CreditInDollars i, CentsPerDollar j )
  {
  return CreditInCents ( i.int_value() * j.int_value() );
  }

CreditInCents operator* ( CentsPerDollar i, CreditInDollars j )
  {
  return CreditInCents ( i.int_value() * j.int_value() );
  }

CentsPerDollar const  cents_per_dollar (100);
This conversion-factor type can also be used for conversions “the other way round,” from cents to dollars:
CreditInDollars operator/ ( CreditInCents i, CentsPerDollar j )
  {
  return CreditInDollars ( i.int_value() / j.int_value() );
  }

CreditInCents operator% ( CreditInCents i, CentsPerDollar j )
  {
  return CreditInCents ( i.int_value() % j.int_value() );
  }

CreditInCents  i (1234);   // $12 + 34c

// Writes '12.34' as desired:
cout << i / cents_per_dollar << '.' << i % cents_per_dollar;
Constants

Very often, the conversion factor between two units of measure is not fixed (e.g., the rate of exchange between two currencies). In this case, a constant or variable object like the first version of cents_per_dollar is appropriate. However, in case of a fixed conversion factor such as factor 100 for the conversion between cents and dollars, such an object is unnecessarily unsafe: any object of type CentsPerDollar can replace cents_per_dollar, whatever its value will be.

To close this final gap in the static safety of the code, you could define CentsPerDollar slightly differently, namely such that the constructor has no arguments and initializes the private member _i to be $100. Of course, all non-const methods should then be omitted, because it does not make sense to change the value of a CentsPerDollar object.

Many constant conversion factors like cents_per_dollar may occur in a program. Hence, I will again realize this class as a template. The conversion factor is introduced as an additional template argument value:
template < typename T, int value >
class ConstantInteger
  {
  int const _i;
  public:
  ConstantInteger () : _i(value) {}
  ...
  };
Now you can blindly rely on the correct value of cents_per_dollar:
struct CentsPerDollarDiscriminator {};
typedef ConstantInteger < CentsPerDollarDiscriminator, 100 > CentsPerDollar;

CentsPerDollar  cents_per_dollar;   // 100
Which Operations to Predefine?

Some of the standard operators for integral types should be predefined as templates in the same namespace as template class Integer so that they are available for all instantiations of Integer. Clearly, the question of which operations to predefine cannot be answered ultimately. Here is my favorite list:
bool operator== ( Integer < T >, Integer < T > );
bool operator!= ( Integer < T >, Integer < T > );
bool operator<  ( Integer < T >, Integer < T > );
bool operator<= ( Integer < T >, Integer < T > );
bool operator>  ( Integer < T >, Integer < T > );
bool operator>= ( Integer < T >, Integer < T > );

Integer < T > operator+ ( Integer < T >, Integer < T > );

Integer < T > operator* ( Integer < T >, int );
Integer < T > operator* ( int, Integer < T > );

Integer < T > operator/ ( Integer < T >, int);
int       operator/ ( Integer < T >, Integer < T > );
At first glance, multiplication and division look a bit strange. Again the physical analogy helps clarify things: when you multiply or divide a speed value by a dimensionless value, you will get another speed value. In turn, when you divide two speed values, you will get their ratio, which is dimensionless. This analogy is reflected by the above definitions of operator* and operator/.

On the other hand, if you multiply two speed values, you will get a value of dimension “speed squared,” and if you divide a dimensionless value by a speed value, you will get a value of dimension “time by distance.” Dimensions of this kind do not make sense in most potential applications of my technique. (What is “dollar squared”?). Hence, these versions should not be defined as templates but “by hand,” only for instantiations of template Integer for which they make sense. For example, for a Distance type, the square type does make sense:
struct  DistanceDiscriminator {};
struct    AreaDiscriminator {};

typedef Integer < DistanceDiscriminator >  Distance;
typedef Integer <   AreaDiscriminator >  Area;

Area operator* ( Distance, Distance );
For analogous reasons, operator% does not always make sense and is thus omitted from the above list.

Conclusion

Programs can be made more reliable by turning (different) meanings into (incompatible) types. This is particularly useful during maintenance: there is a high chance that errors in a revision process result in type errors and are thus caught by the compiler.

The idea of dimensions borrowed from physics makes even more sense if you apply it in a more fine-grained way than in physics.

C++ templates allow a short, convenient, flexible realization. Typically, a function or class is made a template to create a class of similar types. I have shown an example where it makes sense to create a class of identical (yet not compatible) types using the template mechanism of C++.

Note

[1] John J. Barton and Lee R. Nackman. Scientific and Engineering C++ (Addison-Wesley, 1995).

Karsten Weihe is a professor of computer science at the Darmstadt University of Technology, Germany. His main research topic is mathematical optimization and its application in business and industry. This interest has guided his focus to problems of algorithmic software development, especially in C++.