May 1992/Stepping Up to C++

Columns

Stepping Up to C++

Operator Overloading, Part 3

Dan Saks

Dan Saks is the owner of Saks & Associates, which offers consulting and training in C and C++. He is secretary of the ANSI C++ standards committee, and also contributing editor for Windows/DOS Developer's Journal. Dan recently finished his first book, C++ Programming Guidelines, written with Thomas Plum. You can write to him at 393 Leander Dr., Springfield, OH 45504, or dsaks@wittenberg.edu (Internet), or call (513)324-6301.
In the first part of this series ("Operator Overloading, Part 1," January 1992) I presented the basic features of operator overloading by building a simple class for rational numbers (fractions). That class contained only constructors, binary operators, and a crude output function. In the next part ("Operator Overloading, Part 2," March 1992), I extended the rational class to include an additional constructor with one argument to provide a conversion from long to rational. I also defined several unary operators for rationals, including prefix and postfix ++ and --. This month, I'll continue my explanation of operator overloading by adding additional functionality to rational numbers.

Default Arguments
Listing 1 and Listing 2 show the rational class as it was at the end of Part 2 of this series. rational.h (Listing 1) defines the class itself, along with four inline nonmember functions that implement the four binary operators +, -, *, and /. rational.cpp (Listing 2) defines the non-inline class member functions.
Listing 2 defines three constructors
rational() { }
rational(long n) : num(n), denom(1) { }
rational(long n, long d) : num(n), denom(d) { }
The second and third constructors are very similar — the third constructor produces the same result as the second whenever the denominator is one. That is, the declarations
rational r (x);
and
rational r (x, 1);
are equivalent. Using default function arguments makes this equivalence explicit, and reduces the source code needed to define the class. You simply rewrite the third constructor as
rational(long n, long d = 1) :
   num(n), denom(d) { }
and discard the second constructor.
The = 1 in the function declaration specifies the default value for d. You invoke this constructor with either one or two arguments. If you declare a rational object using one argument, as in
rational r (3);
the compiler applies the two-argument constructor and supplies the default value for the second argument. That is, it behaves just as if you had written
rational r (3, 1);
Of course, you can still provide an explicit second argument, as in
rational r(3, 4);
Default argument values can be used in functions other than constructors. For example, the istream (input stream) class defined in the header iostream.h has a member function
istream &get(char *p, int n, char delim = '\n');
that reads at most n-1 characters in the character array addressed by p. get stops reading when it reads a character equal to delim. A call to get with only two arguments, such as
cin.get(s, MAXSTRING);
stops reading at the next newline character (the end of a line). A call such as
cin.get(w, MAXWORD, ' ');
stops reading at the next space.
An overloaded operator cannot have default argument values. That is, you cannot overload both unary and binary plus using one function with default arguments. For example,
rational operator+(rational r1, rational r2 = 0);
is an error. You must declare the unary and binary operators as distinct functions.
A function can have more than one argument with a default value. For example,
void f(int a, int b = 1, int c = 2);
can be called with one, two, or three arguments
f(x);         // same as f(x,1 , 2);
f(x, y);      // same as f(x, y, 2);
f(x, y, z);  // no defaults used
All arguments to the right of the first default argument must have default values. Thus,
void f(int a = 0, int b, int c = 2);
is illegal.

Default Constructors
The rational class now has two constructors

rational() { }
and

rational(long n, long d = 1) : num(n), denom(d) { }
You can eliminate the need for the first constructor by providing a default value for the first argument of the second constructor

rational(long n = 0, long d = 1) : num(n), denom(d) { }
According to the ARM[1], a default constructor is any constructor that can be called without an argument. A declaration such as
rational r1;
invokes the default constructor. (Note that the declaration
rational r1();
does not invoke a default constructor — it declares a function that accepts no arguments and returns a rational.)
You typically declare a default constructor as one with an explicitly empty signature, such as rational:: rational(); however, the ARM states that a constructor with default values for all of its arguments is also a default constructor. Thus, if you provide a default value for each argument of rational:: rarional(long, long) you must discard rational:: rational (). Otherwise, the compiler will complain that the declaration
rational r1;
is an ambiguous call to either of the two different constructors.
Replacing
rational() { }
with
rational(long n = 0, long d = 1) :
   num(n), denom(d) { }
actually changes the behavior of the default rational constructor. When the default constructor is
rational() { }
the declaration
rational r1;
constructs a rational object with unspecified values for its numerator and denominator. When the default constructor is
rational(long n = 0, long d = 1)
   num(n), denom(1) { }
the same declaration for r1 constructs a rational object with the value 0/1 (rational zero).
This change has interesting ramifications for the class design. The constructor
rational() { }
generates no code, so it adds no runtime overhead to rational object declarations. In contrast, the constructor with default argument values always generates code to initialize rational objects. Thus, even though the constructor is an inline function, a declaration such as
rational r1;
always generates at least two instructions
r1.num = 0;
r1.denom = 1;
On the other hand, the constructor with default argument values is safer because it guarantees that all rational objects have a well-defined initial value. Although I have not shown such details in Listing 1 and Listing 2, a robust implementation of rationals should insure that no rational object has a denominator of zero. If a rational object can be constructed with an unspecified initial value, then most, if not all, rational operators must check the validity of their operand(s) before proceeding with the operation. Guaranteed initialization significantly reduces the number of places where the validity of operands must be checked.
In practice, avoiding use of the default constructor is usually easy. In the first part of this series, I showed that you can easily rewrite the member function bodies to avoid using default constructors. For example, you can always rewrite
rational result;  // uses default constructor
result.num = n;   // n may be an expression
result.denum = d; // n may be an expression
return result;
as simply
return rational(n, d);
In fact, the implementation of rationals in Listing 1 and Listing 2 do not use the default rational constructor at all.
Allowing a constructor with default values for all of its arguments to be a default constructor is a recent extension to C++ which is not supported by all current compilers. If you declare class rational with only the constructor
rational(long n = 0, long d = 1);
some compilers will reject declarations such as
rational r1;
With other compilers, you can declare both
rational ();
rational(long n = 0, long d = 1);
and the compiler invokes rational() when it encounters declarations like
rational r1;
Tom Plum and I recommend that, for widest portability across today's compilers, you should not assume that a constructor which has default values for all of its arguments is a default constructor[2]. Rather, write the default constructor as a special case, as in
rational() : num(0), denom(1) { }
rational(long n, long d = 1) :
   num(n), denom(d) { }
Conversion Operators
As I explained in the second part of this series, a constructor that accepts one argument of type T provides a user-defined conversion from type T to the class type. For example, the constructor

rational(long n, long d = 1);
can be called with one argument of type long. In fact, it can also be called with an argument of any type, such as char, short int or int, that promotes to long by a standard (predefined) conversion. Thus, you can use any type that promotes to a long in a context that expects a rational, and the compiler will apply the constructor to create a temporary rational object initialized with that long. For example, the assignment

rational r1; ... r1 += 3;
invokes

rational(long n, long d = 1);
to convert 3 to a rational object whose value is 3/1, and then passes that object as the right-hand operand of

rational &operator+=(rational r);
You may also desire to have conversions from a class type like rational to a predefined type like double or float. However, predefined types do not have constructors. Therefore, C++ offers another notation for user-defined conversions — conversion operators.
A conversion operator is a member function of the form operator T() specifying a conversion from the class type to T, where T is a type name for a predefined or user-defined type. For example, adding a rational member function

rational ::operator double() { return (double)num / denom; }
provides a conversion from rational to double. Then you can pass a rational object as an argument to a function expecting a double. For example, the standard header math. h declares sqrt as

double sqrt(double);
Given

rational r; double x;
the call

x = sqrt(r);
uses the conversion operator rational::operator double() to convert r to double, and passes the result to sqrt.
Conversion operators do not have an explicit return type — you cannot declare the conversion from rational to double as

double rational::operator double(); // error
The return type is implied by the operator name.

Ambiguous Conversions
Unfortunately, adding operator dauble() to the rational class in Listing 1 and Listing 2 leads to an ambiguity in arithmetic expressions such as r + 1, where r is a rational. If the rational class has a user-defined conversion from double to rational (via a constructor) and from rational to double (via the conversion operator), then the expression r + 1 has two interpretations:
1) The compiler promotes 1 from int to long (by a standard conversion) and then to rational (by a constructor), and then applies
rational operator+(rational, rational)
to produce a rational result, or
2) the compiler converts r to double (by the conversion operator), promotes 1 to double, and then applies the predefined+ operator to produce a double result.
C++ considers these two interpretations to be equally valid, so the expression is ambiguous and produces a diagnostic. (As always, not all compilers behave this way. However, this is the behavior described by the ARM, and most newer compilers diagnose the error.)
Applying an explicit conversion to one of the operands in the expression eliminates the ambiguity. You can apply explicit user-defined conversions (both constructors and conversions operators) by using either C-style casts or function-like casts. For example,
r + (rational)1
or
r + rational(1)
ensures that both operands are rational and forces the compiler to use operator+(rational, rational). On the other hand, either
(double)r + 1
or
double(r) + 1
coerces the first operand to double, and forces the compiler to apply the predefined + operator.
Alternatively, you can eliminate the ambiguities by renaming the conversion operator
rational: :operator double();
as an explicitly-named conversion function, say
double rational: :to_double();
Then, expressions such as r + 1 unambiguously invoke operator+(rational, rational), but you must always write conversions from rational to double explicitly, as in
x = sqrt(to_double(r));
When designing classes with overloaded operators and user-defined conversions, you must decide which style produces the most useful and error-free notation. My impression is that most experienced C++ programmers avoid conversion operators and use explicitly-named conversion functions.
Note that, even if you retain the conversion operator from rational to double, there's still no ambiguity in expressions using assignment operators, such as
r += 1
The assignment operators are implemented as member functions. A member function's left operand must be a class object, and the compiler will not apply user-defined conversions to it. The ambiguity only arises in operators implemented as nonmember functions.

User-Defined Output
As defined in Listing 1 and Listing 2, rational numbers have an output function
void rational::put(FILE *f)
that puts a rational object to a FILE (as defined in stdio.h). Listing 3 shows a simple test program for rationals that uses rational::put, along with printf and putchar, to display rational numbers to stdout.
Intermixing calls to rational::put with the other stdio output functions is not very convenient. It takes three separate function calls
printf("r1 = ");
r1.put(stdout);
putchar('\n');
to print the line
r1 = <value of r1>
If rationals were a predefined type, such as double, you could replace the three calls as one call to printf
printf("r1 = %d\n", r1);
For rational output to be as convenient output for predefined types, you must be able to add an additional printf format specifier, like %r or %R. Unfortunately, you cannot extend printf very easily, and even if you could, you will probably run out of mnemonic format specifiers as you or members of your programming team create more and more user-defined types.
C++ provides an alternative input/output library that avoids this problem. You can extend the library so that I/O operations on user-defined types are as convenient as for predefined types. Early implementations of C++ defined the library interface in a header called stream.h. Newer implementations provide an enhanced stream library defined in iostream.h. I shall refer to the library as the iostream library, but the following discussion uses the basic facilities found in both versions.
The library provides output facilities as operations on objects of class ostream. The predefined object cout is the ostream object attached to standard output. Another ostream object, cerr, is attached to standard error.
You write to an ostream by using the overloaded operator <<. << is the predefined left-shift operator, but when used with an ostream, it's called the put to operator. The library defines an operator<< for each of the predefined types. For example,
cout << 100;
uses ostream::operator<<(int) to write the integer 100 to cout, and
cerr << "Error!\n";
uses ostream::operator<<(const char *) to write the string literal "Error! \n" to cerr.
You can string along many output operators in a single statement, such as
cout << "n = " << n << '\n';
Here's how it works. The << operators group from left to right, so the above expression is equivalent to
((cout << "n = ") << n) << '\n';
which is equivalent to
((cout.operator<<("n= ")) .operator<<(n)).operator<< (' \n');
Every ostream::operator<< has a return value of type ostream & and returns its left aperand. Thus
cout.operator<<( n = ")
writes the string literal to cout and returns a reference to cout. Since a reference acts like an object when used in an expression, this reference becomes the left operand of the next operator<<, which in turn, returns that same reference to cout. And so it goes, to the end of the statement.
The overloaded << operators defined by the iostream library are members of class ostream. However, when defining an operator<< for a user-defined type like rationals, you are not expected to add new members to the ostream class because you may not have access to the source code for the library. C++ provides facilities so that you can extend the library without rewriting it.
At first, you might try implementing output for rationals as a member function
ostream &rational::operator<<(ostream &os);
Unfortunately, this member function forces you to place the rational object to the left of the <<, as in
r << cout
which is the reverse ordering of the argument types in expressions that output the predefined types. To put the stream argument on the left, you must implement the operator as a nonmember function, as in Listing 4.
Now you hit another snag — the num and denom members of rational objects are private, but nonmember functions typically cannot access private members. One solution is to add functions to the rational class that provide public access to the values inside an object
long rational::numerator() { return num; }
long rational::denominator() { return denom; }
and rewrite the rational output operator in terms of these functions, as in Listing 5. These accessor functions are better than making the num and denom fields public, because they protect the integrity of rational objects by limiting clients to read-only access, but they add functions to the class interface that may only be needed for implementation. As an alternative, you can define the rational output operator as in Listing 4, but declare it as a friend of the rational class.
A function that is a friend of class X is a nonmember function that has access to the private members of X. The class must grant the function friendship by a friend declaration inside the class definition. Listing 6 shows the rational class with operator<< declared as a friend function. The friend declaration consists of a complete function declaration preceded by the keyword friend. The function definition appears later outside the class definition, but without the keyword friend.
Since a friend function is a global, nonmember function, its declaration is not affected by the access specifiers inside the class definition. It doesn't matter whether you place the friend declaration among the public or the private class members. However, friend functions are part of the public interface to a class, so most programmers place friend declarations just above or among the public members.
Listing 7 shows the test program for rationals rewritten using the iostream library and an overloaded operator<< for rationals. The output expressions freely mix operands of predefined and user-defined types. Using the ostream operators is as convenient and much safer than using printf. printf uses ellipsis (...) arguments, and the compiler checks neither the type nor the number of arguments passed to printf. In contrast, each ostream output statement is actually a sequence of type-safe calls to binary operators.

References
[1] Ellis, Margaret A. and Bjarne Stroustrup, The Annotated C++ Reference Manual. Addison-Wesley, Reading, MA, 1990.
[2] Plum, Thomas and Dan Saks, C++ Programming Guidelines. Plum Hall, Somers Point, NJ, 1991.