Columns


Stepping Up To C++

Operator Overloading

Part 2

Dan Saks


Dan Saks is the owner of Saks & Associates, which offers consulting and training in C and C++. He is secretary of the ANSI C++ standards committee, and also contributing editor for Windows/DOS Developer's Journal. Dan recently finished his first books, C++ Programming Guidelines, written with Thomas Plum. You can write to him at 393 Leander Dr., Springfield, OH 45504, or dsaks@wittenberg. edu (Internet), or call (513)324-3601.

In my last column, I introduced operator overloading in C++ (see "Operator Overloading, Part 1," January 1992). Using operator overloading, you can define new meanings for operator symbols applied to objects of user-defined class types. Operator overloading enables the implementation of user-defined types that are nearly indistinguishable from built-in types.

Last time, I used a simple class for rational numbers (fractions) to demonstrate overloading of binary operators. In this article, I'll enhance the rational class to make it more like a built-in type by adding mixed-mode operations (operations combining rationals with other built-in types) and unary operators.

Reviewing The Basics

Listing 1 shows the declaration for class rational as it was when I left off in the first part of this series. Each rational number stores its numerator and denominator in a pair of signed long integers. The class declares eight overloaded binary operators: +, -, *, and /, and their corresponding assignment operators +=, -=, *=, and /=. Using this class you can declare objects of type rational, and perform rational arithmetic using the overloaded operators. For example, if r1, r2, and r3 are rationals, then

r1 += r2;
adds r2 to r1 using rational ::operator +=. The statement compiles as if you had written

r1.operator+=(r2);
Similarly,

r1 = r2 + r3;
adds r2 and r3 using rational::operator+ and then copies the result to r1 using rational::operator=. This statement compiles as if you had written

r1.operator=(r2.operator+(r3));
Note that Listing 1 doesn't provide a declaration for rational::operator=. The compiler generates a default version of operator= if you don't write one explicitly. The default version uses memberwise assignment. That is, it copies each member of the right-hand operand to the corresponding member of the left-hand operand using that member's operator=.

Listing 2 presents definitions for the rational member functions. I implemented each arithmetic operator (e.g. +) in terms of its corresponding assignment operator (e.g. +=). Note that the first line in the body of each arithmetic operator is the declaration

rational result(*this);
which uses the rational copy constructor

rational::
rational (const rational &);
to initialize the local variable result with a copy of *this. (Recall that this is the address of the object for which the member function was called.) I did not declare the copy constructor explicitly; I let the compiler generate it.

The member function rational::put (FILE *) prints a rational number in the format (num/denom.) rational::simplify is a private member function that reduces a rational number to its simplest form. (My method for reducing fractions is admittedly too simple for an industrial-strength class, but the fine points of fractional arithmetic are secondary to my presentation of overloaded operators. Zeidler[1] presents a better algorithm for reducing fractions.)

simplify uses a general-purpose library function gcd that computes the greatest common divisor of its arguments. I presented a version of gcd in the first part of this series, but since then I've discovered problems with the implementation. Listing 3 contains a new version of gcd based on an algorithm also provided by Zeidler[1]. I compiled gcd separately and placed the object code in a linkable library called mylib. simplify accesses gcd's declaration by including mylib.h.

Constructors As Conversions

C++, like C, provides standard conversions among the arithmetic types. These rules simplify writing arithmetic expressions. For example, when you mix ints and longs in the same expression, the compiler automatically promotes the ints to longs, and does all the computations in long arithmetic. Similarly, you can initialize a long with an int expression, such as

long n = 0;
The compiler automatically converts the integer literal 0 to a long value. You need not write

long n = 0L;
The compiler also performs this conversion when you pass an int to a function that expects a long argument.

Unfortunately, the rational class defined in Listing 1 does not provide automatic conversions to or from other arithmetic types. For example, you cannot write an expression like

rational r1, r2;
...
r1 = r2 + 1;
You must write it as

r1 = r2 + rational(1, 1);
The problem is that rational::operator+ requires two rational operands, and the literal 1 is an int, not a rational.

You could solve this problem by extending the class to include a member function rational::operator+(long n), as shown in Listing 4. The return statement in that function uses the constructor call rational(n, 1) to convert long n to its equivalent rational value, n/1. Adding this member function to the class lets you write expressions like

r1 = r2+ 1L;
It even lets you write

r1 = r1 + 1;
because the compiler promotes 1 to long before calling rational::operator+(long).

The problem with this approach is that you must also add

rational operator+=(long);
rational operator-(long);
rational operator-=(long);
and so on, for every operator defined by the class. A class written this way gets very big very fast, and contains lots of duplicated code.

If you also add operator=(long) to the class, you can write

rational r;
...
r = 3;
which assigns 3/1 to r. Unfortunately, even with this additional operator, you still can't write declarations like

rational r = 3;
Remember, this is not an assignment; it's an initialization that uses a constructor called with one argument, as if you had written either

rational r = rational(3);
or

rational r (3);
To overcome this problem, you must add a rational constructor that accepts a single argument of type long, defined as

rational (long n): num(n), denom(1) {}
It turns out that this one constructor eliminates the need for all those extra operator functions that accept a single long argument. A class constructor with one argument acts as a rule for converting the argument type to the class type. For example, adding the above constructor to the class definition in Listing 1 lets you write

rational r;
...
r = 3;
In translating the assignment, the compiler finds only one assignment operator that it could possibly use, namely

rational &rational::
operator=(const rational &);
as generated by the compiler. The compiler promotes 3 to long and passes the result to the rational(long) constructor to create a rational object for use as the right-hand operand in the assignment. If the constructor is written as an inline function, a good optimizing compiler will "compile away" any temporary objects, and produce code equivalent to

r.num = 3;
r.denom = 1;
The compiler applies the rational(long) constructor whenever it needs to covert the right-hand operand of a rational operator. An expression like

r /= 2;
compiles as if you had written

r.operator/=(rational(2));
and converts the 2 to the rational value 2/1 before passing it to operator/=. Similarly,

r1 = r2 * 2;
compiles as if you had written

t1 = r2.operator+ (rational (2));
rl.operator = (t1);
where t1 is a temporary rational object added for readability.

The implicit conversion from integral types to rational lets you write code that looks even more as if rationals were built in. However, it also lets you inadvertently write less efficient code. For example, you can rewrite

r1 = r2 * rational(2,3) + r3;
as

r1 = r2 * 2/3 + r3;
Despite the spacing, C++'s precedence rules interpret the latter expression as

r1 = ((r2 * 2) / 3) + r3;
which compiles as

t1 = r2.operator*(rational(2));
t2 = t1.operator/(rational(3));
r1 = t2.operator+(r3);
(Again, t1 and t2 are temporary rational objects added only for readability.) Whereas the expression rational(2,3) explicitly creates a single object whose value is 2/3, the expression 2/3 gets regrouped into two separate rational objects and introduces a separate call to rational::operator/.

Note that you cannot create a single rational object by simply adding parentheses

r1 = r2 * (2/3) + r3;
In this case, the compiler sees that 2 and 3 are integer literals, so it performs 2/3 using integer division. The result is an integer-valued zero, which is converted to rational when used as the right-hand operand of rational::operator*.

The compiler doesn't know that you intended 2/3 to be a rational constant. It doesn't promote the operand types if they are already the same type. This situation is no different than if r1, r2, and r3 had been float or double variables. 2/3 grouped as such always yields zero.

Non-Member Operator Functions

When you overload operators using members functions, the compiler can apply user-defined conversions to the right-hand argument, but not to the left. It follows that the left operand must be a class object. Thus you can write expressions like

r1 = r2 * 2;
but not like

r1 = 2 * r2;
because 2 is not a class object. The compiler will not take the liberty of commuting the operands to produce

r1 = r2 * 2;
because some operations, like -- and /, are not commutative.

If you want the compiler to apply user-defined conversion to the left operand as well as the right operand of an overloaded operator, then implement that operator as a non-member function. For example, you move the member function

rational rational::
operator+(rational r);
outside the class declaration, and rewrite its declaration as the non-member function

rational operator+(rational r1, rational r2);
Notice that although the member function appears to have only one formal argument, the non-member function has two. But a member function really has an extra (hidden) argument — the address of the object addressed by this. When you rewrite the operator as a non-member function, you must add a second (or is it a first?) explicit argument.

Changing this operator from a member to a non-member function doesn't affect any of the calls to operator+ using infix notation. The fact is, when the compiler sees r1 + r2, where either r1 or r2 is class type, it tries compiling the expression as either r1.operator+(r2) or operator+(r1, r2), and uses whichever one it finds.

If you define both forms of operator+ accepting identical argument types, then when you write r1 + r2, the compiler produces a diagnostic that says the expression is an ambiguous reference. If you define both forms of operator+ accepting sufficiently different argument types, the compiler selects the best match according to an elaborate set of argument matching rules. Ellis and Stroustrup[2] and Stroustrup[3] describe these rules in detail. I will explain them in a future column.

Listing 5 shows a new version of the header rational.h that declares class rational with the arithmetic operators +, -, *, and / implemented as non-member functions. The body of each function is a one-liner that uses the corresponding assignment operator to do the computation. For example, the body of operator+ is simply

return r1 += r2;
The operator function doesn't need a local rational object to hold the result; it uses its local copy of the left-hand operand for temporary storage. Since the left operand of operator+ is passed by value, changing the value of the formal parameter has no effect on the actual argument passed to operator+. The arithmetic operators are so short, it's appropriate to declare them as inline functions in the header.

When the operators were member functions, they had access to the private members of rational objects. When you rewrite them as non-member functions they lose their access rights, but the loss poses no problem because these functions never exercised those rights. For example, all the real work of operator+ is done by calling rational::operator+=, which still has all its private access rights. operator+= manipulates the private num and denom members of the rational objects, and returns a reference to a complete rational object.

Using const Reference Arguments

The size of a rational object is the sum of the sizes of its data members, namely, 2 * sizeof(long). This is rarely smaller, and often two or four times the size of a pointer. Therefore, you may want to pass the arguments to these operators by const reference rather than by value. (The underlying implementation of a reference is a pointer, so passing by reference is often cheaper than passing a large object by value.) You can certainly do this with the member functions, e.g.

rational &operator+=(const rational &);
without changing anything else in the function definition or in any function call. You can also do it with the second argument of the non-member functions, e.g.,

inline rational operator+(rational r1, const rational &r2);
However, as implemented in
Listing 5, you can't change the first argument, r1, to a const reference because the function alters r1's value. If you insist on changing r1 to a const reference, then you must use a local rational variable to hold the result, as shown in Listing 6, but I doubt you gain anything by writing the function this way.

Unary Operators

Built-in arithmetic types support a number of unary operators. Rationals should too. Listing 7 shows the class declaration in rational.h extended to include six unary operators: +, -, prefix ++ and -- --, and postfix ++ and -- --.

In general, you overload unary operators using either a member function with no arguments or a non-member function with one argument. (The exceptions are postfix ++ and -- --, explained later.) For example, as a member function, you declare rational unary + as

rational rational::operator+();
As a non-member function, declare it as either

rational operator+(rational);
or

rational operator+(const rational &);
In Listing 7, I used the member function notation.

The implementation of unary + is trivial, so I simply defined it inside the class definition. Hence, the function is inline by default. Unary + simply returns the value of its operand.

I try to make overloaded operators for user-defined types act as much as possible as they do for predefined types. Unary + applied to predefined types returns an rvalue, not an lvalue. The return value of a function is an rvalue unless it returns a reference. Therefore, I made the return value of rational's unary + a rational, rather than a rational &.

The implementation of unary - is a little more difficult. The function definition appears in Listing 8 along with the other non-inline unary operator functions. The return value is simply the value of the operand with its numerator negated. However, you must be careful to leave the operand unchanged. If the body of the function were simply

num = -num;
return *this;
it would negate the operand itself, and you would get surprising behavior. For example, if negation changed its operand, then the second statement in

r1 = rational (1, 1); // r1 = 1/1
r2 = -r1;             // r2 = -1/1
would also change r1 to -1/1. To avoid this odd behavior, rational unary - computes the negation in a local variable.

Early versions of C++ (those compatible with AT&T cfront up to release 2.0) did not distinguish between prefix and postfix applications of overloaded ++ and -- --. That is, for a class like rational you could write a single function operator++ (either as a member or non-member) and invoke that function as either ++r or r++. There was no way to get ++r and r++ to behave differently as they do for primitive types.

Newer C++ compilers let you overload the prefix and postfix operators separately. You declare the prefix operator++ much as you do any other unary operator using a member function declared as

rational rational::operator++();
or a non-member function declared as

rational operator++(rational);
You declare the postfix operator++ with an additional argument of type int, as a member function

rational rational::operator++(int);
or as a non-member function

rational operator++(rational, int);
For a rational r, the expression ++r compiles as the call

r.operator++()
and the expression r++ compiles as

r.operator++(0)
You can use an explicit function call to the postfix operator to pass a non-zero value as the int argument, but I haven't yet seen a reason to do this. In practice, the int argument is a dummy argument that distinguishes the postfix from the prefix operator.

The implementation of the prefix operators are very simple. For predefined types, ++r is defined to be r+=1, and that's exactly what the overloaded prefix rational ++ does. The postfix rational++ must do a little more work to set aside a copy of the prior value of the operand while it increments the operand.

For predefined types, the operand of prefix and postfix ++ and -- -- must be a modifiable lvalue, but the return type is not an lvalue. This means, for example, that ++2 and K++ (where K is a const variable) are invalid. The overloaded rational operators should try to preserve this behavior. Unfortunately, as a member function, operator++ and operator -- -- will accept operands other than modifiable lvalues. For example, the member function accepts a call like ++rational(1, 2). However, you should be able to prevent such expressions by defining operator++ as the non-member function

inline
rational operator++(rational &r)
   {
   return r += 1;
   }
That is, the rational argument is passed as a non-const reference. Since a rational & argument can only be bound to a modifiable rational object (an lvalue), this function definition should not accept a call like ++rational(1, 2). Therefore, I recommend overloading ++ and -- -- this way.

I say that this should be able to prevent non-modifiable lvalue operands because that is my understanding of the rules for references defined by the Annotated C++ Reference Manual (the ARM)[2], and by the C++ draft standard. However, as I explained in an earlier column on references (see "Reference Types", CUJ, September 1991) some current compilers still bind non-const references to a constant expression by binding the reference to a temporary initialized by the value of the expression. Since I haven't had the opportunity to use a compiler that agrees with my interpretation, there's room for some debate about my recommendation.

References

[1] Zeidler, Steve, "Doing Fractions in C++", The C Users Journal, Vol. 9, No. 11, Nov. 1991.

[2] Ellis, Margaret A. and Bjarne Stroustrup, The Annotated C++ Reference Manual. Addison-Wesley, Reading, MA, 1990.

[3] Stroustrup, Bjarne, The C++ Programming Language, 2nd. ed. Addison-Wesley, Reading, MA, 1991.