Columns


Stepping Up To C++

Operator Overloading, Part 4

Dan Saks


Dan Saks is the owner of Saks & Associates, which offers consulting and training in C and C++. He is secretary of the ANSI C++ standards committee, and also contributing editor for Windows/DOS Developer's Journal. Dan recently finished his first book, C++ Programming Guidelines, written with Thomas Plum. You can write to him at 393 Leander Dr., Springfield, OH 45504, or dsaks@wittenberg.edu (Internet), or call (513)324-3601.

This is my fourth and final column on operator overloading in C++. Throughout the earlier parts of this series (January '92, March '92 and May '92), I used a class for rational numbers (fractions) to demonstrate the techniques and design considerations for overloading operators.

My rational class represents the numerator and denominator of each rational number by a pair of signed long integers. By the end of the third part in this series, the class had the following operations:

Listing 1 shows a version of the header rational.h that declares these functions. (This version of the header omits implementation details, like the bodies of functions that I previously defined as inlines.) In part 2 of this series, I went to some pains to explain why operators ++ and -- should be non-member functions. However, in part 3, I accidentally wrote them as members. Listing 1 presents them as non-members as they should be.

Last time, I introduced the header iostream.h that defines the interface to the stream input/output library. I described the most basic stream output operators, and then defined a new stream output operator for rationals. In this article, I'll describe the basic stream input facilities, and then extend them to support rationals also'. After that, I will consider what remains to be done to make rationals an "industrial strength" class.

Basic Stream Input Facilities

The iostream library provides input facilities as operations on objects of type istream (an input stream). The predefined object cin (pronounced "see-IN") is the istream object attached to standard input. (I neglected to mention last time that the output stream cout is pronounced "see-OUT". I believe cerr is pronounced "see-UR".)

Just as the library overloads << (the left-shift operator) for ostream output, it uses >> (the right-shift operator) for istream input. Most C++ iostream libraries define istream::operator>> for most predefined types. Table 1 lists a typical set of overloaded operator>> for istream. I have to hedge my statements with words like "typical" because this library, like most of the language, is still in the early stages of standardization, so most libraries tend to leave out one or two of the functions in my "typical" list.

The compiler selects the appropriate operator>> based on the right-hand argument in the >> expression. For example,

int n;
...
cin >> n;
uses istream::operator>>(int &) to read from cin into int n. The int argument is a non-const reference so that the operator can modify it. As another example,

char name[100];
...
cin >> name;
uses istream::operotor>>(char *) to read from cin into the character array name.

Every istream::operator>> skips leading whitespace characters before reading the input value. (Whitespace is any character c for which the standard C function isspace(c) is true.) When reading a numeric type, like int or double, operator>> reads the characters representing the number using an appropriate format and converts them to the numeric representation. The operator stops reading when it encounters a character that doesn't match the input format. operator>> doesn't read that non-matching character; it leaves it in the input to be read by the next input operation.

For example, if the input is

   35
 -1.2 78
then

int i;
...
cin >> i;
skips spaces, reads and converts the 35 to an int using a format equivalent to the %d format used with scanf, and stores the result in i. It leaves the whitespace immediately after the 5 as the next character to be read. Subsequently,

double d;
...
cin >> d;
skips whitespaces to the '-', and then reads -1.2 into d using the equivalent of scanf's %lf format. The next input operation will start scanning at the whitespace immediately after the 2.

istream::operator>> evens skips leading whitespace when reading characters. For example,

char c;
...
cin >> c;
uses operator>> (char &), which skips leading whitespace and reads a single non-whitespace character into c.

istream::operator>> (char *) skips leading whitespace and then reads non-whitespace characters up to, but not including, the next whitespace. The function copies the nonwhitespace characters and a terminating null character into the string addressed by its char * operand. For example,

char s[100];
...
cin >> s;
reads the input

hello, world
up to, but not including, the space after the comma and copies hello, (along with the terminating null) into s.

As with the output operators, you can put more than one input operator in a single input expression. For example,

int i;
double d;
...
cin >> i >> d;
reads from cin into i, and then from cin into d. Just like <<, >> groups from left to right, so the above expression is equivalent to

(cin >> i) >> d;
which, in turn, means

(cin.operator>>(i)).operator>>(d);
Every istream::operator>> returns its left operand (of type istream &) so that

cin.operator>>(i)
reads from cin into i and returns a reference to cin. This reference becomes the left operand of the next call to operator>>. No matter how many >>s appear in the line, the left operand of each >> is always the istream at the beginning of the line.

Detecting End-of-File and Errors

When used inside the controlling expression of a conditional statement, like an if- or while-statement, an istream input expression like cin >> n evaluates to a non-zero value if it succeeds and zero if it fails. A failure can be caused by encountering either EOF (end-of-file) or an error. For example,

double n;
while (cin >> n)
   {
   ...
   }
repeatedly reads numbers (as doubles) from cin into n, and stops when it encounters EOF or an error.
Listing 2 shows a small, but complete, program that uses this technique. The program computes the arithmetic mean (the average) of a sequence of numbers read from standard input. It stops reading when it encounters EOF or an error.

Pray tell, how does this work? Earlier, I said that an expression like cin >> n returns a reference, yet here I've said the value is either zero or non-zero. Which is it? Notice I said that an istream::operator>> expression returns a reference, but it evaluates to zero or non-zero when used in a conditional. This subtle distinction hinges on the semantics of selection statements in C++.

In C, the controlling expression in a conditional statement must have scalar type. That is, it must have an arithmetic or pointer type. However, in C++, a controlling expression must either have scalar type, or it must be a class type for which an unambiguous conversion to a scalar type exists. In other words, you can use an expression that yields a class object (or a reference to a class object) as the controlling expression in a conditional statement if the class has exactly one conversion operator that converts the class type to some scalar type. (I introduced conversion operators in "Operator Overloading, Part 3", May 1992.)

Class istream has a conversion operator that converts an istream to a void *. The function definition typically looks something like

istream::operator void *()
   {
   if (previous operation succeeded)
      return this;
   else
      return 0;
   }
That is, the function examines the internal state (some private data) of the istream to determine if the previous input operation succeeded. If so, the function returns a non-null pointer; otherwise, it returns 0 (the null pointer). this — the pointer to the istream object — is a convenient choice for a non-null pointer. It will not cause addressing problems that might arise from choosing an arbitrary non-zero value.

Each istream (and, in fact, each ostream) stores its success and failure states as a set of bits tucked away in a private data member. Even though you can't touch the bits directly, you can inspect, set, and reset them via member functions.

For example, you can test for EOF on cin without trying to read anything by calling cin.eof.istream::eof returns a non-zero if EOF has occurred, and zero otherwise. Other istream state query functions are:

If either good or EOF is true, the previous operation succeeded. However, EOF means the next operation won't. Attempting to read from a stream when EOF is true sets the fail indicator. Aside from that, attempting to read from a stream whose state is not good generally has no effect.

In addition to indicating attempt to read after EOF, fail may indicate a "soft" error, like a formatting error. You can resume reading by clearing the istream's fail indicator. For example, when

float f;
...
cin >> f;
sees input like

$1.2
it simply stops at the $ and sets cin's fail indicator. The $ remains the next character that will be read. To clear the fail indicator so you can continue reading, call

cin.clear();
To simply discard the offending character without reading it, call

cin.ignore();
The next call to cin >> f should read the value 1.2.

The bad indicator usually means the stream is corrupt, so further reading is probably impossible. bad might indicate a hardware failure, or that the stream object has been damaged by a stray pointer somewhere else in the program. (Yup, C+ + is still C.)

Thus, the implementation of the conversion operator istream::operator void * is typically something like

istream::operator void *()
   {
   // if (previous operation succeeded)
   if (!fail ())
      return this;
   else
      return 0;
   }
or, more simply

istream::operator void *()
   {
   return fail() ? 0 : this;
   }
This lets you write tests like

while (cin >> i) { ... }
or just

if (cin) { ... }
to determine if the stream is still readable. But how about testing if the stream is not readable? As you might expect, you can write

if (!cin) { ... }
because istream also overloads the ! operator as

int istream::operator!()
   {
   return fail();
   }

Input for Rationals

Like the rational output operator, the rational input operator should be a non-member function, declared as

istream &operator>>(istream &is, rational &r);
Listing 3 shows the simplest of all possible implementations. It merely reads two longs from istream is. If it reads both numbers successfully, it uses the values to construct a rational object that it copies into r (more specifically, into the rational object referenced by r). Notice, that if it fails to read two numbers, it does not change r. That's the way all of the iostream library's input operators work. It's usually good practice to make user-defined I/0 operations behave as much as possible like the ones in the library.

The problem with implementation in Listing 3 is that it reads rationals in a format that's different from the output format. The ostream output operator writes rationals in the form (num/denom), but this input function doesn't look for the parentheses and slash. It only reads the numbers.

Listing 4 is my first attempt at writing a rational input operator that reads rationals in the form (num/denom). The input expression

is >> 1p >> n >> slash >> d >> rp;
reads a character, then a number, then a character, then a number, and then a character. All of the >> operators supplied by the iostream library skip leading whitespace at the beginning of each read operation, the rational input operator is very tolerant of whitespace between any parts of a rational number.

The following if statement

if (is && 1p == '(' && slash == '/' && rp == ')')
tests that the stream is still in good condition, and that all of the characters read have their expected values. The && and | | operators can be overloaded in C+ +, but the istream class can't. Therefore, all the &&s in this expression invoke the built-in &&. The operands of built-in && must be scalar, but the is to the left of the first && is an istream object, not a scalar. So, the compiler applies istream::operator void * to is, just as it does when the stream object appears alone as the controlling expression of an if, as in

if (is) { ... }
Anyway, if the stream is intact and all the characters are as expected, then the function proceeds to manufacture the rational number. Otherwise, it must somehow set the failure indicator for the stream. This you do using istream::clear.

Earlier, I mentioned that

is.clear();
resets a stream to a good state. istream::clear actually has an argument, but it has a default value of

void istream::clear(int s = 0);
The default zero corresponds to the good state. However, you can also pass explicit non-zero values to put the stream into a non-good state, like fail or bad, or even a combination thereof. The values you pass are enumeration constants called eofbit, failbit, and badbit, and, just for completeness even though it's equal to zero, goodbit.

These enumeration constants are defined inside a class called ios, which is defined in iostream.h:

class ios
   {
   ...
public:
   enum io_state
      {
      goodbit = 0,
      eofbit = 1,
      fail bit = 2,
      badbit = 4
      };
   ...
   };
Thus, the enumeration constants are members of class ios. When you refer to one of them, you must always use ios:: as a prefix. To actually set the failure indicator in Listing 4, replace the comment with

is.clear(ios::failbit);
The version of operator>> in Listing 4 is an improvement over Listing 3, because it reads the rational output format. Unfortunately, it doesn't behave as well as it should on errors. Every operator>> in the iostream library stops reading at the character that triggered the error, leaving that character in the input stream. But operator>> in Listing 4 tries to gobble up the whole rational number, and then goes back to check if what it gobbled was garbled.

To get error reporting behavior that's consistent with the rest of the library, the rational operator>> must check for errors earlier, as shown in Listing 5. Also, when the function reads an unexpected character, it should call

is.putback(c);
to put that character back into the input stream.

Listing 5 is based on a similar function that reads complex numbers presented by Stroustrup [1] (page 336). This function actually permits three different input formats for rational numbers:

num
(num)
(num/denom)
This is quite user friendly, and more like the behavior of floating point input functions. For example, when you enter a float that has no fractional part, you may omit the trailing decimal point and zero. That is, you need not write 13.0 when 13 will suffice. Similarly, the rational operator>> in Listing 5 lets you enter (13), or even 13, instead of (13/1). Come to think of it, maybe the rational output operator should put (13/1) as 13.

Stroustrup's technique for reading the parenthesis-free form is pretty slick. If the first non-whitespace character read is not a '(', the function puts the character back into the input stream, and tries reading it as just a number (specifically, a long int). If that read succeeds, then the character put back must have been a '+' or a '-' or a digit. If the read fails, then that character remains in the input stream.

As Stroustrup notes in his discussion, Listing 5 doesn't appear to check for errors as often as it should. But remember that any input operations on a stream that's not in good condition have no effect, so the function need not check the stream's state before every input operation. However, it must check each punctuation character immediately.

Listing 5 differs from Stroustrup's implementation in two notable ways. First, his function uses ios::badbit instead of ios::failbit to indicate a formatting error. I think you should reserve ios:: badbit for more catastrophic errors. Secondly, his function fails to call is.putback(c) just before it calls is. clear, so it handles errors less consistently.

The iostream library is fairly large and, at times, complex. What I have shown you is just enough to start using the library and write your own primitive extensions to it. I will cease using stdio.h in my C++ examples, and explain more about iostream as it arises in future examples. For additional explanation of the iostream library see Stroustrup [1] or Lippman [2].

So What's Left?

Listing 6 shows the header rational.h complete with all the details I've added as I built up the class. Listing 7 shows the source file rational.cpp containing the definitions for the out-of-line functions.

My rational class has a lot of functionality, but it needs more work before it's ready for prime time. Here's my wish list:

But more important than the missing operators are implementation details that make the class more reliable, such as:

Error recovery in C++ programs is a complex issue that I will deal with in the future.

I will move on to other topics now, but I encourage you to try adding some of these enhancements to this rational number class. I welcome suggestions for improvement and test cases that expose flaws in what I've done. When I get enough good ideas, I'll present them in a future article.

Ammeraal [3] presents an alternative implementation for rational numbers. His class appears to reduce fractions properly and test for errors at critical places. I haven't had a chance to test any of the code so I can't attest to its quality, but it's a good source of ideas and worth reading.

References

[1] Stroustrup, Bjarne. 1991. The C++ Programming Language, 2nd ed. Reading, MA: Addison-Wesley, Addison-Wesley.

[2] Lippman, Stanley B. 1991. C++ Primer, 2nd. ed. Reading MA: Addison-Wesley.

[3] Ammeraal, Leendert. 1991. C++ Programmers. New York: Wiley.

Mea Culpa!

Last month we accidentally left out the two figures in the article "A Small make" by Mike Gilson. Here they are now. Figure 1 contains a directed graph, or diagraph, which represents the dependencies of the files comprising hello.exe. Figure 2 shows an adjacency list for hello.exe and its dependencies, including the command lasts.

-dt