Columns


Standard C/C++

The Header <fstream>

P.J. Plauger


P.J. Plauger is senior editor of C/C++ Users Journal. He is convener fo the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Draft Standard C++ Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.

Introduction

I've been working my way through the iostreams portion of the draft Standard C++ library, lo these many months. Interestingly enough, I've yet to get to the part that programmers probably use the most — reading and writing streams of text connected to external files and devices. This installment still doesn't get as far as cin and cout, those ubiquitous standard streams we all know and love. But it does lay some important groundwork for them. At last we will see how to specialize a stream buffer for controlling input and output.

The header <fstream> defines half a dozen classes. Three of these cooperate to help you read and write files that you open by name:

The other three classes defined in <fstream> cooperate to help you read and write files controlled by an object of type FILE, declared in <stdio.h>:

Listing 1 shows the way I chose to implement the header <fstream>. I present it here, without explanation, for those of you who feel better looking at concrete code. Next month, I discuss in more detail how to implement this header.

Reading and writing files is an important part of iostreams. For many C++ programs, it is the sole means of communication between a program and the outside world. The standard stream objects, such as cin and cout, are conventionally associated with external streams. Even in this era of window-oriented interfaces, reading and writing files remains important. Communication between two programs, between a program and a window, or between a program and a special device are all often made to look like conventional file reads and writes. The classes defined in <fstream> are the preferred agents for controlling such file operations.

You do not, in principle, need two sets of classes for accessing files with iostreams. The Standard C library function fopen, declared in <stdio.h>, associates a named file with a FILE object. The same header declares fclose, for removing the association. You can associate a FILE object with a stdiostream object, as described above, and you're in business. Why the extra machinery?

A Bit of History

The answer is largely historical. The iostreams package evolved alongside the Standard C library more than atop it. Both spent their earliest days hosted on the UNIX operating system, a particularly friendly environment for reading and writing files. As they have spread to other systems, both have also profited from the influence of UNIX on more modern opeaating systems. Thus, iostreams and the Standard C library have common heritage, and many common architectural features, but they nevertheless grew up separately.

As a consequence, mixing iostreams and C stream operations has often led to an uneasy alliance. A filebuf object often performs reads and writes directly, using low-level operating system calls and managing an in-memory buffer directly. A stdiobuf object, by contrast, typically calls on the higher-level functions declared in <stdio.h>. The overhead per character can be much higher and buffering can occur in two different places within the program.

C++ programmers habitually favor the former over the latter, if only for better performance. They introduce stdiobuf objects only when obliged to mix iostreams and C stream reads or writes to the same file. And then they fret over the uncertainties that inevitably arise with double buffering. Input can be consumed in greater chunks than you expect, by the two agents. Or output from the two agents can get pasted together in bigger chunks than you intend. Often, the only safe fix is to make stdiobuf operations completely unbuffered. And that often penalizes performance even more.

The draft C++ Standard addresses these historical problems:

The basic idea is to better define the relationship between iostreams and C stream operations, and to provide as a default less surprising semantics for mixed operations. Nevertheless, the draft C++ Standard does not require that existing implementations of iostreams be rewritten purely in terms of calls to Standard C library functions.

One example of shared semantics is the way you open a file by name. The filebuf member function open(const char *, ios::openmode) is the agent that does the job. It effectively calls fopen, declared in <stdio.h>. To do so, it must map its second argument to the kind of mode string acceptable to fopen. Thus, for example, ios::in becomes "r". The member function is obliged to accept only those combinations of ios::openmode elements for which a corresponding mode string is defined.

Another example is the semantics of stream positioning. The Standard C library permits a FILE object to mediate both reads and writes, but only in a rather stylized way. At any given time, the input stream may be readable or the output stream may be writable, but not both. (Of course, it is also possible that neither stream is available.) Typically, you have to perform a stream-positioning operation to switch between reading and writing. Equally, only one stream position is maintained for both reading and writing. And an arbitrary stream position requires the unspecified information stored in an fpos_t object, not just a streamoff value.

Iostreams implicitly acknowledges these limitations in several ways. By inheritance, it has the same semantic limitations on switching between extracting and inserting. (Opinions differ on this point within the Committee, however.) If you attempt to position either stream in a filebuf object, you position both at once. And the semantics of class streampos reflect the realities of the restrictions on fpos_t objects. (See "Standard C: The Header <streambuf>," CUJ, June 1994.)

I don't want to paint too rosy a picture, however. Several Committee members begrudge almost any concession to the Standard C library in the area of file operations. They may accept the narrow need for defining the semantics of the two libraries when they work together. But some feel that the Standard C++ library is better off being defined de novo in this area. I personally don't see how to introduce new semantics for iostreams without massively complicating the description of, say, stdiobuf. Others, however, may find a way over time.

Recent Changes

The description of <fstream> depends heavily on references to functions and types declared in <stdio.h>, for reasons I indicated above. Please note once more, however, that such language makes no promises about how classes in this header are actually implemented. The draft C++ Standard often says, for example, that function A calls function B. Generally, this means only that A behaves as if it calls B. If you can write a portable program that can detect whether the call occurs — such as to a virtual member function that you can override — the call may be obligatory. Otherwise, don't be surprised if an interactive debugger fails to detect an actual call.

As I mentioned in conjunction with the header <streambuf> class streambuf has an added virtual member function. The public access function for it is called showmany. It endeavors to tell you how many characters you can safely extract with no fear of blocking while waiting for additional input. The derived classes filebuf and stdiobuf should have nontrivial overrides for this virtual member function, on systems that can supply the needed information.

Once again, I note that a major change to all of iostreams is the addition of wide-character streams. The classes filebuf and stdiobuf become template classes parameterized by the type of the stream element. One instantiation, for type char, has essentially the same functionality as described here. Another, for type wchar_t, supports streams of elements from some large character set. Classes ifstream, ofstream, istdiostream, and ostdiostream change along similar lines.

As before, I continue to present an implementation written just to handle char streams. It illustrates all the basic issues, without the additional complexity of templates.

Finally, I must report that the classes stdiobuf, istdiostream, and ostdiostream have been voted out of the draft C++ Standard. Since they occur widely in existing practice, however, I suspect that many vendors will still choose to supply them. I, for one, am not quick to burn this useful bridge between the worlds of C and C++.

Using <fstream>

You include the header <fstream> to make use of any of the classes ifstream, ofstream, filebuf, istdiostream, ostdiostream, or stdiobuf. Objects of these classes let you read and write conventional files. You can open files by name and control them, or control files already opened under control of objects of type FILE. For each approach, you can choose among three patterns of access:

I deal with each of these options in turn, first for files you open by name.

Read Only, By Name

If all you want to do is open and read an existing text file whose name you know, construct an object of class ifstream. If you know at construction time what null-terminated file name s you wish to use, you can write:

ifstream fin(s);
if (fin.is_open())
<file opened successfully>
If the file is not opened successfully, subsequent extractions will fail. Note, however, that the conventional tests for failure, !fin or fin != 0, will not be false until after you essay such an extraction. That's why I encourage you to make the explicit test fin. is_open() immediately after the object is constructed.

The resultant stream buffer (pointed at by fin.rdbuf()) does not support insertions. You can, however, close any currently open file, then open the file s2 for reading with the two calls:

fin.close(), fin.open(s2);
Naturally, you should once again test whether the open succeeded, as above. The stream position is reset to the beginning of the newly opened stream. (And the resultant stream buffer still does not support insertions.)

You can also construct an ifstream object with no open file, using the default constructor. Presumably, you would later open an existing text file for reading, as in:

ifstream fin;
fin.open(s);
if (fin.is_open())
<file opened successfully>
Destroying an ifstream object closes any open file associated with it.

The code I have shown so far always opens a text file, for reading only. A text file can be subject to a certain amount of interpretation, such as mapping the sequence carriage return/line feed to just line feed (newline). A binary file, on the other hand, delivers each byte from the file unchanged as a char value. To read a binary file, to make a file writable as well, or to invoke various other options when you open a file, you have to specify an explicit open-mode argument. (Naturally enough, it has type ios::openmode.) For all member functions that take a file-name argument s, the open-mode mode immediately follows. The first example, above, is actually equivalent to:

ifstream fin(s, ios::in);
if (fin.is_open())
<file opened successfully>
You have a number of options for the value of mode:

If you also set ios::ate in mode, the file is positioned at end-of-file immediately after it is opened.

Write Only, By Name

If all you want to do is create a new text file — or truncate an existing text file — then open it for writing, construct an object of class ofstream to control insertions into it. You can write:

ofstream fout(s);
if (fout.is_open())
<file opened successfully>
then insert into fout just like any other output stream. As with class ifstream, you can follow the file name s with a mode argument. If you omit the mode argument, as above, it defaults to ios::out.

You can also construct an ofstream object with the default constructor and later create it for writing, as in:

ofstream fout;
fout.open(s);
if (fout.is_open())
<file opened successfully>
In either case, the resultant stream buffer (pointed at by fout. rdbuf()) does not support extractions. And, of course, destroying an ofstream object closes any open file associated with it.

Read/Write, By Name

If you want to open a file that you can read as well as write, you need two objects to control the input and output streams. The classes ifstream and ofstream are highly symmetric, at least in this regard. Thus, you have three equally valid ways to do the job. If you don't want to open a file initially, you can write:

ifstream ifile;
ostream ofile(ifile.rdbuf());
or:

ofstream ofile;
istream ifil e(ofile.rdbuf());
or:

filebuf fb;
istream ifile(&fb);
ostream ofile(&fb);
All approaches cause ifile to control the input stream and ofile to control the output stream.

You can also open a file s in each of these three cases. Since the default values for the mode argument rarely make sense here, I show the argument explicitly in each case:

ifstream ifile(s, mode);
ostream ofile(ifile.rdbuf());
if (ifile.is_open())
<file opened successfully>
or:

ofstream ofile(s, mode);
istream ifile(ofile.rdbuf());
if (ofile.is_open())
<file opened successfully>
or:

filebuf fb;
istream ifile(&fb);
ostream ofile(&fb);
if (fb.open(s, mode))
<file opened successfully>;
Note that the last test for a successful open differs from the earlier ones. As usual, when the filebuf object is destroyed, any open file associated with it is closed.

Controlling C Streams

The classes istdiostream, ostdiostream, and stdiobuf provide additional capability within the header <fstream>. They let you control files already opened under control of an object of type FILE. For example, the function fopen, declared in <stdio.h>, returns a non-null pointer to FILE when it successfully opens a file. Numerous other functions, declared in the same header, support C stream reads and writes to the opened file.

The same header also declares three well known objects of type pointer to FILE that control the three standard streams:

The header <iostream> declares several istream and ostream objects that work in concert with these objects to support iostreams operations on the standard streams. You can nevertheless use the facilities in <fstream> to control, say, stdout with an additional object you construct.

As usual, there are three patterns of access to discuss: read only, write only, and read/write. I cover them in order.

Read Only

If all you want to do is read a stream controlled by a FILE object, construct an object of class istdiostream. You must know at construction time the argument value pf, of type pointer to FILE. You can write:

istdiostream fin(pf);
If pf is a null pointer, or if the stream it controls cannot be read, all subsequent insertion operations will fail. You cannot, however, test whether fin is associated with an open file.

When fin is destroyed, the stream *pf is not closed. Nor should you close the stream, by calling fclose(pf), before fin is destroyed. The call discredits pf, so even a subsequent attempt to access the pointer itself can cause a program to terminate abnormally. Worse, subsequent attempts to control the file may do all sorts of insane things that are not diagnosed.

You can control the degree of buffering within fin. Initially, fin.buffered() returns zero, indicating that no buffering occurs. Put simply, you can alternate the calls fin.get() and fgetc(pf) and read alternate characters from the file.

Once you call fin.buffered(1), however, fin.buffered(1) returns a nonzero value. Thereafter, buffering may occur. Put simply, the call fin.get() may encourage the stream buffer associated with fin to gobble an arbitrary number of characters, not just the one you requested. A subsequent call to fgetc(pf) will not necessarily deliver the next character you would expect.

If you resist the temptation to access *pf directly, buffering causes no problems. On the contrary, it offers the controlling stream buffer the opportunity to improve performance, sometimes considerably. A wise rule of thumb, therefore, is never to enable buffering for a file accessed both via a stream buffer and via C stream function calls. If the stream buffer is the sole agent accessing the file, always enable buffering.

Write Only

If all you want to do is write a stream controlled by a FILE object, construct an object of class ostdiostream. You must know at construction time the argument value pf, of type pointer to FILE. You can write:

ostdiostream fout(pf);
If pf is a null pointer, or if the stream it controls cannot be written, all subsequent insertion operations will fail. As with an istiodstream object, you cannot test whether fout is associated with an open file. The same remarks also apply about not closing the file until fout is destroyed. Equally, the same considerations apply about when to buffer, or not to buffer, a stream associated with an ostdiostream object.

Read/Write

Finally, if you want to both read and write a stream controlled by a FILE object, you need two objects to control the input and output streams. The classes istdiostream and ostdiostream are highly symmetric. Thus, you have three equally valid ways to do the job:

istdiostream ifile(pf);
ostream ofile(ifile.rdbuf());
or:

ostdiostream ofile(pf);
istream ifile(ofile.rdbuf());
or:

stdiobuf sb(pf);
istream ifile(&sb);
ostream ofile(&sb);
In the third case, you enable buffering by calling sb. buffered(1).

File Positioning

Earlier, I discussed the limitations on positioning within files. Your safest bet, as always, is to memorize a file position you want to return to, as an object of type streampos. Later on in the program, while the file is still open, you can use the value stored in this object to return to the memorized file position.

For a binary file that is not too large, you can represent a stream position as an object of type streamoff. You can thus perform arithmetic, on byte displacements from the beginning of a file, to determine new stream positions. The UNIX operating system represents text files the same as binary. Hence, it extends the same latitude in stream positioning to all files, not just binary. But few other systems share this simplicity. Don't write portable code that counts on it.

A file opened both for reading and writing requires intervening stream-positioning requests when switching from reading to writing, or back. Again, some systems may relax this requirement, but don't count on it in a portable program.

Finally, the streambuf virtual member functions setbuf and sync are given non-trivial semantics in the derived class filebuf. The former, however, is defined in terms of the function setvbuf, declared in <stdio.h>, which does not itself promise much. And the latter is generally called as often as necessary in the normal course of business. I recommend, therefore, that you not call either pubsetbuf or pubsync, the public member functions that call these virtual member functions on your behalf.

This article is excerpted in part from P.J. Plauger, The Draft Standard C++ Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1995).