Columns


Standard C/C++

Implementing <fstream>

P.J. Plauger


P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Draft Standard C++ Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.

Introduction

Last month, I introduced the header <fstream> from the draft Standard C++ library. (See "Standard C: The Header <fstream>," CUJ, April 1995.) It provides the classes you need to read and write external files and devices, in the guise of extracting from an istream object or inserting to an ostream object. I continue this month with a description of one way to implement these classes.

This implementation should be largely familiar to those of you with some experience with existing iostreams packages. While the draft C++ Standard has introduced various small changes from traditional technology, char-based file I/O is heavily rooted in existing practice. The really new stuff is all the templates introduced into the library. I plan to avoid talking about them for a bit longer, at least.

Listing 1 shows the file fstream, which implements the standard header <fstream>. It defines the classes filebuf, ifstream, ofstream, stdiobuf, istdiostream, and ostdiostream. I begin with several notes on class filebuf, which is the workhorse class for this header.

The incomplete type declaration struct _Filet is an alias for the type FILE, declared in <stdio.h>. The header <fstream> is not permitted to include <stdio.h>, but it needs to declare parameters and member function return values compatible with type FILE. A secret synonym solves the problem. My implementation of the Standard C library [1] introduces _Filet for a similar reason. For another implementation, you may have to alter the name, or introduce extra machinery, to achieve the same effect.

(A recent addition to C++ is the ability to gather multiple declarations into separate namespaces. The entire Standard C++ library then inhabits the namespace std. With such protection, an implementation can more safely include C headers in C++ headers — though a few problems with macro names have yet to be widely discussed. Until namespace support becomes widespread, however, the sort of implementation presented here is prudent.)

I added a constructor with the signature filebuf(_Filet *). (I did so by adding a default argument to the default constructor.) This implementation performs all file operations mediated by class filebuf through an associated FILE object. The member object _File points at the associated object, or stores a null pointer if none exists. The secret member function _Init, described below, initializes a filebuf object. Its arguments specify the initial values stored in the _File member objects.

The type _Uninitialized, and the value _Noinit, mark the constructor used to perform some "double constructor" magic. The standard streams make use of this magic so they can be usable even within static constructors and destructors for other objects. (See "Standard C: The Header <ios>," CUJ, May 1994.)

Note that class stdiobuf is based on class filebuf. The draft C++ Standard says that stdiobuf is derived directly from streambuf. I pulled a similar trick in implementing the header <sstream>, deriving stringbuf from strstreambuf. (See "Standard C: The Header <sstream>," CUJ, March 1995.) As before, such indirect derivation is permitted by the library "front matter."

A fundamental difference exists between the classes filebuf and stdiobuf, however. Destroying a filebuf object closes any associated file. Destroying a stdiobuf object does not. I added the member object _Closef to filebuf to tell its destructor what to do. The stored value is nonzero only if the file is to be closed when the object is destroyed. That only happens after filebuf::open successfully opens a file.

Improving Performance

Listing 2 shows the file filebuf.c, which defines a number of functions required for practically any use of class filebuf. On the face of it, this implementation errs strongly on the side of portability, at the cost of performance. All of the member functions defined here will work atop any Standard C library. Moreover, none of the functions buffer reads or writes, beyond whatever buffering that may occur in the associated FILE object.

Before you dismiss this as a purely tutorial implementation of class filebuf, take a closer look at the last function definition in the file. As I mentioned above, the member function _Init initializes all filebuf objects when they are constructed. It also reinitializes an object after a successful call to filebuf::open. And it can be made to do one very important additional thing.

As I indicated way back in June 1994, I defined the base class streambuf from the outset with a bit of extra flexibility. The six pointers that control in-memory buffers are all indirect pointers. You can point them at pointers within the streambuf object itself, or at pointers in another object. For the derived class filebuf, you can sometimes choose the latter course to advantage. That's why the macro_HAS_PJP_CLIB chooses between two different calls to streambuf::_Init, which initializes all those direct and indirect pointers in the base subobject.

An arbitrary Standard C library should contain no definition for this macro. The code works correctly, if not as fast as many would like. But for operation atop my implementation of the Standard C library [1], you can do much better. Define the macro _HAS_PJP_CLIB and the indirect pointers are set differently. They point at the pointers stored in the FILE object controlling access to the file. The pointer discipline is similar enough for things to work properly.

Here's why. I designed the FILE structure in C so that the macros getchar and putchar could expand to inline code that is reasonably small and generally very fast. The input and output streams are each controlled by a triple of pointers to characters. The C++ equivalent of the macro getchar, for example, is the inline function definition:

inline int getchar() {
   return ((_Files[0]->_ Next <
           _Files [0]->_Rend
           ? *_Files[0]->_Next++:
           fgetc(_Files[0])));
}
Compare this code with the definition of the similar streambuf public member function sgetc:

int sgetc() {
   return (gptr() != 0 &&
          gptr() < egptr()
          ? *_Gn() : underflow());
}
The only real difference in protocol is a small one. getchar can assume that its "next" pointer _Files[0]->_Next is never a null pointer. It certainly doesn't hurt for sgetc to make the extra test.

What typically happens when extracting from an input stream is pretty much what you'd hope for. If the buffer is empty, sgetc calls underflow. The overriding definition in class filebuf rediscovers that the buffer is empty and calls fgetc to supply a single character. Fortunately, fgetc often delivers up a whole buffer full of additional characters in the bargain.

The next several hundred calls to sgetc simply exercise inline code that accesses the stored character value directly from the buffer and updates the "next" pointer to note its consumption. This is the same pointer as is used by getchar, either as a macro or an inline function definition. It is also, of course, the same pointer as is used by fgetc. Thus, tight synchronization is maintained across all flavors of input. Equally important, many character extractions within istream extractors have no function-call overhead whatsoever.

Of course, the same rules apply to out-put streams. Most characters inserted by streambuf::sputc get stored directly into the output buffer, with the "next" pointer suitably updated. Thus, many character insertions within ostream inserters also avoid function-call overhead. Quod erat demonstrandum.

Any implementation of the Standard C library that follows this discipline for FILE pointers can benefit from the same performance improvement. You probably have to change the pointer member names in the definition of filebuf::_Init. Nothing else need change, however.

The Remaining Code

Listing 3 shows the file fiopen.c, which shows the member function filebuf::open. It maps the mode argument, of type openmode, to the equivalent mode string expected by the function fopen. If that function succeeds in opening the file, it reinitializes the filebuf object to control the associated FILE object.

All the remaining source files needed to implement the header <fstream> are trivial. Listing 4 shows the file ifstream.c, which defines the destructor for class ifstream. Listing 5 shows the file ofstream.c, which defines the destructor for class ofstream. Listing 6 shows the file stdiobuf.c, which defines the destructor for class stdiobuf. Listing 7 shows the file istdiost.c, which defines the destructor for class istdiostream. And Listing 8 shows the file ostdiost.c, which defines the destructor for class ostdiostream.

Testing <fstream>

Listing 9 shows the file tfstream.c. It tests the basic properties of the classes defined in <fstream>. It does so by manipulating a temporary file whose name is obtained by calling tmpnam, declared in <stdio.h>. First it tries to write the file, then read from it, then intermix reads and writes. It also performs some modest file-positioning operations along the way. Finally, it repeats a few of these operations using the classes istdiostream and ostdiostream.

If all goes well, the program prints:

SUCCESS testing <fstream>
and takes a normal exit. It also removes the temporary file it created.

References

[1] P.J. Plauger, The Standard C Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1992).

This article is excerpted in part from P.J. Plauger, The Draft Standard C++ Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1995).