P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active, on the C++ Committee, WG21. His latest books are The Draft Standard C++ Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.
Introduction
I introduced the header <strstream> last month and showed a variety of ways to use the classes it defines. (See "Standard C/C++: The Header <strstream>," CUJ, January 1995.) Among other things, it defines the class istrstream, which is derived from istream to help you extract from a character sequence stored in memory. Thus, you can write code like:
istrstream strin("1 4 7 2 5 8 0 3 6 9"); int i; while (strin >> i) try_button(i);to "read" a constant string just as if it were a file.Similarly, class ostrstream is derived from ostream to help you insert into a character sequence stored in memory. You can construct a string as if writing to a file, for example, then read it later with an istrstream object controlling the same stream buffer, as above. For the special magic involved, both make use of the stream buffer class strstreambuf. As with all stream buffers, it is derived from streambuf, in this case to manage such in-memory character sequences. (See "Standard C/C++: The Header <streambuf>," CUJ, June 1994.)
The net effect is that the classes defined in <strstream> let you read and write these character sequences with exactly the same machinery you use to read and write external files in C++. You can, of course, perform much the same operations with the Standard C library functions sprintf and sscanf, declared in <stdio.h>. But those older functions require you to use different notation for reading and writing in-memory character sequences rather than external files. And they don't work with the inserter and extractor machinery of C++. The classes defined in <strstream> offer an obvious notational advantage, as well as better information hiding.
My goal this month is to show you one way to implement these classes. It is part of the implementation I have been presenting for the past year, which is based on the draft C++ Standard going into the March 1994 meeting of WG21/X3J16. That in turn was based heavily on the existing header <strstream.h> which is still widely used as part of the iostreams package of library classes. If you know current C++ practice, much of what you see here will be familiar territory.
In more recent drafts of the C++ Standard, many headers have been "templatized." The Standard C++ library now defines general templates that describe iostreams operations for an arbitrary "character" type T. To reproduce the existing functionality, the library instantiates these general templates for T defined as type char. The net result is much the same, but the machinery and the notation is now far more elaborate.
The header <strstream> has most recently been exempted from this treatment. The committee favors its newly minted header <sstream>, which does much the same thing as <strstream> but works with templatized strings of arbitrary character types. (I discuss the char version of the newer header next month.) The older header is retained, in the interest of preserving existing code, but as a sort of second-class citizen. For all its idiosyncracies, I personally find it a handy way to fiddle with small sequences of type char. So what you see here is still quite useful. And it is still destined to be part of the final Standard C++ library.
The Header File
Listing 1 shows the header file that implements strstream. I've discussed some of the peculiar notation in past columns, but I've also made a few changes as compilers evolve. Briefly:
I have added to the header two macros to specify in one place two values that are judgement calls. Both deal with the number of characters to allocate when creating or extending a character sequence:
- The macro _BITMASK defines a "bitmask" type. (See "Standard C/C++: The Header <ios>," CUJ, May 1994.) It expands differently depending on whether the translator supports overloading on enumerated types, a fairly recent addition to the C++ language. I later discovered a need to defer the definitions of the overloaded functions, for bitmask types nested inside classes as is the case here. Thus, the macro _BITMASK_OPS supplies these deferred definitions.
- The type bool has even more recently been added to the C++ language. It represents the true/false value of a test expression, such as a comparison operator. Until recently, I provided the typedef _Bool as a placeholder. But since some translators now supply this type, I've brought my code more up to date. (For older translators, bool is a defined type, not a keyword.)
- The macro _HAS_SIGNED_CHAR expands to a nonzero value for translators that treat char and signed char as distinct types. All translators are supposed to, but many still do not.
You may well have reasons to alter either or both of these values, based on what you know about storage size and granularity on a given implementation.
- _ALSIZE, the initial number of bytes to allocate, absent any hints to the contrary (currently 512)
- _MlNSIZE, the minimum number of additional bytes to allocate when extending a sequence (currently 32)
The bitmask type _Strstate, defined within the class strstreambuf, describes the internal state of such an object. Much of the state information is spelled out in detail by the draft C++ Standard. I have added, however, the element _Noread, which is not used by the classes defined in <strstream>. Adding it here greatly simplifies the implementation of the classes defined in <sstream> (next month). The meaning of each of the _Strstate elements is:
I developed the protected secret member function strstreambuf::_Init as a way to handle all possible constructors, including those for class stringbuf, defined in the header <sstream>. Similarly, the protected secret member function strstreambuf::_Tidy does all the work of the destructor. It is also used to advantage in class stringbuf. (See the discussion of Listing 2, below.)
- _Allocated, set when the character sequence has been allocated
- _Constant, set when the character sequence is not to permit insertions
- _Dynamic, set when the character sequence can grow on demand
- _Frozen, set when the character sequence has been frozen (should not be deleted by the destructor)
- _Noread, set when the character sequence is not to permit extractions (is write only)
Most of the objects stored within a strstreambuf object are what you might expect. _Strmode holds the state information. Alsize holds the current allocated sequence length. _Palloc and _Pfree point at the functions that allocate and free storage, if they are specified when the object is constructed.
But there are also two additional private member objects. Each solves a different problem in managing accesses to the controlled character sequence. _Penadsave stores the end pointer for the output sequence while the stream buffer is frozen. (See the discussion of Listing 4, below.) _Seekhigh stores the highest defined offset encountered so far within the character sequence. The code updates its stored value in several places when that value must be made exact.
Workhorse Functions
Listing 2 shows the file strstrea.c. It defines three of the functions you are likely to need any time you declare an object of class strstreambuf its destructor, _Init, and _Tidy. Two of the three functions are straightforward, but _Init warrants a bit of study. It selects among multiple forms of initialization by an intricate analysis of its arguments:
Be warned that this code is extremely fragile. Partly, it reflects the complexities of the numerous strstreambuf constructors (which I described last month). Partly, it is made larger by the inclusion of support for stringbuf constructors. But the code also enforces delicate streambuf semantics that are hard to spell out in detail. Tinker cautiously.
- If gp (the "get" pointer) is a null pointer, then n (the size argument) is a suggested initial allocation size.
- Otherwise, if mode has the bit _Dynamic set, then the initial character sequence is copied from one controlled by a string object. The function copies n characters beginning at gp. The calling string constructor can independently inhibit insertions (_Constant) and/or extractions (_Norend).
- Otherwise, the character sequence resides in an existing character array beginning at gp. If n is less than zero, the sequence is assumed to be arbitrarily large (INT_MAX characters). If n is zero, the array is assumed to contain a null-terminated string, which defines the character sequence. If n is greater than zero, it is taken as the length of the character sequence. The function defines an output stream only if pp (the "put" pointer) is not a null pointer and lies within the character sequence.
Listing 3 shows the file strstpro.c. It defines three functions that override streambuf virtual member functions to insert and extract characters overflow, pbackfail, and underflow. The inherited definition of uflow is adequate, so no override occurs here. (See "Standard C/C++: The Header <streambuf>," CUJ, June 1994.) Once again, two of the three functions are straightforward. Only overflow demands closer study.
It is the business of overflow to "make a write position available," then insert the argument character into it. If the write position is already available, or if none can be made available, the function has an easy job of it. The hard part comes when the function must extend, or initially create, storage for the character sequence. It must then determine the size of any existing sequence (osize) and the desired new size (nsize). Then it can try to allocate the new storage, copy over any existing sequence, and free an existing sequence that was also allocated. Finally, it must determine new settings for the streambuf pointers, using some very finicky arithmetic.
Other Functions
Listing 4 shows the file strstfre.c, which defines the member function strstreambuf::freeze. Here is where the addition of the member object strstreambuf::_Pendsave saves the day. A frozen buffer must not permit insertions, but that is not an easy thing to prevent. The streambuf public member functions won't look past the pointers themselves if they indicate that a write position is available. So the trick is to make the output stream appear empty for a frozen stream buffer by jiggering the end pointer. _Pendsave stores the proper value for later restoration, should the stream buffer be unfrozen.Listing 5 shows the file strstpos.c. It defines the two functions that override streambuf virtual member functions to alter the stream position seekoff and seekpos. The often critical value in both functions is the member object strstrambuf::_Seekhigh. It is updated as needed to reflect the current "end," or high-water mark, of the character sequence. That value determines offsets relative to the end (way equals ios::end), as well as an upper bound for valid stream offsets. The logic of both functions is otherwise simple but tedious.
And that concludes the source code for class strstreambuf. The two remaining classes defined in <strstream> are derived from the classes istream and ostream to assist in controlling inmemory character streams. I described how to use both istrstream and ostrstream last month. As you can see from Listing 1, most of the member functions are small and hence defined as inline.
Listing 6 shows the file istrstre.c. It defines the destructor for class istrstream, which is the only member function not defined inline within the class. And Listing 7 shows the file ostrstre.c. It defines the destructor, and a moderately messy constructor, for class ostrstream. I put the constructor here mostly to hide the call to the function strlen, declared in <string.h>. It is not permissible to include the C header that declares it in <strstream> and I didn't want to make up a version of the function with a secret name. Again this is the only source code for member functions of class ostrstream not defined inline within the class.
While there are a few tricky spots in the implementation of the stream buffer, most of the code that implements the header <strstream> is small and straightforward. You will find much the same story when we visit other specialized stream buffers derived from class streambuf and its brethren. It is a tribute to the basic design of iostreams that this is so.
This article is excerpted in part from P.J. Plauger, The Draft Standard C++ Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1995).