January 1996/Standard C/C++

Columns

Standard C/C++: STL Headers--An Overview

by P.J. Plauger

Introduction

Last month, I gave a summary of the Standard Template Library (or STL for short), which is now a significant chunk of the draft Standard C++ library. (See Standard C/C++: The Standard Template Library, CUJ, December 1995.) The original STL was developed at Hewlett-Packard Labs, by Alex Stepanov, Meng Lee, and David R. Musser. You can get the H-P code for free, or reasonably cheap, from several sources. And you can get various descriptions of how to use the H-P code as well. But that is not exactly what is now in the draft C++ Standard.

STL was changed some on its way into the draft a year and a half ago. It has changed even more through several intervening meetings. (Its presence in the library has also caused other parts of the library to change to meet it part way, I hasten to point out.) Good as it is, STL has exploded onto the scene primarily because of its acceptance as part of the emerging C++ Standard. Hence, it is important to understand the thing that will be part of all conforming implementations of C++ in the near future. It is less important to know how it has evolved over time.

My primary goal in the coming months is to describe STL, as standardized, in considerable detail. I hope to convey the elegance of its design, and to justify some of the more interesting design decisions. As a side effect, I hope to encourage its use among readers of this magazine. Theres simply too much good stuff there for C++ programmers to overlook for long.

This is a major undertaking, for both me and all you readers. Thirteen of the 33 headers unique to Standard C++ are devoted almost exclusively to STL machinery. By that simple metric, were talking about over a third of a rather large library. Counting lines of source code gives a slightly smaller fraction, but you then should give added weight to STL because it is so dense. Theres a lot more bang for the buck in, say, <algorithms> than in <limits>. However you slice it, STL is a big piece of a big pie.

So this month, I plan to take another pass over the entire STL component. This time around, Ill look at those thirteen headers whats in each header, what you might do with those contents, and how the headers group together. It helps to have a bit of perspective before descending into details.

Grouping the Headers

I tend to group the STL headers in terms of the three major organizing concepts I discussed last month iterators, algorithms, and containers. Containers impose various amounts of structure on a sequence of elements of some object type T. Algorithms perform various useful operations, typically on sequences of elements. And iterators are the glue that holds everything together. Algorithms access the elements in a sequence through iterators.

Table 1 shows one way to group the headers, with these definitions in mind. Part I is called iterators, but it is probably more honestly called glue. The headers <utility>, <iterators>, and <memory> do in fact define a number of classes that can serve as iterators, but that is only part of the story. Their focus is on supplying various support services for algorithms and containers.

Part II is more tightly focused on the business of defining algorithms. Even so, the header <functional> contains nary an algorithm. It is the home of numerous template function objects, as well as more conventional template functions, that work nicely with a number of the algorithms.

Part III is the easiest grouping to defend. Each of its headers does indeed define one or more containers. Some of these containers are defined in terms of others, and hence are mere container adapters. But each supplies a unique, and demonstrably useful, way of organizing a sequence and limiting accesses to it.

I will repeat this table, from time to time, in coming installments to remind you where a given topic fits in the overall scheme of things. I also expect to discuss the various headers in the order presented in Table 1.

Part I: Iterators

The three headers that constitute iterators probably have the least in common, other than serving as the glue I described above.

The Header <utility>

As you might surmise from the vague name, the header <utility> is a catchall. Its not even a very large one at that. But its contents are used throughout STL. The only class it defines is the template class pair, which holds a pair of objects of arbitrary types. Certain algorithms return a pair of values they do so by returning an instance of pair. And the container map holds pairs of keys and their associated values.

The header <utility> also defines the template function make_pair, which does just what you think. It returns a pair object constructed from its two arguments. Sadly, it is not as useful as you might think when working with map containers. But now is not the time to discuss why.

Finally, the header provides four template operator definitions. One defines operator!=, for operands of the same type, in terms of operator==. The other three similarly define operator>, operator<=, and operator>= in terms of operator<. These definitions impose a total ordering on their operand types the kind of relationship you take for granted among integers and floating-point values.

The latter group of three has a far reaching effect on C++ programs, one that not everyone approves of. Since they are not directly used by the rest of STL, I cant help but question whether theyll long endure.

The Header <iterator>

The header <iterator> does indeed define a number of iterator classes, but that is only part of the story. It also includes considerable machinery that helps classify and manipulate all sorts of iterators. This common machinery helps you extract, in a standard way, information often needed in conjunction with iterators. It also helps you write sets of templates that expand different ways for different categories of iterators, so you can trade off time complexity for flexibility in implementing an algorithm.

For example, all input iterators should have as a base an object of class input_iterator<T, Dist>. Here, T is the type of object designated by the iterator and Dist is the type that represents the algebraic difference of two iterators. (Typically, Dist is ptrdiff_t.) The template function value_type has a return value of type T * for a parameter whose type is based on input_iterator<T, Dist>. Similarly, the template function distance_type has a return value of type Dist *. You use these template functions as arguments to still other template instantiations, to smuggle in the types T or Dist in ways that you can use them in declarations.

The template function distance lets you determine the distance, of type Dist naturally, between two iterators. And the template function advance lets you add a distance of type Dist to an iterator. You use these templates, for any category of iterators except output iterators, when youre not sure the iterators support the usual addition and subtraction operators.

You can play even more clever tricks with the template function iterator_category. For a parameter whose type is based on input_iterator<T, Dist>, it defines the unique return type input_iterator_tag. Thats the key to instantiating a template designed especially for input iterators. As you might guess, parallel facilities are defined for all the categories of iterators, even conventional pointer objects.

If you havent glazed over after the last few paragraphs, you probably havent been reading too closely. I just described a lot of technology with a few coarse brush strokes. Not to worry. We will revisit this stuff in more detail in future installments.

By comparison, the iterators that are defined in <iterator> are reasonably easy to understand. Template class reverse_iterator helps you construct an iterator that runs backwards, from an existing random-access iterator. Template class reverse_bidirectional_iterator does the same for an existing bidirectional iterator. Most of the containers use these templates to supply reverse iterators for the sequences they control.

You can append elements to a controlled sequence, in the guise of writing a new sequence, by instantiating a back_insert_iterator for the container that controls the sequence. To prepend elements instead, instantiate a front_insert_iterator. To insert them at a designated point within the controlled sequence, use an insert_iterator.

You can play similar games with iostreams objects. Say you want to extract integers from the standard input and process them as a sequence of elements of type long. Instantiate an istream_iterator<long> for cin and youre in business. You generate such a sequence to the standard output by instantiating an ostream_iterator<long> for cout. To deal in individual characters instead, use istreambuf_iterator and ostreambuf_iterator.

The Header <memory>

STL often manages storage directly, usually to improve performance or flexibility. The header <memory> contains an assortment of mechanisms to aid in this tricky process. Perhaps the most fundamental of these mechanisms is class allocator. An object of this class supplies all the magic needed by a container object for allocating and freeing storage, and for addressing that storage if it inhabits some exotic locale, such as a far heap. Theres even an overload for operator new that makes use of such an allocator object.

More mundane are the pair of template functions get_temporary_buffer and return_temporary_buffer. These buy and sell scratch storage for use by several of the more sophisticated algorithms. Such algorithms must be careful to construct objects in newly acquired raw storage, to replace stored values by assigning to objects once theyve been constructed, and to destroy objects before scratch storage is freed. Thats the purpose of template class raw_storage_iterator, and several algorithms that construct as they copy.

An interloper in this header is template class auto_ptr. Proposed by Greg Colvin, an occasional contributor to CUJ, this class provides a smart pointer for keeping track of allocated storage and ensuring its proper disposal. (For more on smart pointers, see Colvins "Smart Pointers for C++ Garbage Collection," CUJ, December 1995.) Class auto_ptr is the only inhabitant of the thirteen STL headers that is arguably not directly a part of STL.

Part II: Algorithms

The three headers that constitute algorithms supply lots and lots of machinery for performing common operations on sequences of objects.

The Header <functional>

As I indicated earlier, the header <functional> does not itself contain any algorithms. What it does contain is a whole slew of classes that define function objects classes that provide one or more definitions for operator(). For example, here are the definitions of the base class binary_function and the template class plus:
template<class A1, class A2, class R>
  struct binary_function {
  typedef A1 first_argument_type;
  typedef A2 second_argument_type;
  typedef R result_type;
  };
                    
template<class T>
struct plus :
    binary_function<T, T, T> {
T operator()(const T& X,
             const T& Y) const
    {return (X + Y); }
};
You can construct an object, such as plus(), then call it as if it were a function, as in plus()(3.0, 2.0). In this case, the result has type double and the value 5.0.

Numerous algorithm templates accept function objects as arguments. They can thus perform the same operation over all the elements of a sequence, for example. The net result is much the same as passing pointers to callback functions, but with considerably greater notational convenience.

As you might guess, the header <functional> defines template classes for all the binary operators defined in C. It also defines the base class unary_function and template classes for all the unary operators as well. These definitions make it easy for you to perform simple operations on a sequence, without having to create your own function objects.

The header goes even further, however. It also supplies some plumbing for combining function objects to form more elaborate expressions. With a bit of patience and practice, you can build up some pretty impressive operations from the basic building blocks. I wont even try to illustrate such antics here, however.

The Header <algorithm>

The header <algorithm> is huge. It contains over 60 different template functions, almost all of which operate on one or more sequences of elements specified by ranges of iterators. They range from the dirt simple, such as for_each, to the amazingly complex, such as stable_sort. Just to shorten compile times, it might be nice if this header were further subdivided. But I have yet to hear anyone propose a sensible scheme for doing so.

The good news is that the algorithms are largely independent. You can study them largely in isolation, or at least in small groups, once you master the machinery described in Part I: Iterators. That leads to an interesting irony. This overview has the least to say about what is doubtless the most extensive part of STL.

The Header <numeric>

One group of algorithms that did get separated out, for whatever reason, is a small group that performs simple arithmetic across one or more sequences. The template functions are accumulate, inner_product, partial_sum, and adjacent_difference. Each of these can accept function objects that perform various binary operations across sequences. Each also has a definition that supplies the commonest form of binary operator. For example, left to its own devices, accumulate behaves as if it were instantiated with the function object plus described above.

Part III: Containers

The seven headers that constitute containers define a number of different template container classes. Each is characterized by its internal structure, which imposes differing tradeoffs between ease of access and ease of modification of the controlled sequences.

The Header <vector>

The header <vector> defines template class vector, which controls a contiguous array of elements. That makes for fast random access, but requires tedious rewriting if you want to insert or delete elements. An explicit specialization endeavors to provide an efficient implementation for vector<bool>.

The Header <list>

The header <list> defines template class list, which controls a bidirectional linked list of elements. That makes for easy insertion and deletion at arbitrary places in the sequence, but rules out fast random access. The template class supports only bidirectional iterators.

The Header <deque>

The header <deque> defines template class deque, which is a compromise between a vector and a list. It is typically implemented as a vector of pointers to blocks of elements. Hence, it supports random access, and reasonably fast insertions and deletions at either end of the sequence. (In the middle, it is no better than a vector.)

The Header <stack>

The header <stack> defines template class stack, which permits insertions and deletions only at the end of the controlled sequence. Moreover, you can access only the last element inserted in the stack. (This is a last-in/first-out, or LIFO, discipline.) You construct a stack from another container type, such as a deque or list.

The Header <queue>

The header <queue> defines template class queue, which permits insertions only at the beginning and deletions only at the end of the controlled sequence. Moreover, you can access only the first or last element in the sequence. (Thus, you can confine accesses to a first-in/first-out, or FIFO, discipline.) You construct a queue from another container type, such as a deque or list.

The header also defines template class priority_queue, which is a queue kept in sort by some predicate you supply. In the default case, the top of the queue the only element you can access or remove is always the largest element in the queue.

The Header <set>

The header <set> defines template class set, which controls a sequence of unique elements kept in sort by some predicate you supply. (The uniqueness criterion means that no two elements can compare equal by the ordering predicate.) Typically, the sequence is maintained as a red-black tree, which balances the cost of insertions, deletions, and searches for an ordered sequence. All such operations take time proportional to the logarithm of the number of elements in the sequence. The template class supports only bidirectional iterators.

The header also defines the template class multiset, which is a set that also permits multiple elements that compare equal by the ordering predicate.

The Header <map>

The header <map> defines template class map, which is essentially a set instantiated for pairs of objects. A pair consists of a key and a value component. The ordering predicate applies only to the key component. No two elements of a map can have keys that compare equal by the ordering predicate.

The header also defines the template class multimap, which is a map that also permits multiple elements whose keys compare equal by the ordering predicate.

Summary

As you can see, the Standard Template Library contains an enormous amount of functionality. I gave this more detailed overview to provide a better feel for whats there, but I also hoped to convey the richness of STL. It is easy to be overwhelmed by all the detail I know that I was myself when I first tackled this subject. Only by seeing both the forest and the individual trees can you explore this thicket and not get repeatedly lost in the process.

P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Draft Standard C++ Library, and Programming on Purpose (three volumes), all published by Prentice-Hall.You can reach him at pjp@plauger.com.