December 1995/Standard C/C++

Columns

Standard C/C++

The Standard Template Library

P. J. Plauger

P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Draft Standard C++ Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.

Introduction
The Standard Template Library (or STL for short) is an impressive collection of software from Hewlett-Packard Labs. It was developed there by Alex Stepanov (now at Silicon Graphics), Meng Lee (still at Hewlett-Packard), and David R. Musser (at Rensselaer Polytechnic Institute). The current version is in C++, but the technology long predates the addition of templates to that language. In fact, Musser and Stepanov described an early implementation using Ada "generics," a form of templates, over seven years ago. (See References [1], [2], and [3] at the end of this column.) And that work was based on still earlier work by these two.
The result today is upwards of 9,000 lines of highly-refined code. Almost all of it takes the form of template classes and template functions. You can instantiate these templates for a wide variety of types, either builtin or classes you define yourself. These instantiations often come remarkably close to the best tailored code you could write by hand. And they supply algorithms and data structures that are devilishly hard to get right if you code them yourself. So you get the benefit of numerous well crafted solutions to common problems in a form that is efficient in code space and execution time as well.
You can categorize all the STL code as falling into three broad areas: algorithms, containers, and iterators. I consider iterators to be the most inventive and interesting of the three categories, but it is also the hardest to explain. Thus, I'll deal with algorithms and containers first.

Algorithms
Strictly speaking, an algorithm is a mathematical procedure for determining a result. It differs from a mathematical function in just a few ways. Probably the most important to us computer types is that an algorithm involves only a finite number of operations to complete. By contrast, a function can conceptually take an infinite number of operations, yet still yield a well-defined result.
In programming, we use the term algorithm a bit more loosely. We take it to mean any function (in the C/C++ sense) that computes a useful result. A good algorithm does the job with an economy of operations, and with predictable time complexity. It is chary of special cases, like zero repetition counts, and it avoids gratuitous intermediate overflows and other computing anomalies. It is cohesive, presenting a narrow and sensible interface to the outside world.
A well coded algorithm is thus an important asset to the working programmer. We use them all the time, in the guise of the Standard C library. All those math functions, string manipulators, and I/O facilities capture any number of algorithms in code that have proved to be remarkably reusable over the years.
In the case of a function library, the choice of data types is critical to reusability. A square root function is much more widely usable if it takes a double argument than if it takes an int. A sequence of char makes a more widely usable string or file than, say, a string of long elements. Imagine how much more useful all these functions could be, however, if you had some say in what data type should be used.
That's where templates come in. Within some limits, you can specialize (or instantiate) a template function or class for one or more specific types, and the translator fills in the blanks for you. Those limits are dictated by any assumptions made, within the template, about what you can do with the parameter type. If the template wants to add one to an object of the parameter type, then the type you choose had better define what it means to add one to an object of that type.
A significant achievement of STL is that it provides quite a number of useful algorithms. It does so in a helpful framework — the types you can substitute for its parameters fall into a handful of categories. Each of these categories is pretty clearly spelled out, and the template definitions are robustly written. Thus, there's a high probability you will guess right about what types you can use with a given algorithm, and there's an equally high probability that the code will do the right thing.
STL offers around 100 template functions that implement algorithms. On the simple end are templates like for_each, which calls the function you specify for each element of a sequence. On the complex end you will find things like stable_sort, which sorts in place a sequence using an ordering rule you specify, preserving the order of elemnts that compare equal.
As you become familiar with STL, you find that more and more of the "interesting" code you used to write can now be written in just a few lines. Invoking an algorithm template or two does all the hard stuff. And it does the job better than you are likely to do.

Containers
Organizing data is as important as choosing the algorithms that manipulate it. How you structure a sequence of elements determines how messy or time consuming it is to add elements, remove them, visit them in various orders, or rearrange them. Indeed, the choice of data structures can make or break a program when it comes to implementing its most time-critical operations.
I long ago observed an interesting parallel in physics. Some computations are notoriously complex, particularly those that require matching up two solutions at a boundary. One approach to solving such problems is to change coordinate systems. Move to a system where the boundary looks simple and the equations are easy to solve. Of course, changing back and forth between coordinate systems may be messy in its own right. The advantage is that someone has probably solved that problem for you, so you can just recycle a known technique.
We in computing have repeatedly rediscovered a handful of useful ways to organize collections of data. Think of them as the moral equivalent of coordinate systems, for the analogy above. An array is good for quick random access, but it doesn't grow easily, and insertions are a real nuisance. A linked list makes light work of additions and insertions, at the cost of linear access instead of random. And so on.
The sad thing is that we've all reimplemented vectors, lists, and other common data structures repeatedly over the years. The code in each case is highly similar, but it has to change in small ways to accommodate variations in the data being managed. Wouldn't it be nice if someone could implement a linked list once and for all in such a way that we'd all be happy to recycle that implementation?
That's where STL's containers come in. These are template classes that support the half dozen or so commonest ways to organize data. The template parameter lets you specify the type of data in each element. All that code for growing, shrinking, inserting, and visiting all the elements is provided once and for all by the containers themselves.
Once again, you slowly learn that much of the code you tend to write for a new program is just bookkeeping for a common data organization. An STL container or two can save some tedious effort.

Iterators
Iterators are the glue that pastes together algorithms and containers. They generalize the concept of a pointer in C by using the ability to overload operators for newly defined classes in C++. Consider, for example, the common pattern for summing the elements of a sequence bounded by the pointers first and last:

for (sum = 0; first != last; ++first) sum += *first;
Here, first sure looks like a pointer, but in C++ it need not be. It can, in fact, be any type that supports:

comparison for (in)equality (first != last)

dereferencing to access a value (*first)

(pre)incrementing (++first)
Of course, you can also use an object pointer in this code.
Add a few lines to the example above and you can make a template function that sums the values in a sequence. Add a few more requirements to those listed immediately above and you have the specifications for a typical iterator.
Nearly all of the algorithms that STL provides work on sequences accessed via iterators. A handful of categories describes the various iterators around which the algorithm functions are defined. This is the organizing principle that gives STL much of its strength and flexibility.
Each STL container defines the iterators needed to access its elements. You can, of course, also supply your own iterators for your own data structures as well. Most important of all, ordinary pointers almost always work just fine as iterators. You don't have to define a bunch of fancy classes if you don't want to. And yet you can still easily extend the set of algorithms and containers that work within the STL framework.

The Code
Hewlett-Packard Labs has kindly released their implementation of STL into the public domain. You can use the code and the accompanying documentation freely, even redistribute it as part of commercial products. You pay no royalties. All they ask is that you reproduce their copyright and disclaimer notice, which is reasonably innocuous:

"Copyright (C) 1994, Hewlett-Packard Company. Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. Hewlett-Packard Company makes no representations about the suitability of this software for any purpose. It is provided 'as is' without express or implied warranty."

Hewlett-Packard
has also applied for several patents on some of the adaptive algorithms developed as part of STL. They have announced their intention to "let" these patents — allow them to be used without prior permission and with no royalties. You have to give the company top marks for good citizenship in this area.
For information on how to obtain a copy of the Hewlett-Packard STL code, see the sidebar, "Getting Your Hands on STL." If you're interested in a commercial version, at least two companies have been selling repackaged STL for some time now. Both Modena Software (1-800-MODENA-1) and ObjectSpace (1-800-OBJECT-1) have run ads here in CUJ at various times over the past year.
Note, however, that the public-domain version of STL, or a commercial repackaging thereof, is not what has been accepted as an ANSI/ISO standard. As with any proposal accepted by the standards committees, the original STL specification has been reworked several times over the past year or so. Approved changes, large and small, affect nearly every corner of the STL code. As a result, programs you write using the existing packages may well have to change when you move to an implementation that conforms to the draft C++ Standard.
The very organization has changed, for example. As shipped by Hewlett-Packard, STL is organized into 48 headers, as shown in Table 1.
These are rearranged into thirteen headers in the draft C++ Standard:
algorithm   deque
functional  iterator
list        map
memory      numeric
queue       set
stack       utility
vector
The impact of this particular change is far reaching, but easily addressed. Just change your list of include files. Keep adding headers until the compiler stops complaining. (Some people make a header <stl.h> that includes the whole package, which neatly encapsulates the change of headers, but that can really slow certain compiles.)
More fundamental changes occur in how containers allocate storage. They primarily affect the more adventurous users of STL, such as those who endeavor to make use of a far heap with the current package. Just a few of the changes affect simple uses of STL, as I recall.
As with all things C++, you suffer more from many small changes persisting over the years than you do from a few major ones. Those who love excitement are continually rewarded. Those who crave stability feel like they're being pecked to death by sparrows. The only solace I can offer in this area is that the draft C++ Standard is indeed getting progressively more stable with each passing trimester. Once the big players get caught up with (roughly) the current draft, you can expect even greater stability.

Coming Soon
Commercial C++ compilers with properly updated versions of STL will begin appearing on the market in early 1996. Commercial validation suites for the Standard C++ library are appearing in the same time frame. I speak knowledgably on this topic because I am now a vendor of both. To be precise, Plum Hall Inc. (1-800-PLUMHALL) is the vendor and I'm the implementor in the back room. I can attest that the Standard C++ library business is now both competitive and fast moving.
What the validation suites do is keep the library vendors honest. Maybe your code that uses STL won't have to change much with the emerging standard, but those implementations sure have to. Once the flurry of debugging settles down, you can expect more uniformity among commercial offerings that include the new Standard C++ library.
The down side of this increased commercial interest is that conforming versions of STL are not likely to be as widely available as the existing crop based on the Hewlett-Packard release. With serious money at stake, some library vendors are rather protective of their code. With the possible exception of the Project GNU folks (see the sidebar), you probably won't find too many inexpensive offerings that also conform to the draft C++ Standard.
Protecting STL is a challenge, by the way. All the code lives in header files, near as no matter, which are notoriously easy to read and to copy. I still hear talk about possible C++ implementations that let you precompile templates, much like conventional library object modules, but that technology hasn't made it into the real world yet, as far as I know. As a consequence, some library vendors are insisting that compiler vendors "shroud" their headers. Only encrypted versions of the headers are visible as disk files — the compiler decrypts these files on the fly as your code includes them. That may be convenient for the vendors, but it sure makes life tough for users who like to understand the tools they depend on.
For my part, I continue to believe in copyright protections. I don't mind disclosing great chunks of the stuff I sell, so long as my audience is adequately educated about what they can and cannot do with it. My policy has remained the same since I started publishing C library code over four years ago [4]. You can build executable programs that use my code, royalty free, provided you preserve my copyright notice. You can find a more precise statement in the preface of my more recent book on the C++ library [5]:

"All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing of the author. You may use and redistribute the code fragments in this book without royalty or fee only as part of executable images, and only provided that the following notice is included prominently in the associated documentation and as part of the executable image: 'Portions of this work are derived from The Standard C++ Library, copyright (c) 1995 by P.J. Plauger, published by Prentice-Hall, and are used with permission.'"

If you want to include a linkable version of the library, or its all important header files, in anything that you distribute, talk to the nice folks at Plum Hall Inc. about licensing.
In the months to come, I'll be exploring STL in greater detail. Along the way, you'll get to see my version of much of that code. (No, I don't claim to have reauthored it. It's mostly reformatted and altered to conform to the draft C++ Standard.) I think you'll find it an interesting journey.

References
[1] D.R. Musser and A.A. Stepanov, "A Library of Generic Algorithms in Ada," Proc. 1987 ACM SIGAda International Conference, Boston, December 1987.
[2] D.R. Musser and A.A. Stepanov, Ada Generic Library, Springer-Verlag, 1989.
[3] A.A. Stepanov and Meng Lee, "The Standard Template Library," Technical Report HPL-94-34, Hewlett-Packard Laboratories, April 1994.
[4] P.J. Plauger, The Standard C Library, Prentice-Hall, 1992.
[5] P.J. Plauger, The Draft Standard C++ Library, Prentice-Hall, 1995.