Columns


Standard C/C++

P. J. Plauger

The Header <list>

The advantage of a linked list is its flexibility. STL's <list> adds genericity, and convenient access in the bargain.


Introduction

The header <list> defines the template class list, naturally enough. Template class list is another of the template container classes defined by the Standard Template Library (STL). (See "Standard C/C++: Containers," CUJ, December 1996.) It is only slightly more complicated than template class vector, which I described last month. (See "Standard C/C++: The Header <vector>," CUJ, January 1997.)

To repeat my usual litany, a container is a class that manages a sequence. How it does so determines which operations on the container are easy and fast to perform, and which are hard and slow. STL offers an assortment of containers with different properties so you can choose the one with the most desirable attributes for a given application.

Template class list stores a controlled sequence of length N as a bidirectional linked list of N nodes, each of which stores a single element. The advantage of a linked list is its flexibility. You can insert and remove elements freely and easily within the list, just by rewriting the forward and backward links. You can even splice in whole sublists. The list nodes themselves don't move about in memory. As a consequence, any iterators you maintain to designate individual nodes remain valid for the life of the node. Similarly, any pointers you maintain to the individual list element itself also remain valid for the life of the node in which the element resides.

The price you pay is sequential access to arbitrary elements in the sequence. To locate element N, for example, you have to chain from one element to another N times, beginning with a pointer stored in the container object. You can chain in either direction, but chain you must. Using STL terminology, template class list supports bidirectional iterators.

Table 1 shows how template class list stacks up against the other STL containers. It is the clear winner for all operations that rearrange list elements (insertions, deletions, and replacements). It is the clear loser for all operations that locate arbitrary elements (searches and random access). It also requires a moderately hefty overhead of two pointers per element, the forward and backward links stored in each node.

So in summary, you use template class list when you need flexibility in rearranging sequences of elements, and in keeping track of individual elements by storing iterators that remain valid across rearrangements. Locating arbitrary elements within a list object is relatively expensive, since you have to perform a linear search each time.

Implementing List

Listing 1 shows template class list from the header <list>. I remind you that the /// comments indicate compromises needed for current compiler technology. Features required by the draft C++ Standard but not supported by many current compilers are commented out with a leading ///. Replacement code is "commented in," and flagged for later removal, with a trailing ///. Most of the rewriting of template class list is to avoid using member templates, which are not yet widely supported.

As with template class vector, which I described last month, I won't describe here how list meets its obvious obligations as an STL container. Instead, I focus on peculiarities of its implementation. The heart of a list object, as usual, is the data it stores to represent the controlled sequence:

That's the easy part, familiar to anyone who has ever managed a linked list. The object also stores between one and three allocator objects, depending on whether the compiler supports member templates. Here's where the real trickery comes in.

In STL containers, all controlled storage is nominally managed through the allocator object specified when you construct the container object. (See "Standard C/C++: Allocators," CUJ, June 1996.) As I explained last month, in conjunction with vector objects, allocators were originally invented to allocate and free arrays of objects of some element type T. They have since been made far more ambitious, and complex. Template class vector can get away with the simplest usages, but not so template class list, for several subtle reasons.

To begin at the beginning, let's examine how you normally manage a bidirectional linked list. You need to define a class Node that stores all the data required of a list node. To store an object of type Ty, you can write:

class Node {
    Node *Next, *Prev;
    Ty Value;
    };

The one small trick you must make use of dates back to the earliest days of the C language, from which C++ evolved. The forward link Next and the backward link Prev are both self-referential pointers — they point at other objects of the same type as the object in which they reside. No sweat, you can declare a pointer to an incomplete type inside a structured type, even if that type is the one you're busy completing.

But allocators cause problems. The first problem is that an object of type list<Ty, allocator<Ty> > is constructed with a useless allocator object. We're not interested in allocating objects of type Ty, which is all that an allocator<Ty> object knows how to do. Instead, we want to allocate objects of type Node. That means we really want an allocator object of type Allocator<Node>. And we want to associate it, in some obvious way, with the allocator<Ty> object supplied to the list object when it is constructed. The allocator might be allocating objects from a private storage pool, for example, which we certainly want to use as intended.

If an implementation supports member templates, you can use two bits of trickery supplied by all allocator types. The first is a member template class that lets you name the actual allocator type you need:

template<class U>
    struct rebind {
        typedef allocator<U> other;

Thus, the bizarre formula A::rebind<Node>::other is a way of naming the type Alloc<Node> when all you have is the synonym A for the type Alloc<Ty>. (This is very twisty stuff. Study it for a while and, if you still don't get it, come back to it later.)

Once you can name the kind of allocator object you want, you still have to construct one from the original allocator object. So all allocator types supply a template constructor:

template<class U>
    allocator(const allocator<U>&);

You can thus construct an Alloc<Node> object from an Alloc<Ty> object. Presumably the constructor is smart enough to copy over any pointers to private storage, or what have you. (Same caveat as above — don't think too hard about it.)

Listing 1 reveals that template class list stores the object Myal (for "my allocator"), at least when member templates are available. The container object uses this allocator object to allocate and free all list nodes. When member templates are not available, the code instead calls the nonstandard member function Charalloc, which I described in the June 1996 installment. Much like the Standard C library function malloc, it lets you allocate an object of arbitrary size.

Generalizing Pointers

Allocators cause yet another problem. They reserve the right to store the objects they allocate in funny places. More precisely, an allocator type A defines for you the type A::pointer, which you are obliged to use to describe any "pointer" to an allocated object. I use quotes here because the type need not be a pointer in the old-fashioned sense inherited from C. If p has type A::pointer, the only promise is that *p is an lvalue that designates the allocated object. (You can access the value or assign to it via *p.)

But this causes a real problem with the declaration of class Node. You want to write:

class Node {
    A::rebind<Node>::pointer Next, Prev;
    Ty Value;
    };

but you can't. An allocator template can be specialized only for a complete type. Type Node is not complete until the closing brace of its definition. You need to describe the pointers it stores before you can complete its definition. What can you do?

When such dependency loops occur, the usual copout in C is to introduce generic, or void, pointers, as in:

class Node {
    void *Next, *Prev;
    Ty Value;
    };

A void pointer is obliged to store any kind of object pointer you can declare in C. You lose a bit of type checking this way, and you have to write occasional type casts when you use the pointers, but it does solve the problem.

When it comes to pointers supplied by allocators, however, the draft C++ Standard is less than clear. In one place, it says that any A::pointer must be interconvertible with a void pointer. In other places, it suggests that an A::pointer can be an arbitrary class type, subject to the restrictions I described above. So I chose a more convoluted, but probably more robust path.

I assume only that any A::pointer is interconvertible with any A::rebind<void>::pointer. Put another way, the type Alloc<void>::pointer supplies the generic pointer type for the family of types Alloc<Ty>::pointer. Whoever writes the allocator template must supply an explicit specialization for type void anyway. It shouldn't be all that hard to ensure that the explicit specialization supplies a sufficiently flexible pointer type in the bargain. (Once again, if you can't track all this stuff, don't worry too much. Most of the rest of template class list is easy enough to understand.)

Perhaps now you can understand the reason for the type Genptr in Listing 1. It is the notorious generic pointer I just described. One small added complexity is the allocator object Myalp (for "my allocator for pointers"). You don't need to allocate and free such pointers, but you do need to know how to construct and destroy such creatures. An allocator object gives you access to the member functions needed to perform these operations.

Still more complexity is encapsulated in the member struct Acc. It supplies handy functions for accessing the objects stored in a list node. Thus, the expression Acc::Next(Ptr) lets you access the forward pointer Next in the node designated by Ptr. To make the expression an lvalue, the function Acc::Next must return a reference to the stored object. Opinions differ considerably on how much latitude an allocator has in defining reference types. Some feel that Alloc<Ty>::reference must always be a synonym for Ty&. But just in case someone supplies an allocator that succeeds in being more clever, I make uniform use of the reference types defined by the allocators.

The Easy Stuff

Once you get past the sophisticated machinery for allocating and freeing nodes, you'll find the rest of template class list rewards even light reading. The template class defines nontrivial member classes iterator and const_iterator, for much the same reasons that vector<bool> did so last month. Chaining from node to node is slightly more work than incrementing a pointer. But the actual implementation of these iterator types is far more standard machinery than trickery.

Template class list differs from the other STL containers in its support for rearranging elements and subsequences. Take a close look at the member functions splice, unique, merge, sort, and reverse. You will find some truly interesting algorithms encapsulated here. But that's a topic for another essay. o

P.J. Plauger is senior editor of C/C++ Users Journal. He is the author of the Standard C++ Library shipped with Microsoft's Visual C++, v4.2. For eight years, he served as convener of the ISO C standards committee, WG14. He remains active on the C++ committee, WG21. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at pjp@plauger.com.