April 2000/Views, A New Form of Container Adaptors

Features

Views, A New Form of Container Adaptors

Gary Powell and Martin Weiser

There's more than one way to slice a container. Views provide a particularly sharp knife.

The C++ standard template library has many useful containers for data. It also includes two container adaptors, queue and stack. We have extended this model along the lines of relational database semantics. An adaptor allows the standard algorithms to be used on a subset or modification of the data without having to copy the data elements into a new container. We provide many useful adaptors that can be used together to produce interesting views of data in a container.

When manipulating large containers of data, it would be handy to be able to use the algorithms in the standard template library. Unfortunately, the standard algorithms often require the data to be in a specific representation or to consist of only a special subset of the data. By using an adaptor, a container can be made to appear to the algorithm as if it contained just the data of interest, producing the required representation or subset on the fly, without any copying.

We call these container adaptors "views," as they are just another way of looking at the data. By our definition, the creation of a view does not modify the data in the original container. The view is as close to the original container as it can be in access functionality. By this we mean that a random-access data container, if possible, will have a random-access view. Some views are at most only bidirectional.

Views are designed to work with the standard algorithms and therefore provide iterators. Views are designed to appear as if they were a STL compliant container. The view is not the container, it is instead the transformation of a container into one with different attributes desirable for writing clean, clear, efficent code.

The current alternative to using a view is to use "smart iterators." (See [1].) Smart iterators are adaptors for iterators, while views are adaptors for containers. Although all the standard algorithms work on ranges instead of containers, other useful algorithms sometimes must call on the standard container member functions begin and end, or perhaps even rbegin and rend, as well as size and swap. They may also need to refer to standard container member types, such as iterator and const_iterator. In these cases, smart iterators are not powerful enough.

By contrast, a view is simply an adaptor that, applied to a container, presents on the fly a controlled sequence with an interface satisfying the requirements of the standard STL containers. Note, however, that several of the views presented here fail to satisfy some of the standard time-complexity requirements, because of the additional functionality they provide. Check the documentation [2] for a full description of each view type.

Views are a natural way to write smart-iterator factories. In fact, most of the algorithmic intelligence of views is encapsulated in their iterators.

Views can simplify writing class interfaces. Assume you have a class that contains nodes, edges, and patches. Perhaps it is a graph package or a finite element mesh — the latter is something we use views for. If you want to provide some restricted access to the graph data, how do you do this? You could make the node, edge, and patch containers public, but this is generally considered poor encapsulation. Maybe the node container is a polymorphic container of interior nodes and boundary nodes. Users of the mesh would then have to deal with a container of pointers.

Alternatively you could provide member functions such as beginNodes, endNodes, beginEdges, etc. And don't forget numNodes, numEdges, and perhaps node and edge, all as random-access methods. Still like this approach? We didn't. Our solution instead is to present appropriately adapted views that reference the internal containers. Writing calls such as nodes.begin() or edges.size() is better and easier to remember than numEdges — or was it num_edges, or size_edges, or something else? Views present the standard interface people are used to when dealing with STL containers.

Our views concept is based on the ideas of Jon Seymour [3] from 1996. He constructed a view that contained a transformation as well as a filter and thus resembled the SQL statements even closer than our current approach. But since transformation and filtering are useful in their own right, we decided to decouple these tasks and to create a lean, orthogonal interface. Of course, the functionality offered by both approaches is the same.

Examples

In the examples that follow, we will be using standard algorithms from STL. All the STL names are declared in namespace std, but rather than clutter up our examples with a lot of std:: qualifiers, we have omitted specifing the namespace. In actual code you will have to either open the entire namespace std by writing:
using namespace std;
or list the names you actually use with individual using declarations, as in:
using std::sort;
Or you will have to insert all those std:: qualifiers we omitted. Also we will be creating classes whose only method is operator(). These classes are known as "functors," because you can apparently call them just like functions.

As a first example, say we have a container of employees and we want to find the oldest one. We can use a transform_view from our library. A transform_view stores a transform function, which by definition takes an element of the container and returns a result. (See Figure 1.) The transform function stored by transform_view is represented by the template parameter Operation. This template parameter can be either a pointer to a function, or a functor class. For this example, we provide a pointer to a function that returns the age of the employee. Using the transform_view (one of the most useful views) with the transformation just described, we then create a view of the container that presents only the ages. A call to max_element yields an iterator referencing the employee record we desire:
// A really minimal Employee Record
struct Employee {
   Date       dateOfBirth;
   Department department;
   Salary     salary;
   Name       name;
};

typedef container<Employee> EmployeeContainer;

EmployeeContainer AllEmployees;

Time getAge(Employee const& e)
{ return today() - e.dateOfBirth; }

typedef transform_view<
           EmployeeContainer,
           Time (*)(Employee const&)
        >
        DateOfBirthView;

// Search global Employee Container.
Employee& getOldestEmployee()
{
DateOfBirthView dobv(AllEmployees, getAge):

DateOfBirthView::iterator OldestAge
  = max_element(dobv.begin(), dobv.end());

Employee& OldestEmployee = *(OldestAge.get());

return OldestEmployee;
}
Here we use the type container to represent any STL-compliant container. We have also provided a get function in the transform_iterator class to return the underlying container iterator.

The DateOfBirthView resembles a projection in relational algebra followed by a simple transformation, as realized by the SQL statement:
SELECT age(DateOfBirth) FROM 
   AllEmployees.
Now let's say we want to enable only a somewhat limited access to the employee data. Perhaps a specific user should see only the employees of a special department. We can use a filter_view from the views library. A filter_view contains a predicate as well as a reference to a basic container. It presents just the elements for which the predicate is true. Here is the solution using a view:
class Visible {
   Department visible;
public:
   Visible(Department const& v)
      : visible(v) {}

   bool operator()(Employee const& e)
   const
   { return e.department == visible; }
};

// Create the filter_view type specific to our need.
typedef filter_view<
           EmployeeContainer,
           Visible
        >
        VisibleEmployeeViewType;

// Create the actual view.
VisibleEmployeeViewType 
vev(AllEmployees,
   Visible(PublicDepartment));
The vev view can now be made available to the user, who will have up-to-date access to just the employee records that the user is permitted to see. Note that as an alternative to using views, we could have created a copy of the original data and then filtered it with the STL's remove_copy_if algorithm. However, it would not have been easy to keep this filtered data synchronized with the original dataset.

The filter_view implements the analogy of the relational algebra "select" operation, as realized by the SQL WHERE clause:
SELECT * FROM AllEmployees
   WHERE department == PublicDepartment
If we want to find the maximum salaried employee from the PublicDepartment we can combine the filter view with the transform view. We create a filter view of just those employees we are interested in and use tranform_view to make max_element see just their salary:
// Create a function to retrieve the Salary data.
Salary& getSalary()(Employee const& e)
{ return e.salary; }

typedef transform_view<
           VisibleEmployeeViewType,
           Salary (*)(Employee const&)
        >
        VisibleSalaryViewType;

VisibleSalaryViewType
vsvt(oev, getSalary);

VisibleSalaryViewType::iterator iter;
iter =
   max_element(vsvt.begin(), vsvt.end() );
*iter == the maximum salary data.

Employee& DesiredEmployee = *(*(iter.get() ).get();
We can build increasingly complex views of our data by layering these adaptors together.

The SQL statement corresponding to the vsvt view would be:
SELECT salary FROM AllEmployees WHERE department  
   == PublicDepartment
If you would like to view this data in reverse order you could apply a reverse_view. The reverse_view does what you would expect by having its member function begin return rbegin from the underlying container. Thus, it requires at least a reversible container. For example:
typedef reverse_view<EmployeeRange>
    ReverseEmployeeView;

ReverseEmployeeView rrve(vev);

copy(rrve.begin(), rrve.end(),
   ostream_iterator<Employee>(cout,"\n"));
A Survey of Views

Here is a short survey of some of the views contained in the library. First is downcast_view. This view applies a dynamic_cast on a container of pointers. It then filters out the elements that return a null pointer from the cast and then converts the pointers to references. With this view you can take a container of heterogenous objects, all inheriting from some base class, and view them as a collection of derived objects.

This sounds way more complicated than it is. We are using two kinds of transform_view and one filter_view. The first transform does the dynamic_cast, then we use the filter_view to skip the elements that return a null pointer. Then we use a transform_view to promote the pointers to references. In fact we realized that for containers of pointers to heterogeneous objects, promotion of a pointer to a reference would be useful on its own. We therefore created the polymorphic_view and used it as part of the the downcast_view.

The friction between generic programming and object-oriented programming shows up in the problems you have to face when representing polymorphic collections with STL containers. The only possibility is to use a container of pointers to the objects. But then it is unsatisfactory to present a pointer-to-base-class interface when conceptually we have a collection of derived objects. The polymorphic view just puts a dereferentiating layer on top of the pointer-to-base-class interface and presents an interface that matches the concept.

We have a number of other useful views working on sorted containers, which we can discuss only briefly here:

a union_view, which concatenates two containers, head to tail

a set_intersection_view, which, given two sorted containers, will return the elements that are in both containers

a set_difference_view, which returns the elements that are in one set and not the other

a set_symmetrical_difference_view, which returns the elements that are in one or the other set but not in both

a set_union_view, which returns the elements in two containers less the elements that overlap

a unique_view, which, returns the elements which are not duplicates.

We also have intersection_view, difference_view, and symmetric_difference_view, which do not require the elements to be sorted. These views will use the find member function of either container if it exists. While this is not efficient if you had to do it anyway, your alternative is to copy the data into a sorted container, or do this comparison manually. The view at least provides an efficent implementation of an inefficent process.

How were we able to make all these views without writing code for the last few years? We applied the age-old technique of divide and conquer. For the sorted views we created a group of utility views that could be applied in layers. (See Figure 2.) The lowest layer is the equal_range_view applied to each of the two containers. equal_range_view returns a range_view for each group of elements in a container, marking the beginning and end of runs of equal elements. Next we use a pair_merge_view, which combines corresponding runs into pairs. If there is no corresponding run on one side, an empty run is substituted.

To create the set_*_views, we then applied a transform to select the wanted elements from the pair of runs. For instance, to make a set_intersection_view, we return only the elements from the runs which appear in both pair.first and pair.second. We then apply a concatenation_view, which returns the elements one by one from the transformed view.

We extracted a common base, which is really just a template of typedefs for the sorted views, and then wrote five transform functors. Since each layer could be tested independently, it was remarkably easy to do. Well, it did take us a while to decipher the error messages, but once we were done we could be confident that the other views would work as well. In fact, that was the case.

We didn't start out with a layered approach to the sorted container views. We first tried the "each container does all the work" approach and ended up with a very complex set of code that was difficult to debug. It may still not have all the bugs removed. We have abandoned this code. The curious can find it on our web site under the "abandoned" directory.

Another interesting view that demonstrates the power of views over copying the elements is the crossproduct_view. For two random-access containers, we apply an operation and return the result for each [i, j] pair of elements. The default operation is a pair of references to the elements, but any functor which takes two elements is allowed. This allows for delayed evaluation and sparse storage of a matrix made by (a op b). We did create a proxy template so that crossproduct_view[i][j] returns the correct (i op j) result. Thus, the crossproduct_view is the analog of the "cross join" operation in relational algebra, as realized by the SQL statement:
SELECT * FROM Table1, Table2
In general, this is interesting only in combination with a filter_view eliminating nonmatching rows. And, of course, the views performance will not match that of sophisticated database engines in this case.

Additionally, there are several convenience views which are not much more than "templated typedefs" of more basic views — for example, to present only the keys or only the values of an associative container. We have included them because they show how easy it is to build layers of views which are then useful on their own.

Last, there is a range_view class. It is not really a view, in that it does not reference a container but an STL range given by two iterators. All it does is define the needed typedefs and methods to present the standard container interface. Using the range_view, all the views can reference ranges as well as containers.

Implementation

The advanced template mechanisms of C++, such as template template parameters and default template parameters, facilitate the construction of highly configurable yet easy-to-use code. The view will adapt itself to take full advantage of its underlying container, should the type of the container change during development. In this section we will explain some of the aspects that make the views code that flexible.

Ownership

Views can own the underlying container or merely reference it. We did this by specifing an argument to the view's template. The tag view_ref references the container, and view_own has ownership properties. This allows you to nest views together. The innermost view can be specified by the user of the view, the next layers require ownership.

Mutability

Views can be constant or mutable. We provide two tag classes, const_view_tag and mutable_view_tag to allow this specification. A mutable view allows you to modify the data in the underlying container. Users of mutable views should be careful to maintain any data integrity necessary for the container. We default to const_view_tag because, in our practice, we have found that to be the most useful. A const view uses only the const_iterator from the underlying container.

Sensible defaults

Some views limit their iterator categories. A filter_view, for example, is at most bidirectional. Views combining two containers, such as union_view, are limited by the less powerful of the containers involved. The common denominator in both cases is the less powerful iterator category. To be easy to use while keeping flexibility, the views compute the appropriate iterator category as a default template parameter at compile time, thus adapting themselves to the given container(s). See the sidebar for details about the technique we used.

Comparisons

All of the views provide the two basic comparison functions, operator== and operator<. We then provide the generic relational comparison functions operator!=, operator>, operator<=, and operator>=, which can be expressed by combining the first two comparisons. We also provide the conversion operators so that assignment to a const_view from a mutable_view is possible, as is comparison. The swap specializations were also provided so that swapping of two containers would be the most efficient. These comparison operators are also provided for the view iterators.

Views with these interfaces appear to act just like any other STL containers and can be used with STL algorithms. Views are a thin layer on top of their containers, therefore the overhead of using them is minimal.

Conclusion

We use the ISO C++ template features excessively. In general this can lead to code bloat. For every set of template parameters, a corresponding variant of the code is instantiated. This can become a major headache, especially for many template paramters. In the views library, this is a non-issue because it is an extremely thin software layer. Very few methods are more than six lines long. Thus, nearly all the methods will be inlined by a good optimizing compiler.

In our implementation we have chosen to use the most advanced template features that make the highly flexible implementation possible. This code will not compile on many compilers. We anticipate that more complier vendors will be able to handle our code soon. The compiler we used was GCC 2.95, which is available at the GNU web page [4] for many systems.

The current code base is still in development and available free of charge at [2]. The design architecture appears to have become stable in late 1999. New views seem to occur occasionally as our work projects expand. Contributions and suggestions for the library are welcome.

The STL containers can be easily adapted to act as subcontainers of their data using these adaptors. The adaptors have been used to help simplify the coding of several projects by the authors. We consider them very useful tools. With the adaptors provided, programmers can build interesting views of their own data, and can easily extend and/or combine these adaptors to view any STL-compliant container in other more interesting ways. These views are not meant to replace the standard algorithms that generate copies of the containers. Rather, they serve as an alternative which may be appropriate for the problem at hand.

The code is available at the VTL homepage [2].

Acknowledgements

The authors would like to thank Jon Seymour for providing them with an initial set of code and the idea of creating views which would work with the STL containers. We would also like to thank Jon for putting us in touch with each other. This is truly an example of the power of the Internet. We would be glad to hear from any users of this code and suggestions for future improvements.

References

[1] Thomas Becker. "Smart iterators," C/C++ User's Journal, September 1998.

[2] This documentation is available with the CUJ online sources, available at ftp.mfi.com or www.cuj.com. Also see: Gary Powell and Martin Weiser. "View Template Library Container Adaptors," http://www.zib.de/weiser/vtl, 1999.

[3] Jon Seymour. "Views — a C++ Standard Template Library Extension," http://www.zeta.org.au/~jon/STL/views/doc/views.html, 1996.

[4] GCC Team. GCC Home Page, http://gcc.gnu.org/, 2000.

[5] Nathan Myers. "Traits: A New and Useful Template Technique," C++ Report, June 1995.

[6] Todd Veldhuizen. "Techniques for Scientific C++," http://extreme.indiana.edu/~tveldhui/papers/techniques/techniques.ps, 1999.

Gary Powell is a senior software engineer at Sierra On-Line. He has a BA in Phyics from Middlebury College VT. While at Sierra, he has worked on numerous games including: "Birthright," "Hellfire," and "Professional Bullrider." His research interests include game AI, and generic programming using C++ templates. He can be reached at gary.powell@sierra.com.

Martin Weiser is a member of the Numerical Analysis and Modelling Division at Konrad-Zuse-Zentrum für Informationstechnik Berlin, Germany. His research interests are numerical methods for solving nonlinear partial differential equations and optimal control problems. The implementation of such algorithms led him to C++ and object-oriented and generic numerics. He can be reached at weiser@zib.de.