September 1999/STL-Style Iterators for MFC

Algorithms

STL-Style Iterators for MFC

Kevin Kostrzewa

STL prides itself on being extensible. You can even extend it to subsume MFC with all its varied containers.

Introduction

The MFC (Microsoft Foundation Classes) library is the most commonly used C++ application framework and class library for writing Windows GUI programs. This library contains both an object-oriented hierarchy of windowing elements and a number of helper classes. A part of these helper classes consists of a set of type-safe (template-based) and type-unsafe (class-based) collection classes. These collection classes fit the requirements and behavior of growable arrays, associative arrays, and linked lists, otherwise known as arrays, maps, and lists.

The Standard C++ library, by adopting Alexander Stepenov and Meng Lee's STL (Standard Template Library), offers its own set of collection classes [1]. These collection classes support a concept called iteration, which enables the collections to be used with a rich set of already written and tested freestanding library functions.

Unfortunately, MFC does not support the STL concept of iteration. In this article, I describe the use and implementation of a component I've created that fills in this gap by allowing MFC collection classes to be iterated. Providing iterators to MFC offers three advantages. First, code written with the MFC iterators walks through contained elements in a standard manner. Second, using iterators with MFC serves as an introduction to the powerful STL portion of the Standard C++ library. Finally, and perhaps most importantly, the rich set of STL algorithms can be seamlessly incorporated into code based on MFC collections.

All of the code for this article was compiled and tested using version 6.0 of Microsoft's Visual C++. Due to the complexities of supporting templates in a C++ compiler, I highly doubt that any of this code is executable on versions of Visual C++ prior to v6.0.

About Iterators

Iterators are objects that "provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation" [2]. According to the iterator concept, when you use an iterator (any iterator, not just ones provided in STL), you are blissfully unaware of how the data is internally stored by the aggregate object. All you need to know is which iterator operations allow you to move from element to element, and which iterator operations allow you to access a particular element.

In STL, iteration is viewed as a service provided by a separate member type that a collection doles out when asked. That iterator type can be manipulated without knowing the kind of collection being iterated. The iterator can be as simple as a pointer or as complex as a C++ class, as long as it follows some predefined rules of behavior. This behavior varies according to the classification of iterator. For example, an iterator can be classified as forward, meaning that it can move only one element at a time forward through a sequence. Another iterator could be classified as random access, meaning that it can move many elements at a time forward or backward through a sequence. For a full description of the classes of iterators and their required behavior, please see [1].

To obtain an iterator to an STL collection, you call a member function on the collection. The STL collections provide a set of member functions that return iterators. These member functions share common names from one collection to the next (see Figure 1). In MFC, on the other hand, the names of the member functions to iterate the collection change with the collection type (see Figure 2). One of the advantages of STL over MFC is that it has iterators that share a common interface and are defined separately from the collection classes over which they operate. You can take advantage of STL's common naming convention and behavior by writing template functions that accept iterators to the beginning and end of a sequence. If the functions are written in such a way as to not care about what collection is being iterated, they will automatically work for all existing STL collections, as well as any future collections that support STL iterators. This is how the algorithms section of STL is implemented. STL implements a set of common algorithms (searching, sorting, etc.) once in terms of STL iterators over a sequence. Any new collection written to support STL iterators automatically acquires this rich set of algorithms as supported functionality.

Goals for the MFC Iterators

Other than the obvious goal of making this component work, I want usage of the MFC iterators to mimic the usage of the STL iterators as much as possible. The other common goals (such as good performance) follow naturally, since the iterators can be implemented inline with very little code. In fact, a cursory analysis shows that about 10% of the code does the "real" work of iteration, and 90% of the code is dedicated to making the component look and feel more like STL.

An STL iterator is typically a nested type within a template class. An example of an iterator typename is:
std::list<T>::iterator  
Here, the iterator type is a nested type of the template class list<T>. The left-hand side of the iterator's typename is the name of the collection class template (std::list in this example — std is the name of the namespace in which list is defined), followed by a template parameter enclosed in angle brackets (<T>). The right-hand side of the declaration is either iterator, const_iterator, reverse_iterator, or const_reverse_iterator. In between the left-hand side and the right-hand side of the typename is the scope resolution operator (::). Thus, std::list<int>::reverse_iterator is the typename of an STL iterator defined to operate over a list of ints.

The nested type iterator is used when an increment operation (any variant of operator++) implies forward movement through the sequence. Also, iterator provides access to elements of the sequence via non-const reference, which means that any element accessed via the iterator can be modified. The const_iterator member type works just like iterator, except that it provides access via a const reference. You can't modify an element directly through the use of a const_iterator.

To fulfill the goal of usability similar to the STL, I have to support all these types of iterators on the appropriate collections. If you want to have a CArray and be able to protect it from modification, I have to support const_iterator and const_reverse_iterator. If you also want to have a modifiable CList, I have to support iterator, const_iterator, reverse_iterator, and const_reverse_iterator. Complicating matters, MFC has 23 versions of their template and non-template collections, compared to three in STL. Complicating matters even further, I can't add any new member functions to the existing MFC collections. I discuss how I deal with these problems in the Implementation section of this article.

Usage

All the MFC iterator template classes and supporting freestanding functions are hidden within the namespace mfciter to avoid global namespace pollution. Every time you call an MFC iterator function or instantiate an MFC iterator class, you must either use the mfciter:: prefix or add a using directive to your code. For the majority of collections, access to the beginning of the collection is obtained via a call to the freestanding mfciter::begin function, with the collection object passed as the lone parameter. In corresponding fashion you can get an iterator to the end of a collection via a call to mfciter::end (also passing the collection as a parameter). This interface enables access to the beginning and end of a sequence in the style of STL.

The iterator type returned is a member typedef of a struct (possibly a template struct, depending on the collection). The name of the struct identifies what kind of collection is being iterated. It also serves as a way to simulate the iterator declaration style of STL without adding member types to the MFC collections classes (more on this in the Implementation section). The typedef member of the struct that is returned is either iterator or const_iterator, whichever is more appropriate (according to the desired const-ness of the collection object).

There are some simple rules for figuring out the name of the struct to use for the collection being iterated. For example, if you're iterating a CStringList, the struct returned is of type mfciter::slst (slst="String LiST"). If you're iterating a CArray<int, int>, the struct returned is of type mfciter::carr<int> (carr="CARRay"). If you're iterating a CTypedPtrMap<CMapPtrToPtr>, void *, foo *>, the struct returned is of type mfciter::tppmap<void *, foo *> (tppmap="Typed Ptr to Ptr MAP"). The full declarations of the above examples are shown in Figure 3. A table of the various collections and the structs that they use are shown in Table 1.

For reasons specified in the Implementation section, the templates CTypedPtrArray, CTypedPtrList, and CTypedPtrMap require a different freestanding accessor function call. It can't be just begin or end. The accessor function for a given CTypedPtrX instance requires a prefix, which is formed by the first few letters of the name of the returned struct container. For example, for a CTypedPtrList<CPtrList, int *> the returned struct container would be (from Table 1) a tplst<int *>. Therefore, the accessor function would be mfciter::tp_begin. Table 2 shows the accessor functions for the various CTypedPtrX template collections.

After you have obtained the appropriate iterator type, you can manipulate it using some very familiar operators. Iterators are a generalization of pointers, so you can use the same operations on them that you would use on a pointer, with restrictions depending upon the iterator's classification. The map iterators are classified as forward iterators, which means that you can advance through their sequence using the ++ operator, copy their value (but not the element they point to) using copy construction or the = operator, compare them (but not the elements they point to) using == or !=, and access the value that they represent using * (the indirection operator, not the multiplication operator) or ->. When accessing the element represented by a map iterator, you must specify whether you want the key field or the value field of the element. The key field is accessed by specifying the first field of the element accessed via the iterator * or -> operator. The value field is accessed by specifying the second field of the element. Note also that although the map iterator provides non-const access to the map's data, you can modify only the value field (second) of any element, not the key field. For examples of valid operations using the map forward iterator, see Figure 4.

The list iterators are classified as bidirectional iterators, which means that in addition to the previously mentioned forward iterator operations, you can use some new operators to move backwards as well. You can move backwards only one position at a time with a bidirectional iterator, using the -- operator. Since a list element has no key-value fields, you can access an element simply through either the * or -> operator, just as you would with a pointer. For examples of valid operations using the list bidirectional iterator, see Figure 5.

The array iterator is the most powerful of the three iterators. It is classified as a random-access iterator, which means that it acts just like a memory pointer. With a random access iterator, you can jump through an array as many positions at a time as you wish, using operator+, operator-, operator+=, operator-=, operator--, or operator++. In addition to this greater freedom of movement, more functions are available for comparing positions of iterators. Iterators can be compared by their relative position on a sequence using operator<, operator>, operator<=, and operator>=, as well as the basic == and != operators. You can even calculate the integer distance between two random-access iterators by subtracting them. For an example of operations using the array random-access iterator, see Figure 6.

In addition to the forward and bidirectional iterators, there are iterators that explicitly support reverse iteration through a sequence. These are named either reverse_iterator or const_reverse_iterator, and are accessed by calling the freestanding functions mfciter::rbegin and mfciter::rend in the same manner as mfciter::begin and mfciter::end are called to obtain forward iterators. There are no reverse iterators defined for a map, but there are for arrays and lists. The reverse iterators fit the same classification scheme as the regular iterators through a sequence, so a list's reverse iterator is bidirectional, and an array's reverse iterator is random access.

Once you are familiar with declaring your own iterators for walking through a sequence, the true power of the component is available to you. The algorithms section of STL, as part of the Standard C++ library, is now applicable to any MFC collection. Sorting, searching, and copying elements is as simple as calling a freestanding function from the STL <algorithm> header file. Before, if you wanted to sort an MFC array, even a simple array of atomic elements, you'd have to use a sorting function that supported hand-coded callback functions to pluck out individual elements from a sequence and compare them. Now, all that is necessary is to ensure that operator< is a callable operator on the contained elements, and then call std::sort, passing the beginning and end of the sequence to be sorted. Figure 7 shows a comparison of sorting methods on an MFC array of unsigned integers, using the old method and the new iterator-based method.

Another advantage of using iterators when calling these algorithms is the ability to perform the algorithm on a subsequence. Suppose you want to use the function std::find to find a match to an element in your container, but you want to consider just the first five elements. Instead of passing mfciter::begin() and mfciter::end() to std::find, which you would do to search the entire sequence, you pass mfciter::begin() and mfciter::begin()+5.

I try to prevent invalid operations on iterators by making those operations result in either compile-time errors or run-time errors. I'd rather generate a compile-time error than a run-time error, but that is not always possible. Some invalid usages will result in assertion failures inside the template code. The compile-time errors that result from invalid operations may produce some very long and cryptic error messages, which can be troublesome if you are not used to using template classes. My only advice for dealing with iterator compilation errors is to read through the entire text of the error message and try to block out unnecessarily long template type names. I've tried to add enough comments around assertion failures to make the cause of the error easy to find and fix.

Implementation

There are two major forces to contend with in implementing the MFC collection iterators. First, as I mentioned in the Goals section, I want the declarations of these iterators to mimic the declaration style of the STL iterators as much as possible. Second, each iterator needs to know the collection class it is operating on and the value type that the collection class contains. An STL iterator knows the collection class it operates on because it is a member type of that class. It knows the value type because it can "see" the class's template parameters.

Communicating the type of the MFC collection class to the iterator proved to be the biggest hurdle. Following are a few approaches I rejected:

Make the iterator class a member of the MFC collection class, in the manner of STL. I rejected this because it would have meant modifying the MFC code.

Derive new classes from the MFC collection classes and add iterator members to the derived classes. This would have meant using inheritance to model a questionable "Is-A" relationship between the derived and base classes.

Create three iterator templates, one for arrays, one for lists, one for maps, parameterized only by MFC collection type. Unfortunately, the iterator would have no way to infer the value type.

Create an iterator template that takes both the MFC collection class and value type as template parameters. This might have worked, but would have required an excessive amount of typing to declare some kinds of iterators (e.g., for a CTypedPtrMap of CString objects to CWindow pointers).

The approach I settled on is to create three template structs, one for each of the MFC collection class types. These template structs are not designed to be instantiated. Rather, they serve as a sort of a parameterized namespace that defines all of the generic iterators used for MFC arrays (struct base_array), lists (struct base_list) and maps (struct base_map). This approach takes advantage of the common member function names found in MFC collection types. The template parameters passed to these structs are both the collection type and the value type, fulfilling one of my initial requirements. As I said, these template structs are not designed to be instantiated. What's more, their design assumes they will never be named in user code.

To declare an iterator, users do not name the base template struct directly as the left-hand side of the scope resolution operator. Instead, new non-instantiatable structs are derived from the base template struct for the 23 implementations of the MFC collection classes. The name of each of the derived structs is the mnemonic created from the collection class found in Table 1. Each derivation specializes the template parameters passed to the base template struct. If a derivation represents a non-template collection class (like CStringList), the derivation is a non-template struct. Otherwise, the derivation must be a template struct, to allow the user to plug in the proper type parameters. Along with the derivation of the new structs, the pairs of freestanding begin/end functions (and rbegin/rend, where applicable) are defined. I feel this is the best way of balancing the forces mentioned at the beginning of this section. I fulfill the requirement of passing down collection type and value type while producing a terse iterator declaration style like that of STL. (Collection type is implied based on the mnemonic name of struct used, and value type is either implied or a template parameter.)

The helper templates CTypedPtrArray, CTypedPtrList, and CTypedPtrMap present a special problem. Each template class accepts a single base collection class from a finite set of classes as a template parameter. So, if you have a CTypedPtrArray of int *, you would declare an instance of that class as:
CTypedPtrArray<CPtrArray, int*>
If I create a single template struct derived from base_array that contains iterators on all CTypedPtrArray objects, I will need to pass template parameters that specify the base collection class type as well as the element type, kind of like this:
mfciter::tpp<CPtrArray, int*>::
    iterator
Because up to now I've been able to exclude the collection class in the template parameters of the iterator declaration, this style of declaration is ugly and unacceptable.

To get around this problem, I provide n versions of the template struct for each CTypedPtrX template, where n is the number of acceptable base collection classes that can be the first template parameter of the CTypedPtr. The name of each of the n versions of the struct starts with the letter t, serving as the reminder that it is a typed pointer. For example, if you have a CTypedPtrArray on a CObArray that contains CView pointers, the name of the struct is toarr<CView*>. For a complete listing of all structs for CTypedPtrX, see Table 1.

Complicating matters further, my scheme of overloading the freestanding accessor functions mfciter::begin, mfciter::end, mfciter::rbegin, and mfciter::rend for CTypedPtrX templates causes internal compiler errors whenever those functions are called. I believe it should be possible to create overloaded versions of those functions in C++, but that's not a valid argument when you've got a compiler crashing on you. My solution is to create special versions of the freestanding accessor functions that have unique names (which can be seen in Table 2). Avoiding multiple function overloads with partial template arguments seems to make the compiler happy. I still feel that this represents a bug in Microsoft's compiler, and will explore using common accessor function names when future compilier versions are released.

I've tried to keep the implementations between the three styles of iterators as common as possible, which does lead to some implementation concessions. For example, Visual C++ 6.0 does not yet support static const typed member constants. Member constants must be simulated with enumerations. I would like to use a special enumeration value to represent "end of iteration" with all of the iterators (called end_of_sequence). The obvious choice for array (-1) works fine as an enumeration value, but the obvious choice for lists and maps (NULL) will not work without typecasting, because enumeration values are represented as numeric, not pointer, constants. Eventually, when Visual C++ supports static const member constants, I'll be able to make "end_of_sequence" be a const member of the appropriate type and avoid all this typecasting.

By definition, the iterator and reverse_iterator member types provide non-const access to a sequence. They must allow modification of the values in the sequence that they represent. With the array iterators, implementing this is trivial because MFC arrays return members by reference. Lists and maps pose a bit more of a problem because they require separate functions to be called to access members vs. modify members. To allow modification, I've created a helper template for both lists and maps called _modifiable_value. The _modifiable_value template looks like a value unless you try to assign to it. It forwards that assignment down to the appropriate assignment function of the associated collection class.

Other than these workarounds, the code for these classes is very straightforward. Map iterators go forward through their maps using the member function GetNextAssoc. List iterators move through their list using the member functions GetNext and GetPrev. As I said before, about 10% of this code actually works on getting iteration done. That 10% is not very complex, especially if you are even slightly familiar with MFC collection classes.

Conclusion

This set of iterators on MFC collection classes should provide both a rich set of new functionality to MFC collections and a good bridge to the advanced collection concepts of STL. Many common operations that were difficult to implement directly on MFC collections, such as sorting, should now have trivial implementations and be much less error prone.

Acknowledgements

Thanks to John Harding and Tonya Kostrzewa for their assistance and suggestions.

References

[1] Alexander Stepanov, et al. The Standard Template Library. Made available by Hewlett Packard. http://www.hpl.hp.com/techreports/95/HPL-95-11.html

[2] Erich Gamma, et al. Design Patterns (Addison-Wesley, 1995).

Kevin Kostrzewa works as a computer programmer in Ann Arbor, Michigan. He has a Bachelor's degree in Computer Science from Michigan State University. His programming interests include design patterns and application frameworks written in C++. He can be reached at tkkost@newsguy.com.