Columns


Stepping Up to C++

Nested Classes

Dan Saks


Dan Saks is the founder and principal of Saks & Associates, which offers consulting and training in C++ and C. He is secretary of the ANSI and ISO C++ committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield OH, 45504-4906, by phone at (513)324-3601, or electronically at dsaks@wittenberg.edu.

Last month, I summarized most of the extensions to C++ introduced by the joint C++ standards committee WG21 + X3J16 (see "Recent Extensions to C++," CUJ, June, 1993). I neglected to mention one other extension — forward declaration of nested classes.

Nested classes were a relatively late addition to the C++ language. They were introduced in 1989, shortly before the committee embarked on drafting a standard. Nested classes added considerable complexity to the language's lookup rules, that is, the rules for matching a reference to an identifier in a translation unit with a declaration for that identifier. The rules stated in the ARM (Ellis and Stroustrup, 1990) seem to work fine for most common simple cases, but they are incomplete and inconsistent in handling complex cases. The standards committee's Core Language working group has spent much of its time these last three years ironing out name lookup problems.

This month's column explains nested classes and types, and explains the extension to allow forward declaration of nested classes. Next month, I'll describe the problems posed by nested classes and the new lookup rules designed to solve those problems.

Nested Classes

A nested class is a class declared inside another class, as in:

class outer
   {
public:
   class inner { /* ... */ };
   void foo(int);
   // ...
private:
   inner i;
   // ...
   };
Here, class inner is nested within class outer. Hence, the name inner is in the scope of class outer. Within class outer (and the bodies of its member functions), you can refer to inner simply as inner. For example,

void outer::foo(int i)
   {
   inner *p;
   // ok, inner in scope of outer
   // ...
   }
But outside class outer, you must refer to inner by its fully-qualified name, outer::inner, as in

int main()
   {
   outer::inner *q =
         new outer::inner;
   // ...
   }
Nested class names are subject to access control. That is, you cannot reference outer::inner outside a member or friend of class outer unless inner is a public (or protected) member of outer.

Nested classes reduce the need for global names in C++ programs, making it easier to avoid name conflicts between different parts of a large program. The following example shows how.

Applying Nested Classes

Threaded data structures, like lists and trees, typically are implemented using at least two distinct types of objects:

For instance,
Figure 1 illustrates the structure of a singly-linked list of integers. Each node object contains a single integer value and a pointer to the next node in the list. The root contains a pointer to the first node in the list, and another pointer to the last node in the list.

When you build an application using a variety of threaded structures, you will probably define several different kinds of node objects — one for each different threaded structure. If you don't use nested classes, you must give each node type a different name, like listnode or treenode, so they don't conflict.

For example, Listing 1 and Listing 2 contain header files that define different threaded structures. Listing 1 defines the node class (called listnode) and the root class (called list) for a singly-linked list of (unsigned) integer values. Listing 2 defines the node and root classes (treenode and tree, respectively) for a binary tree of dynamically-allocated character arrays. (These classes resemble the list and tree classes I used in a cross-reference generator many moons ago. See "Your First Class," CUJ, May, 1991, or "Rewriting Modules as Classes," CUJ, July, 1991.)

This ad hoc approach to avoiding name conflicts isn't so bad in small projects with only one programmer (or maybe even a few programmers), but such an approach quickly becomes unmanageable in a large project with many programmers. Avoiding global name conflicts gets even harder when you start using libraries from various sources in your applications.

For example, lists can have many different implementations. They can be singly- or doubly-linked, and linear or circular. If your application uses, or your library supports, both singly- and doubly-linked lists, then you must use longer names to distinguish the different root and node classes — names like singly_-linked_list_node, SinglyLinkedListNode, SLlistnode. Juggling long names like these takes some of the fun out of programming.

Long names for the node types don't have to be pretty. Users of the list class are not supposed to use the node type name in their own code; they are only supposed to use the name of the root class. The node class is simply part of the (supposedly hidden) implementation of the list. Thus, you could argue that the more ugly and obscure a node type name is, the less chance there is that anyone else will use it.

Making a name ugly and obscure doesn't really hide it. Users must include the header list.h to gain access to the list class. But the header also contains the declaration for the list node class, so the user can access it too. When you implement a list class, you should really put the node class out of the user's reach. Nesting the node class inside the list class does just that.

Listing 3 shows the list class definition with the list node class as a nested class. Since the list node class is no longer global, I shortened its name to just node. As a member of class list, its fully-qualified name, list::node, is sufficiently distinct. Keeping the name as list::listnode is overkill.

Listing 4 shows an implementation for the list class. It implements four simple operations:

Notice that the implementation uses the name node without qualification. It never has to refer to the node class as list::node, because all the function bodies are in the scope of the list class, and node is a member of the class.

Making Do without Nested Classes

C does not support nested structs. If you write a struct within a struct, C treats the inner struct as if it has been written outside the outer struct. That is, C treats

struct outer
   {
   //...
   struct inner { /* ... */ };
   };
more or less like

struct outer
   {
   // ...
   };
struct inner { /* ... */ };
Before it supported nested classes, C++ treated nested classes and structs the same as does C. Thus, even if you had defined class node inside class list as in
Listing 3, an early C++ translator would treat node as a non-nested (global) name.

Listing 5 illustrates a technique that C++ programmers used to prevent users of a list class from accessing the list node class, even when that list node class is global. Listing 5 is essentially the same as Listing 1, except that it defines listnode as a class, rather than as a struct. When listnode is a struct, its members are public. When listnode is a class, its members are private.

Remember, in C++, a struct is a class. A struct, like a class, can contain function as well as data members. A struct can also contain the access specifiers public, protected, and private. The only real difference between a struct and a class is that struct members are public by default, and class members are private by default.

In Listing 5, not only are listnode's data members private, but so is its constructor. When the constructor is private, users cannot access it. Whenever a user tries to create a listnode object (such as by a declaration or a new-expression), the translator complains that the user cannot access a private constructor.

How, then, does the list class use the listnode class? The listnode class declares list as a friend. This means that every member of class list is a friend of class listnode. Thus, members of class list can invoke the listnode constructor when creating listnode objects.

Although this friend class technique prevents users from accessing the listnode objects directly, it doesn't prevent the name listnode from conflicting with other global declarations for the name listnode. Thus, I recommend using nested classes (as in Listing 3) .

Nested classes are now widely, but still not universally, available. Thus many of you probably won't need to use this friend class technique, but others may find it in older code that you have to use or maintain. You may also encounter it in older C++ books and articles.

Declaring Out-of-Line Members

The nested node class in Listing 3 defines only a single member function — a constructor. That constructor has only a short initializer list and an empty body, so it's convenient to just define the function inline inside the class. In general, however, I discourage defining member functions inside class definitions. In all but the most trivial classes, it leads to cluttered, hard-to-read class definitions.

You can define member functions for nested classes outside the class definition. You write the definitions at file scope and refer to each function using its fully-qualified name. For example, Listing 6 shows list.h rewritten with the list::node constructor defined outside the class. The function definition identifies the constructor as list::node::node. If list::node had a destructor, it would be defined as

list::node::~node()
   {
   // ...
   }

Nested Types and Constants

Not only can you nest classes inside classes, you can also nest typedefs and enumerations. Like nested classes, nested types and enumerations are subject to access control.

For example,

class shape
   {
public:
   enum palette { RED, GREEN, BLUE };
   shape(palette c) : color(c) { }
   palette color() { return _color; }
   // ...
private:
   palette _color;
   // ...
   };
defines type palette as a nested type (a type member) of class shape. It also defines the constants RED, GREEN, and BLUE as member constants. Since the enum declaration is public, you can access the type name or any of the constants outside class shape using their fully-qualified names. For example,

shape s(shape::GREEN);
shape::palette c = shape::RED;
or even

if (s.color() != shape::BLUE)
There's a subtle difference between a member constant and a const data member. An enumeration constant like shape::BLUE is a member constant, but MAX declared in

class X
   {
public:
   const size_t MAX;
   // ...
   };
is a const data member. You can use shape::BLUE in a constant expression (like the dimension of an array, or a case label), but you can't use MAX that way. Furthermore, you can't specify a value for MAX by writing

class X
   {
public:
   const size_t MAX = 100; // error
   X();
   // ...
   };
You must initialize MAX using an initializer in a constructor, as in

X::X() : MAX(100) { }
If you need to use MAX in constant expressions, simply define MAX as an enumeration constant, as in

class X
   {
public:
   enum { MAX = 100 }; // clever
   // ...
   };
so that X::MAX is a member constant rather than a const data member.

Static Members

C++ defines rules by which a nested class may refer to identifiers declared in enclosing classes. These rules involve static data and function members, which I'll explain by the following example.

Suppose you want to track the number of list objects in existence at any given time in the execution of your program. Simply define a counter called list_count, initialized to zero, that counts the number of objects. Then, add a statement to the list constructor to increment the counter, and add another statement to the destructor to decrement the counter, as shown in Listing 7.

Notice that list_count is a global variable that exists outside class list. As always, you should avoid using global variables, but you can't make the counter an ordinary data member of the list class. If you did, you'd get a copy of a different counter in each list object. That just won't work. The variable must be statically allocated and separate from every list object so that there's one — and only one — counter for all list objects. But it should be in the scope of the class so it doesn't conflict with other global names.

Static data members solve this dilemma. A static data member is not part of each class object; it's a separate data member shared by all objects of its class. A static data member is in the scope of its class and is subject to access control.

For class list, you write

class list
   {
public:
   list(unsigned n);
   ~list();
   // ...
   static unsigned count;
private:
   // ...
   };
which declares count as a static member of list. The list constructor and destructor can refer to that static member as just count.

count is public, so non-member functions can access it using its fully-qualified name, list::count, as in

printf("# of list objects = %u\n", list::count);
Unfortunately, since it is public, non-member functions can also modify list::count and invalidate the count. Thus, you should declare count private, and write a public member function, howmany, that returns the current count:

unsigned list::howmany()
   {
   return count;
   }
Hence, a non-member function can only inspect the count by calling L. howmany, where L is some list object. This protects list::count from unauthorized access, but forces users to keep an extra list object lying around just so they can call howmany.

The problem is that howmany is an ordinary member function. An ordinary member function always has a hidden extra argument — the object addressed by its this pointer. But howmany doesn't need a this pointer to locate list::count because list::count is not in a list object.

Rather, declare howmany as a static member function, as shown in Listing 8. A static member function does not have a this pointer, so it cannot access ordinary data members, but it can access static data members. Thus, you don't need a list object to call howmany. You simply call it by its full name, as in

printf("# of list objects = %u\n", list::howmany());
If you wish, you can still call L.howmany (where L is a list object). The translator only uses L to determine howmany's class type; it does not bind a this pointer to L's address.

The declaration of a static data member inside a class is only a declaration. The definition (and initialization) of the static member appears elsewhere, typically in a source file along with other members of the class. For list::count, the definition looks like

unsigned list::counter = 0;
as shown in Listing 9. For completeness, Listing 9 also contains the member function definitions for the list class definition in Listing 8.

Accessing Members

A nested class is in the scope of its enclosing class. This means that the name of the nested class is local to the enclosing class, and that the nested class may refer to identifiers declared in the enclosing class.

If an unadorned identifier referenced in a nested class member function refers to a name in an enclosing class, that name must be a type name (a class, typedef, or enumeration type), a member constant (an enumerator), or a static data member. By an unadorned identifier, I mean an identifier not preceded by x. (where x is an object or reference), or by p-> (where p is a pointer). An unadorned name in a nested class member function cannot refer to an ordinary data or function member.

For example, function X::Y::f in Listing 10 cannot access X::i because it doesn't have any objects of type X handy. The translator tries to interpret an unadorned reference to a non-static class member as a reference via a this pointer. But f's this pointer points to an object of type X::Y, not X.

A nested class is not necessarily part of its enclosing class. For example,

X::Y xy;
declares xy as an X::Y object, separate from any X object. The call xy.f only provides f with access to an X::Y object. It has no X object in which to find non-static member i.

On the other hand, X::s is static. It exists apart from all X objects, so X::Y::f needs no objects of type X to access X::s.

A nested class has no special access rights to members of its enclosing class, and an enclosing class has no special access rights to members of a nested class. In other words, unless granted friendship by its enclosing class, a nested class cannot access private or protected members in the enclosing class. Similarly, unless granted friendship by its nested class, an enclosing class cannot access private or protected members in the nested class.

Forward Declarations

With all this as background, I now describe the extension to allow forward declaration of nested classes. In and of itself, the extension is quite simple.

As described in the ARM, a nested class must be completely defined inside its enclosing class. But, a single enclosing class with many nested classes is usually difficult to read and maintain. This extension lets you simply declare the nested classes inside the enclosing class, and then define them later.

For example, when this feature becomes available, you should be able to unravel

class X
   {
   class Y
     {
     // definition of class X::Y ...
     };
   class Z
     {
     // definition of class X::Z ...
     };
   // ...
   };
into the somewhat cleaner form

class X
   {
   class Y; // declaration of X::Y
   class Z; // declaration of X::Z
   // ...
   };

class X::Y
   {
   // definition of class X::Y...
   };

class X::Z
   {
   // definition of class X::Z ...
   };

References

Ellis, Margaret A. and Stroustrup, Bjarne. 1990. The Annotated C++ Reference Manual. Reading, MA: Addison-Westley.