August 1995/Stepping Up To C++

Columns

Stepping Up To C++

Other Assorted Changes, Part 2

Dan Saks

Dan Saks is the president of Saks & Associates, which offers consulting and training in C++ and C. He is secretary of the ANSI and ISO C++ committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield OH, 45504-4906, by phone at (513)324-3601, or electronically at dsaks@wittenberg.edu.
Two months ago, I started describing changes in the C++ language, that is, those features of C++ that have different behaviors under the current draft standard than they do in the Annotated C++ Reference Manual (ARM) [1]. Some changes simply ban constructs that the ARM allowed (intentionally or otherwise). Other changes actually redefine some constructs to have a different well-defined behavior.
I began by explaining the changes in the scope rules:

Conditional statements now introduce a new block scope.

The scope of a declaration in a for-init-statement is now restricted to the for-statement.

The "rewriting" rule and the rule limiting the context sensitivity of class member declarations have been replaced by comprehensive scope rules for class member declarations.

The point of declaration for an enumerator is now immediately after its enumerator-definition.
(See "Stepping Up to C++: Changes in Scope Rules," CUJ, June, 1995.)
The remaining changes affect diverse parts of the C++ language. Last month, I detailed the following new rules:

The left-hand side of a member access expression is always evaluated, even if the right-hand side designates a static member or enumeration constant.

Enumerations are not integral, so that built-in arithmetic operators, such as ++ and --, no longer apply to enumerations; however, enumerations can be promoted to int, unsigned int, long, or unsigned long.

In most cases, temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception.
(See "Stepping Up to C++: Other Assorted Changes, Part 1," CUJ, July, 1995.)
This month, I have more. Remember, these new rules do not include extensions to the C++ language. For descriptions of the extensions, refer to my earlier columns ("Stepping Up to C++," January through May, 1995).

Friend Functions in Local Classes
Classes in C++ are an extension of structs in C. The class syntax is built upon the struct syntax, and classes observe essentially the same rules as structs. Since a struct in C++ is merely a special case of a class, the C++ draft standard simply uses the term class to refer to struct as well as class types.
A class definition can appear in a C++ program anywhere that a struct definition can appear, even local to a function or a block. C++, like C, allows local class definitions as a direct result of its general block structure. (A block or compound statement is a sequence of declarations and statements enclosed in { }. A function body is a block. A block may contain other blocks.)
Block-local struct definitions come in handy once in a great while. I've used them to define tabular data that I want known only to one function in a program, as in the xlate function sketched in Listing 1. xlate uses local struct entry to define translation table xtab for translating character strings into integer values. As always, defining entry locally avoids potential name conflicts with other entities named entry elsewhere in the program.
As part of generalizing structs into classes, C++ allows local classes with member functions. However, C++ imposes various restrictions on local classes that it does not impose on global classes. Most of these restrictions exist to avoid introducing nested function definitions into C++.
One such restriction is that member functions of a local class, if any, must be defined in situ. That is, member functions must be defined within their class definition. For example, you cannot write
void f()
   {
   class X
      {
   public:
      int g();
      };
   int X::g() { ... } // Not!
   ...
   }
because this requires support for nested function definitions. Nor can you write:
void f()
   {
   class X
      {
   public:
      int g();
      };
   ...
   }
int X::g() { ... }     // Not!
because X is not in scope outside f. Rather, you must define X::g in situ, as in
void f()
   {
   class X
      {
   public:
      int g() { ... } // OK!
      };
   ...
   }
Another restriction is that declarations in a local class can use only types, static variables, extern variables, extern functions, and enumerators from the enclosing scope. Thus, a member function of a local class cannot access auto variables (including parameters) of the enclosing function. Both the ARM and the current draft illustrate this rule using the example shown as Listing 2.
Member function g of local class local in Listing 2 is in error because it refers to auto variable x defined in enclosing function f. C++ does not allow this access because it requires essentially the same run-time support as do nested functions. On the other hand, function local::h() is okay because it accesses only a static object from the enclosing function. This access requires no special run-time support.
As with a global class, a local class can declare functions as friends. A function declared as a friend of a class is not a member of the class, but it can access private and protected members of that class. Friends of a class are typically global (non-member) functions; but they can also be members of another class.
A friend declaration can grant friendship to a previously declared function. A friend declaration can also name a function that has not been declared, in which case the declaration introduces the function name into the innermost enclosing non-class scope. For example, assuming that f and g have not yet been declared, then
class X
   {
   friend void f(X);
   class Y
      {
      friend void g();
      };
   };
declares f and g as functions at namespace scope. (Namespace scope is the new way to say file scope.) The definitions for f and g must appear at namespace scope either later in this translation unit or in some other unit.
Friend declarations can also appear in local classes, and the governing rules are essentially the same as for friends in non-local classes. If the friend declaration names a previously declared function, the declaration grants friendship to that function. For example,
int g();

int f()
   {
   class X
      {
      friend int g();
      };
   ...
   };
declares global function g as a friend of local class X. Of course, you must still define g somewhere to avoid a link error. We'll get to the definition of g in a moment.
The previous example declares g before granting it friendship. As in a global class, a friend declaration in a local class can also name a function that has not been declared. Again, in that case, the friend declaration introduces the function declaration into the innermost enclosing non-class scope. For example,
int f()
   {
   class X
      {
      friend int g();
      };
   ...
   };
declares g inside f as if you had written
int f()
   {
   int g();    // introduced by friend g
   class X
      {
      friend int g();
      };
   ...
   };
Thus g has external linkage and refers to a function at namespace (file) scope defined elsewhere in the program. Note that if you defined
int h()
   {
   return g();
   }
immediately after the definition for f above, the compiler would complain that g had not been declared. The declaration for g inside f makes g known in the body of f, but not outside f.
Now, given that you've declared g as a friend of local class X, as in
int f()
   {
   class X
      {
      friend int g();
      };
   };
where do you define g? Since g is a function with external linkage, clearly you can write the definition at namespace scope anywhere before or after f:
int g()
   {
   ...
   }
Unfortunately, this definition can't take any advantage of its friendship with class X, because X is not in scope outside f. As such, the friend declaration is pretty useless.
If you want g to take advantage of its friendship with X, you must define g where X is still in scope. But, as was the case with member functions of local classes, you can't define g inside function f but outside class X, as in
int f()
   {
   class X
      {
      friend int g();
      };
   int g() { ... }    // Not!
   };
because this requires support for nested function definitions. However, you can refer to members of X if you define g at the same time you declare it, as in:
int f()
   {
   class X
      {
      friend int g()
         {
         // can access X's members here
         ...
         }
      };
   ...
   };
The ARM suggests, by the absence of any statement to the contrary, that you can define a function with namespace scope in a friend declaration of a local class, as above. However, the committees discovered that such definitions might be hazardous, as shown by the example in Listing 3.
As with members of local classes, a friend function defined in a local class can use only types, static variables, extern variables, extern functions, and enumerators from the enclosing scope. The definition for g in Listing 3 meets this restriction, because it refers only to static variable n from the enclosing scope. Nonetheless, that reference to n is still a problem because the program might call g initializing n. Here's why.
A C++ program may defer initialization for a local static object (a static object at block scope) until run time. If the initializing expression is a constant expression, as in
void f()
   {
   static int n = 3;
   ...
   }
the program may do the initialization at load time, but the draft standard only guarantees that the initialization occurs prior to entering f. If the initializing expression is not constant, initialization occurs the first time control passes completely through the declaration. Thus, a program initializes n in Listing 3 the first time it calls f.
Since g is a global function, the program might call g before the first call to f. Thus, g might access n before n has been initialized. The resulting behavior is undefined (Bad Things will likely happen).
The C++ standards committees considered outlawing friend declarations in local classes. After all, granting friendship from a local class to a function defined at namespace scope is apparently pretty useless because the function can't take advantage of the friendship. And, as shown just above, defining a global function in a friend declaration of a local class might lead to problems.
Nevertheless, the committees decided, as they have in the past, not to ban a feature just because it's apparently useless. They opted instead to ban only the potentially dangerous forms. In March, 1993 the committees added the following rule to the draft in Section 11.4 [Friends]:
Although a local class may declare a global function as a friend, it may not define it.
Thus, the definition for g in Listing 3 is now clearly an error.
Sometime in the intervening years, that sentence evolved into:
A function of namespace scope can be defined in a friend declaration of a non-local class.
This no longer states the prohibition as clearly, but I'm sure the intent is the same.

Pointer Conversions
In C++, as in C, void * is the generic data pointer type. That is, a void * has enough bits to hold a pointer to any object type. Thus, for any complete or incomplete object type T, a program can convert a T * to void * and back. In C++, the conversion from T * to void * requires a cast; in C it does not. In any event, the result of the conversion compares equal to the original pointer value.
C does not provide a generic function pointer. void * doesn't fill the bill. The C standard makes no statement about the relative sizes of data pointers and function pointers. If function pointers are bigger (require more bits) than void *, converting a function pointer to void * and back won't preserve the original function pointer value.
Even though C does not provide a generic function pointer type, it does guarantee that all function pointers have the same representation. Thus you can define a generic function pointer type, fp_t, as something like

typedef void (*fp_t)(void);
Then you can convert any function pointer to fp_t and back, and the result will be the same as the original. (However, the conversions both to and from fp_t require casts.)
C++ doesn't have a generic function pointer type either. However, the ARM suggests that a program can use void * as a generic function pointer when it states, in Section 4.6 [Pointer Conversions]:
A pointer to function may be converted to void * provided void * has sufficient bits to hold it.
Unfortunately, this is no help for programmers trying to write portable C++. Since there's no guarantee that a void * has enough bits, programmers must assume otherwise. Thus, the committees removed that statement and replaced it with the following footnote:
It is ill-formed to convert a pointer to a function to or from a pointer to an object type.
Using similar reasoning, they also removed the following from Section 5.4 [Explicit Type Conversion]:
A pointer to function may be explicitly converted to a pointer to an object type provided the object pointer type has enough bits to hold the function pointer. A pointer to an object type may be explicitly converted to a pointer to function provided the function pointer has enough bits to hold the object pointer.
Removing this text does not prevent implementations from offering these conversions as extensions. However, a translator must diagnose these conversions when operating in standard conforming mode.

Operator Function Names
The ARM defines an operator function name as a construct of the form
operator-function-name:
operator op
where op is an operator symbol such as +, -, =, or one of many others. For example,

operator+ operator= operator new
are valid operator function names. (The draft standard uses the term operator-function-id instead of operator-function-name.)
Operator function names can appear in many of the places where ordinary function names can occur, such as function declarations:

complex operator+(complex, complex);
function calls:

B::operator=(d);
and so on.
Unfortunately, the ARM inadvertently allows them in some pretty strange places. For instance, the ARM allows operator op as the name of an ordinary variable, as in

int operator+ = 1;
The committees readily agreed this was never meant to be. According to the draft standard, this declaration is no longer valid C++.
I have a few more changes left to cover, which I will do next month.

Errata
One of my readers, Vince Urso, caught a couple of errors in the complex number class that I used to explain mutable class members ("Stepping Up To C++: Mutable Class Members," CUJ, April 1995). Both problems occurred in the definition of complex::operator= in Listing 4 of that article. The corrections appear in Listing 4, marked by comments that say "this is new."
The first problem was that complex::operator= did not behave benignly when assigning a complex object to itself. Assigning an object to itself should have no apparent effect (other than eating processor cycles). I added the if statement to complex::operator= to insure that it does.
The second problem was that operator= caused memory leaks by blindly assigning 0 (a null pointer) to p without deleting p first. I added the delete expression to cure that. (If p is null going into the delete, the delete has no effect.)
For good measure, I added a static data member called count to each of the complex and complex::polar classes. The constructors for these classes increment their respective counters, and the destructors decrement them. Thus, these counters track the number of objects created but not yet destroyed to expose any remaining leaks. The leaks appear to be gone.
Thanks Vince.

References
[1] Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual (Addison-Wesley, 1990).