March 1994/Questions & Answers

Columns

Questions & Answers

Run-Time Type Checking in C++

Kenneth Pugh

Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C and C++ language courses for corporations. He is the author of All On C, C for COBOL Programmers, and UNIX for MS-DOS Users, and was a member of the ANSI C committee. He also does custom C/C++ programming and provides SystemArchitectonics services. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 489-5239. Ken also receives email at kpugh@allen.com (Internet) and on Compuserve 70125,1142.

Run-Time Typing
I have taught myself C++ and am comfortable with most of its concepts. There is one aspect that I am unsure about. Bjarne Stroustrup, in his book The C++ Programming Language states: "using run-time type inquiries ... destroys all modularity in a program and negates the aims of object-oriented programming." I attempt to adhere to this suggestion, but there is a situation in which I don't know how to apply the rule: reading and writing objects to data files. As an example, let's suppose we are writing a simple checkbook balancing program. The base object is an Entry, which corresponds to something you would enter in your checkbook. Derived from base class Entry are three classes which can be instantiated: Check (a written bank draft), Deposit (a bank window deposit), and Withdrawal (a window withdrawal). Keeping track of the type of objects in memory is easy, since when one of the three derived classes is created, it is done through a constructor which builds in type information. There are many ways to write the checkbook entries to a data file, so let's assume we just write a character-based representation to the file. The question is how can we write the data so that we can read the information back into memory? The only way I can think of is to include type information, such as an integer coded for each class type. This method violates Mr. Stroustrup's rule, though. Any thoughts you have on this matter would be greatly appreciated.
Paul Waldo
Forest, VA
A
One reason Bjorne was against run-time checking was because it enables programmers to avoid derivation. For example, suppose you have a type_of function that can identify the Entry type of a pointer. To post an entry, you might code something that looks like Listing 1. If you need to add an additional type of Entry, then you have to add another case to the switch statement.
However, there's a cleaner way to handle this problem. Suppose you give Entry a pure virtual post function. Each class derived from Entry now supplies its own post function. The post_entry function could look like Listing 2. When adding a new derived class, you do not need to change the post_entry function. You just provide a post function for the new class.
As you have suggested, virtual functions only work while objects are in memory. Each object of a class with virtual functions contains a pointer to a table of function pointers (the vtable, as it is sometimes referred to). All objects of a particular class contain a pointer to the same table. When the program calls a virtual member function it uses the pointer to the corresponding function in the vtable. In a sense, this vtable pointer uniquely identifies the class type of an object. In fact, some compilers have non-standard extensions that can use this pointer to provide a form of run-time type identification. Other vendors provide alternative methods for run-time identification. Microsoft has a CRuntimeClass object associated with each class that is used with the IsKindOf function. With this function, you can determine if an object belongs to a particular class or if it is derived from a class. (The class must be derived from the Microsoft CObject class to work with IsKindOf.) To use the IsKindOf function, you include a DECLARE_DYNAMIC macro in the class definition and an IMPLEMENT_DYNAMIC macro in the class implementation. CRuntimeClass is the data type used to store class information. You can obtain a pointer to an object of this type with:
CRuntimeClass * pclass = RUNTIME_CLASS(Your_class);
Typically you do not use the information in CRuntimeClass directly, but instead pass it to IskindOf. For example, as shown in the following code fragment, you might want to cast a base class pointer to a pointer to a real object. To be logically correct, you need to be sure the object pointed to belongs to a particular class.
CObject * pyour_object = new Your_class;
...
if ( pyour_object->IsKindOf( RUNTIME_CLASS(Your_class) ) )
    {
    // Cast it
    Your_class * p_this_object = (Your_class *) pyour_object;
    }
You can use this type information to create objects of a given class, The CRuntimeClass class provides a member function CreateObject for dynamically creating objects. The use of CreateObject is demonstrated as follows:
CRuntimeClass * p_your_class = RUNTIME CLASS(Your_class);
// Create it
CObject * pyour_object = p_your_class->CreateObject();
// Cast it to the class
Your_class * p_this_object = (Your_class *) pyour_object;
In your question, you have run across one area of programming that requires some form of type identification: permanent or persistent storage. You cannot use the address of a vtable as the identification means for an object. When the contents of an object are read back into memory, the chances are that the vtable will be in a different memory location. You will need to store some form of type identification with the object. This identification could either be an integer value or the string name of the class. If your program stores objects in a known sequence and retrieves them by the same known sequence, then you do not need any type identification stored with the object. For example, if you always store Checks starting at the first position in the file, followed by Deposits and then Withdrawals, you can determine the type of an object by its position in the file. In your example, you cannot assume this kind of ordering exists, so you must store a class identifier. There are lots of ways to store this identifier, depending on how your classes are organized. Let's assume you have some unique identifier for each account entry type, such as:
enum Entry_type {Entry_check, Entry_deposit, Entry_withdrawal};
One of these values would be stored away with each object. For example, if your compiler provides run-time type identification with type_of, you might code:
Entry::save()
    {
    if ( this->type_of() == Check )
        // Store Entry_check
    else if (this->type_of() ==
           Deposit )
        // Store Entry_deposit
    else if (this->type_of() ==
           Withdrawal )
        // Store Entry_withdrawal)
    ...
    // Store remaining contents in an account
    }
Before I get too much mail regarding this "hidden switch statement," let me explain that this function is complementary to the retrieval function, which I will show shortly. You could use a virtual function for save in each of the derived classes. The function would store the appropriate Entry_type value, call a save function in the Entry class to store the data members in that class, and then save its own data members. If you were using the Microsoft compiler, this type checking code might look like the following fragment. However, Microsoft provides other functions that make this code unnecessary, as we shall see shortly.
Entry::save()
    {
    CRuntimeClass * pcheck_class
        = RUNTIME_CLASS(Check);
    CRuntimeClass * pdeposit_class
        = RUNTIME_CLASS(Deposit);
    CRuntimeClass * pwithdrawal_class
        = RUNTIME_CLASS(Withdrawal );
    if ( this->IsKindOf(pcheck) )
        // Store Entry_check
    else if (this->IsKindOf(pdeposit) )
        // Store Entry_deposit
    else if (this->IsKindOf(pwithdrawal) )
        // Store Entry_withdrawal)
    ...
    // Store remaining contents in an account
    }
The basic dilemma comes in retrieving the objects. You cannot use a retrieve function for each derived class, as you do not know the type of entry for the next object stored in the file. Thus you can only retrieve an entry as a base class object. For example, the caller might use something that looks like:
Entry *pentry;
Account account;
...
account.retrieve_next(&pentry);
The retrieve_next function could look like the following:
int Account::retrieve_next(Entry **pentry_in)
    {
    // Get rid of old pointer
    Entry *pentry = *pentry_in;
    if (*pentry != NULL)
        delete pentry;
    // Read the record off disk
    // Then check the type of the record ready
    if (type == Entry_check)
        *pentry = new Check;
    else if (type == Entry_deposit)
        *pentry = new Deposit;
    else if (type == Entry_withdrawal)
        *pentry = new Withdrawal;
    // Move associated data into the type
    ...
    // Then return the pointer
    *pentry_in = pentry;
    }
The caller passes this function a pointer to pointer to the Entry class. The function deletes the Entry that was pointed to by pentry_in. The Entry class should provide a virtual destructor, so that any additional data members in the derived classes are deallocated. Based on the Entry_type value stored in the file, the function allocates the appropriate pointer. The function moves the necessary data from the file into the new object and returns the value of the pointer. If necessary, each derived class can include a function that reads any additional data members from the file.
The Microsoft archive retrieval function (CArchive::operator>>) does not require this if-else structure. The storage function (CArchive::operator<<) stores a run-time class identifier (the name of the class) in the file. When the retrieval function reads the file, it dynamically creates an object of the stored class and loads the information from the file into that object. Microsoft's multi-pronged approach to run-time typing eliminates worries about the details of persistent object storage and retrieval. All you have to do is to include the necessary macros in your object header file and your object implementation file.

C++ Problems
I am an agricultural economist at the U.S. Department of Agriculture and am trying to learn C and C++. I have bought over ten books on C, and am reading them and keying in the exercises to practice the concepts. One of the books I have bought is Learning C++, by Neil Graham. I began keying in one of the exercises (in C++) and the program gave me a bunch of iostream declaration errors when I attempted to compile the source code using Borland Turbo C++, 1991 (Listing 3) . I saw your column in The C Users Journal and thought that you might be willing to suggest what is wrong, and how I might fix it. Also, do you have any recommendations for textbooks and/or diskette tutorials that may be useful in learning C and C++?
Kenneth W. Erickson
Washington, D.C.
A
Reading several books and trying their examples is an excellent way to learn a new language. You can get a good feel for alternative ways of approaching problems. However, when you use only books for learning, you often run into problems for which a book has no answer. Many times I've been in the same quandary when using vendor-supplied manuals. In some cases the solution to my problem was simple but the manual just didn't deal with the problem.
In your case, what appears to be an unexplainable error is caused by a simple glitch. You named the program with a ".c" extension. The compiler took this to mean that your program was to be compiled as a C program. Inside the iostream.h file are several C++ syntactic constructs (such as class). The compiler reported error messages for these constructs since they do not exist in C. If you had named the program with a .cpp extension, the compiler would have cleanly compiled the program. You might have been confused if you compiled other C++ programs without errors and therefore you thought there was something wrong with this program. Many compilers (such as Borland), have a switch that compiles .c files as C++ programs. If this option was set in your configuration file for previous programs, then other C++ programs with .c extensions would have compiled properly. If you switched around directories or reinstalled the compiler, the switch may have been reset.
There are numerous books out on C. Many are based on your knowledge of other languages. My book C Language for Programmers (QED) gives comparisons of C constructs with COBOL, PASCAL, PL/1, BASIC, and FORTRAN. My new book C Language for COBOL Programmers (QED) presents very detailed comparisons between COBOL language statements and C. All on C (Harper Collins) explains C without comparisons to other languages, but with numerous examples. In the C++ arena, I suggest C++ Programming and Fundamental Concepts by Anderson and Heinze (Prentice-Hall). I use that book as a supplementary book for my C++ courses.

User interface
Q
I am new to the C language and am having a problem that you might help me with. I am writing a football card data base, and need help with the user interface. What I am trying to do is let the user of my program enter required information without pressing the enter key. I also want the user to be able to move from one data entry field to another using the arrow and home keys of the key pad. If the user presses the arrow keys the highlighted field would move up or down as required, but if a letter key or a number key was pressed, the letter or number would be concatenated to the appropriate char string variable. The concatenating part I can handle. It's getting the input that's giving me the problem. I've tried many different approaches to make this work, but still no luck. If you could help me with this problem I would be very grateful. I'm a subscriber to the C Users Journal, and will be checking the Questions & Answers section to see if you can solve my problem. I am using Borland C++ version 3.0 (DOS).
Randy Jones
Ruther Glen, VA
A
As you have discovered, data-field entry functions are not included in the Standard C libraries. Numerous shareware and commercially-available libraries will perform the operations you list. For a listing of available shareware consult The C User's Group Public Domain Catalog [available for free upon request from R&D Publications, 1601 W. 23rd St., Lawrence, KS 66046. Ph: (913)-841-1631. Fax: (913)-841-2624. e-mail: cujsub@rdpub.com. Perusing The C Users' Journal itself will reveal a number of the major vendors of display packages. Since for the past year I have been programming almost exclusively in Microsoft Windows using Visual C++, I haven't kept up with all the different features of the commercial packages. Many of these packages have integrated screen designers/code generators. With such a system, you can layout your screens with a mouse or cursor keys, rather than by writing individual function calls for each field. In case you can't find what you want in either the catalog or the ads, my book All on C shows a sample implementation of a package of display functions providing most of the features you requested.