C PROGRAMMING

OPPs to the Left, FOPs to the Right, and a View from the Center

Al Stevens

It turns out that I have been misusing a C language feature, the power of which I have only just come to understand. The feature is the typedef statement. Let's examine how my minor transgression came about by looking at how typedef can be used and how it should be used.

Suppose you have a structure like this one:

struct empl_rcd {
   int emplno;
   char emplname[25];
   unsigned date_hired;
   int empl_category;
   long salary;
   };

You can define a brand new data type for this structure with the typedef statement. Of such is the extensible nature of C. It works like this:

  typedef struct empl_rcd EMPLOYEE;

All declarations of the structure can now use the new EMPLOYEE data type rather than the struct empl_rcd type. These are examples of how you can declare instances of this data type.

   EMPLOYEE newhire;
   EMPLOYEE *retired;
   EMPLOYEE chiefs[5];

Why do this? Subsequent references to the structures might gain nothing from the use of the typedef. In practice, you could not tell by looking at the code that accessed one of these structures that the EMPLOYEE data type had been defined at all. Witness these expressions.

   printf(newhire.emplname);
   total_payroll += chiefs[i].salary;
   return retired->empl_category;

In none of these cases is the code insulated from the format of the data structure. In all cases, the code is exactly the same as if the typedef had not been used. The strongest advantage of the typedef, its ability to hide information, is not realized by these uses of it. If no other software in your system needs to know that EMPLOYEEs exist, the typedef is wasted. What, then, do you gain from the EMPLOYEE typedef provided in these examples? Nothing, unfortunately, unless you consider the reduced keystrokes involved in coding the name.

Now consider the C standard input/ output library FILE type. This is a typedef defined in stdio.h that identifies a stream file definition. The structure itself, if such there be, is defined in stdio.h, and its name and format are implementation-dependent. A designer of a standard C input/output library can define a FILE to be anything from a simple integer to the file-defining structure itself. This seems to be a more correct use of typedef. There is something called a FILE, and its implementors know its internals. You know only what you need to know to use a FILE, and you do not care whether it is a structure, union, integer, pointer, or what. All you know is that you can use one.

In an application program you typically do three things with a FILE data type. You declare pointers to the type, assign values to the pointers by calling library functions that return the address of a FILE, and pass the FILE pointers to other functions. The library functions know what to do with the pointers because they know the format of the FILE data type. The calling program does not (or does not need to) know that. The information is thus hidden. You can peek and learn it if you want, and you could code specific references to the data structures that underlie it, but you would surely sacrifice portability, not only among compilers but possibly between versions of the same compiler.

How then could we use the EMPLOYEE typedef in ways that would benefit us? One way would be to define it in a header file and tuck all the employee-related functions away in their own function library, much like the stdio.h file and standard library are used. Then, applications programs that need to do things to employee records could call these functions with EMPLOYEE pointers in the same way we use the FILE pointer for stream input/ output. The functions in the library would have to be aware of the formats of the data structures, but the using functions would not. That way payroll, personnel, and project management systems could all use the same employee function library without needing to know the internal formats and methods of how employee records are stored. Almost object-oriented, wouldn't you say?

How have I abused the typedef? Last autumn, in the window function library that supports the SMALLCOM project, I defined MENU and FIELD data types and then required the applications programs, programs that you might write, to initialize the structures under the typedef definitions. Nothing is hidden. In order to use the menu and data-entry function library, you must know the format of those structures. Change the format and you must change your programs. A better approach would have been to provide initializing functions or macros with initializing data values as parameters. Then, when the underlying structures and functions change to accommodate new requirements, only the applications programs that deal with the changes need to be looked at.

Why do I need to hide information from myself? I know the formats of those structures, I understand the probability that they will change and the implications of such change, and, besides, initializations are more efficient at compile time than at run-time, anyway. (Insert here your favorite argument for continuing to do things the way you have always done them.)

A compiler must hide the details of the FILE data type to preserve the standard method for implementing it. But how well hidden is it? Look into stdio.h and you will see the whole enchilada. The FILE structure format is there. Some of the standard library functions expand into macros that directly address members in the FILE structure, and the macros are there. How well hidden is information that I can find in well-commented source code by using my text editor?

The answer, of course, is that the standard input/output information hidey-hole is a benign safe-store, benign in that a curious seeker can find it and peruse its contents without difficulty, benign in that only casual measures protect its secrets from the voyeur. You would not care to regard, on a daily basis, the details of the garbage disposal in your kitchen or the destination of its fare, but you know where the information can be found if you really need it. The same is true of the information hidden in readily accessible stashes such as stdio.h.

These are not new ideas. Information hiding has been talked about and used for years. But it has come to be viewed in a new light in recent years, one called "object-oriented programming," and I began to revisit my own practices with such things as typedef when I began studying OOP. The new light is still dim and my vision myopic, but we'll break out of the clouds soon, I think. One thing is clear: OOP encourages you to do a better job of hiding the hideable information.

OOPs

I think it was Robert Benchley who observed that the world is divided into two kinds of people -- those who divide the world into two kinds of people and those who do not. The world of programmers is dividing. There are those who embrace object-oriented programming and those who do not. Let's wonder why.

We are told that to learn OOP we must abandon what we know about conventional programming, that the biggest obstacle to learning is one of unlearning. The prospect of returning to elementary school will discourage a lot of conventional programmers who feel that they know how to write programs just fine, thank you.

My pal at DDJ, Kent Porter, suggested a two-syllable word for the notion that OOP is totally new and different, a word that means what you get when you run your favorite straw hat through a bull. Kent had been toying with the object-oriented extensions coming to Turbo Pascal and had crossed the fence to the greener pastures of OOP. Kent was too old to start learning all over again. And he was too young to leave when he did. I will miss him.

For me, I am studying C++ now, and it has been suggested that the Turbo Pascal extensions are a smooth bridge from C to C++. I intend to explore that possibility. My prophesy is that the prevailing OOP language will be C++, and that its full acceptance will occur when and if Borland and Microsoft introduce C++ compilers. If anything has ever screamed for OOP support, the applications program interfaces to DOS Windows and OS/2 Presentation Manager have. Their apparently object-oriented architectures sure do need a C++ with related object libraries behind them.

I tend toward C++ because it is a superset of C, and one can always retreat to C when the OOP view does not seem to fit. Some OOP purists suggest that C++ is not really OOP. Maybe, maybe not, and maybe the paradigm has not settled in, but I'll still bet C++ is the wave of tomorrow if there is going to be a new wave.

Whether or not C++ replaces C as a primary development language depends on how much it is hyped by the moving forces in language development and what it does for the programmer. Certainly if Borland and Microsoft get behind it, lots of programmers will give it a try. But programmers will not stick with C++ unless there is a reason to.

We measure the value of a language by its ability to express a complex algorithm with few lines of code in a way that can be read and plainly understood. Most programmers took to C because it could do those things without getting too far away from the hardware and without rigidly enforcing anything. To win C programmers over, C++ has to get things running with fewer and easier lines of code, make for programs that are easier to modify, or both. As long as the code is readable later, why code two when one will do? C++ has to improve our ability to write programs, or it is not necessary.

So what is different about OOP? Perhaps it is that the OOP view of a programming problem has a different perspective than the one we enjoy now. But is that really so?

The OOP view looks toward the data structures and the functions specific to those structures. You describe classes of data, objects that are instances of those classes, and methods to process the objects. Then, to make things happen, you send a message to an object, and things happen. Consider that old chestnut, hello.c, in conventional C.

   /* ------ hello.c in C ------ */
   #include <stdio.h>
   main( )
   {
        printf("hello, world\n") ;
   }

This program is oriented to the function it performs. It is a function-oriented program (FOP). The FOP programmer thinks about solutions in terms of the procedure, the functions, the steps one takes to solve something.

Now, consider the same program in C++

   \\ ------ hello.c in C++
   #include <stream.h >>
   main( )
   {
   cout << "hello, world \ n";
   }

This program, we are told, is oriented to its objects, the data structures it supports, and the methods it uses to support them.

The << operator is not itself an integral part of C++ any more than the printf function is an integral part of the C language. The << operator is associated with the ostream class within stream.h. You can associate custom operators with classes in C++. The greater-than operator, for example, can mean different things to different classes. Moreover, it can mean different things in the same class when taken in the context of the types of what it operates on.

Now let's write the "hello, world" message to a file. First, we'll use conventional C:

   #include <stdio.h>
   main()
   {
          FILE *fp;
          fp = fopen("hello.dat", "w");
          fputs("hello, world\n", fp);
          fclose(fp);
   }

In standard FOP C, you call three functions. The first one opens the file and returns a pointer to a FILE. The second function accepts a data address and a FILE pointer and sends the data string to the file. The third function closes the file.

Now let's look at the same operation with C++:

   #include <stream.h>
   main()
   {
          filebuf bf;
          bf.open("hello.dat", output);
          ostream op(&bf);
          op.put("hello, world\n");
   }

In the C++ OOP view, you declare an object of class filebuf named bf. This object is the data buffer itself. Next you connect the buffer to a stream file with the filebuf::open function. The other way of saying that is that you send the string address and the output flag as messages to the bf object, telling it to use its open method. Then you declare an object of type ostream named op with the filebuf as an initializing argument. Finally you send the string message to the op object telling it to use its put method.

At this level, the two views are still similar. A C programmer could make the transition to C++ with little problem if this was as far as it went. In fact, you might wonder, why bother? It appears to be a syntax change, nothing more. The big difference, however, is in what is hidden. The definitions of the filebuf and ostream classes are significantly different from their FILE counterparts in standard C.

"So what?" you might ask. "The differences are hidden. Who cares how different they are?"

Think back to the EMPLOYEES typedef. That one is a part of the application, is your responsibility, and, if it is to be an object, needs a class, hidden member data structures, and public member method functions. To design these things you have to be as object-oriented as the language. If OOP has anything to offer, the apparent different statement syntax is not it. There must be merit in the taking of that different perspective.

One of the problems we will confront in writing our first C++ programs is deciding what should be classes and objects and what should not. When do we send messages and when do we call functions? What is the most appropriate way to send messages? Suppose your program posts error conditions by opening a window and displaying the message. Should the error processor be an old-fashioned C function like this?

  error("Too many Chiefs");

Or should the error processor be an object of a class of output like this?

     class error: public window {
      . . .
     };
     error errs;
     errs.post("Too many Chiefs");

Should the error processor use a member function as just shown, or should it use its own operator to post errors like this?

  errs << "Too many Chiefs";

You might post errors with the error class's constructor and destructor functions. The constructor function is called when the object is declared, and the destructor function is called when the object goes out of scope.

     if (chiefs > MAXCHIEFS) {
      error errs("Too many chiefs");
   }

This OOP stuff is beginning to make sense. The code in this column may have a few mistakes, but I'm just learning C++ myself. In the next few months I"ll explore it in more depth, and we'll learn together.

Are People Object-Oriented?

For a programming paradigm to take hold, it must work like people think. The solution to a problem should resemble the problem. Are we object-oriented creatures? I think we might be. People know how to do many complex things but usually only within the context of the objects involved. A plumber's methods have meaning only when considered in the context of sinks, bathtubs, and the like. Take away an understanding of the objects and the methods would be nonsense.

So what is more important and deserves more emphasis, the object or the methods, the bathtub or the plumber's tools and procedures? Where should the perspective be? And what should be saved?

With traditional FOP, you preserve general-purpose, reusable functions in libraries, and you keep the applications data structures in header files. The two are not associated except within the context of programs that tie them together. The applications-specific functions are generally not retained for reuse. In OOP you can catalog the reusable stuff just as with FOP, but you also closely associate data-oriented functions with the data structures as objects, and you can save these objects for reuse if appropriate.

So what should be in the library, the functions or the objects? What should be designed to be reusable? Obviously, only those things that are likely to be reused should be so designed, and sometimes it goes one way and sometimes the other. Without knowing any better, I am drawn to C++ because of its subset ANSI C. You can go either way even within the same program. Only time will tell if that is a good idea.

My pal Bill Chaney tells me to think again; a FOP is a dandy, and OOPs means a mistake has been made.

Book Report: The C++ Programming Language

The C++ Programming Language by Bjarne Stroustrup tries a lot to be the K&R book of C++. It resembles the "white book." It has the same size, same style and presentation, and (almost) the same title. But K&R is a well-crafted description of C aimed at the programmer who does not yet know the language. As you read K&R, you are led carefully through the features of the language with well-organized examples and a well-designed sequence for the presentation of information. The C++ book does not measure up to this high standard.

If anything is going to spell the demise of C++ or OOP in general, it is that the language and the paradigm are or seem to be difficult to describe. Programmers do not readily embrace something that the experts cannot explain. Maybe it isn't so difficult; maybe no one has described them correctly yet. You will hear that many programmers have difficulty making the transition from C to C++. Perhaps they are trying to do it by reading this book. Believe me, Stroustrup's book is not the place to begin.

Once it gets past a brief description of the C subset, the book dives into details way too deep for the newcomer. For example, it introduces classes with a discussion of how ostream is implemented without explaining what ostream is. Then it sinks into an arcane description of how operator overloading is used to implement expressions like A << B << C well before it fully explains how you can extend the language with operators for objects. This book suffers from the malady that most treatments of OOP have -- abstruse examples. Most programmers will not relate to the implementation of stream input/output classes and objects because they do not do that kind of programming. Most programmers will not associate with examples that use complex numbers, because relatively few real-world programming problems use them. The book renames C's arrays, calling them vectors, and never explains that. Then, to further confuse you, the discussion of vectors makes you think that there is a vector class by building one. Not until later do you learn that you are being told how to use the features of C++ to do your own vector bounds checking and not using C++'s unbounded vectors at all. The entire operator topic uses the implementation of improved vectors as an example. Most programmers would not cozy up to these examples even if the examples were well-organized and -presented because most programmers do not write systems programs where such techniques are applied. The approach taken by this book resembles the typical university computer sciences curriculum, where emphasis is on language design and compiler development rather than how and why they are used.

That's about as far as I've gotten with The C++ Programming Language. I'm looking for another book.

My first impression of C was taken from K&R. The organization and clarity of that work engendered confidence in the language. It was easy to believe that the language would live up to its promise; Dennis Ritchie, one of the coauthors of the book, designed the language.

If the same confusion and lack of organization found in the C++ "definitive reference and guide" is in the language (C++ was designed by Stroustrup), we are in deep sushi. First impressions die hard, but I hope to be wrong. I still believe that C++ will be the next major programming language.

As the work of the ANSI X3J11 committee winds down, they are considering the standardization of C++. If this undertaking gets going, it is a sure sign that C++ will establish and maintain a strong presence for a long time.