July 1990/An Object-Oriented Approach To Command Line Options

Object-Oriented Programming

An Object-Oriented Approach To Command Line Options

Don Colner

Donald Colner is a computer software consultant with 19 years experience in software engineering, and applications, systems, design and development. He can be reached at Basic Data Systems, Inc., 2202 Sherbrooke Way, Rockville, Maryland 20850 (301) 279-2791.
Object-oriented programming focuses on an object, dividing a program into two parts: a high-level part that needs information about the object, and a low-level part that provides that information. Object-oriented programming languages enforce this division by hiding the data which describe the object from the high-level part of the program.
Object-oriented programming is both an art and a craft. Identifying the object and the information which is needed about the object is mostly art. The resulting collection of data and functions is called a class. On the other hand implementing the data structures and functions which make up the class is mostly craft; this low-level part of the program is usually addressed with traditional programming techniques.
Scott Maley, in his article "The World of Command Line Options," (The C Users Journal, Vol. 8, No. 3, March 1990) provides an excellent exposition of a traditional solution to the frequently encountered task of accessing command line arguments. To contrast traditional with object-oriented techniques, in this article I will develop an object-oriented solution to the same problem using top-down functional analysis to drive the design.

The Problem
Many C programs begin by parsing the command line arguments accessible through argc and argv. One would expect that consistent use of a function like getopt() from UNIX System V would free the programmer from re-inventing this error-prone code with every new application. Frustratingly, the reality is that the code is seldom reusable because the usage isn't completely consistent between programs and because the available functions (like getopt()) rarely meet all of one's needs.

Top-Down Functional Analysis
Top-down functional analysis begins by identifying the data abstraction (e.g. the command line arguments) and defining what information the high-level functions may require about the data abstraction. The high-level functions which need the information should be considered prior to any code which implements the class. Listing 1, TestOpt.c, provides an example.
To treat the command line arguments as an object, first decide what you want to know about the arguments. Assume that command lines are structured so that they contain the program name in argv[0], switches (a letter preceded by -), and options (arguments after argv[0] that don't begin with -). These options are frequently file names. Switches are sometimes followed by a parameter. For example,
MyProgram -x -f 10 MyFile YourFile
In this example the program name is MyProgram, there are two switches, x and f, and the f switch has the parameter 10. There are two options, MyFile, and YourFile.
The questions we might want to ask about a command line with this structure include:

Is a certain switch set?

What is the parameter given for a certain switch?

What is the next option?

Do I need to look for more switches?

What name was used to invoke this program?
These questions can be answered with the following set of functions:
IsSwitch( opt, 'f');
GetParameter( opt, 'f');
GetNextOption( opt );
IsMoreSwitches( opt );
GetProgramName( opt );
Listing 1, Test0pt.c, shows how these functions could be used in a C program.
Notice that this top level design analysis does not focus on the structure of the data referenced by opt. In fact, we have not even mentioned it. Up to this point in the analysis, opt is a data abstraction.

Public Class Definition
The public class definition for Options is contained in the header file, Options.h (Listing 2) and should be the second focus of attention. This header file contains the prototypes of the functions which operate on the abstract data type. Note that the definition of Options:
typedef void * Options
provides no information at all about the structure of the data which is manipulated by the functions such as GetParameter() in the Options class.

Private Class Definition
The private class definition of Options is contained in Options.c, Listing 3. The details of the data structure pointed to by the Options opt are restricted to this file. As a result, if a high-level function such as main() wants any information about opt, it must invoke one of the Options class functions, such as IsMoreSwitches(). This makes the class data private to the class functions.
Making the definition of the data structure private to the class assures us that none of the high-level functions can manipulate the class data incorrectly. Any mistakes are created inside low-level functions which implement the class. This "information hiding" makes debugging easier.
This approach restricts the details of the class implementation to a single file, the private class definition Options.c. As a result, if it becomes necessary or desirable to change the class, the impact of that change is restricted to a single file. This makes bugs easier to find and enhancements simpler to implement. If you later decide to alter the behavior of a class function, you know that only the low-level functions will need to be changed. If, for example, you decide that switch letters should be case insensitive, you will need to change only IsSwitch().

Constructor And Destructor
So far I have ignored two very important functions in Listing 1:
Options CreateOptions( void );
void  DestroyOptions( Options );
This function pair, called the constructor and destructor functions, plays a very important role. The constructor function instantiates the object by dynamically allocating data for one copy of the structure which will hold the private data of the Options class.
Dynamic data allocation permits the reusability of the class. In fact, the combined requirement of data privacy (meaning that only functions within the class have knowledge of the structure of the object) and reusability make dynamic memory allocation by a class function the only way to instantiate an abstract data type.
The destructor function complements the constructor function. The constructor allocates the memory to instantiate the abstract data type, and the destructor deallocates it.

Implementation And Coding Style
Now that the abstract data type is defined, I'll illustrate its use and implementation by developing a program (lp.c) to drive a printer. If I implement this program as a trivial call to a function PrintFiles():

/* lp.c */ void PrintFiles( int argc, char *argv []); void main( int argc, char *argv []) { PrintFiles( argc, argv);}
then I can easily incorporate the PrintFiles() functionality in other programs. Because I've handled the command line options as an abstract data type, any program which uses the function PrintFiles() can use the class Options to manipulate its arguments. If all your programs use the class Options to parse the command line options, you can achieve a much more consistent user interface and extend this user interface from the command line to functions.
Two conventions have been followed in defining the public class functions. Each of these functions is named by a predicate followed by an object with the first letter of each word capitalized. In addition, the first parameter for every function except the CreateOptions() function is of type Options.
For any class named Object, the following should be included in the Public Class Definition of Object:

typedef void * Object; Object CreateObject( void ); void DestroyObject( Object );
The use of the macro named this is consistent with the usage in C++. In Listing 3, the structure OPTIONS is only used in two places in the file. It is used in CreateOptions() to determine the amount of memory required to instantiate an Options, and is used in the macro definition of this. The this macro gets around the awkward implementation detail of dealing with the object as a (void *) in the high-level functions and as type (OPTIONS *) inside the class definition.
I suggest consistent error handling be placed at the lowest possible level. The ErrorExit() function in Listing 3 illustrates a relatively simple example of a general-purpose error handling function. Low-level error handling implies exiting from the program when the error is encountered. Error recovery schemes ordinarily require handling the errors within the high-level functions, producing much more complicated programs. One of your objectives should be to force complexity to the program's lower-level functions.
Note that none of the functions in the Options class return error values or set error flags. This approach is usually satisfactory so long as you can change the error handling to deal with different types of user interfaces, say a command line or windowing interface. If two styles of user interface exist in the same program, it may be necessary to maintain a global pointer to an error handling routine which is changed when the user interface changes.

Summary
The major elements of this design approach are:

Top-down, functional design

Private Class Definition in one .c file

Public Class Definition in header file

typedef void * Object; abstract data type

this macro to simplify class functions

Predicate-Object function names including:

Object CreateObject( void ) and

void DestroyObject( Object )

Object as the first function parameter

Low-level error handling
The first four items are essential to the object-oriented design process. The remaining items are more a matter of style than substance.
Even though C++ offers much more sophistication in implementing object-oriented programs than C, it's clear that an abstract data type can be created and manipulated in C. By applying the basic principles of object-oriented design to their applications, C programmers can realize the advantages of the object-oriented approach, including: better support for top-down design by using abstract data types; more reusable function libraries (classes); and more maintainable functions (private class definitions).