C++ Interfaces for C-Language Libraries

Dr. Dobb's Journal August 1997

Living with the past

By Larry E. Baker, Jr.

Larry is a project manager with SABRE Decision Technologies in Dallas, Texas. He can be reached at leb@netcom.com.

C++ developers often have to use legacy C-language support libraries even though the styles don't mix, the paradigms conflict, and the resulting program is as unpleasant to look at as it is to code.

There are usually two alternatives: Grit your teeth and use the old library, or rewrite it in C++. It's a difficult decision, especially for management; the lure of continuing to use a known quantity (and especially one that works) is difficult to overcome.

Another approach is to develop a C++ "wrapper" for the underlying C-language API. In this article, I'll present a simple C-language hash-table library and the C++ template wrapper that adapted it to a C++ world.

Nearly New Ideas: Great Prices!

The problem of gracefully migrating C-language applications to C++ can slow the acceptance of C++ and object-oriented design techniques. The typical approach is to start using C++, but keep the old C code around as the foundation. C++ then degenerates to merely a better C. This is a frustrating -- and potentially destructive -- situation. The language can't be used for its intended purpose, and the compromises ultimately result in difficult-to-modify, difficult-to-maintain code.

Providing some kind of transitional interface for C++ allows older code to remain in use. The obvious benefit is that well-tested and stable code continues to be useful; a less obvious benefit is the fact that a commitment can be made to an object-oriented design without having to forfeit the investment in the existing application.

You won't have to look far to find examples of large-scale applications that use this approach of C and C++ API détente. Microsoft and Borland have developed C++ encapsulations of the Windows API -- MFC and OWL, respectively. These encapsulations share several common traits:

The underlying C-language library API is inviolate. It cannot be changed in any compatibility-breaking way. This allows existing C-language applications to continue to use the library without changes.
C-language API state variables are either carefully hidden behind a private interface, or projected up through the C++ interface.
The C++ API is largely consistent with the same basic abstractions developed by the lower-level API (a window is still a window).

While the scope of OWL and MFC is different from most application support libraries, the traits they share form guidelines for developing an application-specific interface for internal use.

Layering C++ Over C-Language Semantics

The process of building a C++ interface for a C-language library involves two subprocesses: grouping functions around common data objects and classifying common functions.

Start the process by looking for characteristics that help classify the functions in the library into objects and a class hierarchy. For example, a window that is described by a window handle (HWND in Windows) will group together functions that require a window handle as an argument.

Some functions will require an instance of an object in order to create an instance of another object. This kind of dependency will dictate the different kinds of constructors that the object will have in C++. The Windows Device Context (DC), for instance, cannot be created without a reference to either a printer, a bitmap, or a window.

In general, different kinds of handles correspond to classes. Dependencies between different kinds of handles will determine their creation protocols and constructors.

This is a good point at which to begin building a formal object model of the interface. Note that the model is of the interface, not an application. In this case, the interface is the application.

One of the most interesting side effects of this process is the generation of an explicit description of the protocol of calls that must be made to a C-language API. The C++ constructor/destructor protocol, coupled with an explicit object model, define an elegant specification for the syntax of the API. This is an area that's often sorely lacking in the documentation.

Encapsulating C-Language Interfaces

The most difficult problem in designing a C++ interface for a C library is overcoming non-object-oriented behavior. Although making a C++ layer brings to light all of the library's misbehavior, it also ends up hiding it. The result will be a happy face on poorly designed code. And so the question must be asked: Is it worth working with a piece of code that really should be discarded?

Modal Behavior

Modal behavior isn't exactly a sin, but it does complicate things. Modal behavior must be either preserved or hidden by the C++ API.

If a library hides a static variable somewhere, that variable can cause functions in the library to vary their behavior through subsequent calls. Designing a consistent object-oriented interface using the library requires using this interface, as we are not allowing incompatible changes to the C API.

The first option is to provide the thinnest possible encapsulation of the C API, grouping common functions around data they refer to. This eliminates the need to pass handles each time an API call is made. The modality of the interface is preserved by projecting it through the object-oriented interface. This is probably best described as the "it's not a bug, it's a feature" technique. But it may not be much of an improvement.

Another option is to dynamically switch between modes behind the scenes. While this presents a simpler interface to the outside world, it incurs hidden overhead that may be undesirable and difficult to optimize out of critical sections of code.

Mixed Memory-Allocation Models

C and C++ have distinct approaches to memory management that can coexist in the same program. For this coexistence to work, each must keep to its own domain: C++ objects should always be created and destroyed using new and delete (or automatic instantiation); C data should always be managed using malloc() and free().

There are two good reasons for this. First, new and delete are not necessarily implemented using malloc() and free(). Mixing them can produce undefined behavior.

Second, you must preserve the C++ destructor sequence. When you delete a C++ object, its destructor member function is automatically called. Destructors for aggregated member data objects are also called automatically. It is thus extremely important to respect the call to a C++ object's destructor, so this sequence can be carried out correctly.

If a C library contains references to memory allocated outside of the API (in the form of a pointer), take care that the library doesn't apply free() to it. For C++ objects, you need to respect destructor protocol. This may require traversing the data structure(s) contained in the library with a function that explicitly calls the object's destructor. (This is the approach taken with the UTHASH library, which will be described later.)

Extensions (not modifications) can be made to the C-language API supporting C++ abstractions. In the UTHASH example, we could have built iterator functions into the C-language API. Since we were pressed for time, we took a simpler, but slightly less efficient way out. We used the existing functions in the library to apply an appropriate C++ function to call each object's destructor.

The UTHASH C Module

UTHASH started out as the smallest, simplest replacement for the UNIX hash routines that we could come up with in a short period of time. The code lives on in several large C-based applications, so the C-language API must continue to be supported. (The complete source files for UTHASH.H and UTHASH.CPP are available electronically; see "Availability," page 3.)

The UTHASH API is simple, clean, and reasonably object oriented. It is hardly a robust collection of all object-oriented design techniques, but it respects the concepts of encapsulation and object identity.

UTHASH implements a simple hash table -- in this case, it implements an aggregation of a hash function, a comparison function, a collection of buckets, and a linked list of objects associated with each bucket. Listing One resents an overview of the UTHASH C-language API, while Listing Two presents its data structures.

The hash function, supplied by the user at the time the hash table is created, returns an integer computed from state information in the object being hashed. The only requirement is that the integer always be the same for a given instance of an object, and that the integer not change as long as the object is contained within the hash table.

It's worth noting that the UTHASH utilities were designed specifically to support a C-language application, without regard to a future C++ encapsulation. A little effort toward good design practices in the beginning paid off handsomely.

The UTHASH C++ Template Interface

The Htable<t> class provides a template-based interface to the C-language routines; see Listing Three. The interface will ensure strong typing and type-safe linking.

In the search() and deleterec() member functions and in the destructor, it was necessary to emulate C++ destruction semantics. Extra steps were taken to explicitly remove the object instance from the hash table and destroy it with a call to its destructor. In the case of the destructor for the Htable<t> itself, this involved using a special member function that could traverse the table, applying explicit calls to each object instance as it went along.

The Htable<t> template is probably a better example of projecting the underlying C-language API forward in an object-oriented way, than a pure hash table, as might be provided by a commercial class library.

Listing One

typedef struct _HT_LIST {       // list of pointers  struct _HT_LIST *next;
  void            *item;
} HT_LIST;
typedef struct _HASH {      // array of lists, hash/compare functions
  int   size;
  HT_LIST         **buckets;
  int             (*hash)(void *);
  int             (*compare)(void *, void *);
} HASHTABLE;

Back to Article

Listing Two

HASHTABLE *htcreate(                    /* create a new hash table */
  int size,                            /* # of buckets in table */
  int (*hash_function)(void*),         /* user hash function */
  int (*search_function)(void*,void*)  /* user search function */
);
int
htdestroy(                      /* destroy hash table */
  HASHTABLE * hash_table,              /* existing hash table */
  int   flags                          /* HT_FREE_REC|0 */
);
int
htinsert(                   /* insert a new entry */
  HASHTABLE   * hash_table,            /* existing hash table */
  void    * record                     /* record ptr to insert */
);
int
htdeleterec(                     /* delete a table entry */
  HASHTABLE * hash_table,              /* existing hash table */
  void    * record,                    /* record or template */
  int   flags                     /* HT_FREE_REC|0 */
);
int
htsearch(                        /* search and/or remove */
  HASHTABLE * hash_table,              /* existing hash table */
  void    * templ,                     /* search template */
  void    ** record,                   /* &ptr for record found */
  int   flags                          /* HT_FREE_REC|HT_REPLACE|HT_DELETE|0 */
);
void
htstats(                        /* print out hash tbl stats */
  HASHTABLE * hash_table               /* existing hash table */
);
/* for htwalk, note that the function takes two arguments.  The    */
/* first is a pointer to the object found in the hash table; the   */
/* second is an argument passed in the void * arg pointer.         */
/* this is done so an argument can be passed "through" the ht walk */
/* function to the user-supplied "func" function.                  */

void
htwalk(                         /* walk through the table */
  HASHTABLE * hash_table,              /* existing hash table */
  void  (*func)(void *, void*),        /* function to apply */
  void  * arg                          /* user argument to function */
);

Back to Article

Listing Three

template <class T> class HTable{
  public:
     HTable(
        int size,
        int destroyFlags,
        int (*hashFunction)(T*),
        int (*compareFunction)(T*, T*)
     );
     HTable(HASHTABLE * ht);
     ~HTable();

     int insert(T& item);
     int deleterec(T& itemTemplate, int flags);
     int search(T& itemTemplate, T ** foundItem, int flags);
     void stats();
     void walk(void (*walkFunction)(T&, void*), void * walkArg);
     int rc;
  protected:
     HASHTABLE * ht;
     int dFlags;
     static void HTable::walkDestructor( T * pT, void * arg);
};

Back to Article

DDJ