January 1993/Designing an Extensible API in C

Portability

Designing an Extensible API in C

Charles Mirho

Charles Mirho is a consultant in New Jersey, specializing in Multimedia. He holds a Masters Degree in Computer Engineering from Rutgers University. He can be reached on CompuServe at: 70563,2671.

Definition of an Extensible API
An application program interface is the set of function calls that a software library exports to applications. These functions, along with their parameters, are usually prototyped in a header, or include file. In addition to prototypes, the header file also contains definitions for structures used by the functions, and #defines for flags and return values. All these components of the header file make up the complete definition of the API.
An extensible API can accomodate growth in the library without requiring changes to existing applications (beyond a possible recompile). Listing 1 contains a simple, extensible API.
The function in Listing 1 is useful in GUI APIs. It defines a region of the display where a mouse click can be trapped. A typical call to the function might look something like
#define ID_HELP_BUTTON 10
int vertices[] = {10,20, 50,20, 50,40, 10,40};
REGION HelpButton = {sizeof(vertices)/sizeof(int)/2,
                  &vertices[0]};
.
.
.
DefineHotSpot (ID_HELP_BUTTON, &HelpButton);
The value of ID_HELP_BUTTON varies depending on the application. ID_HELP_BUTTON also provides a unique ID for the region. The second parameter, &HelpButton, defines the boundary of the region as a set of vertices. Notice that the REGION structure is defined in API.H, the header file for the API. REGION contains a field for the number of vertices in the region, and a pointer to a list of vertices (coordinate pairs). Early versions of the library might only support rectangular hot spots (four vertices), but the API is extensible because more complex shapes can be used in the future without altering the prototype for DefineHotSpot. Compare Listing 1 to a non-extensible version of the same API in Listing 2.
Since the region is always rectangular, only two vertices are required, specifying the upper-left and lower-right corners of the rectangle. While this function may seem cleaner and more intuitive at first glance, it is extremely confining. If future versions of the library must support regions with more than four vertices, then you must choose one of two undesirable alternatives:

You can create an extensible version of DefineHotSpot. Now developers must learn two functions, since the old version of DefineHotSpot must be retained for compatibility. The result is a cluttered API.

You must modify existing applications that use DefineHotSpot to support the new function API.

Structured Parameters
The extensible version of DefineHotSpot (Listing 1) uses a structured parameter, while the non-extensible version (Listing 2) passes all the parameters through the function prototype. Using structured parameters is one of the best ways to design an API that is more extensible. As the capabilities of the library expand, fields can be added to the structures without changing the function prototype. A good application of this technique is in functions that return information, such as
int GetSystemInfo (SYSTEM_INFO * sinfo);
This function is used to get information about devices in the computer system.
The SYSTEM_INFO structure:
typedef struct tagSYSTEM_INFO
{
 int num_displays;   /* number of attached displays */
 int num_printers;   /* number of attached printers */
 int num_drives;  /* number of attached disk drives */
} SYSTEM_INFO;
has fields to hold the important properties of the system. (I kept it short for clarity.)
Later versions of the API can expand the structure to accomodate new additions to the system, such as tape drives, as in:
typedef struct tagSYSTEM_INFO
{
 int_num displays;   /* number of attached displays */
 int_num_printers;   /* number of attached printers */
 int_num_drives;  /* number of attached disk drives */
 int_num_tapes;   /* number of attached tape drives */
} SYSTEM_INFO;
Because the features of the system are passed through the API in the form of a structure, rather than as separate parameters, it is easy to add a field for tape drives.

The Size Field
You can add even more flexibility to structured parameters with the size field. The size field holds the size, in bytes, of the structure containing it. When using a size field, you must make it the first field in the structure, as in

typedef struct tagSYSTEM_INFO { int size; /* size of this structure */ int num_displays; /* number of attached displays */ int num_printers; /* number of attached printers */ int num_drives; /* number of attached disk drives */ int num_tapes; /* number of attached tape drives */ } SYSTEM_INFO;
The size field makes it possible for existing applications to use newer versions of the library without performing a recompile. This is especially useful on platforms that use dynamic linking, because dynamic link libraries are often packaged separately and sold directly to customers. Application developers often have no control over which version of the library customers are using.
To see how the size field can save a recompile, look at the SYSTEM_INFO structure again. When the num_tapes field is added, the size of the structure changes. It would normally be necessary to recompile applications that use the structure so that static and dynamic allocations reserve the correct amount of storage. Otherwise, the newer library would write too much data into the structure parameter, corrupting memory. However, if the first field of the structure contains the structure's size, and you are careful to add fields only to the end of the structure, then the structure can be extended without the need to recompile existing applications. The library simply examines the size field to determine which version of the structure the application is passing. If the application is passing the older structure, the size will be smaller, and the library knows not to fill the extended fields. Listing 3 contains an example.
In Listing 3, the library keeps the declaration of the old SYSTEM_INFO structure as oSYSTEM_INFO. The oSYSTEM_INFO structure does not appear in the header file that applications use.

Interpretation Flag
Suppose the GetSystemInfo function is extended in the future to report details about particular devices in the system. You can use the same function to get the number of displays in the system, and details about the displays the system is using, as in:

typedef struct tagDISPLAY_INFO { int size; /* size of this structure */ int displayno; /* display to get info on */ int xpixels; /* display width in pixels */ int ypixels; /* display height in pixels */ int bits_per_pixel; /* bits per pixel */ int planes; /* video planes */ } DISPLAY_INFO;
You can insure that the GetDisplayInfo function will support this and any other device-specific structures that come along by changing the original prototype to

int GetSystemInfo(int flag, unsigned char *info);
GetDisplayInfo now accepts a byte-aligned pointer instead of a pointer to a specific structure. The function interpretes the pointer differently, depending of the value of the flag parameter. You call the function for general system information with

/* API.H */ #define GET_SYSTEM_INFO 1 #define GET_DISPLAY_INFO 2 . . /* application */ SYSTEM_INFO sinfo = { sizeof(SYSTEM_INFO), 0, 0, 0 }; . . GetSystemInfo (GET_SYSTEM_INFO, (unsigned char *) sinfo);
For details on display devices, you call the function with
/* application */ DISPLAY_INFO dinfo = { sizeof (DISPLAY_INFO), 1, 0, 0, 0, 0}; . . GetSystemInfo (GET_DISPLAY_INFO, (unsigned char *) dinfo);
Inside, the GetSystemInfo function would look something like Listing 4.
Different structures describing entirely different things evolve differently, and so it is entirely possible for them to be the same size by coincidence. When different structures (as opposed to different versions of the same structure) are passed through the API as in Listing 4, the size field alone is not sufficient. The interpretation flag resolves any ambiguity.

Variable-Sized Structures
Variable-sized structures typically have a fixed-sized header portion and a variable-sized data portion. The header usually defines or limits the data in some way. The header and data are stored contiguously in memory, so that the data can be referenced as elements of the structure. This often makes the C code that manipulates the data easier to write and read. A variable-sized structure is also extensible because the data portion can be any size.
The REGION structure in Listing 1 can be made variable-length by changing its definition to
/* api.h */
typedef struct tagREGION {
 int vertex_count;
 int vertices[1];
} REGION;
int DefineHotSpot (int id, REGION *pRegion);
Suppose you want the user to decide the shape of the region in which to trap mouse events. The number and values of the vertices in the region are not known at compile time. You first prompt the user for the number of vertices, then allocate a region of the proper size, as in
/* application */
REGION *pRegion;

printf ("Enter the number of vertices:\n");
scanf {"%d", &cnt);
pRegion = (REGION *) malloc(sizeof (REGION) + (2*cnt-1)*sizeof(int));
The pointer pRegion points to a region large enough to hold cnt vertices. In the malloc statement, sizeof(REGION) allocates enough space for the header (the field vertex_count) and one vertex, since the typedef for REGION contains one vertex. Since each vertex is a pair of integers, cnt vertices requires 2*cnt integers. You thus allocate space for 2*cnt-1 integers in addition to the space already allocated for the base structure. You then cast the return value of malloc to a pointer to a REGION structure. From then on, you can refer to the vertices as members of the structure. A loop is used to read the vertex pairs, as in
pRegion->vertex_count = cnt;
for (i=0;i<cnt;i++)
{
 fflush (stdin);
 printf ("Enter X,Y of vertex %d\n", i+1);
 scanf ("%d , %d", &pRegion->vertices [2*i],
        &pRegion->vertices [2*i+1]);
}
The region with all its vertices is now passed cleanly to the DefineHotSpot function
DefineHotSpot (ID_HELP_BUTTON, pRegion);
Templates
Sometimes the arguments to a function are so unpredictable that even structured parameters are limiting. The classic example of this is the Standard C library function printf. The prototype for printf declares a single parameter, a string that acts as a template for optional arguments. The printf function scans the template for clues to the number and size of the optional arguments. This is an extremly powerful technique, since it allows the function to accept any number of arguments of any size, in any order. Consider, for example, a function that draws an arbitrary set of line segments. Each segment has its own attribute for width and color. Segments may or may not be connected. We create a simple language to tell the function how to draw one or more segments. The language describes the motion and attributes of an imaginary pen which moves across the drawing surface. Table 1 describes the Simple Drawing Language. Figure 1 shows sample output from the Simple Drawing Language. Figure 2 shows pen styles. Listing 5 contains an example of the Draw function using the Simple Drawing Language.
Extending the API is as simple as extending the drawing language. For example, to support different line styles, the language is extended to include an s (for style) followed by a number 1-5.

Callback Functions
Callback functions provide a useful way for developers to enter the API. They provide developers with a means of extending the API without altering it. Suppose, for example, that an API included a function for copying one file to another, with optional compression, such as

int zcopy (char *szSourceFile, char *szDestFile, int (*fnCompress)());
This function takes two files as arguments. The first file is the source to copy from and the second is the destination to copy to. The third argument specifies an optional callback function to perform compression, so that the destination file takes up less space than the source. The callback function, if used, is provided by the developer who uses the library. The zcopy function calls the callback function repeatedly during the file copy. Developers are free to use any compression algorithm they desire. This makes the API extensible, since better compression algorithms can be developed and inserted without altering the API. Not only that, developers who use the library have a means of differentiating their products by offering better compression. The callback function resembles

int fnCompress (unsigned char *pData, int *iSize) { //code to perform compression here }
The zcopy function passes to the function a buffer, pData, which contains the raw data from the source file and the size of the buffer. Function fnCompress is expected to compress the data in pData (possibly using intermediate buffers) and return the buffer and new size to the zcopy function.
This example is slightly oversimplified. A commercial version of zcopy would require additional (possibly structured) parameters to specify things like the compression block size. This example is meant only to illustrate the utility of callback functions in extending the API.

Conclusion
Following simple guidelines when designing an API can save headaches down the road as the API expands to accomodate new features. Structured parameters, variable-sized structures, size fields, interpretation flags, templates, and callback functions are some of the ways to prepare an API for future growth.