May 1999/A Wrapper Class for Dynamically Linked Plug-Ins

Cross-Platform Development

A Wrapper Class for Dynamically Linked Plug-Ins

Eric Roe

Creating plug-ins for multiple platforms can be tricky. Here's some techniques that make writing plug-ins almost as easy as using them.

Introduction

Many applications today are written to be modifiable or extendable, by running code that has been compiled by third parties. This code, often called a "plug-in" or "add-in," typically takes the form of a DLL (dynamically linked library) that can be mapped at run time into the application's address space. Most modern operating systems, including the variants of Windows, Solaris, and Linux, provide some capability to do such dynamic linking.

This article presents a simple class that wraps the dynamic linking facilities under Win32, Solaris, and Linux. Using the presented class as a base, developers can create derived classes that fully encapsulate the functions contained in a DLL. In addition, by writing a set of abstract classes and using them as interfaces, it's possible to mimic some of the core functionality of COM (Microsoft's Component Object Model).

The sources accompanying this article were tested on Windows NT using Microsoft Visual C++ v5.0. On Linux, kernel version 2.0.34, I used the egcs compiler version 1.1.1; and on Solaris I used the GNU C++ compiler version 2.8.2. It should be noted that older versions of the GNU compiler (in particular v2.7.x) will fail to build the included source files because of lack of support for namespaces and incomplete STL implementations. I suspect that the presented code can easily be ported to other versions of Unix.

Overview of Dynamic Linking

There are two ways a program can call functions in a DLL (or "shared object," as Unix programmers might call it). Probably the most common use of dynamic libraries is through build-time linking. In this case, libraries on which the program depends are specified at build time. The dependency information is stored within the executable and when the program is run, the system loader loads all the required libraries and resolves function addresses between the executable and the libraries. Win32 developers will recognize this as "implicit" linking.

For the most part, this is an acceptable way to connect your program to the services provided by a specific library without incorporating the library code directly in your program (which is what occurs when you link with a static library). There are, however, a few drawbacks with implicit linking:

Your program will fail to run if it depends on any missing libraries.

Implicit linking doesn't provide a way for users to choose which libraries they wish to use.

The application cannot switch between libraries depending upon different situations.

This leads to the second method for using functions contained in a dynamic library: run-time linking. In this case, the DLL is loaded programmatically by the application. Once the library is loaded, the application can query the library to obtain the addresses to functions contained therein. Finally, when the application is finished using the services provided by the library, the library is programmatically unloaded. Win32 developers will recognize this as "explicit" linking. Run-time linking requires more work on the part of the developer, but the reward can be a significantly more flexible program.

The DynamicLibrary class presented below encapsulates the loading, address resolution, and unloading required to perform run-time linking. It is the starting point for making plug-ins appear like objects to the calling application.

The DynamicLibrary Class

Although DynamicLibrary can be instantiated and used directly, it was really meant to serve as a base class. DynamicLibrary contains all the code needed to load, query, and free DLLs, while classes derived from DynamicLibrary wrap the functions exported by the plug-in. Listings 1 (DynamicLibrary.h) and 2 (DynamicLibrary.cpp) present the DynamicLibrary class.

Loading the Library

There are two ways to load a DLL with DynamicLibrary. The first method simply loads the library as part of the construction of the DynamicLibrary object. In this case the programmer uses the constructor
DynamicLibrary(const char* libName)
passing the library name in the libName parameter. You should specify a full or relative path to the dynamic library in libName. Without path information, both Windows and Solaris/Linux will attempt to locate the library using the standard library search path. If your plug-in is not located in the library search path — and it's quite probable that it won't be — the library won't get loaded.

Because the constructor does not throw an exception if it fails to load the dynamic library, you should call DynamicLibrary::IsLoaded to see if the load was successful. If you want to separate construction from loading, you can use the default constructor and then call the Load(const char* libName) method, which returns a bool value indicating if the function completed successfully.

Under Solaris and Linux, Load calls dlopen, which returns a non-null void pointer if the library was successfully loaded. The dlopen function actually takes two parameters. The first is the library file name (and optional path); the second is a mode flag. I use the RTLD_LAZY flag that defers relocating function references until the function is first called. This results in faster loading of the dynamic library.

Under Win32, Load uses the Win32 LoadLibrary call, which returns a Win32 HANDLE if the library was successfully loaded or NULL otherwise. Because of the use of the HANDLE data type, be sure to place the necessary #includes to define Windows-specific data types before #include "DynamicLibrary.h", or you'll end up with an undefined type error.

Getting the Function

The DynamicLibrary class makes available two (four under Win32) methods for obtaining a function pointer from the DLL. GetProcAddr(const char* procName) returns a pointer to the function named procName in the library or NULL if the function wasn't found. As a concession to Win32 developers I've included a version of GetProcAddr that takes an ordinal value. Using ordinals instead of strings makes lookup slightly faster — there is no similar functionality under Solaris or Linux.

Caching

If you continually are calling the same function, there is a cached lookup available: GetProcAddrCached(const char *procName, int procId). GetProcAddrCached uses the DynamicLibrary's cache array, which is an STL vector of cache_info objects.
class cache_info
{
public:
    DLPROC procAddr;
    bool testFlag;
    // constructor declaration
};
The cache_info class is a private, nested class of DynamicLibrary. It contains two data members: procAddr, which contains the address of the function obtained when querying a dynamic library; and testFlag, an indicator of whether an attempt has already been made to obtain a function pointer from the library.

GetProcAddrCached begins by ensuring that the cache vector contains at least procId + 1-ache_info objects. Adding one to procId accounts for the fact that the vector is zero-based and procId will be used as an index into the vector.

Next, a reference to the cache_info object at index procId is placed in ci. If ci.testFlag is true, then a function lookup was previously performed and the result of that lookup is in ci.procAddr. If ci.testFlag is false, then this is the first time GetProcAddrCached has been called with this particular value for procId. In this case, no function address is cached in ci.procAddr, so a call to GetProcAddr is made to look up the function by procName. The resulting function pointer, or NULL if the GetProcAddr call was unsuccessful, is stored in ci.procAddr and the test flag is set to true. Finally, GetProcAddrCached returns the value stored in ci.procAddr.

The key to successfully using the GetProcAddrCached method is the selection of the procId value. It is important that each function retrieved from the DLL have a unique value for procId and to always use the same value for procId to refer to that function.

Admittedly, the caching mechanism is simple; but it's also effective. Timing test code in the sample driver program (PITest.cpp, available online — see p. 3 for downloading instructions) confirms this. On an Intel Pentium Pro 200MHz running Windows NT, ten million calls were made using both cached and non-cached lookups. The cached calls ran in about 4.7 seconds whereas the non-cached calls took about 44.7 seconds. Linux on the same machine yielded times of 7.6 seconds cached and 24.8 seconds non-cached. Solaris on a 200MHz UltraSPARC I produced times of 10.1 seconds cached versus 32.9 non-cached.

Like GetProcAddr, there is also Win32 version of GetProcAddrCached that uses an ordinal value; however, in this version, the ordinal value is used not only for function lookup but also as an index into the cache vector.

Clean-up

When you are finished using the functions in the dynamic library, call the Unload method to free the library and to delete any cache information that the DynamicLibrary object maintains. In the spirit of good object-oriented programming, the DynamicLibrary class destructor ensures that Unload is called when a DynamicLibrary object is deleted.

Plug-ins in a Nutshell

Put simply, a plug-in implements a set of well defined behaviors on behalf of a calling application. Programmatically, behavior is implemented through functions; and if the application and plug-in agree on a set of functions to use, you've got an API. The DynamicLibrary class provides the means to call functions in a dynamic library, but it does not prescribe the expected behavior of a plug-in. To that end, Listings 3 (SimpleMath.h) and 4 (SimpleMath.cpp) show the SimpleMath class. SimpleMath is derived from DynamicLibrary and it defines the API between the calling application, which instantiates the SimpleMath class, and the plug-ins, which implement the requisite functions.

As its name suggests, SimpleMath wraps plug-ins that implement some basic mathematical functionality. In this case the SimpleMath API consists of four functions capable of performing ordinary arithmetic. The four functions that every plug-in must implement are prototyped as:
extern "C" int
simple_math_add(int x, int y);
extern "C" int
simple_math_sub(int x, int y);
extern "C" int
simple_math_mul(int x, int y);
extern "C" int
simple_math_div(int x, int y);
I've also included another function each plug-in must implement:
extern "C" void
simple_math_who(char *str, int nChars);
This function fills in the character array pointed to by str with the name of the plug-in. The parameter nChars specifies the size of the buffer pointed to by str. This function illustrates a different function signature than the arithmetic functions, which all have the same signature.

The key to deriving a class from DynamicLibrary is adding the plug-in API-specific wrapper methods. For each function in the SimpleMath API, the SimpleMath class defines a wrapper method. Each wrapper method is responsible for obtaining the corresponding plug-in function address, casting the returned pointer to the appropriate signature, and then calling the function. Examining SimpleMath::Add shows how all the above steps are accomplished in just a few lines of code.
int(*f)(int,int) = 
reinterpret_cast<int(*)(int,int)>
       (GetProcAddrCached("simple_math_add",
       SIMPLEMATH_ADD));
The above line performs the first two steps: obtaining the function address and casting the returned pointer to the correct signature. The code in the next two lines
if (f)
    return f(x,y);
checks that the function pointer is non-NULL, and then invokes the function.

An excerpt from the test driver program in Figure 1 shows how to use the SimpleMath class. The line
SimpleMath spi( argv[1] );
constructs the SimpleMath object spi and loads the DLL specified as the first command-line argument. Before attempting to call the plug-in functions, IsLoaded checks if the DLL was successfully loaded. A call to spi.Who obtains the name of the plug-in which is then output to the screen. Then the test program calls the spi.Add and spi.Mul methods and outputs the results on the screen. To the test program, the plug-in appears just like an ordinary object, which is exactly the desired behavior.

Two Sample Plug-ins

Of course, SimpleMath is not of much use without corresponding plug-ins. Listing 5 (GoodMath.cpp) shows one of two plug-ins that can be used by a SimpleMath object. The GoodMath plug-in implements the four arithmetic routines in a manner to generate correct results. The LINKAGE preprocessor definition assures that the functions are properly exported from the Win32 DLL without requiring a module definition (.def) file. (LINKAGE is of no use when the plug-in is compiled under Solaris or Linux.) The second plug-in, BadMath.cpp, is available online. As its name suggests, the BadMath plug-in is a little confused, swapping subtraction for addition and multiplication for division.

Use the test driver program to call the plug-in functions. The name of the plug-in is specified on the command line. On Linux, the command
PITest ./GoodMath.so
uses the GoodMath plug-in when testing the SimpleMath class.

COM-like Capability

Developers who are familiar with Java or COM will recognize the next concept: the interface. In COM, an interface is a group of function prototypes with no defined implementation. Written in C++, an interface is nothing more than a class definition that follows a few simple rules.

Only methods are declared — there are no data members.

All methods are declared as pure virtual.

All ancestor classes follow the previous two rules.

I have exploited the interface concept in its barest form. Unlike real COM, I'm not worried about language neutrality or creating a general-purpose object model — my interest is C/C++ plug-ins. I don't use any non-portable COM system calls nor do I write any IDL (Interface Definition Language). I don't even deal with GUIDs (Globally Unique Identifiers) a.k.a. UUIDs (Universally Unique Identifiers). Even while ignoring these key COM features, it is still possible to mimic some of the basic functionality of COM, namely the creation and use of objects that implement interfaces. The main advantage of all this is that by specifying your plug-in API as one or more interfaces, you can avoid writing plug-in wrapper classes like SimpleMath. Instead, the compiler does the work by generating virtual functions tables (vtables) through which all the methods implemented by the plug-in get called. The main disadvantage is that it's harder to code the corresponding plug-ins in C.

The Interface Specifications

Just like COM's IUnknown interface, the interface I have created is used as a common ancestor. This interface is called IBase (see Listing 6 — BaseInterface.h). IBase declares the two methods QueryInterface and Destroy. All implementations of QueryInterface should check to see if the interface named iid is supported. For my purposes, iid is a null-terminated character array containing the name of the interface. If the implementation supports the requested interface, it should place a pointer to that interface in *iface and QueryInterface should return true. If the implementation does not support the requested interface, it should set *iface to NULL and QueryInterface should return false. The Destroy method is used to destroy the interface returned by a previous call to QueryInterface (possibly deleting the interface's underlying object).

COM programmers will note that I haven't included an AddRef method in IBase. I've chosen a slight variation on COM reference counting. My rule is simple: for each successful call to QueryInterface, there must be a corresponding call to Destroy. This rule implies that QueryInterface increments the underlying object's reference count and Destroy decrements it. When the reference count reaches zero, the underlying object deletes itself. Of course the exact implementation is up to the developer of the plug-in.

Using IBase as a starting point, I've created two other interfaces: IDoMath and INamed, shown in Listings 7 (IDoMath.h) and 8 (INamed.h). IDoMath declares the same arithmetic methods as found in the SimpleMath class. INamed contains the Who method as found in the SimpleMath class.

Class COMLikeMath

COMLikeMath (Listing 9, COMLikeMath.cpp) implements all of the interfaces described above. COMLikeMath is derived from both IDoMath and INamed. COMLikeMath is also required to implement IBase by virtue of the fact that IBase is an abstract ancestor of both IDoMath and INamed. By implementing IDoMath and INamed, COMLikeMath can perform the exact same functions as plug-ins implementing the SimpleMath functions.

A look at the QueryInterface and Destroy methods shows that I've implemented them exactly as described above. Each time QueryInterface is called and successfully returns an interface pointer, COMLikeMath's refCount member is incremented. Each time Destroy is called, refCount is decremented. Finally, when refCount reaches zero, the COMLikeMath object deletes itself. (Note that this implementation is by no means thread-safe.)

The Class Factory

There is only one thing missing that prevents use of COMLikeMath: the class is completely unknown outside of the COMLikeMath.cpp source file. Therefore there is no way for the application using the plug-in to create the initial instance of the COMLikeMath class. To solve this problem, just like in COM, I use a class factory. Listings 10 (ClassFactory.h) and 11 (ClassFactory.cpp) show the declaration and implementation for the ClassFactory class.

ClassFactory is a class used to create the underlying objects that serve to implement interfaces. Like class SimpleMath, ClassFactory wraps functions exported by the plug-in. In this case, the method ClassFactory::CreateInterface wraps the function
extern "C" bool 
cf_create_interface(const char *iid, IBase **iface)
where iid and iface are identical to that of IBase::QueryInterface.

Each plug-in should contain an implementation of the cf_create_instance function that is tailored to creating the classes defined in the plug-in. For the COMLikeMath plug-in, cf_create_interface creates a new COMLikeMath object on the heap. Then, using the newly created COMLikeMath object, it calls the QueryInterface method to fill in *iface with a pointer to the interface requested by iid. Just like interfaces obtained by QueryInterface, you must call Destroy on any interfaces obtained by calls to CreateInterface.

A test driver program for the COMLikeMath plug-in is included with this month's online sources (see p. 3 for downloading instructions).

Concluding Remarks

Plug-ins are a powerful technique to extend the capabilities and usefulness of your application. Using class DynamicLibrary as a starting point, it is possible to create a wide variety of plug-in architectures, ranging from direct use of the DynamicLibrary class to deriving wrapper classes based on DynamicLibrary to a COM-like interface-based architecture. There are certainly other approaches available that were not covered in this article.

The code presented here does have its limitations (thread safety jumps out at me in particular), but I hope it gives you a starting point for experimenting with plug-ins in your applications.

Eric Roe began tinkering with computers back in the early 1980's on the "good ol' Apple II+" — and he's been tinkering ever since, working mainly on Windows, MS-DOS/DR-DOS, Solaris, and Linux. He's currently a software engineer in the Systems and Information Technology Group at TRW Inc. He can be reached at Eric.Roe@trw.com.