You can do the darndest things with DCOM, including load balancing with a relatively simple daemon.
Introduction
A request distribution server is a server that distributes client requests across clustered machines, so that user access time will be shorter while the cost of providing large-scale service remains low. In this article, I present a simple request distribution server, LoadBal. It balances the creation of DCOM (Distributed Component Object Model) objects on a cluster of up to eight Windows NT machines. LoadBal is installed on a master NT machine, and when it receives a request to create a DCOM object, it creates the object on a slave machine based on a round-robin algorithm. LoadBal is implemented as a DCOM server; it is developed using ATL (Active Template Library); and it runs as an NT service.
Some Background on DCOM and ATL
DCOM is the Microsoft model of distributing objects over a network. A DCOM server can create object instances of multiple object classes. A DCOM object can support multiple interfaces, each representing a different view or behavior of the object. An interface consists of a set of functionally related methods. A DCOM client uses DCOM objects by requesting that an object be activated on the server; the client is passed back an interface to that object. Once the client obtains the interface, it can access the interface as if the server object resides within the client's address space. This is done through a piece of code called a proxy; it exists in the client address space. A similar piece of code called a stub resides in the server address space. When the client invokes a method of the interface, it invokes the method on the proxy. The proxy passes the invocation to the stub code on the server via RPC (Remote Procedure Calls). The server itself may or may not be on the same machine as the client. The stub on the server unmarshals the request, and invokes the method on the server object.
A DCOM server can be implemented as an EXE, a DLL, or an NT service. An NT service is conceptually similar to a Unix daemon. The Service Control Manager starts, stops, and controls services [1]. The reason for implementing LoadBal as an NT service is the need to retain some common state information even if no instance of the server object has been instantiated. An NT service can fulfill this need because it remains in the address space even if no instance of the server object has been instantiated.
A DCOM server can be easily implemented as an NT service using ATL. ATL is a template library that is designed for writing DCOM servers. Using ATL significantly simplifies the daunting task of developing DCOM servers, without greatly increase code size. ATL 2.1 comes as part of Visual C++ 5.0, and an identical version called ATL 2.0 is available on Microsoft's web site for users of previous versions of Visual C++ [2].
The Objmill Class
The main class responsible for creating DCOM objects in round-robin fashion is called Objmill. The definition appears in Figure 1. This source code was generated by the ATL COM AppWizard. This source code is provided in its entirety on the CUJ ftp site, but I also provide documentation on the ftp site that explains how to use the ATL wizards to create the project and source files. (See p. 3 for instructions on accessing the ftp site.)
ObjMill contains one primary method: CreateObject. This function takes two parameters. The first is a string describing the type of the requested object. CreateObject creates the requested DCOM object on a machine based on a round-robin algorithm, and returns a pointer to the interface of the requested object.
In DCOM, there are two ways to identify the type of an object class, CLSID and ProgID. A CLSID is a GUID (Globally Unique Identifier), which is a unique 128-bit number. Here's an example of a CLSID:
{0xF503FAD1,0x7FFC,0x11D2, {0xAD,0xD5,0x00,0xC0, 0x4F,0xD0,0xCD,0x29} }A ProgID is a string that uniquely identifies the corresponding class. Here's an example of a ProgID:
ObjMill.ObjMill.1A CLSID can be converted from a ProgID by calling ProgIDFromCLSID, or vice versa.
The first parameter to CreateObject, szProgId, is of type BSTR. BSTR is a string type provided by DCOM. It is implemented as an array of wide chars, plus a header that indicates the size of the string. Because the header part is hidden from the user and is used by DCOM, a BSTR can be viewed as an array of wide chars. However, a BSTR can be created only by calling SysAllocString, and it should be destroyed by SysFreeString after it has been used. ATL provides macros such as USES_CONVERSION, T2OLE, and OLE2T to ease the conversion to and from BSTR.
The last parameter to CreateObject, ppRetObj, is the address of a pointer to an IUnknown interface. An IUnknown is the basis of all other DCOM interfaces. The IUnknown interface contains the following functions: QueryInterface, AddRef, and Release. Because CreateObject doesn't know which interface of the requested object a user wants, it just hands back the generic IUnknown interface. The user can call the QueryInterface method on IUnknown to request a specific interface later on.
There are a couple of things in the signature of CreateObject that bear some explanation they're not pure C++. STDMETHOD is a COM macro that wraps the method declaration. The comment /*[in]*/ in the parameter list indicates that szProgId is an [in] parameter, a parameter that passes data from the client to the server. ([in] and [out] are constructs defined in DCOM's Interface Definition Language, a.k.a. IDL. IDL can be used to define interfaces independently of programming languages.) The /*[out]*/ comment in the parameter list indicates that ppRetObj is an [out] parameter, which means it is used to pass data from the server to the client. [in] parameters are allocated and freed by the client; [out] parameters are allocated by the server and freed by the client. The [retval] prefix indicates that ppRetObj is also the return value of the method.
Other Members of Objmill
Objmill also has a few private data members. m_MachinesNames is a simple array of strings that holds the names of the machines in the cluster. The integer m_nCount keeps track of the number of machines in the cluster. m_nCurrentMachine keeps track of whose turn it is to create the next object. Because these three variables are shared among all instances of ObjMill, they need to be static. Note that since there may be many instances of ObjMill updating m_nCurrentMachine at the same time, access to this variable must be serialized. For this purpose, Objmill declares a static variable, m_hMutex, that is used as a handle to a mutex.
Mutexes are one kind of kernel object Win32 API provides for thread synchronization. To use a mutex, a program must first create it by calling CreateMutex. Like any other kernel object, a mutex has two states at any time: signaled or nonsignaled. Threads sleep while the mutex they are waiting for is nonsignaled. As soon as the object becomes signaled, the sleeping thread sees the flag, wakes up, and resumes execution [3].
Objmill also defines a couple auxiliary member functions, Startup, Cleanup, and CreateObjectAt. Startup reads in the list of machine names in the cluster from an .INI file, whose name is specified by a registry entry, and it sets the mutex handle. Cleanup closes the mutex handle. CreateObjectAt takes the ProgId of an object, and the name of the machine on which the object will be created. It first checks whether the machine name is empty. If so, it just uses the name of the machine on which it resides. Then CreateObjectAt converts the ProgId to CLSID by calling CLSIDFromProgID, and obtains the interface ID of IUnknown. Finally, CreateObjectAt creates the object on the machine specified, by calling CoCreateInstanceEx, and hands the reference to this object back to the caller.
The method CreateObject picks the machine name according to m_nCurrentMachine, calls CreateObjectAt to create the object, and advances m_nCurrentMachine. The call to advance m_nCurrentMachine is bracketed by calls to WaitForSingleObject and ReleaseMutex. This pair of API calls ensures that no two threads can update m_nCurrentMachine simultaneously. WaitForSingleObject puts the thread to sleep until m_hMutex is signaled. Once m_hMutex is signaled, the thread wakes up, and changes the mutex to the nonsignaled state. After m_nCurrentMachine is updated, ReleaseMutex puts m_hMutex back in the signaled state.
Because ObjMill is running as an NT service, it must use the Event Log to track any errors encountered. Unlike Unix, NT does not have the concept of a master console. Services cannot count on a console being available to receive output or error messages. However, NT does provide event logging capability. All applications can access the event log to record significant information, and the event log can be easily viewed from any machine on the network. Using the event log, however, requires that you develop a series of messages in advance and compile them using the message compiler, which can be a complex process [1]. Luckily, ATL provides a simple implementation of logging to event log as a method of the standard CServiceModule class.
Running the Request Distribution Server
In the online documentation, I show how to use the ATL wizard to create a project named LoadBal. LoadBal is a project that creates the load balance server (Objmill) as an NT service. The _tWinMain function for the project is in the file LoadBal.cpp, Figure 2. This function calls the Objmill's Startup method to start the NT service, and it calls the Cleanup method after the service has ended.
Creating a Sample DCOM Server
Testing LoadBal will require creation of a sample DCOM server, svrCompName. This server has only one function, GetCompName, which returns the name of the computer on which the server resides. This will be a good test to see how evenly the load balance server distributes the requests.
Like the LoadBal project, creating the sample server involves creation of a new project space (called svrCompName), using the ATL COM AppWizard. Again, I provide detailed instructions for using the ATL wizard on the CUJ ftp site. There I show how to create a DCOM server of type "Executable (EXE)," and add a method called Name, with parameter [out retval] BSTR* szComputerName. Figure 3 shows the implementation.
Creating a Sample DCOM Client
I've used MFC to create a simple test client. This client instantiates an ObjMill object, uses the CreateObject method on ObjMill to create a CompName object, and invokes the Name method on CompName. The client displays the name of the computer on which CompName resides in a message box.
The client must initialize the DCOM libraries by adding the initialization code shown in the CSampleClient::InitInstance function in the SampleClient.cpp file (Figure 4).
The last thing to implement is code to instantiate an ObjMill object and use it to create a CompName object. You'll need to include LoadBal.h and svrCompName.h in both SampleClient.h and SampleDialog.h. You'll also need to define the interface ID and CLSID of ObjMill and CompName in SampleClientDlg.cpp. These IDs can be copied from the LoadBal_i.c file and svrCompName_i.c file. Note: you need to change the server name to the name of the machine on which LoadBal is installed.
Setting up Tests
Setting up the tests involves setting up registry entries on the slave machines and the master machine, as well as on the client machine. I provide detailed instructions on the CUJ ftp site.
You'll start the request distribution service as someone in your domain other than the local system account, because the local system account cannot create objects on remote machines. Make sure that account belongs to the Administrator group on all slave machines.
When you run SampleClient.exe several times, you should see a different machine name listed in the message box every time.
Limitations and Further Enhancements
Limitation 1 maximum number of machines. LoadBal can distribute requests among a maximum of eight slave machines. This is because Objmill uses a simple array to contain the names of the machines in the cluster, and the array has a maximum size of eight. A programmer could remove this limitation by using a more sophisticated data structure, such as a vector as defined in Standard Template Library.
Limitation 2 unequal resource loads. LoadBal does a good job of balancing the creation of objects, but not all objects are created equal! Some objects take a lot more resources than others; some are more active than others; some objects have longer lives than others. The round-robin algorithm does not take into account any of these aspects. There are a couple of ways to take these aspects into consideration. For example, a separate thread can be constructed to periodically collect the loads of machines in the cluster, and distribute the creation of objects accordingly.
Limitation 3 overhead in the master machine. Because every DCOM object created requires the existence of an ObjMill object, there must be at least one instance of ObjMill for each client. This may impose a huge overhead for the master machine when the number of clients becomes large. For example, suppose there are 1,000 clients, each of which creates four objects, distributed over a cluster of eight machines. Then each machine will host approximately (1,000 x 4)/8 = 500 objects. However, the master machine will host 1,000 objects. If the master machine is also a slave machine, then it will host 1,500 objects. One solution is to have a separate powerful machine that hosts only ObjMill; another solution is to install ObjMill on very machine in the cluster, and designate groups of clients to create ObjMill on certain machines so that the number of ObjMill instances are distributed across all machines evenly. But then, "distributing across all machines evenly" is what LoadBal is for! This means you can use ObjMill itself to create ObjMill, so that the number of ObjMill instances are distributed evenly across the cluster, and the overhead for the master machine can be minimized. To be more specific, create an ObjMill from one machine, and use it to create an ObjMill. Destroy the first instance of ObjMill, and keep the second instance to create other objects.
Limitation 4 security compromise. I've used default DCOM security here for the sake of simplicity. This means that when LoadBal creates another object, the result object will be running under the same user as the user LoadBal is running under. This will create a problem if the client is running under a different user than the service. For example, if the service is running under John's account, then the object it creates will be running under John's account as well; but the client may be running under Jane's account. This problem comes from making the objects to be balanced accessible to everybody. The problem can be fixed by coding impersonation into LoadBal.
Summary
In this article I've presented a simple request distribution DCOM server running as an NT service. I provide a set of step-by-step instructions on how to create the request distribution server, using ATL, on the CUJ ftp site. While I've emphasized the simplicity of the server, it should be quite easy to modify and thus improve.
References
[1] John T. Bell. "A Wrapper Class for NT Services," C/C++ Users Journal, August 1998, p. 35.
[2] Richard Grimes. Professional DCOM Programming (Wrox Press, Inc), Chapter 3.
[3] Jeff Richter. Advanced Windows (Microsoft Press), Chapter 10.
James Fan got his B. S. in Computer Sciences from the University of Texas at Austin. Currently he is working as a programmer analyst, developing DCOM components. He can be reached at jjfan@email.com.