The Component Object Model

The foundation for OLE services

Sara Williams and Charlie Kindel

Sara is a technical evangelist in the developer relations group at Microsoft. She can be reached at saraw@microsoft.com. Charlie is a program manager, software-design engineer, and technical evangelist in the developer relations group at Microsoft. He can be reached at ckindel@microsoft.com.


The Component Object Model (COM) is a component-software architecture designed by Microsoft that allows applications and systems to be built from components supplied by different software vendors. COM is the underlying architecture that forms the foundation for higher-level software services, like those provided by OLE; see Figure 1. OLE services span various aspects of component software, including compound documents, controls, interapplication programmability, data transfer, storage, naming, and other software interactions.

These services provide distinctly different functionality to the user. However, all OLE services share a fundamental requirement for a mechanism that allows binary software components (supplied by different software vendors) to connect to, and communicate with, each other in a well-defined manner. This mechanism is supplied by COM, a component-software architecture that:

In addition, COM provides mechanisms for:

It is important to note that COM is a general architecture for component software. While Microsoft is applying COM to address-specific areas like those shown in Figure 1, any developer can take advantage of the structure and foundation that COM provides.

How does COM enable interoperability? What makes it such a useful and unifying model? To address these questions, it is helpful to examine the basic COM design principles and architectural concepts. In doing so, you will see the specific problems that COM was designed to solve, and how COM provides solutions for these problems. After this, turn to the article, "Application Integration with OLE," by Kraig Brockschmidt in this issue, to see how OLE provides higher-level services on top of the COM foundation. For an example implementation using COM and OLE, see the article, "Implementing Interoperable Objects," by Ray Valdés.

The Component-Software Problem

The fundamental question COM addresses is: How can a system be designed such that binary software components from different vendors, written in different parts of the world, at different times, are guaranteed to interoperate? To design such a system, four specific problems must be solved:

These problems need to be solved without sacrificing performance. Achieving cross-process and cross-network transparency must be accomplished without adding undue system overhead to components interacting within the same address space. In-process components must be scalable down to small, lightweight pieces of software, equivalent in scope to C++ classes or GUI controls.

COM Fundamentals

The design of COM rests on fundamental concepts that:

Binary Standard

To implement a binary standard for component invocations, COM defines a standard way to lay out (for each of several platforms) virtual function tables (known as "vtables") in memory, and a standard way to call a function in a vtable. Thus, any language that can call functions through double-pointer indirection (C, C++, Smalltalk, Ada, Basic, and many others) can be used to write components that can interoperate with other components written in any language that conforms to COM's binary standard.

An important distinction is made between objects and components. The word "object" indicates something different to everyone. In COM, an object is some piece of compiled code that provides some service to the rest of the system. To avoid confusion, a COM object here is referred to as a "Component Object," or simply a "component." This avoids confusing COM objects with source-code OOP objects, such as those used in C++ programs.

Interfaces

In COM, applications interact with each other and with the system through collections of functions (or methods) called "interfaces." Note that all OLE services are simply COM interfaces. A COM interface is a strongly typed contract between software components to provide a small, but useful, set of semantically related operations. An interface is the definition of an expected behavior and expected responsibilities. OLE's drag-and-drop support is a good example of COM interface usage. All the functionality that a component must implement to be a drop target is collected into the IDropTarget interface. All the drag source functionality is in the IDragSource interface. Interface names begin with "I." OLE defines a number of interfaces for compound document interactions--these usually start with "IOle." Any developer can design custom interfaces to take advantage of COM to implement specific types of component integration and communication. Incidentally, a pointer to a Component Object is really a pointer to one of the interfaces that the Component Object implements. This means that you can only use a Component-Object pointer to call a method and not to modify data. Example 1 shows an interface definition for a simple phone-directory service, ILookup, which has two methods, LookupByName and LookupByNumber.

All Component Objects support a base interface called "IUnknown," along with any combination of other interfaces, depending on what functionality a Component Object chooses to expose. Unlike C++ objects, Component Objects always access other component objects through interface pointers. A Component Object can never access another component object's data. Only an object's interfaces are exposed to other objects; see Figure 2. This is a primary architectural feature of the Component Object Model. It allows COM to completely preserve encapsulation of data and processing, a fundamental requirement of a true component software standard. It also allows for transparent remoting (cross-process or cross-network calling), since all component access is through well-defined interface methods that can exist in a proxy object that forwards the request and vectors back the response.

Interface Attributes

An interface is a contractual way for a Component Object to expose its services. The key aspects of this design are:

Figure 3(a) shows a diagram of a Component Object that supports three interfaces--A, B, and C. By convention, a standard pictorial representation is used for objects and their interfaces in which an interface is represented as a "plug-in jack." Figures 3(b) and 3(c) show how interfaces allow for both client/server and peer-to-peer relationships between components.

Interface Benefits

The unique use of interfaces in COM provides a number of benefits:

The IUnknown Interface

COM defines one special interface, IUnknown, to implement some essential functionality. All Component Objects are required to implement the IUnknown interface, and conveniently, all other COM and OLE interfaces derive from IUnknown. IUnknown has three methods: QueryInterface, AddRef, and Release; see Example 2. Since all interfaces derive from IUnknown, QueryInterface, AddRef, and Release can be called using any interface pointer.

AddRef and Release are simple reference-counting methods. An interface's AddRef is called when another Component Object makes a copy of a pointer to that interface. An interface's Release method is called when the other component no longer requires use of that interface. While the Component Object's reference count is nonzero, it must remain in memory. When the reference count becomes zero, the Component Object can safely unload itself, because no other components hold references to it.

QueryInterface is the mechanism that allows clients to dynamically discover (at run time) whether an interface is supported by a Component Object. At the same time, it is the mechanism that a client uses to get an interface pointer from a Component Object. When an application wants to use some function of a Component Object, it calls that object's QueryInterface, requesting a pointer to the interface that implements the desired function. If the Component Object supports that interface, it will return the appropriate interface pointer and a success code. If the Component Object doesn't support the requested interface, then it will return an error value. The application will then examine the return code. If successful, it will use the interface pointer to access the desired method. If the QueryInterface fails, the application will take some other action, letting the user know that the desired functionality is not available.

Example 3 shows a call to QueryInterface on the component Phonebook. The code is asking this component, "Do you support the ILookup interface?" If the call returns successfully, then the component supports the desired interface and a pointer can be used to call methods contained in that interface (in this case, either LookupByName or LookupByNumber). Note that AddRef() is not explicitly called in this case because QueryInterface() increments the reference count before returning the interface pointer.

Identifying Interfaces

COM uses Globally Unique Identifiers (GUIDs) to identify every interface and every Component Object class. GUIDs are equivalent functionally to Universally Unique Identifiers (UUIDs), as defined in the Open Software Foundation's Distributed Computing Environment (OSF DCE). GUIDs are 128-bit integers that are guaranteed to be unique in the world across space and time. Human-readable names are assigned only for convenience and are locally scoped. This helps ensure that COM components do not accidentally connect to the "wrong" component, server, or try to use the "wrong" interface, even in networks with millions of Component Objects. GUIDs are embedded in the component binary itself, and are used by COM dynamically at bind time to ensure no false connections are made between components.

CLSIDs are GUIDs that refer to Component Object classes, and IIDs are GUIDs that refer to interfaces. Microsoft supplies a tool (uuidgen) that automatically generates GUIDs. Additionally, the CoCreateGuid function is part of the COM API. Thus, you can create your own GUIDs when you develop Component Object classes and custom interfaces. COM header files provide macros that allow you to define a more readable name to your GUIDs. Example 4 shows two GUIDs. CLSID_PHONEBOOK is a Component Object class that gives users lookup access to a phone book. IID_ILOOKUP is a custom interface implemented by the PhoneBook class that accesses the phone book's database.

Component Object Library

The Component Object Library is a system component that provides the mechanics of COM. This library provides the ability to make IUnknown calls across processes. It also encapsulates all the "legwork" associated with launching components and establishing connections between components, so that both clients and servers are insulated from location differences.

When an application wants to instantiate a Component Object, it passes the CLSID of that Component Object class to the Component Object Library. The library uses that CLSID to look up the associated server code in the registration database. If the server is an executable, COM launches the EXE and waits for it to register its class factory through a call to CoRegisterClassFactory (a class factory is the mechanism in COM used to instantiate new Component Objects). If the associated server code happens to be a DLL, COM loads the DLL and calls the DLL's exported function DllGetClassFactory. COM uses the object's IClassFactory interface to ask the class factory to create an instance of the Component Object, and returns a pointer to the requested interface back to the calling application. The calling application neither knows nor cares where the server application is run. It just uses the returned interface pointer to communicate with the newly created Component Object. The Component Object Library is implemented in COMPOBJ.DLL on Windows and OLE32.DLL on Windows NT and Windows 95.

COM is designed to allow clients to transparently communicate with components, regardless of where those components are running. There is a single programming model for all types of Component Objects--for not only clients of those Component Objects, but also for the servers of those Component Objects. From a client's point of view, all Component Objects are accessed through interface pointers. A pointer must be in-process, and, in fact, any call to an interface function always reaches some piece of in-process code first. If the Component Object is in-process, the call reaches it directly. If the Component Object is out-of-process, then the call first reaches a "proxy" object provided by COM. This proxy generates the appropriate remote procedure call to the other process or the other machine. It can then transparently connect to objects that are in-process, cross-process, or remote.

From a server's point of view, all calls to a Component Object's interface functions are made through a pointer to that interface. Again, a pointer only has context in a single process, and so the caller must always be some piece of in-process code. If the Component Object is in-process, the caller is the client itself. Otherwise, the caller is a "stub" object provided by COM that picks up the remote procedure call from the proxy in the client process and turns it into an interface call to the server Component Object. As far as both clients and servers know, they always communicate directly with some other in-process code; see Figure 4.

The benefits of this local/remote transparency are:

Solving the Component-Software Problem

Let's see how the fundamental pieces of COM fit together to enable component software. COM addresses the four basic problems associated with component software: basic component interoperability, versioning, language independence, and transparent cross-process interoperability. COM solves these problems while satisfying the requirements of high performance and efficiency mandated by the commercial component marketplace.

COM provides basic component interoperability by defining a binary standard for vtable construction and method calling between components. Calls between COM components in the same process are only a handful of processor instructions slower than a standard direct function call, and no slower than a compile-time bound C++ object invocation.

A good versioning mechanism allows one system component to be updated without requiring updates to other components in the system. COM defines a system in which components continue to support the existing interfaces (used to provide services to older clients) as well as support new and better interfaces (used to provide services to newer clients).

Versioning in COM is implemented by using interfaces and IUnknown:QueryInterface. This mechanism allows only one system component to be updated at a time. This approach completely eliminates the need for things such as version repositories or central management of component versions. A software module is generally updated to add new functionality, or to improve existing functionality. In COM, you add new functionality to your Component Object by adding support for new interfaces. Since the existing interfaces don't change, other components that rely on those interfaces continue to work. Newer components that know about the new interfaces can use them. Because QueryInterface calls are made at run time (without expensive calls to some "capabilities database," as is done in some other system-object models), the capabilities of a Component Object are evaluated each time the component is used. When new features become available, applications that know how to use them will begin to do so immediately.

Improving existing functionality is even easier. Because the syntax and semantics of an interface remain constant, you are free to change the implementation of an interface, without breaking other developers' components that rely on the interface. Windows and OLE use this technique to provide improved system support. For example, in OLE today, the Structured Storage service is implemented as a set of interfaces that currently use the C run-time file I/O functions internally. In Cairo (the next version of Windows NT), those same interfaces will write directly to the file system. The syntax and semantics of the interfaces remain constant. Only the implementation changes.

Existing applications will be able to use the new implementation without any changes. The combination of the use of interfaces (immutable, well-defined, "functionality sets" for components) and QueryInterface (the ability to determine at run time the capabilities of a specific Component Object) enable COM to provide an architecture in which components can be dynamically updated, without requiring updates to other reliant components. This is a fundamental strength of COM over other proposed object models. At run time, old and new clients can safely coexist with a given Component Object. Errors can only occur at easily handled times--at bind time or during a QueryInterface call.

Regarding language independence, COM allows you to implement components in a number of different programming languages and use these components from clients that are written using completely different programming languages. Again, this is because COM, unlike object-oriented programming languages, represents a binary-object standard, not a source-code standard. This is a fundamental benefit of a component software architecture over object-oriented programming languages. Objects defined in an OOP language typically interact only with other objects defined in the same language. This necessarily limits their reuse. At the same time, an OOP language can be used in building COM components, so the two technologies are actually quite complementary. COM can be used to "package" and further encapsulate OOP objects into components for widespread reuse, even within very different programming languages.

Achieving cross-process interoperability is, in many respects, the key to solving the component software problem. It would be relatively easy to design a component-software architecture if you assumed all component interactions occurred within the same process space. In fact, other proposed system-object models do make this basic assumption. Most of the work in defining a true component-software model involves the transparent bridging of process and network barriers. The design of COM began with the assumption that interoperability had to occur across process spaces, since most applications could not be expected to be rewritten as DLLs loaded into shared memory. By solving the problem of cross-process interoperability, COM also creates an architecture under which components can communicate across a network.

The Component Object Library is the key to providing transparent cross-process interoperability. This library encapsulates all the "legwork" associated with finding and launching components and with managing the communication between components. It insulates components from location differences, which means that Component Objects can interoperate freely with other Component Objects running in the same process, in a different process, or across the network without having separate code to handle each case. Because components are insulated from location differences, when a new Component Object Library is released with support for cross-network interaction, existing Component Objects will be able to work in a distributed fashion without requiring any source-code changes, recompilation, or redistribution to customers.

COM and the Client/Server Model

The interaction between Component Objects and the users of those COM objects is based on a client/server model. The term "client" already has been used to refer to some piece of code using the services of a Component Object. Because a Component Object supplies services, the implementor of that component is usually called the "server." The client/server architecture enhances system robustness: If a server process crashes or is otherwise disconnected from a client, the client can handle that problem gracefully and even restart the server, if necessary.

Because COM allows clients and servers to exist in different process spaces (as desired by component providers), crash protection can be provided between the different components making up an application. For example, if one component in a compound document fails, the entire document will not crash. In contrast, object models that are only in-process cannot provide this same fault tolerance. The ability to cleanly separate object clients and object servers in different process spaces is very important for a component-software standard that promises to support sophisticated applications. Unlike other competing object models, COM is unique in allowing clients to also represent themselves as servers. Many interesting designs have two or more components using interface pointers on each other, thus becoming clients and servers simultaneously. In this sense, COM also supports the notion of peer-to-peer computing. This is more flexible and useful than other proposed object models, where clients never represent themselves as objects.

Servers can come in two flavors: in-process and out-of-process. "In-process" means the server's code executes in the same process space as the client (as a DLL). "Out-of-process" means the code runs in another process on the same machine (as an EXE), or in another process on a remote machine. These three types of servers are also called "in-process," "local," and "remote." Implementors of components choose the type of server based on the requirements of implementation and deployment. COM is designed to handle all situations, from those that require the deployment of many small, lightweight in-process components (like OLE Controls, but conceivably even smaller) up to those that require deployment of a huge component, such as a central corporate database server. To client applications, the basic mechanisms remain the same.

Creating Custom Interfaces

To create an interface, the developer uses the Interface Description Language (IDL) to create a description of the interface's methods. From this description, the Microsoft IDL compiler generates program header files so that application code can use that interface. It also creates code to compile into proxy and stub objects that enable an interface to be used cross-process. You could write this code by hand. However, allowing the MIDL compiler to do it for you is far less tedious. The Component Object Library contains proxy and stub objects for all of the standard predefined COM and OLE interfaces, so you will only use the IDL if you want to create a custom interface. Example 5 shows the IDL file used to define the custom interface, ILookup, which is implemented by the PhoneBook object. The IDL used and supplied by Microsoft is based on simple extensions to the IDL used in OSF DCE, a growing industry standard for RPC-based distributed computing.

Conclusion

COM is not a specification for how applications are structured, but rather a specification for how applications interoperate. For this reason, COM is not concerned with the internal structure of an application--that is your job, and it depends on the programming languages and development environments you use. Conversely, programming environments have no set standards for working with objects outside of the immediate application. For example, C++, which works extremely well with objects inside an application, has no support for working with outside objects. COM, through language-independent interfaces, picks up where programming languages leave off, providing network-wide interoperability of components.

In general, only one vendor needs to (or should) implement a COM Library for any particular operating system. For example, Microsoft is implementing COM on Windows, Windows NT, and the Apple Macintosh. Other vendors are implementing COM on other operating systems, including specific versions of UNIX.

It is important to note that COM draws a very clean distinction between the object model and the wire-level protocols for distributed services, which are the same on all platforms, and platform-specific, operating-system services (local security or network transports, for example). Developers are therefore not constrained to new and specific models for the services of different operating systems, yet they can develop components that interoperate with components on other platforms.

Only with a binary standard on a given platform and a single, wire-level protocol for cross-machine component interaction can an object model provide the type of structure necessary for full interoperability between all applications and between all different machines in a network. With a binary and network standard, COM opens the doors for a revolution in innovation without a revolution in programming or programming tools.

The Problem with Implementation Inheritance

Implementation inheritance--the ability of one component to "subclass" or inherit some of its functionality from another component--is a very useful technology for building applications. Implementation inheritance, however, can create many problems in a distributed, evolving object system.

The problem is that the "contract," or relationship between components in an implementation hierarchy is not clearly defined; it is implicit and ambiguous. When the parent or child component changes its behavior unexpectedly, the behavior of related components may become undefined. This is not a problem when the implementation hierarchy is under the control of a defined group of programmers who can update to components simultaneously. But it is precisely this ability to control and change a set of related components simultaneously that differentiates an application, even a complex application, from a true distributed-object system. So while implementation inheritance can be a very good thing for building applications, it is not appropriate for a system object model that defines an architecture for component software.

In a system built of components provided by a variety of vendors, it is critical that a given component provider be able to revise, update, and distribute (or redistribute) his or her product without breaking existing code in the field which is using the previous revision or revisions of his component. In order to achieve this, it is necessary that the actual interface on the component (including both the actual semantic interface and the expected behavior) used by such clients be crystal clear to both parties. Otherwise, how can the component provider be sure to maintain that interface and thus not break the existing client's? From observation, the problem with implementation inheritance is that it is significantly easier for programmers to be unclear about the actual interface between a base and derived class than it is to be clear. This usually leads implementors of derived classes to require source code to the base classes; in fact, most application-framework development environments that are based on inheritance provide full source code for this exact reason.

The bottom line is that inheritance, while very powerful for managing source code in a project, is not suitable for creating a component-based system where the goal is for components to reuse each other's implementations without knowing any internal structures of the other objects. Inheritance violates the principle of encapsulation, the most important aspect of an object-oriented system.

--S.W. & C.K.

COM Reusability Mechanisms

The key to building reusable components is black-box reuse, which means that the piece of code attempting to reuse another component knows nothing, and does not need to know anything, about the internal structure or implementation of the component being used. In other words, the code attempting to reuse a component depends upon the behavior of the component and not the exact implementation--implementation inheritance does not achieve black-box reuse.

To achieve black-box reusability, COM supports two mechanisms through which one Component Object may reuse another: containment/delegation and aggregation. For convenience, the object being reused is called the "inner object" and the object making use of that inner object is the "outer object."

These two mechanisms are illustrated in Figure 5. The important part to both these mechanisms is how the outer object appears to its clients. As far as the clients are concerned, both objects implement interfaces A, B, and C. Furthermore, the client treats the outer object as a black box and thus does not care, nor does it need to care, about the internal structure of the outer object--the client only cares about behavior.

Containment is simple to implement for an outer object. The process is like a C++ object that itself contains a C++ string object. The C++ object would use the contained string object to perform certain string functions, even if the outer object is not considered a "string" object in its own right.

Aggregation is almost as simple to implement. The trick here is for COM to preserve the function of QueryInterface for Component-Object clients even as an object exposes another Component-Object's interfaces as its own. The solution is for the inner object to delegate IUnknown calls in its own interfaces, but also allow the outer object to access the inner object's IUnknown functions directly. COM provides specific support for this solution. Both Containment/Delegation and Aggregation provide for reuse of components without violating the OO principle of encapsulation.

--S.W. & C.K.

Figure 1 Component Object Model serves as the foundation for component-software services. Figure 2 Virtual function tables (vtables) are a binary standard for accessing component services. Figure 3 (a) A typical component object that supports three interfaces A, B, and C; (b) interfaces extend toward the clients connected to them; (c) two applications may connect to each other's objects, in which case they extend their interfaces toward each other. Figure 4 Clients always call in-process code; Component Objects are always called by in-process code. COM provides the underlying transparent RPC. Figure 5 (a) Containment of an inner object and delegation to its interfaces; (b) aggregation of an inner object, where the outer object exposes one or more of the inner object's interfaces as its own.

Example 1: C++-style interface definition generated by the MIDL compiler for ILookup, a simple custom interface.

interface ILookup : public IUnknown
{
  public:
  virtual HRESULT __stdcall LookupByName( LPTSTR lpName,WCHAR
                                  **lplpNumber)=0;
  virtual HRESULT __stdcall LookupByNumber( LPTSTR lpNumber,WCHAR
                               **lplpName)=0;
};

Example 2: The IUnknown interface is supported by all Component Objects.

interface IUnknown
{
    virtual    HRESULT  QueryInterface(IID& iid, void** ppvObj) = 0;
    virtual    ULONG    AddRef() = 0;
    virtual    ULONG    Release() = 0;
}

Example 3: Calling QueryInterface() on the component PhoneBook.

LPLOOKUP *pLookup;
char szNumber[64];
HRESULT hRes;

// call QueryInterface on the Component Object PhoneBook, asking for
// a pointer to the Ilookup interface identified by a unique interface ID.
hRes = pPhoneBook->QueryInterface( IID_ILOOKUP, &pLookup);
if( SUCCEEDED( hRes ) )
{
        // use Ilookup interface pointer
    pLookup->LookupByName("Daffy Duck", &szNumber);
        // finished using the IPhoneBook interface pointer
    pLookup->Release();
}
else
{
    // failed to acquire Ilookup interface pointer
}

Example 4: Two GUIDs, one CLSID for a phone-directory class, and an IID for a custom interface that retrieves phone-directory information.

DEFINE_GUID(CLSID_PHONEBOOK, 0xc4910d70, 0xba7d, 0x11cd, 0x94, 0xe8,
0x08, 0x00, 0x17, 0x01, 0xa8, 0xa3);

DEFINE_GUID(IID_ILOOKUP, 0xc4910d71, 0xba7d, 0x11cd, 0x94, 0xe8,
0x08, 0x00, 0x17, 0x01, 0xa8, 0xa3);

Example 5: IDL file for a custom interface, ILookup, used by the PhoneBook project.

[
    object,
    uuid(c4910d71-ba7d-11cd-94e8-08001701a8a3),// GUID for PhoneBook object
    pointer_default(unique)
]
interface ILookUp: IUnknown // ILookUp interface derives from IUnknown
{
    import "unknwn.idl";       // Bring in the supplied IUnkown IDL
    HRESULT LookupByName(      // Define member function LookupByName
             [in] LPSTR lpName,
             [out, string] WCHAR ** lplpNumber);
    HRESULT LookupByNumber(    // Define member function LookupByNumber
            [in] LPSTR lpNumber,
            [out, string] WCHAR ** lplpName);
}


Copyright © 1994, Dr. Dobb's Journal