InterOperable Objects

Laying the foundation for distributed-object computing

Components are Not Object Models

One area of ongoing confusion is the difference between application-level component technologies and the object models that support them. At the lowest level in Figure 1 are "object models" such as the System Object Model (SOM/DSOM) from IBM and Component Object Model (COM) from Microsoft. These system-level technologies are basically intended to solve the problem of tight binary coupling between an application and the objects it relies upon. Consider a C++ application that uses several classes whose methods are implemented in DLLs. Conveniently, a DLL is not part of the application's code and so can be upgraded or altered without affecting the application, as long as the interface remains the same. Unfortunately, this idea falls apart if changes are made to a DLL-based class that alter the size of the object, or even just the layout of the virtual function table. In that case, the calling application may need to be recompiled even though its source has not changed textually.

To sever the binary bond between client and server object implementations, an object model is defined at a level of abstraction which subsumes the language object model and renders it transparent. The model is usually expressed in terms of an Interface Definition Language (IDL), which is processed independently of the implementation language. IDLs predate object computing, and many Remote Procedure Call (RPC) mechanisms (such as DCE) use them to decide the calling protocol of a procedure. An IDL is way of defining the interface to a service; often, the mechanism generates stub code callable by the application. On the implementation end, it describes the interface to the code that executes the function and generates stubs which can be filled in to perform the operation. The RPC mechanism bridges the gap between the two. Since the IDL must be translated (or mapped) into the implementation language, this approach also fulfills the language-independence requirement: The stubs can be generated in any language for which the IDL has a mapping.

The main difference between interface definitions for procedural mechanisms and those for object models, is that in the object model the interface is part of a semantic construct that represents an object. Depending on the specific model, this construct may have any or all of the characteristics and advantages expected of objects, including encapsulation, inheritance, and polymorphism. As an alternative to the static IDL approach of issuing requests, many models offer a dynamic means of invoking requests. In dynamic invocation, the interface is determined at run time, and the request data is built up into a structure, which is passed to the system. Some implementations offer both kinds of invocation.

Loose coupling of client and server objects would go a long way toward making objects more reusable. Our ultimate goal, however, is the use of objects across process address spaces, either within a single processor or across multiple processors, in a heterogenous networked environment. The systems and proposed designs discussed here don't necessarily achieve that. Some systems only work across process boundaries on a single machine, and some don't do even that much. Others work seamlessly across multiple machines in a net, but are subject to other constraints.

Compound Documents

High above the fray of the operating-system wars are issues of concern to application designers and users. These are the component-integration facilities as found in OLE (Microsoft), OpenDoc (Apple), and OpenStep (Next). As Figure 1 shows, these facilities rely on lower-level object models in order to implement functions such as linking and embedding, drag-and-drop, in-place activation, and scripting. These facilities all revolve around a "document-centric" end-user model for applications. In this model, a "container" application serves as the framework for presenting the user with a number of individual "objects" or components, each of which is self-contained in terms of its data and the actions which can be taken on that data. Such groupings of objects are often referred to as "compound documents," and so we might refer to all of these technologies as "compound-document technologies." Figure 2 depicts a typical example of a compound document containing text, image, sound, and spreadsheet table objects. (For an introduction to compound documents, see "Compound Documents" by Lowell Williams, DDJ, March 1993.)

Compound documents make a nice model for end-user utilization of shared and distributed objects. They are compelling enough that most of the major vendors of these technologies are either producing their own high-level integration model or cooperating in the development of someone else's. Examples are Microsoft's OLE 2.0, which utilizes COM as its enabler and is a shipping product, and the OpenDoc consortium, whose technology will eventually rest upon IBM's SOM and its fully distributed progeny DSOM. The application frameworks being designed by Taligent (a startup funded by Apple, IBM, and HP) will also rest on SOM/DSOM.

Compound documents are not simply an end-user technology, however. They rely upon the capabilities of objects to describe themselves to applications and export their interfaces for use by those applications. In essence, these objects are dynamically linked modules. Application developers are already finding OLE 2.0 useful both for integration at the object level and for allowing applications to export interfaces which can be dynamically linked to by other applications.

Crossing Borders

The fundamental idea behind interoperable objects is to pass through existing boundaries such as those in Figure 3. In today's model of object-oriented programming, there is a tight binary coupling between an application and the classes of objects it uses. In many mainstream applications, everything is implemented in a single language running in a single process located on a single machine under a single operating system. The first boundary to fall is address space. An "interprocess object model" allows a process in one address space to request the services of an object in another, or two processes to share an object in a third address space.

The next boundary is the machine. It requires only a short leap of the imagination to move from the idea of objects shared across address spaces to the notion of objects shared among many interconnected processors. This short leap, however, spans a great deal of complexity. Any interprocess object model must be able to translate the data associated with requests between memory models. A technology which crosses the machine boundary must also locate the server object, establish communication with it, pack up the request and parameters and ship them off, then wait for the results, unpack and translate them, and deliver them back to the application. This is the most basic requirement. Add to it the increased need for security, versioning, repositories, name-collision resolution, and a host of other details inherent in distributing objects across a network and you have the makings of, not a short leap of imagination, but a big hurdle of technological complexity. Only those technologies that cross over to an interprocess/interprocessor model can be called "distributed-object technologies."

Two other boundaries don't have much to do with whether a technology is distributed or not, but they do affect the potential for its application in the real world. These boundaries are the programming language and operating system, both important practical considerations for enterprise-wide systems.

The ultimate goal is a model which allows objects written in any language to be shared among applications written in any other language, running on any machine in a network, and under any operating system.

The Combatants

The combatants lining up on the interoperable-object battleground range from large, cross-industry consortia and established system vendors to small, entrepreneurial software houses. Due to space constraints, I'll focus here on those contenders most likely to affect the mainstream, much the way that Windows has done. Unfortunately, many of the smaller contestants who have developed proprietary solutions that actually work now will get only brief mention.

A bird's eye view of the battlefield reveals that the principal tug-of-war is currently between the Object Management Group (OMG) and Microsoft. OMG is a consortium of more than 300 hardware, software, and end-user companies, including every heavyweight in the business and, nominally, Microsoft itself. OMG was founded in 1989 by 11 companies including Digital, Hewlett Packard, Hyperdesk, NCR, and SunSoft. Those companies, along with ObjectDesign, were authors of the "Common Object Request Broker Architecture" (CORBA) specification Version 1.0, released in October 1991. It was followed in March 1992 by Version 1.1; the group is currently working on revision 2.0, due sometime in 1994. CORBA specifies the architecture of an Object Request Broker (ORB), whose job it is to enable and regulate interoperability between objects and applications. ORB is part of a larger vision called the "Object Management Architecture" (OMA).

It is more than passing strange to compare CORBA to Microsoft's OLE 2.0. Among the aspects of OLE 2.0 is an application-level, component-integration technology that has no real counterpart in the OMG world. OLE is built on a foundation called the "Component Object Model" (COM) which performs some of the same tasks as an ORB, but at a different scale, using different techniques. Also, Microsoft's idea of an object model differs greatly from that of the rest of the industry. In fact, the whole basis for comparing the two technologies rests on the promise that in the future they will have similar capabilities.

If it were simply OMG versus Microsoft, this article would be much shorter. There's more to the story, however. OMG's CORBA specification lays down the plans for an architecture, but does not address implementation. In addition, the spec itself leaves many areas undefined. The result is that, while you can address CORBA's overall design and intent, when turning to real or promised implementations, you are effectively faced with several proprietary technologies. IBM, Digital, Hewlett Packard, Iona, ExperSoft, and SunSoft all have (or have planned) implementations of the CORBA spec.

IBM's System Object Model is one major CORBA-compliant implementation. In some ways, the Microsoft versus OMG contest has evolved into a battle between Microsoft and IBM. Both companies offer technologies that are now shipping; both are engaged in trying to shift the loyalties of desktop users from the competitor's operating system to the homegrown alternative; and both consider their particular visions of shared components as strategic technologies which will serve them well in the larger contest for the operating-system dollar.

While the behemoths line up to do battle, a number of small companies have been quietly producing tools that enable some set of the full capabilities of distributed-object computing to be realized. Some of these tools are proprietary, others are headed toward CORBA or COM compliance. Examples include RDO from Isis, Snap from Template Software, SynchroWorks from Oberon, ILOG Broker/Server from ILOG Inc., and OpenBase-SIP from Prism Technologies. Each of these vendors has a shipping product, and testimonials from users who say they are using it now to create distributed applications. That's more than some of the larger companies I'll cover can claim, which is a bit of irony, but there you have it.

OMG, OMA, and CORBA

The Object Management Group was founded in 1989 to adopt a standard for the interoperation of software--specifically, object-oriented software--across operating systems and platforms in a heterogenous networked environment. CORBA is a specification of an architecture and interface which allows applications to make requests of objects in a transparent, independent manner, regardless of language, operating system, or locale considerations. The nature of objects--what they are and how they are created, destroyed, and manipulated--is specified in the OMG object model, a part of the OMA.

The OMA spec is OMG's complete vision of the distributed environment. While the CORBA spec focuses solely on the interaction of objects and the mechanisms which enable it, OMA defines a broad architecture of services and relationships within an environment, as well as the object and reference models. As Figure 4 illustrates, OMA is built upon the ORB services defined by CORBA which provide the interaction model for the architecture. The environment is made richer with the addition of Object Services and Common Facilities, both intended to serve as building blocks for assembling the frameworks within which distributed solutions are built.

Object Services is an area covered by yet another OMG specification, Common Object Services Specification (COSS), that defines a set of objects which perform fundamental operations, such as lifecycle, naming, event, and persistence services. The second stage of the COSS spec, expected late in 1994, defines relationships, externalization, transactions, and concurrency control. Additional stages planned for the next two years will address issues such as security, licensing, queries, and versioning.

Common Facilities (CF) are the newest area of effort by the OMG. Unlike CORBA and Object Services, which are low-level fundamental operations, the CF has an application-level focus, and defines objects which provide key workgroup-support functions: printing, mail, database queries, bulletin boards and newsgroups, and compound documents. The OMG envisions this as the layer most often used by developers working within a distributed environment. This spec is also due sometime in 1994.

The OMG Object Model

The CORBA specification describes the OMG object model, which underlies CORBA and all of the OMA, as "classical": Clients send messages to servers, and a message identifies an object and zero or more parameters to the request. The OMG model strictly separates interface from implementation. The model itself is concerned only with interfaces, to the extent that "interface" and "object type" are synonymous. This approach is used by other technologies (such as OLE) and results from the model's obligation to define the interface between components regardless of their implementation language.

In C++ programs, an object is identified by its unique memory address. In the OMG model, objects are identified by "references"--an implementation-defined type guaranteed to identify the same object each time the reference is used in a request. The CORBA spec is silent on how references are implemented. ORB vendors have implemented references as objects which carry enough descriptive information about the object referred to make them effectively unique. The CORBA spec explicitly states that references are not guaranteed to be unique. The OMG chose not to define a Universal Unique Identifier scheme in Version 1.1 of the specification because of concerns about management and interaction with legacy applications that have a different idea of an object ID. The lack of a universal means of "federating" (that is, making globally compatible) the names used to reference objects is a failing that the OMG intends to address in Version 2.0 of the specification.

Objects in the OMG model have a life cycle: They are created and destroyed dynamically in response to the issuance of requests. The specification does not define a means of allowing the application to create and destroy objects; however, vendors such as IBM have implemented this capability in their versions. Objects can also participate in any of the normal types of relationships, the most important perhaps being subtype-supertype relationships. Multiple inheritance is also permitted, although in this sense it is limited to interface inheritance only. Since the OMG model does not deal with implementation, there is no provision in the spec for implementation inheritance. Inheritance between object interfaces is specified syntactically using the OMG's IDL. Nothing prevents the developer of a set of server objects from using implementation inheritance in the design of the servers, but the dependency is not made explicit in the Interface Definition syntax. The ORB is unaware that a set of servers accessed through an interface hierarchy is also related by implementation inheritance; this therefore becomes a maintenance and management concern.

The OMG model has a strong concept of "types"--identifiable entities which have an associated predicate defined over a set of values. Where the predicate is true, the value is said to satisfy and be a member of the type. Types are used to restrict and characterize operations. The two primary categories of types in the object model are Basic and Constructed. Basic types are nonobject types which represent fundamental data types: signed and unsigned short and long integers, 32- and 64-bit IEEE floating-point numbers, ISO Latin-1 characters, Booleans, enums, strings, and a nonspecific type, any. In addition, a special 8-bit data type is guaranteed not to undergo conversion when transferred from one system to another.

Constructed types are more-complex, higher-level entities, the most important of which is the Interface type. An object is an "instance" of an Interface type if it satisfies the set of operations defined by the type. An Interface type is satisfied by any value which references an object that satisfies the interface. Other types include Structs, Sequences, Unions, and Arrays. Structs are pure data structures which operate much like C structs; Unions operate like C unions. Sequences are a variable-length array type which may contain any single type of object, including other Sequences. Arrays are fixed-length arrays of a single type. Figure 5 illustrates the OMG-type hierarchy.

The Architecture of an ORB

The job of the Object Request Broker is to manage the interaction between client and server objects. This includes nearly all the responsibilities of a distributed computing system already mentioned, from location and referencing to "marshaling" of request parameters and results. To provide this capability, the CORBA specification defines an architecture of interfaces, all of which may be implemented in different ways by different vendors. Figure 6 depicts the CORBA architecture, which consists of three specific components: client-side interface, implementation-side interface, and ORB core.

The client-side architecture provides clients with interfaces to the ORB and to server objects. It consists of the Dynamic Invocation, IDL stub, and ORB services interfaces. In general, the IDL stub interface comprises functions generated based on IDL interface definitions and linked into the client program. The function stubs represent a language mapping between the client language and the ORB implementation. Thus, ORB capabilities can be made available to clients written in any language for which stubs can be generated from IDL specifications. There is currently an accepted language mapping for C; mappings for C++ and Smalltalk are planned. All vendors of CORBA implementations provide a C++ mapping based on a not-yet-approved OMG proposal. The use of the stub interface brings the ORB right into the application programmer's domain: The client interacts with server objects by invoking functions, just as it would for local objects.

The Dynamic Invocation interface is a mechanism for specifying requests at run time, rather than calling linked-in stubs. The dynamic interface is necessary when the object interface cannot be known at compile time. It is accessed using a call (or series of calls) to the ORB in which the object, request, and parameters are specified. The client code is responsible for specifying the types of the parameters and expected results. This information may come from an Interface Repository, about which more will be said later. Most clients will probably use stubs to access object services. In any case, the receiver of the request--the server object--cannot tell whether the request was sent via the stub or dynamic interfaces.

The last of the client-side interfaces are the ORB services, functions of the ORB which may be accessed directly by the client code. An example might be retrieving a reference to an object. The details of these services are mostly undefined by the specification.

ORB services are the one component that the architecture of the implementation-side interface shares with the client-side architecture. Additionally, the implementation-side interface consists of the IDL skeleton interface and the Object Adapter. The skeleton interface is an "up-call" interface, through which the ORB calls the method skeletons of the implementation to invoke a method requested by a client. Most functionality provided by the ORB to object implementations is supplied through the IDL skeletons and the Object Adapter. The OMG expects only a few services to be common across all objects and accessed via the ORB core.

The Object Adapter is the means by which object implementations access most ORB services, including generation and interpretation of object references, method invocation, security, activation (the process of locating an object's implementation and starting it running), mapping references to implementations, and object registration. The adapter actually exports three separate interfaces: a private interface to the skeletons, a private interface to the ORB core, and a public interface for use by implementations. The CORBA specification is less than concrete about the services an adapter needs to support, but it is clear that the adapter is intended to isolate object implementations from the ORB core to as great an extent as possible.

The spec envisions a variety of adapters providing services needed by specific kinds of objects. The most generic adapter described is the Basic Object Adapter (BOA). The BOA allows a variety of object implementation schemes to be accommodated, from separate programs for each method, to separate programs for each object, to a shared implementation for all objects of a given type (the C++ model). The specification also describes adapters suited to objects stored in libraries and object-oriented databases.

Interface Definition Language (IDL)

Most interprocess object models are expressed in terms of a language for defining interfaces. Since the early days of RPC mechanisms, these languages have been known as "Interface Definition Languages" (IDLs). The basic purpose of an IDL is to allow the language-independent expression of interfaces, including the complete signatures (name, parameters, parameter and result types) of methods. This is accomplished by providing a mapping between the IDL syntax and whatever language is used to implement client and server objects. The two need not be implemented using the same language--and in fact it is anticipated that they will not be--as long as mapping is available for the client and server implementation languages.

CORBA IDL is a C-like language with many constructs similar to C++. In fact, the specification credits Stroustrup and Ellis's Annotated C++ Reference Manual as the source for the adaptation which became the CORBA IDL specification. IDL obeys the same lexical rules as C++, while introducing a number of new keywords specific to the needs of a distributed system. If you're familiar with C++, you shouldn't have any trouble adapting to IDL. Writing interface definitions in IDL is a bit like writing class declarations in C++. Since IDL is expressly for interface definition, it lacks the constructs of an implementation language, such as definitions (which actually create storage for a variable or object), flow control, and operators. In particular, there is no concept of public and private parts of the interface declaration, since the notion of encapsulation is implicit in the separation of the IDL interface from the implementation.

Interface and Implementation Repositories

As an alternative to IDL, the CORBA spec devotes a couple of paragraphs to the idea of repositories for both interface and implementation definitions. On the interface side, the repository is intended to augment the dynamic-invocation interface by providing persistent objects which represent information about a server's interface. With an interface repository, a client should be able to locate an object unknown at compile time, query for the specifics of its interface, and then build a request to be forwarded through the ORB. The implementation repository contains information which allows the ORB to locate and activate objects to fulfill dynamic requests. The spec also envisions this repository being used to contain other incidental information about an object, such as for debugging, versioning, and administration. The specification does not define how either repository is implemented, so vendors have gone their separate ways, as they have with much of the CORBA spec.

CORBA Prospects

The OMG has been criticized for resembling other industry consortia which began with much fanfare about open architectures and cooperation, but ultimately produced little of substance. In the case of CORBA, however, the comparison is unfair; there is broad industry support for the spec, many implementations are available, and serious work is underway to address shortcomings in Version 1.1. CORBA is viewed by some large institutions as the only viable technology that is truly cross platform and cross operating system.

But is it a standard? Yes and no. Within the consortium it is a standard description of an architecture, but it is not a standard for implementation, and it is not as well-defined as it needs to be. The result is that each implementation of CORBA is a proprietary product. There is currently no interoperability between ORBs, though various partnerships have been announced--SunSoft and Iona, for example.

As a technology, CORBA is maturing rapidly. Companies such as Netlinks Technology (founded by two of the key implementors of DEC's ORB) have produced tools which ease the building of distributed applications using CORBA. A number of training companies now offer hands-on courses. Version 2.0 of the specification may deliver on the promise of interoperability. Currently, CORBA implementations are available for nearly all the major operating systems, and if an organization is willing to stick to a single vendor, real-world solutions can be built today.

IBM's System Object Model

It would not be accurate to describe IBM's SOM as an implementation of CORBA. Rather, it is a binary standard for objects that are operating-system and language neutral, and whose interfaces conform to CORBA definitions expressed in IDL. DSOM, the distributed-object framework which ships with the SOMobjects Toolkit, is a CORBA-compliant ORB. Dealing with object implementations sets SOM well apart from the CORBA spec, which defines object interfaces strictly without regard to implementation. As do other CORBA-compliant implementations, SOM extends the spec's capabilities: It supports implementation inheritance and polymorphism, provides metaclasses which are manipulated as first-order objects, and allows dynamic addition of methods to a class interface at run time.

SOM is not a distributed technology, nor is it even an interprocess technology. DSOM serves these purposes. SOM was intended specifically to solve the problem of tight binary coupling between an application and the libraries of classes it uses. To accomplish this, SOM relies on interfaces defined in an extended version of CORBA's IDL, which uses the SOM compiler and "emitters" to generate the interface stubs and implementation skeletons described earlier. In addition to language-neutral definition of object interfaces, SOM provides run-time support for objects, which again sets it apart from the OMG model.

The SOM Object Model

IBM's SOM object model is a classical model in the same sense as the OMG model--classes define the characteristics of objects, and method requests identify a single object on which the method is to be executed. SOM is a "singly rooted" object hierarchy: All objects derive from the base class SOMObject, which provides run-time support methods common to all objects in the system. As opposed to Microsoft, IBM's stated goal is to provide for loosely coupled object libraries while retaining the commonly agreed-upon principles of object orientation: encapsulation, inheritance, and polymorphism. SOM provides for method overloading, run-time method resolution (polymorphism), and all the common forms of implementation inheritance. Types in SOM are CORBA IDL types, as described in Figure 5. Unlike CORBA, these types are used in the implementation of SOM objects, as well as the definition of the interfaces to them.

Also unlike CORBA and yet more like "pure" object-oriented languages, SOM classes are themselves objects, which are instances of SOM "metaclasses." A metaclass is (roughly) the type of a class. Whereas a class describes a set of potential object instances, a metaclass describes a set of potential classes. In practice, SOM metaclasses function similarly to static-member functions and variables in C++. Metaclasses in SOM define functions that operate on the class as a whole, including methods which execute when an instance of the class is created, functioning much like a C++ constructor. Figure 7 shows the relationship between classes and metaclasses in the SOM object model. Note that SOMClass is the parent of all metaclasses in the same way that SOMObject is the parent of all classes. Interestingly, SOMClass is itself derived from SOMObject. It is from this derivation that metaclasses in the IBM model receive the common methods which allow them to behave as first-order objects in the system. Neither SOMObject nor SOMClass contain member variables, so classes and metaclasses inheriting from them suffer no increase in size.

SOM Extensions to CORBA IDL

As with CORBA, the process of creating a SOM object involves using the IDL to define its interface and attributes. Once these are specified, the SOM compiler generates the stub and skeleton bindings in the preferred language. The SOM compiler uses "emitters," back-end code generators which perform the actual mapping of IDL syntax into the implementation language and generate the implementation skeletons. On the client-side, the emitter generates the include files that specify the method signatures clients use to invoke methods on objects.

SOM adds to the standard CORBA IDL syntax a number of extensions to support the SOM model or provide convenience in object specification. These include implementation statements, instance variables, and private methods and variables. Implementation statements provide information about an object implementation to the SOM compiler, such as the metaclass of the object, version information, whether or not the object is persistent, the name of the DLL in which it is implemented, and so on. Implementation statements are nested within the interface statement for the object. In addition to the implementation information ("modifiers"), these statements allow the declaration of "instance variables"--declarations of IDL types meant to serve as private data to an instance of the object. These variables are distinct from the attributes declared in the interface statement as defined by CORBA.

SOM IDL allows the declaration of private methods and variables in the specification of an interface. The intent is similar to that of private properties of C++ classes, though the mechanism is quite different. Under normal operation, the SOM compiler ignores private methods and variables, and only the public-interface bindings are generated for client use. A command-line switch enables generation of bindings for the private methods, as well as access methods for private variables, so that these declarations can be provided to modules which need access to them. Methods and attributes declared as private in the specification can thus behave a bit differently from their C++ counterparts. Where C++ private properties are visible only within the class methods, private properties in SOM IDL may be exposed in a controlled manner to any client that needs them. This is analogous to declaring a class or function to be a friend of a C++ class, thus allowing access to private methods and data.

Inheritance in SOM

SOM supports interface inheritance in the same manner as CORBA. Subclasses inherit the interface signatures of their parent classes, so that any method available on a parent class is also available on the subclass. Unlike CORBA, subclasses also inherit the procedures which implement those methods, unless the methods are overridden or specialized. Subclasses may also introduce new methods, attributes, and instance variables which will be inherited in turn from any class derived from them. This is consistent with the common model of class inheritance in languages such as C++.

Metaclasses in SOM are also participants in inheritance relationships. These relationships are separate from the inheritance relationships between classes. For example, a class A with a metaclass M_A may be subclassed by a class B. If the class B explicitly specifies its own metaclass M_B, then it does not automatically inherit the relationship between A and its metaclass. In some cases, this can lead to incompatibilities. Suppose that in Figure 8, class A contains a method Foo() which in turn invokes a class method Bar() defined in metaclass M_A. Class B will inherit Foo() from A; however, since B has no relationship with A's metaclass, there is no Bar() for the inherited version of Foo() to invoke. A hierarchy of this type is not allowed in SOM: The SOM compiler will automatically generate an intermediate metaclass, as in Figure 9. This intermediate metaclass M_C is derived from both M_A and M_B, ensuring that class B's metaclass provides the method Bar() upon which B::Foo() depends.

SOM also supports multiple inheritance, which allows a subclass to inherit the interface and implementation of multiple base classes. A classic problem with multiple inheritance is the ambiguities that may arise when a class inherits either the same method from two different bases or different methods with the same signatures. Any multiple-inheritance model must provide a means of disambiguating such method collisions. SOM automatically detects and resolves such situations by giving precedence to the method inherited from the leftmost ancestor of the class. IBM calls this "left-path precedence."

If you decide when implementing a class that left-path precedence is not appropriate, you have two alternatives. You can create a new metaclass which alters the makeup of the method table for the class. This effectively alters the semantics of SOM's default inheritance mechanisms. Alternatively, you can override the inherited method and make a fully qualified call to the parent method you select.

Multiple inheritance in SOM also results in a similar problem. If, for instance, a class C is derived from two classes A and B, and if A and B both declare explicit metaclasses M_A and M_B, then the SOM compiler must generate a new metaclass M_C, which is derived from M_A and M_B, and made the metaclass of class C. The programmer may override this behavior by creating the derived metaclass explicitly and assuring that it supports all the required methods. If all this sounds complicated, that's because it is. The advantage of metaclasses is the availability of information about a class at run time. C++ classes provide capabilities similar to metaclass methods in SOM, by allowing static class methods to be declared, but C++ classes are compile-time constructs about whom most information is lost once the program has been built.

SOM Method Resolution

Method calls in SOM are bound at run time using a mechanism similar to virtual function calls in C++. Each class has a method table which contains pointers to the procedures that implement its interface methods. Unlike C++, SOM metaclasses can be made to alter the composition of these method tables. The SOM table-lookup mechanism, known as "offset method resolution," allows method calls to behave polymorphically at run time, exactly as C++ virtual functions do. Like C++ virtual function calls, offset resolution requires that the names of the method and the class that introduced it be known at compile time.

In addition to offset resolution, a method call may use name-lookup resolution or dispatch-function resolution. Name-lookup resolution, a dynamic-method binding similar to that in Smalltalk and Objective-C, is more flexible than offset resolution because the name of the method can be unknown at compile time. You can use it when a method is selected at run time based on user input or when a method has been added to a class interface dynamically. As you might expect, it is less efficient than offset resolution, because finding the method procedure involves searching a number of data structures associated with the class. Dispatch-function resolution is different from both offset- and name-lookup resolution. A dispatch function allows the implementor of the class to decide arbitrarily which rules and conditions will be used to find and invoke a procedure which implements a method. It is the most flexible--and most costly--of the three means of binding method invocations in SOM.

Distributed SOM

The SOM capabilities discussed thus far are for objects which exist in the same process address space as the calling application. While SOM does provide a robust implementation of a language- and operating-system-neutral object model, it is not a distributed-object technology. To address this limitation, IBM ships with the SOMobjects Toolkit a "framework" (a set of SOM classes) known as "Distributed SOM" (DSOM). Where SOM defines an implementation-independent model for objects, DSOM extends this to allow use of objects independent of their location with regard to the calling application.

In its current version, DSOM supports two types of distribution: across process spaces on a single machine, or across multiple machines in a network. The former is an extension to SOM packaged by IBM as "Workstation DSOM," and the latter a CORBA 1.1-compliant ORB packaged as "Workgroup DSOM."

DSOM is currently available on AIX 3.2 (IBM's flavor of UNIX), OS/2 2.0, and Windows 3.1. Workgroup DSOM supports distribution of objects across local-area networks comprised of machines running all three operating systems, making it a multiplatform model. Future versions of DSOM will allow distribution across larger, enterprise-wide networks. Transport protocols currently supported include NetWare IPX/SPX on AIX, OS/2 and Windows, NetBIOS on OS/2 and Windows, and TCP/IP on OS/2 and AIX. An application can also define its own transport protocol.

The SOM Toolkit

IBM's SOM is a complete, shipping technology currently available for three popular operating systems. In addition to the basic features, the SOMobjects Toolkit includes several frameworks consisting of SOM classes which provide higher-level facilities for application developers. These facilities include: a CORBA-compliant framework for Interface Repositories; a Persistence Framework, for archiving objects between run-time sessions of an application; a Replication Framework that allows an object to be mirrored in multiple address spaces (with locking, synchronization, update propagation, fault-tolerance, and guaranteed consistency among copies); and an Emitter Framework to aid developers in creating new language bindings for SOM IDL. The kit also includes collection classes, utility metaclasses, and event-management classes as well as bindings for C and C++.

Like most models of this kind, SOM and DSOM are complex. One development which may ease the conversion from binary-coupled objects in C++ to SOM objects is the "direct-to-SOM" support in C++ compilers from MetaWare and Symantec, among others. In a direct-to-SOM implementation, the compiler generates SOM classes directly from C++ code, allowing existing class libraries to be recompiled as binary-insulated SOM classes.

Microsoft's OLE

Microsoft's OLE 2.0 is the heavyweight wildcard in the race to define standards for language-neutral and distributed-object technologies. The foundation of OLE is its Component Object Model (COM). This model, along with the high-level application integration technology that rests on top of it, represents a clear challenge to CORBA and CORBA-compliant technologies. As expressed in OLE, Microsoft's vision of system-object technology presents a strong contrast to that of CORBA and SOM. It also diverges from some commonly accepted principles of object orientation.

Despite the differences between low-level system-object technology and high-level component-integration facilities, Microsoft has striven to combine the two in the minds of developers. Marketing tactics aside, the reason for this is likely Microsoft's role as a leading vendor of application packages and suites, in contrast with the system vendors (HP, Sun, DEC), who are focusing on CORBA and other low-level technologies.

OLE 2.0 is not Microsoft's first foray into the world of interprocess object communication. To understand the rationale behind OLE, it's worth a moment to examine the previous process-interaction model, Dynamic Data Exchange (DDE)--a broadcast protocol whereby an application can set up a channel of communication with a "DDE server" located elsewhere on the machine on which the app is running. DDE is an inherently asynchronous protocol, meaning that once communication is established (itself no mean feat), the caller ships off a request and waits in a loop for the results to come back. Such a mechanism is more complicated than a synchronous function call, due to the possibility of failed communications, timeouts, and other errors which the looping application must detect and recover from. Many developers have found DDE frustrating and error prone, hence its lack of popularity. Microsoft has tried to make it more palatable by adding a library, DDEML, that handles many of the more complex aspects of the protocol, but apparently this has not been enough.

Version 1.0 of OLE was designed mostly as an embedding-and-linking mechanism for compound documents; it used DDE as its underlying communications mechanism. Thus, OLE 1.0 inherited many of the problems associated with an asynchronous broadcast protocol. OLE 2.0 enhances Version 1.0 by defining many system services in addition to 1.0's linking and embedding. These services include Uniform Data Transfer (an expansion on older data exchange protocols such as the clipboard), Structured Storage (a way of providing persistent storage for nested hierarchies of objects), and OLE Automation (a way for applications to expose interface APIs for use by other applications and scripting languages). The most important change made to OLE 1.0, however, is the abandonment of DDE as the underlying protocol in favor of the Component Object Model.

The relationship between COM and OLE 2.0 is shown in Figure 10 where COM specifies a binary standard for object interaction. Microsoft provides run-time support for COM via COMPOBJ.DLL, which implements a small API for use in creating and manipulating the entities known as "Windows Objects."

The Component-Object Model

A Windows Object is a functional entity that obeys the object-oriented principle of encapsulation. Clients do not manipulate Windows Objects directly. Instead, the object exposes to its clients various sets of function pointers, known as "interfaces." An interface is effectively a pointer to a table of function pointers. Figure 11 depicts the relationship between an interface table and the object implementation. An object may support any number of interfaces. All Windows Objects must support the most basic interface, IUnknown (by convention, interface names start with "I"), which supports three methods that supply basic functionality to all Windows Objects. These methods are QueryInterface, which allows a client to inquire which interfaces an object supports, and AddRef and Release, which manage reference counting for objects. Reference counting is a mechanism familiar to most object-oriented programmers with which the system can track how many clients possess a pointer to one or more of a given object's interfaces. When a reference count reaches zero, the system can delete the object and recover its resources.

Microsoft has specified a set of 60 or so interfaces which comprise the OLE 2.0 architecture. These include interfaces for In-place activation, Linking, and Embedding--the core of OLE 2.0's compound-document technology. Interfaces also exist for Drag-n-Drop, Uniform Data Transfer, Automation, Compound Files, and other useful capabilities. Objects are also allowed to create custom interfaces. However, support for this is currently limited, and in fact Microsoft recommends that COM developers stick to the standard interfaces for the time being.

Inheritance versus Aggregation

Microsoft's opinion is that some of the standard mechanisms of object-oriented programming are not properly applied in an interprocess object model. In this view, the particular mechanism that causes the most trouble is inheritance. While implementation inheritance is useful in constructing stand-alone applications, Microsoft believes that inheritance is improper when applied to interprocess object models. The reasons for this lie in the "fragile base class problem," which results from a dependency between a derived class and its parents that is "implicit and ambiguous." Should the base class alter its behavior, that alteration may force changes in derived classes, according to Microsoft. While this is certainly true, experienced object-oriented programmers might point out that the interface between any two classes, whether parent and derived, or client and server, represents a contract which, if changed, will force alterations on the other side of the relationship.

Nevertheless, Microsoft's concern with the potential management problems of implementation inheritance was enough to rule out supporting it in COM. To be fair, it should also be noted that Microsoft has a more compelling argument against inheritance: It intends to use Windows Objects to implement many advanced features of its next-generation operating systems. Eventually, COM object interfaces will take the place of the procedural API through which Windows is now accessed. When using Windows Objects, which are part of the operating system, an application will not have access to the source code for these objects. Such a restriction makes it difficult to use these classes as bases for implementation inheritance. Developers who use third-party libraries for which no source is available will likely sympathize.

In place of implementation inheritance, Microsoft offers a different model of code reuse called "aggregation," which allows an object to be constructed from subobjects. In object-oriented programming languages, aggregation (or "composition" or "containment") may take many forms. The containing object may allow access to the subobjects directly; it may provide forwarding capabilities through which the subobject's methods can be invoked via the owner's interface; or, it may use the subobject entirely for internal purposes. In COM, the first scenario is true aggregation, while the second is containment. Does aggregation function as a complete substitute for implementation inheritance? Not really. Inheritance in object-oriented languages is a syntactic mechanism enforced automatically by the language. Aggregation is a convention subject to implementation in any number of ways. Inheritance usually requires little or no code to support it, whereas aggregation must be completely supported by the programmer. Whether object-oriented programming can be done effectively without implementation inheritance is something for individual developers to decide.

COM Object Identity

COM identifies objects differently from CORBA and SOM. With CORBA, there's a potentially significant problem with making object names globally compatible across distributed systems. In a dynamic environment, name collisions can cause applications to "link up with" the wrong object, with possibly disastrous results. Microsoft has foreseen the problem and utilized a mechanism to cope with it: "Globally Unique Identifiers" (GUIDs), 128-bit integers guaranteed to be "unique across space and time." You can obtain GUIDs for identifying COM objects either by requesting a block of 256 GUIDs from Microsoft or by using a network card and the UUIDGEN.EXE utility shipped with the OLE 2.0 SDK. UUIDGEN uses the date, time of day, and a unique number embedded in the network adapter to create a set of 256 GUIDs. The chance of this tool generating duplicate IDs is, according to Kraig Brockschmidt, "about the same as two random atoms in the universe colliding to form a small avocado" (Inside OLE 2, Microsoft Press, 1994).

COM Object Creation and Marshaling

In addition to the interface specifications, Microsoft provides run-time support for COM in the form of COMPOBJ.DLL, a library of API functions for object creation and marshaling. Objects are created by requesting them from the API using a GUID. Microsoft has defined GUIDs for the standard interfaces which come predefined with OLE 2.0. When COMPOBJ.DLL creates an object, it returns to the requester a pointer to the first interface of the object, usually IUnknown. COM objects need not be implemented such that they can be created using this mechanism. Such implementation, however, insulates users of the object from its implementation language, and in future versions of the technology will also insulate clients from object location in a distributed system. To make an object addressable from COMPOBJ.DLL using this mechanism, the object must reside in a DLL or executable file and must export a specific set of functions which COMPOBJ.DLL uses to interact with the object during its life cycle.

The other major piece of functionality in this module involves a process Microsoft refers to as "marshaling"--translating and delivering parameters to, and results of, a method invocation across address spaces. The marshaling mechanism in OLE 2.0, "Lightweight Remote Procedure Call" (LRPC), currently works across address spaces on a single machine. In the future, Microsoft intends to implement a more robust RPC mechanism compliant with OSF DCE that will allow object interaction across networks and between Windows and OSF servers. Microsoft claims that objects which conform to the current interface in COMPOBJ.DLL will require no changes--source or binary--to work with the proposed RPC mechanism.

One limitation of the current mechanism is that it does not support generic marshaling. That is, Microsoft has provided code in COMPOBJ.DLL which handles marshaling only for the standard, predefined interfaces currently shipping with the OLE 2.0 SDK. Support for generic marshaling remains in the future, so some developers advise against creating custom interfaces now. At present, creators of custom interfaces must provide their own marshaling mechanisms, a difficult task beyond the resources, if not the abilities, of many programmers. The result is that, today, COM is limited for use in support of the compound-document architecture defined by Microsoft.

Wrapping It Up

How do COM and OLE 2.0 compare with the other technologies? COM is available today only as the set of interfaces which defines OLE 2.0 capabilities. In that sense, there is no possibility of direct comparison to CORBA or IBM's SOM. The intent of those technologies is similar to that of COM, but the implementation of COM is currently too restricted. By comparison, IBM's technology is more complete and far-reaching. Also, IBM's technology is more consistent with generally accepted notions of object-oriented programming. Microsoft claims it intends to create distributed versions of COM (DCOM?) and to implement OLE 2.0 on non-Windows platforms. The Macintosh version of OLE was demonstrated in March 1994 and is reportedly in beta.

Microsoft and Digital Equipment have announced an agreement that will integrate DEC's ObjectBroker, a CORBA-compliant ORB, with Microsoft's COM, creating the Common Object Model (also known as COM). This will allow the two technologies to interoperate to some extent. Microsoft has not ruled out more-direct CORBA compliance, if the market demands this.

Despite the alternatives, the business reality is that, unless some other system overtakes Windows as the desktop leader (now at 60 million installations), Microsoft's stated intent to build future operating systems on top of COM makes this technology one that you ignore at your own financial peril. The likely scenario is that the current Win32 API continues to exist within future systems, with advanced features being provided by COM objects in a gradual migration strategy. OLE 2.0 provides capability for application integration and interoperation that CORBA and DSOM can only hint at--the former through the Common Facilities Compound Document initiative, and the latter through the OpenDoc collaboration with Apple, WordPerfect, Borland, and others.

In truth, there is no easy answer--and there likely won't be one in the near future. Windows developers had better pay attention to COM/OLE 2.0, while OS/2 and AIX developers had better become familiar with SOM and DSOM. If you need to build distributed applications now, then COM is not at all useful. If you are compelled by the market to interoperate with evolving Windows implementations, then moving to COM is a virtual mandate emanating from the company that controls that operating system.

Figure 1 Relationship between applications, component-integration models, and object models.
Figure 2 A compound document with text, image, spreadsheet, and sound objects.
Figure 3 Application boundaries for distributed processing.
Figure 4 Object-management architecture.
Figure 5 OMG-type hierarchy.
Figure 6 CORBA architecture. From the programmer's perspective, the standard interfaces are the Dynamic Invocation, ORB, and (Basic) Object Adapter. The IDL stubs are also standard, depending on the language mapping used by the client program.
Figure 7 SOM class relationships.
Figure 8 Example of incompatibility caused by metaclass dependency for method Foo() of class A.
Figure 9 Generated metaclass to resolve incompatibility caused by metaclass dependency for A::Foo().
Figure 10 OLE 2.0 interfaces and the Component Object Model.
Figure 11 Relationship between client, component object, and interface.