Jeff is a software consultant working with AmigaDOS, OS/2, PC-DOS, and embedded systems. He is also the creator of the Echomail conferencing facility used in Fidonet. Jeff can be contacted at jrush@onramp.net.
As tomorrow's applications grow more complex by adding feature upon feature, a new approach is needed that structures applications for simplicity. Today's users are increasingly overwhelmed with bells and whistles. Each added wrinkle satisfies the interests of one small market segment, but the entire set is considered necessary for the application to have broad appeal. The result is "Swiss-army-knife" applications with thick user manuals, long training cycles, and plenty of support calls. Developers pay a price for this approach as well--long development schedules, arduous test schedules, an increased time-to-market, and lost opportunities. This makes it almost impossible for small developers to participate in the major application categories.
OpenDoc is an enabling technology designed to address these problems. OpenDoc restructures application development in a way that fosters small, reusable, interoperable application components that users can mix and match according to their precise needs. This article provides an overview of this technology, which was designed at Apple, but is now being deployed by a number of vendors (such as IBM, WordPerfect, and Apple). For a brief history, see the accompanying text box entitled, "Origins of OpenDoc."
OpenDoc simplifies applications by letting the end user select and install the features needed, omit the ones not needed, and, at the same time, level the playing field so that big and small companies alike can contribute to a larger market of available feature sets or components. OpenDoc is an open architecture for the creation of compound documents--which can contain many different types of data (such as text, graphics, tables, video, sound, and animation). The documents can be edited, printed, circulated in read-only form, presented as a slide show, and circulated for mark-up and review. It pursues the goal of reducing software complexity by using replaceable binary objects. Application modules interoperate in a seamless fashion by using a single, shared user interface (UI). It frees the user to view one's work in a document-centric fashion, instead of as a set of applications between which one must move. It allows users to build documents that can be used on different platforms, without the hassle of conversion (which is especially useful on heterogeneous networks). Users can customize their work environments by discarding, adding, or replacing objects with those from other vendors. Through the use of distributed objects, such documents can reach across networks and transparently participate in client/server arrangements, pulling data from several sources into a document.
OpenDoc is more than a super-extensible word processor. An OpenDoc document can be used as an audio- and animation-enabled slide show, or as a shared electronic blackboard with replicas of a document coordinated by distributed objects, or even as an electronic voting ballot that tallies opinions as the document circulates. In the latter case, the ballot document could either store the responses internally for later retrieval or transmit them over a network by using remote objects.
Because of OpenDoc's support for connectivity and scripting, documents can be created that track each time they are read or written. Such documents could be used, for example, by a publications department to monitor its corporate manuals and determine their popularity and audience. Authors could be notified by e-mail each time their work is read.
The content in an OpenDoc document can be dynamic and automatically generated by query to a remote database each time the document is opened (such as the case of a bill-of-materials that shows the current cost of goods by way of queries to a parts database). OpenDoc supports multiple representations of information within a single document, allowing documents to be opened in, say, French or German, according to the reader. Scripts can be embedded in documents and activated upon viewing--for example, to first present only an outline and then more detail as desired. A smart table of contents can take the reader to the pointed-to section, depending on who opens the document. Since OpenDoc supports the compression and encryption of parts of documents, a script could request a password, consult with a remote security key database, and unlock certain portions of the document or verify a digital signature on a document.
OpenDoc provides a unified, but customizable, system for creating complex documents, with a consistent UI. Users no longer must run different applications to create the various parts of a document and then import these pieces into one master document. Users no longer must remember which particular application they are in. With some of the early linking-and-embedding technology, it was necessary to be acutely aware of switching between applications. By contrast, OpenDoc presents a single work area (or "shell document") on which users can paste various content types. A single, high-level UI is used for manipulating the environment. Users can customize this by adding new content types, tools such as spelling checkers and thesauri, menu items, and buttons. OpenDoc acts as a high-level toolkit that allows users to tailor the environment for their particular needs. Shell documents can be read-only (for the viewing and distribution of information) or instead be fully functional editors. The user interface for OpenDoc is based upon extensive research originally conducted by Apple. Each platform vendor can modify the UI to conform with platform-specific conventions.
OpenDoc relies on a multimedia-capable storage format called "Bento" to hold documents. This is the same format used by the ScriptX multimedia authoring environment designed at Kaleida Labs (another Apple spin-off). The name "Bento" comes from the Japanese-style lunch box, which has multiple compartments, each containing disparate elements arranged in an aesthetically pleasing manner. The name represents the fact that the format is very flexible, so that new types of information can be defined without disrupting any existing content types. The format is designed with the real-time requirements of multimedia in mind, so that sound and animation content types can be played back reliably and without interruption. The format also permits the alternative versions of a document to support different authors working on document drafts, or to support alternative content types for using a document on different platforms. For example, a user can make a picture available in one form for Windows and another for the Mac, or create a block of text in both Italian and English, and each user would see the appropriate one.
OpenDoc has provisions for a scripting mechanism that allows users to collaborate across space and time by explicitly writing scripts or by recording their actions and converting them into scripts. These scripts do not have to be low-level macros that only contain raw information such as mouse clicks and keystrokes. Instead, the scripts can encode operations that reference the parts of a document semantically--by section, paragraph, word, and so on--independent of which application object is responsible for which content type. For example, a script could scan the figure captions in a document and index them, or change their fonts to a particular style, then create a summary page and transfer the caption information to that page. OpenDoc scripts are in an English-like syntax similar to Applescript and Hypertalk. They can be mechanically translated to other languages (say, a Kanji-like version of the scripting language) without losing their meanings. Also, since the parts of a document are actual objects and are "smart," the script verbs are handled automatically in a manner appropriate to the content type.
OpenDoc permits users to add new capabilities and content types by drawing upon a third-party market of objects, known as "parts" in OpenDoc parlance. Users can select content types from a palette of available "stationery," which then get dropped onto documents. New parts can be purchased and added to the palette of stationery. New tools to manipulate existing types can be added as well, appearing as menu items or buttons. For example, a user can select a spelling-checker object based on the size of its dictionary or suffix-recognition ability, and it will operate across the appropriate content types of text and tables, even reaching into nested types to spell check embedded text (such as bar-chart labels).
Through its reliance on standards, OpenDoc is able to interact with other technologies and architectures. OpenDoc uses IBM's System Object Model (SOM) for its object-messaging facility. SOM conforms with the CORBA object technology fostered by the Object Management Group (OMG); see the article, "IBM's System Object Model," by F.R. Campagnoni, on page 24, in this issue. This means that OpenDoc objects will be able to invoke objects distributed across any of the wide range of platforms that support CORBA now (or will in the future), such as Macintosh, OS/2, DOS, the various incarnations of Windows, NextStep, and even the worlds of UNIX boxes and mainframes. When encountering platforms that use proprietary standards (such as Microsoft's OLE 2.0), OpenDoc documents can work with those objects by using bridging technology such as Open Linking and Embedding of Objects (OLEO). OLEO enables bidirectional interoperability between OpenDoc and Microsoft OLE 2.0, and is being implemented by WordPerfect for Windows. Because of compliance with the CORBA standard or by using bridging components, OpenDoc will be able to interact with NextStep's Portable Distributed Objects (PDO), Sun's Distributed Objects Everywhere (DOE), Novell's AppWare Bus, and Hewlett-Packard's Distributed Object Management Facility (DOMF), as well as a few others.
OpenDoc's object-oriented architecture (see Figure 1) speeds up development of new parts by letting the developer build upon existing objects using multiple inheritance and polymorphism. Developers must define only those characteristics in their object that are different from those of parent class or classes. For details on the process, see the accompanying text box, "Steps in Creating an OpenDoc Part."
Parts can be designed in a general-purpose manner, and then reused on other product lines or sold on the open market to other developers. Some user documentation can be reused as well, since the fundamental way of interacting with a part is not likely to be different between projects. A spell checker is still a spell checker.
OpenDoc lowers development risk by reducing the complexity of applications. Bugs and schedule slippage can be better controlled by encouraging the creation of small, single-purpose, part-handler objects, with well-defined APIs that can be more easily developed and tested. Such parts can be tested in isolation from other parts and (because of the fewer combinatorial inputs) be tested more fully. Once a library of such parts exists, creating a new application becomes primarily a matter of selecting the base objects and writing the portions unique to the particular application.
To ensure that a large pool of objects can be made available in the diverse marketplace, OpenDoc uses the packaging mechanism of SOM to create binary executable parts that can be replaced or upgraded with just a file copy. This allows the creation of objects written in any language that conforms to the SOM standard and gives an object the ability to be called from any other such language without the usual incompatibilities (such as differing calling conventions, register usage, and name management/ mangling). This technology will allow objects to communicate between process spaces on a single machine or across a network. An object can then call upon the services of other objects located, say, out on the Internet or the corporate LAN. By packaging objects in such an interchangeable manner, it becomes possible for vendors to offer them for sale to other developers for integration into their applications.
As mentioned previously, WordPerfect is developing OLEO, the bridging technology that permits OpenDoc objects to interact with OLE objects. Other such bridges are in the works as well. This creates a larger market of reuse opportunity by letting OpenDoc developers invoke other types of objects and lets them sell SOM objects into the OLE 2.0 market for those customers with OLEO bridges.
OpenDoc is comprised of several major subsystems (see Figure 2), each rather independent in its own right, and some usable without the others. Included are: the shell document along with a pool of part handlers, a geometry-negotiation protocol, an object-storage mechanism, an exposed event flow, and an object-packaging technology. To maximize cross-platform portability, OpenDoc does not specify drawing systems, coordinate systems, window systems, or human interface guidelines. To enable the parts to work together, OpenDoc does specify protocols covering storage management, event distribution, the run-time model, and the management of the human interface.
The basic visual framework within which OpenDoc operates is called a "shell document." It resembles the UI of a word processor, but without presuming the type of data that is to be manipulated. The shell document provides an address space, distributes events, and provides basic UI resources such as windows and menus. It relies upon an open-ended pool of part handlers to provide the functionality to handle specialized forms and presentations of data such as text, spreadsheets, graphs, and so on--OpenDoc "parts." The particular style of the framework is platform dependent. It is designed to minimize conflicts with known platform conventions of the Macintosh, Microsoft Windows, Motif, OpenLook, and OS/2.
A part handler is responsible for displaying the part (both on screen and on paper), editing the part, and managing the storage for the part (both in memory and on disk). The part handler and the part itself, together, comprise a high-level object, with the part providing the state information and the part handler providing the behavior. Part handlers come in two flavors: editors and viewers. The viewer is simply a subset of the editor. It lacks the ability to alter the part. The view-only OpenDoc shell takes up less disk space than the editor and can be freely distributed.
Each part handler operates within an arena of space and relies upon a protocol to negotiate the use of the geometric space, which may represent the display screen or the printed page. This protocol is quite dynamic, allowing objects to move and adjust on-the-fly as other objects are added or changed. OpenDoc supports nonrectangular regions, as well as the usual rectangular ones. This is one advantage over the current version of OLE. (Microsoft has said the nonrectangular regions will be supported in future versions of OLE.)
Note that the ability to print is not within the domain of OpenDoc proper, but is up to each part to decide how and what to print. The highest part in the hierarchy, the root part, drives the printing process.
Every document-manipulation system must have a way to store a particular complex arrangement of data to disk. OpenDoc relies on the Bento storage system to hold a document's contents. Bento has the ability to subdivide a storage container to hold wildly differing forms of data, in a structure that lends itself to portability between platforms and that addresses the real-time playback requirements of multimedia.
Bento can coordinate multiple data streams within a single file. It also provides a robust method of annotation for links between objects, whether located in the same file or in another. It has elements that support the tracking of draft revisions and that arbitrate such revisions when several authors are collaborating on the same document. Bento is not, however, a full-blown, object-oriented database of the corporate server variety. That would make Bento too complex. Instead, it focuses on the management of the content of a system of structured files, along with references to external data items. There is nothing, however, in the Bento architecture that would preclude it from being hosted on top of an existing object-oriented database.
Bento treats a file as a highly structured container that can hold multiple objects nested within each other. Each object can have multiple properties, each of which can have multiple values. In Bento, a "value" is a byte stream with an associated type that defines how the byte stream is interpreted. A property serves to indicate the role of the value, but not the type itself. For example, a property may specify that a byte stream has a role of document title. However, it is the byte stream's type that specifies which character set is used to interpret the byte-stream value. A different document may have a title property as well, but with a value typed as a graphic image in order to represent a stylized corporate logo.
The byte streams that comprise values may be of any length, with random access possible to any point within. In addition, it is possible to store a byte stream in noncontinuous pieces on a particular storage medium (interleaved with other data), to facilitate the real-time playback of multimedia data.
For example, a value containing a sequence of compressed images that make up an animation may be interspersed with another value containing the audio soundtrack. Then, during playback, the storage media does not have to seek between wildly distant regions of the disk. The disjoint nature of these values is hidden from those objects that do not need to know. The ability to have disjoint values also means that values can be edited without rewriting the entire value, which might be quite slow for multimegabyte values that represent animations or sound tracks. Unlike your normal file system, however, values support the ability to snip out or splice in new byte sequences in order to adjust the length of a region of a value. By recording the set of pieces that make up a value at any particular time, it is possible to track revisions made to an object. This is how OpenDoc keeps track of revision drafts of documents.
Bento supports other interesting features such as layered data transforms (for example, compressing a value, and then encrypting it). Also, users can have out-of-line data references that can refer to external files or to values elsewhere in the same container. This makes possible the scenario in which several multimegabyte 24-bit images stored on a CD-ROM are used in a document. This document can contain references to the files on the CD-ROM, as well as alternative representations in low-resolution, 8-bit images stored within the document itself. If the CD-ROM drive is not available (say, when a user is traveling), the low-resolution images are substituted automatically so that the user can continue to edit while on the road.
The physical layout of the Bento storage format is published information, so that even non-OpenDoc environments can retrieve information from within a Bento container.
The flow of events within OpenDoc is made visible to the parts in a document, so that the actions of a user can be recorded and played back. OpenDoc events are not just keystrokes and mouse clicks, but rather semantic actions at a higher level of abstraction. Parts may emit events while they operate. They can also inject events into the flow in order to alter the state or behavior of other objects. OpenDoc provides a powerful referencing facility so that events are meaningful in the context of application usage. For example, an event can indicate that the user deleted the fourth word in the third paragraph in chapter four. This is in contrast to the simple macro-like recording of low-level events used by other systems--in which the mouse pointer moves to a paragraph, the mouse button is clicked, and the Delete key pressed. The effect of this sequence is heavily dependent upon the resolution of the screen, and other similarly unrelated constraints. In addition, such a low-level event stream is not very meaningful to users examining the resulting stored script.
By recording the sequence of changes made to documents at a high level of abstraction, OpenDoc can maintain a meaningful revision history and can associate changes made by several authors with each person responsible, for coordinated integration later. The high-level event stream is also more concise than macro-level recording, and can be transmitted over a slow-speed communications link to create an efficient electronic blackboard--in which geographically dispersed authors can simultaneously work on a document and have their actions reflected in the remote copies in close to real time. Also, you can imagine an author receiving revised copies of a document and then invoking a script that walks through the changes made by others, highlighting each change on the screen, and audibly explaining its rationale. The script could prompt the author to approve each change before it is applied to the final draft. As you can see, scripting allows the construction of complex client applications disguised as compound documents.
It is important to note that OpenDoc has no specific scripting language per se, but rather the provision for multiple scripting languages using the Open Scripting Architecture (OSA). The OSA design provides for the interception and injection of UI and semantic events, organized into standardized suites of common events for word processing, graphics, and other application-usage scenarios. However, each platform vendor (or a third party) must provide its own scripting language and interface it to these events. On the Mac, the three OSA-compliant scripting languages are Apple Computer's AppleScript, UserLand Software's Frontier, and CD Software's QuicKeys.
For OpenDoc to succeed, it must be easy to create the objects that make it up. This means that objects created using one set of tools (for example, Borland C++) must be able to interoperate with objects created by another (say, the Watcom compiler). Because of the differing name-mangling conventions, register conventions, and in-memory object representations, this is often not possible with today's objects. Many developers also prefer their C straight, or use Smalltalk, REXX, Pascal, or some other language. To address this and other issues, IBM created SOM as an interoperable way of packaging objects.
SOM provides a language-neutral, load-on-demand, object-calling convention that supports distributed services and field-replaceable components. SOM 1.0 ships with every OS/2 2.x package and is the basis of the Workplace Shell user interface. SOM 2.0 was revised to comply with the CORBA object standard and enhanced to support access to objects distributed across a network. This version of SOM, along with the developer's kit, is the Warp II Beta of OS/2 now in circulation. SOM for Windows is also now available for purchase. This is IBM's contribution to the OpenDoc effort. By using SOM to construct the objects that make up OpenDoc, these objects can be written in any language, by any compiler, and still interact. SOM objects on OS/2 and Windows are represented as DLLs, each of which may contain one or more object classes. The interfaces to these libraries are designed such that new methods can be added without affecting the callers caused by shifted entry points. Classes under SOM are also objects themselves. So, unlike classes in C++, SOM classes can dynamically change behavior at run time.
Although the alpha seed of OpenDoc did not include SOM, the beta releases for all platforms are expected to. SOM should be fully integrated into the OS/2, Windows, and Macintosh platforms by year's end.
OpenDoc is a future-oriented technology that, if it prevails, will restructure how applications are written and marketed. Even if it does not succeed, it is likely that many of the design ideas in OpenDoc will be imitated and incorporated into the technological mainstream of the mid-1990s.
The original ideas behind OpenDoc, including its storage format, known as "Bento," and the Open Scripting Architecture (OSA), all arose at Apple and were intended for the Macintosh platform. Apple approached IBM and other companies for support in forming a nonprofit industry association, Component Integration Laboratories (CI Labs). The charter of CI Labs is to promote OpenDoc. The cost of participating in the organizational stage was low and the following companies formed the founding group: Apple, IBM, Novell/WordPerfect, SunSoft, Taligent, and XSoft (a division of Xerox). Once CI Labs was underway, several of the founding companies made business decisions not to join, or are still evaluating that choice. Among these are SunSoft, Taligent, and XSoft. Lotus has recently signed up.
The technology pieces of OpenDoc are being contributed by various members. Apple is supplying the object protocols, the Open Scripting Architecture (OSA), and the Bento storage subsystem. IBM is supplying the SOM object-messaging facility and the OS/2 port of OpenDoc. WordPerfect, in conjunction with Novell, is providing the Windows port of OpenDoc and the piece that lets it interoperate with Microsoft's OLE 2.0. Novell is giving network support for distributed access to objects. Taligent is participating to ensure that OpenDoc fits in with its application frameworks. (At this writing, there are rumors that WordPerfect may halt its development efforts because of recent "peace accords" between its parent company, Novell, and Microsoft, whose OLE technology is the chief competitor of OpenDoc.)
The stated goal of OpenDoc is to produce a level playing field for application development, so that small companies can participate in major application markets. Ironically, the current members of CI Labs are a few large firms. While they may have little direct financial interest in helping the smaller firms, they have their own reasons for supporting OpenDoc. We can only speculate about some of their motivations. Apple must attract to the Macintosh some of the market momentum that OLE 2.0 has given Windows. IBM must extend its enterprise-wide systems across larger markets. WordPerfect seeks to retain its application market by broadening its feature base and moving into distributed document management.
By making reference sources to the various OpenDoc technology pieces freely available, CI Labs is hoping for rapid deployment across diverse platforms--Mac, OS/2, Windows, various flavors of UNIX, PowerPC, and so on. Some of the pieces (such as SOM for OS/2 and Windows and Bento for the Mac) are available now as separate items. OpenDoc is designed for incremental adoption by application vendors, and each subsystem is fully replaceable by a platform vendor. The alpha version was available in the first half of 1994, with the alpha OS/2 version distributed on IBM's Developers Connection CD-ROM #4. Betas are expected any time, with product to ship in late 1994. The alpha version of OpenDoc is still based on Apple's original C++ version with a big push currently on to port it to IBM's System Object Model (SOM). The betas will reflect this port.
CI Labs is not a standards organization, but rather a support organization chartered to provide reference source, technical documents, examples, and validation suites in an open environment that does not require nondisclosure agreements. It derives its funding from membership fees, not royalties. Membership is open to all. Nonmembers may freely use OpenDoc technology (including source) but only members may vote and hold office. To join CI Labs, the annual membership fee ranges up to $110,000, computed at $10,000 plus 1 percent of a company's gross revenue. To join as a sponsor, a one-time fee of $500,000 is added to the annual fee. Current sponsors include Apple, IBM, and WordPerfect.
CI Labs is not in itself performing any of the ports, but serves as a clearinghouse for the various members. Each member company is conducting a port. Apple is bringing SOM to the Macintosh, IBM is porting OpenDoc to OS/2, and WordPerfect is responsible for the Windows port. Further technical information is available on the Internet via FTP at ftp .cilabs.org or via World Wide Web at URL= "ftp://ftp.cilabs.org/pub/". For e-mail, use cilabs@cil.org or call 415-750-8352. WordPerfect can be contacted at opendoc@wordperfect.com. Apple maintains a conference on AppleLink and CI Labs runs several Internet "interest lists" to which you can subscribe by e-mailing Majordomo@cilabs.org. IBM's Development Connection organization can be reached at 800-633-8266 or via e-mail at devcon@vnet.ibm.com.
-- J.R.
Define the content model and semantic events of your part. For example, the content model for a simple text editor would consist of lines of text--with semantic events for inserting lines, deleting lines, replacing text, and so forth. For a painting part, the content model may be a rectangular region of pixels, with semantic events to create points, lines, circles, and so forth.
Implement your core data engine. This is where the custom portion of your part is developed. It is the basic set of algorithms and data structures specific to the type of data you are manipulating, independent of any human interface. A key element of this piece is making sure that the human-interface component interacts with the core engine through a well-defined set of calls matching the user model of the core engine. These calls are the semantic events on which scripting is based.
Implement your part's storage-manipulation code. Here you develop the body of code that uses the Storage API to load your part into memory and store it back as needed. It does not mean that your part must be loaded into memory in its entirety (which may be difficult for certain multimedia types), but rather that you create the structures necessary for your part to begin accepting events and rendering requests. Usually, you will not have to worry about the drafting facility of OpenDoc, as the document shell will handle most cases for you. If your part is a container, it must ask the embedded parts to load or store themselves as well.
Implement your part-rendering code. This code examines the frame within which the part resides, determining whether it is on a screen or paper, and performs its geometry negotiation appropriately. It then issues platform-specific calls to draw the content of the part. If your part is some type of container, your code must include support for layout negotiation and update the transformations of each frame (embedded in your container) that is visible.
Implement your user-interface event-handling code. This code supports direct manipulation of your part by handling user-interface events such as mouse clicks and keystrokes. You may need to deal with drag-and-drop and, if your part must display elements outside of your frame (like a ruler), you must get involved in layout negotiation. This portion of your code may use platform-specific OS calls, or you may rely upon an OpenDoc User-Interface parts facility (which is more portable and may be extended by developers). If your new part is some type of container, you must include code to notify parts embedded within yours of changes to your frame, and maintain information about the shape and transformations of your frame yourself.
Implement your scripting code. The scripting code provides accessory functions that resolve external references to a part's content objects (for example, "Line 6" into the actual reference to the sixth line of your Core Data structure). This is also where you provide functions to take semantic events such as "Delete Line," and actually perform the deletion by calling upon the Core Data Engine. You must handle the notification of dependent parts when the content of objects linked and exported changes.
Implement the desired extension interfaces. This is an area that goes beyond the basic OpenDoc architecture, and in which various extensions are added. It could include full text search, spell checking, or many other interactions. These APIs are reserved for those functional areas where bandwidth or integration requirements prohibit the use of scripting to accomplish them. CI Labs plans to be active in proposing and publishing standard interfaces between parts.
Package your part handler. Now your part is finished and you can prepare documentation to be provided with your part that specifies what part types, semantic events, and extension APIs are handled. The user of your part needs some information about how to use your part and what to expect. For a complex part such as a spreadsheet, this may be a small manual.
Create stationery to bootstrap your new part. For the user to insert new parts of the new type, you must create some stationery that has an empty copy of your part type. The user can then drag a copy of this empty type from the stationery palette into a document.
-- J.R.
Figure 1 OpenDoc architecture. Figure 2 Major blocks of OpenDoc.
Copyright © 1994, Dr. Dobb's Journal