Jim Gettys was an original member of MIT's Project Athena development team and was part of the research project that eventually led to X. Jim is now a consulting engineer with the DEC Cambridge Research Lab. and can be reached at 1 Kendall Square, Bldg. 700, Cambridge, MA, 02139.
The X Window system is a device independent, multitasking windowing, and graphics system designed to operate across heterogeneous networks. As such, X supports high-performance graphics and window-management mechanisms to provide a hierarchy of overlapping, resizable windows. The X system is based upon a client/server architecture whereby the client (that is, the application program) requests that the server (the program that controls the user-interface at the workstation) be responsible for drawing text, windows, and other objects. Through this client/server architecture, applications can run on any machine in the network and can be accessed by any workstation or PC running on an X server.
The X Window system was originally designed at the Massachusetts Institute of Technology (MIT) to provide a generic, network windowing environment for dissimilar bit-mapped workstations. The system makes it possible for MIT to network its collection of incompatible computers, which have accumulated at the institute over the years. X, which is now a de facto standard and is publicly available, provides efficient workgroup computing, connecting the dissimilar computing environments that exist in many situations.
The X Window system includes the Xlib graphics subroutine library, the X network protocol (which handles transmission between client applications and server processes running on remote workstations), an X toolkit (which programmers use to build graphical-user interfaces), and several window managers. The window-manager program is distinct from the base window-system server and provides part of the user interface for manipulating existing applications on the screen. The application itself, however, in combination with an X toolkit, actually specifies the bulk of the window interface and defines the application's "look and feel." X contains mechanisms for implementing many interface styles and, unlike some interfaces (such as the Macintosh), does not mandate a single style. Because X is supported by virtually every major workstation vendor and more than 24 organizations, any application residing on a multi-vendor network and adhering to the de facto X standard for windowing on high-performance bit-mapped workstations can be accessed by all X-based workstations. X defines an open-systems architecture --one that is independent of devices, networks and operating systems --that accommodates a range of workstations. X servers have been implemented on workstations and PCs from Digital Equipment Corp., Apollo Computer, Sun Microsystems, and Apple Computer, as well as MS-DOS and OS/2-based PCs.
The X Window system, written in C, provides a network transparent client/server architecture that spares you from coding communications services between connected clients and servers. Clients establish connections with any number of servers. These servers handle graphics and windowing functions (called X server processes) and receive requests from clients in the form of Xlib graphics library calls. Fast asynchronous communication between clients and servers is performed by the X Network Protocol.
X applications and toolkits usually interface to the Xlib subroutine library on the host. Xlib in turn converts the parameters passed to the procedural interface into the X network protocol format, and translates messages from the server into return values for the application. Xlib also provides a set of utility routines needed by most applications.
In addition to Xlib, applications can interface to a variety of programming libraries as needed. These include the X Toolkit (used to build graphical-user interfaces), industry-standard libraries (such as GKS and PHIGS) that can be layered on top of Xlib, and extension libraries that provide programming interfaces to X server extensions such as 3-D graphics, Display PostScript, and imaging.
X applications allow you in effect, to reach out across a network via a consistent and intuitive graphical-user interface to access remote applications and high-performance resources anywhere on the network. This network transparency lets you intermix operating systems in workgroups and access applications regardless of the system those applications run on. For example, you can display in separate windows on a PC applications running on a VAX/VMS system, a Unix workstation from Digital, Sun Microsystems, Hewlett-Packard, and IBM computers. Through X, you can access multiple applications and display and control them within hierarchical-style windows on this workstation. Applications can be running either on the workstation or on remote hosts. The X Window system's network transparency makes remote applications appear as if they were running on your workstation.
Although an X application can be accessed by any bit-mapped graphics workstation on the network, the program actually executes on the whatever available computer is best suited to process the program. For example, a PC that lacks the performance and memory to run a weather prediction program can access a Cray supercomputer running the program while you interact with the application through a window on the PC. Again, X masks the differences in operating systems.
Applications can be designed with mouse-driven front ends that allow you to scan and navigate through multiple databases scattered throughout a network. A stockbroker, for instance, can review and compare multiple documents presented on his screen at the same time, each containing stock information from various news wires. The broker can "cut and paste" information of the same data formats from these windows into another window, and electronically mail the contents of the window immediately to a client with whom he is talking on the phone. The client, after reviewing the information from a desktop terminal, can place an order to purchase stock while the broker fills in an order form in another window and sends it to the client for confirmation. The stock information can be coming from news wires running on one system that knows nothing about the mail system, and nothing about the order form application and its host.
In addition to providing an architecture for distributed applications, X also simplifies the porting of software. This should be music to your ears if you traditionally have had to customize an application to match each hardware and software environment that you want your application to run on. Through X, one common application interface reaches all platforms because X is independent of machine architecture, operating system, and resolution and display characteristics. Application programs written to support X can often run on another vendor's system simply through recompilation.
The portability of X has been amply demonstrated by the X Testing Consortium, a group of vendors jointly developing tests to help firms validate their X implementations. To date, the test software runs on all the computer architectures of contributing companies with virtually no changes. Developers have had prerelease test software (containing more than 200 programs) up and running in one day.
Because X is device independent, you do not need to rewrite, recompile, or relink an application for each new hardware display. Applications can be written to be independent of monochrome and color displays or displays with varying resolution and other characteristics. Furthermore, every graphics function defined by the system will work on virtually every supported display. If graphics functions were not made to work on all displays, "inquire" operations (like the GKS-graphics-standard inquire) could be used to determine the set of implemented functions for a particular display at run-time. (However, using GKS inquire operations would require run-time analysis for every application, adding overhead and producing inconsistent user interfaces.)
Vendors implementing their own versions of X can extend the system over time. The most recent version of X, Version 11, provides hooks to extend X to support functions such as the Display PostScript imaging model, PEX 3-D graphics (see accompanying box), and standard compound-document interchanges. X11 also adds a graphics state for improved performance and defines precise semantics for output routines. The X11 tape, publicly available for a nominal fee from MIT, includes a sample X server, the Xlib library of X routines, the X Toolkit, several window managers, and other contributed software.
Unlike other windowing systems, X has a basic philosophy of providing only low-level mechanisms and defers policy decisions to developers. X will create a window, but the application must tell it whether the window should have a particular border or color. X gives low-level "raw" functionality in order to provide a simple, clean system on which to build. If policy decisions were made at a low level, the system would not grow so easily or allow for future advances in user-interface technology.
As previously stated, X provides the mechanisms to move, resize, and manipulate windows but does not dictate the actual appearance of the windows. These are defined by various developers. One such toolkit is the public-domain Xtk toolkit included with the MIT distribution tape. Another is Digital's DECwindows X User Interface (XUI), which is a collection of user-interface components called "widgets," which are analogous to objects. These widgets include scroll bars, pop-up menus, window borders, and dialog boxes. Digital's widget library is built on top of the lower-level "intrinsics" found in the X Toolkit. Intrinsics are the basic set of rules governing widgets: how the widgets are created and destroyed, how they receive input events, how they are stored in a resource file to be initiated at runtime, and other characteristics. You can build interfaces on their applications that maintain a high degree of portability and consistency with the X standard using Digital's XUI.
To build graphical-user interfaces, you design any number of custom widgets from the X Toolkit, or use sample widgets from the X Toolkit as well as widgets provided or sold separately by vendors. Because all widgets sit on top of the same foundation (the intrinsics), an application and its widgets can be ported from one X-based computer to another. Each computer provides a portable foundation, yet each application is customized and differentiated by its own "look and feel." Consistent graphical-user interfaces enable applications to have similar looks and feels, function similarly, and operate intuitively through mouse-driven graphical icons and pull-down menus. This simplifies the learning process for users.
From the three-level stack of programming interfaces --widgets, intrinsics, and Xlib --an application calls on any one interface, or any combination of interfaces as required by the program. If you are developing a spreadsheet, for instance, your program might call on intrinsics to customize a widget that displays cells in a spreadsheet, and access Xlib routines to draw graphics on the screen. Although it is possible for applications to directly access the server via the X protocol, you should use the higher-level Xlib graphics routines to manage communication.
The X protocol defines data structures used to transmit requests between clients and servers. X transmission is asynchronous. This enables requests to be sent without waiting for the completion of previous requests. Pipelining techniques in both the server and Xlib speed the processing of requests. Any requests depending on the completion of other requests are blocked, pending execution of those other requests. Errors are also generated asynchronously, and clients must be prepared to receive error messages at arbitrary times.
In general, the X protocol also describes connections between clients and servers, windows (which allow interaction between you and the application), events (which notify the application of mouse and keyboard actions and provide a way to control communication between multiple applications), and graphics routines (which allow an application to draw information on a display). These are described later.
Because X is network and operating system independent, applications can run on any machine in the network. The X protocol defines data structures used to transmit requests between clients and servers. Applications do not generate protocol requests themselves. Instead, applications call Xlib and other layered libraries. X uses asynchronous stream-based interprocess communication instead of the traditional procedure call or kernel-call interface. This asynchronous communication improves network speed by enabling requests to be sent without needing to wait for the completion of previous requests. Nearly any form of reliable data transport may be used. Current implantations include TCP/IP and DECnet.
Pipelining techniques in the server and Xlib help accelerate processing of requests. Some X requests, however, have return values (state queries, for example) that depend on the completion of previous requests. The X protocol will block any further requests until the server has generated a reply and sent it back to the client. Errors are generated asynchronously so clients must be prepared to receive error replies at arbitrary times after the offending requests.
A connection (that is, the communication path between the server and client program) can exist between processes on the same machine or on different machines. A client program usually has one connection to a server over which requests and events are sent. When processes are on the same machine, the X protocol is often sent using shared memory or other local transport facilities of the system, rather than TCP or DECnet.
To interact with you, a client must first open a connection with an X Server using a common transport mechanism. (DECwindows, for example, uses DECnet/OSI or TCP/IP). The client passes version and authorization protocol information in a packet to the server along with a code that indicates the byte order used by the client. If the byte order differs from the server's machine architecture, the server will use the byte order code to swap the bytes of incoming requests. A swapping takes place, for example, between a server running on a Macintosh and a client application running on a VAX processor.
If a request to open a connection is successful, a reply is sent back to the client. This reply contains information about the server and the associated display hardware including display resolution, physical dimensions, color-handling abilities, and a vendor identification string.
Once a connection is made, you can interact with multiple applications that are displayed within windows employing a window manager, which is simply another client program, that helps you manipulate windows on the screen. From the programmer's perspective, windows are hierarchical and can be created inside other windows to any depth necessary for an application.
Each screen has a root window that displays a background color or pattern and serves as the root of the window tree for that display. Windows can be displayed fully on the screen, partially, or completely hidden. To display a window, the client sends the server a Map-Window request. Graphics output to a window is clipped to the boundaries of the window. A window, therefore, becomes a virtual graphics terminal for an application, allowing multiple applications to share a screen and not overwrite another application's output.
Each window has a height and width and Z position that indicates its position within a stack of other windows. In addition, windows carry other attributes that identify their location on the screen, their mapped state, and their relationship to parent and sibling windows. The border pixel attribute is used to draw a border around a window. Through a background pixel and pixmap attribute, an application can specify either a single pixel value or a complete pixmap (a rectangular array of pixels in main memory) as the window background. Through this attribute, a server can redraw a window's background color itself without sending a request to an application.
An application that creates a window can specify bit and window "gravity" to the server to indicate which pixels should be retained when a window is resized, or how children windows should be positioned when the parent window is resized. For example, a text application might specify "NorthWest-Gravity" to indicate that the upper-left information should be preserved when the window is reduced to size.
Many Xlib routines are used primarily by a window manager or toolkit rather than by applications. Typical routines include changing the parent of a window, grabbing the pointing device or keyboard, altering event dispatching and processing, changing the keyboard encoding, determining the resident color maps, and modifying the list of hosts that have access to the server.
All pixels in X have uninterpreted color values, although the application can allocate and define a color map to gain control of the mapping between pixel values and colors displayed on screen. X encourages sharing of color maps between applications. Pixel values can be allocated as read only, and shared in a color map (optionally by name), or as read/write and exclusive in a color map. Applications that use low-level X routines are expected to query the hardware capabilities at connection set-up time and adjust their usage accordingly.
X graphics routines can be directed to a window or to an arbitrary pixmap. Pixmaps and windows are referred to as "drawables" and all X drawing operations are passed drawable as a parameter. Instead of passing all parameters that describe a drawing operation to the server on each graphics request, the server keeps state in a data structure called a graphics context (GC). The GC is passed as an argument to each graphics call and includes information about the foreground and background colors, line widths and styles, polygon fill rule, stipple patterns, text fonts, and a client-supplied clipping region. Applications can create more than one GC to alternate quickly between states on sequential output calls.
X applications are event driven with events being sent to an application from a number of sources, including the X server and X toolkit, as well as other applications. Events are generated by the X server when you type on the keyboard or move the mouse or other pointing device. Some event types are generated as side effects of client requests. Each event includes a time stamp, a bitmask indicating the up/down state of all modifier keys and mouse buttons just before the event, the window the mouse is in, and details about the change the event describes.
An application is notified when the pointing device or cursor enters or leaves a window. A single window is globally designated as the "input focus." This window receives all keyboard input specified by the event until the input focus is set to a different window. An event is generated when a window gains or loses input focus. In X, applications are expected to regenerate, on request, any information displayed in windows. When a window changes size or becomes visible, the server may need to tell the application which parts of the window to redraw. This triggers an event. Some X implementations may invoke backing store and save orders to reduce repainting, but applications must still be able to repaint a window on request.
Graphics operations in X are designed to be simple and fast. They are relatively low level compared to PostScript, PHIGS, or GKS, but are still well-suited to create high-performance, visually sophisticated applications. Tasks requiring a higher-level, graphics-oriented interface can use layered graphics libraries or intermix calls to the layered libraries with basic X graphics functions.
The Xlib contains about 300 routines that either map directly to X Protocol requests or provide utility functions to the client. Xlib routines allow a client application to create, destroy, manipulate, and configure windows. There are also routines for lines, polygons, arcs, text, block pixel transfers, stipple and tile filling, and color-map manipulation. Routines such as Polyline and PolyRectangle perform multiple operations based on a list of points.
Operating on an array of objects is more efficient than making multiple graphics calls due to the X requests overhead. The protocol to draw one or more rectangles is PolyRectangle, which takes a drawable (a pixmap or window), a graphics context, and a list of rectangles as parameters.
For example, X shows that graphics performance over an Ethernet network is excellent, and usually functions at the speed of the display device (often higher when the application is running remotely rather than locally!). Although the semantics of server operations are tightly connected to the X protocol, a fair degree of freedom exists in the actual design and implementation of the server itself. The quality of the server implementation is one way vendors can add value to their competing X offerings.
The MIT sample server (on the MIT distribution tape) consists of a section of highly portable code, and a section of device-dependent code. The sample server was designed to make device-independent code as large as possible, thus simplifying implementation at the expense of performance. Reimplementing the server to be entirely device dependent may provide the best performance, but would require a major effort to support each new workstation product.
Over time, extensions to support 3-D graphics, imaging, and even live video will be added to both the X architecture and to development tools with the goal of providing added functionality, but not at the expense of compatibility. With this in mind, software developers should consider the following criteria when evaluating a specific implementation of X: quality and robustness of code, performance between clients running X applications and remote workstations running X servers, the vendor's X development environment, and the level of difficulty required to integrate a software developer's own extensions and software into the X environment.
While the PEX project is supported by a number of companies and organizations -- including DEC, Tektronic, Hewlett-Packard, Apollo, Sun Microsystems, and the Open Systems Foundation -- the actual implementation work will be done by Sun under the direction of Robert Scheifler, director of the X Consortium. Sun will develop a public implementation of PEX and provide the full network and graphics code necessary to generate 3-D graphics on an X display.
Scheifler, who was the principal architect of X, says "PEX adds a significant new functionality to X." He went on to tell DDJ that the PEX project is especially significant, and personally gratifying, because "it is another indication of how well the industry can pull together when the right technology is recognized." Scheifler added that "interest in PHIGS throughout the world has been heating up recently, especially because it is about to become an official ISO standard. It is the right thing at the right time."
The preliminary release of the software will be in mid-1989, with public release (including documentation) scheduled for late 1990. The PEX implementation will become part of the MIT X Consortium software release and will be available at distribution cost with no licensing restrictions.
On a related topic Scheifler indicated that similar projects may be announced by the X Consortium sometime in the future, particularly projects targeted at object-oriented programming. "The consensus is that object-oriented programming is fundamental to user-interface building," he said. "We are looking into application development environments and the next generation of toolkits of which object-oriented languages are a key ingredient."
--eds.
Copyright © 1989, Dr. Dobb's Journal