TACKLING LARGE-SCALE PROGRAMMING PROJECTS

Exploring the power a networked distributed computing environment brings to the software development project

by William Courington, Jonathan Feiber, and Masahiro Honda

W. Courington is a technical writer for Sun, holds an A.B. from Occidental College and has been writing about software for 15 years. J. Feiber is the director of programming technologies for Sun and holds a B.A. in computer science from the University of Colorado and has eight years experience in software engineering. M. Honda is manager of NSE development at Sun. He has a Ph.D. in computer science from the University of Wisconsin at Madison and has 12 years of experience in software engineering.


When the personal computer revolution started back in the 1970s, life was relatively simple. The computers, though more difficult to use than many of today's PCs, were relatively simple machines. The software these computers ran was also simple, largely because of the computers' limited memory and other resources. This software was usually developed by one person.

As the complexity and power of personal computers grew and their use became more widespread, users demanded software that was more sophisticated and easier to use. Nowadays, developing new software products usually involves large teams that include not only programmers, but quality assurance engineers, release engineers, technical writers, and others, all working toward a common goal.

These large-scale programming efforts, often undertaken in a network environment, face several challenges not encountered by the lone developer. Developers now must contend with the difficulties of product complexity, staff interference, and network utilization.

Product complexity involves not only the profusion of files associated with a product, but also the management of different versions of these files and the ability to rebuild these different versions. Staff interference comes about from the interaction of a project team, such as when two or more programmers want to modify the same module at the same time. The added dimension of working on a network brings its own problems, as well as benefits.

Over the years, several tools have been developed to address these problems. The most promising solution today is an object-oriented, network-based, interactive environment, such as the one present in the Network Software Environment (NSE) developed by Sun Microsystems. Using such a system, a complex product appears to be composed of a few components; people modifying the same component appear to have their own copies; and resources scattered around the network appear to be local to each machine.

Objects Make Complexity Manageable

One characteristic of a large-scale software product is a profusion of files. A single project may have hundreds or thousands of source files, object files, libraries, executables, and documents. To impose some order on this mass, developers typically put related files into directories and organize the directories into hierarchies reflecting the product's structure. For example, the files related to a subsystem might be placed in a common directory, and files related to programs in that subsystem might be placed in subordinate directories.

Although a hierarchy of directories is a well-proven organizational tool, it does not address several other product-complexity issues.

As a product evolves, so do its files. Each file typically exists in multiple versions, with each version representing a set of enhancements or bug fixes.

Managing multiple versions of files is the domain of tools known as version control systems. The original and perhaps best-known of these is SCCS, the Unix system source-code control system. Like directories, version control systems help manage one of the many dimensions of product complexity.

If multiple versions of multiple files are not trouble enough, the time required to rebuild a large system (compile and link all modules) is often so great that complete rebuilds are impractical on a routine basis. The alternative to rebuilding an entire system is to find the source files that have been changed, recompile them, and then relink the system. Although this is simple in principle, faithfully tracking changes is difficult in practice, and imperfect tracking can produce bugs of exquisite obscurity when incompatible modules are linked together.

A class of tools sometimes called system modelers has been developed to automatically find and rebuild only changed files. The Unix make program is the best-known system modeler, and it, too, addresses one more aspect of product complexity.

Directories, version control, and system modeling represent the state of the art in many software projects today. Disciplined use of these tools is a great help, but it is still inadequate because the tools are exclusively file oriented. When the complete software development cycle is considered, it becomes clear that a software product consists of more than just code-related files. The product may also include proposals, schedules, requirements, drawings, documentation, specifications, data dictionaries, test data, test drivers, and test results. There needs to be a way to manage these diverse entities, which may be manipulated with tools supplied by multiple vendors, in a coherent way.

A single general solution to accommodate this diversity of software building blocks can be found in a hierarchy of objects. Everything in an object-oriented system is an object of some type. All objects have names, values (contents), and revisions (change histories).

Figure 1, below, shows a typical object hierarchy. Three types of objects have been defined: files, targets, and components. Additional object types -- for example, data flow diagrams or data dictionaries -- can also be integrated into such a system. All object types, whether developed in house or by outside suppliers, fit into the object hierarchy. Note especially that components can contain all types of objects, including other components.

Files, Targets, and Components

At the bottom of the hierarchy are files. Sun's NSE distinguishes between three types of files: ordinary, source, and derived. Ordinary files are just that; no special facilities are provided for them. Source files have their version history maintained, allowing any version of a source file to be recreated at any time. Summaries of the differences between two versions or of the changes made to successive versions of a source file may be displayed. To minimize disk space, the NSE stores only the differences between successive versions of a source file.

Derived files are created by programs, usually compilers or linkers. Object files, libraries, and executables are examples of derived files. The NSE does not maintain successive versions of derived files because any version of a derived file can be recreated from the corresponding source file version.

Targets encapsulate all the files necessary to build a derived file together with a recipe for building it. The membership of a target must be kept up to date. For example, if a source file is changed to include a new header (include) file, the header must be added to the target. The NSE handles this automatically. Target derived files are built under the control of the make program described earlier. (The recipe is actually a makefile.)

Components are the most versatile objects because they can contain other objects, including other components. This property is similar to directories containing files and other directories. Because components can have other components as members, they are the natural object for representing the structure of a software product. For example, one group of components might represent a system's major subsystems. These components, in turn, might contain components representing minor subsystems. These last components might contain still other components representing programs.

The most basic component might contain only a target. A more elaborate component could also contain objects related to the target, such as a test-data file, an executable test driver, or a documentation file. In short, any related objects that are naturally examined and changed as a unit are good candidates for grouping together in a component. (The issue of which objects belong in a particular component should be decided by local preference, the same as when determining what goes into a directory.) Components can freely share members, allowing, for example, one copy of a common header file to be shared across many components. Revisions of components are maintained in a manner similar to that used for source files. Old revisions are immutable and can be accessed at any time. When a component is revised, all subcomponents are automatically revised too.

Environments Insulate Activities

A component hierarchy can make the structure of a complex software product intellectually manageable, but it does not address the problems that arise from the interaction of project team members. For example, if two programmers share a copy of a source file (or any object), changes made by one programmer are, at the very least, destabilizing to the other's work. The alternative to sharing is copying, which has its own interaction problems. If, for example, two programmers modify local copies of a master file and replace the master with them, the second programmer's changes obliterate those made by the first.

NSE has a facility, that is called an environment, for controlling concurrent access. Unlike traditional concurrency control mechanisms, such as locks, environments are "optimistic" and don't assume that the changes are incompatible. Suppose two programmers need to change the same source file at the same time: One programmer needs to fix a bug while the other wants to add a procedure. A "pessimistic" concurrency control mechanism, in effect, assumes that the changes are incompatible and allows only one programmer to change the file at a time. Environments, on the other hand, allow both programmers to change the same file in parallel so long as they make the changes in different environments. Eventually, of course, the two sets of changes must be merged into a new version of the file. The NSE provides a tool (described later) that automatically merges compatible changes and provides assistance for resolving incompatible changes. Thus, different environments enable programmers to work on the same file in parallel, and the time spent merging the changes will be proportional to the degree of incompatibility of their changes. Generally, as a software product matures, incompatible changes become increasingly rare, requiring very little time to merge changes made in parallel.

Virtualizing the File System

An environment is an insulated programming workspace that is, in many ways, similar to a process' virtual-address space in a virtual-memory system. Within an environment, a programmer refers to files by their ordinary names and manipulates files with ordinary tools such as compilers, editors, and utilities. However, just as the memory mapping hardware of a virtual-memory computer transparently maps the same address reference made by two different processes into different physical addresses, so the NSE transparently maps references to the same file from different environments to different underlying files. The result: A change made to a file in one environment is invisible in other environments, even though they contain the same file.

Environment Hierarchies

Environments can be arranged in hierarchies, as shown in Figure 2, page 49. An environment hierarchy reflects the relationships among project activities. That is, all environments represent activities, and subordinate environments reflect subactivities that can proceed in parallel with their parent activities. Thus, a diagram of a project's environment hierarchy will strongly resemble the project's organization chart. Components are distributed through an environment hierarchy as they are pertinent to the activities conducted in each environment.

Although all environments have the same properties (there are no environment types), they can be grouped into three classes corresponding to activities common to many software projects: release environments, integration environments, and development environments.

At the top of an environment hierarchy is a release environment. It is here that the complete product is tested and released to manufacturing or to customers. A release environment contains the component(s) at the top of a product's object hierarchy; these components contain all other components as subcomponents, but the subcomponents are not the concern of the person or department in charge of the release environment.

To begin development of a new major release, a release engineer can create a new release environment as a child of the current one. Copying the latest revision of the components from the old release environment to the new one establishes a new baseline for development.

Integration environments typically correspond to the activities of project groups such as departments. An integration environment usually contains the components (and their subcomponents) comprising a subsystem. A project that has multiple group levels can create corresponding levels of integration environments.

At the lowest level of an environment hierarchy are development environments, which have no children. These are the workspaces of individual staff members, and usually contain the components they are actively working on. The basic business of a software project, changing and compiling source files, is conducted in development environments.

Detecting and Resolving Conflicts

As mentioned earlier, some mechanism must exist to reconcile conflicting concurrent revisions. The NSE provides two operations, called acquire and reconcile, that logically copy components (including their subcomponents) from parent environments to children, and vice versa. (The term logically copy emphasizes that although the user of an environment sees what appears to be local copies of files, to the degree possible, the NSE shares the same copy of files among multiple environments.) The acquire operation copies components from a parent environment; the reconcile operation adds a new revision of a component to the parent environment.

Thus, to modify the current revision of a component, a programmer acquires the revision from his or her department's integration environment. When the modifications are complete, the programmer reconciles the component back to the parent integration environment. Reconciling the components in the top-level integration environments into the release environment produces a new revision of the complete product. Development can continue in the lower-level environments while the new release is being prepared in the release environment.

The NSE encourages parallel development, and does not prevent two programmers from acquiring the same component from a common parent environment, modifying it, and reconciling it back. If, for example, two programmers both acquire revision 5 of a component, each with the intent of creating revision 6, whoever reconciles first does create revision 6 in the parent environment. When the other programmer reconciles, however, the NSE detects a potential conflict. Revision 6 must be merged with the second programmer's changes; the merged result can then be reconciled to the parent to create revision 7. The result is exactly the same as if the programmers had created revisions 6 and 7 sequentially -- except that by working in parallel, they will probably finish the job sooner.

The NSE resynch operation effectively acquires the conflicting revision of a component so the conflict can be resolved in the child environment. With conflicting revisions in the same environment, the file resolve tool automatically merges non-conflicting lines from two source file versions and marks conflicting lines for manual resolution. When all conflicting files are resolved, the component can be compiled, tested, and then reconciled back to the parent environment.

Object and environment hierarchies effectively address product and project problems on a single machine, but a network adds yet another dimension to these problems. The new dimension is exemplified by this question: Where in the net is environment ABC, and where is revision 3 of component PDQ? The NSE addresses this question with a network-wide naming scheme. Users refer to environments and objects by simple names; the NSE transparently maps these local names to network addresses and, using Sun's Network File System, automatically mounts and accesses remote file systems as needed.

Summary

The NSE addresses many of the difficult problems that arise in large and complex software development projects. The NSE manages the complexity that comes from a large number of files, multiple versions of each file, multiple revisions of a product in the field, and multiple processor architectures. The NSE does all of these things in a networked, distributed development environment. Although not described in this brief article, the NSE supports the integration of tools that cover every phase of the software development lifecycle. Perhaps most importantly, the NSE understands the issue of scale; it continues to support projects as they grow from tens to hundreds of programmers, and from thousands to millions of lines of code.