TACKLING LARGE-SCALE PROGRAMMING PROJECTS

Exploring the power a networked distributed computing environment brings to the software development project

by William Courington, Jonathan Feiber, and Masahiro Honda

W. Courington is a technical writer for Sun, holds an A.B. from Occidental College and has been writing about software for 15 years. J. Feiber is the director of programming technologies for Sun and holds a B.A. in computer science from the University of Colorado and has eight years experience in software engineering. M. Honda is manager of NSE development at Sun. He has a Ph.D. in computer science from the University of Wisconsin at Madison and has 12 years of experience in software engineering.

When the personal computer revolution started back in the 1970s, life was relatively simple. The computers, though more difficult to use than many of today's PCs, were relatively simple machines. The software these computers ran was also simple, largely because of the computers' limited memory and other resources. This software was usually developed by one person.

As the complexity and power of personal computers grew and their use became more widespread, users demanded software that was more sophisticated and easier to use. Nowadays, developing new software products usually involves large teams that include not only programmers, but quality assurance engineers, release engineers, technical writers, and others, all working toward a common goal.

These large-scale programming efforts, often undertaken in a network environment, face several challenges not encountered by the lone developer. Developers now must contend with the difficulties of product complexity, staff interference, and network utilization.

Product complexity involves not only the profusion of files associated with a product, but also the management of different versions of these files and the ability to rebuild these different versions. Staff interference comes about from the interaction of a project team, such as when two or more programmers want to modify the same module at the same time. The added dimension of working on a network brings its own problems, as well as benefits.

Over the years, several tools have been developed to address these problems. The most promising solution today is an object-oriented, network-based, interactive environment, such as the one present in the Network Software Environment (NSE) developed by Sun Microsystems. Using such a system, a complex product appears to be composed of a few components; people modifying the same component appear to have their own copies; and resources scattered around the network appear to be local to each machine.

Objects Make Complexity Manageable

One characteristic of a large-scale software product is a profusion of files. A single project may have hundreds or thousands of source files, object files, libraries, executables, and documents. To impose some order on this mass, developers typically put related files into directories and organize the directories into hierarchies reflecting the product's structure. For example, the files related to a subsystem might be placed in a common directory, and files related to programs in that subsystem might be placed in subordinate directories.

Although a hierarchy of directories is a well-proven organizational tool, it does not address several other product-complexity issues.

As a product evolves, so do its files. Each file typically exists in multiple versions, with each version representing a set of enhancements or bug fixes.

Managing multiple versions of files is the domain of tools known as version control systems. The original and perhaps best-known of these is SCCS, the Unix system source-code control system. Like directories, version control systems help manage one of the many dimensions of product complexity.

If multiple versions of multiple files are not trouble enough, the time required to rebuild a large system (compile and link all modules) is often so great that complete rebuilds are impractical on a routine basis. The alternative to rebuilding an entire system is to find the source files that have been changed, recompile them, and then relink the system. Although this is simple in principle, faithfully tracking changes is difficult in practice, and imperfect tracking can produce bugs of exquisite obscurity when incompatible modules are linked together.

A class of tools sometimes called system modelers has been developed to automatically find and rebuild only changed files. The Unix make program is the best-known system modeler, and it, too, addresses one more aspect of product complexity.

Directories, version control, and system modeling represent the state of the art in many software projects today. Disciplined use of these tools is a great help, but it is still inadequate because the tools are exclusively file oriented. When the complete software development cycle is considered, it becomes clear that a software product consists of more than just code-related files. The product may also include proposals, schedules, requirements, drawings, documentation, specifications, data dictionaries, test data, test drivers, and test results. There needs to be a way to manage these diverse entities, which may be manipulated with tools supplied by multiple vendors, in a coherent way.

A single general solution to accommodate this diversity of software building blocks can be found in a hierarchy of objects. Everything in an object-oriented system is an object of some type. All objects have names, values (contents), and revisions (change histories).

Figure 1, below, shows a typical object hierarchy. Three types of objects have been defined: files, targets, and components. Additional object types -- for example, data flow diagrams or data dictionaries -- can also be integrated into such a system. All object types, whether developed in house or by outside suppliers, fit into the object hierarchy. Note especially that components can contain all types of objects, including other components.

Environment Hierarchies

Environments can be arranged in hierarchies, as shown in Figure 2, page 49. An environment hierarchy reflects the relationships among project activities. That is, all environments represent activities, and subordinate environments reflect subactivities that can proceed in parallel with their parent activities. Thus, a diagram of a project's environment hierarchy will strongly resemble the project's organization chart. Components are distributed through an environment hierarchy as they are pertinent to the activities conducted in each environment.

Although all environments have the same properties (there are no environment types), they can be grouped into three classes corresponding to activities common to many software projects: release environments, integration environments, and development environments.

At the top of an environment hierarchy is a release environment. It is here that the complete product is tested and released to manufacturing or to customers. A release environment contains the component(s) at the top of a product's object hierarchy; these components contain all other components as subcomponents, but the subcomponents are not the concern of the person or department in charge of the release environment.

To begin development of a new major release, a release engineer can create a new release environment as a child of the current one. Copying the latest revision of the components from the old release environment to the new one establishes a new baseline for development.

Integration environments typically correspond to the activities of project groups such as departments. An integration environment usually contains the components (and their subcomponents) comprising a subsystem. A project that has multiple group levels can create corresponding levels of integration environments.

At the lowest level of an environment hierarchy are development environments, which have no children. These are the workspaces of individual staff members, and usually contain the components they are actively working on. The basic business of a software project, changing and compiling source files, is conducted in development environments.

Detecting and Resolving Conflicts

As mentioned earlier, some mechanism must exist to reconcile conflicting concurrent revisions. The NSE provides two operations, called acquire and reconcile, that logically copy components (including their subcomponents) from parent environments to children, and vice versa. (The term logically copy emphasizes that although the user of an environment sees what appears to be local copies of files, to the degree possible, the NSE shares the same copy of files among multiple environments.) The acquire operation copies components from a parent environment; the reconcile operation adds a new revision of a component to the parent environment.

Thus, to modify the current revision of a component, a programmer acquires the revision from his or her department's integration environment. When the modifications are complete, the programmer reconciles the component back to the parent integration environment. Reconciling the components in the top-level integration environments into the release environment produces a new revision of the complete product. Development can continue in the lower-level environments while the new release is being prepared in the release environment.

The NSE encourages parallel development, and does not prevent two programmers from acquiring the same component from a common parent environment, modifying it, and reconciling it back. If, for example, two programmers both acquire revision 5 of a component, each with the intent of creating revision 6, whoever reconciles first does create revision 6 in the parent environment. When the other programmer reconciles, however, the NSE detects a potential conflict. Revision 6 must be merged with the second programmer's changes; the merged result can then be reconciled to the parent to create revision 7. The result is exactly the same as if the programmers had created revisions 6 and 7 sequentially -- except that by working in parallel, they will probably finish the job sooner.

The NSE resynch operation effectively acquires the conflicting revision of a component so the conflict can be resolved in the child environment. With conflicting revisions in the same environment, the file resolve tool automatically merges non-conflicting lines from two source file versions and marks conflicting lines for manual resolution. When all conflicting files are resolved, the component can be compiled, tested, and then reconciled back to the parent environment.

Object and environment hierarchies effectively address product and project problems on a single machine, but a network adds yet another dimension to these problems. The new dimension is exemplified by this question: Where in the net is environment ABC, and where is revision 3 of component PDQ? The NSE addresses this question with a network-wide naming scheme. Users refer to environments and objects by simple names; the NSE transparently maps these local names to network addresses and, using Sun's Network File System, automatically mounts and accesses remote file systems as needed.

Summary

The NSE addresses many of the difficult problems that arise in large and complex software development projects. The NSE manages the complexity that comes from a large number of files, multiple versions of each file, multiple revisions of a product in the field, and multiple processor architectures. The NSE does all of these things in a networked, distributed development environment. Although not described in this brief article, the NSE supports the integration of tools that cover every phase of the software development lifecycle. Perhaps most importantly, the NSE understands the issue of scale; it continues to support projects as they grow from tens to hundreds of programmers, and from thousands to millions of lines of code.