High-Level Program Development

Dr. Dobb's Special Report December 2000

In the future, programmers may do less coding

By Bard Bloom, Jim Russell, John Vlissides, and Mark Wegman

The authors are researchers at IBM. They can be contacted at bardb@us.ibm.com, jrr@us.ibm.com, vlis@us.ibm.com, and wegman@us.ibm.com, respectively.
About IBM Research
Automatic Programming

There are at least three trends currently driving software technology that we believe will change software development to such an extent that over the coming years, many programmers will spend more time analyzing business needs than coding. This means that programming may increasingly be done at the level of application-domain concepts, with lower level programming relegated to small parts of the system where they are unavoidable.

The first emerging trend lies in increasing levels of abstraction, which lets programs be written in terms of the problem itself. For instance, a business program might work with "Customers" and "Bills," rather than "database tables" and "sockets."

The second trend involves tools that let people with different interests and points of view work on the same program. The business process programmer, for instance, will work only with Customers and Bills. The database programmer, on the other hand, will map Customers and Bills to database tables -- and all within the same program.

The third developing trend involves optimization techniques. The business-level program working with Customers and Bills would be hideously slow if run directly, but automatic optimization can make it run nearly as well as a hand-crafted program built with lower level concepts.

Luckily, these threads work together, making it possible for development to be completed mostly by domain experts, rather than programmers. And we predict the results will be efficient enough for production use. When automatic optimization fails, as it surely will now and then, advanced tools will allow the areas of inefficiency to be extracted and optimized by hand without disturbing the logic or high-level structure of the program.

In this article, we'll survey some of the advanced technologies driving these trends. For more detailed information, see http://www.research.ibm.com/drdobbs/.

High-Level Business Processes

In the future, high-level programming will be done with specialized tools called "application builders." Although rudimentary application builders have been around for a while, application builders of the future will encompass almost all programming tasks with a small set of domain-specific languages for the most common application domains. Consequently, users will more likely view them as "domain-specific application development tools."

Consider, for example, a common domain-specific programming task -- the design and implementation of a travel expense accounting application. The builder will show different views of the software to different people, presenting to each the aspects that concern him. An executive's view might show which data are available when and how. The person incurring an expense might see and be allowed to change the amount of the expense. The person writing the check should not be allowed to change the amount. The database administrator, working at a much lower level than the executive, will see the database schema. His view is related to the executive's -- both will be aware of concepts such as "the amount of the expense." There might be many such views. Ultimately, a businessperson could describe a business process, and the software that a participant in that process uses would be automatically constructed.

The description of this process involves "actors" (that is, people or their proxies), their actions, and the objects they act on. No current application tool does a good job modeling all three. Lotus Notes, for instance, understands the actors as people and the objects as documents, but it only understands the actions in the sense that it will send mail saying an action needs to be taken. Notes is improving, of course, most recently adding some workflow models. Over time we expect commercial tools will do most of what an application builder should do. Enterprise Builder, a research project being developed by our team in the Programming Languages & Software Engineering group at IBM Research, is well on its way to doing it now.

Enterprise Builder is the internal name of a high-level, visual language for describing and envisioning business processes -- an application builder, in other words. To demonstrate the capabilities of emerging tools such as Enterprise Builder, we'll use it here to model the travel expense accounting system. However, we've also used Enterprise Builder for more robust applications, including modeling customer call centers, internal banking systems, and supply-chain support systems.

Figure 1 captures a process whereby a traveler fills out a travel expense form, virtually staples his receipts to it, and passes it to his manager for approval. If approved, the manager sends the form to the accounting department for payment.

On the left of the diagram are the three roles played by participants in the process. On the bottom are data objects: one for the form and two for lists of related objects. The blocks in the middle of the diagram are actions, representing one user viewing and manipulating some data; they are on the same level as the role of the person who performs them. Lines extend from data objects and forms to the actions that use and modify them. The programmer or the businessperson can delve more deeply into the definitions of any of these parts of the program, for example, to see what types of attributes are available for a data object. Constraints can be placed on the objects: When improper information is entered in a user interface, the system displays an appropriate error message or forbids a nonsensical action. Each action step has further details, telling precisely what information is available to users and how it can be manipulated. Cross referencing is easy, because the model is a richly structured object: The tool can reveal, say, which action steps use a particular field of an object.

Alas, this approach treats people as no more than statements in a program or cogs in a machine, which is both immoral and ineffective. Users may quite properly object to being dehumanized. As an approach toward gaining the benefits of mechanization without restricting human freedom so severely, we are experimenting with processes described in terms of goals, and supplementing activities with preconditions and postconditions. Users could do anything allowable (that is, anything with a precondition that is satisfied) at any time; the system would understand the needs of the process well enough to suggest actions that make progress toward achieving the goals. However, such actions would not be obligatory. Users could, for example, explore the corporation's information system, following references from datum to datum like a big hypertext document, within the limits of their permissions to view corporate data.

Each Enterprise Builder object has more detail behind it. The TravelForm object has panes describing its fields, the user interfaces that apply to it, and the constraints it must satisfy. The role objects include details about their relationship to each other -- for example, that the Approver should be the Traveler's manager whenever possible. Even the lines have information. The connection between an object and a use of that object tells which GUI will be used to manipulate that object at that point.

There is enough information in this model for Enterprise Builder to generate a sample application automatically. Users of the process can try out the sample and comment on it, and developers can use Enterprise Builder to modify it accordingly. Several such iterations should take on the order of hours or days, increasing the chance that users will be happy with the result.

Enterprise Builder hides most implementation details from programmers, making reasonable default assumptions. This is the key to quick and painless generation of an application -- albeit a likely inefficient one. As a first cut, we simply do not worry about efficiency. Optimization can be done later, by programmer or machine.

Any significant program will perform at least a few calculations that are outside the scope of Enterprise Builder's abstractions. For these, we descend to the level of conventional programming. Enterprise Builder provides hooks that make such multilevel programming possible. For example, the travel expense application integrates hand-crafted code for computing taxes. In our case studies, only a small fraction of the many thousands of lines of Java generated by Enterprise Builder have been written by programmers.

We have been able to model several business processes with Enterprise Builder. An early one modeled a customer help desk solution. We interviewed the person in charge of the help desk for an hour and a half as we entered information into Enterprise Builder. We ended up generating a prototype application functional enough to let us shift our focus to the business process and how it could be improved. With tools like Enterprise Builder, programmers may go beyond being passive implementers of business decisions to being an integral part of the decision-making process.

Improving Program Structure

Higher levels of abstraction result in shorter, more comprehensible programs. Nonetheless, when a problem is intrinsically complicated and changeable, the program will necessarily be complicated and changeable. A telltale sign of complicated code is that programmers cannot find the topics they are looking for, or that a conceptually small change prompts modifications to disparate parts of the program. Design patterns help people plan for change by encapsulating variabilities in their code, making it easier to modify without repercussions.

However, even the best extant design falls prey to a problem we call "the tyranny of the dominant decomposition." A program is usually organized along a particular dimension: a class hierarchy perhaps, or a functional module structure. Topics that suit this dimension are easy to address. Other concerns wind up scattered across modules. For example, consider a compiler organized around a class hierarchy that implements an abstract syntax tree. Each class will have a method for intermediate code generation. Scattering the concern of intermediate code generation across the entire syntax tree dimension makes it difficult to view and change as a whole.

The same issue arises in a company's business processes (including their Enterprise Builder implementations). A common decomposition is by department. One person might be responsible for the medical department's procedures, another for the purchasing department's. It may be hard to get all the departments to do the same things in the same way, much as the company's top management may wish for it.

Requirements may be inconsistent for good reasons. The purchasing department might insist that a person's manager be the approver for anything purchased for the person. On the other hand, the medical department might insist that medical matters be kept secret even from the manager. Resolving this conflict will require careful examination of all the ways that approval is granted -- temporarily rearranging the business processes to highlight approval with little regard to division. That is, the original decomposition of the program by division is temporarily lost in favor of the new decomposition, but it will be restored when this task is done.

IBM's experimental Hyper/J tool (http://www.alphaworks.ibm.com/tech/hyperj) allows new ways of decomposing programs and recomposing them into equivalent programs organized along different dimensions. Viewed one way, a segment of code might be related to approvals; viewed another way, it is related to the purchasing department. Hyper/J lets you reorganize code for either concern, gathering all the parts about approval in one place when the approval process should be changed. Hyper/J understands the code well enough to alert programmers to conflicts they may introduce. This mode of programming is more powerful and flexible than standard object-oriented programming, as it keeps any particular decomposition from dominating the program's organization. Combining Hyper/J with Enterprise Builder should greatly increase the maintainability of many programs.

Optimization of High-Level Languages

Enterprise Builder makes application programming easier and more accessible to nonprogrammers. But applications designed at a high level without direct and careful attention to low-level concerns frequently turn out agonizingly slow, bloated, or both. Optimization can compensate, at least in part.

Historically, compilers have included increasingly sophisticated optimizations that have afforded higher level languages with acceptable run-time overhead. The basic optimization technique has been to examine the program unit and determine either that certain operations are completely unnecessary (and hence can be eliminated) or that sequences of operations can be accomplished with simpler, more specific and efficient code sequences. Optimizers look at increasingly large amounts of the program; current optimization techniques look over entire class hierarchies. That strategy has worked well for decades as the technology has moved from assembler, through Fortran and Cobol, to C++.

However, Java presents new challenges to the compiler writer. Garbage collection, dynamic loading, and threads have been around for a long time, but never in the dominant language of the decade. Dynamic loading is particularly difficult to optimize; the compiler does not see the whole program in advance, and so it cannot analyze the code completely. However, recent advances allow optimizations even in the presence of dynamic loading, giving nearly the level of optimization we have seen for other languages. Garbage collection and threading are difficult to optimize because almost any operation may happen at almost any time, so the optimizer must produce fast code that nonetheless can tolerate surprises. Advances allow optimizing garbage collection and threading, to the point where Java should perform competitively with C++ in the near future.

Dynamic Optimizations

Beyond these programming language advancements lie higher order abstractions expressed in the programming language. For example, most modern programming environments include powerful libraries (collection and GUI classes, for instance). These deserve optimizations of their own.

Libraries are written separately from the applications that use them. Library writers cannot specialize their data structures and algorithms to a particular application; the burden of choosing the library class for optimal performance typically falls on the application programmer. That's often done poorly or not at all, because the application programmer doesn't know enough about the details of the library to make an informed choice. The higher level a system is, the more important and numerous such decisions become, but the less able the user is to make them.

Early attempts to solve this problem, such as SETL and VERS, were not good enough for mainstream programmers. They attempted to guess the algorithms and data structures that were most appropriate by looking at the source code, which is impossible in theory and unworkable in practice. Fortunately, an effective approach has recently emerged that considers information collected from a few executions of the program.

Our Dynamic Application Partitioning (DAP; see http://www.research.ibm.com/dap/) system illustrates this approach. DAP automatically determines how best to place objects in a distributed program, perhaps the most important decision in such programs. DAP monitors the execution and records how often particular objects communicate with each other. Then it computes an ideal placement by determining the minimum cut set of a graph. We have constructed linear time heuristics (see the accompanying text box entitled "Automatic Programming") that makes this an efficient computation in practice.

The problem remains that future runs of the program may not be the same as the instrumented run. To address that, DAP automatically examines the object placements and determines characteristics that caused an object to be placed on one machine rather than another. For example, it can detect that factory objects should run on the same server as the objects they create.

Ultimately, we predict this technique will handle caching, replication, and similar issues automatically. Users of high-level tools such as Enterprise Builder will not have to address such issues; application builders and DAP-like analysis will ensure high-quality implementations.

View of the Future

We see traditional optimization technology and a range of dynamic optimizations coalescing in development environments of the future. The result will be a dramatic improvement in the efficiency of executables generated from high-level descriptions. Meanwhile, tools will evolve natural abstractions for their users' domains of expertise, including intuitive ways to manipulate and connect those abstractions. The result will be systems by which nonprogrammers can create domain-specific, production-quality applications quickly.

Even if such an ambitious goal is long in coming, there will be early benefits. For example, Enterprise Builder produces prototype applications quickly before it will produce robust, scaleable, and secure applications. The prototypes will make it easier to get the requirements right. DAP lets us build visualizations of object placement today, helping programmers understand distribution issues even when it can't do the full placement. Hyper/J lets programmers add new functionality to extant applications now. In short, a little bit of the future is already here.

DDJ