Software Manufacturing

Dr. Dobb's Journal April 2004

Automating software component production

By Fred Wild

Fred is the principal at Advantage Software Tech and can be contacted at fred@codepatterns.com.

Software manufacturing is the process of applying automated methods to produce a specific set of software components from models, specifications, or other sources of metadata. In this regard, automated methods suggest a "manufacturing line" for creating components. It does not prescribe what those methods should be—only that they should be automated. For its part, "specific components" suggests sets of components that have specific purposes in specific contexts (as opposed to generic components or skeletons of components). This isn't to suggest that you use just one method or mechanism for producing components, but rather you use the most appropriate means for producing specific components on a given line. And finally, metadata is the raw material used to manufacture software components.

Given this, you might be tempted to give credit to code-generation features found in a number of UML tools. Usually, however, these tools only offer degenerate examples, leaving them outside of the class of code-creation strategies that we would like to call "software manufacturing" per se. Following the intent of software manufacturing yields more than a capability for generating the one-to-one code-artifact equivalents of your design-model elements. You need to think in larger terms. Instead, the intent in software manufacturing is to create as many of—and as much of—your required set of implementation artifacts as possible. While keeping code artifacts bidirectionally in synch with UML elements on a one-to-one basis is fine when maintaining pictorial representations of what your code contains, in software manufacturing such a thing is "zero leverage" and provides no advantage. Rather, in software manufacturing, you seek as much leverage as possible, with ratios of manufactured artifacts to specification elements that start at 20 to 1 and go up from there.

You also want to treat yourself to the most useful methods of specification (embracing both UML and nonUML metadata as appropriate). Wherever possible, you want to use the manufacturing line to reduce the act of creating specific types of software components to a matter of specification and use of the manufacturing line. You want to specify the highest level, significant details, and allow the remaining instrumentation and supporting work products (helper classes, interfaces, adapters, IDL, DDL, DTDs, XML, and the like) to be created by applying code patterns, mappings, and bridging concepts. In such a context, the need to reverse engineer previously forward engineered work products (to create a "round trip" path through the code back into design) becomes much less interesting, if not entirely undesirable. Reverse engineering is analogous to saying, "Here is the sausage, tell me about the pig." Since we are dealing with many simultaneous work products, we don't really have a productive reason to try to translate in that direction.

Manufacturing Engineering: Parallels and Differences

One of my family members made his living as a machine designer, focusing on setting up manufacturing lines for creating various types of goods. He dealt with everything from generalized machines that can be configured to do a range of things (producing sweaters or socks) to custom machines that did one very specific thing (fasten rivets into a particular set of holes).

His general goals were always the same: Optimizing quality (of the end product), meeting production goals (manufacturing line throughput), ensuring reliability (of the manufacturing line), and minimizing the overall cost of goods produced. Such goals were not usually all satisfied to the degree desired in one pass. In each case, machines and processes on the line were updated, tuned, and even replaced until the right balance was finally achieved.

The considerations of manufacturing enter into a product development cycle early on, when the item to be manufactured is in its design stage. Enabling an effective manufacturing process is a key design criterion, referred to as "manufacturability." (Because it is possible for a design to either complicate or simplify manufacturing in major ways, manufacturing engineers get involved early on to ensure things go their way.)

In the history of goods production, manufacturing lines were not always automated. Before this, many manufacturing plants consisted of row upon row of stations where people did some type of repetitive task and pushed the work to the next station. The optimization of introducing automation into production lines is part of the evolutionary changes that have taken place in that field. (In a short number of years, people may look back on this period and wonder why it took so long for us to adopt an automated means of producing software.)

Manufacturing Software

In terms of manufacturing end deliverables, there is a difference in how software developers approach the problem. We are not trying to manufacture many instances of exactly the same thing, but rather instances of similar things. For example, if a manufacturing line were set up for producing monogrammed shirts, it would be a closer match to cases we often see. Why? Because in such a situation, each shirt is much like the previous, except for sizing information and the actual letters that need to appear in the monogram. Some means of describing these differences would need to be fed into the mechanisms of the production line as parameters to create finished shirts.

In software development—particularly when looking at best practices for specific classes of systems—we see an impressive similarity among components with like purposes. Further, the differences in those components are often well structured and predictable. This suggests that a manufacturing approach may be feasible for such components. If so, what can be implied about our approach to software engineering that might exploit these similarities?

Seeking a Workable Approach

Despite rising and falling hopes in particular methods, tools, and paradigms, we are still happy to acknowledge that we should strive to treat software production in an engineering context. We want to, and need to, follow the trends in other industries and produce more software in decreasing time with increasing quality. We see that software systems and their components follow patterns at many levels of granularity, which have been described in numerous educational materials. In practice, developers also see that they are employing a number of similar mechanisms from implementation to implementation, adapting each to take the needs of the specific application at hand into account, so there is a nagging feeling that these differences and similarities can be separated and treated more systematically. We are prompted by these things to ask, "Why do developers continue to create software systems (and their components) using repetitious, labor-intensive, and error-prone means?" The truth is, for no sustainable reason; but the good news is that our ways and means are maturing, and ideas about manufacturing software are being appreciated in terms of how software is both similar to and different from other manufactured work products.

The Software Factory. Sadly, on the rare or endangered species list of software development paradigms (at least in the U.S.) is the idea of the "software factory," which hopes to specialize jobs and define workflow in the software production lifecycle to match that of other engineering disciplines (see "The Software Factory: A Historical Interpretation," IEEE Software Magazine, March 1989; and Japan's Software Factories: A Challenge to U.S. Management, Oxford University Press, March 1991, ISBN: 0195062167, both by M.F. Cusumano).

A software factory is centered in formal staging, controls, specializations, and optimization being applied to the workflow, processes, and procedures within the software development lifecycle. Adopting the formalities represented in implementing a Software Factory is, in practice, an instance of deciding to become a CMM Level 5 organization. Thus, it brings with it all that you expect with such transformational changes to the organization. As such, as noble as its goals and principles are, adopting this approach remains an expensive proposition for organizations currently running under the less formal norms of the day. If you can afford to change your organization to adopt a full-blown software factory approach, great! If not, you may still want to use some of the ways and means that are employed by software factories when optimizing for the production of certain types of components; namely, automating component production (see "Siemens Automotive Brings Software Factories On-Line," by D. Ladd, Corporate News Release, March 2, 2001; http://www.usa.siemensvdo.com/media/2001/pt/ 0302dihtm).

An Industry of Reusable Software Components. Despite very early vision and the best efforts of a number of people, we have a mostly unfulfilled promise of assembling software systems from libraries of commercially mass-produced components (see M.D. McIroy's "Mass Produced Software Components," NATO Conference on Software Engineering, March 1968). Instead, we have a shadow of this in the widespread use of popular languages and facilities that provide a set of general capabilities. As valuable to productivity as these are, components offered by these facilities are generally service-level components, not domain-specific components. Service-level components such as strings, collections, RPC mechanisms, UI frameworks, and the like are very well appreciated, but it is still left up to application developers to assemble business components and applications from them. This is often done manually. Once business components are in hand, we may try to reuse them, but they are, by nature, bound to the mechanisms of the applications at hand, and this makes reuse somewhat narrow and conditional. Thus, the successes that one can point to in reusable domain-specific asset programs are most often found within specific organizations, not the general state of the practice as a whole. Nevertheless, even where it is practiced, particular concerns over this approach being effective remain in how components are adapted to fit into the application environment in which they are reused. Unless a component can adapt to the needs of the environment, the environment must adapt to the needs of the component. This is not a trivial problem and is still being actively investigated (for instance, see "Classifying Software Components Using Design Characteristics," by C. Clifton and W. Li, Proceedings of the 10th IEEE Knowledge-based Software Engineering Conference, November 1995; "Retrieving Software Components that Minimize Adaptation Effort," by L. Jilani, J. Desharnais, M. Frappier, and R. Mili; Proceedings of the 12th IEEE International Conference on Automated Software Engineering, November 1997; and "Beyond Components-Connections-Constraints: Dealing with Software Architecture Difficulties," by J. Kyaruzi and J. van Katwijk, Proceedings of the 14th IEEE International Conference on Automated Software Engineering, October 1999).

Upon looking at the root issues involved in these challenges, you see a reexpression of a principle that we all know—one size does not fit all. Rather, domain-specific components are influenced by all of the functional and nonfunctional requirements that they need to satisfy. To enable a drop-in style of reuse, it is not enough to be domain specific, but the components must also be implementation environment specific (for example, the type of server, type of database, and so on). Success in component reuse remains a matter of separating and appropriately rebinding aspects of function with aspects of being well-behaved citizens of their environment.

Domain-Specific Software Architecture (DSSA). DSSA grew out of the need to use more rigorous and repeatable techniques in developing mission-critical software systems, sponsored predominantly by DARPA. I first came across the concepts of "domain engineering" in a meeting held at the Software Engineering Institute (SEI) in the early 1990s. Later, work against these concepts migrated to the Software Productivity Consortium (SPC), where it is maintained today under the heading of "Product Line Engineering." Related work is also ongoing in Domain-Specific Languages (DSLs) in certain research facilities ("Domain-Specific Language Design Requires Feature Descriptions," by A. van Deursen, and P. Klint, November 30, 2001, Report SEN-R0126, National Research Institute for Mathematics and Computer Science) and is a particular focus of Don Batory and staff of the Product-Line Architecture Research Group at the University of Texas, Austin (http://www.cs.utexas.edu/users/schwartz/).

DSSA-DSL has a sound basis as a means of dealing with formally mapping specifications (the DSL) into implementations (instances of the DSSA). Unfortunately, as an approach, it is more costly to set up than can usually be afforded from the shallow depths of the pockets of mere mortals in the nonacademic sectors. In approaches to DSSA, the recipe often calls for creating your own domain-specific language, with a custom compiler translator. The language translator needs to be capable of performing all of the required semantic checking, and produces a software system or subsystem that is correct and consistent. It needs to map both the structural and behavioral aspects of the solution described in your language. Given the sophistication involved in developing such a compiler translator, the lead time can be quite long, and you can end up with quite a research project on your hands.

The DSSA-DSL approach is akin to the path that was taken in electrical engineering with VLSI circuit design using VHDL or Verilog and hardware synthesis. These languages provide a higher level means of specifying how a circuit should be structured and behave, and the silicon compiler figures out how the gates should be constructed. The leverage in this approach is huge, and it is a valid claim to say that, in using such tools, we have gone well beyond what is feasible using manual means. Similar hopes exist in the eventual commercialization and widespread use of DSLs for software systems.

OMG's Model-Driven Architecture (MDA). OMG's MDA (http://www.omg.org/mda/) adds together UML modeling, and the concept of creating translations between platform-independent and platform-specific models (PIM/PSM). In most examples, a PIM is a UML model of static structure diagrams that provides a description of your logical design elements. PIMs are transformed into PSMs via automated transformation steps (using tools for making these translations). Since PIMs and PSMs can be cascaded (treating the next PSM as a PIM—the input—to a subsequent transformation), various transformations can theoretically transform an arbitrarily high-level PIM into increasingly more granular and concrete models, ending in the lowest level PSM, which is code ready for the compiler.

The MDA technique comes with cited promises of making UML modeling the choice for specifying and producing middleware components, such as .NET, CORBA, and J2EE components, via automated means.

Issues with XMI convergence-to-standard have been troubling for many early adopters and vendors who hopped onto the bandwagon of being associated with this approach and its publicity. Nevertheless, it seems things can be predicted to stabilize within a few more cycles (another year or two).

Metadata-Based Component Production. I have enjoyed the most practical and varied successes using this software manufacturing method (http://www.codepatterns.com/). It has a low setup cost, and the benefits of applying the method can be demonstrated in a matter of days. It can be adopted as an add-on in the process of using the modeling tools that you likely already have in hand, and the information you can extract from legacy systems. Lastly, it requires only a moderate level of sophistication to implement. The crux of this approach is to separate the information needed to derive components (metadata) from the implementations of the components. The approach can be thought of as being more general than OMG's MDA because it does not require UML or XMI as input formats, although it can use UML as a source of input if that is a metadata source that the user elects to use. Components are produced by applying patterns of implementation to the metadata using an automated mechanism (Figure 1).

Software components are similar to other manufactured components in that they can be designed to follow a set of rules, standards, and conventions. More specifically, they often need to follow prescribed interfaces and behaviors, especially when they are meant to exist within a particular implementation framework, as citizens of that framework. These interfaces and behaviors form the set of rules, or implementation patterns, that their implementations must follow.

For example, consider Java EJBs. The J2EE environment calls for a predictable relationship among Entity Beans, their EJB containers, and the Home and Remote interfaces that each needs to implement. Given the definition of the container-bean coupling in an EJB server, a number of the environmental aspects that components need to obey are fixed. The differences between Entity Beans seem mostly describable in terms of their data elements and methods, so as a start, we can create metadata to describe these differences. I say that this is a start because we know, in practice, that components need to do more than this.

Also consider some of the common aspects that are imposed by data management and security policies in building a real application. Suppose you are dealing with a financial application and must respond to a requirement to log changes to certain entities, recording the principal who made each change, and what the change was. You will also want to have control over how and where such changes are logged. These are matters of local choices and policies, so you can't expect components or commercial frameworks to provide this part of the solution for us. Nevertheless, mixing in these policy-driven requirements still lends a predictable result. To add this requirement to our component production line, you need to tag the elements in the metadata that have a need to log changes, and we also need to codify the pattern that needs to be applied to make a generated component behave as it should. Elements in the metadata that are not tagged as requiring logging will simply not have such code produced for them when they are generated.

It is also useful to see that the implications of logging add additional management implications. We do not likely want to allow log files or tables to grow unbounded. Log entries of a certain age or older need to be periodically archived and purged. The fact that we have identified the elements in the metadata that need to be logged means we have also identified the associated logs that need to be archived and purged. Thus, another set of patterns can be applied to the metadata to produce the scripts or jobs that would run periodically to carry out those functions, using our chosen means for doing so. The more we look for things that we can derive from the metadata, the more we will see.

Because we need to take custom requirements into account, one size does not fit all, but if we have adopted a capability in setting up component production lines, when we know what we want, we can quickly specify the rules for how given components are created. Then, we can apply those manufacturing rules whenever the situation calls for them.

Changes to Development Processes

Software manufacturing is not just about producing code using automated means. Although this is an important aspect, it is more about the process of building software systems in a way that maximizes consistency and quality and places due emphasis on getting a return on investment for careful modeling and specification.

Architecture and Systems Engineering. The stability of implementation platforms and technologies will be either an enabler or a disabler. To be an enabler, decisions have to be made and standards have to be chosen. When applying a manufacturing approach, we must begin with the end in mind. The production line cannot be set up until the platform is defined. Senior architects and system engineering people need to make these decisions as early as possible. Then we can apply manufacturing methods and techniques that are most appropriate for creating each set of components with the platform in mind.

Prototyping. To automate component production, we need to understand the inner workings of our components in precise terms. This means it is a good practice to first build prototypes of a representative set of components. Once you have perfected the prototypes, you can use them as a reference for creating and applying your implementation patterns. You will also see the need arise for specific facts to be expressed in the metadata. These facts become apparent because decision points in automating the patterns will require them.

Analysis and Design. It is important to notice that the approach of developing components on a production line places a great deal of reliance on the correctness of the metadata that is used as input. In some environments, business and systems analysts create object models and data models only as "suggestions" of what the developers might use in the actual system implementation. This needs to be tightened up. The importance of using these inputs as metadata, or as authoritative sources for deriving metadata, should not be underestimated. Once you set up production lines so that you get components based on what you model, the models need to be right. If this level of rigor is not employed in your modeling processes, you will need to change this.

Consolidating Metadata. Notice that instead of framing the question like this: "We have a model, what components can we generate from it?" in software manufacturing, we come from the other direction: "Given the components that we are trying to create, what models do we need to describe them?" Sometimes the answer will fit quite nicely into specifications supplied in UML, but at other times, different, or additional, specifications will be required.

When we think in this direction (beginning with the end in mind), we remain open to gathering in metadata that adequately describes our system elements from each of the places that should contribute to filling out the whole picture (Figure 2).

Metadata Management. Once we have gathered the metadata together, we need an ability to compile the metadata in a manner that validates the metadata's structure and interreferences. Other validations may be required to ensure that the metadata is consistent and complete, given the intent for how the metadata will be used in applying the patterns for producing target components.

We also need to deal with change management. Since the metadata plus our patterns yield the source code for our components, both the metadata and patterns become our source code and need to be stored under CM control for the same good reasons that we place source code under CM control.

Conclusion

It is happening. We are seeing a trend to treat software component engineering in similar fashion as manufacturing engineers treat production lines. Also, various techniques are immediately available for us to use. The wide adoption of modeling tools, and the ability to bring their contributions together into consolidated metadata, enable us to set up software production lines and to automate the repetitive tasks of creating software components against well-understood frameworks and platforms.

We have a number of very promising engineering methods for delivering quality and productivity in producing software using manufacturing techniques. What's more, these are things that we can do within our own software development organizations. In many cases, these methods and techniques can be a much better alternative to outsourcing software development work. We don't need less expensive labor—we need more effective methods. By using software manufacturing methods, we are able to dramatically improve both quality and time-to-completion aspects of producing software, and optimize for cost at the same time.

DDJ