C# and the Outhouse Paradigm

Dr. Dobb's Journal July 2002

By Al Stevens

Al is DDJ's senior contributing editor. He can be contacted at astevens@ddj.com.
Classes and Databases

I've been slowly getting into C#. There are many differences between this language and C++, and as I consider each of them I have to ask, first, why the difference and, second, what is the reason for the corresponding C++ feature in the first place? A couple of months ago, I looked at the C# Main function and wondered about its implementation. This month, I'll look at inheritance. There are several significant differences between the two languages when it comes to inheritance.

An object-oriented design incorporates abstraction, encapsulation, inheritance, and polymorphism. An object-oriented programming language must support these design characteristics. These requirements go back to the pioneering days of OOP and early OOP languages such as Smalltalk. C++ and C# both comply with these requirements. But they do it differently in some areas, particularly with respect to inheritance.

C++ supports multiple inheritance and C# does not. Many programmers deprecate the use of multiple inheritance because of its inherent problems. (Some programmers deprecate the use of inheritance entirely, preferring to use data members to implement the behavior of derived classes, but that is another issue to be argued at another time.)

One of multiple inheritance's problems is this: When a class derives from multiple base classes, which themselves derive from a common base class, the compiler encounters ambiguity when instantiating instances of the bottom-most class. The compiler wants to construct instances of the common higher base class multiple times, once for each of the intermediate base classes that derive from the common base. Consequently, the object has multiple executions of the common base constructor and multiple copies of the common base's data members. Then, when the program references public members of the common base class, the compiler does not know which of the instantiations of the common base class to use. Thus the ambiguity. From this characteristic rises the commonly held view that multiple inheritance is a bad thing, which is probably why many OOP languages do not support it.

C++ addresses this problem with the so-called virtual base class. If the highest base class in our hypothetical design is declared as a virtual base class, the compiler knows to instantiate only one copy of the common base component when instantiating objects of classes indirectly and multiply derived from the common base class. This technique works well, but to use it, a base class design must anticipate its own use in a multiple inheritance class library and declare itself to be virtual. Those intentions are not always obvious at the outset, and users of legacy base classes drawn from published sources cannot always change a base class into a virtual base class for their own convenience without compromising the integrity of the publicly distributed resource. When you fork a design, it stops being a common resource and becomes a maintenance coordination problem. Upgrade retrofits and all that. And that's what's wrong with multiple inheritance — and the C++ virtual base class mechanism does not really fix it. Perhaps a better language design would permit derived classes to declare that base classes up the chain are to be virtual only in the context of the current class hierarchy (see the accompanying text box entitled "Classes and Databases"), but that's not how C++ works.

You will hear it argued, too, that multiple inheritance encourages complex design resulting in class hierarchies that are difficult to understand, maintain, and use. I reject that argument. Complexity is a function of perspective. Any design is complex to someone who has not studied it. K&R's classic "Hello World" program is complex to the programmer who has not learned C. If multiple inheritance is necessary to a design, its corresponding complexity is one of the design costs and the designer must pay that cost and deal with that complexity. If multiple inheritance is poorly applied — if any design feature is poorly applied — the resulting complexity is the fault of the designer and not the design tool.

Why does C++ include multiple inheritance? It didn't always. Bjarne Stroustrup did not add multiple inheritance to C++ until Version 2.0, when the language was about six years old. Given that other OOP languages do not support multiple inheritance (which indicates that multiple inheritance is not an absolutely dropdead, essential, can't-live-without-it design model), and given that it exhibits the problems I just discussed, why does C++ support the feature? Let's ask Bjarne himself and look for his answer in The Design and Evolution of C++ (Addison Wesley, 1994).

After admitting that he gave multiple inheritance priority over more important features only because the author of a book about a competing language had declared it impossible to do in C++, Stroustrup says the following:

I saw multiple inheritance as a potentially important means of organizing libraries around simpler classes with fewer interclass dependencies.

He gives an example of a derived class that, without multiple inheritance, can inherit the behavior of only one of two base classes, making it necessary to have two independent classes to support the behavior of one of each of the two bases, each of which would mutually exclude the behavior of the other base class, making it impossible to have one derived class that exhibits the behavior of both bases. If that sentence is as confusing as it looks to me, get Bjarne's book and see for yourself.

But is multiple inheritance itself really a desirable design tool given that many programmers think not and given that many OOP languages do not support it? Return with us now to those thrilling days of yesteryear...

Many years ago, I was a maintenance programmer on a big aerospace project during the early days of the Apollo program. I observed that some subsystem designs were intuitively easy to understand from their documentation, whereas others required more study and even one-on-one consultation with their designers. These differences could not always be attributed to levels of complexity in the problem domains. Sometimes, it seemed, the solutions were more complex than the problems they solved. I wondered about it.

Then, one morning while on vacation at my in-laws' farm, I was walking toward the outhouse when it hit me: No one ever looked at an outhouse and asked what it was for. Its purpose is evident from its appearance. (If you are too young or urbane to know what an outhouse is, send an inquiry to editors@ddj.com. Our editor-in-chief knows all about such things.)

I was young and had not heard the engineering cliché, "form follows function," so while relaxing in the subject in question, I set aside the Sears Roebuck catalog and meditated about outhouses and software systems. From that meditation, I coined this phrase: "The solution to a problem should resemble the problem." That is, when someone observes a solution, the solution should remind the observer of the problem being solved. I realized that the architecture of a software subsystem should, when depicted in charts and graphs and source code, tell the reader what the subsystem does, just as the blueprint for an outhouse reveals its noble purpose. This concept became my guiding principle for software design from that day forward. I called it the "outhouse paradigm."

(My wife's mother, a lady of gentle sensitivities, always preferred the term "privy" and wished I wouldn't use the more vulgar term, until Judy explained that it took her a while to get me to say "outhouse.")

So, how does this story relate to multiple inheritance? If a design is to accurately model the problem it solves, it must be able to represent that model in a way that suggests the problem. If the problem domain is a complex network and not a simple single-inheritance hierarchy, then single inheritance cannot accurately represent the problem. Not that you couldn't build a software system that works without multiple inheritance. Obviously, you must be able to — we've been building software systems for years without a lot of the tools now available to us. But will any particular design fit my time-proven outhouse paradigm without employing multiple inheritance? Surely, many will because they don't need it, but let's consider one simple example, an example I've used before to justify the need for multiple inheritance as a design tool.

In September 1996, I answered the question, "When should you use multiple inheritance?" with this explanation:

Consider an Asset class, Building class, Vehicle class, and CompanyCar class. All company cars are vehicles. Some company cars are assets because the organizations own them. Others might be leased. Not all assets are vehicles. Money accounts are assets. Real estate holdings are assets. Some real estate holdings are buildings. Not all buildings are assets. Ad infinitum. When you diagram these relationships, it becomes apparent that multiple inheritance is a likely and intuitive way to model this common problem domain.

And, I would add today, single inheritance is not. Without multiple inheritance, you need many more interclass dependencies and redundancies to represent the complexities of this problem. Consider only the company car situation and Figures 1 and 2. In Figure 1, the behavior of a vehicle must be encapsulated in two classes, the Vehicle class and the VehicleAsset class. The only difference between the two is that the VehicleAsset class derives from Asset to add that behavior. The disadvantage is that when you maintain one, you have to duplicate the maintenance on the other. A similar relationship exists between LeasedCar and OwnedCar. They both encapsulate the specific behavior of a car. Looking at Figure 2, we have the same number of classes, but no duplication of encapsulation across classes. The Vehicle class encapsulates everything about a generic vehicle. Likewise with the Car class.

I must have been right because no one among my vast readership challenged this conclusion back in 1996. So, given that multiple inheritance is, in my opinion and those of my readers, sometimes a desirable design model, does that eliminate C# as a useful OOP language? Certainly not. C# is one of a family of languages that, according to the great plan from Redmond, coexist and collaborate peacefully on the .NET platform. C++ is another member of that family. If a .NET solution requires — according to the outhouse paradigm — multiple inheritance, one can implement that portion of the design with C++. What does one do when a design component requires multiple inheritance and some other language feature that C# supports better? One must do what one always does when faced with mutually exclusive choices. One must compromise.

I expect some readers to respond by saying, "But multiple inheritance is a Bad Thing." You'll just have to do better than that. As I pointed out in '96 and again a couple of months ago, a chain saw can be a bad thing, too, when improperly used. But sometimes nothing else will do.

The C# Interface

The closest C# comes to multiple inheritance is with the interface feature, something that C# took from Java. An interface is, in the view of a C++ programmer, an abstract base class with no data members and only pure virtual public member functions for which the class may provide no implementation. C# classes can derive from one base class and as many interface classes as they wish. Here's the catch: The derived class has to implement all the functions in the interface. Which means that the Asset class in Figure 2 must be an interface and the OwnedCar class in Figure 2 must provide implementations of all the functions in that interface and any data members that a generic asset requires. Which means that if you add an OwnedJet class after your company goes public and starts making dough, you have to implement Asset all over again in the OwnedJet class. Which introduces a maintenance problem.

It will be argued that a common implementation of the Asset interface can be done with a class and that OwnedCar and OwnedJet could each include an object of that class as a data member with connections from the owning class's interface implementation functions and the actual implementation. That is a valid argument, albeit a kludge. But that style of implementation violates the outhouse paradigm because it also violates the time-honored object-oriented rule that says use inheritance to implement the IS-A relationship among classes and embedded data members to implement the HAS-A relationship and never confuse the two. But it's the only way to do it, which is another way of saying we must shape the solution to fit the tools rather than the problem. You might just as well embed an Asset object into OwnedCar and OwnedJet and be done with it.

Conclusion: C#'s implementation of inheritance is deficient because you cannot use its class system to effectively portray some typical problem domains. But that's not a big problem. For those domains, we still have C++.

Epilogue

They tore down that outhouse many years ago. The farm had indoor plumbing for several years before that, but you never discard something that works until its replacement has been proven to be reliable. So the respectable old outhouse teaches us another lesson. Don't rush to embrace the latest, trendiest dodad to the exclusion of an old reliable institution until we know whether the newfangled device really works. Until we're flush with confidence. So to speak.

DDJ