VISUAL OBJECT-ORIENTED PROGRAMMING

Standard C with home brewed OOP features

Rob Dye

Rob is a software engineer at National Instruments and a member of the LabView 2.0 development team. He can be reached at 12109 Technology Blvd., Austin, TX 78727.

Graphical, direct-manipulation interfaces are quickly becoming the standard for today's software. Object-oriented programming (OOP) techniques can be naturally applied to such interfaces, because the on-screen objects that users see and manipulate can be directly and conveniently implemented as abstract objects in an OOP language.

LabView is a program with just such an interface. It is a visual programming and execution environment for data acquisition and laboratory automation. Functions are wired together in LabView's diagrammatic, dataflow language to produce executable programs. Graphical display objects provide both interactive and programmatic input and output for the functions. (See Dr. Dobb's Software Engineering Sourcebook, Winter 1988, 28-35 for a more detailed description of LabView.) While LabView itself is not an OOP environment, our implementation of Version 2 of LabView is very much object-oriented.

One thing noteworthy about the LabView object-oriented implementation is that the language is standard C-flavored but incorporates our own home brewed OOP mechanisms. This article describes these mechanisms -- how they were implemented, how messages are dispatched, how inheritance is achieved, and how objects are represented.

Why Not a Real OOP Language?

Why don't we just use a real OOP language to reap OOP benefits? Fundamentally because when development began on LabView 2.0 in mid-1987, no high-performance OOP language was available on the Mac. Several OOP environments have since become available, yet we still feel our OOP implementation has advantages over these others. By building our own OOP features into standard C, we have the freedom to buy into as much object-orientedness as we need and can afford. We can leave out those features that we feel negatively affect performance, and yet build in and fine-tune those that we feel are worth the price of implementing.

Our implementation of LabView 2.0 embodies two OOP concepts: 1. The close binding of objects with the methods that operate on them, and 2. the code-sharing framework accorded by inheritance. These concepts are manifested in several features: A concise way of sending messages to objects and an inconspicuous dispatcher for those messages; a mechanism enabling a class to automatically inherit methods from its superClass; and a class hierarchy that can be traversed at run time by a second dispatcher to allow a class to forward messages to its superClass. In implementing these features, we generally optimized for speed rather than space.

Together, these features give us almost all the benefits that any real OOP language offers. A few features are missing, though. We have no true data encapsulation because it must be provided by the language, and C's file scoping and #include files are minimal encapsulation features. Garbage collection is sometimes listed as a feature of OOP languages. It is not a feature of our architecture -- our objects are responsible for cleaning up after themselves. This is in keeping with the C philosophy that the programmer retains all the power, not to mention all the responsibility.

A class browser like the one in Smalltalk, although certainly not a requirement for an OOP language, would be a nice tool for managing our large collection of classes. To perform this task, we use a group of tools that treat the OOP system as a matrix, with classes labeling rows and messages labeling columns. (See the accompanying box, "Managing Class Attributes," for a discussion of these method table tools.)

These OOP mechanisms give us a more powerful language to work with, but its successful use hinges on programmer discipline. Therefore, before getting into the details of the implementation, we'll describe how these features appear to the programmer. For the most part, they look like the familiar features of the C language.

The template that defines an object's instance variables is created through the use of nested macros, where the nesting mirrors the class ancestry of the object. See Listing One (page 35) for an example adapted from LabView's source code. The example shows the definition of object fields for a class hierarchy four layers deep; the classes shown are involved in the display of input and output values to functions. Figure 1 depicts the logical relationships between these classes, and Figure 2 shows how some instances are graphically represented in LabView.

At the root of the hierarchy is the data display object (DDO). Front panel DDOs inherit all of their fields by nesting the DDO_ClassFields macro within the FPDDO_ClassFields macro, and then supplying a few new fields of their own. These two classes are examples of abstract classes because no object instances of these classes are ever actually created; their purpose is to provide a base class from which many sub-classes may inherit certain methods.

Numeric display objects inherit all the fields of the FPDDO class and supply those fields specific to numbers, such as the numeric representation, a digital display object, and its range of valid numbers.

Finally, we see the actual structure definition of several objects with three typedefs. The standard numeric display object is defined simply in terms of the NumericDDO_ClassFields. The Knob display object inherits all those fields, plus some information relating to its graphical depiction and the scale around its perimeter. The StringDDO inherits from the FPDDO and adds fields necessary for displaying text.

Once a template for a class has been defined, the programmer must be able to refer to an object of that class. In LabView, the objects allocated for a single document (called a "virtual instrument") are contained within a data structure called an ObjHeap. A particular object is therefore referred to unambiguously by a pair of values: A handle (a doubly indirect pointer) to an ObjHeap and an ObjID. The ObjHeap is a relocatable, dynamically sized block of memory in which objects are allocated. An object's ObjID is its offset from the beginning of its ObjHeap. (Handles and memory management are discussed later in this article.)

Because a pair of values is used to refer to an object, accessing an object's field requires dereferencing the heap's handle, then adding the offset to yield a pointer to the object's structure. Macros are used to perform these two steps in a consistent way for each class. The definition of KnobPtr in Listing One is an example of such a macro. Example 1 shows an example of its use.

Example 1: Defining the macro

  KnobDDORec *k;
  k = KnobPtr(heap, self);
  k->knobFlags |= clockwiseFlag;

Another implication of this two-value identifier is that all messages must take at least two arguments: An ObjHeap handle and an ObjID. It does not necessarily mean that all stored references (such as one object holding a link to another) need to have both values, because the ObjHeap is usually known by the context. In LabView, the vast majority of inter-object links are within the same ObjHeap.

One of the features most visible to the programmer is the syntax for sending messages. The message is not an explicit argument to a function called, say, MsgDispatch; rather, the message is the actual name of the function that is called. The message is therefore emphasized, not the dispatching mechanism. For example, to send a message asking an object to copy itself, you would write code like this:

  theCopy = oCopy(h, o,...);

The name of the function or method that actually gets invoked need not be known by the programmer, and frequently it isn't known. The ultimate destination is determined by the class of the object described by the first two arguments to the message. The O (for object) at the beginning of the message name is a convention used to signal those reading the code that this is no ordinary function call. (It also gives minor amusement to those who read the message names as holy incantations.)

  theCopy = supCopy(myClass, h, o,...);

The extra argument myClass is used so that the message-dispatching mechanism can correctly crawl up the class hierarchy tree to the parent class. This mechanism is more complex than that used for normal messaging, and space considerations do not permit its description in this article.

Writing a method is no different from writing any other function in C. The method names, however, adhere to a convention so that the names can be automatically generated by the method table tools. The name is generated by the concatenation of the name of the class to which they belong; the character O; the name of the message to which this method responds; and the word Method. Example 2 shows such a method.

Example 2: A typical method

  KnobOCopyMethod(heap, self, I)
      OHHandle heap;
      ObjID self;
      ...
      {
      ...
      }

Memory Management of Objects

As mentioned earlier, object instances (Figure 3) physically reside in a data structure called an ObjHeap (shown in the middle of the figure). ObjHeaps themselves reside in a data structure, defined by the Macintosh Memory Manager, called a "zone" (shown at the right side of the figure). ObjHeaps are relocatable blocks of memory that must be accessed through a nonrelocatable master pointer. Only one master pointer exists for each relocatable block in a zone, and it belongs to the Memory Manager. When the block must be relocated, the Memory Manager updates the master pointer to point to the block's new location. All other references to the block are handles -- pointers to the master pointer -- thus, we have double indirection. This arrangement allows the Memory Manager to compact the zone by moving all the relocatable blocks to one end of the zone when it becomes fragmented from numerous allocations and deallocations.

ObjHeaps, therefore, live in the domain of the Macintosh Memory Manager. Objects within LabView live in the domain of the ObjHeap Manager. LabView's ObjHeap Manager takes care of object allocation and deallocation within ObjHeaps. At first glance, this extra layer of memory management software may seem to be a source of extraordinary overhead, but it actually results in a significant increase in performance.

One of the lessons we learned (the hard way) from developing LabView 1.0 was that the performance of the Mac Memory Manager begins to degrade severely when the number of blocks allocated in a zone gets to be more than a couple of thousand. This is because allocation of a new block may require searching extensively through a fragmented zone before finding a free block large enough to satisfy the memory request. If such a block can't be found, the zone must be compacted and the search restarted.

All of the OOP languages that have become available recently on the Mac (the Object Pascal environments, Think C 4.0, and the soon-to-be released MPW C++) make the same mistake we made in LabView 1.0; each object is allocated in its own pointer or handle block. Some ambitious programs written in LabView may require as many as 50 linked documents totaling tens (perhaps hundreds) of thousands of objects. Our heavy reliance on the Mac Memory Manager degraded all aspects of performance that relied on memory allocation, even drawing. Other OOP languages that rely on the Memory Manager can be expected to run into the same problems.

Adding an extra layer of memory management improves the performance of both layers. Each layer has to deal with fewer numbers of blocks. And because the lowest layer of object management is our own, we are free to tweak the ObjHeap Manager to enhance performance. As mentioned earlier, we avoided the framework imposed by an OOP language in order to keep this kind of freedom.

More Details.

Another advantage of ObjHeaps has to do with saving all the objects of a document to disk. As explained earlier, a virtual instrument is in large part made up of two ObjHeaps. Each ObjHeap can be written out to disk as a single block. Because inter-object references are simply offsets within an ObjHeap and not memory addresses, the references need not be encoded or transformed in any way to survive the move to disk and then back into memory. This goes a long way towards achieving truly persistent objects, that is, objects that maintain their identity and interrelationships from one invocation of the program to another.

One disadvantage of maintaining objects within an ObjHeap is the expense of calculating a pointer to an object from its ObjHeap handle and offset. This disadvantage is somewhat exacerbated by the fact that ObjHeaps are relocatable, which means that a pointer to an object inside an ObjHeap, once calculated, can go stale across function calls, which can cause memory relocations. Programmers must be careful to refresh such pointers at the appropriate times. We consider this a minor penalty, given the massive global savings in memory management. Judicious use of register variables can further reduce the penalty.

Representation of Objects

An object has a logical extent that corresponds exactly to the struct that defines the object's fields. However, an object's physical extent is larger than its logical extent by three 32-bit integers (plus, perhaps, a few bytes of internal fragmentation). These three integers are in a header that is at a negative offset from the logical beginning of the object. They contain system information that is normally invisible to the programmer: The actual physical size of the object, a scratch field, and a pointer to the data structure representing the class of which the object is an instance.

The physical size of the object is the private data of the ObjHeap Manager. The scratch field is used both by the ObjHeap Manager (for example, during the compaction of an ObjHeap) and by objects during certain message protocols (for example, oCopy, oCompile).

The pointer to the class data structure is the most important field for our purposes in this article. It is the link that binds the object to its methods..

The defProc

All the defining information about a class is held in a data structure we call a defProc. For the most part, it is a table of pointers to the class's methods that is indexed (not searched) by message selectors. defProcs are the source of a space trade-off inherent in our entire OOP mechanism. The average size for a defProc is about 600 bytes.

At the beginning of the defProc is the range, which is the maximum selector value allowed for this class. (See Figure 4.) The range is the offset in the defProc to the last entry in the table of method pointers -- an error-handling method. Should an object be sent a message that is beyond its ken, that is, a message selector that indexes beyond the end of its method table, the error-handling method is invoked. This method performs much the same function as does the doesNotUnderstand method in Smalltalk.

Beyond the end of the method table in the defProc is a fixed-length structure containing a variety of information, including the logical size of an instance of this class, the ASCII name of the class and its superClass, and a pointer to the superClass's defProc. The pointer to the superClass's defProc establishes the inheritance hierarchy and is used at run time to route superMessages.

defProcs for LabView's built-in classes are created and initialized at LabView initialization time, not by the compiler at compile time. (The reason is an unfortunate limit in the amount of static data allowed by the compiler.) Classes may also be defined externally to LabView; their defProcs are read in and initialized the first time an object of that class is instantiated. A description of how external classes are defined is beyond the scope of this article.

One of the steps in the initialization of a defProc establishes the inheritance of all methods from its superClass's defProc. In this step, all the method pointers of the superClass are copied into the defProc being initialized. Once the pointers are copied, the class's initialization procedure continues by poking the addresses of its own methods into the appropriate places in the method table. Because inherited methods are copied directly into the subClass's defProc, it is not necessary to climb up the inheritance tree at run time to find the class that defines the method.

You might think that generating and maintaining these defProc initialization procedures, as well as the indexes in the table for each method, would be quite a nightmare. It could be, if you did it by hand. Our method table tools automatically regenerate the source code for these initializers whenever a change needs to be made in a class methods, or when new messages are introduced.

Message Dispatching

What path does the code follow in getting from the point of sending an object a message to the actual execution of that object's method? The path goes through two functions: One could be called the message glue; the other is the message dispatcher, an assembly language function called DefProcDispatch. Its source is shown in Listing Two.

As we saw earlier, sending a message actually calls a function with the same name as the message -- this is the message glue function. These functions are quite small (as you can see in Listing Two), and their source is automatically generated by the method table tools. All the function does is place its message selector in data register zero and jump to DefProcDispatch.

DefProcDispatch knows the state of the stack upon entry. When the message was originally sent, the parameters were pushed from right to left (as they appear in the source code). Therefore the most recently pushed items on the stack (besides the return address) are the objects ObjID and handle to the ObjHeap. With these items, DefProcDispatch generates a pointer to the object being sent the message and retrieves the pointer to the object's defProc from the object's header. The message selector in DO3 is compared to the range at the beginning of the defProc to make sure that the defProc contains an entry at this index in its method table. If all is well, the method address is retrieved from the table and DefProcDispatch jumps to it.

One of the nice features of this mechanism is that all message traffic goes through DefProcDispatch. Therefore, it is a convenient place to put all sorts of debugging hooks. We use it to check the validity of ObjHeap and ObjID arguments, and to count messages for performance evaluation.

Conclusion

We feel the benefits of object-oriented programming are substantial. Extending and modifying existing objects, as well as experimenting with new sub-classes, is quite easy. Building in our own OOP mechanisms has caused a certain amount of overhead in development time. We had to spend more time on mechanisms rather than on the code that actually gets the job done, but we feel that the benefits are worth this overhead.

The experience of both building and using these mechanisms has been enlightening. We have all come to a greater appreciation of what object-oriented programming is all about and what kind of design considerations go into the making of an OOP language. Certainly this education will be useful to us as we consider the development of future software.

Managing Class Attributes

During the development of a large object-oriented system, class hierarchies and message protocols are frequently changed. OOP languages generally have mechanisms that ease these modifications -- Smalltalk's Browser is perhaps the best example of such a mechanism. With LabView, we had to invent our own mechanisms external to the language, all of which depend on a spreadsheet that contains a matrix of objects and messages.

This spreadsheet, which we call the method database, is essentially a primitive browser. Each row defines a class, each column an attribute of the class. The various attributes are the class's name, its superClass's name, the name of the typedef that defines its objects' fields, and so on. In addition to these attributes there is a column for every message in the system. The entry at the intersection of a class and a message tells whether the class defines its own method for that message, inherits a method from its superClass, responds with an ErrorMethod, or uses a function with a special name.

A program that we wrote reads the text version of this spreadsheet file and generates from it six source code files (as well as a number of includes) that are compiled into LabView. These files include all the class initialization functions for all the built-in classes in LabView, as well as a function that calls them in the correct order to assure that no class is initialized before its superClass. It also generates the message glue functions and the indexes into the defProc method pointer tables for each message selector.

Modifications to the method database, such as introducing a new message or a new object, or changing an object's response to a message from inheriting a method to overriding with its own method, are sufficiently infrequent that the following three-step process is not too much of a hassle. 1. Modify the spreadsheet and save as text. 2. Run the source code generator. 3. Remake all the necessary files. One can easily imagine a program, however, that not only would allow browsing in an easier way, but also would generate the required source code. So much code to write, so little time.

--R.D.

[LISTING ONE]

#define DDO_ClassFields 	/* Data Display Object */ \
	int16 flags;			\
	ObjID owner;			\
	RGBColor structRGB;	\
	Rect bounds;			\
	ObjID label; 

#define FPDDO_ClassFields	/* Front Panel DDO */\
	DDO_ClassFields 		\
	int16 dataStatus;		\
	ObjID graphic;
	
#define NumericDDO_ClassFields	\
	FPDDO_ClassFields				\
	ObjID digitalDisplay;			\
	int16 repCode;		/* int16, uInt32, float32, etc.*/\
	byte	min[sizeof(float96)],		\
			max[sizeof(float96)],		\
			def[sizeof(float96)];

typedef struct {
	NumericDDO_ClassFields
	} StdNumDDORec;

typedef struct {
	NumericDDO_ClassFields
	int16 knobFlags;
	ObjID rotaryScale;
	PicHandle knobPict;
	BitMap refreshBMap;
	} KnobDDORec;

typedef struct {
	FPDDO_ClassFields
	ObjID scrollBar;
	StringHandle dfltString;
	TextRec textParams;
	} StringDDORec;

#define ObjPtr(h,o)			(*(char**)h + o)
#define KnobPtr(h, o)		((KnobDDORec *)ObjPtr(h, o))
#define StringDDOPtr(h, o)	((StringDDORec *)ObjPtr(h, o))
#define StdNumPtr(h, o) 		((StdNumDDORec *)ObjPtr(h, o))




[LISTING TWO] 

/*
	On entry:
		D0	= message selector
		4(A7)	= ObjHeap handle
		8(A7)	= ObjID
*/
DefProcDispatch()
	{
	asm
	{
	move.l	4(A7), A0       ; get ObjHeap handle 
	move.l	(A0), A0	; dereference. A0 now points to ObjHeap
	move.l	8(A7), D1	; move object id to D1 for next instruction
	move.l	-4(A0,D1.L), A0 ; A0+D1 points to object. 
	move.l	(A0), D1	; put range (ErrorMethod offset) into D1
	cmp.l	D0, D1		; compare msg selector with range 
	blt.s	@err_msg	; if (D0 >gt; D1) execute the ErrorMethod 
	move.l	4(A0,D0.L), A1	; D0 has method offset
	jmp	(A1)		; execute the method
    err_msg:						
	move.l	4(A0,D1.L), A1	; D1 has ErrorMethod offset
	jmp	(A1)            ; execute the ErrorMethod
	}	
	}

/*
	Message glue function for oCopy message.
*/
void oCopy()
	asm {
		move.l	oCopyMsg<<2, D0
		jmp	DefProcDispatch
		}
	}