STRUCTURED PROGRAMMING

Pieces of Charlie

Jeff Duntemann K16RA/7

While hunting for houses in Cave Creek, we saw a large-eared skinny brown dog trot across the highway close ahead of us with no hint of a look over its shoulder. He knew we were there; the ears said it all. This was obviously a creature that knew man and automobiles and wasn't much fazed by either. In short, we met the much-maligned coyote who, at the parting of the genes ages ago, chose the glorious life over the long one, perhaps because he somehow foresaw that we would take wolves and turn them into poodles.

For all his persecution, the coyote is doing surprisingly well in the Valley of the Sun. In defiance of conventional wisdom, there are (if my sources are correct), more coyotes living in North Phoenix today than there were 20 years ago, even with the incredible building growth the area has seen in that time. The two interlocking reasons aren't surprising when you think about them: More places to hide, and more things to eat.

Twenty years ago there were few culverts or storm drains in Phoenix. Now, after two hundred-year floods in close succession, culverts are everywhere, and they carry water perhaps three days out of the year. The rest of the time, they are bone-dry -- perfect for waiting out the blazing Phoenix days or raising pups in perfect safety.

And things to eat, lordy ... who'd chase a stringy old roadrunner when you can gorge on leftover French fries behind the dumpster at Carl's Junior? And for those atavist coyotes who prefer chasing down dinner, there is no shortage of free-roaming house cats. (Though fewer in recent years, I've heard: Nothing teaches a cat owner a sense of responsibility like finding pieces of Charlie underneath the palo verde tree in the empty lot down the street ....)

Ecological Niches

Actually, hearing of the Phoenix coyote population explosion surprised me less than hearing last year that Jensen & Partners was bringing out yet another C compiler, or this year hearing that JPI would be bringing out a new Pascal compiler. It seemed odd (or excessively gutsy) only at first. As with the coyotes' role in keeping cat doo-doo out of my flower beds, I had overlooked an important ecological niche in the structured language world that was not being well-filled: The area of multi-language development.

The idea, in short, is to write a program in C and call routines written in Pascal ... or to write a program in Pascal and call routines written in Modula2 ... or some other permutation of the major languages. This is a difficult business for a great many reasons, and the major vendors haven't done much to make it easier. Microsoft started out down the right road years ago, by providing specific instructions on linking code written in their various languages, but they never took it very far, and even at its best the process was something I would consider an ordeal. Borland does provide some small .DOC files on calling Turbo Pascal from Turbo C and vice versa, but it seemed lots more trouble than it was worth.

Polymorphic Compilation

The TopSpeed solution to multi-language programming is bold: When you buy a TopSpeed language, you get the TopSpeed interactive development environment (IDE). When you buy a second (or third) TopSpeed language, it installs so that it can be invoked from within the same copy of the IDE (which is not installed a second time), as a peer with any other installed TopSpeed languages. The various languages recognize certain file extensions as their "own" and the IDE will automatically invoke the correct language when a given source file is specified for compilation. In other words, if you compile CHARLIE.C, the IDE invokes TopSpeed C without explicit instruction from you; or if you compile CHARLIE.MOD, the IDE will invoke TopSpeed Modula-2.

There is already a TopSpeed Assembler. TopSpeed Pascal is in its final testing stages, and may well be shipping by the time you read this. JPI has expressed intention of delivering a TopSpeed C++ and a TopSpeed Ada sometime in 1991, and both will fit right into the larger scheme.

Building CHARLIE.EXE from multiple pieces written in multiple languages requires a central blueprint, and this is provided by JPI's excellent automatic MAKE facility. Unlike MAKE utilities in the traditional C world, TopSpeed MAKE is built into the TopSpeed compilers. The programmer provides a project file containing instructions to the compiler: Memory model to be used, compiler options to invoke, which files are involved and must be linked together, and so on. The MAKE facility follows this plan to produce the final .EXE file, taking into account the time stamps of involved source and object files so that it doesn't recompile anything needlessly.

Why Bother?

JPI has managed to get multi-language development over the line into the realm of the possible. But the question does remain, Why bother at all? Especially today, when you can do just about anything you want in just about any commercial implementation of any structured language. The reasons are few but they can be compelling:

1. Retaining existing investment in code libraries. Say you're in a shop that has always done its development in Modula2, and Corporate issues a proclamation that all future development is to be done in C. Hokay ... except that that means that maybe five years' worth of existing libraries have to be rewritten. If you can rig things so as to call the existing Modula libraries from the new modules written in C, you can get the work done in C and convert the older libraries to C as time allows ... if you bother at all.

2. Making use of specialty commercial libraries in different languages. Perhaps you do all your work in C, but the only third-party library you've found that provides a certain difficult function is in Pascal or Modula-2. A real-life example would involve the Solid Link Modula-2 library I described in my July column. I have numerous communications libraries for various languages in my collection, but Solid Link is the only one that implements the ZModem protocol, which is hairy in the extreme to implement yourself. By using TopSpeed C, you could have your C and ZModem too.

3. Maintaining multiple-platform libraries. If your shop must support different platforms than DOS, like the Sun under Unix, or the Amiga or Macintosh, it can make sense to identify what functions can be implemented identically across all platforms in a single highly-standard source file. User interface hassles might confine this to computational things such as fast Fourier transforms and so on, but it can still be worth doing. The only language all platforms might have in common is ANSI C, but a cross-language system would allow you to develop under DOS in the language of your choice and still use the common-platform C libraries.

On the flipside, there are things that are not reasons to work cross-language. This is the big one:

1. Working cross-language will not gain you execution speed. (Unless, of course, one of the languages is assembler.) One of the significant happenings of the past few years is that code-generation technology has improved across the board, and there is no longer any automatic penalty for working in Modula-2 or Pascal. The recent Modula2 compilers from Stony Brook and JPI, in fact, are so good that you might incur a performance penalty for working in some implementations of C. Certainly, within the TopSpeed language family, you can assume that the code generation technology for all languages is similar and that code performance will be about the same no matter which language you're using.

And a lesser one:

2. Working cross-language will not give you additional functionality. If there's anything C can do that current commercial implementations of Modula-2 can't do, I've yet to see it. I'll provisionally say the same for JPI's own dialect of Pascal (which will not be a clone of Turbo Pascal) but we'll address this issue later on when I've had time to play with TopSpeed Pascal.

My own conclusion: If you need to work cross-language, JPI is currently the only way to go.

Memory Addressing and Memory Models

Working cross-language opens up a whole Pandora's box of things you have to keep straight to stay out of lockup-land. The single most important of these is the issue of memory models. Turbo Pascal people have rarely had to think about memory models, because Turbo Pascal only supports one memory model. (Much of the problem in converting old Turbo Pascal 3.0 apps to Turbo Pascal 4.0 and later lies in the fact that the conversion involves a change of memory model.) Most Modula-2 compilers have provided more than one memory model, but the majority of Modula-2 programmers choose the default memory model and simply stick with it to avoid having to understand what changing from one to another really means.

Simply put, a memory model is a set of assumptions about how memory is addressed beneath the surface of a high-level language. But before I go further down that road, let's review how memory is addressed in the 8086/8088 CPU and in real mode of the 286/386/486.

86-family real mode memory is limited to 2{20} bytes, or 1,048,580 bytes, alias 1 Mbyte. An 86-family register, however, contains only 16 bits, which can specify only 216 (65,536 or 64K) locations when used as an address. How, then, do you address a full megabyte with 16-bit registers? The answer is to use two registers side-by-side. The high-order 16-bit register specifies one of 65,536 starting points within the megabyte of memory. Each starting point begins 16 bytes higher in memory than the one before it. The low-order register specifies an offset from some starting point. This offset may be up to 65,535 bytes away from the starting point.

The 65,536 bytes beginning at any starting point is called a "segment," and the number of the starting point that begins any given segment is called its "segment address." Segment 0 has its starting point at the very bottom of memory, in the very first byte of the memory system. Segment 1 begins 16 bytes up-memory from Segment 0. Segment 2 begins 32 bytes up-memory from Segment 0, and so on. Segment 65,535 begins only 16 bytes down from the very last byte in the 1,048,580 bytes addressable by the 86 family.

From this you should be able to see that by choosing the right segment and the right offset within that segment, you can uniquely specify any single byte in the whole 1,048,580 of them. Doing this choosing, however, requires two registers, which we generally call a "segment register" and an "offset register." The 86 architecture has several segment registers and several other registers that may act as offset registers. To pinpoint a location in a megabyte of memory, you have to put a segment address in a segment register, and an offset address in another register. Between the two of them, you can point anywhere in memory.

Looking Far and Near

Needless to say, the language compiler worries about all this so that you don't have to. The compiler takes care of keeping data variables in an area of memory that it knows how to find later on, and knows how to find a procedure or function in memory when that procedure or function has to be called. The assumptions that a compiler uses to locate code and data comprise the memory model the compiler is currently using. There are several such models.

First of all, consider a program that has a lot of code but not much data.

If all the data a program will need to work with can fit into a single 64K segment, the compiler arranges things so that all data is put together within one segment, and makes the assumption that data will only be found in that segment. One segment register is given the segment address of this data segment when the program begins execution, and when data must be read or written, only the offset portion of the address must be changed. With only one register to modify, operations on data can be done more quickly than if both a segment and an offset address must be specified every time data is accessed.

When all of a program's data is placed together in a single 64K segment, we call it "near data."

Now, that same program has lots of code, more than will fit in a single segment. So the compiler sets up as many segments as it takes to contain all the program's code, and when one routine calls another, it must specify both the segment address and the offset address of the routine to be called. Code addressed this way is called "far code."

It can work the other way around for both code and data. If a program has only a little code, all that code can be placed together in a single segment with a single fixed segment address. All calls from one routine to another may then be made using a single 16-bit offset address. This scheme (called "near" code) uses less memory and is faster than when a full 32-bit address must be used. Similarly, programs with loads of data (say, several very large arrays) can arrange to place their data in multiple segments and address the data with full 32-bit addresses. This is somewhat slower and bulkier than near data, but it does allow you to use a great deal more data in a program.

All the Myriad Models

The memory model used by a compiler is predicated on what combination of code and data assumptions will be made. The memory model in which there is both near code and near data (and hence only two 64K segments) is called the "small model." The memory model in which there is both far code and far data is called the "large model." In between are two intermediate stages called the "compact model" (near code, far data) and the "medium model" (far code, near data). See Table 1 for a summary of the various models.

Table 1: The standard Intel 86-family memory models

          Tiny   Small   Compact    Medium   Large  Huge
  ------------------------------------------------------
  Code     Near  Near    Near       Far      Far    Far
  Data     Near  Near    Far        Near     Far    Far
  Max.     64K   128K    1MB        1MB      1MB    1MB
    prog.
    size

NOTE: The huge and large models differ primarily in how data is addressed; in the huge model, data items may span multiple segments.

There are two slightly peculiar mutant models, one on each end of the scale. If both code and data are small enough so that both can fit into the same 64K segment without tromping on one another, we call that the "tiny model." The tiny model's sole virtue is that it is the only memory model that can be massaged into a .COM file by the EXE2BIN DOS utility.

There is one more model, the "huge model," that's slightly tougher to explain. In all other models, there is an assumption that no single data item may be larger than a single segment. In other words, even though the large model may have as many data segments as it likes, no data item may span more than one segment. The huge memory model allows a single data item to span more than one segment.

This sounds simple, but when you start to mull what it means the whole concept starts to collapse. You can only make sense of it by understanding how data is addressed by the CPU, and the best example is a large array.

An ordinary array in any data model but the huge model begins at some offset from the segment address. To read an item in the array, the CPU places the offset address of the array in an offset register, and calculates yet another offset based on the desired array index. For example, if the array is an array of records where each record is 32 bytes long, and you want to access the thirteenth element in the array, the CPU multiplies 32 by 12(you count elements from 0) to calculate this second offset. The second offset is then added to the array offset to find the specific desired array element.

The problem should begin to come clear: The sum of the two offsets must still fit into a single 16-bit register to act as an offset from the array's segment address. A 16-bit register can only count to 65,535 -- hence the array is limited to 64K in size.

In the huge memory model, the CPU must perform some considerably more sophisticated calculations to access any element of a "huge" array. Each element of the array has its own segment and offset address, and both must be calculated each time an element of the array is specified. Needless to say, this takes lots more time than when an array must fit into 64K.

Marrying Models

You have to keep all this stuff in mind when you begin to butt one piece of code from one language up against another piece of code from another language. If you call a piece of near code from a piece of far code, you'll probably crash the system. The near code pushes only one address (the offset address) onto the stack when the call is made, but the far code pops two addresses from the stack when it returns. It'll take the one address the calling code pushed, and grab the next two bytes on the stack as well, no matter what those 2 bytes actually are ... and then launch off to the 32-bit address represented by the genuine offset address and the bogus segment address. Where it stops, well, nobody knows.

This might sound a touch familiar to Turbo Pascal people. In Turbo Pascal, calls made within a unit are near calls. Calls made to a unit from outside the unit are far calls. The compiler handles this transparently for you unless you're going to define things like INLINE macros or assembly language externals. Then you'd better make sure that all code is forced to be far code by bracketing all the procedure headers involved with the {$F+} and {$F-} compiler directives.

Turbo Pascal, by the way, uses the medium memory model: Near data in one 64K data segment, and far code residing in as many code segments as you need, with each unit getting its own code segment. This can get in the way if you try to declare several very large arrays. The way around Turbo Pascal's near data limitations is to use the heap and create a linked list rather than try to declare an enormous array in one piece. (The heap is wholly an artifact of the high-level language you're using and does not really involve the memory model.)

If you're working cross-language within the TopSpeed environment, you avoid trouble by making sure that all the languages involved in creating old CHARLIE.EXE are working within the same memory model. The TopSpeed languages support all models except the tiny model and the huge model. The default small model is good enough for most small projects -- and certainly for getting the hang of things.

Calling All Conventions

The really ugly barrier to cross-language development, however, lies in something called "calling conventions." Like memory models, calling conventions are sets of assumptions the compiler makes when setting up a program.

When one routine calls another routine, several things must happen: The return address must get pushed onto the stack; any parameters to be passed as part of the call must be pushed onto the stack; control must be transferred to the called routine; and finally, something must return the stack to its previous state when the called routine returns control to the caller.

These things can be done in different orders in different ways. There are two traditionally recognized calling conventions in the 86-family world:

In the Pascal calling convention, parameters are passed from left to right. In other words, given the following call:

Grimbler(Foo,Bar,Bas,Beep);

the parameter Foo will be pushed on the stack first, then Bar, then Bas, then Beep. Just before the called procedure returns control to the caller, it performs some work on the registers that causes the parameters to disappear from the stack. So by the time procedure Grimbler returns control to whatever called it, Foo, Bar, Bas, and Beep are simply gone, and the stack is in the same state it was before the call to Grimbler began.

In the C calling convention, things are pretty much the other way around. Parameters are pushed on the stack from right to left. Consider this C function:

fumbler(foo,bar,bas,beep);

Following the C calling conventions, the beep parameter goes onto the stack first, followed by bas, and then bar, and finally foo. The parameters are pushed this way so that the number of parameters passed to a C function may vary from call to call.

The sincerest hope is that when a variable number of parameters is being passed, the last parameter pushed onto the stack -- and hence the only one the called procedure is certain to be able to identify using stack pointer SP -- is the number of parameters passed on that particular call. The called procedure can then use this count to identify and access the remaining parameters, which lie further up the stack.

Weird? I used to think so, but it's growing on me. The problem is that the cleanup of the stack is not something that can be parametrized. The code must know how much stack space is used on each call at compile time to be able to restore the stack to its state that existed before the call was made. If the same C function can be called with three parameters at one point in the program and with seven parameters at another point in the program, there's no way the function itself can clean up the stack. Only the code that calls the function knows at compile time how much stack space is needed for the call. Therefore, in the C calling convention, the code that calls a function takes the stack back from the called function with all the parameters still there. The caller then removes the parameters from the stack and restores the stack to its pre-call state.

These two conventions are utterly incompatible. You cannot call C code compiled using the C calling conventions from a Pascal routine compiled with the Pascal calling conventions. The tug-o-war over stack cleanup alone will send your DOS session into the bushes, regardless of parameter order.

In most systems that have allowed C and Pascal to call one another, the C code is directed (via a compiler toggle of some sort) to generate a call using the Pascal calling conventions when it calls Pascal code. Similarly, when Pascal calls a C function, that C function must have been compiled using the Pascal calling conventions. I have never yet seen a Pascal compiler that can generate calls using the C calling conventions, but I know of no reason why it couldn't be done.

JPI makes the two languages meet in the middle by creating its own calling convention, in which parameters are passed from left to right, as in Pascal, but in which the caller cleans up the stack, as in C. Furthermore, when CPU registers are available to carry parameters between caller and callee, those registers are used, making for much faster procedure calls.

The central point to be made about calling conventions is that both ends of the call must agree on the convention used. Get confused and you go bye-bye. If you're going to work cross-language, you must understand calling conventions completely. This begins by reading whatever the compiler vendor or vendors provide in the way of calling convention documentation, but the smart hacker goes in with a good debugger and watches exactly what happens -- at an assembly language level -- when a call is made.

Products Mentioned

TopSpeed Modula-2, V2.0 Jensen & Partners International 1101 San Antonio Road, Ste. 301 Mountain View, CA 94043 415-967-3200 Price: $199

From the Land of Lost Books

Many thanks to the people who wrote and called to say they had seen my books on the stands here and there. The bad news is that Scott, Foresman & Company was sold earlier this year to Harper & Row, which last month shut down the Scott, Foresman trade books division. When supplies are gone, that's that -- my books are in limbo, I can't revert rights, and I'm a man without a publisher. So it goes with corporate megamergers.

But the programming business continues to improve. Actor 3.0 has appeared, coincident with Microsoft Windows 3.0. Modula-2 from JPI now has object extensions almost identical to those of Turbo Pascal 5.5. Stony Brook's upgraded Modula-2 and new Pascal products push the frontier of code optimization even further into the stratosphere. More on that in a future column, along with real code for some Modula-2 objects, promise . . . .

. . . drat, there's that cat again! Quick, where's my coyote call?