PROGRAMMING PARADIGMS

A Language Without a Name: Part II

MICHAEL SWAINE

Bob Jervis is the author of Wizard C, which later became Borland's Turbo C. Recently, Bob has been working on a new programming language, a seemingly Quixotic gesture in a time when, by his own admission, the prospects for a new, independently developed language are dim. In last month's installment of this two-part interview, he talked about C++, OOP, Ada and the DoD, and the approach he is taking with his new language.

This month, the interview moves on to the future of software development, a future in which multiprocessor machines will sit on the average desktop and parallel algorithms will be the norm. Ultimately, Bob brings it all around to an answer to the question that inspired the interview in the first place: Why is Bob Jervis writing a new programming language?

DDJ: You've been doing language development for some time. Any thoughts on the quality of development tools, or the direction you see software development tools headed in the future?

BJ: As far as development tools go, I haven't seen anything that contradicts the basic model that Turbo C and Turbo Pascal established. More and more integration. More and more, you're sitting at your screen with this ever-larger array of support tools there at your fingertips. The 640K barrier of DOS has been kind of a throttle on how much tools you get for the simple reason that having all those tools costs memory and costs CPU cycles. 640K [makes it] sort of hard to be squeezing 100K compilers and 100K debuggers and 100K code analyzers in and out and get any performance out of the machine. But I think that once you get into the 32-bit world, things are a little more friendly and you'll see some really powerful development environments.

I think the big difference between the way that Turbo Pascal and Turbo C have done things and the way the 32-bit environments are going to work is more along the lines of Microsoft's Workbench, where rather than having everything compiled into one monolithic .exe, you're going to have a much more loosely connected collection of tools. You're going to be able to plug in your own editor and you're going to be able to plug in a third-party debugger if they have a better one.

I hate terms like software bus, because that sort of term is very misleading about what is really happening, but I think you're going to get some simple publicly documented interfaces that say: The editor supports these features, and here's how you get a list of error messages into the editor, and here's how the debugger works, and here's how you find out where you are in the source and how you set a breakpoint, and here's how the compiler works, and here's how you feed it the source, and here's how you get information out, and so on. It's clearly in the object-oriented paradigm that any particular company's compiler or editor or debugger will inherit that basic class and put in their specific implementation. It will look to the user like one set of windows or related buttons under the editor, but it will be a much more robust set of tools.

I think you're going to see more multivendor solutions. In the good old mainframe days, if you went to IBM you got your compiler and you got your libraries and you got your operating system, and only the very marginal niche languages that were too small for your hardware vendor to care about did you get from third parties. As time has gone on, people have become much more comfortable with building solutions out of pieces from several different sources.

I recently got a copy of the DPMI stuff from Microsoft, and I was very impressed that they've actually documented an interface that's important and that lets third-party vendors interact with Windows in a fundamental way. That's really exciting. Microsoft has, to this day, undocumented features in DOS that they refuse to talk about.

DDJ: Last fall I heard Bill Gates questioned about the undocumented features of DOS. He maintained that there were none.

BJ: That's very interesting. Well, if it's undocumented, it's not a feature, right? Too bad it's things that their own utilities use. It blows my mind that the only way you can make a debugger work with a standard.exe and actually work through the operating system is if you use undocumented features. Which is one of the things that I find impressive about DPMI: It really looks like you could write a DOS extender that supports DPMI, and that therefore runs Windows under a non-DOS environment, or not a pure DOS environment. I think that's very encouraging. That's going to go a very long way toward helping a lot of different developers -- and users -- out. I see it as a real positive step.

Of course it'll probably turn out that there's some hidden feature they haven't told you about that'll make Windows run two times faster on their DPMI implementation.

DDJ: Getting back to your project, you are planning for The Language to be a product, right? What are your plans?

BJ: It's been a very interesting project. This spring I sat back after having done all this work and having made some presentations last fall and said: I don't have AT&T or IBM or Microsoft or Borland pushing my language. How am I going to find a role for it? And I said: Well, C didn't become popular for ten years after its original invention. So if we assume ten years out, maybe what I should do is make sure The Language has features that people are going to want ten years from now.

DDJ: I suppose that depends most on what kind of hardware is on the desktop in 2001.

BJ: That's what I looked at: What is the hardware going to look like ten years out? And the answer that came back is that everything is going be multiprocessor. There may conceivably still be a $10 8088 that you can still buy in the year 2000, but realistically people are going to have two, three, four, five processors sitting on their desks.

So what I concluded is that the way to make this language have a real future is to stop now before I build -- or try to build -- a large customer base and lock myself in, and go back and look at how you program a multiprocessor.

So I started looking at the research materials and realized that there's a lot of room. The state of the research on multiprocessing languages and how to program them is very limited. I just came back from Cray, which has obviously been working on multiprocessors for a number of years. It blew my mind when they talked about performance on the order of four gigaflops. Four gigaflops! You gotta be kidding me. But they said: That's just benchmarks; realistically you can't expect more than about one gigaflop. So they're in a different league. But ten years from now that league is going to be on your desk.

DDJ: I've dug into the research on multiprocessor architectures and programming for this column in the past. The impression that I'm left with is that there are several different models, depending on the number of independent processors and whether each processor has its own memory and so on, and that each of these models is as complex as the familiar sequential model. The parallel universe is bigger than the sequential universe.

BJ: One of the things that Cray has stated [is] that they are going to be going to massively parallel machines. Their [current] project is a 1000-processor multiprocessor.

When you look at those kinds of architectures, you need a different kind of language to talk about programming those kinds of beasts. Cray's experience is that it's next to impossible to take any old random piece of C code or Fortran and to decipher what the hell you're supposed to do with your other half-dozen processors that are sitting around idle. The work they've been doing is how to get these multiprocessors to share a common memory where there aren't too many multiprocessors. They're finding A, that it's very hard, and B, that you have to extend the language to support extra features so that the programmer can help the compiler find the parallelism in the program.

[But now] even Cray is admitting that maybe this [model of] one monolithic memory and a bunch of processors snaking through it is not going to scale up to 1000 processor machines very easily. They haven't completely figured out what they're going to do, but a lot of other people like N Cubed and Hypercube are much more aggressively saying: What we're going to have is x many thousand processors and each one's going to have their own dedicated memory and we're going to be sending messages back and forth over some high-speed bus, or some multitude of buses and communications lines, to share information.

DDJ: I assume you've looked at the existing languages. There are a number of languages that have been developed for doing parallel programming. Occam. Parlog.

BJ: There's a lot of research effort, but commercially there hasn't been a lot that really works well with that kind of hardware. There are basically two strategic evolutionary directions that people are taking in trying to program these multi-processors.

One is what you might call the Prolog camp that says: We're just going to run Prolog or some language related to Prolog, and what you're going to do is distribute your productions and your solution resolution across this multitude of machines, and they're going to execute the productions and explore the alternatives in parallel. The problem with that as I see it is that -- and it may turn out that when you have enough horse-power and true parallelism that it may work out -- but those kinds of languages haven't proved to be too successful in the marketplace. People have a hard time programming in them.

DDJ: And the other direction?

BJ: It dovetails very nicely with what I was doing with The Language. It's the area of research called Concurrent OOP. What that does is say: To get your parallelism, you take all the objects in your program, and all of a sudden all of those objects become parallel processes. So each object, instead of being a passive patch of memory with some functions somehow associated with it, is an active processing beast. And when you use the Smalltalk terminology of sending messages, you really are sending messages.

I've run across at least half a dozen different projects that have worked out languages to express parallelism using this stuff, and it looks very interesting, so I'm working on putting those kinds of features into The Language.

There's going to be a large [range of numbers] of processors available in the machine of the year 2000. Low-end machines like the [Intel] Micro 2000 will have a mere half dozen processors, whereas your high-end Cray machines are going to have 1000 to 10,000 processors, and we're going to need programming languages that work on that range of hardware. So what I want to do is [have] the compiler know that [when it's running on] a fairly low-end machine with not many processors, you use less parallelism and compile things more into regular procedure calls between objects, and bind more objects into a single executable. And then when you've got the processors, you split everything up and ship everything across to all the processors.

I think that we're probably still five years away from a whole lot of people having machines that can really take advantage of that kind of language, because I wouldn't expect a real low-end multiprocessor from Intel, for example, until the 686 or 786. Before they get the microchips out that have a dozen processors, you're going to have chips with two or three processors on them. So we're still a few years away before these things become advantages.

DDJ: So you are writing a language for a market that won't begin to emerge for another five years?

BJ: Well, there is one thing that is happening today that fundamentally is dealing with a very similar sort of problem, and that's distributed [computing]. I went to a presentation recently by one of the guys at Next. They have this thing where you go home at night and it comes in and uses your desktop processor as extra computing horsepower. So I think that there's an immediate use for a language like this. If you can write a single piece of source or write an application that thinks that it's calling the library, and have the compiler and the operating system supporting it turn that into a client-server application operating over a network, you've got an immediate application for these features. And then when the multiprocessors come along, you've got even more applications for them.

So that's the technical context in which, if you will, I'm doing version two of The Language. Version one is still pretty much a single-processor, conventional C-like language with OOP extensions. The next generation is going to be much more ambitious in terms of the kinds of technical scope.

DDJ: Ambitious seems to be an appropriate description. Aren't there some fundamental problems that have to be solved before a language for these kinds of architectures can be commercially viable?

BJ: There are a lot of things that have to be solved. Clearly, if you've got a distributed network where you've got two processors, then a client-server exchanging messages is a sensible way of doing things. To get both processors involved, you have to ship information in the form of a message across the network. But if those two pieces of the application happen to be residing on one machine, then it's less clear what the best way of doing it is. Because now all of a sudden you're paying all this internal overhead to do a generalized message send when all you've got is one little processor ticking away.

So it's not at all clear how you balance the efficiency for conventional single-processor architecture with the flexibility of distributed computing. Most of the research I've seen has not really addressed that very well. For example, message-based operating systems, while they have great flexibility for working across networks, tend not to perform very well compared to more conventional operating systems. So if your programming language makes heavy use of messages, then you're going to be in the same boat.

So I'm trying to explore ways to at least make the more common operations of basic disk I/O flexible enough that if you have a network there, you'll be able to take advantage of it, but if you don't have a network it'll still be relatively efficient. For today, that's probably the single biggest challenge that I face. It's not proving to be real easy. But then again, that's why there aren't ten other people out there selling products like this.

DDJ: That may change, after this interview appears. I have to say, you're exploring some issues that I find fascinating, as well as challenging.

BJ: It's an exciting end of the business. The more I talk to people the more I realize that this is really cutting-edge technology. There's a very good chance that in another five to ten years this kind of work is going to be being done all over the world, just because the demand will be there.

That, in a nutshell, is where I think software development is going, too. OOP came along at the right time to help us write windows kinds of applications, and I think these sorts of languages are going to take OOP to the next step. It's the most promising avenue I've run across in the literature for the kind of language in which you still write algorithms most of the time, so you don't have to constantly be thinking, as in Prolog: How do I get it to do A before B? If you want it to do A before B you just write A ; B and it does it. And if you want it to do it in parallel, you have to think about it. You pay for the complexity you need.