LETTERS

Striding Forth With Mini-Interpreters

"Roll your own Mini-Interpreters," by Michael Abrash and Dan Illowsky in the September DDJ was fun to read. What they have done, essentially, is show how to make a Forth-like program without even Forth's overhead. Forth is legendary for producing extremely compact code within the Forth environment. The authors went them one better by eliminating even that.

What the article doesn't mention, is that there is nothing to prevent high-level language compilers from emitting this type of threaded code. All that is required is to forego separate compilation and process the entire application, including all subprograms and perhaps even the run-time library, via a globally optimizing compiler (GOC).

A GOC knows the list of all subprograms and the entire calling tree. It can automatically build one or more jump vector tables, and compose the interpreted data streams. Because of the low overhead of procedure calls, even tiny sequences such as A=B+C could be generated as threaded calls rather than as direct machine code. Thus, subprogram granularity much finer than the application code is possible, and the compiler would be able to trade off code-size versus execution-speed over an extraordinary range.

Recent press reports show that even the giants like Lotus and Ashton-Tate are killing themselves trying to make their code fit the 640K limits of DOS. Assuming that others have similar problems, there could be quite a market for GOCs. Attention compiler vendors! Here's a chance to make more money.

Of course, global optimization has its cost; namely glacial compilation speed. The right way to use it would be to develop the application on a 386 in protected mode, using fast compilers and whatever memory is needed. It should remain in this environment even through beta testing, until the code can be frozen. Then, assuming a bulletproof and bug free GOC, it can be compiled once more with global optimization. The compiler must be instructed about just how compact the program must be. For example, "take this program, which needs 843K for code using MSC, and compile it to fit in no more than 403K while minimizing the impact on speed." Who cares if it takes a week or so to finish? The GOC could even be run as a service bureau, compiling on a Cray, and charging as much as one dollar per line for a compile. If it works, it would be well worth it.

Burlington, Vermont

Where There's Smoke, There's Ire

The article in the September 1989 DDJ entitled "Roll Your Own Minilanguages with Mini-Interpreters" by Abrash and Illowsky offended me. No, not by the content; by the title. No, "roll your own" is not just an expression.

I hold cigarette smokers (and smokers of anything else) in very low regard. Next time, try some synonyms like "create," "devise," or "construct" that have far fewer connotations.

Barrington, Illinois

More Kudos for Abrash and Illowsky

A few remarks on a couple of subjects. One: Since the word polymorphism is being bandied about a lot these days, the following, "loosely typed" definition may prove interesting.

Polymorphism is a polysyllabic noun used to "encapsulate" the idea that: 1) to treat "loosely coupled" items, you use "loosely typed" variables; and 2) "loosely typed" variables can't be bound until run-time, an activity often referred to by way of another buzzword, "late-binding."

Two: Several hips and half as many hoorays for the Abrash and Illowsky article in the September issue. The most exciting thing about assembly language is not the control it gives one over the machine, nor the reduction in memory requirements and execution time. As Abrash and Illowsky point out, the most exciting thing about assembly language is that it gives you as much control over the design of your program logic and data layout as you're ever likely to get.

To which I would like to add: If you want even more control, you need your own assembler, perhaps your own editor, and even your own operating system -- only for your own private use, of course (that way there's no effective, market-oriented argument against actually tackling the job). As a fantastic learning experience, working on any of them would be hard to beat. Let's hope DDJ has more .ASM goodies in the queue.

Chicago, Illinois

LZW Patent Issues

"LZW Data Compression," by Mark Nelson (DDJ, October 1989) is a nice exposition on the LZW algorithm. But before your readers decide to use this method in any application (except perhaps for purely personal use), they should know that the algorithm is patented.

Terry Welch is listed as the inventor of U.S. Patent 4,558,302, "High Speed Data Compression and Decompression Apparatus and Method," December 1985, assigned to Sperry Corporation (now Unisys). The Unix compress utility and several commercial and shareware programs are apparently infringing on this patent (unless they have licensed it from Unisys).

If you wish to use this method in a commercial setting, you should contact Unisys for a license, or at least consult your legal counsel first.

Englewood, Colorado

Mark responds: When I wrote the LZW article I was unaware of any patent on the algorithm. The issue has just surfaced in the press because of concern in CCITT Group 7 over approval of the BTLZ algorithm for data compression in the V.42bis modem standard. Unisys, British Telecom, and IBM apparently all have some claim on the algorithm. Robert Bramson, a patent attorney for Unisys, has been quoted as saying they will license the algorithm for a one-time fee of $20,000.

I have not seen the Unisys patent, so I don't know what their specific claims are. However, I am not aware of any attempt by Unisys to show infringement by software developers. The BTLZ algorithm seems to be concerned with hardware implementations. In the event that they do pursue their claim with software developers, they will be very busy, as there are literally hundreds of potentially infringing programs in the commercial marketplace alone. And they certainly cannot claim a comprehensive patent on basic LZ compression, as Terry Welch, the patent holder, was not the inventor.

I agree with Mr. Gardner that anyone who intends to use LZW compression in a commercial product would be wise to consult legal counsel first.

Finally, I would like to suggest that DDJ readers begin a letter-writing campaign directed to members of Congress, the ACM, and the IEEE. The current confusion over copyright and patent issues in the software development world only serves to stifle both creativity and productivity. At present the only way questions regarding the validity of copyrights and patents are being answered is through random decisions from legal proceedings. Copyright and patent laws both need to be updated to work properly in the 1990s.

Small Is Better

Jeff Duntemann is the second person I have encountered in print this month who characterizes the evolution of S/370 mainframes as moving toward the role of a gigantic file server. The other guy is the CEO of my present work situation.

A recent discussion (warm, not heated) with a fellow systems grunt who specializes in network supports Jeff's observation that the mainframe "empire" is resisting the enhancement and distribution of processing that micros bring to the 4 techno-brews we build and support. His positions are 1) most databases cannot be distributed, 2) the lack of standardized protocols and network architectures eliminate most advantages of micro local processing, and 3) ancient business practices, banking for example, do not mesh well with modern data processing technologies. Well, some of that is true yet anyone who has jumped into micro coding from a mainframe environment knows the euphoria of megalomaniacal control, has been amazed by the low cost of quality software tools, and has embraced the heady vision of a computer-literate society where programming arts will be as second nature as reading and writing.

IBM and compatible vendors continue to eliminate the need for systems support through automated operations, packaged operating system installation/maintenance, 4GL database administration, system managed storage, and of course function suction into microcode, as Jeff has observed. Systems programming at the operating system level has been reduced to configuration management to a great degree. S/370 mainframes are becoming turnkey and selfconfiguring, a welcomed release from drudgery and business risk. As a result, we systems programmers who were first attracted to the science, art, and technology of S/370 become increasingly bored with the whole mainframe world. Some organizations have tried to cut this boredom by inventing projects based on ALC for their programmers to play with, usually having artificial and redundant purposes. What a waste of talent.

My AT-class system is booted, the coffee is fresh, Turbo C faithfully awaits my attention as the word processor gets stroked. It is Sunday morning, the sky is gray with rain, my desk is strewn with reference books and product catalogs. An application that has never existed before occupies my background wetware as these lines are written; plans for system expansion pop up like menus. I am a happy programmer doing happy things: I am learning, creating, imagining.

Come Monday morning I will return to my small, crowded office and turn my concerns to DASD management. There will be performance statistics to analyze and a time sheet to keep current. A product analysis report will be written, a meeting attended. Decisions will be made slowly and safely, actions will be delayed to an ever-shrinking outage window. I will think occasionally about the 20 MIPS processor, the four-volume database, the worldwide network. Problems will occur and solutions will be defined. I will cover my behind and not rock the boat too violently because this is how you survive in the business world.

Meanwhile there is a whole population of micro programmers out there who are thinking at a level not known in the mainframe world in recent years, whose mainframe concerns amount to a hill of beans. They want access and nothing more. At their fingertips are megabytes of main storage, gigabytes of disk storage, CASE tools, multitasking, hundreds of colors, thousands of bauds. Me, I will hit ENTER at my graphics tube and wait for GDDM to get a few cycles. I will print a document and receive it an hour later. I will scurry to the machine room to check a main console; scurry to a meeting and struggle to stay interested, write a memo, update a time sheet. I will return home to boot my personal system and again become a happy programmer.

Business in general wants the turnkey mainframe. There is nothing wrong with this other than it puts me and many other systems folks out on the street with unmarketable knowledge. Notice I did not write "skills" -- we all have tremendously marketable skills. S/370 knowledge is a good base for OS/2; assembler knowledge a good base for micro assembler; SAS and other high-level language knowledge a good base for Pascal, C, and others. Our skills are inherent: Love of science, art, and technology; the ability to make a machine do what we want; enough business savvy to survive these many years in data processing. So bring on the turnkey mainframe; bring on the local area network; bring on distributed processing; and give me back my machine! Let me build the better system, create the never before seen application, make this puppy run like it never ran before. Get me out of the erector-set mentality of canned software, black box hardware, and Big Blue strategy.

For my colleagues who find all this ranting quite unbecoming of a professional, I would advise opening up those purse strings and getting a home micro system. Get some development software, build a reference/tutorial library. Cook up some application interesting to you and develop it. It does not matter if the program ever sells. It matters that it is your program, does what you want, and that you learn the amazing cost, function, and performance characteristics of the micro computer. Then and only then will we ever have meaningful discussions on the viability of distributed, local, and personal computing.

Two really smart people in one month comparing mainframes to file servers. Doesn't that tell you something?

Chantilly, Virginia

Unix Help Wanted

I would like to describe a problem I recently encountered with the Unix system Bourne shell. It seems that there is a subtle interaction between path searching, file permission flags, and file hashing. A friend asked me to do a regression test for him after he made some last minute changes to his program. Using FTP, I imported a copy of the last minute version. (An official beta test version was already installed on my system.) A quick test showed that no errors had been introduced by the changes. In fact, I could detect no difference at all between the beta and last minute versions. I reported to my friend that everything looked fine. Hours later, he dropped by to demonstrate some exciting feature. Having difficulty finding this feature in my copy of his program, he concluded that I had, in fact, been testing the official beta version. What had happened?

My friend and I reasoned that two things had occurred First, FTP had stripped the execute permission from my copy. Second, and heretofore unknown to me, the path search function only finds files that have execute permission set. It encounters the named file without execute permission, it continues its hunt for an executable version. (If the official beta test version had not been installed, I would have received the error message: "name: execute permission denied.")

Being resourceful, my friend and I proceeded to reset the execute permission flag using the chmod command. And, when we tried to execute the program again, the path search function still yielded the official beta test version. Now what had happened?

Hashing! The Bourne shell of System V has a thing called file hashing that speeds up path searches. The shell remembers where in your search path it was that invoked commands were last found. So even though path searching would have worked properly now, it was not being used. The hash memory can be erased with the "hash - r" command.

Now, at last, my copy of the program was being tested. Unfortunately, it was time to go home, and the actual program test had to wait until the next day. (Yes! Non-trivial errors were discovered and I had to retract my earlier thumbs up.)

Perhaps the Unix developers would consider a revision to the System V specification and redesign the path searching algorithm. Would it cause any problem, I wonder, if, instead of looking for EXECUTABLE FILES, the path search function looks for ANY FILE whose name is identical to that entered on the command line. This might generate more error messages, but users could be more confident that path searching has resolved to the expected version of a file.

Issaquah, Washington

Graphics Programming Fix

As you probably have been made aware, there was a typo in one of the listings for Kent Porter's "Graphics Programming" column in the July issue. The typo caused a program to loop in a recursive call, which would either cause a system bang or exhausted stack space.

The typo was in the program RESIZE, in the subroutine drawstar. In this subroutine, the first call to draw_rect should have its dy argument be (30) not (-30). This causes the intended rectangle to be lower than the arguments to the following floodfill anticipated. Instead of filling an object, the floodfill routine tries to fill the viewport.

This first error revealed two other errors in earlier listings that had not been noticed because filling a viewport had never been tested.

The first is in the EGAPIXEL routine. It checks incorrectly if a pixel is within the vuport. It checks if the pixel is "greater than or equal" to the vuport instead of just "greater than". If a pixel is in the vuport, it can be equal to the limit. This caused the floodfill to reject the last fill line as outside the vuport and make a recursive call in the reverse direction, starting an endless loop.

The EGAPIXEL routine does not check if the pixel is less than the vuport. This is why the endless loop doesn't happen until the fill hits the bottom. Depending on someone's needs, the test can be either fixed for both cases or dropped completely.

The second error is in FLOODFILL. The variable dir is not needed at all. It can be fixed as -1 in the first loop and +1 in the second. The third loop can be dropped completely. With these changes, there is no need to add a check in floodfill for the error return condition from EGAPIXEL. Also this change speeds up the fill routine by about 20 percent.

Acton, Massachusetts

Subscrib-er to Subscrib-ee

I have been a fan of Jeff Duntemann since the late Borland Turbo Technix. His October column comparing OOP word definitions in various languages was a beauty. However, I would like to play turnabout.

In defining binding he uses the terms caller and call-ee. I quibble with call-ee.

There is a parent-child sequence that goes something like telegraphy, telephony, radio, electronics, computers. We have inherited much terminology from our ancestor technologies. Since time immemorial (or at least, since the beginnings of telephone and telegraph switchboards) the terms caller and called have been used. These are pronounced call-lerr and call-ledd. Each word has two equally stressed syllables.

Copyright © 1989, Dr. Dobb's Journal