This item is from the "Programming Paradigms" and "Swaine's Flames" Buglist (along with a heartfelt "Gee, thanks!" to all you beta testers out there): "Won't bunking lead to an increase in the 'american populous'?" e-mails Jim Weaver from Oregon, suggesting that "perhaps Cowlishaw's observation [in the March 1996 DDJ article 'A Conversation with Michael Cowlishaw'] about programmers and debuggers applies equally well to writers and spell checkers."
Jim is using a term that I invented in my March 1996 "Swaine's Flames" column to comment on a misspelling that I perpetrated in that same issue's "Programming Paradigms" column. This marks Jim as a Double-Columned Editorial Ego Stroker and guarantees him a mention here. The only problem is, I didn't make the mistake Jim cites. (I refer now to the erroneous use of "populous" for "populace." Lowercasing "American" is Jim's own error--unless it's some kind of political statement.) No, the mistake I made was an entirely different mistake. What I wrote was "populus," which is not even an English word. An editor fixed it into "populous."
As I told Jim, rather than Cowlishaw's observations, I would point to Poncifrete's advice: "Correcting an error is the most error-prone activity in programming. Never do it."
Sound advice in any field, with the possible exception of politics. I'm thinking of New Jersey Congressman Bill Martini's pet phrase, "Correct me if I'm not mistaken," the logic of which makes me dizzy.
But I digress.
Some programmers aspire to Poncifrete's standard of perfection. Such perfectionists really believe, in the tightly-knotted ganglia of their neural nets, that they ought to be able to sit down cold and code that routine clean and complete in one pure act of immaculate cerebration. But I want to talk about one kind of perfectionist, the all-inclusive visionary kind. The kind of perfectionist who says, "All of computing can be viewed as (fill in the blank)."
Gerry Wolff believes that all of computing can be viewed as compression. Wolff is a professor at the School of Electronic Engineering and Computer Systems at the University of Wales at Bangor. His vision encompasses other fields, too: Besides "Computing as Compression," he has written papers on "Cognition as Compression" and "Language Learning as Compression." To be followed, I have no doubt, by "Coal Production as Compression," a hot topic in Wales, and "Editing as Compression," which--uh, yeah, I can take a hint.
Wolff's views seem well thought out and original. He has demonstrated their applicability in many areas of computation, too (he'd have to). None of his demonstrations have set any records for performance, but he argues that that's because they are hybrids--his paradigm implemented with tools and on hardware based on a different paradigm. We may have to wait for Wolff to build a computing-as-compression machine before we can truly evaluate his program, but he's working on that, too. (You can tap into his research program at http:// www.sees.bangor.ac.uk/~gerry/sp_summary.html.)
His basic conjecture is that "all kinds of computing and formal reasoning may usefully be understood as information compression by pattern matching, unification, and search." Several of these terms require definition, not the least of them being the word "usefully."
By "unification," Wolff means something less than logicians and Prolog programmers mean by the term. (In Prolog, correct me if I'm not mistaken, the term means something like "banging two assertions together to see what conclusions they spark.") Wolff's unification means a simple merging of the matching patterns returned by the pattern-matching phase. For example, a simple compression algorithm might look for repeating patterns (aaaaaaa, 161616), using some matching methods to recognize repetitions, and unifying the patterns by merging them in some redundancy-reducing fashion (ax7, 16x3). These examples of matching and unification are two methods among many, and that's where the search comes in: Wolff's model, which he calls "SP" (short for "simplicity and power"), involves narrowing the options using some metrics-guided search technique like hill climbing.
Wolff thinks that the multiple-alignment technique used in DNA analysis is a good approach to matching and unifying information content, and he proposes using it in his model. He points out that it has been promoted for use in unsupervised learning, fuzzy-pattern recognition, deductive and probabilistic reasoning, and other tasks.
Wolff predicts that computers built and programmed along the SP model would produce less-brittle software, greater flexibility, better performance in the face of uncertain or incomplete information, at least partial automation of the process of software development, and elimination of the need to write search routines (because powerful search techniques would be built into the model). Some of these sound like the predicted benefits of neural-net or fuzzy-logic systems, and SP has a lot in common with those approaches.
I don't know whether or not Wolff is onto something promising, but there is one thing about his notions that I find intriguing. The current trend in software development--toward building from components--makes it hard to judge the overall efficiency of a system. When lots of people build roads to get them where they want to go, and then you design a route to your goal over these roads, all you can say with certainty is that you've chosen an efficient path to your goal using the available roads. This says nothing about the crow-fly distance from where you are to where you want to be. Wolff's SP model, and I hope he'll correct me if I'm not mistaken, is very much concerned with searching for an efficient path from here to there. And that's certainly worth looking into.
There must be a reason why these grand visions seem so often to pop up in out-of-the-way places like Bangor, Wales. Actually, Bangor is the center of the universe compared to the place where Konrad Zuse created his grand vision.
April 1945 was not a good time to be in Berlin. As the Allied troops approached the city, one lesson to be drawn from recent events was the importance of choosing the right name for your product. No matter that the Versuchsmodell 4 at Gottingen University was merely a computer. Any device called the V4 was bound to attract the attention of the occupation troops, even if it had no connection with the V1 and V2 buzzbombs. Konrad Zuse had invented the V4, as well as the V1 through V3--which were destroyed in the bombing--and he got permission to transport the surviving V4 to a safer location.
There ensued a hairsbreadth escape. Zuse and the V4 trucked south into Bavaria, deep into the Bavarian Alps, along the Austrian border, to the tiny town of Hindelang, and then to the even smaller village of Hinterstein. In a small shed in Hinterstein, the V4 found a seemingly good hiding place.
A few days later, the French army was in Hinterstein.
The V4 managed to escape notice, hidden in a basement that time, but the next time it wasn't so lucky. It was discovered by British Secret Service officers, who, however, lost all interest in it when they determined that it had nothing to do with buzzbombs.
What with funds dried up, his staff dispersed, and the need to hide the machine in the basement whenever company was expected, there was no way Zuse could get any work done on the V4. There in Hinterstein, with time on his hands and ideas in his head, Zuse settled into theoretical work. Outside, spring was breaking through with Alpine earnestness. Inside, Zuse began the design of a programming language. Since he had no machine to run it on, this was a purely theoretical enterprise, which suited Zuse all right.
Zuse had already developed some thoughts about language. All of computing, he believed, can be viewed as starting at the bit. This was in fact the guiding principle behind his machines, the V1 through V4. Zuse didn't just mean that he wanted to build machines that calculated in binary, although he did mean that, in part, and that was not by any means a given in the early 1940s. Historically, calculating devices had often worked in decimal, and it was not immediately obvious that binary was better.
"The first principle of the Plankalkul," Zuse wrote later, "is: data processing begins with the bit." Building on the bit, you can make programs of arbitrary complexity because you can derive all of predicate and propositional logic; and you can also create the most complex data structures. The bit is it.
There were various other nifty insights in Zuse's language. Program flow in Plankalkul was tightly controlled. It had no GOTO statement; in fact, its version of the IF statement had no ELSE clause. It did, however, have a subscripted FIN statement that caused a jump out of a number of iteration levels.
Zuse introduced into his notation what we would today call invariants, tracking the mathematical relationships between variables used, an idea that didn't reappear in programming languages until C.A.R. Hoare brought it up in the language FIND in 1971. Example 1(a) is an approximation of Zuse's notation. This statement (it's a single statement, although it spans three lines) maps an 11-element floating-point array into an array of 11 ordered pairs, with data types defined elsewhere. Or so they tell me. The notation is a little obscure.
But Plankalkul's most outstanding feature was its hierarchical data structures, the outgrowth of Zuse's insistence on the bit as central to everything. Floating point was not a built-in datatype, but was defined as (3xS0, 7xS0, 22xS0). This means 3 bits, 7 bits, 22 bits (S0 was the sole provided primitive datatype, the bit), and these three segments were used, respectively, for sign and other markers, exponent, and mantissa. Once floating point was defined it could, in turn, be used in defining more complex datatypes, which could be arbitrarily complex, with heterogeneous components of variable length.
Nothing like this came into programming languages again for a decade. Curiously, Plankalkul didn't have much impact on subsequent programming languages. In particular, it had little effect on the design of Algol, even though some of the Algol designers knew of Plankalkul and even though it had many features Algol was to have, as well as some it lacked. Some who were involved in the Algol spec say that the reason was that Plankalkul had a horrible notation and was, well, too perfect.
Zuse saw it differently, of course. History doesn't always speak with one voice.
Which brings us back to the "Programming Paradigms" and "Swaine's Flames" Buglist: Because history does not speak with one voice, writing about history is a sure way to get mail. If you don't make your own mistakes, you will repeat the mistakes of your sources, and your readers will nail you either way.
Robert Patterson writes to say, "I'm sure I'm the zillionth person to tell you, but (for the record) I believe that the ENIAC revival of Feb. 14 occurred at the University of Pennsylvania in Philadelphia, rather than at the University of Pittsburgh as you reported in the March '96 issue of Dr. Dobb's. I used to walk by the ENIAC every day when I was attending the U of Penn, and occasionally I went in to peek at it."
Robert is right, of course.
And Gary Brown writes, "I hold in my hot little hand (no mean task when you are touch typing) the manual 'GENIACS: Simple Electric Brain Machines, and How to Make Them' (c) 1955 by Oliver Garfield. I think your reference to Mr. Berkeley may be in error (unless Mr. Garfield only wrote the manual...who knows...). I can still remember putting all the little screws on the masonite disks!"
I'm still researching this one. I welcome e-mail from anybody who knows for sure who created Geniac.
Not all my mail goes to the Buglist. Says Matteo Vaccari, "I've read with interest your columns about languages other than C, in Dr. Dobb's. May I suggest you to take a look at an interesting language called Haskell? It is a functional language, with a few twists. For one thing, it is entirely functional: Unlike Lisps and Schemes, there is no imperative subset." Aha, I thought, a pure functional language with no flaw in that crystalline paradigm. That's the kind of thing I'm talking about this month. Haskell is a lazy, functional-programming language invented by Mark Jones at Oxford University. It has been implemented, at least in part, for DOS, Mac OS, Atari, Amiga, and Sun OS in the form of a free interpreter called "Gofer" (no relation to the Internet utility Gopher) or MacGofer. You can find Gofer and MacGofer at http://www.inesc.pt/free-dir/free-S-1.53.html. A lot of info on Haskell itself is available at http://www.cs.yale.edu/ HTML/YALE/CS/haskell/. Haskell has many of the sorts of features you'd expect in a more or less modern programming language, and, as Example 1(b) illustrates, has an unsurprising syntax.
Haskell has the usual built-in datatypes, including lists, which are as useful in Haskell as they are in that other functional language, LISP. List functions like map, which applies a function to all elements of a list, are central to the way you program in Haskell. Strings are implemented as lists of characters.
Pattern matching also is central to Haskell. The declaration of a function, in fact, involves the specification of patterns to be matched when the function is invoked. These patterns are just the formal arguments of the function, so the same assertion could be made regarding any programming language that supports functions, but in Haskell these patterns can be more complex, including lists, tuples, wildcards, and other forms. Haskell also supports a LET statement, which makes its claims to be a pure functional language without assignment statements suspect, but Haskell's LET is not Basic's LET. It's used in a very limited way for local declarations.
But I wanted to focus on the essence of Haskell, which is this grand vision of pure functional programming.
The briefest definition of functional programming is "everything is a function," although "programming without assignment" is a close second. It may come as a surprise, not to you but to self-trained Basic hackers, that it is possible to live without an assignment statement. (Zuse, incidentally, invented the assignment statement. That many other computer pioneers also invented it independently doesn't take away from this original act of invention. Like many successes, the assignment statement has many parents.)
But it's more revealing to point out the lack of side effects of pure functional programming. It has several desirable implications: Individual functions can be analyzed in isolation, and you can convince yourself of their correctness without having to consider other parts of the program; you can perform sophisticated program transformations; and you can do lazy evaluation. Lazy evaluation allows for some intuitively satisfying techniques, like defining the Fibonacci series as an infinite list and selecting from that list the nth element. Without lazy evaluation an infinitely long list can only be dealt with indirectly if you don't have infinite time at your disposal, and who does, these days.
Some readers would like to rewrite my columns, like Glenn Linderman, who rewrote the closing sentence in my February 1996 column. That column discussed the language ReWrite and ended, "I hope to write about it [Mathematica] soon." Glenn thinks this would have had more punch if I had said, "I hope to ReWrite about it [Mathematica] soon." "Since you've discussed it before," Glenn says, "you could have gotten away with it." Heck, Glenn, I wish I'd thought of that. What a perfect way to end the column.
Everything is Compression
The Road Not Taken
Springtime for Hitler
Everything is Bits
History is Bunk
Everything is a Function
Example 1: (a) Approximation of Zuse's notation; (b) Haskell syntax.
(a)
P2 / R(V) => R
V/ 0 0
A/ 11 X d1 11 X 2
(b)
fact n = product [1 . . n]
comb n r = fact n / (fact r * fact (n-r))
comb 5 2
entered at the interpreter's prompt gets the response
10