PROGRAMMING PARADIGMS

Netwurkz and PL/D

Michael Swaine

This month I won't be presenting the object-oriented language interviews I'd planned because of a product that has been sitting on my shelf for months and that I've finally had a chance to start playing with. Writing about this package will lead me back into parallel processing and more specifically into neural networks. I guess I'll get back to object-oriented interviewing next month.

The neural network paradigm, as I've mentioned in past columns, loosely models the structure of the nervous system, with many nodes or neurons linked in a large network whose overall behavior is a function of its inputs, its topology, and properties of the neurons. It is a naturally parallel paradigm.

As I mentioned in the July issue, my friend Jürgen Fey is investigating neural net models. Having built a transputer=based system, Jürgen now wants to see what he can do with it, and neural nets seem like a natural next step, particularly given the ease with which transputers can implement a custom network of processors.

My hardware knowledge doesn t extend much beyond setting jumpers and cleaning edge connectors, so l have not followed Jürgen's example. But l really wanted to play around with neural nets, and I wanted it to be, well, easy. I was looking for some sort of simulation that would run on one of my computers without additional hardware and without my having to write the simulation myself. Obviously, such a simulation would be of no practical value: simulated parallel processing on a single-processor machine is only useful for learning, for experimenting purposes. My purposes.

Netwurkz turned out to be ideal for my purposes.

Nerl Netwurhz

Netwurkz is a neural network simulation written in a system language called PL/D. Netwurkz and PL/D are the brainchildren of Dennis Reinhardt, president of DAIR Computer Systems in Palo Alto, Calif. Reinhardt distributes the source for both the simulation and the compiler itself at a reasonable price.

What Reinhardt has implemented in Netwurkz is the simplest kind of neural network--a deterministic, single-pass, feed-forward net. Its simplicity makes it a good tool for getting started with neural nets, and the implementation, although cumbersome, certainly gives you a sense of what it really means to build a neural net, as you laboriously cobble together your network of neurons.

The physical system that all neural network systems simulate is this: some 1,010 information-processing cells connected via input channels called dendrites and output channels called axons. Each cell typically has one axon and many dendrites. Within the cell, information flows electrochemically from dendrites to cell body to axon, with the cell applying some function to the inputs to p duce the single output signal sent d wn the axon. Between cells, axons connect with dendrites.

Some attempts to model this system have used matrix models, tagging the connections that actually exist in a large sparse matrix that represents every possible connection.

A more efficient way to implement such a model is via list storage, letting individual lists indicate (1) the cell and (2) all the cells whose dendrites reach it: the node and its sources as head and tail of a list. Reinhardt's system language PL/D turns out to be good at list processing. Now, Reinhardt is intrigued by neural networks, and it would make a nice story to say that he created a language and a compiler in order to do neural network research, but the truth seems to be that his language just happens to be well suited to supporting the kind of data structures he needs for his neural network research.

The language also happens to be interesting in its own right.

Programming Language/Dennis

Dennis Reinhardt's real motivation for developing PL/D was a mundane observation. He noticed, not for the first time, nor was he the first to make the observation, that assembly language and high-level languages each have their strengths. He went on to get clearly in mind what those respective strengths were, at least from his perspective, and set out to develop a language that embodied the best of both levels of programming.

History does not record how many languages have been developed from exactly this motivation. History does demonstrate, though, that the same motivation can lead to rather different solutions, depending on the programmer's sense of just what the relative strengths of assembly lannguage and HLLs are and on the programmer's tastes and abilities.

Reinhardt's perspective, at least in part, was that assembly language provides tight control and HLLs provide convenient notation, and he tried to provide both in PL/D.

A simple assembly-language operation might look like:

   LOAD      A
   SUB      B
   STO      C.

These three lines perform one meaningful operation; reflecting this notationally could be accomplished by putting them on one line:

    LOAD A
    SUB B
    STO C.

PL/D does exactly this, with compiler style notation:

A-B-->C; or:
A;
   -B;
       -> C;

(PL/D is forgiving of syntactic variations because it is, at the expression level, essentially translating a text stream into code.) At this level, PL/D is an assembler that accepts a notation for expressions more like that you'd expect from a compiler.

Above the level of the expression, PL/D provides the structures you would expect from a HLL: IF ... THEN ... ELSE ... ENDIF, FOR ... DO ... NEXT, WHILE ... THEN ... ENDWHILE, and CHOOSE ... CASE ... BUT ... ENDCH. It is these structures that make PL/D a compiler.

PL/D has a couple of other nice features. Any declaration made within a subroutine is local to that subroutine, for more structuring. And Reinhardt has built a lot of compiletime orthogonality into the language. All four of the control structures have compile-time analogs with exactly analogous syntax, such as FOR$ ... DO$ ... NEXT$. You could implement the Byte magazine Sieve of Eratosthenes algorithm entirely in compiletime code in PL/D. PL/D allows you to control any libraries you want to include, so you can keep programs small. And it's a one-pass compiler.

OK; I'm not writing a review of PL/D here, and if I were I'd probably find some problems to gripe about. The point is that it is an interesting little language, and it provides a means for exploring simple neural networks.

The Neuron as DATA Statement

The neuron of the neural network model that Reinhardt has created in Netwurkz is implemented as a PLA) DATA statement. The list structure of cell and input cells becomes a DATA statement that lists the cells. These DATA statements are interpreted and executed (sequentially) by a network executive routine. In a true parallel implementation of a neural network, the cells themselves would have access to processors and could fire in parallel.

The PL/D DATA statement takes one of these forms:

     DATA Name v1...vn;      DATA ** v1...vn;

Here, the first form defines Name as the current location counter and allocates v1 through vn to the current through current + (n-1)st starting locations (actually current + 28(n-1), but that's not important to an understanding of this use of the DATA statements). The second form is identical except that it does not define a name.

A neuron in a neural net model typically does something such as summing the values of its dendritic signals or summing all those that exceed some threshold, and there are many variations on this theme. Real neurons often have inhibitory dendrites, whose signals inhibit the signals of other dendrites. Reinhardt has implemented several simple neurons that do things more suited to his immediate tutorial purposes than to modeling the neural processes of the brain. The Netwurkz simulation includes the neurons CON, EQ+, SUM, and MAX.

The CON neuron defines a constant. I question its neural reality; it sounds suspiciously like the infamous "grandfather cell." It is used to build a database. Its syntax is:

     DATA Name CON Value;

The EQ + neuron counts the number of exact matches between corresponding a and b inputs. The count is saved in the location reserved by the 0. Its syntax is:

     DATA Name EQ + 0 n a1 b1 ... an bn;

The SUM neuron adds the values of its n inputs a1 through an and stores the sum in the location reserved by the 0. It also stores a pointer to a reference neuron. This business of providing two outputs from the cell is counter to the one-axon rule for real neurons, but Reinhardt says that the same effect could be achieved with one output (or one axon) just by time-sharing the channel between two messages. Its syntax is:

     DATA Name SUM 0 refad n a1 ... an;

The MAX neuron compares its two inputs, placing the larger value and the address of the neuron that provided it in its two reserved spaces-- another two-axon simplification. Its syntax is:

     DATA Name MAX 00 a1 a2;

With these simple neurons, Reinhardt implements an even simpler spelling checker that has a five-word vocabulary and only recognizes four-letter words. Let me try that second constraint again: It recognizes only words of four or fewer letters. Both these constraints are imposed merely to keep the example simple and easy to follow; they are easily relaxed.

The spelling checker compares the first word in its vocabulary with each of the others to find a best match. It accepts and assigns goodness weights to shifted matches (Ike vs. Mike), elisions (Swine vs. Swaine), and transpositions (Dbob's vs. Dobb's), and it gives preferential weight to initial-letter matches. These few match rules are easy to implement, easy to understand, and easy to adjust the weightings for.

Other than that, the spelling checker is lacking just about everything you'd like it to have It doesn't learn; its vocabulary is insignificant; and as it is currently implemented, adding new words to the vocabulary requires adding new neurons at every level of processing. Giving it a word to check means keying in a number of DATA statements. But what is nice about the Netwurkz implementation is that the model makes these deficiencies so apparent and provides a framework in which to think about how to fix them.