Reading, Writing, 'Rithmetic

Dr. Dobb's Journal October 2003

By Michael Swaine

Michael is editor-at-large for DDJ. He can be contacted at mike@swaine.com.

In September, I indulged in the venerable magazine "Summer Reading" cliché. This month, I hope you'll indulge me by letting me write about, among other things, some deeper reading. I whiled away several hours this summer reading about reading code, and code that reads writing, and code that reads faces.

In that same September issue, Gregory Wilson reviewed the book Code Reading. I had a personal reaction to the book, and find that I have to add my comments to Gregory's. I hope that they will augment rather than echo his. I will, however, echo his bottom line—this is a worthwhile book.

Reading Code

Way back in 1987, I wrote a book on HyperTalk programming, and I started that book by quoting the founding fiction editor of Esquire magazine:

I've got a shelf of how-to-write books, and they all seem to me pretty much dreadful...Then I've got another shelf of books, some of them seem to me great...Basically, these are how-to-read books...[I]t seems to me that a beginning writer could learn more from any one of them...than he ever could from reading the whole damn shelf of how-to-write ones.

I went on to draw the obvious parallel with books on writing programs, and bemoaned the fact that I couldn't find a good book that taught how to read programs.

"So I tried my hand at writing one," I said, "and this is it."

Well, I don't know how successful I was in helping people learn how to write programs by teaching them how to read small and large examples of HyperTalk scripts, but I haven't changed my mind in 16 years about the importance of learning how to read code.

Now, on one level, what I am saying is painfully obvious. Of course it's important to know how to read code: You need to be able to read your own or others' code in order to maintain it; you need to be able to read cold code written by people No Longer With The Company and to read hot code as it is being keyed in by your partner if you ever do pair programming; you need to read code if you want to reuse open-source software components. Open-source software is all about being able to read the code. In fact, it's pretty safe to say that anyone reading this particular magazine already knows that reading code is an essential part of being a programmer.

But there's reading and there's reading. Not everybody who can talk can give a good speech, and not everybody who knows C++ syntax can write decent programs. Similarly, not everyone who knows C++ syntax can read code well. There are actual techniques and skills in code reading, and they are rarely taught explicitly.

So I was delighted to read, in the introduction to Diomidis Spinellis's Code Reading: The Open Source Perspective (Addison-Wesley, 2003; ISBN 0201799405):

In this book we demonstrate important code-reading techniques and outline common concepts in the form they appear in practice, striving to improve your code-reading ability.

That sounds like what I've been looking for. Does he deliver?

Reading Tutor

If you read Gregory's review in September, you know that he thinks so. I'll let you decide, and offer some examples to help you.

There are a lot of examples to choose from, because Spinellis helpfully lists all 268 of his code-reading maxims in an appendix. The main body of the book runs through C data types and data structures, control flow, issues in working on large projects, coding standards and documentation, program architecture, code-reading tools, and one large complete example program. That's the Chapter 11 that Gregory talked about in September. Along the way, Spinellis drops lots of code-reading advice, which he then collects in the appendix.

Spinellis starts out with general advice like: "Make it a habit to spend time reading high-quality code that others have written," and "Print on paper code you find hard to understand." He offers tips of the form "X usually means Y," like "A permanent...pointer to a list node often represents the list head," and "Read the expression sizeof(x)/sizeof(x[0]) as the number of elements of the array x." And he is a wealth of good advice on the use of grep, fgrep, diff, and other tools.

But he's altogether pragmatic about where you find your insights into the code. Sometimes, he says, running the code is the quickest way to answer a question about it.

In the midst of a long list of the benefits of reading the documentation, he also mentions some of the reasons why you might want to curse the documentation: "Because documentation is seldom tested and stressed in the way the actual program code is, it can often be erroneous, incomplete, or out of date." And: "Documentation occasionally does not describe the system as implemented but as it should have been or will be implemented." This guy's been looking over my shoulder.

Knowing what software architecture, or what paradigm, the writer of the code had in mind can make all the difference in how hard the code is to decipher, and there are always clues to the architecture. "A tell-tale sign of a data-flow architecture is the use of temporary files or pipelines for communicating between different processes." He points out, though, that a system may not have a single all-encompassing architecture. He recommends trying to understand architecture in terms of frequently used design patterns, even if there's no indication that the authors were thinking in those terms.

Some code, though, was not written to be read. "Limit your expectations when reading wizard-generated code," he says, "and you will not be disappointed."

Reading Ambiguity

In the acronym soup of artificial intelligence, WSD stands for "Word Sense Disambiguation." Good thing researchers chose not to call this area of research "Word Meaning Disambiguation," or some researcher would insist that he never said he was interested in WMDs but only in WMD programs, and it would all get political.

Over the years, the pendulum in AI research swings back and forth between approaches that emphasize logic and algorithms and those that emphasize crunching large bodies of real-world data. Lately, the data-crunchers have been making progress in making sense of human utterances, and Mark Stevenson's Word Sense Disambiguation: The Case for Combinations of Knowledge Sources (CLSI Publications, 2003; ISBN 1575863901) is a snapshot of how far that approach has come in extracting sense from text. Unlike Spinellis's book, this is a dry, academic work with many references to academic papers in journals you don't read, and it has no zingers. It has several virtues that make up for these defects, though.

Stevenson is apparently the first person to pull together a large enough corpus of appropriate data to test different approaches to WSD. He has actually done the tests, and he comes up with an approach that beats all competitors.

The WSD problem is to determine, in context, the meaning of an ambiguous word. Does "bank" refer to a place to keep money, or to the edge of a river? Does "sanction" mean to approve or to penalize? Does "bake" refer to an action that changes an object's state ("bake a potato") or to an action that creates something ("bake a cake")? And how about irony, as when "good" means "bad"?

One reason that the data-rich approach didn't take off earlier is the problem of coming up with a representation for the data. Putting a dictionary or thesaurus online doesn't solve the problem entirely. Different data representations led to different approaches. The various approaches to WSD can be sorted into several logical categories, and one open question when Stevenson began his research was: How different are these categories? Are they really measuring different things, or are they different views of the same underlying phenomena?

The benefit of Stevenson's results is that they show that the different approaches are really measuring different things, to a useful extent. Significantly, the contributions of the different approaches to WSD are independent enough that there is real value in a hybrid approach that combines all the methods.

The results are complex, but the general picture is that Stevenson's approach is generally 90-something percent accurate in resolving ambiguities across parts of speech and other variables. The component approaches, of which his approach is made up, generally hit around 60-80 percent, so his approach of combining methods is clearly justified.

But 90-something percent: That doesn't sound so great. Until you realize that's not 90-something percent accuracy in extracting the meaning from text, but 90-something percent correct on just the hard stuff.

Overall, I think that this book and the research in it represent a minor milestone in the machine understanding of human writing. I suppose that the ultimate solution to the problem will only come when machines get smart enough and powerful enough and fed up enough to teach us humans how to write without illogic and ambiguity. Until then, they will have to do their best with the puzzling, ambiguous, unclear documents we typically produce. Boy, was that ever an invitation to criticism. Oh, well.

Reading Faces

Emotions in Humans and Artifacts, edited by Robert Trappl, Paolo Petta, and Sabine Payr (MIT Press, 2002; ISBN 0262201429) tries to bridge what may be hard-to-bridge questions: how emotions work in humans, how to simulate emotional expression in human-constructed artifacts, the possible role of emotion in machine learning, among others. But this is an edited volume (although not a collection of papers prepared for some other context), and the approach is probably right. If the goal is to explore the space defined by the intersection of emotions and artifacts, all these questions are germane, and no solid overarching theory is to be expected.

The contributors are working in what sound like interesting fields: researching how to build lifelike computer characters; developing mathematical approaches to the analysis of virtual worlds; emotional modeling for autonomous and social agents; writing a program called Affective Reasoner; developing a life-sized improvising synthetic character for an interactive exhibit; directing the Affective Computing Research Group at MIT; working with Marvin Minsky on the role of emotion in memory, reasoning, and learning; and creating Dogz, Catz, and Babyz.

For me, the central question of the book is asked almost exactly at the center of the book: In the title (and content) of Rosalynd W. Picard's essay, "What Does It Mean for a Computer to 'Have' Emotions?" I'll discuss only that essay here, although the whole book is interesting enough.

Picard starts off right, deriding the looseness with which people talk about artifacts "having" emotions. Picard acknowledges that she is an engineer and no philosopher, but—I'll make the argument for her—neither was Alan Turing, and understanding his Turing Test is a must for any philosopher talking about what constitutes intelligence. Picard takes the Turing tack, recasting the question as, "What capabilities I would require a machine to have before I would say that it 'has emotions,' if that is even possible?"

She decides that there are at least four components to what she would demand of this emotive artifact:

Emotional appearance.
Multilevel emotional generation.
Emotional experience.
Mind-body interaction.

Some of these terms are not self explanatory.

By "multilevel emotional generation," she means the range of emotional reactions from instinctive fight and flight behavior to reactions that seem to involve reasoning and appraisal in humans and that seem to be far removed from obvious direct bodily reactions.

I'm less clear what she means by "mind-body interaction," but she makes the interesting point that sometimes our emotions communicate themselves even when we are doing everything we can to communicate the opposite message. Emotive artifacts should have expressive "body" language, and should have trouble suppressing it.

As for "emotional experience," that lands her right in the morass of computer self awareness. This does not seem to me to be reducing the problem any.

Picard's arrestingly simple answer to how to produce an artifact that "has" emotion is to make sure that it has some level (some "nuances," she says) of each of these four components. The artifact then has emotion, and it's only a question of how much.

Okay, I think I have badly oversimplified Picard's argument. I hope I have. But I think that's sort of what she has in mind.

As for the question, "Aren't you merely simulating emotion?" She gives it the only answer that can be given: "Call it what you will, if it fills the role of real emotion, we have to treat it as though it were real emotion."

Writing Wrongs

In my July column about Adam Osborne, I misspelled the name Sri Ramana Maharshi. I apologize. It ought to be easier to keep the bugs out of prose than out of code, because (Diomidis Spinellis notwithstanding) it's easier to read prose than code.

So I don't fault Apple for having a few bugs in their zippy Safari browser. And I do appreciate the bug report icon right there on the face of the browser, sort of like a corrections box on the front page of a newspaper, only not. My only complaint is that, no matter how many times I report the bug that most annoys me, they never fix it. And Safari's out of beta now. So I thought I'd go public with my gripe here.

Sometimes, when I'm viewing a web page narrow enough in a window wide enough that a horizontal scroll bar is not needed, the following happens: The horizontal scroll bar disappears (as it should). The drag box below the vertical scroll bar disappears (it shouldn't). The vertical scroll bar grows down to the bottom of the window (which looks weird). Result: I can't resize the window except by poking the green traffic light (which is pretty limited). The behavior doesn't happen all the time, but it happens a lot. Both facts are annoying.

So, Apple, do you think you could take a look at this bug and fix the darned thing? I may be the only person who cares, but I have been buying your products for a long time. Thanks.

'Rithmetic

Wolfram Research has sent me the latest release of its mathematics software, Mathematica—Version 5. I left Version 4.2 on my machine and ran 5.0 next to it with the intention of running some benchmarks.

That, however, would have involved writing some serious benchmarks, and as anyone who has written serious benchmarks knows, there is actual work involved in the process. Especially in the case of Mathematica, with its multiple paradigms, rich libraries, flexible data representations—but I'm making excuses. The real reason that I haven't come up with some good speed benchmarks for Mathematica is my own limited experience with the program. Despite having used it off and on for many years, I remain a Saturday-morning dabbler.

Nevertheless, I did dabble with the two versions and convince myself, very informally, that Wolfram is not kidding about the improvements.

First, a reminder about what Mathematica is. It is a system for doing serious math on a computer. It lets you enter equations the way a mathematician does on a blackboard or in other styles; it deals with unlimited precision numbers and computations; it does symbolic computation as well as crunching numbers; it comes with packages of routines for working in many areas of mathematical specialization including algebra, calculus, series and limits, curve fitting, discrete math, number theory, geometry, and statistics. It has amazing graphing capabilities. It's a cross-platform app and employs a client-server model that lets you run processor-hogging computations on a dedicated server.

Version 5 claims to outperform dedicated numerical systems—by which they mean MATLAB, MATRIXx, and O-Matrix, apparently. Speedups are scattered across thousands of routines in Mathematica, and in some case are reported to be on the order of a thousand-fold speed increase. I'm guessing that any routine that could be sped up a thousand-fold was kinda slow, but that's just a guess.

New features in 5.0 include a fully integrated solver for differential algebraic equations ("integrated" is one of those words we need WSD to disambiguate, but I think the meaning is clear from the context here), .NET/Link to integrate (see above parenthesis) Mathematica apps into .NET, optimization for 64-bit platforms, a solver for recurrence equations, and more support for sparse matrix operations and formats.

Mathematica and Wolfram Research are among the brainchildren of Stephen Wolfram, author of A New Kind of Science, about which I have written in this space on several occasions.

DDJ