Doubling Down on the Good and the Okay

Dr. Dobb's Journal March 2002

By Gregory V. Wilson

Greg is a DDJ contributing editor with a special interest in scientific computing and small-scale software engineering. He presently works for Baltimore Technologies, and can be reached at gvwilson@ddj.com.

Program Generators with XML and Java
J. Craig Cleaveland
Prentice Hall, 2001
448 pp., $49.99
ISBN 0130258784

Programming Python, Second Edition
Mark Lutz
O'Reilly & Associates, 2001
1292 pp., $54.95
ISBN 0596000855

Developing Bioinformatics Computer Skills
Cynthia Gibas and Per Jambeck
O'Reilly & Associates, 2001
446 pp., $34.95
ISBN 1565926641

Practical Guide to Testing Object-Oriented Software
John D. McGregor and David A. Sykes
Addison-Wesley, 2001
224 pp., $44.95
ISBN 0201325640

The easiest way to explain a general idea is often through specific examples. J. Craig Cleaveland's Program Generators with XML and Java is a good example, and a good book. Despite its title, it isn't really about Java or XML: It's about treating programs as just another kind of data, which is something that programmers ought to do much more often.

By now, most programmers have had some experience with code generators, such as the "wizards" in Microsoft's Visual C++. Cleaveland's book is a systematic exploration of what wizards can do and how they should be built. His examples start simply, but by the end of the book have worked up to customized templating using the XML Document Object Model (DOM), XSLT, XPath, JavaBeans, and other bleeding-edge technologies.

Chapters 2 and 3 of this book deserve special mention. In Chapter 2, Cleaveland looks at the sort of domain analysis procedures that should be used to scope out a code generator. Chapter 3 is then a diary-style description of those procedures in action. I originally found these chapters out of place, but the further I read, the more I appreciated Cleaveland being explicit about the sort of thinking that ought to go on before any code is written or generated.

I had just finished Cleaveland's book when the second edition of Mark Lutz's Programming Python landed on my desk with a loud "thunk." At almost 1300 pages, this book is far bigger than any single volume ought to be; my back would have been a lot happier if O'Reilly & Associates had published it in two parts. On the positive side, everything in it is very useful. Most of the "Python 101" material from the first edition has been stripped out (or moved to Lutz and Ascher's excellent Learning Python, also from O'Reilly & Associates). What's left, and what's been added, is about as comprehensive as any book can be. If you want to build a full-sized application in Python, and don't want to reinvent any wheels, this book is worth the hernia.

I wish I could say equally good things about Cynthia Gibas and Per Jambeck's Developing Bioinformatics Computer Skills. Unfortunately, the authors try to cover so much of this emerging field that they don't do a satisfactory job of any one part. As the title suggests, the book's aim is to teach people with a background in the life sciences the skills they need to work with the vast volumes of data that are being produced by the Human Genome Project and related efforts. This is a laudable goal, but far too much for any one reader to get out of a single book.

Chapters 3-5, for example, discuss how to set up a Linux workstation, the UNIX filesystem, and the basics of UNIX shell scripting. These are all worthy things, but these chapters fall between two stools: They are too short to be useful to people who don't already know them, but too shallow to teach experienced users anything new. Similarly, I simply don't believe that the quick introduction to Perl in Chapter 12 will be comprehensible to someone who doesn't already speak the language well enough to not need it. That said, there is a lot of useful material in the middle of the book on what standard sequence matching engines do, and how they work. I suspect that much of this material will go stale fairly quickly (just like everything else web-related does), but it's still helpful to have it all pulled together this way.

I was similarly disappointed by John D. McGregor and David A. Sykes's Practical Guide to Testing Object-Oriented Software. The blurb on the back claims that the book "...shows how testing object-oriented software differs from testing procedural software." In fact, I came away feeling that it had actually shown how similar the two are. Take out the UML diagrams, and what you're left with is two key ideas — how to know what you're actually testing, and how to do that testing systematically.

The authors' answer to the first problem is to nail down a tight specification of how the system being tested is supposed to act. There's a lot more formalism here than most working programmers are used to, but one of the strengths of this book is that it shows how rigorous analysis and design pays off in increased testability.

The authors tackle the second problem by describing the whole spectrum of QA working practices, from systematic development of traceable test cases to test drivers and instrumented builds. I think most working programmers would enjoy this part of the book more if it included more "hands on" material; again, it's worth borrowing, but probably not something that most DDJ readers will keep on their desk.

DDJ