P.J. Plauger is senior editor of The C Users Journal. He is convenor of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Standard C Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.
Internationalization is to the 1990s what portability was to the 1980s. I believe it is the next important technology for software developers to master to improve their return on investment. Portable software can move from machine to machine at less cost than it takes to rewrite it for each platform. Similarly, internationalized software can move from culture to culture without a major investment in adapting it to each locale. (Such adaption is called "localization.")
Readers of my column "Standard C," and other writings, will know that I've been pursuing for some time the techniques needed to make Standard C a good vehicle for writing international applications. Programmers will need lots of guidance to sail these barely charted waters, more than I can hope to provide even with my ambitious writing plans. Thus, I've been waiting for some good books from others to appear on the subject. Sadly, this isn't one of them.
Uren, et al. give us a 300 page book, nearly half of which is appendices. The preface says the book aims to introduce internationalization and localization to translators and "anybody who designs, builds, markets, and sells any software product outside his or her own country." That, to me, promises a certain level of technical sophistication, both in the intended audience and in the material presented. That material comes in four major parts:
1: Issues
2: Internationalization and Localization for Western European Languages on the IBM PC
3: Other Computers and Other Languages
4: Business Aspects
The appendices that follow repeat some of the tabular data that has gone before. They also essay to explain all sorts of basic concepts of computer hardware and software.
On the whole, I find this a reasonable approach to presentation. Certainly, the IBM PC is the single most important platform for hosting applications, and Western Europe is the single most approachable follow-on market for many existing applications. It is fitting to focus on this particular combination and give others shorter shrift. But I must say that Part 4: Business Aspects is rather lame. It is an arm-waving overview of issues, few of which are discussed in adequate depth.
I began reading Part 1: Issues, with considerable interest. The authors discuss all the cultural surprises, large and small, that bite the naive programmer who first ventures into new cultures. There are obvious issues like extra characters with funny pigeon droppings on them. Not so obvious are all the interesting ways that cultures have adopted to format times, dates, and currency amounts. Downright subtle are the rules (often unarticulated) for ordering words in dictionaries and names in telephone books.
I almost enjoyed this part, except for a few niggling concerns. The first is that the authors seem unaware that all the problems they outline exist in some form well inside the borders of the good old US of A. We have subcultures of our own, thank you, ones large enough to justify tailored software packages. Documents use far more than the 95 characters of ASCII. Military and civilian times have different formats. Accountants have their own ideas about how to represent currency losses. And there are more ways to order text than are dreamed of in your philosophy.
The second concern is that the authors seem more bent on complexifying than on presenting a unifying framework for internationalizing software. (Much of the traffic on the WG20 reflector is also of this "betcha didn't know" variety, reveling in such esoterica as the changing rules for ordering the dipthong ij in Dutch.) It is sobering to note that cultures are endlessly inventive, but that doesn't meet our needs as software designers. We need to know how much adaptation will make people happy. My bet is that the world is a) starved for software that meets folks even halfway; and b) more tolerant than you'd expect of new conventions, particularly if they look to be usable worldwide.
The third concern took root near the end of Part 1, blossomed in Part 2, and exploded riotously by the end of the book. This concoction is chock full of errors. I'm not even talking about errors the authors may have copied from various reference manuals they wisely begin with a disclaimer on that topic. Rather, I mean errors that show the authors' shallow research and general ignorance of programming.
At first it was a small thing. On page 30, the book tells us that "the term GMT is gradually being replaced by UCT, Universal Coordinated Time." The name is right but the initials should be UTC. (Don't ask why.) A small slip, as is the typo on the next page. But then I started paying closer attention, and chasing terms I knew well through the index. I soon learned that:
Page 210 also has the most confused, and erroneous, description of object-oriented code you're likely to hear outside of a marketing-department pep rally. I'd quote it for your enjoyment, but I think you should get the drift by now.
- VAX/VMS is a variety of UNIX (page 97), probably to the great surprise of Digital who wrote it as a counter to UNIX (or in indifference to that system, depending on whom you believe)
- the name UNIX derives from "UNICS, an acronym for Uniplexed Information and Computing Service" (page 206), never mind what Kernighan said when he coined the term
- "A very popular language nowadays is C; in some of its various dialects, it can perform both low-level operations at the bit level and, more normally, perform at a high level. It's possible to write some programs in a language like C; that in part, is almost indistinguishable from Pascal. However, Pascal allows for more manipulation of data structures than does C" (page 210, with bizarre punctuation copied verbatim).
Those are some of the howlers. I can assure you that more pedestrian errors also abound. Every time the book wanders into areas that I know something about, I find plenty to distrust. That doesn't build trust for me in areas where I need greater guidance.
It is clear that the authors are quick to opine about topics on which they are largely ignorant. They damn UNIX with faint praise and not-so-faint innuendo. They devote just one paragraph to the C Standard (p. 140), half of which is negative. Ironically, that paragraph acknowledges that Standard C is the only standardized programming language with support for the things the book discusses at length. Yet the authors apparently saw no need to present, or even understand, what Standard C has to offer.
Much of the support in Standard C for internationalization is instead attributed, rather vaguely, to UNIX. In some cases, the implication is strong that only Hewlett-Packard's HP-UX really does the job right. The "routines" listed on p. 183 are an undistinguished mishmash of POSIX utilities, Standard C functions, and HP-UX function collections all too vaguely described to be useful. The next page begins with the multibyte functions from Standard C, two of which are misspelled. And so on, and so on.
An even greater irony emerged when I went back to study Part 2 at greater length. The authors refer repeatedly to functions provided by Microsoft and others to internationalize DOS and Windows. If they ever mention that these functions are designed to be called from a C program, I couldn't find where. Instead, the text implicitly and repeatedly takes for granted that you will be writing your application in C. Not a bad role for a language that lacks the expressiveness of Pascal.
I'll stop before I get really nasty. You might want to buy this book if you're desperate for an overview of internationalization issues. It does touch most of the bases. On the other hand, you can't depend on it to give you a very accurate picture of any of the details. At least not if the sampler of errors I found is at all representative. I suspect it is.
Title: Software Internationalization and Localization: an Introduction
Author: Emmanuel Uren, Robert Howard, and Tiziana Perinotti
Publisher: Van Nostrand, Reinhold, 1993.
ISBN: 0-442-01498-8
Price: $38.95