PROGRAMMER'S BOOKSHELF

Graphics Gems and Fractal Compression

Ray Valdés

This "Programmer's Bookshelf" covers two graphics books--both a bit expensive, somewhat specialized, but definitely worth your consideration.

The first is unreservedly recommended and has the most general appeal. Graphics Gems III is part of a series initiated by Andrew Glassner of Xerox PARC. The current volume, edited by David Kirk, follows on the heels of two successful volumes (one of which was recommended by Michael Abrash in his July 1992 "Graphics Programming" column), with a fourth planned for later this year.

Glassner's concept resembles DDJ's mission: to provide "a collection of tips, techniques and algorithms for the practicing computer_programmer. Many ideas that were once passed on through personal contacts or chance conversations can now be found here_. Many are illustrated with accompanying source code." The books consist of short, accessible bits of software expertise that can be used immediately in solving specific, practical problems--fast stretching of a bitmap, joining two lines with a circular arc fillet, computing the intersection between a triangle and a cube, and the like.

There are no tedious tutorials, no mention of hot products or fashionable paradigms--just no-nonsense, proven chunks of software technology, embodied in either pseudocode, C/C++, or sometimes just in narrative form. To make use of these gems, you need a background in computer graphics, but many items are self-contained, and most are lucidly explained. Chances are, if you know you need a particular item, you'll have the background to understand its exposition. If your application programs rely on an API provided by a graphics library, then you probably won't need this book; in such a case, the coverage here is of interest only if you want to know what's happening on the other side of the API boundary.

Volumes subsequent to the original Graphics Gems are the result of contributions by numerous programmers from both the research, industrial, and university communities. Volume III, for instance, consists of items from about 60 different contributors, subdivided into categories such as image processing, numerical programming, modeling, rendering, ray tracing, and radiosity. Space limits preclude mentioning more than a few items: quaternion interpolation, fast random rotation matrices, interpolating Bézier curves, decomposing linear and affine transformations, fast circle clipping, partitioning a 3-D convex polygon with an arbitrary plane, converting Bézier triangles into rectangular patches, ray tracing with a BSP tree, and antialiasing polygon edges using bit-masks. (For readers interested in contributing to future Graphics Gems, each volume includes a postcard to request an author's packet.)

Glassner's goal of capturing the oral history of high-end computer graphics into a written record is extremely worthy. I recall 12 years ago when I was a member of a team designing a high-end electronic-publishing workstation from the ground up, 1.5 million lines of code in all. Some of us had strong backgrounds in graphics research, others had lengthy programming experience. Even so, we had to struggle to solve many problems that fell in between theoretical research and workaday programming; for example, efficiently scan-converting a rotated ellipse or a Bézier curve. The standard graphics textbooks of the day--such as Foley and van Dam, or Newman and Sproull--provide the basic equations, but neglect critical practical aspects (such as using recursive subdivision for Bézier scan-conversion). Because these topics are not substantive enough to be merit coverage in the research journals, we had to invent our own solutions, or ask other working graphics programmers. In many cases we succeeded, in others we did not. As Glassner says in the foreword, "[With Graphics Gems] we avoid reinventing the wheel, and by sharing this information, we help each other move towards a common goal of amassing a body of useful techniques to be shared throughout the community."

All the C and C++ code in the series is in the public domain, and you don't even have to buy the books to get it; you can just ftp it from Internet sites such as princeton.edu (in the pub/Graphics directory) or weedeater.math.yale.edu. For those without Internet access, the current volume comes with either a Macintosh or IBM format diskette containing code from all three volumes.

Fractal Image Compression, by Michael Barnsley and Lyman Hurd, is of interest to a much smaller audience, but has been eagerly awaited by that group of people. Many of us have been reading for years how iterated function systems (IFS) can be used for fractal compression of scanned images, resulting in compression ratios of 2500:1 or more.

Much of the allure is that the task of generating an image from a given IFS specification is very simple, and can be expressed in about 12 lines of C code and 16 floating-point numbers. Because the images are fractal in nature, the resolution is effectively infinite--the closer you look, the more detail is there.

The hard part is arriving at the small set of numbers that characterize a given image. Or as Barnsley puts it:

If our real world image is one of many basic shapes, such as a leaf or a letter of the alphabet, or a black-and-white silhouette of a fern, or a black cat sitting in a field of snow, or a rook's feather on a white starched sheet, or a black crack in a white teacup, or a snowflake lying on a frozen lump of coal, or a circle or a square, or a Julia set, or the outline of a pine tree or many pine trees against the skyline at dusk, or a component of an image received, or to be sent, via a fax machine, or the graph of a complicated function, or one of a multitude of familiar shapes and forms such that a model is appropriately made in black-and-white alone; in such cases, one can achieve fractal image compression via the IFS compression algorithm, which is an interactive image modeling method based on the collage theorem. The IFS compression algorithm starts from a target image T which lies in [the support]. An affine transformation wi(x)=[matrix formula omitted]=Aix+ti is introduced_"

Although the preceding reads like Finnegan's Wake blended in with a mathematics textbook, most of the book is less verbose and highly technical, requiring a background in topology, measure theory, information science, and image processing. This despite the fact that the authors spend the early chapters providing formal definitions of basic concepts such as metric space, affine transformation, Hausdorff space, and Hutchinson metric. Even so, there aren't enough pages left to define additional terms such as Borel-measurable function, which are used throughout the discussion.

Those who have heard Barnsley speak shouldn't be surprised. He's a brilliant dynamo of ideas and allusions, with enough incandescent brainpower to heat a small winter cabin, his body going through various geometric transformations as he steps lightly from one subject to another. This pedagogical style parallels his approach to handling images, which is to take bits and pieces of an image and paste them in altered form at different places, repeating the process until a sufficient level of detail is achieved. Although I always walk away from a lecture having learned something, I'm also left wondering whether the whole story was presented, or only a lower-resolution "lossy" version, or perhaps it is just that I don't understand.

So how is this hard-to-believe magic achieved? This is the book in which Barnsley promised to open the kimono and reveal the secret of the fractal transform, now that his patent has been approved. The results are mixed. There's a lot of interesting material here but the secret is only partially revealed.

I don't think it's just me, either. Thomas Colthurst, who teaches an MIT class on Advanced IFS Theory, and who has come up with his own methods for fractal image compression, posted his reaction to this book in the comp.graphics newsgroup on the Internet:

Rather a disappointment. It contains precious little about how Barnsley's fractal transform method works, and what it does reveal is nothing more than a variant of the stuff Jacquin and Yisher, Boss, and Jacobs have been doing for a couple of years now. Sometimes it seems like the main point of the book is to allow Barnsley to wave his patents (#5,065,447 and #4,941,193) around.

Colthurst's expectations are probably higher than that of most readers. For me, there was enough substance in the book to merit struggling through it, but the list price should certainly give you pause.

Another way in which the book is frustratingly incomplete is that much of the discussion depends upon theorems such as the IFS, Condensation, Collage, Hutchinson, Elton, and Attractor Computation theorems, which are stated without proof. In all these cases, the reader is referred to Fractals Everywhere (Barnsley's earlier book) for the proofs. Given the high price of the book, and given the presence of somewhat irrelevant material like sections on Huffman coding and Discrete Cosine Transform, the proofs should have been provided in context.

Further confusing matters is the imminent appearance of another book by Barnsley (this time co-authored with Louisa Anson), entitled The Fractal Transform, but not yet available. This collage of coverage parallels the development of IFS technology, which from this vantage point was undertaken by a collage of overlapping companies, all generally within the domain of Barnsley.

Moreover, in contrast to the generous collegial spirit behind Graphic Gems, I counted about a half dozen references to Patent #5,065,447, followed by the repeated admonition: "If you wish to set up an image compression system on a digital computer using the fractal transform, you should apply for a license to: Licensing Department, Iterated Systems, Inc., 5550-A Peachtree Parkway, Norcross, GA 30092." Keep in mind that patent law restricts your use of a particular technology even if you have independently come up with the same algorithm or program.

So how does the fractal transform algorithm work? Ironically, I found the best summary in Colthurst's Internet posting:

[It works] by dividing the image into domain blocks (typically an 8x8 square of pixels), and range blocks (bigger than the domain blocks, say 16x16 pixels) and finding local iterated function system transforms (LIFS transforms) from range blocks to every single domain block. The LIFS transforms will typically translate the image, scale it by a fixed factor (in our case of transforming 16x16 blocks into 8x8 blocks, it shrinks each dimension by one-half), rotate and/or flip it (giving the eight symmetries of the square), and multiply the color or gray scale intensities by some factor so that the average intensities of the domain and range blocks match. The trick, of course, is to find the best range block for every domain block, and to do this quickly. Barnsley does not address this issue at all, and the C code which he provides merely implements the brute force algorithm (it computes the [Hausdorff] distances between every range block and every domain block).

Fractal Image Compression

Michael Barnsley and Lyman Hurd

AK Peters, 1993

244 pp. $49.95

ISBN 1-56881-000-8