Book Reviews


Data Structures Using C

Reviewed By John Ridge


John Ridge works for Questar Corporation and currently writes C programs under Windows for image and data acquisition and analysis. He can be reached at Questar, 608 Ridgedale Road, Dayton, OH 45406.

As the title suggests, the book Data Structures Using C, by Aaron Tenenbaum, Yedidyah Langsam, and Moshe J. Augenstein, provides an introduction to information organization using C. According to the authors, the book is intended for use in a two-semester course. The topics it covers include data structures such as lists, trees, stacks, and graphs. It also discusses sorting, searching, and list management. Typical of most textbooks, there are exercises at the end of each chapter and most major sections.

An earlier work by Messrs. Augenstein and Tenenbaum, Data Structures and PL/1 Programming, covers essentially the same material as the C book, except that PL/I was the language used. I compared the two editions and found that most of the introductory chapters and much of the supporting text was the same. I was delighted that all references to punched cards have been removed from the more recent volume.

Most chapters contain plenty of source code, and provide diagrams when needed. Example programs are, of course, written in C. The authors were careful not to use any compiler-specific features in the sample code. The C code in the new book is essentially a direct translation of the code in the PL/1 volume.

Each chapter in the book builds upon topics discussed in previous sections. This process makes the book easy to follow and reinforces key concepts and ideas.

Chapter One examines the ideas of information and meaning and relates these concepts to their computer representation as bits and bytes. It shows the bit representation of integers, characters, strings, and numbers, including IEEE floating point. In this chapter the authors examine C data types such as arrays, characters, structures, and unions, and explain the relationship of these data types to the ideas of information and meaning. There they introduce pointers, and clearly explain storage allocation and scope of variables.

The authors introduce the concept of the abstract data type in Chapter One. They use the abstract data type and its associated functions to provide the program designer with a machine-independent method for visualizing a problem. Implementation of abstract data types using C data types, and their limitations, are key themes throughout the book.

Chapter Two introduces the reader to the stack. The authors explain all basic stack operations such as PUSH, POP, EMPTY, etc. and give examples. Here they demonstrate the stack as an abstract data type. The C implementation shown uses arrays as the data type for the stack. They present infix, postfix, and prefix notation as examples of stack usage.

Chapter Three covers recursion. As is typical of most treatments of the subject, the authors use the factorial function as the primary example of a recursive algorithm. They also give recursive implementations of the Fibonacci Sequence and the binary search.

The authors show that recursion is allowed in C and is made possible by C's use of an internal stack. This point is a good example of the concepts outlined in Chapter Two.

This chapter also shows the Towers of Hanoi problem, and lists the code for the complete C program. The authors point out that an iterative solution is generally better than a recursive solution due to system overhead. Recursion is then simulated using non-recursive techniques.

Finally, the efficiency of recursion is discussed.

Chapter Four introduces queues and lists. Both are defined semantically and as abstract data types. It provides a C implementation of each type. The specific types of queues discussed are ascending and descending priority queues. It disucsses linked lists in the context of a solution to the problem of fixed arrays used in the structures of the previous chapter. It covers the generic implementation of linked lists and shows the linked implementation of stacks. Queues are an improvement over a fixed array implementation.

The actual C implementation of linked lists includes a more in-depth discussion of malloc, sizeof, etc. The authors explain dynamic allocation and freeing of data objects. They introduce doubly-linked lists as an alternative to circular-linked lists.

To demonstrate the topics introduced in this chapter, the authors simulate a bank waiting line. They define the data structures and write the program around the data structures. This exercise provides a good real-world example of the subjects discussed.

Chapter Five discusses the tree data structure, with the binary tree receiving most of the coverage. The authors use trees to implement inorder, preorder, and postorder arithmetic operations. They provide C functions for creation, traversal, and manipulation of trees.

The authors implement a recursive solution to the inorder traversal of a binary tree to demonstrate that the recursive solution can be more efficient than a non-recursive alternative. They also introduce threaded binary trees as a method to improve the efficiency of tree traversal.

A particularly good example provided is that of using trees to implement the Huffman algorithm, which is used for encoding bit strings. They give code to encode data using this algorithm.

Another interesting application of trees is that of game trees. The authors build a tic-tac-toe program to demonstrate move look-ahead, and the minimax method of determining the best move.

Chapter Six covers the topic of sorting. The authors define basic sorting terms such as file, record, and key. They explain sorting efficiency especially well, and introduce the 0 notation as a method of analyzing sorting efficiency.

This chapter discusses a variety of sorting techniques including bubble sort, quicksort, selection sort, binary tree sort, heapsort, insertion sort, Shell sort, merge sort, and radix sort. In all cases, the authors discuss the efficiency of each sort type, and show the trade-off between file size and sort speed. Heapsort and quicksort are especially well covered.

They describe the sorting methods and implement them using the topics and data structures introduced in the previous chapters. This helps the book flow.

Chapter Seven discusses searching techniques, and is the longest chapter in the book. The authors repeat the definitions of file, record, and key and expand upon these definitions. They discuss primary and secondary keys. They define the search table or dictionary through use of an abstract data type.

A key point made in this chapter is that efficient sorting depends on specific ordering of the data. The first section of the chapter presents the sequential search, indexed sequential search, binary search, and interpolation search.

The next two sections provide a rather detailed coverage of tree searches. They demonstrate tree searching, inserting, and traversing the trees. The authors cover binary and non-binary search trees. Since the efficiency of tree searching depends on the balancing of the trees, they discuss various balancing methods. These methods include B-Trees, B+ trees, and digital search trees. In all examples of tree searching, the authors discuss efficiency considerations.

The last section of this chapter discusses hashing as a search method. The authors cover design of a good hashing function as well as various methods for resolving hash clash. This section covers hashing and its efficiency extensively.

Chapter Eight introduces the topic of graphs. The authors introduce the subject by the application of graphs to determine if a path exists between one city to another based on the existing roads connecting the cities. They devise an algorithm, and give the C data structures necessary to implement a graph solution.

Another interesting real-world example given is that of the shortest path problem used to solve water flow in a water pipe system. The goal in the example is to maximize water flow. The authors give a C program to find the shortest path between two nodes using a weighted graph.

The authors describe the implementation of graphs using linked lists, and discuss the efficiency of this representation. They use a scheduling problem, cooking an egg, to illustrate concepts presented. (The graph of this problem is on the cover of the book.) They also discuss graph-traversal methods.

Chapter Nine discusses storage management as performed by an operating system. This chapter is especially interesting because it ties together all material presented in the previous chapters, and it is a solid introduction to the various methods used by operating systems to manage data requests.

Since linked lists are often used to manage data objects, the authors reintroduce the linked list and discuss it in greater detail. They demonstrate memory management techniques such as allocation, compaction, and garbage collection.

The authors describe dynamic memory management by showing how blocks of memory that vary in size can be managed by the system. They illustrate and compare compaction, best-fit, and worst-fit methods. They also discuss the boundary-tag method and buddy system as efficient liberation methods.

The book includes an especially complete bibliography which contains references to many of the major works of the computer discipline. It is an excellent source of reference listings for the topics discussed in Data Structures in C, as well as the general topics of data structures and algorithms.

Conclusion

This text is a solid introduction to the basic data structures used in computer science. It is logically organized and well-written. The use of C in the program examples should certainly make the book interesting to the large and growing number of C programmers.

Those readers who already have the PL/I version of the book may not want to purchase the new volume, as the essential concepts are well covered in both works. For programmers looking for an excellent reference on data structures, or for computer students whose curriculum requires this book, I recommend adding it to your bookshelf.

Data Structures Using C
Tenenbaum, Langsam, and Augenstein
ISBN: 0-13-199746-7
Prentice-Hall, 1990
662 pages, $47.00.