Columns


Doctor C's Pointers®

Data Structures, Part 7

Rex Jaeschke


Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quarterly publication aimed at implementors of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA 22091 or via UUCP at rex@aussie.com.

Last month, Part 6 included the first half of a program to implement a doubly-linked list in dynamically allocated memory. This installment contains the remainder of that example. In addition, it shows a way to allow the user to enter commands as case-insensitive text, bypassing the numeric menu options used so far.

Listing 1, Listing 2, Listing 3, Listing 4, Listing 5, and Listing 6 contain the code. Figure 1 shows an example of some output.

A Better Command Interface

Selecting option numbers from a menu is not particularly exciting. Besides, the numbers corresponding to each option are not at all intuitive. A better approach would be to allow users to enter a command name. An even better solution would be to allow them to enter any leading part of that command name so abbreviations would be possible. Listing 7 shows the modifications necessary for this support. (The previous version is changed little except for main and help.)

All main does is to get the command from input, pass it to the lookup function, and call the function that lookup tells it to. To make the program even more tolerant, the commands may be entered using either uppercase or lowercase letters or both. Words in multi-word commands must, however, be separated by exactly one space (although that restriction could easily be eliminated too).

The declaration (and definition) of the function lookup is pretty tricky. It looks like the following:

void (*lookup(const char *pcmd))
     void)
lookup takes one argument (of type char *) and returns a pointer to a function that will be called to process the command passed to it. The function this pointer points to has no arguments and no return value. Since lookup has no business modifying the command string sent to it, the argument has the const qualifier.

The call to lookup is also fairly tricky. It looks like:

(*lookup(command)) ();
The command is passed to lookup which returns a function pointer. That pointer is dereferenced and the underlying function is called. Note that command validation has been relegated to the lookup function which is defined in Listing 8.

cmd_table is an array of structures each of which contains a command string and a corresponding processing function's address. This table is searched to find the matching command. Note that the commands are sorted alphabetically, in order to detect if a command is invalid without necessarily searching the whole table. (If the table were much larger, a binary search could have been used instead.)

If a command is invalid, the obvious thing to do would be to return a NULL pointer and test for it in main. However, this is unnecessary. Instead of treating invalid commands as something special we simply treat them as commands that are processed by a special "command" function. That is, when an invalid command is detected, we return the address of an error handling function:

void bad_cmd (void)
{
   printf("\nInvalid command. Please try again.\n");
}
The main function is now completely ignorant of whether a command is valid or not.

Since main won't call lookup for an empty string, we know the command consists of at least one character. By using strncmp for the comparison, we compare the whole input string with the leading part of each command in the table. Thus, the input commands a, add, Add n, and Add Node are all recognized as requesting the add node option. In fact, all commands can be uniquely identified in one character except for the dump ascending and descending orders. In this case, a command of d or dump would be interpreted as dump ascending since that command would be matched first. A command of at least dump d is needed to get the descending order. If you want to complain that a command such as d is ambiguous, you could have a third member in each structure containing the minimum length needed to make the command strings distinct and use that when doing the comparison.

The help function has now been modified as shown in Listing 9. For completeness, here then is the string conversion function as well.

void stolower(char *pstring)
{
   while ((*pstring = tolower(*pstring)) != '\0') {
      ++pstring;
   }
}

What Goes Around Comes Around

Oftentimes it is possible to use a recursive technique to traverse a list or tree. However, in many cases, the recursive solution is no more efficient. (It is probably less efficient, since it spends a good deal of its life calling and returning from functions.) Typically, the recursive solution is no more readable. In any event, Listing 10 shows a recursive version of dump_des_nodes. You can compare it to the earlier non-recursive version and judge for yourself.

Reader Exercises

The doubly-linked list program did not allow a node's string value to be modified. Think about what it would take to add this option. Here are some tips on what to consider.

1) If the new string value already exists, you need to add the old string value occurrence count to the new string value count and free up the old node and the old string.

2) If a node does not exist with the new string's value, allocate memory for the new string, make the current node point to it, copy the new string out there, free the old string space, remove the node from the current location in the list, and add it back according to the new string value. Don't free up old space, however, until you are guaranteed you have gotten the space for the new string.

3). You could look for short cuts. For example, if the new location is the same as the old there is no point in removing and reinserting the node in the same place. (One special case of this is modifying the only node in the list.)

4). Since duplicates are stored as one string with an occurrence count, when you change one "occurrence" you change them all. If you wanted, however, you could just change one and leave the original node alone with a decremented count.

A more interesting problem has to do with making the alphabetical order case-insensitive. That is, the strings ABC, abcd, and DEF should be stored in that order. However, the strings RED, Red, and red must be stored as distinct (and adjacent) strings not as duplicates.

At a first glance, it appears all you have to do is to write a case-insensitive version of strcmp and call that from locate_node instead of strcmp. However, this isn't the whole story. For example, if the list contained the string entry Red but not one for red and you tried to add the string red, locate_node would report that you got an exact match. Case would be ignored. However, this is incorrect since the new string is not a duplicate of the existing one. They differ in case. So you need a case-insensitive match when looking for the place to insert a new node, but you want a case-sensitive search when checking if a node already exists with a given string value.

To help you out, Listing 11 contains a case-insensitive version of strcmp. Note that it does not actually change either of its input strings. Rather, it makes uppercase copies of each character pair just before it compares them.