November 1991/Doctor C's Pointers

Columns

Doctor C's Pointers®

Data Structures, Part 8

Rex Jaeschke

Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quarterly publication aimed at implementors of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA 22091 or via UUCP at rex@aussie.com.

Circular Lists
In previous installments I have presented both singly- and doubly- linked lists. All of these lists have used some sort of sentinel to indicate the end of the list — typically a pointer with a value NULL. In the case of doubly- linked lists, a tail as well as a root pointer indicates the end of the list, permitting list traversals in reverse order. Other variations are possible. For example, consider a singly-linked circular list. Instead of the last node pointing to NULL, it points back to the first node. A root pointer can still point to one of the nodes. To process all the nodes in the list, you start at the one pointed to by the root and continue through the list until you return to the starting point. You know you have completed a cycle when the current node's forward pointer is the address of the starting node, assuming of course the node remained intact.
In the case of a doubly-linked list, the backwards pointer of the first node is the address of the last node, so the circle is completed in that direction as well. In this case, a special tail pointer is unnecessary since you can determine the tail from the root. That is, the root pointer points to the first node and the first node's backward pointer points to the last node. (Throughout this article, the first and last nodes refer to the first and last nodes in one cycle. A circular list doesn't actually have end nodes.)
The structure template for a circular list node is the same as for a non-circular list. The difference is that you no longer set or test for NULL terminators, but rather, check to see if you have gone "full circle."
Numerous kinds of applications require the entries in a list to be processed iteratively, either for a given number of iterations or indefinitely. One example is the list of active programs being executed by an operating system. In many cases, the nodes are organized in some order — chronological, alphabetic by name, or in some ascending or descending order of priority, for example.

The First Example
Listing 1 contains a simple application that displays a series of characters at specific cursor positions on the screen. (I will call each character and its position a point.) After displaying all points, the program waits a few seconds, clears the screen, and then redisplays all over again. In this example, the data never changes. In real-world applications, however, either or both the data and coordinates might change. Also, old points may be removed and new ones added. If any data has changed or some other action has caused part or all of the screen to be overwritten, the program must clear the display before refreshing it.
Note that the program always adds new nodes to the end of the existing list. The points are thus displayed in the order in which they were entered. By changing the order in which you give the X and Y coordinate pairs to the program, you can affect how the screen is drawn. For large and/or busy screens it might be best to draw the points in left-to-right and top-to-bottom order. This direction is certainly less distracting visually.
If your display device is not ANSI-compatible, you can substitute the corresponding escape sequences in the "clear screen" and "cursor position" macros.
The code in Listing 1 prompts the user to enter the character to be displayed, its screen line, and column coordinates. (Note that throughout, input validation is kept to a minimum or is non-existent. The purpose here is to outline the method only.) This process repeats until all points have been entered, followed by EOF. The display data is stored one point per node in a singly-linked circular list stored in dynamically allocated memory.
Once the list has been created, it is a trivial matter to cycle through it indefinitely, displaying the points. To test the program, run it and enter the following data:
1 1 a
2 2 b
3 3 c
..
20 20 t
The screen should display the letters a-t in a diagonal starting from the top-left corner. If your system supports I/O redirection, you can test the program with various data files.
I intended the wait function to suspend execution for five seconds. Though this function is not part of Standard C, your compiler library may provide something like it. In any event, you can make a crude version of wait using something like
void wait(unsigned int seconds)
{ void dummy(void);
 int i;

 for (i = 0; i COUNT; ++i)
    dummy();
}
void dummy(void)
{
}
By trial and error you can find a value for COUNT that gives you the required delay. Of course, this approach is crude, inconsistent, and compiler-dependent. In the past, people have written idle loops that contained only a null statement. Good compilers will optimize such code away since nothing needs to be done, resulting in no delay at all. For this reason the body of the loop in wait is a function call, which the compiler cannot easily optimize away unless it does inter-procedural analysis. That is, the compiler won't get rid of wait unless it examines the source of dummy and finds that it, in turn, does nothing.

Empty Vs. Non-Empty Lists
As with non-circular lists, the special case of an empty list can complicate the design. It so happens that in many cases you can eliminate this issue by ensuring the list has at least one entry. This "dummy" entry can never be removed and must not unduly impact the processing of the other nodes. That is, processing the dummy node shouldn't take any significant time and shouldn't adversely impact the program's environment (the state of the display, files, memory, etc.). On the other hand, you don't want to special-case this node since that is precisely what we're trying to get away from.
The changes to the code

static Node dummy_node = { &dummy_node, 1, 1, ' ' }; proot_node = &dummy_node; pprev_node = &dummy_node;
allocate the dummy node permanently and initialize it to display a space at position 1,1. Since the dummy node is the first node in the list, it is the first point displayed after the screen has cleared. As such, drawing a space on a blank screen has no visible effect. If you didn't clear the screen each cycle, you would want this node's character to be non-displayable and free of unwanted side-effects. (Some displays ignore cursor positioning attempts outside the valid coordinate range, while others provide a special position to display characters "off the screen.")
You can remove the tests for an empty list both in the list creation and at the start of the processing cycle.

The Second Example
This problem is a little more complex than the one in Listing 1. A set of predefined processes (pro1, pro2, and pro3) is implemented by corresponding functions. The user selects a set of steps, each of which has a process and priority. The list starts out with a dummy "do nothing" process and grows as the user defines new steps. The list is implemented as a doubly-linked circular list with nodes stored in descending order of priority. Since the dummy step has priority zero and user-defined steps must always have a priority greater than zero, the dummy process node is always last in the list cycle. If a new step's process has a priority higher than all others, the root pointer is made to point to it. That is, the root pointer can change at runtime. Nodes representing processes with the same priority are stored most recent first. A node is inserted in front of its priority duplicates.
To simplify the code, each step simply causes a function to be called that identifies itself and its priority. In reality, each node would contain data to be processed and the processing functions would act on that data. There is little validation; the same process can be used in many steps and at different priorities.

How It Works
The main function cycles through the list, continually executing the processes in order of priority. To add a new step, the user causes a SIGINT signal, usually by pressing CTRL/C or CTRL/D, depending on the operating system. The user-written signal handler proc_cmd then executes while main is temporarily suspended. proc_cmd prompts for an option. Option Two ("add step") causes a new node to be inserted in the list at the appropriate place. The signal handler returns, and main continues execution. Note that main could be executing anywhere in its endless loop when the signal occurs.
Actually, having the main routine do the display while a signal handler does the menu processing is an unusual approach. Normally, you would have the main program manage the user interface and have an asynchronous thread that runs in the background processing the list. I've done this kind of thing on VAX/VMS and other systems, but since the code uses operating-system-specific library calls, I have chosen to present the alternate (and portable) solution here.
In Listing 2 the dummy header node is permanently allocated and is initialized to point to itself in both directions. Its priority is zero (the lowest), and its process is idle, which normally does nothing but return to its caller. The processing functions receive their priority as an argument.
Adding a node in a circular list is simple so long as the list always contains at least one node. Since you never insert a start or end node before the first or after the last node, adding a node is just the same as inserting between two nodes. Every node is always between two nodes, even the only node in a one node list. See Listing 3.
The version of locate_node in Listing 4 is much like the one I've presented in earlier installments. This version handles integer values, and the list is sorted in descending order instead of ascending. The get_free_node in Listing 5 is the same as before and is included for completeness.

Further Work
The second example provides only one production option — adding a new node. Of course, you would probably want a few more options, such as removing a node and changing the priority and/or process of a node. A potential problem arises in this program in that the process functions and the signal handler functions perform I/O to the same stream asynchronously. This may or may not work correctly depending on whether these I/O routines are reentrant.
A complication I alluded to earlier was that of determining when you have gone full-cycle. If the node pointed to by the root is removed while you are cycling though the list, how will you know you have gotten back to the start? The same problem occurs if a new node is added in front of the old start node. Since nodes can come and go, using their addresses to determine end-of-cycle is of no use: a new node could have the same address of a previously deleted node, for example. In the second example you could use the priority value. When the next link's priority is greater than that of the current link, you know you've returned to the start. (Interestingly, this doesn't hold true for a list containing only the idle node but then it probably doesn't matter either.)
As the list exists now, it may contain consecutive nodes with the same priority. For some applications it may be desirable for the primary list to contain only the first occurrence of each duplicate. Though the circular list will contain fewer nodes, each node in that primary list will have a forward pointer to the next node with the same priority. This definition for Node appears in Listing 6.
Each primary node is the head node of a list of duplicate priority nodes. Depending on your needs, these secondary lists could be singly- or doubly- linked, and circular.
Consider the case where the root pointer points to the dummy node instead of to the node with highest priority. Since the dummy node always exists and in the same location, and since it always points forward to the first node in the list, it may be possible to do away with updating the root pointer all together.
No doubt you can think of other ways to extend these examples. Note, too, that once you get into asynchronous operations, you may want to use the volatile keyword to disable certain optimizations. Changes made asynchronously should typically be reflected elsewhere in the program immediately. You might also need to use semaphores to block the cases where you try to remove a node that is currently being processed.