December 1991/Doctor C's Pointers

Columns

Doctor C's Pointers®

Data Structures, Part 8: Circular Lists

Rex Jaeschke

Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quarterly publication aimed at implementors of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA 22091 or via UUCP at rex@aussie.com.
Previous installments have described both singly- and doubly-linked lists. All of these lists have used some sort of sentinel to indicate the end of the list, typically, a pointer with value NULL). Doubly-linked lists used a tail as well as a root pointer, for traversing the list in reverse order. Other variations are possible. For example, consider a singly-linked list. Instead of the last node pointing to NULL, why not have it point back to the first node? Such a list is called a singly-linked circular list. A singly-linked circular list can still have a root pointer pointing to one of the nodes and can process all the nodes in the list by starting at the one pointed to by the root, then continuing through the list until it gets back to the starting point. You know a cycle is complete when the current node's forward pointer is the address of the starting node, assuming, of course, the node had remained intact. In the case of a doubly-linked list, the backwards pointer of the first node would be the address of the last node so the circle would also be completed in that direction. A doubly linked list does not need a special tail pointer since the tail can be determined from the root. That is, the root pointer points to the first node and that node's backward pointer points to the last node. (Throughout this article, the first and last nodes refer to the first and last nodes in one cycle. In actual fact, there are no end nodes in a circular list.)
The structure template for a circular list node is the same for a non-circular list. The difference is that in a circular list you no longer set or test for NULL terminators but, rather, check to see if the list has gone "full circle."
There are numerous kinds of applications where the entries in a list are to be processed iteratively either for a given number of iterations or indefinitely. One example is the list of active programs being executed by an operating system. In many cases, the nodes are organized in some order. It might be chronological, alphabetic by name, or in some ascending or descending order of priority, for example.

The First Example
To get the feel of a circular list, Listing 1 starts out with a simple program that displays a series of characters at specific cursor positions on the screen. (I call each character and its position a point.) After all points have been displayed, the program waits a few seconds, clears the screen, and then redisplays all over again. In this example, the data never changes. However, in real-world applications either or both of the data and coordinates might change. Also, old points may be removed and new ones added. If any data has changed or some other action has caused part or all of the screen to be overwritten, the display must be cleared before being refreshed.
Note that the nodes are always added to the end of the existing list so the points are displayed in the order they were entered. As such, you can affect the way in which the screen is drawn by changing the order in which the X and Y coordinate pairs are given to the program. For large and/or busy screens it might be better to draw the points in left-to-right and top-to-bottom order. It certainly would be less distracting, visually. If your display device is not ANSI-compatible you can substitute the corresponding escape sequences in the "clear screen" and "cursor position" macros. Initially, the user is prompted to enter the character to be displayed and its screen line and column coordinates. (Note that throughout, validation is kept to the minimum or is non-existent. The purpose here is to outline the method only.) This process is repeated until all points have been entered followed by EOF. The display data is stored one point per node in a singly-linked circular list which is stored in dynamically allocated memory. Once the list has been created it is a trivial matter to cycle through it indefinitely, displaying the points, as in Listing 2.
A simple test is to enter the following data:
1 1 a
2 2 b
3 3 c
..
20 20 t
The screen should show the letters a-t displayed in a diagonal starting from the top left corner. (If your system supports I/O redirection you can test the program with various data files.)
The wait function is intended to suspend execution for five seconds. This function is not part of Standard C although something like it may be provided in your compiler's library. In any event, you can make a crude version of wait using something like the following:
void wait(unsigned int seconds)
{
   void dummy(void);
   int i;
   for (i = 0; i < COUNT; ++i)
      dummy();
}

void dummy(void) { }
By trial and error you can find a value of COUNT that gives you the required delay. (Of course, this approach is crude, inconsistent, and compiler-dependant.) In the past, people have written such idle loops and had them contain only a null statement. Such code will likely be optimized away since nothing needs be done, resulting in no delay at all. For this reason the body of the loop is a function call which the compiler cannot easily optimize away unless it does inter-procedural analysis. (That is, unless it examines the source of dummy and finds that it, in turn, does nothing.)

Empty Versus Non-Empty Lists
As with non-circular lists, the special case of an empty list can complicate the design. But, in many cases you can eliminate this issue by always making sure the list has at least one entry. To do this invent a dummy entry and make sure it can never be removed, and that it also does not unduly impact the processing of the other nodes. That is, when that dummy node is processed it shouldn't take any significant time and it shouldn't adversely impact the program's environment. For example, it should not change the state of the display, files, memory, etc. Don't special-case this node since that's what we're trying to get away from.
The changes to the code are quite small and are as follows:

static Node dummy_node = { &dummy_node, 1, 1, ' ' }; proot_node = &dummy_node; pprev_node = &dummy_node;
The dummy node is allocated permanently and initialized to display a space at position 1,1. Since this is the first node in the list it is the first point displayed after the screen is cleared. As such, drawing a space on a blank screen has no visible effect. If the screen were not cleared each cycle you would want this node's character to be non-displayable and also to not cause any unwanted side-effects. (Some displays ignore cursor positioning attempts outside the valid coordinate range while others provide a special position to display characters "off the screen.")
The tests for an empty list can be removed both in the list creation and at the start of the processing cycle.

The Second Example
Listing 3 begins a problem that is a little more complex. There are a set of predefined processes (pro1, pro2, and pro3) each of which is implemented by a corresponding function. The user selects a set of steps each of which has a process and priority. The list starts out with a dummy "do nothing" process and grows as new steps are defined by the user. The list is implemented as a doubly-linked circular list with nodes stored in descending order of priority. Since the dummy step has priority zero and user-defined steps must always have a priority greater than zero, the dummy process node is always the last in the list cycle. If a new step's process has a priority higher than all others, the root pointer is made to point to it. That is, the root pointer can change at runtime. Nodes representing processes with the same priority are stored most recent first. That is, a node is inserted in the list in front of any of its priority duplicates. To simplify the code, each step simply causes a function to be called that identifies itself and its priority. In reality, each node would contain data to be processed and the processing functions would act on that data. There is little validation; the same process can be used in many steps and at different priorities. Again, the intent here is to introduce the concept and provide a simple demonstration. The way this program works is that the main function simply cycles through the list continually executing the processes in order of priority. To add a new step, the user causes a SIGINT signal. (This is most often done by pressing CTRL-C or CTRL- D, depending on your operating system.) This results in the user-written signal handler proc_cmd being executed while main is temporarily suspended. proc_cmd then prompts for an option of which option 2 means "add step." This results in a new node being inserted in the list at the appropriate place, the signal handler returning, and main continuing execution. Note that main could be executing anywhere in its endless loop when the signal occurs.
Actually, having the main routine do the display while a signal handler does the menu processing, is an unusual approach. Normally, you would have the main program manage the user interface and have an asynchronous thread running in the background processing the list. I've done this kind of thing on VAX/VMS and other systems but since such code uses operating-system-specific library calls I have chosen to present the alternate (and portable) solution here.
The dummy header node is permanently allocated and is initialized to point to itself in both directions. Its process is 'idle' (which normally wouldn't do anything but return to its caller) and its priority is zero, the lowest. The processing functions are passed their priority as an argument. The purpose of this is simply so they have something to display when the program is executed. See Listing 4.
Adding a node is much simpler with a circular list so long as it always contains at least one node. Since there is never a start or end node inserting before the first or after the last node is just the same as inserting between two nodes. Every node is always between two nodes, even the only node in a one node list. See Listing 5.
Listing 6 contains a version of locate_node much like that seen previously; it just handles integer values instead and the list is sorted in descending order instead of ascending.
Listing 7 contains get_free_node. It is the same as before and is included for completeness.

Further Work
The second example provides only one production option, that is, to add a new node. Of course, we would probably want a few more such as remove a node and change the priority and/or process of a node. There is also a potential problem in this program. The process functions and the signal handler functions perform I/O to the same stream asynchronously. This may or may not work correctly depending on whether these I/O routines are reentrant.
A complication alluded to earlier was that of determining when you have gone full-cycle. If the node pointed to by the root is removed while you are cycling though the list, how will you know you have gotten back to the start? The same problem occurs if a new node is added in front of the old start node. Since nodes can come and go, using their addresses to determine end-of-cycle is of no use: A new node could have the same address of a previously deleted node, for example. In our example we could use the priority value. When the next link's priority is greater than that of the current link we know we are back at the start. (Interestingly, that doesn't hold true for a list containing only the idle node but then it probably doesn't matter either.) As it exists now, the list may contain consecutive nodes with the same priority. For some applications it may be desirable for the primary list to contain only the first occurrence of each duplicate. How could this list be represented? Well, we still need our circular list; it will just contain less nodes. However, each node in that primary list would contain a forward pointer to the next node with the same priority. That is, the definition for Node might now be that in Listing 8.
Each primary node then is the head node of a list of duplicate priority nodes. And depending on your needs, these secondary lists could be singly- or doubly-linked, and circular.
Consider the case where the root pointer points to the dummy node instead of to the node with highest priority. Since the dummy node always exists and in the same location and it always points forward to the first node in the list, it may be possible to do away with updating the root pointer all together.
No doubt you can think of other ways to extend these examples. Note too that once you get into asynchronous operations you may need to start using the volatile keyword to disable optimization of certain object accesses. Changes made asynchronously should typically be reflected elsewhere in the program immediately. You might also need to think about using semaphores to block the cases where you try to remove a node that is currently being processed.