Steve Halladay and Mike Wiebel currently work for Storage-Tek in Louisville, CO. Steve and Mike have over ten years of multi-threaded programming experience. Steve holds an M.S. in computer science from Brigham Young University. He is also the author of the multi-threaded software Multi-C and CSIM. Mike has a B.S. in computer science from Colorado State University. You can contact Steve or Mike at (303) 673-6683.
Though trade journals have given significant attention to multi-threaded programming, especially the use of C's setjmp and longjmp facilities, there remains a great deal of confusion in the practical application of multi-threaded techniques. In fact, the examples used to demonstrate multi-threaded functionality are often contrived or unrealistic.
Multi-threaded programming can in fact be applied to everyday programming problems. This article presents one such problem and shows how multi-threaded programming simplifies the program solution.
Parallel Recursive Data Structure Access
Several years ago we undertook the building of a large text retrieval system. The retrieval system would read the text of documents and classify them in terms of subject matter. The process of reading through the documents was processor intensive and normally run in batches after hours.As the retrieval system reads the text, it creates a recursive data structure representing the subject matter of each document. After reading a batch of documents, the system would traverse all the data structures to write them to disk as though they were a single virtual data structure.
The nature of the structure made it desirable to use recursion to traverse the data structure. Recursion, however, would prevent traversing a single data structure concurrently with the others. We combined multi-threaded constructs with a simple recursive algorithm to traverse many data structures concurrently.
Consider, for example, the problem of concurrently traversing multiple binary trees. An individual binary tree is a data structure that lends itself to recursive traversal. However, multiple binary trees make using recursion difficult. In Figure 1, the pre-order traversal of the first binary tree results in visiting nodes 1, 5, 7 and 9. Pre-order traversal of the second binary tree would visit 3, 6, 8 and 10. To traverse both trees as if they were one virtual tree, you would want to visit the nodes as follows: 1, 3, 5, 6, 7, 8, 9 and 10.
The Single Threaded Approach
The program in Listing 1 constructs multiple binary trees that contain random values and then traverses the trees as if they were one virtual tree. Listing 1 contains both the single threaded and multiple threaded examples. The programmer selects the single thread version by compiling the program with the SINGLE_THREAD macro defined.The main function calls BuildTree to create each random binary tree. Binary trees consist of NODEs, each of which has left and right sub-tree pointers, and a key. Once created, the trees array has the address of the root of each tree in the array.
BuildTree creates the random binary trees by generating random keys. BuildTree inserts the random keys into the tree by calling AddKey.AddKey then performs a standard recursive binary tree insertion by traversing left or right depending upon the magnitude of the key value.
Once main creates the random binary trees, MergeTrees traverses them. In the single threaded example, MergeTrees starts at the root node of each tree and moves iteratively from node to node. The single threaded example first moves to the left-most node, pushing those nodes onto a stack. The for and while loops within MergeTrees perform the initial traversal to the left-most node of each tree.
After the single threaded version locates the left-most node of each tree, MergeTrees iteratively finds and prints the lowest key until it has visited all nodes of all trees. To find the lowest key, MergeTrees calls DisplayLowest, which compares the key of the node from each tree and prints the lowest of the keys. Once DisplayLowest prints the lowest key, MergeTrees calls TraverseRightSub.
TraverseRightSub locates the next lowest key within the tree. It moves as far left as possible from the current node's right sub-tree. If no right sub-tree exists, TraverseRightSub will at least pop the stack, causing the parent node to be on the top of the stack. In any event, the effect of TraverseRightSub is to leave the stack with the next lowest value on the top of the stack. This traversal allows DisplayLowest to continue to operate as previously described.
A Multi-Threaded Example
The multi-threaded approach to traversing many concurrent binary trees is a simple extension to traversing a standard single tree. The idea is to traverse each tree individually using the standard recursive algorithm. A separate thread traverses each tree. Before a node in a tree can be visited, its key must be compared with the keys of all other trees about to be visited. The nodes with the lower keys are visited first.Listing 1 also contains a multi-threaded approach. The #else section of the listing contains this example. Compiling the p rogram with the SINGLE_THREAD macro not defined yields the multi-threaded example. This example uses the Multi-C library, which supports portable, non-preemptive multi-threaded constructs such as thread creation, scheduling, inter-thread communication, etc.
The multi-threaded solution creates the random binary trees using the same procedures as in the single threaded version. However, it uses a different implementation of MergeTrees.
In the multi-threaded example MergeTrees creates a separate thread for each binary tree by calling MtCCoroutine, a Multi-C function that creates a new thread of execution. Each new thread begins executing in the PrintTree routine. When the new thread completes the PrintTree routine, it will terminate execution and yield the processor to other threads. The original thread that created the sub-tree threads will continue to execute to completion.
When the original thread completes by returning from main, the sub-tree threads will begin execution in PrintTree, a standard recursive binary tree traversal routine. The first thread created will be the first to start executing. It first traverses the left sub-tree, then visits the current node by calling PrintNode. Finally PrintTree traverses the right sub-tree.
PrintNode guarantees that the node about to be visited has the smallest key of any of the trees' current nodes. PrintNode compares its current key to the other trees' keys using the static variable CurrentKey. When using the Multi-C library, each thread has its own stack, so auto variables are thread-specific. The remainder of memory is accessible to all threads. Therefore threads share static variables (though they are local to a procedure) and can use them as a simple means of inter-thread communication. An alternative solution to inter-thread communication would be to use Multi-C's thread mailboxes. Thread mailboxes allow threads to receive information from other currently executing threads.
CurrentKey is initialized with a number that is the largest integer so that all keys will be less than it. As threads execute PrintNode, each thread compares its key to CurrentKey. If the thread's key is less than CurrentKey, the thread makes its key the CurrentKey and calls MtCYield. MtCYield is a Multi-C library function that causes the thread to suspend execution and allows another thread to begin. The yielding thread resumes execution when MtCYield returns.
As threads yield, the Multi-C scheduler reschedules them for execution using a round robin algorithm. CurrentKey will equal the thread's current key when all other threads have executed and it is the lowest of the threads' keys. Threads wait for this condition to occur while executing a loop within PrintNode. PrintNode finally visits the node when the executing thread's key is equal to the CurrentKey. Then recursion within the thread's tree continues.
Comparing Complexities
To compare the complexities of the two approaches quantitatively, we refer to work initiated several years ago by Maurice Halstead at Purdue. Dr. Halstead started a calculus he called "Software Science," in which he defines a measure of program volume as:
V = N log2 n N = N1 + N2 n = n1 + n2Where:
V = the program volume N = the program length N1 = the total number of operators N2 = the total number of operands n = the vocabulary size n1 = the unique operator count n2 = the unique operand countThe volume program metric allows us to compare relative complexities of the two approaches. Table 1 contains the empirical and calculated values for each approach. While the rules used for counting operators and operands are not necessarily the same as those used by Halstead, they are at least consistent for both examples. We counted only the operators and operands in the code for MergeTrees, since it contains the only differences in the two approaches.Table 1 shows the volume for each implementation of MergeTrees. Program volume is an estimate of effort required to understand the workings of a program. The other metrics in Table 1 are used to calculate the program volume. The volume of the single threaded example is slightly greater than three times the multi-threaded example. This implies the single threaded example is three times more complex than the multi-threaded example.
We also counted the number of lines of code for each example, shown in Table 1, as another measure of program complexity. We defined the number of lines of code in the examples as the total number of lines minus blank lines and lines only containing comments. It is interesting to notice that this measure also shows the single threaded approach is more complex by roughly three times.
Conclusion
The program described here is a tightly coupled multi-threaded program. In such programs each thread performs the same task, but on different data. Loosely coupled multi-threaded programs have threads that perform nearly unrelated activities. Examples of these programs include communications file servers, AI applications, simulations, and games (see the sidebar entitled Loosely Coupled Multi-threaded Programs).Multi-threaded programming is not a cure-all. It is something that every programmer should have in her or his bag of tricks. Since multi-threaded programming is portable, it is applicable without machine dependent restrictions.
Like other programming techniques, multi-threaded programming presents a learning curve. As the programming community becomes familiar with multi-threading techniques, we will discover more ways to use it to simplify our programming lives.