If you can spare one bit of storage in each node, you can traverse a tree without recursive function calls.
Introduction
To traverse a tree in a depth-first pattern, I have usually used implicit recursion (via a function call) or sometimes explicit recursion (via a private stack). In either case I use memory space proportional to the size of the tree (at least to the biggest branch). There are algorithms that do not consume memory, but they impose additional constraints, or need to modify the tree structure itself (see [1, 2]).
I have implemented an algorithm in C that involves neither implicit nor explicit recursion, and uses just one bit of storage to traverse a tree. Using this algorithm, you can supply a pointer to a function to operate on each node visited. I will explain how it works shortly, but first a few definitions:
- A prefix traversal processes a node (applies a function to it) before it visits subnodes of the node.
- A postfix traversal processes a node after visiting the subnodes.
- An infix traversal processes a node between visiting the subnodes (typically used on a binary tree).
You can arrange a traversal that merges all or part of these properties.
The Algorithm
The algorithm uses a parent-child-sibling structure (see Figure 1a). Figure 1b shows the path the algorithm uses to traverse the tree. The algorithm uses only existing information to traverse all of the tree. The problem is to use the right information at the right time.
There are only two cases for the algorithm to consider. The first is when it is positioned at the top of an unexplored subtree; then it should visit the first subnode. The second case is when the algorithm has finished visiting a subtree; then it should visit either a sibling or the parent. The proper course of action can be deduced from the current position and the last operation performed. The algorithm does not need to store any information as it traverses the tree except for one bit.
Figure 2 shows the C code for the core algorithm. (Full source code is available on the CUJ website, www.cuj.com/code.) Function tree_depth_traversal1 implements either prefix- or postfix-order traversal (or both). It consists of a simple loop (after a few preliminary if statements) that operates on the node passed into the function. The two cases mentioned above are represented by the Boolean variable dir_flag.
When dir_flag is TRUE, the algorithm must explore a subtree. The algorithm attempts to move to the first subnode (leaving dir_flag with the value TRUE because it's a new unexplored subtree). If the current node turns out to be a leaf the algorithm considers the subtree explored and sets dir_flag to FALSE.
When dir_flag is FALSE, the algorithm attempts to move to a sibling (and sets dir_flag to TRUE because the sibling is a new unexplored subtree). If there is no sibling the algorithm moves to the parent (and leaves dir_flag FALSE because the subtree that has the parent as its root is completely explored).
The algorithm begins the traversal with a node and a user-supplied function to process each node. The initial value of the flag is obviously TRUE.
The user-supplied function for processing a node controls whether the traversal is prefix-order, post-fix order, or both. The second parameter to the function, first_flag, is a Boolean that indicates whether the function is being called before the node's subtree has been visited, or after. To effect a prefix traversal, define the function to work only when first_flag is TRUE, and to do nothing when first_flag is FALSE. To effect a postfix traversal, define the function to work only when first_flag is FALSE. To get a combined prefix/postfix traversal, define the function to ignore first_flag.
Extra Considerations
The complete source code provides two traversal functions. The first one (the one shown in Figure 2) decides where to move next after processing the current node. The second one (not shown) is easily derived from the first. It decides where to move before processing the current node. This second function could be used to implement an iterator. If the user-supplied function does not modify the tree the two versions are equivalent. You should use the second version if you want to destroy nodes during the (postfix) traversal. (Adding nodes is left as an exercise for the reader.)
The user-supplied function will never receive a NULL node. The test node_exist(current) in the loop (see Figure 2) is present only for safety reasons. The implemented version of this algorithm will never set the current node to NULL if the tree is correctly built.
Conclusion
I have presented an algorithm which does not consume any memory space to explore a tree, without any limit on depth. The implementation enables both prefix-order and postfix-order traversal, but infix-order could be added without any problem, in the algorithm itself or in the user-supplied process function.
Acknowledgements
I would like to thank two teachers from my University, Jean Francois Perrot and Emmanuel Saint-James for teaching me the essence of software engineering; one of my old colleagues, Christian Joubert; and the CUJ editorial staff for the revision and correction of this English composition.
References
[1] H. Shorr, W. M. Waite. "An Efficient Machine-Independent Procedure for Garbage Collection in Various List Structures," Communications of the ACM, 10:501-516, August 1967.
[2] W. A. Burkhard. "Non Recursive Traversals of Trees," The Computer Journal, 18(3), August 1975.
Valery Creux received a DESS (equivalent to an MS) in software engineering from the Pierre & Marie Curie University (Paris, France). His interests include algorithm analysis and design, and he's still waiting for a new volume of The Art of Computer Programming from Knuth. He currently works for VDO Car Communications GmbH, an automotive company in Germany, manufacturing radios and navigation systems. He can be reached at Valery_Creux@yahoo.com.