August 1994/An Efficient Flood Visit Algorithm

Graphics

An Efficient Flood Visit Algorithm

Anton Treuenfels

Anton Treuenfels has a B.A. in General Science from Grinnell College in Grinnel, IA. He is interested in bitmapped graphics, interpreters and telecommunications. He can be reached at 5248 Horizon Dr, Fridley, MN, 55421, (612) 572-8229.

Introduction
A common operation in bitmapped graphics is known as flood-filling an area. The purpose is to change the color of the pixels making up the area. A specific example would be, "Change all the white pixels within the blue lines to red."
This is actually only one form of a more general problem, which can be defined as visiting all adjacent points on a finite plane that share a common property. Together these form the interior points of a region bounded by any points that do not share this property and by the edges of the plane. The region occupied by the interior points need not necessarily be convex. That is, there may be islands or peninsulas bounded by points not sharing the common property embedded within the region of points that do. The locations of any such boundary points are not known in advance. The only givens are a function that can determine whether or not any particular point is an interior point and the location of a single seed point that is an interior point. The goal is to visit all interior points as efficiently as possible.
This formulation does not provide for any particular purpose in making the visits. The purpose is irrelevant to the problem. Changing a property of the interior points, such as their color, is only one possibility. Merely counting them is another. Or the purpose might be to determine whether two or more points belong to the same region. In any case, the problem of visiting all interior points is logically separate from the problem of what to do with them.
One very simple algorithm that solves this problem begins by pushing the seed point onto a stack. Then, while the stack is not empty, it removes and examines the point at the top of the stack. If this point is an interior point that has not been previously visited, it is marked as visited and the four neighboring points located immediately above, below, left, and right are pushed onto the stack. Otherwise, the point is simply discarded.
While this algorithm is logically correct, in practice it is inefficient. Each interior point of the region is visited several times, once the first time it is examined and once each time one of its neighbors is visited. The number of points saved on the stack can also quickly grow quite large and the memory set aside for the stack can easily become exhausted.

A Few Definitions
The efficiency of this algorithm can be improved by working with groups of points rather than single points. To pursue this idea some definitions are needed. Any group of horizontally adjacent points defined by two endpoints and a row is a run. Note that there is nothing sacred about a run being horizontal. In graphics applications, horizontal runs tend to increase the efficiency of point examination because they match up well with the underlying physical organization of most raster-oriented graphics displays.
A run that consists of only visited interior points is a line. Runs with the same endpoints as a line and immediately above and below it on the adjacent rows are shadows of the line. Shadows may fall on lines. That is, lines and shadows may overlap or be coincident. Wherever a line contains at least one point that also belongs to a shadow, that line is a child of the parent line that originally cast the shadow.
The improved version of the first algorithm begins by considering the seed point to be a seed shadow and pushing it onto a stack. Then, while the stack is not empty, it removes the shadow at the top of the stack and examines all of its points. If any run of previously unvisited interior points is found, they are marked as visited and the shadows of this new line are pushed onto the stack.
A couple of points are important to note here. First, a new line may extend beyond a shadow if either endpoint of the shadow is an unvisited interior point. This is why the seed point need only be a point. Second, except in the case of the seed point, every new line is a child line and one of its two shadows will fall partly or wholly on its parent line. Since there is no hope of finding any new lines within the parent line, there is no reason to push onto the stack any part of a child's shadow which falls on its parent.
An algorithm incorporating these observations is much more efficient than the first algorithm. Most interior points are visited only once. The exceptions occur only when new visits to interior points proceed around an island and eventually reach previously visited interior points. Normally, less stack space is required — although each shadow takes more space on the stack than a point, on the whole fewer shadows than points need to be recorded.

A Better Algorithm
It is possible to guarantee that each interior point is visited only once. That, incidentally, removes the requirement that visited interior points be so marked. Lines represent interior points that have been visited, and shadows represent poin ts that will be visited. If shadows on the stack never fall on lines, then visited interior points will not be visited again. Because shadows are always next to lines, this can be arranged.
Preventing a child's shadow from falling on its parent line is a useful first step in this direction. Listing 1, uflood.c, is a straightforward extension of this idea to any shadow and any line. The pending shadow stack is maintained in the form of a linked list. The top shadow is removed to become a seed shadow. Whenever a child line is found within the seed shadow, both of its shadows are pushed onto the stack in their entirety. Each pending shadow that falls on any line is then clipped to remove the coincident portion of the shadow. This modifies the extent of those shadows which fall partly on lines, and eliminates those that fall wholly on lines.
Despite having only the child line with which to compare shadows, the task is not difficult. Wherever the child line is coincident with a shadow, there are two shadows falling on two lines to consider. First, the shadow coincident with the child is falling on the child. Second, one of the child's shadows is falling on a parent of the child. (Note that this implies that a child may have more than one parent, which is in keeping with the original definition given earlier. The alternative would seem to be awkward phrasing similar to "falling on the line which cast the shadow with which the child is coincident").
For simplicity this implementation pushes both entire new shadows of a child on the stack and then makes adjustments whenever a shadow overlaps the child. There is always an overlap between the child line and the seed shadow, and so this is the first adjustment made. The seed shadow itself is not clipped since that might place a portion of the seed shadow back on the stack when it is already off. All the pending shadows in the stack are then checked for overlap and adjusted if necessary. Overlapped shadows are not actually removed from the stack but instead simply marked as invalid, which is easier than trying to maintain the integrity of the stack. These shadows are discarded only when they reach the top of the stack.

Improving Efficiency
This algorithm is logically correct, but it is still inefficient. Time and space are wasted by pushing both entire child shadows on the stack, only to immediately eliminate part or all of one. Much more important, the time required to examine all the pending shadows each time a child line is found increases with the number of pending shadows. This problem is not helped by a pattern of visitation in complex regions that might best be described as somewhat haphazard, necessitating the stacking of a large number of shadows (in one test case, more than could be accommodated by available memory). With a complex region, far more time can be spent trying to discern whether an area has been previously visited than actually visiting new areas.
Listing 2, flood.c, addresses these problems in several ways. First, the portion of a child's shadow that falls on its parent is never placed on the stack, so it never needs to be eliminated. The seed point, which creates the only line which has a full length shadow on each side of it, is handled as a special case.
Second, the number of comparisons required to determine overlap is reduced. Since it is not possible for a child line to overlap any pending shadow that is not on the same row as the child, only pending shadows on the same row as the child need be checked.
To facilitate this checking, a row list is introduced. Whenever this list is empty, the first valid shadow on the pending shadow list is moved to the row list. The rest of the pending list is then examined and any shadow on the same row as the first is also placed on the row list. It is also convenient at this time to arrange these row shadows in left-to-right order.
Shadows to examine are now taken from the head of the row list instead of from the head of the pending list. This shadow is always the leftmost shadow on the row, so any possible overlaps must now occur between a child line on the left and the remaining row shadows on the right. Overlap and non-overlap are easily determined: any row shadow whose left endpoint is coincident with the child line is overlapped. The first row shadow whose left endpoint is not coincident with the child line is also the last shadow that needs to be checked for overlap.
Overlaps now occur only once for each island in the region. In practice, far less time is spent adjusting shadows to remove overlap than is spent detecting overlap in the first place. While it is thus relatively less important to make the adjustment process as fast as possible, it is easy to avoid some needless slow downs.
A child's shadow is most likely to overlap a parent line on the same side of the child as its original parent (defined as the parent row of the seed shadow the child coincides with). This is because visits tend to proceed around islands in a region much as water flows around islands in a stream, traveling along both sides simultaneously and meeting up again on the far side. In view of this a child's shadows are added to the pending list in the order: shadow opposite parent row, shadow to left on parent row, then shadow to right on parent row. When the list is searched, any shadows on the parent side are encountered first.
If a child line overlaps a row shadow, the child's shadow that falls on the parent line must be located. A necessary and sufficient condition for identifying this shadow is that its row match the row of the parent line. The reasoning is as follows: the overlapping child shadow cannot be the shadow to the left of the original parent line because overlaps can only occur to the right of the child. So it must be one of the two remaining shadows, which are known to be on different rows.
Thus, when the first overlap of a row shadow by a child line is detected, it is enough to match rows. (There will always be a match now because there is no longer any attempt to clip shadows cast by the seed point.) Furthermore, when the child's shadow is clipped, any portion to the right of the parent line will be placed at the head of the pending list. As any additional row shadows overlapped by the same child line must be to the right of the first one, the first child shadow whose row matches the parent row sought is always the one overlapping the parent.
The row shadow that overlaps the child line is a bit more troublesome because any clipped version of this shadow must now be put on the row list, not the pending list. However there are just two possibilities to consider: either the row shadow is completely overlapped by the child line or it is exposed only at its right end. In the first case, mark the shadow invalid. In the second case, adjust the left endpoint.
Another change introduced is that shadow nodes not currently needed are kept in a free list rather than returned to heap storage via free. The heap storage allocator malloc is called only when the free list is empty at the time a request for a shadow node is made. If malloc cannot satisfy the storage request, the storage used by the row and pending lists is released and an error indication is returned to the calling function.
This practice improves performance relative to repeatedly calling malloc and free. This gain is not nearly as great as the gains made by optimizing the overlap check, but it is fairly simple to implement and it does help, particularly with complex regions requiring thousands of shadow nodes.

A Few Examples
Having made the case that the purpose of visiting a point need not have anything to do with graphics, it is nevertheless true that the example programs deal only with graphics. Listing 3, floodtst.c, is a program designed to test flood.c and uflood.c in a more-or-less device independent manner. Listing 1 through Listing 3 are compatible with ANSI C.
Listing 4, bgigrh.c, contains supporting graphics functions used to draw test figures. Listing 5, bgipixel.c, is a point-examination function that systematically changes the colors of visited points (so that multiple visits can be easily detected). Listing 6, egapix.c, is a point-examination function that is optimized for speed by saving relevant variables rather than recalculating them each time it is called. Listing 4 through Listing 6 are dependent on library functions provided with Turbo C v2.0. Listing 6 also requires that an EGA or VGA card be in use.
Listing 7, usrdef.h, contains some definitions I commonly use. Listing 2 through Listing 6 contain both headers and source code. Although I do not believe this is a common practice, I personally prefer it because I like to have the headers where they are easy to refer to and easy to change. When a header changes, I simply mark it as a block and write it out to disk. In practice this is probably not terribly different from using #include to bring a header back into the source file it refers to (to avoid duplicating function prototypes, perhaps).
The investigations into the performance of uflood and flood were accomplished with the aid of a profiler (in this case Turbo Profiler v1.0). Some of the results of profiling flood are presented in Table 1 and Table 2.
Table 1 shows the relative percentage of time spent executing each function in the flood.c module when using the function egapix to examine points. Tests 0 and 1 are simple convex regions. Test 2 has one island. Test 3 has two islands with peninsulas. Tests 4 through 7 have lots of islands (792 of them on the EGA display used during testing). Test 8 has a moderate number of islands and peninsulas.
Instead of using a fixed pattern suited to repeated testing, test 9 generates a random pattern of roughly the same size each time it is executed. Its main contribution to performance testing is to give candidate algorithms a chance to fail by presenting them with a situation not thought through beforehand quite carefully enough (a task it accomplishes with annoying regularity).

A Few Results
The results in Table 1 show that even in the worst test case flood spends two thirds of its time examining points and only one third deciding which points to examine. This is a somewhat volatile measure whose meaning must be interpreted cautiously. For example, placing a delay loop into visitshadow, or replacing the point-examination function with a slower one, will increase the percentage of time spent in visitshadow without at all meaning overall performance has been increased. It does seem safe to say that most of the easy opportunities to increase the percentage of time spent examining points by decreasing the percentage spent on other activities have already been taken advantage of.
Table 2 shows the count of shadow nodes required during each test, and how many were obtained from the free list versus how many came from malloc. The results show that a minimum of three nodes will always be required (for the first line and its two shadows).
The count of calls to malloc also indicates the maximum storage required during each test. Perhaps the easiest way to recognize this is to consider that freeshadow, which releases every shadow node allocated by malloc at one time, will call free once for each time malloc was called. This count shows that at least once during the test there were that many nodes combined in the row and pending lists. Assuming that a shadow node occupies about 16 bytes, in the worst test case about 2 KB total storage space is required to keep track of what has been visited and what is left to visit.

Alternative Algorithms
The utility of an algorithm which is guaranteed to visit interior points only once can be appreciated by considering some of the alternative algorithms described earlier. The bland phrase "mark as visited" does not hint at how this is accomplished. One method is to change a property of the point visited, such as its color. In this case care must be taken if this is also the common property used to distinguish whether a point is an interior point or not. For example, suppose a flood-fill problem is stated as, "Change all the white pixels within the blue lines to a red-and-white pattern of pixels." Now a point may be visited, identified as an unvisited interior point, and marked as visited in a way that leaves it essentially unchanged. This causes confusion if the point is visited again.
Another possibility is to use an auxiliary Boolean array to record which points have been visited. Before visiting any point, consult the array to determine if the point has been visited before. A main drawback here is the size of the array, which must contain at least one bit for every point in the plane that might be visited. An entire high-resolution EGA screen, covering 640 x 350 pixels, would require at least 224,000 bits, or 28 KB.
The algorithm used by flood avoids both these problems. It never visits an interior point more than once and is efficient in its use of both time and space. It is also independent of the actual nature of the points it is visiting, and ignorant of the common property that distinguishes interior points from boundary points. This makes the algorithm applicable to any problem requiring these traits for which an effective point-examination function can be devised. It really is best to let the blind robot vacuum cleaner do only one room at a time. Give it the floor plan of the entire house and it will spend more time traveling from room to room than cleaning.

References
Lieberman, Henry, "How to Color in a Coloring Book," Computer Graphics, Vol. 12, No. 3, Aug. 1978.
Polik, William F., "Area Filling Algorithms,'Computer Language, Vol.3, No. 5, May 1986.
Wilton, Richard, PC & PS/2 Video, Microsoft Press, 1987.

Sidebar: "Patterns of Visitation"