Features


Image Processing, Part 10: Segmentation Using Edges and Gray Shades

Dwayne Phillips


The author works as a computer and electronics engineer with the U.S. Department of Defense. He has a PhD in Electrical and Computer Engineering from Louisiana State University. His interests include computer vision, artificial intelligence, software engineering, and programming languages.

Introduction to the Series of Articles

This is the tenth in a series of articles on images and image processing. Previous articles discussed reading, writing, displaying, and printing images (TIFF format), histograms, edge detection, spatial frequency filtering, and sundry image operations. This article will discuss image segmentation using edges and gray shades.

The last article (Phillips, February 1993) discussed image segmentation using histograms. That basic technique examined the histogram of an image, transformed the image into a 1-0 image, and "grew" regions. The results were acceptable given the simplicity of the approach.

Segmentation Using Edges and Gray Shades

There are powerful segmentation techniques that use the edges in an image, grow regions using the gray shades in an image, and use both the edges and gray shades. These techniques work well over a range of images because edges and gray shades are important clues to objects in a scene.

Figure 1 shows the result of using edges to segment an image. The left side shows the output of an edge detector. The right side is the result of grouping the pixels "inside" the edges as objects — a triangle and rectangle. This idea is simple. Detect the edges and group the pixels as objects.

Figure 2 illustrates growing objects using the gray shades in an image. You group a pixel with a neighboring pixel if their gray shades are close enough. You replace the two pixels with their average and then look at the neighbors of this two pixel object. If the gray shades of the neighbors are close enough, they become part of the object and their values adjust the average gray shade of the object. The left side shows the input, and the right side shows the result of growing objects in this manner. The 1s are the background object produced by grouping the 1s, 2s, and 3s. The triangle of 2s is a grouping of the 7s and 8s, and the rectangle of 3s is the 8s and 9s.

Figure 3 combines the two techniques. The left side shows a gray shade image with the output of an edge detector (*s) superimposed. The right side shows the result of growing regions using the gray shades while ignoring the detected edges (*s). The result is the three objects produced in Figure 2 separated by the edges.

These three simple techniques work well in ideal situations. Most images, however, are not ideal. Real images and image processing routines introduce problems.

Problems

You can encounter three potential problems using these segmentation techniques: the input image can have too many edges and objects, the edge detectors may not be good enough, and you may need to exclude unwanted items to effectively grow the region.

The input image can be too complicated and have small, unwanted objects. Photograph 1 shows an aerial image of house trailers, roads, lawns, trees, and a tennis court. The white house trailers are obvious and easy to detect. Other objects (tennis court) have marks or spots that fool the edge detectors and region growing routines. Photograph 2 shows a house, and Photograph 3 shows its edges. Segmentation should detect the roof, windows, and door. The bricks, leaves, and shutter slats are real, but small, so you do not want to detect all of them.

You need high quality edge detection to use these technisques. Figure 4 demonstrates how a small edge-detector error leads to a big segmentation error. On the left side of the figure, I poked a small hole in the left edge of the rectangle. The right side shows the terrible segmentation result. Edge detectors do not produce these 1-0 images without thresholding them (Phillips, November 1991). Photograph 4 shows the result of edge detection on Photograph 1. You need to threshold the strong (bright) and weak (faint) edges, to produce a clean 1-0 image. But you also need a consistent and automatic method to find the threshold point. Detected edges can be too thin and too thick. A surplus of stray, thin edges misleads segmentation, and heavy, extra-thick edges ruin objects. Figure 5 shows how the triple thick edges on the left side produce the distorted objects on the right side.

The region-growing algorithm (Phillips, January 1992) must limit the size of objects and exclude unwanted pixels. The house in Photograph 2 contains objects of widely varying size. You may want to exclude the bricks to concentrate on large objects or vice versa. You may want to omit certain pixels such as edges. The left side of Figure 6 is a repeat of Figure 1 while the right side shows what happens when the region grower mistakes the edges for objects.

Solutions

You can solve the edge detection problems by preprocessing, better edge detection, and better region growing.

Preprocessing

Preprocessing involves smoothing the input image to remove noise, marks, and unwanted detail. The median filter (Phillips, October 1992), one form of smoothing, sorts the pixels in an nxn area (3x3, 5x5, etc.), and replaces the center pixel with the median value. High- and low-pixel filters, variations of the median filter, sort the pixels in an nxn area and replace the center pixel with either the highest or lowest pixel value.

Figure 7 illustrates the median, high-pixel, and low-pixel filters. The left side shows the input, the image section. The right side shows the output for each filter processing a 3x3 area. The median filter removes the spikes of the larger numbers. The high-pixel filter output has many high values because the input has a large number in most of its 3x3 areas. The low-pixel filter output is all 1s because there is a 1 in every 3x3 area of the input.

Photograph 5 and Photograph 6 show how the low-pixel filter reduces the clutter in the edge detector output. Photograph 5 is the result of the low-pixel filter applied to Photograph 2. The dark window shutters are larger, and the mortar lines around the bricks are gone. Photograph 6 is the output of the edge detector applied to Photograph 5. Compare this to Photograph 3. The edges around the small objects are gone.

Listing 1 shows the high_pixel and low_pixel subroutines. They create the output image file if needed, read the input image, loop through the data, and write the output. The looping portion places the pixels in the nxn area into the elements array, sorts the array, and places the highest or lowest pixel value into the output image.

Improved Edge Detection

For effective segmentation you need accurate edge detectors with automatic thresholding of edges and the ability to thin edges.

I improved the accuracy of the edge detectors in the C Image Processing System by adding two more edge detectors and correcting three others (Phillips, November 1991, January 1992). One of the new edge detectors, the variance operator, examines a 3x3 area and replaces the center pixel with the variance. The variance operator subtracts the pixel next to the center pixel, squares that difference, adds up the squares of the differences from the eight neighbors, and takes the square root. The other new edge detector, the range operator, sorts the pixels in an nxn area and subtracts the smallest pixel value from the largest to produce the range.

Figure 8 shows the results of applying the variance and range operators to an array of numbers. Photograph 7 and Photograph 8 show the outcome of applying these operators to the house image of photograph 2.

Listing 2 shows the source code for the variance and range operators. They create an output image if needed, read the input, loop through the image, and write the output. The looping structure performs the calculations described above.

I corrected a major mistake in the directional edge detectors (Phillips, November 1991) that improved accuracy. The Sobel, Kirsch, and Prewitt edge detectors use eight different 3x3 convolution masks — one mask for each direction. Listing 3 shows the correction to the subroutine perform_convolution (Phillips, November 1991). A reader noticed I was setting the output to the answer of each direction convolution — not the strongest of all eight directions. To correct this, set the output to the convolution sum only if the sum is greater than the output.

Photograph 9 and Photograph 10 show the effect of this correction. Photograph 9 is from the early article, and the edges are all on the right side and bottom of the objects. There are no edges on the upper and left sides. Photograph 10 shows how the correction produced edges without holes and having the quality needed for segmentation.

For effective edge detection you need a technique for thresholding the edge detector output consistently and automatically. One technique sets the threshold point at a given percentage of pixels in the histogram. You calculate the histogram for the edge detector output and sum the histogram values beginning with 0. When this sum exceeds a given percent of the total, you have the threshold value. This method produces consistent results without any manual intervention. A good percentage to use is 50% for most edge detectors and images.

Photograph 11 shows the thresholded edge detector output of Photograph 1. That is, I processed Photograph 1 with the edge detector (Photograph 4 shows the result) and set the threshold at 70 percent. Listing 4 shows the find_cutoff_point subroutine that looks through a histogram to find the threshold point. It takes in the histogram and the desired percent and returns the threshold point. This is a simple accumulate and compare operation.

The erosion operation (Russ 1992) can solve the final problem with edge detectors, removing extra edges and thinning thick edges. Erosion looks at pixels turned on (edge detector outputs) and turns them off if they have enough neighbors that are turned off. Figure 9 and Figure 10 illustrate erosion. In Figure 9, the left side shows edges (1s) around the triangle and rectangle and then several stray edges. The right side shows the result of eroding or removing any 1 that has seven 0 neighbors. In Figure 10, the left side shows very thick edges around the triangle and rectangle. The right side shows the result of eroding any 1 that has three 0 neighbors. The edges are thinner, and the objects inside the edges are more accurate.

Listing 5 shows the erosion subroutine erode_image_array. The looping structure examines every pixel in the_image that equals value. It counts the number of neighboring 0 pixels and sets the out_image to zero if this count exceeds the threshold parameter. The threshold parameter controls the erosion. Threshold was 6 in Figure 9 and 2 in Figure 10.

Photograph 12 shows the result of eroding the thick edges of Photograph 6. Note how it thinned the thick edges and removed the stray edges in the house, lawn, and tree. The threshold parameter was 3 for this example.

Improved Region Growing

You need accurate region growing to implement the edge and gray shade segmentation techniques. Figure 11 shows the region-growing algorithm used in the last article (Phillips, February 1993). This worked for binary images containing 0s and a value. If the algorithm found a pixel equal to value, it labeled that pixel and checked its neighbors to see if they also equaled value (step 3).

The region-growing algorithm needs improvements to work with any gray shades, limit the size of regions, and exclude pixels with special values (like edges). Figure 12 shows the new region-growing algorithm. The input image g contains gray shades and may contain special pixels equal to FORGET_IT. The output image array holds the result. There are three new parameters: diff, min_area, and max_area. diff specifies the allowable difference in gray shade for two pixels to merge into the same object. min_area and max_area specify the limits on the size of objects.

The major differences in the algorithm begin at step 4. Instead of checking if g(i,j) == value, the algorithm performs three checks:

The first two are simple. You want to exclude certain pixels, so you set them to FORGET_IT and ignore them. The output must not be part of an object, so it must be zero.

The third test allows you to work with gray shade images. In step 3, you create a target equal to the average gray shade of the pixels in an object. You group neighboring pixels whose values do not differ by more than the diff parameter. The is_close routine at the bottom of Figure 12 tests for this condition. If the pixel g(i,j) is close enough to the target, you call the pixel_label_and_check_neighbor routine to add that pixel to the object and check its neighbors. The pixel_label_and_check_neighbor routine updates the target or average gray shade of the object.

The new algorithm limits the size of objects in step 6. It tests count (the size of an object) against min_area and max_area. If the object fails the test, you set all pixels of g in the object to FORGET_IT and set all pixels in output to 0. This removes the object from output and eliminates the pixels from any future consideration in g.

I've already discussed how the new algorithm excludes pixels with certain values via the FORGET_IT value. If you want to remove edges from consideration, you lay the edge detector output on top of the input image and set to FORGET_IT all pixels corresponding to the edges.

Listing 6 shows the source code for the three subroutines outlined in Figure 12. They follow the algorithm closely. The only trick is how to implement the stack (Phillips, February 1993).

The improved region-growing algorithm is the key to the new techniques. It ignores certain pixels and eliminates objects of the wrong size. These small additions produce segmentation results that are much better than those from Part 9.

The Three New Techniques

Now that I've laid all the ground work, I'll go on to the three new techniques.

Edges Only

The edge_region subroutine shown in Listing 7 implements this technique. The algorithm is

1. Create the output image file if needed

2. Read the input image

3. Edge detect the input image

4. Threshold the edge detector output

5. Erode the edges if desired

6. Set the edge values to FORGET_IT

7. Grow the objects while ignoring the edges

Steps 1 through 6 should give you an image like that shown in Figure 1. Step 7 grows the objects as outlined by the edges. The edge_region subroutine calls any of the edge detectors from this and previous articles, the histogram functions from previous articles, and the find_cutoff_point, erode_image_array, and pixel_grow functions from Listing 4, Listing 5, and Listing 6.

The edge_type parameter specifies which edge detector to use. min_area and max_area pass through to the pixel_grow routine to constrain the size of the objects detected. diff passes through to pixel_grow to set the tolerance on gray shades added to an object. diff has little meaning for this technique because the image in which you grow regions contains only 0s and FORGET_IT pixels. The percent parameter passes through to the find_cutoff_point routine to threshold the edge detector output. The set_value parameter is the turned on pixel in the threshold_image_array and erode_image_array routines. Finally, the erode parameter determines whether you perform erosion on the edge detector output. If erode is not zero, then it is the threshold parameter for erode_image_array.

Gray Shades Only

The short gray_shade_region subroutine in Listing 8 implements this technique. This subroutine creates an output image file if needed and calls the pixel_grow function of Listing 6. pixel_grow does all the work since it handles the gray-shade region growing and limits the sizes of the objects. The diff, min_area, and max_area parameters play the same role as in the edge_region routine described above.

Edges and Gray Shade Combined

The technique for combining edges and gray shades is implemented by the edge_gray_shade_region function in Listing 9. The algorithm is

1. Create the output image file if needed

2. Read the input image

3. Edge detect the input image

4. Threshold the edge detector output

5. Erode the edges if desired

6. Read the input image again

7. Put the edge values on top of the input image setting them to FORGET_IT

8. Grow gray shade regions while ignoring the edges

The differences between edge_region and edge_gray_shade_region are in steps 6 and 7. At this point, edge_gray_shade_region reads the original input image again and overlays it with the detected edges. Step 8 grows gray shade regions while ignoring the detected edges. Steps 1 through 7 generate an image like the left side of Figure 3. Step 8 generates the right side of Figure 3.

Photograph 13 through Photograph 17 illustrate these techniques on the aerial image of Photograph 1. Photograph 13 shows the result of the Sobel edge after erosion. The edges outline the major objects in the image fairly well.

Photograph 14 shows the result of the edge-only segmentation of Photograph 1. It is result of growing the black regions of Photograph 13. This is a good segmentation as it denotes the house trailers, roads, trees, and parking lots. This is not just the negative image of Photograph 13. Regions too small and large were eliminated.

Photograph 15 is the result of the gray-shade-only segmentation of Photograph 1. This segmentation also found the major objects in the image. The combination of edge and gray shade segmentation in Photograph 16 shows the edges of Photograph 13 laid on top of the input image of Photograph 1. Photograph 17 shows the final result of growing gray-shade regions inside these edges. This segmentation has better separation of objects than the gray-shade-only segmentation of Photograph 15. The edges between the objects caused this spacing.

Which segmentation is best? That is a judgement call. All three segmentations, however, are better than those produced by the simple techniques in the last article.

Photograph 18, Photograph 19, and Photograph 20 show the results of the three techniques applied to the house image of Photograph 2. The edge-only segmentation of Photograph 18 is fairly good as it denotes most of the major objects in the image. The gray-shade-only result in Photograph 19 is not very good because all the objects are right next to each other and hard to distinguish. The combination segmentation in Photograph 20 is an excellent result. It detected objects not found in the edge-only technique and also eliminated many of the unwanted bricks.

Integrating the New Techniques

Listing 10 shows the new code for the main routine of the C Image Processing System (CIPS). I added case 18 to allow segmentation using the three techniques discussed here. Given next are the changes to the CIPS main menu and the routine that interacts with the user to obtain all the options.

Listing 11 shows a standalone application program for segmenting entire images using the new techniques. It is command-line driven and calls the functions given in the previous listings.

Conclusions

This installment in the series described three powerful image segmentation techniques that work on complicated ages. The techniques, however, are only combinations of existing tools and tricks. Given different images, you might have used different combinations of tools. Experiment, try different combinations, and modify existing tools to create new ones.

References

Phillips, Dwayne. November 1991. "Image Processing, Part 5: Writing Images to Files and Basic Edge Detection," The C Users Journal. Lawrence, KS: R&D Publicatons.

Phillips, Dwayne. January 1992. "Image Processing, Part 6: Advanced Edge Detection," The C Users Journal. Lawrence, KS: R&D Publications.

Phillips, Dwayne. October 1992. "Image Processing, Part 7: Spatial Frequency Filtering," The C Users Journal. Lawrence, KS: R&D Publications.

Russ, John C. 1992. The Image Processing Handbook. Boca Raton, FL: CRC Press.

Phillips, Dwayne. February 1993. "Image Processing, Part 9: Histogram-Based Image Segmentation," The C Users Journal. Lawrence, KS: R&D Publications.