The author works as a computer and electronics engineer with the U.S. Department of Defense. He has a PhD in Electrical and Computer Engineering from Louisiana State University. His interest include computer vision, artificial intelligence, software engineering and programming languages.
Introduction to the Series of Articles
This is the eleventh in a series of articles on images and image processing. Previous articles discussed reading, writing, displaying, and printing images (TIFF format), histograms, edge detection, spatial frequency filtering, sundry image operations, and image segmentation. (See box called "The C Image Processing Series...so far" on page 91.) This article will discuss manipulating shapes.The last two articles discussed image segmentation. Segmentation took an image and produced a binary output showing the objects of interest. This article will describe several techniques for taking those objects and improving their appearance.
Working with Shapes
You process images, such as segmentation results, to improve their appearance (Phillips February 1993, June 1993). Photograph 1 shows an aerial image, and Photograph 2 a segmentation of it. Photograph 3 shows a house, and Photograph 4 a segmentation of it. These are good segmentations, but they have problems.Segmentation results have "holes" in them. The roof in Photograph 4 should be solid, but has holes. Larger holes can even break objects. The opposite can also be true, as segmentation can join separate objects. Segmentation results often have little or no meaning. The solid objects resemble blobs and are hard to interpret especially if you want to use a computer to interpret them.
The answer to these problems is morphological filtering, or manipulating shapes. Useful techniques include:
These techniques work on binary images where the object equals a value and the background is zero. Figure 1 shows a binary image with the background equal to 0 and the object equal to 200. All of the figures in the article will use the same format.
- erosion and dilation
- opening and closing
- outlining
- thinning and skeletonization
Erosion and Dilation
The erosion and dilation operations make objects smaller and larger. These operations are valuable in themselves and are the foundation for the opening and closing operations. Erosion, discussed briefly in "Image Segmentation Using Edges and Gray Shades" (Phillips June 1993), makes an object smaller by removing or eroding away the pixels on its edges. Figure 2 shows the result of eroding the rectangle in Figure 1.Dilation makes an object larger by adding pixels around its edges. Figure 3 shows the result of dilating the rectangle in Figure 1. I set any zero pixel that was next to a 200 pixel (shown as asterisks).
There are two general techniques for erosion and dilation. The technique introduced in the last article (Phillips June 1993) employs a threshold (Russ 1992). Another technique (Myler and Weeks 1993) uses masks to erode and dilate in desired directions.
The threshold technique looks at the neighbors of a pixel and changes its state if the number of differing neighbors exceeds a threshold. Listing 1 shows the erosion and dilation routines that use this method. The loops in the erosion routine examine every pixel in the_image equal to value. They count the number of zero neighbors and set the pixel in question to zero if the count exceeds the threshold parameter. Figure 2 used a threshold parameter of 3. Compare this to Figure 4 (threshold = 2).
The loops in the dilation routine do the opposite. They count the value pixels next to a zero pixel. If the count exceeds the threshold parameter, set the zero pixel to value. The dilation in Figure 3 used threshold = 0 while Figure 5 used threshold = 2.
The masking technique (Myler and Weeks 1993) lays an nxn (3x3, 5x5, etc.) array of 1s and 0s on top of an input image and erodes or dilates the input. With masks you can control the direction of erosion or dilation. Figure 6 shows four 3x3 masks (you could use 5x5, 7x7, etc. masks). The first two masks modify the input image in the vertical or horizontal directions while the second two perform in both directions.
Figure 7 shows the results of dilating the object of Figure 1 using the four masks of Figure 6. The procedure is:
1. Place the 3x3 mask on the object so that the center of the 3x3 array lies on the edge of the object.
2. Place a 200 everywhere a 1 from the mask lies.
The object in the first part of Figure 7 has been dilated, or stretched, vertically. The second result is a horizontal dilation. The third and fourth show dilation in both directions. These last two differ in dilating the corners of the object.
Mask erosion is the same, but opposite. You lay the 3x3 mask on the image so that the center of the array is on top of a zero. If any of the 1s in the mask overlap a 200 in the image, set the 200 to zero. Vertical mask erosion removes the top and bottom rows from an object. Horizontal mask erosion removes the left and right columns, and the other masks remove pixels from all edges.
Listing 2 shows the routines for mask erosion and dilation. mask_dilation copies the correct directional mask specified by the mask_type parameter and goes into the looping code. The loop moves through the input image and lays the 3x3 mask on top of every pixel in the image. The inner loops examine those places where the 3x3 mask equals 1. If the_image is greater than 1 (non-zero) at that place, set max to the input image value. After exiting the loop, set the out_image to max. This changes zero pixels to value and enlarges or dilates an object in the_image.
The mask_erosion routine performs the opposite function. Its inner loops look at those places where the 3x3 mask is 1 and try to find pixels in the_image that are less than min (that are zero). If there are any zeros in this part of the_image, set out_image to zero. This removes value pixels, makes them zeros, and erodes an object in the_image.
Photograph 5 illustrates directional dilation. The left section shows the segmentation of the house image. The center section shows dilating with a vertical mask, and the right section shows dilating with a horizontal mask.
Opening and Closing
opening and closing help separate and join objects. They are powerful operators that are simple combinations of erosion and dilation. opening spaces objects that are too close together, detaches objects that are touching and should not be, and enlarges holes inside objects. The first part of Figure 8 shows two objects joined by a "thread." The second part shows how opening eliminated the thread and separated the two objects. The rest of the figure shows how opening enlarges a desired hole in an object.Opening involves one or more erosions followed by one dilation. Eroding the object of Figure 8 twice erases the thread. A dilation enlarges the two objects back to their original size, but does not recreate the thread. The left side of Photograph 6 is a segmentation of the house image (Phillips June 1993). The right side is the result of opening (three erosions followed by one dilation). Although excessive, it shows how opening spaces the major objects.
closing joins broken objects and fills in unwanted holes in objects. The first part of Figure 9 shows two objects that should be joined to make a line. The second part shows how closing removes the break in the line. The last two parts of Figure 9 show how closing fills in unwanted holes in objects.
Closing involves one or more dilations followed by one erosion. Dilating the top part of Figure 9 twice enlarges the two objects until they join. An erosion thins them back to their original width. Dilating the third part of Figure 9 twice makes the box bigger and eliminates the hole. Eroding shrinks the box back to its initial size.
Listing 3 shows the routines that perform opening and closing. They call the mask erosion and dilation routines, but calling the threshold erosion and dilation routines would work just as well (homework for the reader). opening calls mask_ dilation one or more times and mask_erosion once. closing calls mask_erosion one or more times and mask_dilation once. These are simple, yet powerful routines.
Special Opening and Closing
The opening and closing operators work well, but sometimes produce undesired side effects. closing merges objects, but sometimes merges objects that it shouldn't. Figure 10 shows such a case. Photograph 7 shows the result of closing applied to Photograph 2. closing closed the holes in the objects, but also joined distinct objects. This distorted the segmentation results. opening enlarges holes in objects, but can break an object. Figure 11 shows a case where opening broke an object and eliminated half of it.The answer is special opening and closing routines that avoid these problems. Figure 12 shows the desired result of such special routines that open and close objects, but do not join or break them.
The first difficulty to overcome is what I call the 2-wide problem. In opening, you erode an object more than once, and an object that is an even number of pixels wide can disappear. The first part of Figure 13 shows a 2-wide object. The second part shows the object after one erosion, and the third part shows it after two erosions. The object will disappear entirely after several more erosions.
A solution to the 2-wide problem is my own variation of the grass fire wavefront technique (Levine 1985). My technique scans across the image from left to right looking for a 0 to value transition. When it finds one, it examines the value pixel to determine if removing it will break an object. If removal does not break the object, it sets the pixel to 0 and continues scanning. Next, it scans the image from right to left and does the same operation. Then it scans from top to bottom, and finally from bottom to top. The different scans will not erode away an object that is 2-wide.
The key to special opening is not breaking the object. One solution places the pixel in question in the center of a 3x3 array. Find every value pixel next to the center pixel. Do all of those pixels have value neighbors other than the center pixel? If yes, then erode or remove the center pixel. If no, then removing the center pixel will break the object. The top part of Figure 14 shows cases where removing the center pixel will break the object. The bottom part shows cases where removing the center pixel will not break the object. Here, every 200 has a 200 neighbor other than the center pixel.
A similar problem in special closing is not joining two separate objects when you dilate or set a pixel. One solution is to place the pixel in question in the center of a 3x3 array. You grow objects in this array and check if the center pixel has neighbors whose values differ (Phillips February 1993). If their values differ, do not set the center pixel because this will join different objects. The top part of Figure 15 shows 3x3 arrays and the results of growing objects. The center pixel has neighbors that are parts of different objects, so you do not set the center pixel. The bottom part of Figure 15 shows another set of 3x3 arrays. Here, the non-zero neighbors of the center pixel all have the same value, and we can set the center pixel.
The source code to implement special opening and closing, shown in Listing 4, is basic but long. The special_opening routine calls thinning (instead of erosion) one or more times before calling dilation once. thinning works around the 2-wide problem while performing basic threshold erosion. thinning has four sections one each for scan (left to right, right to left, top to bottom, and bottom to top) recounted earlier. Whenever thinning finds a pixel to remove, it calls can_thin to prevent breaking an object. can_thin checks the non-zero neighbors of the center pixel. If every non-zero pixel has a non-zero neighbor besides the center pixel, can_thin returns a 1, else it returns a zero.
The special_closing routine calls dilate_not_join one or more times before calling erosion once. dilate_not_join uses the basic threshold technique for dilation and calls can_dilate to prevent joining two separate objects. can_dilate grows objects in a 3x3 array and checks if the center pixel has neighbors with different values. If it does, the neighbors belong to different objects, so it returns a zero. can_dilate grows objects like the routines in the previous two C Image Processing installments (Phillips February 1993, June 1993). can_dilate calls little_label_and_check which resembles routines described in those two articles.
Photograph 8 shows the result of special closing. Compare this with Photograph 2 and Photograph 7. Photograph 2, the original segmentation, is full of holes. Photograph 7 closed these holes, but joined objects and ruined the segmentation result. Photograph 8 closes the holes and keeps the segmentation result correct by not joining the objects.
Photograph 9 and Photograph 10 show how to put everything together to improve segmentation results. Photograph 9 shows the outcome of eroding the segmentation result of Photograph 4. Applying special closing to Photograph 9 produces Photograph 10. Compare Photograph 4 and Photograph 10. Photograph 10 has all the major objects cleanly separated without holes.
Outlining
Outlining is a type of edge detection. It only works for zero-value images, but produces better results than regular edge detectors. Photograph 11 shows the exterior outline of the objects in Photograph 4.Outlining helps you understand an object. Figure 16 and Figure 17 show the interior and exterior outlines of objects. Outlining zero-value images is quick and easy if you have erosion and dilation at your disposal. To outline the interior of an object, you erode the object and subtract the eroded image from the original. To outline the exterior of an object, you dilate the object and subtract the original image from the dilated image. Exterior outlining is easiest to understand. Dilating an object makes it one layer of pixels larger. Subtracting the input from this dilated, larger object yields the outline.
Listing 5 shows the source code for the interior and exterior outline operators. The functions call the mask_erosion a n d mask_dilation routines. They could have called the threshold erosion and dilation routines (homework for the reader). The interior_outline routine erodes the input image and subtracts the eroded image from the original. The exterior_outline routine dilates the input image and subtracts the input image from the dilated image.
Thinning and Skeletonization
Thinning is a data reduction process that erodes an object until it is one pixel wide, producing a skeleton of the object. It is easier to recognize objects such as letters or silhouettes by looking at their bare bones. Figure 18 shows how thinning a rectangle produces a line of pixels.There are two basic techniques for producing the skeleton of an object:
Thinning erodes an object over and over again (without breaking it) until it is one pixel wide. Listing 4 contains the thinning routine. I used it as part of the special_opening routine to erode objects without breaking them. In that context, I set the once_only parameter of thinning to one, so that it would erode an image only one time. Setting once_only to zero causes thinning to keep eroding until the objects in the image are all one pixel wide.
- basic thinning
- medial axis transforms
This basic thinning technique works well, but you cannot recreate the original object from the result of thinning. If you want to do that, you must use the medial axis transform.
The medial axis transform finds the points in an object that form lines down its center, that is, its medial axis. It is easier to understand the medial axis transform if you first understand the Euclidean distance measure (don't you love these big terms that really mean very simple things?). The Euclidean distance measure is the shortest distance from a pixel in an object to the edge of the object. Figure 19 shows a square, its Euclidian distance measure (distance to the edge), and its medial axis transform.
The medial axis transform consists of all points in an object that are minimally distant to more than one edge of the object. Every pixel in the bottom of Figure 19 was the shortest distance to two edges of the object. The advantage of the medial axis transform is you can recreate the original object from the transform (more homework). Figure 20 shows a rectangle (from Figure 18) and its medial axis transform. Photograph 12 shows a block letter A, and going clockwise, the result of exterior outline, medial axis transform, and thinning.
Listing 6 shows the source code to implement the Euclidean distance measure and the medial axis transform. edm calculates the Euclidean distance measure. It loops through the image and calls distance_8 to do most of the work. distance_8 has eight sections to calculate eight distances from any value pixel to the nearest zero pixel. distance_8 returns the shortest distance it found.
The functions mat and mat_d calculate the medial axis transform in a similar manner. mat loops through the image and calls mat_d to do the work. mat_d calculates the eight distances and records the two shortest distances. If these two are equal, the pixel in question is minimally distant to two edges, is part of the medial axis transform, and causes mat_d to return value.
Integrating Shape Operations into the C Image Processing System
Listing 7 shows the new additions to the main routine of the C Image Processing System. Case 19 handles the 14 shape operations discussed herein. The show_menu routine underwent a big change from past installments. We now have 19 options, so I used a two column display for them. The final routine, get_shape_options, interacts with the user to obtain all the options needed to use the techniques.Listing 8 shows an application program that allows you to apply the shape operations to entire image files. This program created the images shown in the photographs of this article.
Conclusions
This installment in the series discussed shape operations or morphological filters. These techniques help you improve the appearance of segmentation results. They are also useful for other situations. As with all the image processing operators in this series, you must experiment. Try the techniques and tools in different combinations until you find what works for the image or class of images at hand.
References
Levine, Martin D. 1985 Vision in Man and Machine. New York, New York: McGraw-Hill.Myler, Harley R. and Arthur R. Weeks. 1993. Computer Imaging Recipes in C. Englewood Cliffs, New Jersey: Prentice Hall Publishing.
Phillips, Dwayne. February 1993. "Image Processing, Part 9: Histogram-Based Image Segmentation," The C Users Journal. Lawrence, KS: R&D Publications.
Phillips, Dwayne. June 1993. "Image Processing, Part 10: Segmentation Using Edges and Gray Shades," The C Users Journal. Lawrence, KS: R&D Publications.
Russ, John C. 1992. The Image Processing Handbook. Boca Raton, Florida: CRC Press.