December 1999/Visualizing Depth Images

Graphics and Imaging

Visualizing Depth Images

Dwayne Phillips

A couple of simple transforms can make depth information really stand out in a picture.

This article focuses on techniques for visualizing depth images. A depth image is similar to an ordinary grayscale image, except that the brightness of each pixel represents a "depth" or some other quantity that can be represented by a scalar. An example of a depth image is a map of ocean floor depth produced by sonar measurements. (These images are also called range images, because the gray level represents the range from the sensor.)

Images 1 and 2 show more examples of depth images. Image 1 (from [1] and [2]) is an American 25-cent piece. The brighter areas of the quarter are closer to the viewer than the darker areas. Image 2 is one I created. It is a sinusoid radiating from a point in the upper left-hand corner. The source code to create this is in file myown.c. (Not shown — all source code for this article is available on the CUJ ftp site. See p. 3 for downloading instructions.) This image is similar to the depth images I used in [3] to illustrate random dot stereograms.

The two images shown so far are interesting, but difficult to interpret. Their "depth" or third dimension is not readily apparent. It isn't hard to "see" the coin, but that object is familiar and easy to picture in our minds. The sinusoid picture, however, looks like rings — not a 3-D object.

"Depth" images such as this are becoming more prevalent everyday. It can be helpful to display such images in such a way that the depth is actually represented as a third dimension. The remainder of this article examines two image processing algorithms that help people see the third dimension: embossing and the isometric transform.

Embossing

Embossing is an image processing operation, but it is also a familiar concept in the physical world. There it refers to embedding an image into paper; an embossing stamp crimps the paper to add a third dimension. An example of physical embossing is a notary public notarizing a document. The embossing image processing operation is similar to the edge detection algorithms presented in [4] and [5]. A depth image is convolved with an embossing mask. The embossed image helps the viewer see the third dimension in the depth image.

Figure 1 shows portions of the xemboss program. The convolution operation is the heart of this program. Convolution consists of (1) multiplying each number in the convolution mask by each number in a 3X3 area of an image; (2) adding these products together; and (3) placing the sum in a single pixel (the center pixel of the corresponding 3X3 area) of the output image. The emboss_convolution routine at the end of Figure 1 implements this operation.

The first part of the figure shows 14 embossing masks. The convolution output of masks 7 through 13 will be divided by three. This scaling operation will produce an output level identical to the input levels when the input is a uniform gray shade — all pixels at the same depth. When a convolution area contains differing gray shades (differing depths), the output image will contain darker and lighter areas near regions of maximum and minimum depth. This will accentuate the perception of depth.

In a similar manner, masks 0 through 6 produce an output identical to the input when the input is a uniform gray shade. The zeros in masks 0 through 6 show the variations in depth to a greater degree than masks 7 through 13. Sometimes this greater depth perception is better, sometimes not. These masks also effectively superimpose the original gray shade over the embossing result. The images that follow will show this effect. The major segments of the main routine in Figure 1 read an input image, call the convolution routine, and write the output image. The code that performs the image I/O operations is not listed, but is available in file imageio.c on the CUJ ftp site.

The emboss_convolution routine shown next performs the convolution. This is similar to the edge detection convolution used in [4] and [5]. The copy_array routine (not shown) copies the desired mask into the mask array. Note that when masks 7 through 13 are used, the result is divided by three. This keeps the output image from becoming saturated.

This program is invoked via a command line. The user enters the names of the input and output images and the number of the convolution mask to use.

Figure 2 shows a portion of a depth image at the top and the result of embossing at the bottom. In this particular embossing operation I used mask 10 — a simple mask that increases the perception of depth for vertical edges.

Figure 3 shows the same depth image at the top and the embossing result of mask 3 at the bottom. Note how this result amplifies the perception of depth. In Figure 2, the left edge of the block rose to a gray level of 85 and the right edge dipped to 30. In Figure 3, these rises and dips went to 130 and 0.

Images 3 and 4 show the results of embossing applied to a simple depth example. These two images are composites, with the original depth image in the upper left hand corner and three embossing results in the remaining quadrants. Image 3 shows an original depth image with two blocks falling away from the viewer (indicated by their darker-than-background shade in the center). The result quadrants were created with (going clockwise) masks 7, 9, and 8. The upper right and lower left embossing results (masks 7 and 8) improve the depth perception. The lower right result (mask 9) seems to make the blocks appear closer than the background. This is because the shadows are on the right and the bottom. People assume the light shines from the left to the right. Why we do this is a mystery, but we do. The result is a false perception of depth.

Image 4 shows the result of embossing the same input with (going clockwise) masks 0, 2, and 1. Note the increase in perceived depth. The upper right (mask 0) and lower left (mask 1) results look like buttons that have been depressed. The lower right (mask 2) looks like buttons that are popping up towards the viewer. Again, this is because our minds assume the light is coming from the left. Try to imagine the light shining in from the right. If you can do this, you will see depressed buttons again.

The next two images further illustrate how embossing accentuates depth perception. Image 5 shows the results of four different embossing masks applied to the quarter shown in Image 1. These four results used some of the masks 7 through 13. Image 6 shows similar results using masks 0 through 6. Image 6 tends to pop out at the viewer more; at least I think so. This is a matter of perception, and different people perceive it differently. Images 5 and 6 differ because of the 1 in the center of masks 0 through 6. This 1 has the effect of superimposing the input image's gray value over the result. Which result is better depends on the input image and your desired output.

Regardless of preference, Images 5 and 6 show how embossing helps people see depth in depth images. The shadows in Images 5 and 6 pop out as depth much better than those shown in Image 1.

Isometric Transforms

Another operation that helps us perceive depth is the isometric transform. Most people are probably unfamiliar with the term, but will recognize the operation when they see it. This transform tilts an image and draws lines whose height corresponds to the height in the depth image. Isometric transforms manipulate a depth image to create hills and valleys.

I present two different ways of performing the isometric transform. The first uses a simpler calculation and allows viewing the "other side" of a depth image. The second presents a better view of the image, but is more complicated, and it requires another operation to view the other side of the depth image.

A Simple Isometric Transform

Figure 4 shows the geometry of the first isometric transform. The transform performs two operations on the input image. The first operation shifts individual rows of the image horizontally, with more shift occurring for each row further down in the image. The angle theta in Figure 4 governs the amount of shift. Theta can be positive or negative — choosing a negative theta is what allows viewing the "other side" of the depth image. The second operation raises the image proportionally to the height in the depth image. This creates the hills in the ouput.

These two operations create an output image that has more rows and columns than the input image. The output image is wider because of the shift in the rows. It is taller because each pixel was plotted at a different height than its original location, to indicate depth.

Image 7 shows results of the first isometric transform on image 1. The upper left quadrant shows the basic transform. The rows are shifted over, and for each row, a black curve is drawn whose height at any point is proportional to the depth image value in the row. The upper right quadrant demonstrates what happens when the angle theta is negative. This resulting image shows the other side of the hills in the coin.

The lower left quadrant shows a variation on the upper left quadrant. In the upper left, every pixel of the depth image is plotted in the transform. In the lower left, every fifth pixel is plotted. Skipping some of the pixels in the transform provides a different view of the changing depth.

The lower right quadrant shows another variation. In the other three quadrants, a black pixel is placed at the top of the hill created by the transform. In the lower right quadrant, the hill is topped with the gray shade from the depth image. Depending on the input image and the purpose of the transform, this last variation could improve the perception of depth.

Figure 5 shows portions of the program that perform the first isometric transform. The first part of the figure defines constants. The first piece of code in the main routine uses the size of the depth image and the angle theta to calculate the size of the output image and create it. After filling the output image and reading the input image, the code finds the maximum depth from the depth image. This max value is used to scale the hills in the output image.

The code inside the loops over i and j performs the two operations in the isometric transform. The first operation uses the tangent of the angle theta to shift the rows. The second calls the lineup routine to create the hill in the output image. The lineup routine is shown at the end of the figure.

The space variable specifies whether to skip any rows, as was done in the lower left quadrant of Image 7. The value variable specifies whether to put a black pixel at the top of the hill or to use the input image's value, as in the lower right quadrant of Image 7.

This program is called from the command line. The user inputs the names of the input and output files. The user also enters the angle theta as well as the space and value values described in the previous paragraph.

A Better Isometric Transform

Figure 6 shows the geometry of the second isometric transform. Like the first transform, this one also performs two operations on the depth image. However, in this case the shifting operation tilts both the rows and columns of the depth image, not just the rows as in the previous transform. Figure 6 shows how two angles, alpha and beta, create the two shifts. The second operation, drawing the curves to create hills, is the same as in the first transform.

This transform also creates an output image larger than the input image. The larger dimensions depend on the angles alpha and beta.

Figure 7 shows portions of the source code that perform this transform. The basic parts are similar to that shown in Figure 5. The code uses the size of the depth image and the angles alpha and beta to calculate the size of the output image. The max value of the depth image is used to scale the size of the hills in the output image.

The code inside the loops over i and j performs the two operations of the transform. The ii and jj variables hold the results of the shifting equations shown at the bottom of Figure 7. The call to the lineup routine creates the hills in the output image. The space and value variables work the same here as in Figure 5.

This program is also called from the command line. The user enters the names of the input and output images, the angles alpha and beta, and finally, space and value.

Image 8 shows results of this transform performed on Image 2. It is much easier to see the depth with this view than it is with Image 2. The hills and valleys jump out at the viewer. In the left side of this image the transform puts a black pixel at the top of each hill. It is possible to space out these pixels just like in Image 7. In the right side of Image 8 the value from the depth image is placed at the top of each hill. Which is better depends on what you want.

One shortcoming of this program is that is does not handle negative angles of alpha and beta. The reader may want to modify the program to allow this. This limitation prohibits viewing the other side of the hills. One workaround is to use a utility program I call flip. The flip program performs 90 degree rotations of an image. The code for this program is not shown, but is available on the CUJ ftp site.

Image 9 shows the result of flipping Image 1 by 180 degrees. Image 10 shows the isometric transform of the flipped image. This shows the other side of the hills in the image. It is easy to see the hills of the top and back side of the figure on the coin.

Conclusion

This article has presented a first venture into the creation of 3-D images. The embossing and isometric transforms presented make it easier to perceive depth in depth images. I encourage readers to improve the programs shown here. There are many other topics in 3-D images to explore, including the reflection properties of surfaces. There is plenty to learn and lots more fun available.

References

[1] John C. Russ. The Image Processing Handbook, Third Edition (CRC Press, 1999).

[2] The Image Processing Toolkit, Version 2.5. Reindeer Games Inc., http://members.aol.com/ImagProcTK.

[3] Dwayne Phillips. "Image Processing, Part 16: Random Dot Stereograms," C/C++ Users Journal, April 1996.

[4] Dwayne Phillips. "Image Processing, Part 6: Advanced Edge Detection," The C Users Journal, January 1992, pp. 47-63.

[5] Dwayne Phillips. "Image Processing, Part 5: Writing Images to Files and Basic Edge Detection," The C Users Journal, November 1991, pp. 75-102.

Dwayne Phillips has worked as a software and systems engineer with the US government since 1980. He has written Image Processing in C, R&D Publications, 1994; and The Software Project Manager's Handbook, Principles that Work at Work, IEEE Computer Society, 1998. He has a Ph.D. in electrical and computer engineering from Louisiana State University. He can be reached at d.phillips@computer.org.