Dr. Dobb's Journal February, 2005
Shadows are among the most noticeable visual cues and are important for human perception. Consequently, accurate rendering of shadows is paramount for realistic computer-generated images. In this article, I examine five algorithms that generate dynamic shadows in 3D graphics applications. I've implemented these algorithms in the Small Dynamic Shadows Library (available electronically; see Resource Center, page 5), a freely available OpenGL-based library.
Finding HASH(0x86ff50) of space shadowed from a given light source implies solving the visibility problem for that light source. In other words, if a viewing camera is to replace the light source, all points in space visible for that camera should be illuminated, whereas all points hidden from this imaginary camera are shadowed. In general, generating shadows for moving or animating objects is more difficult than producing static shadows. A common, albeit simplistic, approach is to draw a dark spot underneath moving objects. Even when approximate or geometrically incorrect, this is visually better than not having any shadows at all. The dark spot depicting the shadow can have soft edges, further improving its appearance. Many popular video games (from "Return to the Castle of Wolfenshtein" to "Grand Theft Auto 3") rely on this technique. One approach used to improve the appearance of this spot shadow is to elongate it into the direction opposite to the predominant light source. Another approach is to draw multiple intersecting dark spots. This works well in many instances, especially low-light intensity environments.
Although faking dynamic dark spot shadows is still in use, it is hardly enough to draw a compelling picture of the virtual world. Luckily, with the improvements in computer performance, it is now possible to consider more advanced approaches producing geometrically correct dynamic shadows.
One popular solution for a constrained situation where dynamic shadows are always cast onto a plane is to find a projection matrix that squishes the object to that plane and draws the entire model the second time with the found transform and an ambient dark, near black color (see Figure 1). Although constrained, such a situation often occurs in sport games that represent playing fields and players casting shadows.
It is not too difficult to find the required transformation matrix. If a normal to a plane where the projection is sought is n and some point on that plane is P, then the plane equation could be represented as n(X-P)=0, where X is any point on that plane. Given a light source L and an arbitrary point on the model M (Figure 2), Example 1(a) holds because the distance from the light source to the projection point X-L relates to the distance from the light source to point M in the same way the distance from the light source to the plane relates to the distance from the light source to the point M projected onto the direction of the normal n(L-M) (Figure 2). Thus, the projection point relates to the point on the model as in Example 1(b), or in vector notation, represented as in Example 1(c). Thus, this transformation, with some simple arithmetic, can be described as a 4×4 matrix transforming any point on the model M into a point on the plane X; see Example 1(d).
However, there are several potential complications with this algorithm, starting with the possibility of Z fighting. After all, the shadow model is drawn on top of the previously drawn plane. Thus, Z values of the two will be near equal, resulting in some pixels from the shadow appearing intermittent with the pixels from the plane. A common solution is to push the projection plane slightly above the shadow-receiving plane.
A second problem exists due to the matrix, which you have derived; it projects onto an infinite plane, where what you usually want is more limited (a table top rather than an infinite ground plane, for example). Stenciling techniques can help. When the surface receiving the shadow is drawn, you also simultaneously draw into the stencil buffer, whereas when the shadow model is drawn, the stencil test is performed to check if the current pixel also belongs to the previously drawn limited surface. In many situations, a destination alpha test can be used as a substitute for a stenciling test.
Finally, to improve performance, you could use separate shadowing and drawing models. Because shadows (as important as they are) are not what catch the eye first, shadow models could be considerably simpler and use a smaller number of triangles than the rendering models.
The main drawback of the planar shadows is, of course, the severe limitation that is placed onto shadow receivers. Quite often, it is necessary to cast shadows onto objects of an arbitrary shape. Texture mapping and procedural generation of texture coordinates can help. In most graphics pipelines, it is possible to specify a texture-coordinate generation matrix that is applied to spatial coordinates. The result of this multiplication is used as texture coordinates. Thus, you can produce a shadow by first computing a black-on-white image of the shadow caster (see Figure 3) as seen from the position of the light source, further using it as a texture for shadow receivers whose texture coordinates are generated procedurally based on the view transformation as observed from the light-source position.
The first step of the algorithm is to draw the black-on-white image of the shadow from the position of the light source. Assume that the light source is positioned at point L, and some point on the model M is projected into the texel T (see Figure 4). The first step of the transformation is the rotation of the object-space point into the light-space coordinate system. The latter is chosen so that its k axis (Figure 4) points along the direction from the center of the shadow caster to the position of the light source. Directions of the other two axes are not particularly relevant (since the orientation of the image in the plane is not that relevant), so you can arbitrarily choose the i axis to be perpendicular to the Y axis of the object space. The last axis j is mutually perpendicular to the first two.
Thus, the axes of the light-source coordinate system can be computed as in Example 2(a). Knowing the axes, the rotation part of the transformation from the object space into the light-source space can be represented by Example 2(b). To project the point, however, you also need to translate the result of the application of Example 2(b) along the view direction by the distance from the light source to the shadow caster. Furthermore, you apply the perspective transform that determines how big the object is in the image, and finally you potentially need a view-port transform (not represented in the formula) to position the shadow caster's projection in the middle of the image; see Example 2(c).
The perspective transform lets you adapt the size of the projected shadow caster to occupy the entire texture. This transform thus depends on the size of the texture buffer and the bounding radius of the shadow caster.
For the second step of the algorithm, you use a rendered image of the shadow caster as a texture map on shadow-receiving objects. The same transform as that used in the projection is also used for procedural texture-coordinate generation. Thus, a vertex on the shadow receiver obtains a proper texture coordinate corresponding to the view from the light source. However, you may have to alter the view-port transform since the range of texture coordinates may be limited and different from the view-port transform used in the projection. As Figure 5 illustrates, this algorithm lets you cast shadows on objects of an arbitrary shape.
There are several drawbacks to this algorithm. First, a shadow is obtained on both front- (light-source facing) and back-facing sides of the shadow receiver. It is because there are points on both sides of a closed object that map to the same shadow map texel according to the transform that we have found. You can deal with this problem by choosing very low-ambient illumination so that the back-facing polygons are very dark.
It is also somewhat difficult to take care of the concave shadow receivers with this method, particularly when the shadow caster happens to be placed in one of the cavities of the receiver. Suppose a single model is representing a tunnel through which a shadow caster is movingthe shadow may potentially be mapped onto both the floor and ceiling of this tunnel. Because both the floor and ceiling are modeled by the same object (and thus, use the same texture-coordinate generation matrix), it is impossible to suppress the drawing of an extra shadow. Consequently, it may be necessary to subdivide the scene further and use more complex line-of-sight calculations to determine which objects are casting shadows onto which other objects.
Generally, extra calculations of which objects cast shadows onto which objects complicate practical implementations of this algorithm considerably. The algorithm is also sensitive to the resolution of the texture where the shadow image is rendered. If it is too small, the shadow is blocky. If that is not enough, with this algorithm, it is hard to model multiple objects casting shadows onto the same receiver. However, there is an elegant solution that falls somewhat in between this algorithm and that of shadow Z buffer.
Despite these drawbacks, this is one of the more popular algorithms. A large number of excellent games (from "SSX" to "MGS2") use variations of this approach.
Besides the drawbacks of projective textures just described, there are effects that are simply impossible to do with their help alone; for instance, self shadowing is quite problematic. The shadow volumes algorithm overcomes this difficulty.
A shadow volume is an extrusion of the silhouette of a shadow caster that delimits all points in space shadowed by this caster. You can build the volume by finding the silhouette edges of a model and extruding them away from the light source. The shadow volumes algorithm draws shadows as a postprocessing step. It works by first drawing the scene without any shadows and then drawing (in a particular way) the shadow volumes into a separate image while using the Z buffer of the original image. It, thus, becomes possible to find all pixels that should have been shadowed and then we can adjust the image of the scene by blending with, or stenciling against, the found shadow mask.
The key idea of the algorithm is to first borrow the Z buffer from the drawn image of the scene. When shadow volumes (which are assumed to be closed polyhedrons) are drawn, their pixels are drawn an odd number of times for the shadowed HASH(0x86ff50) and an even number of times for the illuminated HASH(0x86ff50) (Figure 6). Indeed, because object B is shadowed, only the front side of the shadow volume is drawn (the backside is prevented from drawing by the Z test, since you have borrowed the Z buffer from the original scene where object B was drawn). This is not the case with object A and the other shadow volume.
This observation lets you formulate the following algorithm: First you draw the scene as is, without any shadow volumes. Following that, you continue using the Z buffer from the first image, but switch off Z buffer updates. If you have a stencil buffer permitting positive and negative values, you draw the front faces of the shadow volumes with value 1 into the stencil buffer (initialized with all zeroes) and draw the back faces with the value of -1. Consequently, you obtain a stencil where illuminated pixels are marked with a 0 and shadowed pixels with a positive integer. You can now apply the stencil to the original image so that only illuminated pixels remain in it. A variation of this approach is possible if the graphics pipeline supports flexible accumulation of colors. In such a case, you can create a binary image where shadowed pixels are black, and blend such an image over the original scene image. Note that you need to paint front- and back-facing sides of the shadow volumes in different colors to account for a situation where two shadow volumes intersect.
The difficulty of this algorithm is, of course, in the necessity to build the shadow volumes. Even with static light, you may have to recalculate the shadow volumes for dynamic objects whose silhouette from the position of a light source may change every frame. To build a shadow volume, you have to extrude the silhouette edges in the direction away from the light source (Figure 7). An edge in the model is a silhouette edge if one of the triangles joined at that edge faces towards the light source, whereas the other triangle faces away. This fact can be easily tested by computing the dot product of the triangle's normal and the direction of view for this triangle. Thus, by traversing all edges and computing two dot products per edge, we find all silhouette edges. To do that efficiently, you can precompute edge-based representation of the model where two triangles are given for every edge.
In real life, many models have some artifacts that complicate the process just described. Some edges may hang so that they have only a single triangle. Other edges may be joins of more than two triangles. The most reasonable solution is to edit models designated for use with this algorithm in such a way that they are closed polyhedrons and don't contain the situations just described. At the very least, you can use special, refined shadow models for generation of the shadow volumes, whereas less strict drawing models can be used for rendering. Such strategies are more reasonable in practice than trying to complicate a silhouette-detection algorithm by searching closed loops of edges and so forth.
The length of the volume is also difficult to compute. If the length is too long, much of the fill rate of the graphics accelerator is wasted. You may also get into a situation when a shadow volume is projected onto a thin wall and you observe a shadow on an opposite side of that wall just because the volumes are too long. If the length is short (and the volumes are open), this can produce nasty artifacts. Indeed, it is common not to close the volumes from above (the model itself fills that gap) nor from the bottom that likely intersects with shadow receivers of the world. There are instances, however, when closing the volumes is still necessary. If the camera moves inside a shadow volume, the clipping plane opens the volume and interferes with the algorithm, and you can't determine shadowed pixels correctly because the algorithm is based on the assumption that the volumes are closed (thus, every illuminated pixel is drawn at least twiceonce from the front-facing surface of the volume and once from the back-facing surface).
One solution is to introduce an extra shadow-drawing pass (at least for the volumes that are likely to be clipped by the front plane). In this pass, you disable the Z test and reverse the integers used to paint front and back faces. The effect of this operation is that you compute the caps on the shadow volumes and can proceed with the actual drawing of the volumes without worrying about a situation where the camera may be inside.
Shadow volumes have several significant shortcomingsone is the necessity to construct volumes for dynamic objects on-the-fly every frame. There is an old alternative algorithm that requires neither building triangles dynamically nor severe limitations on an object's shapethe shadow maps algorithm.
In its classical form, the shadow maps algorithm requires some special hardware support. Nonetheless, there are variations of it (or approximations) that can be done in software only.
The classical shadow-maps algorithm requires access to the Z buffer. It first computes the image of the scene from the position of the light source. The Z buffer of this image is saved as a texture and mapped onto the scene to be viewed by the camera. This could be done in a manner described for projective textures. Of course, Z values stored in the texture correspond to points visible from the light source. If for any point, you can compute the distance to the light source directly and compare it with the texel value that is mapped onto this point, you can determine if the point is illuminated by the light source. If the distances are the same, it is illuminated; if the distances are different (notably, the distance computed directly is larger than that stored in the corresponding texel), it implies that something else was closer to the light source and the given point is shadowed. In Figure 8, point B is shadowed because its distance to the light source is larger than the distance that was recorded in the texture (notably, the distance to point C). This is not the case for point A, which is illuminated.
One step of this algorithm (notably, computing the distance to the light source from some point when viewing the scene by the camera) normally requires some hardware assistance, or at least flexibility in the graphics pipeline. However, it is still possible to approximate this algorithm by using only commonly available operations. Illumination calculations for attenuated-point light sources allow coloring objects based on the distance to the light sources. Blending several images together permits you to mark the pixels that should have been shadowed.
Consider this algorithm: You first draw the image of the scene from the position of the light source. You only use ambient white light for this light source that is linearly attenuated with distance. Normally, three attenuation coefficients are present in most renderers and the intensity of the attenuated light is computed as in Example 3. For this algorithm, you only need a linear attenuation coefficient set to a value dependent on how large the scene isthat is, dependent on where this light should stop producing any contribution. Thus, with such illumination set up, pixels that are closer will appear brighter, whereas pixels that are further away from the light source will be dimmer (see Figure 9).
As the next step of the algorithm, you compute two images from the position of the camera. In one image, you project onto all objects the texture computed during the previous step using projection from the light source. It is done with the help of the mechanism described for projective textures. For the second image, compute the scene with the same attenuated light source as was used on the first step. Figure 10 demonstrates the two images, which contain the approximations to distances to the light source. The color of the point closest to the light source also colors all pixels further along the way in the left image. In the right image, every pixel is colored depending on its own distance only. By subtracting from the value of every pixel in the left image the value of the corresponding pixel in the right image, you can mark shadowed pixels. If the result of subtraction is positive, the pixel is shadowed; if it is zero, it must have been illuminated (see Figure 11).
By scaling the values of the shadow mask up and clamping from above, you can obtain a binary shadow mask where all shadowed pixels are white and illuminated pixels are black. Note also that negative values indicate HASH(0x86ff50) that were not mapped with the computed texture in the left image of Figure 10. These points could be considered as illuminated.
As the last step, you draw the scene with normal illumination and subtract from this image the shadow mask. That is, from every pixel of the image, you will subtract from its value the value of the corresponding pixel of the shadow mask. The right image in Figure 11 illustrates the result. Note that in the case of OpenGL, the subtraction of images can be done with the help of the accumulation buffer.
A significant drawback to this algorithm is due to potential artifacts that may appear on open models illuminated from the inside. A pixel illuminated from the back face of a polygon is not distinguished from the pixel illuminated from the front. Thus, if a light shines into an opening on an object, you may observe a light spot on the side of the object opposite to the opening. The algorithm is not sensitive to these situations. Better modeling of objects ensures that there is no one polygon. Thick surfaces will help avoid these problems.
The most significant drawback, however, is the necessity to do multiple rendering passes per every light source producing shadows. On the positive side, the algorithm is very general and inherently produces self shadowing without placing significant limits on the shape of the models in the scene. Finally, there is no need to construct geometry on the flyan expensive process.
The main difference between shadow maps and priority maps is that shadow maps compute approximations of distances per pixel, while priority maps do it on a per-object basis. By sorting all objects based on their distance to the light source, you can assign ambient color to every object so that the closest is the brightest, and the farthest is the dimmest.
The following steps are similar to the shadow maps algorithm. The texture map is drawn from the position of the light source using assigned ambient colors. Furthermore, the camera image is drawn twiceonce using a previously computed texture map projected onto all objects, and the second time using the object's assigned colors. The two images are subtracted to obtain the binary shadow map that is then subtracted from the actual image drawn using proper illumination of the scene.
As the resulting image illustrates (see Figure 12), the ability to produce self shadows has been lost. Indeed, the entire object is drawn with the same color, and thus, only the shadows produced by other objects can be computed. The use of sorting in this algorithm also assumes that the objects are not particularly elongated and placed in complex configurations. Otherwise, sorting may not be able to place all objects into an unambiguous order.
The greatest advantage of this algorithm is the simplicity with which it handles multiple shadow casters on multiple shadow receivers, something quite complicated in the case of projected textures alone.
All of the algorithms I've described have strengths and weaknesses. Planar shadows limit the shape of shadow receivers to planes only. Projective textures have difficulties with multiple casters on the same receiver and erroneously produced shadows. Shadow volumes require generating additional geometry and need multiple drawing passes. Shadow maps need too many passes and tend to produce artifacts as do priority maps.
On the other hand, planar shadows are trivial to implement, as are projective textures. Shadow volumes permit self shadowing and easily allow for multiple casters on the same receivers. Shadow maps don't require building any data structures dynamically and place only moderate limitations on the models. Priority maps inherit many good features of projective textures and shadow maps.
DDJ