September 2001/Image Rotation Using OpenGL Texture Maps

Graphics

Image Rotation Using OpenGL Texture Maps

Shehrzad Qureshi

Here's a new twist on rotating images efficiently.

Introduction

Image rotation is a computationally expensive operation. The classical method for image rotation works backwards, in a sense. Assuming an image f is to be rotated about its center by q degrees, the typical software implementation proceeds by starting with the output image g. Each pixel in the rotated image is mapped back into the input image space by a rotation of -q degrees. So, for each pixel g(x',y'), we must find a corresponding f(x,y) such that :

Equation 1:

x = x' cos q - y' sin q

y = y' cos q + x' sin q

Due to the trigonometric terms in the above expression, more often than not x and y will not be integers. As a result, an interpolation such as bilinear interpolation is useful to properly reconstruct the image at the point (x',y')(see sidebar).

This article will demonstrate an alternative method of rotating image bitmaps, through the use of OpenGL texture maps. Texture maps are patterns or images that can be “attached” onto arbitrary surfaces [1]. This rotation algorithm is alluded to in the ubiquitous “red book” [2], which all aspiring — and accomplished — OpenGL programmers should be familiar with. If an application requires the same image to be interactively rotated, or in the case of image registration algorithms rotated by varying perturbed values of q, then this implementation will be very efficient. The main performance bottlenecks are the instantiation of the OpenGL texture map object and the extraction of the image data from the frame buffer. If you have access to a high-performance graphics accelerator (and who doesn’t nowadays?), simply let the hardware do all the work! Furthermore, the code presented in this article can be extended to incorporate translational displacements and more general forms of spatial warping. Spatially warping an image in this context means applying some geometric transform to each and every pixel. Basically, the client supplies the “image warper” with some arbitrary coordinate transformation that shall be applied to all pixels in the input image. You can think of the image rotation problem described in this article as a subset of these aforementioned warpings, with the coordinate transformation being Equation 1.

One way of rendering an image using OpenGL is through the use of glDrawPixels. Basically, the programmer tells OpenGL the raster position and then OpenGL simply writes the pixel data directly to the frame buffer. Usage of pixel functions such as glDrawPixels bypasses the transformation pipeline. This article describes an alternative way of rendering an image. This is accomplished by texture mapping a polygon. The vertex coordinates of our polygon (in this case, a rectangle) can subsequently be transformed by the graphics pipeline. Since the polygon has been texture-mapped with the image data, when I transform the polygon coordinates, I subsequently transform the image pixel data. The code that implements this scheme was developed and tested with Visual C++ v6.0 (NT4), using the OpenGL DLL provided by Microsoft (opengl32.dll).

Initializing the Texture Map

Listing 1 contains the OpenGL initialization code for the aforementioned texture-mapping scheme. The model-view and projection transformations are set up for two-dimensional rendering via the gluOrtho2D call. These parameters define the width and height of the orthographic viewing volume, sometimes referred to as a “rectangular parallelepiped.” This is in contrast to a perspective projection, which can be thought of as a “frustum,” or truncated pyramid. As its name suggests, using a perspective projection results in objects closer to the viewing location appearing larger, while objects farther away appear smaller, culminating in a vanishing point that depends on the characteristics of the frustum. For simplicity, this application utilizes an orthographic projection, where the relative distance from the viewing location to the object does not matter (i.e., objects far away are rendered the same as if they are very close). In order to implement the generalized warping techniques alluded to earlier, you would need to use a perspective projection.

In order to avoid unwanted subtle changes in the image dot pitch (spatial resolution), you need to be careful with the inputs to gluOrtho2D. For example, suppose you want to render an MxN image onto a rendering context that is also of size MxN pixels, which can be specified by an appropriate call to glViewport. In this case, gluOrtho2D(0,N,0,M) is actually incorrect. Remember, OpenGL is zero-based, and the effect of this call will be to scale the pixel values so that OpenGL renders an MxN image onto an (M+1)x(N+1) rendering context. In order to maintain spatial resolution, the correct call is gluOrtho2D(0,N-1,0,M-1), or as I have:
gluOrtho2D((-N/2)+.5,(N/2)-.5,
           (-M/2)+.5,(M/2)-.5)
The motive for the latter form is because I want to rotate the image about its center point; defining the coordinate system in this fashion allows me to fix the OpenGL origin squarely in the center of the image. Since I am going to rotate the image about its center, the subsequent viewing transformations are trivial, as I will soon show.

The remainder of the initialization code is fairly straightforward. Texture mapping is enabled and configured, and a GL texture map identifier is created. Finally, a display list is compiled, which defines the vertices of the square that are to be texture mapped, and also which maps texel [3] coordinates to these same vertices (see Figure 1).

Display lists are an OpenGL optimization mechanism, whereby instead of constantly redefining primitives directly within the main rendering loop, a “pre-compiled” (to borrow OpenGL parlance) primitive list is called each time through the rendering loop.

The Rendering Loop

Listing 2 contains the code that executes whenever the window needs to be painted (i.e., the function that the MFC framework method OnPaint calls). The CglrotateDlg class encapsulates an attribute m_fTheta, which is the angle of rotation. glRotatef(m_fTheta, 0.0, 0.0, 1.0) essentially spins the model-view coordinate frame about its z axis (think of this axis as being perpendicular to your monitor screen), with the effect on the texture-mapped polygon shown in Figure 2.

Since texture mapping has been enabled, the display list subsequently tells OpenGL to draw the input image onto this polygon, as if it were a decal. In fact, previously I had OpenGL treat the image as if it were a sticker via the call to glTexEnvf. An alternative, which is a key element to another computer graphics technique called “bump-mapping,” is for OpenGL to modulate or vary the image pixel intensities based upon the polygon’s color. This option could have been specified by first defining colors for each of the polygon’s vertices and then supplying GL_MODULATE instead of GL_DECAL as a parameter to glTexEnvf.

MFC and BMP File Format

The complete distribution (located on the CUJ website at <www.cuj.com/code>) includes two sample BMP files that can be used by this application (any BMP image can be used). BMP image files are stored in the MS Windows DIB (Device Independent Bitmap) format. As is the Microsoft creed, there are countless permutations of DIB storage (apparently beginning with Windows 2000, DIBs can even encapsulate JPEGs!), none of which work well with OpenGL. For a more in-depth treatment of this subject, I encourage the reader to peruse Dale Rogerson’s excellent MSDN tutorial [4]. Suffice it to say, a not-so-well documented function auxDIBImageLoadA [5] is used that takes care of the gory details of translating any DIB into a format that can be used as an RGB image array that OpenGL understands.

While on the subject of these bitmap images, one major caveat is that OpenGL requires texture maps to have dimensions that are a power of two. This restriction arises from a feature called “mip-mapping,” a mechanism by which OpenGL can efficiently filter texture maps so that they scale nicely. This application doesn’t employ mip-maps; however there is the possibility that the input image size is not a power of two. There are two courses of action here. Either the input image can be padded (i.e., add margins where necessary), or the image can be scaled appropriately. This application uses the latter. I know of two convenient functions that accomplish this goal. One way is to use the GDI function StretchDIBits. In the name of being less “windows-centric,” I decided to use the OpenGL utility function gluScaleImage to scale an input image so that its dimensions are a power of two, if this is indeed necessary. The method CglrotateDlg::TexMapScalePow2 does just that.

Another interesting method deals with the extraction and saving of the rendered image. The callback method CglrotateDlg::OnSaveBmp deals with reading the image data from the frame buffer. Note the call to SetForegroundWindow — this call raises a window and must be called prior to reading the frame buffer [6]. Next, I tell Windows to suspend drawing within the window via a call to LockWindowUpdate. The frame buffer data can now safely be read into an RGB array. Once this is done, the suspension of drawing activity is lifted.

The final problem is saving the image back to disk in the BMP file format. Alas, there is no analogue to auxDIBImageLoadA for saving BMP files. By perusing MSDN, I was able to discern just enough about the file format to be able to save the image data in a raw (uncompressed) RGB format. In fact, it’s quite simple, after one takes into account that the rows of image data must be aligned on DWORD boundaries, and the color channels need to be in BGR order. (As an aside — why on earth did Microsoft choose this particular order? Is it merely to make programmer’s lives difficult, or is there any significance to the fact that blue-green-red is backwards from the order that most everyone is familiar with?)

When the application starts, click on the right mouse button to bring up a pop-up menu. After loading in an image, it can be interactively rotated by using the “<” and “>” keys. The “0” key resets the angle of rotation to zero. Figure 3 shows some screen-shots using an example image provided, and Table 1 is a summary of the methods I overrode with the ClassWizard so that code archeologists have an easier time understanding the class.

In recent years, the use of texture maps has become commonplace in the world of 3-D computer graphics, partly because of their inherent flexibility and efficiency, but also because they are so visually rewarding. I hope I have given you a glimpse as to the power of texture maps, while also illustrating a somewhat unconventional usage of this powerful feature.

References

[1] The term “texture mapping” refers to the process of actually mapping these patterns or images onto arbitrary parameterized surfaces. The mathematics behind how the graphics pipeline actually accomplishes this can be quite involved. A very clear explanation can be found in 3D Computer Graphics by Alan Watt (Addison-Wesley, 1999).

[2] OpenGL Architecture Review Board. OpenGL Programming Guide, 2nd Edition, January 1997.

[3] The term texel is analogous to pixel. Pixel stands for “picture element,” while texel stands for “texture element.”

[4] Dale Rogers. “OpenGL V: Translating Windows DIBs,” <http://msdn.microsoft.com/library/techart/msdn_gl5.htm>.

[5] Ron Fosner. OpenGL Programming for Windows 95 and Windows NT (Addison-Wesley, 1996).

[6] OpenGL Developers FAQ. “Why don’t I get valid pixel data for an overlapped area when I call glReadPixels() where part of the window is overlapped by another window?”, <www.opengl.org/developers/faqs/technical/rasterization.htm#rast0070>.

Shehrzad Qureshi has a BS in Computer Science from UC Davis, and an MS in Computer Engineering from Santa Clara University. He is a software engineer at Accuray Inc., where he works primarily on medical image processing applications. In his spare time, he enjoys racquetball and reruns of “Law & Order.” He can be reached at shehrzad_q@hotmail.com.