Peter Heinrich is a video and computer game programmer who has worked on products for Amiga, PC, Sega, 3DO, and MacIntosh. He studied mathematics at the University of Oregon, and now lives in Seattle. Nathan Dwyer lives and works in Seattle. His interests are widespread and include programmer/computer interfaces, and graphics.
Introduction
When developing graphics applications, programmers must frequently make a trade off between precision and speed. Many graphics algorithms must store and calculate fractional values, but if those algorithms use floating-point they may incur such a large speed hit as to become unusable. At the other extreme, integer calculations are fast, but simple integers in C cannot represent fractional values. In cases like this, programmers can use a fixed-point representation to strike a good compromise.Fixed-point representation trades range and precision for speed, and is based on a simple observation. Binary numbers are sums of powers of two. Each digit position in a binary number represents a unique power of two, that is, 2 raised to an integral exponent such as 20, 21, 22, etc. Adding places to the right adds positions of decreasing exponents: 2-1, 2-2, 2-3, and so on (see Figure 1a) . The method for converting the number to decimal remains the same as with integers: sum the powers of two whose digits have the value 1. Thus, the number 0.11 represents the decimal number 2-1 + 2-2 = 0.5 + 0.25 = 0.75, and 0101.0110 represents the number 22+20+2-2+2-3 = 4 + 1 + 0.25 + 0.125 = 5.375 (see Figure 1b) . The separator between positive and negative exponents is called the binary point (analogous to the decimal point in decimal numbers).
In a fixed-point representation, the number's digits are stored in a Standard C type variable (such as int or long), but the digits do not represent the same quantities as they would for a simple integer. The fixed-point representation includes an implied binary point, which is assigned a fixed position within the variable. Since some bits must represent the fractional part, fewer bits are available for use by the integer part and the range is decreased. For example, a 16-bit number usually has a range of [ -32768, 32767 ]. If we set the binary point at the fourth bit, the range is decreased to [ -4096, 4095 ], but it can now represent numbers to 1/16 precision. I notate this as 12.4: twelve bits of integer, four bits of mantissa. In exponential notation, fixed-point numbers evaluate to N + f * 2-d, where N is the integer part, f is the fractional part, and d is the number of digits assigned to the fractional component.
Speed Comparisons
Fixed-point is, for the most part, a change in how integer numbers are interpreted. Fixed-point operations require a small amount of extra work, which is at most an extra shift per operation. Thus, fixed-point operates at almost exactly the same speed as the integer equivalent. Generally, this speed is several orders of magnitude faster than floating-point, with or without an FPU.
Range/Precision Trade-offs for Graphics
Fixed-point is particularly applicable to computer graphics. Since computer screens have limited resolution, usually representable by eleven or twelve bits, the fixed-point number's decreased range does not represent too much of a handicap.For example, the 12.4 representation works well for many applications. Recall that 12.4 applies to 16-bit integers, with four bits assigned to the fractional part of the number, giving 0.0625 precision (1/16) and a range of [ -4096, 4095 ]. This range provides enough precision to run a Bresenham line algorithm on a very large screen. For algorithms involving color averaging, it's tempting to use 8.8 representation, since color images commonly contain eight-bit color components. However, this representation runs the risk of overflow. A safer bet is 10.6, which still gives precision to the 1/64. If you need very high precision (as in the trigonometric functions discussed later), you can use 2.14 representation. Here two bits form the integer part to represent both 1 and -1, and numbers can be precise to the 1/16384, or 0.000061. 32-bit integers allow even higher levels of precision while still maintaining a large range.
Basic Fixed-Point Operations
Basic mathematical operations work a little differently with fixed-point numbers than with integers. Consequently, the C programmer must do some bookkeeping to keep track of which numbers are of which type. Also, the program will require conversion routines to convert from fixed-point to integer and vice versa. I describe here the conversion operations, basic math operations, and along the way, demonstrate the bookkeeping required.
Converting
Converting between fixed-point numbers and integers is fairly simple. To convert from integer to fixed-point, shift the integer up by the number of bits in the mantissa (the fractional part). Thus, to convert the integer 00001101 to 4.4 representation, shift it left four places. The resulting fixed-point number is 11010000, and has four bits of precision. To convert from fixed-point to integer, shift the opposite direction.I set these operations up in C by defining a shift constant (FixedPointShiftAmount) and conversion operators written in terms of that constant. Listing 1 shows the code for these conversion functions (FPToInteger and IntegerToFP), as well as other basic operations.
If you use these conversion functions you should either check for overflow errors or ensure that integers being converted to fixed-point format have a compatible range. If all fixed-point related shifts are done with a single constant it is easy to change precision later on.
Fixed-point addition and subtraction work exactly the same as integer addition and subtraction. To see why, consider the numbers in exponential notation, as in Figure 2a. With four bits in the mantissa, 2.5 + 3.125 = 0010.1000 + 0011.0010 = 0101.1010 = 5.625. (Keep in mind that you can't add fixed-point numbers and integers together like this. Nor can you add two fixed-point numbers with different binary points. To add or subtract two numbers they must both be in the same format.)
Multiplication and division are only slightly more complicated. Examine the exponential form of multiplication in Figure 2b. When two fixed-point numbers are multiplied, the result ends up shifted to the left by the number of bits in the mantissa. The code must allow for these extra bits of precision, and an additional shift must take place after the multiplication to normalize the result.
Division is similar except the operation causes a loss rather than a gain in precision. To compensate, the code shifts one argument up before division.
Note that multiplication and division, while not difficult, require more care than addition and subtraction. The extra shifts involved in multiplication and division must be placed carefully, or significant bits in the result may disappear into oblivion. Inattentive programming can produce total loss of precision. This is particularly true when using 32-bit fixed-point numbers. (See sidebar for more information on avoiding loss of precision.)
One property of fixed-point numbers allows for occasional optimizations: multiplying a fixed-point number and an integer results in a fixed-point number that does not require the additional shift. A programmer who carefully monitors mixing of types can use this to advantage.
Advanced Operations
A basic understanding of the principles discussed above will enable use of fixed-point mathematics in real situations. In many cases, switching to fixed-point is transparent. A 3D system, for example, may model physical bodies with position, velocity, acceleration, etc., all of which may be represented in fixed-point with very little impact on the code.However, the basic fixed-point operations discussed above don't fully satisfy the needs of graphics applications. For example, a 3D graphics system will probably rely heavily on trigonometric functions, traditionally accessible only via floating-point numbers. Without fixed-point alternatives for, say, cos and sin, modest speed improvements in other areas may disappear in the noise. Programmers concerned with performance will probably want to deal with fixed-point numbers exclusively. In this section I present several fixed-point trigonometric functions. But first I discuss a concept upon which all trigonometric functions depend: fixed-point angles.
Angle Representation
Angles are used extensively in all sorts of calculations, including trajectories, light-source shading, reflection, and rotation. A program that uses the standard library trig functions will declare these angles as type double and will express the angles in radians. As long as the standard library routines aren't used, however, there's no reason not to choose some other type, or different units more suitable to integer/fixed-point arithmetic.Consider making all angles fit in an unsigned char. An angle variable can then provide 256 divisions in a circle, for a resolution of approximately 1.41° per division. This really isn't much precision, though; it would be nice to be able to represent angles of less than 1°. What if all angles fit in an unsigned short? Then each division would represent approximately 0.00549° because a circle would have 65,536 divisions.
As a side effect, using an integer to hold an angle provides automatic overflow protection. That is, the results of arithmetic operations will always be modulo the maximum value (+1) an integer can represent. For examples, assume angle A is an unsigned short equal to 57,330. Adding the value 32,980 sets A to 24,744, not 90,310 (the result will be modulo 65,536). In terms of angles, this is equivalent to adding 315° to 181° and obtaining 136° (instead of 496°); the result always remains on the unit circle.
Fixed-Point Trig Functions
The trig functions presented here make extensive use of mathematical tables. (Some programmers spurn math tables, preferring algorithmic solutions, but a simple table lookup is frequently the most efficient and elegant way to approach a problem. For example, a sine or cosine math table for angles contained in unsigned chars requires at most 256 entries (even less, as it turns out). Of course, unsigned shorts will require more entries, but the table need not grow to an unreasonable length.Assume the lowest four bits of an angle are insignificant when used as an argument to a trig function. The resulting table needs only 4,096 entries, and if the full-scale range of the angle argument is ± 180°, the trig function can still resolve angles as small as 0.0879°. I've made no reference to the size of the entries in the table, because it is completely arbitrary. Each entry may hold a 2.6, 12.4, or even 4.28 fixed-point value. The size of each value and the position of its binary point depends on the range and precision required. For example, a custom fixed-point version of cos might take a 16-bit argument (whose lower four bits weren't significant) and return a 2.14 fixed-point result. A fixed-point atan2 routine might take two 16.16 values and return an 8-bit result. The needs of the program dictate how the tables shape up.
Listing 2 shows fixed-point substitutes for sin and cos. This listing demonstrates how a single table may be used for both routines. It also shows how the size of the table may be reduced by taking advantage of the functions' symmetry. [For sanity's sake we've abbreviated this table in the printed listing; the full table and source are available on this month's code disk. mb] Listing 3 shows the code for generating the table used in Listing 2.
Fixed-Point in Parametric Equations
One graphics-related task that benefits from fixed-point math is the computation of parametric equations. Parametric formulas are generally functions of a single variable, normally t, that evaluate to a point in space for each value of t in the range [ 0, 1 ]. I describe two optimizations here: using fixed-point to represent t, and computing the equation iteratively.Given two points X1 and X2, the line between them is given by the parametric equation in Figure 3a. This equation can be broken up into the system of equations in Figure 3b.
Listing 4 shows floating-point and fixed-point implementations that compute the equations of Figure 3b. In the fixed-point implementation t is a fixed-point number, while everything else is an integer. Recall that fixed-point/integer multiplication requires no compensating shift operation at the end; the code takes advantage of this extra efficiency to avoid converting all the coordinates into fixed-point. The fixed point implementation runs quite a bit faster without losing the clarity of the code.
To further optimize the algorithm, the program could compute each point on the line iteratively that is, it would compute each point from the previous point. Given that t changes by a discrete amount (dt) each iteration, X(t) can be computed as shown in Figure 3c.
To compute each value of X(t), just add ( X2 - X1 )dt to the previous value of X(t). ( X2 - X1 )dt is a constant that can be computed before the loop is entered; each iteration of the loop involves merely a few additions. Listing 5 contains a typical implementation. In this case, I've kept all the coordinates as fixed-point to maintain precision.
A Test Program
Listing 6 shows a test program, which draws a sine curve, a cosine curve, and several diagonal lines, to exercise the fixed-point functions presented in this article. The program calls functions from the Borland Graphics Interface (BGI), such as initgraph (initialize graphics) and putpixel; if you don't have a Borland compiler you will need to substitute equivalent functions supplied with your compiler. The code disk also contains an EXE file for users who just want to run the test program.
Summary
Fixed-point speeds up arithmetic requiring fractional precision without much loss of code clarity. Graphics algorithms can especially benefit from fixed-point number representations because graphics applications usually operate over a limited range of the numbers. Using fixed-point requires the C programmer to make a conceptual shift in thinking about how values are stored. Once this conceptual shift has been made, fixed-point becomes a simple matter of bookkeeping.