May 1990/A Pseudo-Random Number Generator

Number Crunching

A Pseudo-Random Number Generator

Robert Fruit

Robert Fruit is a graduate of the Illinois Institute of Technology and is president of Simulation Rule, which designs and builds actuarial computer system for microcomputers. He has worked in the actuarial field for 20 years, most of that time spent on computer systems for actuaries. He can be contacted at PO Box 295, Clarendon, IL 60514.
The linear congruent method for creating pseudo-random numbers dominates modern random number generation for three reasons. First, it can create good random number series. Second, it makes efficient use of limited computer resources. Third, its mathematical requirements are a particularly good match to the way electronic computers do arithmetic.
Forty years of mathematics research into the linear congruential method guarantees that if properly used, it will create good random number sequences. The theory also predicts that it is easy to create bad random number generators using the linear congruent method (LCM). In fact, one can create bad LCM random number generators more easily than good ones.

Linear Congruence
The classic description of a congruent system is the hours on an analog clock (the ones with the hands). The clock hands point out the hours, 1, 2, 3, ..., 12. After 12 the next number is 1, and the pattern keeps repeating. The classic description says this is modulo 12. (Modulo is a special number in congruent mathematics which defines the length of a pattern).
This classic definition is wrong about almost everything, except that the hours on a clock are modulo 12. The clock does not follow a classic congruent pattern. For modulo 12, the numbers would go from 0 to 11 (not 1 to 12). One of the first rules of congruent mathematics is that the numbers always stop one short of the modulo number.
The minutes on a clock do follow the correct pattern for a congruent system. If the minute hand is at 59 minutes right now, one minute from now the minute hand will be at zero minutes. The minutes are a congruent system with a modulo value of 60.
Congruent mathematics has its own writing style. Using the minutes example above, the style is

0 = ( 59 + 1 ) mod 60
which, in general, would be written as

A = B mod M A, B, and M are integers.
This statement reads "A is congruent to B modulo M". From the early rules of congruent mathematics, A must be between 0 and M-1. There is no like restriction on what integral value B may have.
A linear congruent function is nothing more than

x' = ( A * x + C ) mod M
It may be little hard to see how this works without an example. Let A = 5, C = 3 and M = 8. Then if x = 0 initially:

x 5*x+3 mod 8 --------------- 0 3 3 3 18 2 2 13 5 5 28 4 4 23 7 7 38 6 6 33 1 1 8 0
This example shows all the parts of a linear congruent random number generator. An initial seed is chosen (zero in the example). The seed passes through the linear formula ( 5*x+3 in the example), and is "confined" by the modulo operation (mod 8). Each result (x') becomes the next value (x) to be fed through the process. The mod 8 column represents the first eight random numbers created by this process.
In this example, the generator produced every value from 0 to M-1. If the same linear formula were "confined" by a mod 7 operation, it would create only five values:

3, 4, 2, 6, 5, 0.
And if this new generator were seeded with a one, it would generate:

1, 1, 1, ...
There are two lessons here. First, the longest run any linear congruent system can have is M, each number from 0 to M-1 appearing once. Second, the values for A, C, and M must be selected carefully if one is to achieve the longest possible run length.
Donald Knuth in his book, The Art of Computer Programming volume 2 / Seminumerical Algorithms, has the most often quoted set of rules for choosing the A and C. These rules guarantee that the run length of the linear congruent series will be M — that is, every value from 0 to M-1 will appear once. The rules are:

C must be relatively prime to M.

A-1 must be a multiple of every prime factor of M.

A-1 is a multiple of 4, if M is a multiple of 4.
For C and M to be relatively prime, as required by the first rule, they may share only the divisor one. The numbers 12 and 19 are relatively prime because one is the only divisor they have in common. The numbers 15 and 18 are not relatively prime because one and three are their common divisors.
To meet this requirement, first find all the prime divisors of M and then make C a prime number that is not a prime factor of M.
To apply the second rule you must break M into its prime factors. If M is 63, its prime factors are 3 squared and 7. According to the second rule, A-1 must be a multiple of every prime of M. So, A can be any of three values:

factors of A-1 A - 1 A ---------------------------------- 3 * 7 21 22 2 * 3 * 7 42 43 3 * 3 * 7 63 64 mod 63 = 1
The values 1, 22, and 43 are the only values for A that will produce the maximum run length of 63. Every other value of A will have shorter run lengths. Notice that 42 has a prime factor that is not a prime factor of M. That is fine as long as 42 also contains at least one of every prime of M.
The third rule is a special case of the second. If M is divisible by four, then A-1 must be divisible by four. For example, if M = 56 (prime factors two cubed and seven), the only values for A that will have the maximum run length are 1 and 29.

factors of A-1 A - 1 A ---------------------------------- 2 * 2 * 7 28 29 2 * 2 * 2 * 7 56 57 mod 56 = 1
Notice that 2 * 7 ( A = 15 ) is not in the list. The second rule would include this factor, but the additional restriction imposed by the third rule excludes it.
Rule 3 can complicate the selection when two can be part of A-1's factors. For example, if M = 18 (prime factors 2 and 3 squared) here are 3 values for A: 1, 7, and 13.

factors of A-1 A - 1 A ---------------------------------- 2 * 3 6 7 2 * 2 * 3 12 13 2 * 3 * 3 18 19 mod 18 = 1
In this case A = 7 will have the maximum run length since 18 is not divisible by four. Only the second rule applies in this case.

Choosing M
For electronic computers, if M is equal to the word size of the computer's arithmetic unit, the modulo operation will be performed automatically for simple multiplication and addition — the high order bits lost during arithmetic overflow are the same bits that would be removed by the modulo operation.
Generally M should be as large as possible, because it determines how long a linear congruent random number generator can go without repeating its pattern.
The factors of M will lead to other patterns that must be carefully watched. For example, the random number generator

#define RAND( seed) (seed = \ ((( seed * 41 ) + 13 ) % 100 ))
produces the 100 values in Figure 1. Notice how the one's digits of all these numbers are all the same for each column.
Similar patterns can occur in generators with a larger M. Consider the following #define
// all items must be longs

#define RAND( seed ) ( seed = \
  ((( seed * 101L ) + 13L ) % 1000L ))
Figure 2 gives the upper right corner of the first three pages of the 3-dimensional array (10 x 10 x 10) of its output. Notice that not only is the column pattern from Figure 1 preserved but the 10 digit repeats in the same place on each page.
These patterns are characteristic of linear congruent random number generators and are determined by the factors of M. In particular, you should avoid M's that are perfect squares or cubes.
Thus, selecting M is fairly complicated: you must choose a number that avoids all undesirable patterns and still produces the maximum run lengths.

Selecting A
A must also be selected with care. It is not enough that the A be selected for the greatest run length of M. For example,

#define RAND( seed ) ( seed = \ ( ( seed + 1 ) % M ) )
always produces the maximum run length, but simply counts through the integers:

0, 1, 2, 3, ..., M-1, 0, 1, ...
Run length isn't enough; the order of the numbers is also important.
The A must be selected so that a large amount of mixing occurs between random numbers. To get good mixing, A must meet two tests. First, A must be at least as large as the square root of M. LCMs work much like the carnival number guessing wheel in Figure 3. Choosing a small A is like giving the wheel a small push — so small that the wheel might not even make one complete revolution. Choosing a large A is like giving the wheel a large push — the wheel will turn several times before it slows to a stop. When the wheel makes several revolutions it is difficult to guess, at the time the wheel is pushed, where it will stop.
In addition to being large, A must also have a characteristic that makes the next value appear especially random. Knuth suggests calculating a value he called potency. The potency for A is s, where s is the smallest integer to satisfy
(A-1)^s = 0 mod M
Every A that meets the three conditions given earlier will have a solution for the potency equation. Listing 1 is a program that will calculate the potency using one of two functions. The first function requires the modulo size which must be less than the size of an unsigned long. The second function assumes the modulo is the size of an unsigned long.
For M = 32, some As have a potency of one (1, 9, 17, and 25). Two of these are shown in Figure 4, 1 and 9. Some As have a potency of two (5, 13, 21, and 29). Two of these, 5 and 13, are shown in Figure 5.
Figure 4 and Figure 5 use interpolation differences to determine the next number in the sequence. When A=1, the results differ by a constant relationship, when A=9, by a linear relationship. It turns out that A=1 is a special case of the linear relationship (the multiplier of the linear term is 0). Obviously higher values for A yield greater mixing. Figure 5 shows what happens when larger values are used. Both of these random sequences need a quadratic formula to determine the next item in the sequence. It does not take much effort to determine that the quadratic results in Figure 5 are significantly better (more random-like) than are the results in Figure 4. Remember that these sequences are created with a linear formula — it is the relationship between that linear formula and congruent arithmetic that gives the effect of higher powered functions.
Potency is a measure of mixing. The higher the potency, the better the mixing. Knuth recommends a potency of at least three, and he mentions that others recommend a potency of at least four.

Applying The Rules
To illustrate how the criteria for M, C and A interact, I'll use them to select values for a "real" LCM random number generator.
First select an M. Begin by checking some modulo calculations with a pocket calculator. Let M = 769,387, A = 7,283, and C = 853. Start with x0 = 0

x1 = ( 7,283 * 0 + 853 ) mod 769,387 = 853 x2 = ( 7,283 * 853 + 853 ) mod 769,387 = 57,303 x3 = ( 7,283 * 57,303 + 853 ) mod 769,387 = error
An eight-digit pocket calculator will overflow on that last calculation, loosing its least significant digits. Without the least significant digits, it is impossible to calculate the modulo value.
A computer has the same problem. If the calculations are attempted using standard mathematics, a similar overflow will occur. Thus M² can never be greater than the word size of the computer's arithmetic unit. For unsigned int with Microsoft's QuickC the M² limit is 65,536 and M can not be larger than 256 — an unacceptable restriction for M. Using unsigned long int instead makes the limit on M 65,536 — still not a great situation.
If you are unwilling to write your own arithmetic routines, even ones in C, there is another way out. Earlier it was mentioned that the word size for the arithmetic unit would do modulo calculation automatically for multiplication and addition. Relying on this "automatic" modulo would let unsigned int accomodate M values to 65,536 — still too small — and unsigned long int acccomodate M values to 4,294,967,296 — an acceptable size.
Choosing this M, though, implies a compromise — many patterns, based on the powers of 2, will be created, but we'll get the needed run length.
Now we select A. Since M only has 2 as a prime factor, and is divisible by 4, both rules 2 and 3 apply. A-1 must be large, but still fit within a unsigned long int, say 2²⁰. Large As like this have a low potency, so a small A-1 with a large potency is also needed, say 2³. To keep from having too simple a multiplier, it is a good idea to also include another power of 2 between the 2 already selected, like 2¹³. Combining these three numbers, and adding 1 gives:

A = 2²⁰ + 2¹³ + 2³ + 1 = 1,056,777
Because we included 2³, the potency of this combined number is 10.
Now select C, preferably a value larger than the square root of M. The only requirement is that C be relatively prime to M. C = 2¹⁸ + 1 must be prime to M.
This makes the definition for this random number generator:

#define RAND( seed ) ( seed = \ ( ( 1056777 * seed ) + 262145 ) )
Because M is the word size unsigned long, there is no % operator in this definition; the modulo operation will occur automatically.
Figure 6 shows the first 100 numbers created by this random number generator. These numbers are not like the digits found in the random number books like 1,000,000 Random Digits. In books like that, each digit occurs randomly. A random number generator creates a sequence of numbers that have the wanted random behavior. Looking at one number does not make it possible to reasonably guess what number in the sequence will be next.
That distinction between the random digits found in a book or table of random digits, and a random sequence of numbers, is important. In a table of random digits, if some random three-digit numbers are wanted, just start selecting three-digit triples anywhere, and random three-digit numbers will be created. With a sequence of random numbers, it is necessary to convert from one sequence into another. This is often done by first converting the random numbers into numbers between 0 and 1. Those random numbers between 0 and 1 are then converted into random three-digit numbers.

Conclusion
One thing should be clear now: creating a good random number generator is not a random process. Hard work goes into choosing an M that best balances the hardware with the need to suppress patterns and have long run lengths — and to choose an A that works best with that M, suppresses unwanted patterns, and has great mixing.

Bibliography
The Art of Computer Programming — volume 2 / Seminumerical Algorithms, Donald Knuth, Addison-Wesley Publishing Company, 1969.
Algorithms, Robert Sedgewick, Addison-Wesley Publishing Company, 1984.
C.R.C Standard Mathematical Tables, twelfth edition, Chemical Rubber Publishing Company, 1963.