Hsi-Chiu Liu is an associate professor of computer science at California State Polytechnic University, Pomona, Pomona, CA 91768.
Electronic data transmission errors are a fact of life and, despite rapid advances in digital communication and computer networks, transmission error control continues to be a major software- and hardware-engineering task. One of the most efficient methods of error detection and correction is algebraic coding, which requires only a minimal amount of bit redundancy in forming code words. Using algebraic operations with a matrix and a vector, code words can be easily encoded before transmission and decoded at the receiving end. When compared to other error control schemes, algebraic coding is potentially capable of correcting multi-bit errors with lower-bit redundancy overhead.
Algebraic codes are a series of code words, each of which is formed by attaching a number of check bits to a message word. The purpose of the attached check bits is to help detect and correct transmission errors. To correct multi-bit errors, more check bits are needed. For an algebraic coding, the inequality n + r + 1 <= 2r is used to determine the minimum number of check bits needed for correcting single-bit errors (n is the number of bits of a message word and r the number of check bits needed). For example, a 4-bit message word will need at least three check-bits in order to be able to correct single-bit errors.
Suppose that a message word is n-bits long and it takes r check bits to form a code word. For each correctly formed code word, there may be (at most) n+r code words, each of which is in a 1-bit error. Each of these erroneous code words is formed by inverting any one of the n+r bits of the correctly formed code word. For a transmission that causes only single-bit errors, there will be n+r+1 possible code words that may be received in transmitting every code word. Because there are 2n message words, a maximum of 2n (n+r+1) code words will be involved in the transmission. For a code word that is composed of n+r bits, there will be a total of 2(n+r) possible code words, even though it is not possible for some of them to be present in the transmission because of the single-bit error transmission. Based on this reasoning, we have 2n (n+r+1) <= 2(n+r). This inequality is easily reduced to n + r+ 1 <= 2r.
Actually, the probability of a multi-bit transmission error is much lower than a single-bit transmission error. For example, if the bit error rate is 10-3, then the probability of a double-bit in error will be 10-6, if the bit error is independent. (That's why I'm only considering single-bit errors in this article.) To generalize this type of coding, we suppose that the message words consist of n bits each. To form a code word, a number of r check bits will be appended to the message word: m1 m2 m3 ... mn c1 c2 c3 ... cr. Note that there are only 2n valid code words out of 2n+r possible ones.
To generate the r check bits algebraically, we have to first predefine a matrix called "H" of the dimension r x (n+r). Consider each code word to be generated as a vector T which consists of message bits followed by check bits. If even parity is adopted for the computing system, appropriate values can be assigned to each of the check bits from the matrix equation H T = 0. See Example 1 for definitions of H and T. Note that the righthand portion of H is an identity matrix. The entries in the lefthand portion of H must be either 0 or 1. These values must be predefined under the following two conditions: No column consists of all 0s; No identical entries are assigned to any two columns. The following problem illustrates how to assign values to the check bits.
[h11 h12 * * * h1n 1 0 0 0 * * * 0 ]
|h21 h22 * * * h2n 0 1 0 0 * * * 0 |
| * |
H = | * |
| * |
[hr1 hr2 * * * hrn 0000 * * * 1 ]
T = [m1 m2 * * * mn c1 c2 * * * cr]
Problem: Given the message word 1101 (n = 4), then the number of check bits is set to 3 (r = 3). If the predefined matrix H is as shown in Example 2(a), find the values of the check bits for the formation of a code word with the given message word.
(a)
[1 1 0 1 1 0 0]
H = |1 0 1 1 0 1 0|
[0 1 1 1 0 0 1]
(b)
T = [1 1 0 1 c1 c2 c3]
(c) 1 + 1 + 0 + 1 + c1 + 0 + 0 = 0
1 + 0 + 0 + 1 + 0 + c2 + 0 = 0
0 + 1 + 0 + 1 + 0 + 0 + c3 = 0
(d) c1 = 1
c2 = 0
c3 = 0
Solution: First, a vector for the message word 1101 is formed. See Example 2(b). Secondly, assume that even parity is adopted. Then, perform the matrix multiplication: H T=0. Three equations will be generated from this matrix multiplication; see Example 2(c). Using modulo-2 addition, the three equations will be reduced to Example 2(d).
Therefore, the valid code word will be 1101100. Following this procedure, all of the valid code words formed with all of the 4-bit message words can be generated; see Example 3.
Message word Code word
------------ ---------
0000 0000000
0001 0001111
0010 0010011
0011 0011100
0100 0100101
0101 0101010
0110 0110110
0111 0111001
1000 1000110
1001 1001001
1010 1010101
1011 1011010
1100 1100011
1101 1101100
1110 1110000
1111 1111111
Let's next consider how to detect and correct transmission errors. Assume that the received code word is a vector R. We also assume that an error vector E is of the same dimension as that of the code word or vector R. The 1s in E represent the error positions in the code word. We then have R = T+E. Now, use the following to perform a matrix multiplication: H R = H(T+E) = H T + H E = 0 + H E = H E.
Therefore, if this multiplication results in product zero (i.e., H E = 0), we then conclude that E consists of all 0s. This means that there is no transmission error detected, and no transmission error has occurred. Otherwise, at least one error was made in transmission. Of course, this elaborate system will do more than just error detection.
The value of the algebraic coding arises in multiple-error detection and correction. The product HE is defined as S, the syndrome. The dictionary defines syndrome as "a number of symptoms occurring together and characterizing a specific disease or condition." In fact, the error syndrome characterizes the specific bit error.
Let's first assume that a single-bit error occurs. E would then be a vector composed of a single one, and the remaining positions would be zeros. If we take the product of this with H to form the syndrome, the result is a vector that is identical to one column of H, that column being the one corresponding to the bit position in error.
To illustrate the single-bit error correction mechanism, consider this problem:
Problem: Given the code word T = 1101100 (before transmission), and the received code word R = 1100100 (after transmission), find the error and correct it.
Solution: First form the product HR; see Example 4(a). Note that the result is equal to the fourth column of H, thus identifying an error as having occurred in the fourth-bit position. This error can be corrected simply by inverting the fourth bit of the received code word.
R = 1100100
(a)
[1 ]
|1 |
[1101100] |0 1|
|1011010| |0 = 1|
[0111001] |1 1|
|0 |
[0 ]
(b)
[1]
S = |0|
[0]
If an error of more than 1-bit error occurs, the syndrome will be equal to the sum of the corresponding columns of H. If, for example, errors occurred in the third and fourth bits of the example above, the syndrome would be as in Example 4(b).
This would be incorrectly interpreted as a single-bit error in the fifth bit position. Even though in this article we restrict our discussion of algebraic decoders to the single-bit error correction case, note that the syndrome can correct two errors, provided that no two columns sum to either zero or to another column or to the sum of two other columns.
Algebraic codes are not difficult to implement. Calculation of the various vector products is equivalent to summing combinations of bit positions. This is accomplished by entering the codes into shift registers, tapping off at the appropriate positions, and feeding these outputs into a summation device. Modulo-2 addition of two inputs is performed in the XOR logic block. One simple implementation of the algebraic coding and decoding system for 4-bit message words is shown in Figure 1 and Figure 2. In Figure 1, the 4-bit message word to be transmitted is input from the left. The correct values of the check bits for this message word are output from the right. So, a code word can be formed with the message word and the check bits. In Figure 2, the received code word is input from the left. The correct message word is output from the right. Note that the implementation setup in Figure 2 is capable of handling both detecting and correcting single-bit errors. Another possible implementation using ROM is shown in Figure 3. Note that in part (a), each message word to be transmitted is used as an address pointing to the ROM location where the needed check bits are stored. Similarly, in part (b), the received code word is also used as address pointing to the ROM location where the correct message word is stored.
Roden, Martin S. Digital Communication Systems Design, Englewood Cliffs, N.J.: Prentice Hall, 1988.
Tanenbaum, Andrew S. Computer Networks, Second Edition, Englewood Cliffs, N.J.: Prentice Hall, 1988.
White, Ben. "Hamming-Code Decoding," Dr. Dobb's Journal, #156, October 1989.