ALGEBRAIC CODES FOR ERROR DETECTION AND CORRECTION

Controlling data transmission errors

Hsi-Chiu Liu

Hsi-Chiu Liu is an associate professor of computer science at California State Polytechnic University, Pomona, Pomona, CA 91768.


Electronic data transmission errors are a fact of life and, despite rapid advances in digital communication and computer networks, transmission error control continues to be a major software- and hardware-engineering task. One of the most efficient methods of error detection and correction is algebraic coding, which requires only a minimal amount of bit redundancy in forming code words. Using algebraic operations with a matrix and a vector, code words can be easily encoded before transmission and decoded at the receiving end. When compared to other error control schemes, algebraic coding is potentially capable of correcting multi-bit errors with lower-bit redundancy overhead.

Code Word Formation

Algebraic codes are a series of code words, each of which is formed by attaching a number of check bits to a message word. The purpose of the attached check bits is to help detect and correct transmission errors. To correct multi-bit errors, more check bits are needed. For an algebraic coding, the inequality n + r + 1 <= 2r is used to determine the minimum number of check bits needed for correcting single-bit errors (n is the number of bits of a message word and r the number of check bits needed). For example, a 4-bit message word will need at least three check-bits in order to be able to correct single-bit errors.

Suppose that a message word is n-bits long and it takes r check bits to form a code word. For each correctly formed code word, there may be (at most) n+r code words, each of which is in a 1-bit error. Each of these erroneous code words is formed by inverting any one of the n+r bits of the correctly formed code word. For a transmission that causes only single-bit errors, there will be n+r+1 possible code words that may be received in transmitting every code word. Because there are 2n message words, a maximum of 2n (n+r+1) code words will be involved in the transmission. For a code word that is composed of n+r bits, there will be a total of 2(n+r) possible code words, even though it is not possible for some of them to be present in the transmission because of the single-bit error transmission. Based on this reasoning, we have 2n (n+r+1) <= 2(n+r). This inequality is easily reduced to n + r+ 1 <= 2r.

Actually, the probability of a multi-bit transmission error is much lower than a single-bit transmission error. For example, if the bit error rate is 10-3, then the probability of a double-bit in error will be 10-6, if the bit error is independent. (That's why I'm only considering single-bit errors in this article.) To generalize this type of coding, we suppose that the message words consist of n bits each. To form a code word, a number of r check bits will be appended to the message word: m1 m2 m3 ... mn c1 c2 c3 ... cr. Note that there are only 2n valid code words out of 2n+r possible ones.

Code Word Encoding

To generate the r check bits algebraically, we have to first predefine a matrix called "H" of the dimension r x (n+r). Consider each code word to be generated as a vector T which consists of message bits followed by check bits. If even parity is adopted for the computing system, appropriate values can be assigned to each of the check bits from the matrix equation H T = 0. See Example 1 for definitions of H and T. Note that the righthand portion of H is an identity matrix. The entries in the lefthand portion of H must be either 0 or 1. These values must be predefined under the following two conditions: No column consists of all 0s; No identical entries are assigned to any two columns. The following problem illustrates how to assign values to the check bits.

Example 1: Definition of H and T

      [h11  h12  * * *  h1n   1 0 0 0 * * * 0 ]
      |h21  h22  * * *  h2n   0 1 0 0 * * * 0 |
      |              *                       |
  H = |              *                       |
      |              *                       |
      [hr1  hr2  * * *  hrn      0000 * * * 1 ]

  T = [m1   m2   * * *  mn c1 c2  * * * cr]

Problem: Given the message word 1101 (n = 4), then the number of check bits is set to 3 (r = 3). If the predefined matrix H is as shown in Example 2(a), find the values of the check bits for the formation of a code word with the given message word.

Example 2: Solving the message word 1101

  (a)
          [1 1 0 1 1 0 0]
      H = |1 0 1 1 0 1 0|
          [0 1 1 1 0 0 1]
  (b)
      T = [1 1 0 1 c1 c2 c3]

  (c) 1 + 1 + 0 + 1 + c1   + 0    + 0     = 0
      1 + 0 + 0 + 1 +   0   + c2  + 0     = 0
      0 + 1 + 0 + 1 +   0    + 0    + c3  = 0

  (d) c1 = 1
      c2 = 0
      c3 = 0

Solution: First, a vector for the message word 1101 is formed. See Example 2(b). Secondly, assume that even parity is adopted. Then, perform the matrix multiplication: H T=0. Three equations will be generated from this matrix multiplication; see Example 2(c). Using modulo-2 addition, the three equations will be reduced to Example 2(d).

Therefore, the valid code word will be 1101100. Following this procedure, all of the valid code words formed with all of the 4-bit message words can be generated; see Example 3.

Example 3: Valid code words formed from 4-bit message words

  Message word     Code word
  ------------     ---------

    0000            0000000
    0001            0001111
    0010            0010011
    0011            0011100
    0100            0100101
    0101            0101010
    0110            0110110
    0111            0111001
    1000            1000110
    1001            1001001
    1010            1010101
    1011            1011010
    1100            1100011
    1101            1101100
    1110            1110000
    1111            1111111

Code Word Decoding

Let's next consider how to detect and correct transmission errors. Assume that the received code word is a vector R. We also assume that an error vector E is of the same dimension as that of the code word or vector R. The 1s in E represent the error positions in the code word. We then have R = T+E. Now, use the following to perform a matrix multiplication: H R = H(T+E) = H T + H E = 0 + H E = H E.

Therefore, if this multiplication results in product zero (i.e., H E = 0), we then conclude that E consists of all 0s. This means that there is no transmission error detected, and no transmission error has occurred. Otherwise, at least one error was made in transmission. Of course, this elaborate system will do more than just error detection.

The value of the algebraic coding arises in multiple-error detection and correction. The product HE is defined as S, the syndrome. The dictionary defines syndrome as "a number of symptoms occurring together and characterizing a specific disease or condition." In fact, the error syndrome characterizes the specific bit error.

Let's first assume that a single-bit error occurs. E would then be a vector composed of a single one, and the remaining positions would be zeros. If we take the product of this with H to form the syndrome, the result is a vector that is identical to one column of H, that column being the one corresponding to the bit position in error.

To illustrate the single-bit error correction mechanism, consider this problem:

Problem: Given the code word T = 1101100 (before transmission), and the received code word R = 1100100 (after transmission), find the error and correct it.

Solution: First form the product HR; see Example 4(a). Note that the result is equal to the fourth column of H, thus identifying an error as having occurred in the fourth-bit position. This error can be corrected simply by inverting the fourth bit of the received code word.

Example 4: Solving the code word T = 1101100 and the received code word

R = 1100100

  (a)
                       [1     ]
                       |1     |
  [1101100]            |0    1|
  |1011010|            |0  = 1|
  [0111001]            |1    1|
                       |0     |
                       [0     ]

  (b)
                 [1]
             S = |0|
                 [0]

If an error of more than 1-bit error occurs, the syndrome will be equal to the sum of the corresponding columns of H. If, for example, errors occurred in the third and fourth bits of the example above, the syndrome would be as in Example 4(b).

This would be incorrectly interpreted as a single-bit error in the fifth bit position. Even though in this article we restrict our discussion of algebraic decoders to the single-bit error correction case, note that the syndrome can correct two errors, provided that no two columns sum to either zero or to another column or to the sum of two other columns.

Implementation

Algebraic codes are not difficult to implement. Calculation of the various vector products is equivalent to summing combinations of bit positions. This is accomplished by entering the codes into shift registers, tapping off at the appropriate positions, and feeding these outputs into a summation device. Modulo-2 addition of two inputs is performed in the XOR logic block. One simple implementation of the algebraic coding and decoding system for 4-bit message words is shown in Figure 1 and Figure 2. In Figure 1, the 4-bit message word to be transmitted is input from the left. The correct values of the check bits for this message word are output from the right. So, a code word can be formed with the message word and the check bits. In Figure 2, the received code word is input from the left. The correct message word is output from the right. Note that the implementation setup in Figure 2 is capable of handling both detecting and correcting single-bit errors. Another possible implementation using ROM is shown in Figure 3. Note that in part (a), each message word to be transmitted is used as an address pointing to the ROM location where the needed check bits are stored. Similarly, in part (b), the received code word is also used as address pointing to the ROM location where the correct message word is stored.

References

Roden, Martin S. Digital Communication Systems Design, Englewood Cliffs, N.J.: Prentice Hall, 1988.

Tanenbaum, Andrew S. Computer Networks, Second Edition, Englewood Cliffs, N.J.: Prentice Hall, 1988.

White, Ben. "Hamming-Code Decoding," Dr. Dobb's Journal, #156, October 1989.