Error Detection and Correction

Error Detection and Correction Outline for Today 1. Parity Check Code 2. Bounds based on Hamming distance 3. Hamming Code Can You Raed Tihs? I cnduo t bvleiee taht I culod aulaclty uesdtannrd waht I was rdnaieg. Unisg the icndeblire pweor of the hmuan mnid, aocdcrnig to rseecrah at Cmabrigde Uinervtisy, it dseno t mttaer in waht oderr the lterets in a wrod are, the olny irpoamtnt tihng is taht the frsit and lsat ltteer be in the rhgit pclae. The rset can be a taotl mses and you can sitll raed it whoutit a pboerlm. Tihs is bucseae the huamn mnid deos not raed ervey ltteer by istlef, but the wrod as a wlohe. Aaznmig, huh? Yaeh and I awlyas tghhuot slelinpg was ipmorantt! See if yuor fdreins can raed tihs too. languagehat.com (original source unknown) 1

One Example: Error Detection Bits are occasionally flipped in transmission. For example, a sender may transmit 1101001, and the receiver gets the string 0101011. Adding redundancy can allow us to detect, and possibly correct, some errors of this type. Simple approach: Repeat each bit Repeat each bit twice. For bit x, transmit xx. If the receiver gets two different bits, it requests retransmission. This in an error-detecting code - it allows one error to be detected, but it is not error-correcting, since retransmission is necessary. Repeat each bit three times. For each bit x, transmit xxx. Now the receiver can correct a single error. (How?) 2

Parity Check Code: Detecting an Odd Number of Bit Flips Definition: A bit string has odd parity if the number of 1s in the string is odd. A bit string has even parity if the number of 1s in the string is even. Recall: 0 is an even number. Example: 01100, 000, and 11001001 have even parity. 1000011, 1, and 00010 have odd parity. Assume we are transmitting blocks of k bits. A block w of length k is encoded as wa, where the value of the parity bit a is chosen so that wa has even parity. Example: If w = 10110, we send wa = 101101, which has even parity. If there are a positive, even number of bit flip errors in transmission, the receiver gets a bit string with even parity, and the error(s) go undetected. If there are an odd number of bit flip errors in transmission, the receiver gets a bit string with odd parity, indicating an error occurred. The receiver requests retransmission. 3

Assumption: Bit flips are rare, so we can tolerate a very small percentage of corrupted blocks that have an even number of flips. 4

Single Parity Check Code - Undetectable Errors Any time a block of length n (with parity bit) contains an even number of bit errors, the error cannot be detected. Let p be the probability of an error in a single bit. The probability of 2 bit flips in the block is: ( n ) 2 p 2 (1 p) n 2 i.e, the number of ways to choose 2 bits from n bits, times p 2, the probability of those bits being errors, times p n 2, the probability of the remaining n 2 bits being correct. The probability of an undetected error is: ( n 2) p 2 (1 p) n 2 + ( n 4) p 4 (1 p) n 4 +... For bit strings of length n = 32 and p = 0.001, the probability of an undetectable error is approximately 0.0005. 5

2D Parity Check Block of bits is organized in rows and columns, say an m n matrix. The parity bit of each row is calculated, and appended to the row before it is transmitted. The parity of each column is calculated, and the parity bit of the entire matrix is computed - these are also transmitted to the receiver. m + n + 1 parity bits are computed. A total of mn + m + n + 1 bits are sent to the receiver. 6

2D Parity Check Example: Original data: 1100, 1011, 0111, 0101 row parity 1 1 0 0 0 1 0 1 1 1 0 1 1 1 1 0 1 0 1 0 0 1 0 1 0 (matrix parity bit) col parity bits Exercise: Describe an error that cannot be detected with this approach. 7

Hamming Distance between Codewords Example: Suppose we want to send 2-bit strings. Each codeword contains two copies of the string plus a parity bit. If the bit-string is 01, we send the 5-bit string 01x01, where x is the parity bit. So in this example, the sender transmits the codeword 01101. For this code, there are only four 5-bit codewords: 00000, 01101, 10110, 11011. When the receiver sees any other string, the error is corrected by replacing it with the codeword that has the least Hamming distance to the received word. Suppose that the string 10001 is received. For Hamming distance d, d(10101, 00000) = 3, d(10101, 01101) = 2, d(10101, 10110) = 1, d(10101, 11011) = 3. So the closest codeword to the received string is 10110, so the receiver assumes that this was the original string. The number of errors that can be detected and corrected depends on the Hamming distance between the codewords. 8

Hamming Distance and Bounds on Error Correction Theorem: Let S be a set of codewords and let h be the minimum Hamming distance between any two codewords in S. Then: 1. It is possible to detect any number of errors less than h 2. It is possible to correct any number of errors less than h/2 Proof: Assume that codeword x is sent and string y is received. If d(x, y) < h, then y is obviously either x, or y is not a codeword. So the receiver will detect an error. If d(x, y) < h/2, then we will show that x is the closest codeword to y, and so the receiver can correct the error by replacing y with closest codeword x. Let z be any codeword other than x, and suppose that d(y, z) d(y, x). Then since d(y, x) < h/2, it follows that d(y, z) < h/2. So d(x, z) d(x, y) + d(y, z) by the triangle inequality < h/2 +h/2 = h. Contradiction, since x and z are both codewords. Therefore x is the closest codeword to y. 9

Hamming Distance Exercise: We looked previously at a code that has four 5-bit codewords: 00000, 01101, 10110, 11011. Calculate the minimum Hamming distance between any two codewords. So we are: 1. Guaranteed to be able to correct any number of errors less than. 2. And guaranteed to detect any number of errors less than. 10

Hamming Code Assume we are transmitting codewords of length k. Hamming code: Sends a logarithmic number l additional bits per word, called the check bits. Allows one error to be corrected for each block (of k + l bits) transmitted. 11

Hamming Code: How Many Check Bits? Choose l to be the smallest positive integer so that the representation of k + l has l bits. For example: 1. If k = 1, what should l be? 2. if k = 2, what is l? Question: What is the maximum number of data bits k corresponding to a given number of check bits l? The positive numbers with l-bit binary representations are between 2 l 1 and 2 l 1 (Verify). So we need k such that k + l 2 l 1, i.e., k 2 l l 1. 12

Hamming Codes TO make a Hamming code: 1. Label the k + l bit positions from 1 to k + l. 2. The l check bit positions are the positions with indices that are powers of 2: indices 1, 2, 4, 8,... 3. The k data bits go in the other positions 4. Choose values for the check bits such that the XOR of the indices of all 1 bits is 0 (Hamming Code Rule) 13

Check Bit Values Recall: For any finite set S, f(s) denotes the XOR of all bit-strings in S. Let C be the set of positions in a codeword where the check bits are 1, and let D be the set of positions where the data bits are 1. By the Hamming Code Rule, f(c) f(d) = 0. Therefore f(c) = f(d). Since we already know the data bits, i.e., D, compute f(d). Then we set the check bits so that f(c) is the same. We do this by assigning the entries in the bit-string f(c) to the check bits in order from highest to lowest indices. Example: Suppose we want to transmit 4 data bits. We add 3 check bits, and so all transmitted strings will have 7 bits. Compute the check bits (and the codewords) for these strings: 1011, 0110. 1 0 1 c 1 c c d d d c d c c 7 6 5 4 3 2 1 Compute f(d) = 111 101 011 = 001. So the transmitted string is 1010101. Exercise: What is the codeword for 0110. 14