ECEN 5682 Theory and Practice of Error Control Codes

ECEN 5682 Theory and Practice of Error Control Codes Introduction to Block Codes University of Colorado Spring 2007

Definition: A block code of length n and size M over an alphabet with q symbols is a set of M q-ary n-tuples called codewords. Example: Code #1. Binary code of length n = 5 with M = 4 codewords given by C = {00000, 01011, 10101, 11110}. Definition: The rate R of a q-ary block code of length n with M codewords is given by R = log q M. n

Definition: The redundancy r of a q-ary block code of length n with M codewords is given by r = n log q M. Example: Code #1 has rate R = log 2 4/5 = 2/5 = 0.4 and redundancy r = 5 log 2 4 = 3 bits. Example: Code #2. 5-ary code of length n = 4 with M = 5 codewords given by C = {0000, 1342, 2134, 3421, 4213}. This code has rate R = log 5 5/4 = 1/4 = 0.25 and redundancy r = 4 log 5 5 = 3 symbols.

The goal when using error control codes is to detect and/or correct transmission errors. Suppose code #1 is used and the (corrupted) codeword v = (00101) is received. Comparing v with all legal codewords and marking the discrepancies with * yields: 00000 01011 10101 11110 00101 00101 00101 00101 ----- ----- ----- -----..*.*.***. *... **.*. The discrepancies are the result of transmission errors. If all error positions are marked with one and all other positions with zero, then the received codeword v = (00101) corresponds to the set of possible errors E = {00101, 01110, 10000, 11010}, when code #1 is used. But which of these 4 errors is the right one?

To decide which error out of a set of errors is the right one, one needs to make additional assumptions about the likelihood of errors and error patterns. The two most common models for the occurrence of errors are: (i) Independent and identically distributed (iid) errors with probablity ɛ. This requires a memoryless transmission channel model. (ii) Burst errors of length L. If an error occurs, it is very likely that it is followed by L 1 more errors. Burst errors occur for instance in mobile communications due to fading and in magnetic recording due to media defects. Burst errors can be converted to iid errors by the use of an interleaver.

More generally, and especially for non-binary codes, one also needs a model for the error amplitudes. Two possibilities are (i) Uniformly distributed non-zero error amplitudes. This is a good model for orthogonal signaling. (ii) Non-uniformly distributed non-zero error amplitudes with smaller error magnitudes more likely than larger ones. This is a good model for QAM signaling.

In addition to the models that describe which error pattern e is most likely, a transmission channel model is also needed that specifies how codewords c and error patterns e are combined to form the received codeword v = f (c, e). The most prevalent model is the additive model shown in the following figure. Note that addition is often assumed to be modulo q addition for a q-ary code. error e c codeword + vector addition (often modulo q) v = c + e received codeword A concise graphical way to describe simple error models is in the form of a discrete channel model that shows all possible transitions from the channel input X to the channel output Y, together with the associated transition probabilities p Y X (y x).

Example: The simplest discrete channel model is the memoryless binary symmetric channel (BSC) shown in the following figure. Input X 1 ɛ 1 1 ɛ ɛ 0 0 1 ɛ Output Y This channel is completely described by the set of four transition probabilities: p Y X (0 0) = 1 ɛ, p Y X (1 0) = ɛ, p Y X (0 1) = ɛ, p Y X (1 1) = 1 ɛ. Clearly P{Y X } = ɛ and thus the (uncoded) probability of a bit error is P b (E) = ɛ.

Thus, if ɛ < 0.5 on a memoryless BSC, fewer errors are more likely and the right error pattern is the one with the fewest number of 1 s in it. Note that, since all symbols are binary here, only errors of amplitude 1 are possible and no specification for the distribution of error amplitudes is needed. Example: Suppose q = 5, errors occur iid with Pr{Y X } < 0.5, and uniformly distributed amplitudes. The corresponding channel model is a memoryless 5-ary symmetric channel (5SC) with transition probabilities { ɛ/4, if y x, x, y {0, 1, 2, 3, 4}, p Y X (y x) = 1 ɛ, if y = x, x, y {0, 1, 2, 3, 4}. In this case the decoding rule assumes again that the right error pattern is the one with the fewest nonzero symbols in it.

Example: Suppose again q = 5 and errors occur iid with P{Y X } < 0.5. But now assume that only errors of magnitude 1 occur, with +1 and 1 being equally likely. This leads to another memoryless 5SC with transition probabilities ɛ/2, if y = x ± 1 (mod 5), x, y {0, 1, 2, 3, 4}, p Y X (y x) = 1 ɛ, if y = x, x, y {0, 1, 2, 3, 4}, 0, otherwise. Now the decoder decides on the error pattern with the fewest number of ±1 (mod 5) symbols as the right error.

Once an error and a channel model are defined, a distance measure between codewords can be defined. Then one can determine the minimum distance between any two distinct codewords and this in turn determines how many errors a code can detect and/or correct under the given error and channel models. For the iid error model with (discrete) uniform error amplitude distribution the most appropriate measure is Hamming distance which is defined as follows. Definition: The Hamming distance d (H) (x, y) (or simply d(x, y)) between two q-ary n-tuples x and y is the number of places in which they differ. Example: d(10221, 20122) = 3.

The Hamming distance is probably the most popular distance measure for error control codes. Another measure that is more suitable in cases where smaller error magnitudes are more likely than larger ones is the Lee distance which is defined next. Definition: The Lee distance d (L) (x, y) between two q-ary n-tuples x and y is defined as d (L) (x, y) = x 0 y 0 + x 1 y 1 +... + x n 1 y n 1, where the magnitude v of a q-ary symbol v is computed modulo q as v = min{v, q v}. Definition: The minimum distance d min of a code C = {c i, i = 0, 1,... M 1} is the smallest distance between any two distinct codewords of the code, i.e., d min = min d(c i, c j ), c i, c j C. i j

Example: Code #2 has has the following Hamming distances between pairs of distinct codewords: d (H) (x, y) 0000 1342 2134 3421 4213 0000 4 4 4 4 1342 4 4 4 4 2134 4 4 4 4 3421 4 4 4 4 4213 4 4 4 4 The Lee distances between pairs of distinct codewords are: d (L) (x, y) 0000 1342 2134 3421 4213 0000 6 6 6 6 1342 6 6 6 6 2134 6 6 6 6 3421 6 6 6 6 4213 6 6 6 6 Thus, for code #2, d (H) (L) min = 4 and d min = 6.

Example: Code #3. Binary code with n = 10 and the following set of M = 4 codewords C = {0010010111, 0100101110, 1001011100, 1110001001}. For this code the Hamming distances between pairs of codewords are d (H) (x, y) 0010010111 0100101110 1001011100 1110001001 0010010111 6 6 6 0100101110 6 6 6 1001011100 6 6 6 1110001001 6 6 6 That is, code #3 has minimum Hamming distance 6.

Theorem: A code with minimum Hamming distance d min can detect all error patterns with d min 1 or fewer nonzero components. Proof: The only error patterns that cannot be detected are those that make the transmitted codeword look like another codeword. But because the smallest Hamming distance between any two distinct codewords is d min, this can only happen if the error pattern affects d min or more coordinates of the transmitted codeword.qed Definition: The sphere of radius t about codeword c is the set S t (c) = {v d(c, v) t}, where d(.,.) is the distance measure used (e.g., Hamming distance).

Example: Consider the codeword c = 01011 from binary code #1. Then, using Hamming distance as the distance measure, S 0(01011) = {01011} S 1(01011) = {01011, 11011, 00011, 01111, 01001, 01010} S 2(01011) = {01011, 11011, 00011, 01111, 01001, 01010, 10011, 11111, 11001, 11010, 00111, 00001, 00010, 01101, 01110, 01000} Theorem: The Hamming distance satisfies the triangle inequality, i.e., for any 3 n-tuples x, y, z d(x, y) + d(y, z) d(x, z).

d(x, y) + d(y, z) d(x, z) Proof: First note that for any u = (u 0, u 1,..., u n 1 ) and v = (v 0, v 1,..., v n 1 ), the Hamming distance satisfies d(u, v) = d(u 0, v 0 )+d(u 1, v 1 )+...+d(u n 1, v n 1 ), d(u i, v i ) {0, 1}, where the addition is over the reals. Consider now a coordinate, say j, where x and z differ, i.e., x j z j. Then there are three possible cases for y j : (i) y j = x j, which implies d(y j, z j ) = 1. (ii) y j = z j, which implies d(y j, x j ) = 1. (iii) y j x j and y j z j, which implies d(y j, x j ) = 1 and d(y j, z j ) = 1. Thus, in all three cases d(x, y) + d(y, z) increases by at least one while d(x, z) increases by exactly one. QED

Theorem: A code with minimum Hamming distance d min can correct all error patterns with t or fewer nonzero components as long as 2t < d min. Proof: Because of the triangle inequality, the spheres S t (c i ) and S t (c j ) of any two distinct codewords c i and c j contain no common elements as long as 2t < d min. Theorem: The Hamming or sphere-packing bound for q-ary codes with d min = 2t + 1 states that the redundancy r must satisfy t ( ) r log q n (q 1) j. j j=0 Proof: Left as an exercise.

Definition: A code which satisfies the Hamming bound with equality is called a perfect code. Example: Code #4. The binary code with blocklength n = 7 and the following set of M = 16 codewords C = {0000000, 0001011, 0010101, 0011110, 0100110, 0101101, 0110011, 0111000, 1000111, 1001100, 1010010, 1011001, 1100001, 1101010, 1110100, 1111111}, is a perfect code. By inspection d min = 3 is found and thus all patterns of t = 1 errors are correctable. Therefore [( ) ( )] 7 7 r log 2 + = log 0 1 2 (1 + 7) = 3. But r = n log 2 M = 7 4 = 3, i.e., the code satisfies the Hamming bound with equality. This code is known as binary (7, 4, 3) Hamming code.

Example: Probably the most prominent and celebrated perfect code is the binary Golay code with blocklength n = 23, M = 2 12 codewords and minimum Hamming distance d min = 7. It can correct all error patterns with up to t = 3 errors and thus the Hamming bound requires that r log 2» 23 0 23 23 23 + + + = log 1 2 3 2 (1+23+253+1771) = log 2 (2048) = 11. The actual redundancy of the code is r = 23 log 2 2 12 = 23 12 = 11 and therefore it is perfect. Note: Don t take the name perfect code too literally. Perfect just simply means that the spheres of radius t = (d min 1)/2 around all codewords fill out the whole codewordspace perfectly. It does not necessarily mean that perfect codes are the best error detecting and/or correcting codes.

How easy is it to find a block code by trial and error? Example: Binary rate R = 0.8 code of length n = 10. There are ( 2 10 2 8 ) ways to choose 256 codwords out of 1024 possibilities. Here is the actual number:

Definition: A q-ary linear (n, k) blockcode C is defined as the set of all linear combinations, taken modulo q, of k independent vectors from V, where V is the set of all q-ary n-tuples. If C has minimum distance d min, C is called a q-ary (n, k, d min ) code. Definition: A generator matrix G of a linear (n, k) code C is a k n matrix whose rows form a basis for the k-dimensional subspace C of V. Definition: The q-ary k-tuple u = (u 0, u 1,... u k 1 ) is used to denote a dataword. In general, it is assumed that there are no restrictions on u, i.e., it may take on all possible q k values and, unless otherwise specified, all these values are equally likely. Other names for u are message or information word.

Definition: For the encoding operation, any one-to-one association between datawords u and codewords c may be used. For a linear code with generator matrix G, the most natural encoding procedure is to use c = u G. Example: Code #5. Ternary (n = 5, k = 2) code with generator matrix [ ] 1 0 2 1 2 G =. 0 1 2 2 1 This defines a linear code C whose codewords C = {00000, 01221, 02112, 10212, 11100, 12021, 20121, 21012, 22200}, lie in a 2-dimensional subspace of V, where V is the set of all 3-ary 5-tuples.

Example: Code #1 is a linear binary (n, k, d min ) = (5, 2, 3) code with generator matrix [ ] 1 0 1 0 1 G =. 0 1 0 1 1 To verify this, generate the set of codewords by taking all possible linear combinations (modulo 2, or modulo q in general) of the rows of G as follows: { C = (0, 0) G = (0, 0, 0, 0, 0), (0, 1) G = (0, 1, 0, 1, 1), } (1, 0) G = (1, 0, 1, 0, 1), (1, 1) G = (1, 1, 1, 1, 0).

Example: Code #2 is a linear 5-ary (4,1,4) code with generator matrix G = [ 1 3 4 2 ]. The set of codewords is obtained as n C = 0 G = (0, 0, 0, 0), 1 G = (1, 3, 4, 2), 2 G = (2, 1, 3, 4), o 3 G = (3, 4, 2, 1), 4 G = (4, 2, 1, 3). Example: Code #3 is a nonlinear binary code with n = 10. The sum of codewords (0, 1, 0, 0, 1, 0, 1, 1, 1, 0) + (1, 0, 0, 1, 0, 1, 1, 1, 0, 0) = (1, 1, 0, 1, 1, 1, 0, 0, 1, 0), for instance, is not a codeword of the code. In fact, the 4 codewords C = {0010010111, 0100101110, 1001011100, 1110001001}, are linearly independent and form the basis of a 4-dimensional subspace of the space of all binary 10-tuples.

Definition: The Hamming weight w(c) of a codeword c C is equal to the number of nonzero components of c. The minimum Hamming weight w min of a code C is equal to the smallest Hamming weight of any nonzero codeword in C. Definition: The Lee weight w (L) (c) of a codeword c C is defined as w (L) (c) = c 0 + c 1 +... + c n 1, where the magnitude v of a q-ary symbol v is computed modulo q as v = min{v, q v}. The minimum Lee weight w (L) min of a code C is equal to the smallest Lee weight of any nonzero codeword in C.

Theorem: For a linear code d min = w min. Proof: For any x, y C d min = min x y d(x, y) = min x y d(x y, 0) = min c 0 w(c), where c = x y C because the code is linear. QED

Standard Array for Decoding. A good conceptual, but practically inefficient way to visualize the decoding operation for a q-ary linear code under the Hamming distance measure is the so-called standard array. It is set up as follows: (1) The first row of the array consists of all the codewords of the code, starting on the left with the all-zero codeword that must be present in every linear code. (2) The first column starts out with all q-ary n-tuples that are in the decoding sphere S t (0) of radius t about the all-zero codeword c 0 = 0, where t is the maximum number of errors the code can correct. There are P = S t (0) n-tuples in this sphere, and, assuming an additive (modulo q) error model, all are correctable error patterns (including the the all-zero error ). The elements in this column are called the coset leaders.

Standard Array for Decoding (contd.) (3) Making use of the linearity of the code, anything that applies to the all-zero codeword c 0 also applies to any other codeword c j by simply translating the origin. Thus, the first P entries of the j-th column make up the decoding sphere S t (c j ). All entries in the j-th column are obtained by adding (modulo q) the error pattern on the left to the codeword above. (4) Each of the rows in the standard array is called a coset. Altogether, the first P rows contain the M = q k distinct decoding spheres S t (c j ) for j = 0, 1,... M 1. If the code is a perfect code, then P M = q n, else there are (q n P M) = (Q P)M, where Q = q n k, distinct q-ary n-tuples that do not yet appear in the array. These correspond to error patterns with more than t errors, but in general only a few of these are correctable.

Standard Array for Decoding (contd.) (5) To complete the standard array, organize the (Q P)M q-ary n-tuples that do not yet appear in the array into Q P cosets. For each coset, select an error pattern of smallest Hamming weight that has not yet appeared anywhere in the array as a coset leader, and complete the coset by adding the error pattern on the left to the codeword above. Because of the linearity of the code it can be shown that it is always possible to fill the bottom Q P rows in this way with distinct n-tuples and that the set of all Q rows contains all q n possible q-ary n-tuples.

Decoding using the standard array consists of looking up the received n-tuple in the array and returning the codeword above it as the result. Definition: A decoder which decodes only received n-tuples within decoding spheres of radius t or less, but not the whole decoding region, is called an incomplete decoder. If one of the n-tuples in the bottom Q P rows of the standard array is received, an incomplete decoder declares a detected but uncorrectable error pattern. Conversely, a complete decoder assigns a nearby codeword to every received n-tuple. Note: For a perfect t-error correcting code P = Q, and no coset leader has weight greater than t.

The following figure shows the contents of the standard array graphically. decoding sphere decoding of radius t region...... codewords ----> c0= : c1 : c2... : cm-1 : 00..0 : : : : ------------------------------------------------------- / e1 : c1+e1 : c2+e1... : cm-1+e1 :... coset -----:-> e2 : c1+e2 : c2+e2... : cm-1+e2 : : coset leader... e3 : c1+e3 : c2+e3... : cm-1+e3 : 1,2,..,t. :. :. :. : error. :. :. :. : patterns ep-1 : c1+ep-1 : c2+ep-1... : cm-1+ep-1 : P rows \... : : above line - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - / ep c1+ep c2+ep : cm-1+ep : Q-P rows more than... :. : below line t errors... :. : \ eq-1 c1+eq-1 c2+eq-1... : cm-1+eq-1 :...

Example: Code #1 has minimum (Hamming) distance d min = 3 and thus is a single error correcting code. The standard array for this code is: decoding sphere decoding of radius 1 region...... c0 : c1 : c2 : c3 : codewords ----> 00000 : 01011 : 10101 : 11110 : ---------------------------------------- / 00001 : 01010 : 10100 : 11111 : single 00010 : 01001 : 10111 : 11100 : errors < 00100 : 01111 : 10001 : 11010 :... coset -----:-> 01000 : 00011 : 11101 : 10110 : : coset leader... \ 10000 : 11011 : 00101 : 01110 :... : : - - - - - - - - - - - - - - - - - - - - - - - 00110 01101 10011 : 11000 : 01100 00111 11001 : 10010 :... The selection of the coset leaders for the last two rows is not unique. For example, 11000 and 10010 could have been used instead of 00110 and 01100.

Because the size of the standard array increases exponentially in the block length n, it is quite clearly not very practical even for moderately large values of n. Therefore the concept of a parity check matrix is introduced, which will lead to a more compact decoding method using syndromes. Definition: Let C V, where V is the set of all q-ary n-tuples, be a linear (n, k) code. Then the dual or orthogonal code of C, denoted C ( C perp ), is defined by C = {u V u w = 0 for all w C}. The orthogonal complement of C has dimension n k and thus C is a linear (n, n k) code.

Definition: A parity check matrix H of a linear (n, k) code C is a (n k) n matrix whose rows form a basis for the (n k)-dimensional subspace C of the set of all q-ary n-tuples V. That is, any parity check matrix of C is a generator matrix of C. Theorem: Let C be a q-ary linear (n, k) code with generator matrix G and parity check matrix H. Then, using arithmetic modulo q, c i H T = 0, where T denotes transpose, for all c i C, and G H T = 0. Proof: Follows directly from the definition of C and the definition of H. QED

Example: Code #5 (ternary (5,2,3) code) has generator matrix G and parity check matrix H given by [ ] 1 1 1 0 0 1 0 2 1 2 G =, H = 2 1 0 1 0. 0 1 2 2 1 1 2 0 0 1 Thus, C is the set C = {00000, 12001, 21002, 21010, 00011, 12012, 12020, 21021, 00022, 11100, 20101, 02102, 02110, 11111, 20112, 20120, 02121, 11122, 22200, 01201, 10202, 10210, 22211, 01212, 01220, 10221, 22222}

Theorem: A linear code C contains a nonzero codeword of Hamming weight w iff a linearly dependent set of w columns of H exists. Proof: Let c C have weight w. From c H T = 0 we can thus find a set of w linearly dependent columns of H. Conversely, if H contains a linearly dependent set of w columns, then we can construct a codeword c with nonzero coefficients corresponding to the w columns, such that c H T = 0. QED Corollary: A linear code C has minimum weight w min iff every set of w min 1 columns of H is linearly independent and at least one set of w min columns of H is linearly dependent.

Definition: Elementary row operations on a matrix are the following: (i) Interchange of any two rows. (ii) Multiplication of any row by a nonzero scalar. (iii) Replacement of any row by the sum of itself and a multiple of any other row. Definition: A matrix is said to be in row-echelon form if it satisfies the following conditions: (i) The leading term of every nonzero row is a one. (ii) Every column with a leading term has all other entries zero. (iii) The leading term of any row is to the right of the leading term in every previous row. All-zero rows (if any) are placed at the bottom. Note: Under modulo q arithmetic, any matrix can be put into row-echelon form by elementary row operations if q is a prime number.

Example: Let q = 11 and consider the matrix 2 6 9 1 10 6 10 2 3 0 A = 41 7 4 9 10 10 8 65. 2 3 9 7 8 4 7 4 Multiply the first row by 6 1 = 2. Then replace the second row by the difference of the second row minus the new first row. Next, subtract 2 times the new first row from the third row and replace the third row with the result to obtain 2 1 7 2 9 1 9 4 3 0 A = 40 0 2 0 9 1 4 65. 0 0 5 0 6 8 10 4 Now start by multiplying the second row by 2 1 = 6. Then subtract 2 times this row from the first row. Finally, note that the third row is just a multiple of the second row, so that it can be replaced by an all zero row. The result is A in row-echelon form: 2 1 7 0 9 3 8 0 3 5 A = 40 0 1 0 10 6 2 35. 0 0 0 0 0 0 0 0

Definition: Two codes which are the same except for a permutation of codeword components are called equivalent. The generator matrices G and G of equivalent codes are related as follows. The code corresponding to G is the set of all linear combinations of rows of G and is thus unchanged under elementary row operations. Permutation of the columns of G corresponds to permutation of codeword components and therefore two codes are equivalent if their generator matrices G and G are related by (i) Column permutations, and (ii) Elementary row operations.

From this it follows that every generator matrix G of a linear code is equivalent to one in row-echelon form. Because G is a k n matrix whose rows span a k-dimensional subspace, all rows of G must be linearly independent. This proves the following Theorem: Every generator matrix of a q-ary linear code, where q is a prime (or a prime power), is equivalent to one of the form G = [I k P], or G = [P I k ], where I k is a k k identity matrix and P is a k (n k) matrix. Definition: A code C with codewords whose first (or last) k components are the unmodified information symbols is called a systematic code. The remaining n k codeword symbols are called parity symbols. A systematic code has generator matrix G = [I k P] (or G = [P I k ]).

Let u = (u 0, u 1,..., u k 1 ) be the information word and let c = (c 0, c 1,..., c n 1 ) be the corresponding codeword. If G is in systematic form, then 1 0 0... 0 p 0,k p 0,k+1... p 0,n 1 0 1 0... 0 p 1,k p 1,k+1... p 1,n 1 G = 0 0 1... 0 p 2,k p 2,k+1... p 2,n 1............. 0 0 0... 1 p k 1,k p k 1,k+1... p k 1,n 1 then the components of c = u G are c j = u j, for 0 j k 1, and c j = u 0 p 0,j + u 1 p 1,j + + u k 1 p k 1,j, k j n 1. This latter set of equations is known as the set of parity-check equations of the code.,

Theorem: The systematic form of H corresponding to G = [I k P] is H = [ P T I n k ]. Proof: Multiplying corresponding submatrices in G and H T together yields [ ] P G H T = [I k P] = I k P + P I n k = 0. I n k Thus, H = [ P T I n k ] satisfies G H T = 0. QED

Written out explicitly, the systematic form of H is p 0,k p 1,k... p k 1,k 1 0 0... 0 p 0,k+1 p 1,k+1... p k 1,k+1 0 1 0... 0 H = p 0,k+2 p 1,k+2... p k 1,k+2 0 0 1... 0............. p 0,n 1 p 1,n 1... p k 1,n 1 0 0 0... 1. From c H T = 0 it thus follows for the m-th row of H that c 0 p 0,k+m c 1 p 1,k+m c k 1 p k 1,k+m + c k+m = 0, 0 m n k 1. Letting j = k + m and using the fact that for a systematic code c i = u i, 0, i k 1, one obtains c j = u 0 p 0,j + u 1 p 1,j + + u k 1 p k 1,j, k j n 1, which is the same set of parity check equations as before. This shows that a systematic linear (n, k) code is completely specified either by its generator matrix G or by its parity check matrix H.

Example: Let q = 11 and consider the (8, 2) code with generator matrix [ ] 3 10 0 5 9 2 0 4 G =. 1 7 1 9 2 3 2 8 Put G into row-echelon form which yields [ ] G 1 7 0 9 3 8 0 5 =. 0 0 1 0 10 6 2 3 Note that G and G produce exactly the same set of codewords, but using a different set of basis vectors and therefore a different mapping from datawords u to codewords c. To obtain a generator matrix in systematic form, permute the second and third columns of G so that [ ] 1 0 7 9 3 8 0 5 Gsys =. 0 1 0 0 10 6 2 3

Now, using Hsys = [ P T I n k ], one easily finds 2 3 4 0 1 0 0 0 0 0 2 0 0 1 0 0 0 0 Hsys = 8 1 0 0 1 0 0 0 63 5 0 0 0 1 0 0. 7 40 9 0 0 0 0 1 05 6 8 0 0 0 0 0 1 Finally, to obtain a parity check matrix H for the original generator matrix G, all the column permutations that were necessary to obtain Gsys from G need to be undone. Here only columns two and three need to be permuted to obtain 2 3 4 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 H = 8 0 1 0 1 0 0 0 63 0 5 0 0 1 0 0. 7 40 0 9 0 0 0 1 05 6 0 8 0 0 0 0 1 A quick check shows that indeed G H T = 0 modulo 11.

Theorem: Singleton bound. The minimum distance of any linear (n, k) code satisfies d min n k + 1. Proof: Any linear code can be converted to systematic form (possibly permuting coordinates which does not affect d min ) and thus G = [I k P]. Since P is a k (n k) matrix and systematic codewords with only one nonzero information symbol exist, the result follows. QED Note: It can be shown that the Singleton bound also applies to non-linear codes.

Definition: Any linear code whose d min satisfies d min = n k + 1, is called maximum distance separable (MDS). Note: The name maximum distance separable code comes from the fact that such a code has the maximum possible (Hamming) distance between codewords and that the codeword symbols can be separated into data symbols and parity check symbols (i.e., the code has a systematic encoder). Example: Code #6. The ternary (4, 2) code with generator matrix» 1 0 2 2 G =, 0 1 2 1 is an MDS code. The set of codewords is C = {0000, 0121, 0212, 1022, 1110, 1201, 2011, 2102, 2220}. From this it is easily seen that d min = 3 = n k + 1, which proves the claim that this code is MDS.

Definition: Let C be a linear (n, k) code and let v = c + e, where c C is a codeword and e is an error vector of length n, be a received n-tuple. The syndrome s of v is defined by s = v H T. Theorem: All vectors in the same coset (cf. standard array decomposition of a linear code) have the same syndrome, unique to that coset. Proof: If v and v are in the same coset, then v = c i + e and v = c j + e for some e (coset leader) and codewords c i, c j. But, for any codeword c, c H T = 0 and therefore s = v H T = c i H T +e H {z } T = e H T =0 s = v H T = c j H T +e H T = e H T = s = s. {z } =0 Conversely, suppose that s = s. Then s s = (v v ) H T = 0, which implies that v v is a codeword. But that further implies that v and v are in the same coset. QED

Note: In practice, this theorem has some important consequences. To decode a linear code, one does not need to store the whole standard array. Only the mapping from the syndrome to the most likely error pattern needs to be stored. Example: Syndrome decoding for code #1. This is a binary code with parity check matrix 1 0 1 0 0 H = 0 1 0 1 0. 1 1 0 0 1 Computing s = e H T for e = 0, all single error error patterns, and some double error patterns, the following table that uniquely relates syndromes to error patterns is obtained.

Error e Syndrome s 00000 000 00001 001 00010 010 00100 100 01000 011 01000 101 00110 110 01100 111 2 1 0 1 0 3 0 H = 40 1 0 1 05 1 1 0 0 1 Note that an incomplete (or bounded distance) decoder would only use the first six entries (above the dividing line at the bottom) for decoding. The choice of the error patterns for the last two entries is somewhat arbitrary (just as it was in the case of the standard array), and other double error patterns that yield the same syndromes could have been used. Suppose now that v = (11101) was received. To decode v, compute s = v H T = (011). From the syndrome lookup table the corresponding e = (01000). Finally, the corrected codeword c is obtained as c = v e = (10101).

Note: To construct a linear q-ary single error correcting code with redundancy r = n k, one can start from a parity check matrix H whose columns are q-ary r-tuples with the property that they are all distinct, even when multiplied by an arbitrary nonzero q-ary scalar. The resulting codes are called Hamming codes. Example: Natural parameters (n, k, d min ) of commonly used small binary linear codes are: (7, 4, 3) (15, 11, 3) (15, 7, 5) (15, 5, 7) (23, 12, 7) (31, 26, 3) (31, 21, 5) (31, 16, 7) (31, 11, 11) (63, 57, 3) (63, 51, 5) (63, 45, 7) (63, 39, 9) (63, 36, 11) (127, 120, 3) (127, 113, 5) (127, 106, 7) (127, 99, 9) (127, 92, 11) (255, 247, 3) (255, 239, 5) (255, 231, 7) (255, 223, 9) (255, 215, 11) Note that most of the block lengths of these codes are of the form 2 m 1 for some integer m. Such block lengths are called primitive block lengths.

Modified Linear Blockcodes. Often the natural parameters of block codes are not suitable for a particular application, e.g., for computer storage applications the data length is typically a multiple of 8, whereas the natural parameter k of a code may be a crummy number like 113 (= 14 8 + 1). Thus, it may be necessary to change either k or n or both. To explain the different procedures, code #7 which is a linear binary (6, 3, 3) code with G and H as shown below, is used. 1 0 0 1 1 0 1 1 0 1 0 0 G = 0 1 0 1 0 1, H = 1 0 1 0 1 0. 0 0 1 0 1 1 0 1 1 0 0 1 The six different modifications that can be applied to the parameters n and k of a linear block code are: Lenghtening (n+,k+), shortening (n,k ), extending (n+,k =), puncturing (n,k =), augmenting (n =,k+), and expurgating (n =,k ).

Lengthening. Increase blocklength n by adding more data symbols while keepig redundancy r = n k fixed. The result is a code that has n and k increased by the same amount. In the best case d min will be unchanged, but it can drop to as low as 1 and needs to be reexamined carefully. Example: Code #7 lengthened by 1 results in a (7, 4, 3) code with generator and parity check matrices 1 0 0 0 1 1 1 G = 0 1 0 0 1 1 0 1 1 1 0 1 0 0 0 0 1 0 1 0 1, H = 1 1 0 1 0 1 0. 1 0 1 1 0 0 1 0 0 0 1 0 1 1

Shortening. Decrease blocklength n by dropping data symbols while keeping the redundancy r fixed. The resulting code has n and k reduced by the same amount. In most cases d min will be unchanged, occasionally d min may increase. Example: Code #7 shortened by 1 results in a (5, 2, 3) code with generator and parity check matrices G = [ ] 1 0 1 0 1, H = 0 1 0 1 1 1 0 1 0 0 0 1 0 1 0. 1 1 0 0 1

Extending. Increase blocklength n by adding more parity check symbols while keeping k fixed. The result is a code that has n and r = n k increased by the same amount. The minimum distance may or may not increase and needs to be reexamined. A common method to extend a code from n to n + 1 is to add an overall parity check. Example: Code #7 extended by 1 results in a (7, 3, 4) code with generator and parity check matrices 1 1 0 1 0 0 0 1 0 0 1 1 0 1 G = 0 1 0 1 0 1 1, H = 1 0 1 0 1 0 0 0 1 1 0 0 1 0. 0 0 1 0 1 1 1 1 1 1 0 0 0 1

Puncturing. Decrease blocklength n by dropping parity check symbols while keeping k fixed. The resulting code has n and r = n k decreased by the same amount. Except for the trivial case of removing all zero columns from G, the minimum distance decreases. Example: Code #7 punctured by 1 yields a (5, 3, 2) code with generator and parity check matrices 1 0 0 1 1 G = 0 1 0 1 0, H = 0 0 1 0 1 [ ] 1 1 0 1 0. 1 0 1 0 1

Augmenting. Increase datalength k while keeping n fixed by reducing the redundancy r = n k. The result is a code which has k increased by the same amount as r = n k is decreased. Because of the reduction of r, the minimum distance generally decreases. Example: Code #7 augmented by 1 gives a (6, 4, 2) code with generator and parity check matrices 1 0 0 1 1 0 [ ] G = 0 1 0 1 0 1 0 1 1 1 1 0 0 0 1 0 1 1, H =. 1 0 1 1 0 1 0 0 0 1 1 1

Expurgating. Decrease datalength k while keeping n fixed by increasing the redundancy r = n k. The resulting code has k decreased by the same amount as r = n k is increased. The increase in r may or may not lead to an increase in d min. Example: Code #7 expurgated by 1 gives a (6, 2, 4) code with generator and parity check matrices [ ] 1 1 1 0 0 0 1 0 1 1 0 1 G =, H = 1 1 0 1 0 0 0 1 1 1 1 0 1 0 1 0 1 0. 0 1 1 0 0 1

Definition: u u + v -construction. Let u = (u 0, u 1,..., u n 1 ) and v = (v 0, v 1,..., v n 1 ) be two q-ary n-tuples and define u u + v = (u 0, u 1,..., u n 1, u 0 + v 0, u 1 + v 1,... u n 1 + v n 1 ), where the addition is modulo q addition. Let C 1 be a q-ary linear (n, k 1, d min = d 1 ) code and let C 2 be a q-ary linear (n, k 2, d min = d 2 ) code. A new q-ary code C of length 2n is then defined by C = { u u + v : u C 1, v C 2 }. The generator matrix of the (2n, k 1 + k 2 ) code C is [ ] G1 G G = 1, 0 G 2 where 0 is a k 2 n all-zero matrix, G 1 is the generator matrix of C 1 and G 2 is the generator matrix of C 2.

Theorem: The minimum distance of the code C obtained from the u u + v -construction is d min (C) = min{2d 1, d 2 }. Proof: Let x = u u + v and y = u u + v be two distinct codewords of C. Then d(x, y) = w(u u ) + w(u u + v v ), where d(.,.) denotes Hamming distance and w(.) denotes Hamming weight. Case (i): v = v. Case (ii): v v.