Example. A source emits symbols i, i 6, in the BCD format with probabilities P( i ) as given in Table, at a rate R s = 9.6 kbaud (baud=symbol/second). State (i) the information rate and (ii) the data rate of the source. 2. Apply Shannon-Fano coding to the source signal characterised in Table. Are there any disadvantages in the resulting code words? 3. What is the original symbol sequence of the Shannon- Fano coded signal 0000000000? 4. What is the data rate of the signal after Shannon-Fano coding? What compression factor has been achieved? i Table. P( i ) BCD word A 0.30 000 B 0.0 00 C 0.02 00 D 0.5 0 E 0.40 00 F 0.03 0 5. Derive the coding efficiency of both the uncoded BCD signal as well as the Shannon-Fano coded signal. 6. Repeat parts 2 to 5 but this time with Huffman coding. 72
Example - Solution. (i) Entropy of source: 6 H = P( i ) log 2 P( i ) = 0.30 log 2 0.30 0.0 log 2 0.0 0.02 log 2 0.02 i= 0.5 log 2 0.5 0.40 log 2 0.40 0.03 log 2 0.03 = 0.5209 + 0.3329 + 0.288 + 0.4054 + 0.52877 + 0.577 = 2.05724 bits/symbol Information rate: R = H R s = 2.05724 [bits/symbol] 9600 [symbols/s] = 9750 [bits/s] (ii) Data rate = 3 [bits/symbol] 9600 [symbols/s] = 28800 [bits/s] 2. Shannon-Fano coding: P() I (bits) steps code E 0.4.32 0 0 A 0.3.74 0 0 D 0.5 2.74 0 0 B 0. 3.32 0 0 F 0.03 5.06 0 0 C 0.02 5.64 Disadvantage: the rare code words have maximum possible length of q = 6 = 5, and a buffer of 5 bit is required. 73
3. Shannon-Fano encoded sequence: 0 0 0 0 0 0 0 0 0 0 = DEFEEEDADE 4. Average code word length: Data rate: Compression factor: d = 0.4 + 0.3 2 + 0.5 3 + 0. 4 + 0.05 5 = 2. [bits/symbol] 5. Coding efficiency before Shannon-Fano: d R s = 2. 9600 = 2060 [bits/s] 3 [bits] d [bits] = 3 2. =.4286 CE = information rate data rate = 9750 28800 = 68.58% Coding efficiency after Shannon-Fano: CE = information rate data rate == 9750 2060 = 97.97% Hence Shannon-Fano coding brought the coding efficiency close to 00%. 6. Huffman coding: 74
P() steps code 2 3 4 5 E 0.4 A 0.3 0 0 00 D 0.5 0 0 00 B 0. 0 0 00 F 0.03 0 0 00 C 0.02 0 0 step 3 step 4 step 5 E 0.40 E 0.40 ADBFC 0.60 0 A 0.30 A 0.30 0 E 0.40 D 0.5 0 DBFC 0.30 BFC 0.5 step step 2 E 0.40 E 0.40 A 0.30 A 0.30 D 0.5 D 0.5 B 0.0 B 0.0 0 F 0.03 0 FC 0.05 C 0.02 Same disadvantage as Shannon-Fano: the rare code words have maximum possible length of q = 6 = 5, and a buffer of 5 bit is required. 00 00 00 00 00 = EEAEEEEAAEEDEEA The same data rate and the same compression factor achieved as Shannon-Fano coding. The coding efficiency of the Huffman coding is identical to that of Shannon-Fano coding. 75
Example 2. Considering the binary symmetric channel (BSC) shown in the figure: P( 0 ) = p P( ) = p 0 p e p e p e P(Y ) p e Y 0 Y P(Y 0 ) From the definition of mutual information, I(, Y ) = i P( i, Y ) log 2 P( i Y ) P( i ) [bits/symbol] derive both (i) a formula relating I(, Y ), the source entropy H(), and the average information lost per symbol H( Y ), and (ii) a formula relating I(, Y ), the destination entropy H(Y ), and the error entropy H(Y ). 2. State and ustify the relation (>,<,=,, or ) between H( Y ) and H(Y ). 3. Considering the BSC in Figure, we now have p = 4 and a channel error probability p e = 0. Calculate all probabilities P( i, Y ) and P( i Y ), and derive the numerical value for the mutual information I(, Y ). 76
Example 2 - Solution. (i) Relating to source entropy and average information lost: I(, Y ) = i = i = i = i P( i, Y ) log 2 P( i Y ) P( i ) P( i, Y ) log 2 P( i ) i 0 @ P( i, Y ) A log 2 P( i ) P(Y ) P( i ) log 2 P( i ) i! P( i Y ) log 2 P( i Y ) P( i, Y ) log 2 P( i Y ) P(Y ) I( Y ) = H() H( Y ) (ii) Bayes rule : P( i Y ) P( i ) = P( i, Y ) P( i ) P(Y ) = P(Y i ) P(Y ) 77
Hence, relating to destination entropy and error entropy: P(Y i ) I(, Y ) = P( i, Y ) log 2 P(Y ) i i P(Y, i ) log 2 P(Y i ) = i P(Y, i ) log 2 P(Y ) = H(Y ) H(Y ) 2. Unless p e = 0.5 or for equiprobable source symbols, the symbols Y at the destination are more balanced, hence H(Y ) H(). Therefore, H(Y ) H( Y ). 3. Joint probabilities: Destination total probabilities: P( 0, Y 0 ) = P( 0 ) P(Y 0 0 ) = 4 9 0 = 0.225 P( 0, Y ) = P( 0 ) P(Y 0 ) = 4 0 = 0.025 P(, Y 0 ) = P( ) P(Y 0 ) = 3 4 0 = 0.075 P(, Y ) = P( ) P(Y ) = 3 4 9 0 = 0.675 P(Y 0 ) = P( 0 ) P(Y 0 0 ) + P( ) P(Y 0 ) = 4 9 0 + 3 4 0 = 0.3 78
P(Y ) = P( 0 ) P(Y 0 ) + P( ) P(Y ) = 4 0 + 3 4 9 0 = 0.7 Conditional probabilities: P( 0 Y 0 ) = P( 0, Y 0 ) = 0.225 P(Y 0 ) 0.3 = 0.75 Mutual information: P( 0 Y ) = P( 0, Y ) P(Y ) P( Y 0 ) = P(, Y 0 ) P(Y 0 ) P( Y ) = P(, Y ) P(Y ) I(, Y ) = P( 0, Y 0 ) log 2 P(Y 0 0 ) P(Y 0 ) +P(, Y 0 ) log 2 P(Y 0 ) P(Y 0 ) = 0.025 0.7 = 0.03574 = 0.075 0.3 = 0.25 = 0.675 0.7 = 0.964286 + P( 0, Y ) log 2 P(Y 0 ) P(Y ) + P(, Y ) log 2 P(Y ) P(Y ) = 0.356665 0.070838 0.8872 + 0.2447348 = 0.422954 [bits/symbol] 79
Example 3 A digital communication system uses a 4-ary signalling scheme. Assume that 4 symbols -3,-,,3 are chosen with probabilities 8, 4, 2, 8, respectively. The channel is an ideal channel with AWGN, the transmission rate is 2 Mbaud (2 0 6 symbols/s), and the channel signal to noise ratio is known to be 5.. Determine the source information rate. 2. If you are able to employ some capacity-approaching error-correction coding technique and would like to achieve error-free transmission, what is the minimum channel bandwidth required? 80
Example 3 - Solution. Source entropy: H = 2 8 log 2 8 + 4 log 2 4 + 2 log 2 2 = 7 4 [bits/symbol] Source information rate: R = H R s = 7 4 2 06 = 3.5 [Mbits/s] 2. To be able to achieve error-free transmission ( R C = B log 2 + S ) P N P 3.5 0 6 B log 2 ( + 5) Thus B 0.875 [MHz] 8
Example 4 A predictive source encoder generates a bit stream, and it is known that the probability of a bit taking the value 0 is P(0) = p = 0.95. The bit stream is then encoded by a run length encoder (RLC) with a codeword length of n = 5 bits.. Determine the compression ratio of the RLC. 2. Find the encoder input patterns that produce the following encoder output cordwords 0 0 00 0 0000 00000 3. What is the encoder input sequence of the RLC coded signal 0000000? 82
Example 4 - Solution. Codeword length after RLC is n = 5 bits, and average codeword length d before RLC with N = 2 n N d = (l + ) p l ( p) + N p N = pn p Compression ratio l=0 d n = pn n( p) = 0.953 5 0.05 = 3.844 2. RLC table 00 00000 {z } 3 00 0000 {z } 30 00 000 {z } 29 00 00 {z } 28 00 0 {z } 27. 0 0000 00000 0 0 00 0 3. 0 00000 0 the encoder input sequence 00 0 {z } 27 00 {z 0000} 30 83