Version: 0.1 This is an early version. A better version would be hopefully posted in the near future.

Chapte 28 Shao s theoem Vesio: 0.1 This is a ealy vesio. A bette vesio would be hopefully posted i the ea futue. By Saiel Ha-Peled, Decembe 7, 2009 1 This has bee a ovel about some people who wee puished etiely too much fo what they did. They wated to have a good time, but they wee like childe playig i the steet; they could see oe afte aothe of them beig killed - u ove, maimed, destoyed - but they cotiued to play ayhow. We eally all wee vey happy fo a while, sittig aoud ot toilig but just bullshittig ad playig, but it was fo such a teible bief time, ad the the puishmet was beyod belief; eve whe we could see it, we could ot believe it. A Scae Dakly, Philip K. Dick 28.1 Codig: Shao s Theoem We ae iteested i the poblem sedig messages ove a oisy chael. We will assume that the chael oise is behave icely. Defiitio 28.1.1 The iput to a biay symmetic chael with paamete p is a sequece of bits x 1, x 2,..., ad the output is a sequece of bits y 1, y 2,..., such that P [ x i = y i = 1 p idepedetly fo each i. Taslatio: Evey bit tasmitted have the same pobability to be flipped by the chael. The questio is how much ifomatio ca we sed o the chael with this level of oise. Natually, a chael would have some capacity costaits (say, at most 4,000 bits pe secod ca be set o the chael), ad the questio is how to sed the lagest amout of ifomatio, so that the eceive ca ecove the oigial ifomatio set. 1 This wok is licesed ude the Ceative Commos Attibutio-Nocommecial 3.0 Licese. To view a copy of this licese, visit http://ceativecommos.og/liceses/by-c/3.0/ o sed a lette to Ceative Commos, 171 Secod Steet, Suite 300, Sa Facisco, Califoia, 94105, USA. 1

Now, its impotat to ealize that oise hadle is uavoidable i the eal wold. Futhemoe, thee ae tadeoffs betwee chael capacity ad oise levels (i.e., we might be able to sed cosideably moe bits o the chael but the pobability of flippig (i.e., p) might be much lage). I desigig a commuicatio potocol ove this chael, we eed to figue out whee is the optimal choice as fa as the amout of ifomatio set. Defiitio 28.1.2 A (k, ) ecodig fuctio Ec : {0, 1} k {0, 1} takes as iput a sequece of k bits ad outputs a sequece of bits. A (k, ) decodig fuctio Dec : {0, 1} {0, 1} k takes as iput a sequece of bits ad outputs a sequece of k bits. Thus, the sede would use the ecodig fuctio to sed its message, ad the decode would use the eceived stig (with the oise i it), to ecove the set message. Thus, the sede stats with a message with k bits, it blow it up to bits, usig the ecodig fuctio, to get some obustess to oise, it sed it ove the (oisy) chael to the eceive. The eceive, takes the give (oisy) message with bits, ad use the decodig fuctio to ecove the oigial k bits of the message. Natually, we would like k to be as lage as possible (fo a fixed ), so that we ca sed as much ifomatio as possible o the chael. The followig celebated esult of Shao 2 i 1948 states exactly how much ifomatio ca be set o such a chael. Theoem 28.1.3 (Shao s theoem) Fo a biay symmetic chael with paamete p < 1/2 ad fo ay costats δ, γ > 0, whee is sufficietly lage, the followig holds: (i) Fo a k (1 H(p) δ) thee exists (k, ) ecodig ad decodig fuctios such that the pobability the eceive fails to obtai the coect message is at most γ fo evey possible k-bit iput messages. (ii) Thee ae o (k, ) ecodig ad decodig fuctios with k (1 H(p) + δ) such that the pobability of decodig coectly is at least γ fo a k-bit iput message chose uifomly at adom. 28.2 Poof of Shao s theoem The poof is ot had, but equies some cae, ad we will beak it ito pats. 28.2.1 How to ecode ad decode efficietly 28.2.1.1 The scheme Ou scheme would be simple. Pick k (1 H(p) δ). Fo ay umbe i = 0,..., K = 2 k+1 1, adomly geeate a biay stig Y i made out of bits, each oe chose idecetly ad uifomly. Let Y 0,..., Y K deote these code wods. 2 Claude Elwood Shao (Apil 30, 1916 - Febuay 24, 2001), a Ameica electical egiee ad mathematicia, has bee called the fathe of ifomatio theoy. 2

Fo each of these codewods we will compute the pobability that if we sed this codewod, the eceive would fail. Let X 0,..., X K, whee K = 2 k 1, be the K codewods with the lowest pobability to fail. We assig these wods to the 2 k messages we eed to ecode i a abitay fashio. The decodig of a message w is doe by goig ove all the codewods, ad fidig all the codewods that ae i (Hammig) distace i the age [p(1 ε), p(1 + ε) fom w. If thee is oly a sigle wod X i with this popety, we etu i as the decoded wod. Othewise, if thee ae o such wods o thee is moe tha oe wod, the decode stops ad epot a eo. 28.2.1.2 The poof Ituitio. Let S i be all the biay stigs (of legth ) such that if the eceive gets this wod, it would deciphe it to be i (hee ae still usig the exteded codewod Y 0,..., Y K ). Note, that if we emove some codewods fom cosideatio, the set S i just iceases i size. Let W i be the pobability that X i was set, but it was ot decipheed coectly. Fomally, let deote the eceived wod. We have that W i = P[ eceived whe X i was set. S i To boud this quatity, let (x, y) deote the Hammig distace betwee the biay stigs x ad y. Clealy, if x was set the pobability that y was eceived is w(x, y) = p (x,y) (1 p) (x,y). As such, we have P[ eceived whe X i was set = w(x i, ). Let S i, be a idicato vaiable which is 1 if S i. We have that W i = P[ eceived whe X i was set = w(x i, ) = S i, w(x i, ). S i S i The value of W i is a adom vaiable of ou choice of Y 0,..., Y K. As such, its atual to ask what is the expected value of W i. Coside the ig R() = { x (1 ε)p (x, ) (1 + ε)p whee ε > 0 is a small eough costat. Suppose, that the code wod Y i was set, ad was eceived. The decode etu i if Y i is the oly codewod that falls iside R(). Lemma 28.2.1 Give that Y i was set, ad was eceived ad futhemoe R(Y i ), the the pobability of the decode failig, is τ = P S i R(Y i ) γ 8, whee γ is the paamete of Theoem 28.1.3. 3 },

Poof: The decode fails hee, oly if R() cotais some othe codewod Y j ( j i) i it. As such, τ = P S i R(Y i ) P [ Y j R(), fo ay j i P [ Y j R(). Now, we emid the eade that the Y j s ae geeated by pickig each bit adomly ad idepedetly, with pobability 1/2. As such, we have P [ Y j R() ( (1+ε)p m) = 2 ( ), 2 (1 + ε)p m=(1 ε)p sice (1 + ε)p < 1/2 (fo ε sufficietly small), ad as such the last biomial coefficiet i this summatio is the lagest. By Coollay 28.3.2 (i), we have P [ Y j R() ( ) 2 (1 + ε)p 2 2H((1+ε)p) = 2 (H((1+ε)p) 1). As such, we have τ = P S i R(Y i ) P [ Y j R(). K P[Y 1 R() 2 k+1 2 (H((1+ε)p) 1) 2 (1 H(p) δ)+1+(h((1+ε)p) 1) 2 ( ) H((1+ε)p) H(p) δ +1 sice k (1 H(p) δ). Now, we choose ε to be a small eough costat, so that the quatity H ((1 + ε)p) H(p) δ is equal to some (absolute) egative (costat), say β, whee β > 0. The, τ 2 β+1, ad choosig lage eough, we ca make τ smalle tha γ/2, as desied. As such, we just poved that τ = P S i R(Y i ) γ 2. Lemma 28.2.2 We have, that w(y i, ) γ/8, whee γ is the paamete of Theoem 28.1.3. Poof: This quatity, is the pobability of sedig Y i whe evey bit is flipped with pobability p, ad eceivig a stig such that moe tha εp bits whee flipped. But this quatity ca be bouded usig the Cheoff iequality. Let Z = (Y i, ), ad obseve that E[Z = p, ad it is the sum of idepedet idicato vaiables. As such w(y i, ) = P [ Z E[Z > εp ) 2 exp ( ε2 4 p < γ 4, sice ε is a costat, ad fo sufficietly lage. Lemma 28.2.3 Fo ay i, we have µ = E[W i γ/4, whee γ is the paamete of Theoem 28.1.3. Poof: By lieaity of expectatios, we have µ = E[W i = E S i, w(y i, ) = [ E S i, w(y i, ) = E[ S i, w(yi, ) = P[x S i w(y i, ), 4

sice S i, is a idicato vaiable. Settig, τ = P S i R(Y i ) ad sice w(y i, ) = 1, we get µ = P[x S i w(y i, ) + P[x S i w(y i, ) = R(Y i ) R(Y i ) R(Y i ) P x S i R(Y i ) w(y i, ) + τ w(y i, ) + w(y i, ) τ + P[x S i w(y i, ) w(y i, ) γ 4 + γ 4 = γ 2. Now, the eceive got (whe we set Y i ), ad it would miss ecode it oly if (i) is outside of R(Y i ), o R() cotais some othe codewod Y j ( j i) i it. As such, τ = P S i R(Y i ) P [ Y j R(), fo ay j i P [ Y j R(). Now, we emid the eade that the Y j s ae geeated by pickig each bit adomly ad idepedetly, with pobability 1/2. As such, we have P [ Y j R() ( (1+ε)p m) = 2 ( ), 2 (1 + ε)p m=(1 ε)p sice (1 + ε)p < 1/2 (fo ε sufficietly small), ad as such the last biomial coefficiet i this summatio is the lagest. By Coollay 28.3.2 (i), we have P [ Y j R() ( ) 2 (1 + ε)p 2 2H((1+ε)p) = 2 (H((1+ε)p) 1). As such, we have τ = P S i R(Y i ) P [ Y j R(). K P[Y 1 R() 2 k+1 2 (H((1+ε)p) 1) 2 (1 H(p) δ)+1+(h((1+ε)p) 1) 2 ( ) H((1+ε)p) H(p) δ +1 sice k (1 H(p) δ). Now, we choose ε to be a small eough costat, so that the quatity H ((1 + ε)p) H(p) δ is egative (costat). The, choosig lage eough, we ca make τ smalle tha γ/2, as desied. As such, we just poved that τ = P S i R(Y i ) γ 2. I the followig, we eed the followig tivial (but supisigly deep) obsevatio. Obsevatio 28.2.4 Fo a adom vaiable X, if E[X ψ, the thee exists a evet i the pobability space, that assigs X a value µ. This holds, sice E[X is just the aveage of X ove the pobability space. As such, thee must be a evet i the uivese whee the value of X does ot exceed its aveage value. 5

The above obsevatio is oe of the mai tools i a poweful techique to povig vaious claims i mathematics, kow as the pobabilistic method. Lemma 28.2.5 Fo the codewods X 0,..., X K, the pobability of failue i ecoveig them whe sedig them ove the oisy chael is at most γ. Poof: We just poved that whe usig Y 0,..., Y K, the expected pobability of failue whe sedig Y i, is E[W i γ 2, whee K = 2 k+1 1. As such, the expected total pobability of failue is K K E W i = E[W i γ 2 2k+1 = γ2 k, by Lemma 28.2.3 (hee we ae usig the facts that all the adom vaiables we have ae symmetic ad behave i the same way). As such, by Obsevatio 28.2.4, thee exist a choice of Y i s, such that K W i 2 k γ. Now, we use a simila agumet used i povig Makov s iequality. Ideed, the W i ae always positive, ad it ca ot be that 2 k of them have value lage tha γ, because i the summatio, we will get that K W i > 2 k γ. Which is a cotadictio. As such, thee ae 2 k codewods with failue pobability smalle tha γ. We set ou 2 k codewod to be these wods. Sice we picked oly a subset of the codewods fo ou code, the pobability of failue fo each codewod shiks, ad is at most γ. Lemma 28.2.5 cocludes the poof of the costuctive pat of Shao s theoem. 28.2.2 Lowe boud o the message size We omit the poof of this pat. 28.3 Fom pevious lectues ( ) Lemma 28.3.1 Suppose that q is itege i the age [0,. The 2H(q) + 1 2 H(q). q Lemma 28.3.1 ca be exteded to hadle o-itege values of q. This is staightfowad, ad we omit the easy details. (i) q [0, 1/2 ( ) q 2 H(q). (ii) q [1/2, 1 ( Coollay 28.3.2 We have: q ) 2 H(q). (iii) q [1/2, 1 2H(q) ( ) +1 q. (iv) q [0, 1/2 2 H(q) ( +1 q ). Theoem 28.3.3 Suppose that the value of a adom vaiable X is chose uifomly at adom fom the iteges {0,..., m 1}. The thee is a extactio fuctio fo X that outputs o aveage at least lg m 1 = H (X) 1 idepedet ad ubiased bits. 6

28.4 Bibliogaphical Notes The pesetatio hee follows [MU05, Sec. 9.1-Sec 9.3. Bibliogaphy [MU05 M. Mitzemache ad U. Upfal. Pobability ad Computig adomized algoithms ad pobabilistic aalysis. Cambidge, 2005. 7