How To Find The Optimal Data Compressio

Size: px
Start display at page:

Download "How To Find The Optimal Data Compressio"

Transcription

1 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY Optimal Lossless Data Compressio: No-Asymptotics ad Asymptotics Ioais Kotoyiais, Fellow, IEEE, ad Sergio Verdú, Fellow, IEEE Abstract This paper provides a extesive study of the behavior of the best achievable rate ad other related fudametal limits i variable-legth strictly lossless compressio. I the oasymptotic regime, the fudametal limits of fixed-to-variable lossless compressio with ad without prefix costraits are show to be tightly coupled. Several precise, quatitative bouds are derived, coectig the distributio of the optimal code legths to the source iformatio spectrum, ad a exact aalysis of the best achievable rate for arbitrary sources is give. Fie asymptotic results are proved for arbitrary ot ecessarily prefix compressors o geeral mixig sources. Noasymptotic, explicit Gaussia approximatio bouds are established for the best achievable rate o Markov sources. The source dispersio ad the source varetropy rate are defied ad characterized. Together with the etropy rate, the varetropy rate serves to tightly approximate the fudametal oasymptotic limits of fixed-to-variable compressio for all but very small block legths. Idex Terms Lossless data compressio, fixed-to-variable source codig, fixed-to-fixed source codig, etropy, fiite-block legth fudametal limits, cetral limit theorem, Markov sources, varetropy, miimal codig variace, source dispersio. I. FUNDAMENTAL LIMITS A. Asymptotics: Etropy Rate For a radom source X ={P X }, assumed for simplicity to take values i a fiite alphabet A, the miimum asymptotically achievable source codig rate bits per source sample is the etropy rate, H X = lim = lim H X Eı X X ], 2 where X = X, X 2,...,X ad the iformatio i bits of a radom variable Z with distributio P Z is defied as ı Z a = log 2 P Z a, 3 Mauscript received December, 202; revised November 3, 203; accepted November 5, 203. Date of publicatio November 4, 203; date of curret versio Jauary 5, 204. S. Verdú was supported i part by the Natioal Sciece Foudatio uder Grat CCF , i part by the Ceter for Sciece of Iformatio, ad i part by the NSF Sciece ad Techology Ceter uder Grat CCF I. Kotoyiais was supported by the Europea Uio ad Greek Natioal Fuds through the Operatioal Program Educatio ad Lifelog Learig of the Natioal Strategic Referece Framework Research Fudig Program THALES: Ivestig i kowledge society through the Europea Social Fud. I. Kotoyiais is with the Departmet of Iformatics, Athes U of Eco ad Busiess, Athes 0675, Greece yiais@aueb.gr. S. Verdú is with the Departmet of Electrical Egieerig, Priceto Uiversity, Priceto, NJ USA verdu@priceto.edu. Commuicated by T. Weissma, Associate Editor for Shao Theory. Color versios of oe or more of the figures i this paper are available olie at Digital Object Idetifier 0.09/TIT The foregoig asymptotic fudametal limit holds uder the followig settigs: Almost-lossless -to-k fixed-legth data compressio: Provided that the source is statioary ad ergodic ad the ecodig failure probability does ot exceed 0 <ɛ<, the miimum achievable rate k is give by as. This is a direct cosequece of the Shao- MacMilla theorem 25]. Droppig the assumptio of statioarity/ergodicity, the fudametal limit is the limsup i probability of the ormalized iformatios 3]. 2 Strictly lossless variable-legth prefix data compressio: Provided that the limit i exists for which statioarity is sufficiet the miimal average source codig rate coverges to. This is a cosequece of the fact that for prefix codes the average ecoded legth caot be smaller tha the etropy 26], ad the miimal average ecoded legth achieved by the Huffma code, ever exceeds the etropy plus oe bit. If the limit i does ot exist, the the asymptotic miimal average source codig rate is simply the lim sup of the ormalized etropies 3]. For statioary ergodic sources, the source codig rate achieved by ay prefix code is asymptotically almost surely bouded below by the etropy rate as a result of Barro s lemma 2], a boud which is achieved by the Shao code. 3 Strictly lossless variable-legth data compressio: As we elaborate below, if o prefix costraits are imposed ad the source is statioary ad ergodic, the radom rate of the optimum code coverges i probability to. As oted i 40], 4], prefix costraits at the level of the whole ecoded file are superfluous i most applicatios. Stored files do ot rely o prefix costraits to determie the boudaries betwee files. Istead, a poiter directory cotais the startig ad edig locatios of the sequece of blocks occupied by each file i the storage medium e.g. 36]. It should be oted that otwithstadig his itroductio of the Shao code, i 34], Shao ever imposed prefix costraits, ad cosidered codig of the whole file rather tha symbol-by-symbol. B. Noasymptotics: Optimum Fixed-to-Variable Codes A fixed-to-variable compressor for strigs of legth from a alphabet A is a ijective fuctio f : A {0, } where {0, } ={, 0,, 00, 0, 0,, 000, 00,...}. 4 The legth of a strig a {0, } is deoted by la. Therefore, a block or strig, or file of symbols IEE. Persoal use is permitted, but republicatio/redistributio requires IEEE permissio. See for more iformatio.

2 778 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 a = a, a 2,...,a A is losslessly compressed by f ito a biary strig whose legth is lf a bits. If the file X = X, X 2,...,X to be compressed is geerated by a probability law P X, a basic iformatiotheoretic object of study is the distributio of the rate of the optimal compressor, as a fuctio of the blocklegth ad the distributio P X. The best achievable compressio performace at fiite blocklegths is characterized by the followig fudametal limits: R,ɛ is the lowest rate R such that the compressio rate of the best code exceeds R bits/symbol with probability ot greater tha 0 ɛ<: mi Plf X > R] ɛ. 5 f 2 ɛ, k is the best achievable excess-rate probability, amely, the smallest possible probability that the compressed legth is greater tha or equal to k: ɛ, k = mi Plf X k]. 6 f 3 R,ɛ is the smallest blocklegth at which compressio at rate R is possible with probability at least ɛ; i other words, the miimum required for 5 to hold. 4 R is the miimal average compressio rate: R = mi Elf X ]. 7 f Naturally, the fudametal limits i, 2 ad 3 are equivalet i the sese that kowledge of oe of them as a fuctio of its parameters determies the other two. For example, R,ɛ = k iff ɛ, k + ɛ<ɛ, k. 8 The miima i the fudametal limits 5, 6, 7 are achieved by a optimal compressor f that assigs the elemets of A ordered i decreasig probabilities to the elemets i {0, } ordered lexicographically as i 4. I particular, R,ɛ is the smallest R such that Plf X > R] ɛ. 9 As for 4, we observe that 8 results i: R = Elf X ] 0 = = ɛ, k k= 0 R, x dx 2 We emphasize that the optimum code f is idepedet of the desig target, i that, e.g., it is the same regardless of whether we wat to miimize average legth or the probability that the ecoded legth exceeds KB or MB. I fact, the code f possesses the followig strog stochastic competitive optimality property over ay other compressor f : Plf X k] Plf X k], k 0. 3 A optimal compressor f must satisfy the followig costructive property. Property : Let k = log 2 + A.Fork =, 2,...,k, ay optimal code f assigs strigs of legth 0,, 2,...,k to each of the k = 2 k 4 most likely elemets of A.Iflog 2 + A is ot a iteger, the f assigs strigs of legth k to the A + 2 k least likely elemets of A. Accordig to Property, the legth of the logest codeword assiged by f is log 2 A. To check this, ote that if log 2 + A is a iteger, the the maximal legth is log 2 + A = log 2 A, otherwise, the maximal legth is log 2 + A = log 2 A. Note that Property is a ecessary ad sufficiet coditio for optimality but it does ot determie f uiquely: Not oly does it ot specify how to break ties amog probabilities but ay swap betwee two codewords of the same legth preserves optimality. As i the followig example, it is coveiet, however, to thik of f as the uique compressor costructed by breakig ties lexicographically ad by assigig the elemets of {0, } i the lexicographic order of 4. The, if a is the kth most probable outcome its ecoded versio has legth lf a = log 2 k 5 Example : Suppose = 4, A ={, }, ad the source is memoryless with PX = ]> PX = ]. The the followig compressor is optimal: f4 = f4 = 0 f4 = f4 = 00 f4 = 0 f4 = 0 f4 = f4 = 000 f4 = 00 f4 = 00 f4 = 0 f4 = 00 f4 = 0 f4 = 0 f4 = f4 = Although f is ot a prefix code, the decompressor is able to recover the source file a exactly from f a ad its kowledge of ad P X. Sice the whole source file is compressed, it is ot ecessary to impose a prefix coditio i order for the decompressor to kow where the compressed file starts ad eds. Oe of the goals of this paper is to aalyze how much compressed efficiecy ca be improved i strictly lossless source codig by droppig the the prefix-free costrait at the block level.

3 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 779 C. Noasymptotics: Optimum Fixed-to-Variable Prefix Codes I this subsectio we deal with the case whe the prefix coditio is imposed at the level of the whole ecoded file, which is more efficiet tha imposig it at the level of each symbol a well-studied setup which does ot exploit memory ad which we do ot deal with i this paper. The fixed-tovariable prefix code that miimizes the average legth is the Huffma code, achievig the average compressio rate R p which is strictly larger tha R, defied as i 7 but restrictig the miimizatio to prefix codes. Also of iterest is to ivestigate the optimum rate of the prefix code that miimizes the probability that the legth exceeds a give threshold ; if the miimizatio i 5 is carried out with respect to codes that satisfy the prefix coditio, the the correspodig fudametal limit is deoted by R p,ɛ, ad aalogously ɛ p, k for 6. Note that the optimum prefix code achievig those fudametal limits will, i geeral, deped o k. The followig ew result shows that the correspodig fudametal limits, with ad without the prefix coditio, are tightly coupled: Theorem : Suppose all elemets i A have positive probability. For all : For each iteger k : { ɛ, k k < log ɛ p, k + = 2 A 6 0 k log 2 A. 2 If A is ot a power of 2, the for 0 ɛ<: R p,ɛ= R,ɛ+. 7 If A is a power of 2, the 7 holds for ɛ mi a A P X a, while we have, R p,ɛ= R,ɛ= log 2 A +, 8 for 0 ɛ<mi a A P X a. Proof: : Fix k ad satisfyig 2 k < A. Sice there is o beefit i assigig shorter legths, ay Kraft-iequalitycompliat code f p that miimizes Plf X > k] assigs strigs of legth k to each of the 2 k largest masses of P X. Assigig to all the other elemets of A biary strigs of legth equal to l max = k + log 2 A 2 k + 9 guaratees that the Kraft sum is satisfied. O the other had, accordig to Property, the optimum code f without prefix costraits ecodes each of the 2 k largest masses of P X with strigs of legths ragig from 0 to k. Therefore, Plf p X k + ] =Plf X k], 20 or, equivaletly, ɛ p, k + = ɛ, k. If 2 k A, the a zero-error -to-k code exists, ad hece ɛ p, k + = 0. This was cosidered i 5] usig the miimizatio of the momet geeratig fuctio at a give argumet as a proxy see also 4]. 2: For brevity, let p mi = mi a A P X a. From Property it follows that if A is ot a power of 2, the while if A is a power of 2, the ɛ, log 2 A = 0, 2 ɛ, log 2 A + = 0 22 ɛ, log 2 A = p mi 23 O the other had, implies that: ɛ p, log 2 A + = 0 24 ɛ p, log 2 A = ɛ, log 2 A p mi. 25 Furthermore, R p, ca be obtaied from ɛ p, aalogously to 8: R p,ɛ= i iff ɛ p, i ɛ<ɛ p, i. 26 Together with 8 ad 6, 26 implies that 7 holds if ɛ p mi. If ɛ < p mi, the 2 25 result i 8 whe A is a powerof2.if A is ot a power of 2 ad 0 ɛ<p mi, the, R,ɛ = log 2 A 27 R p,ɛ = log 2 A +, 28 completig the proof. D. Noasymptotics: Optimum Fixed-to-Fixed Almost-Lossless Codes As poited out i 40], 4], the quatity ɛ, k is itimately related to the problem of almost-lossless fixed-tofixed data compressio. Assume the otrivial compressio regime i which 2 k < A. The optimal -to-k fixed-to-fixed compressor assigs a uique strig of legth k to each of the 2 k most likely elemets of A, ad assigs all the others to the remaiig biary strig of legth k, which sigals a ecoder failure. Thus, the source strigs that are decodable error-free by the optimal -to-k scheme are precisely those that are ecoded with legths ragig from 0 to k by the optimum variable-legth code Property. Therefore, ɛ, k, defied i 6 as a fudametal limit of strictly lossless variable-legth codes is, i fact, equal to the miimum error probability of a -to-k code. Accordigly, the results obtaied i this paper apply to the stadard paradigm of almost-lossless fixed-to-fixed compressio as well as to the setup of lossless fixed-to-variable compressio without prefixfree costraits at the block level. The case 2 k A is rather trivial: The miimal probability of ecodig failure for a -to-k code is 0, which agai coicides with ɛ, k, uless A = 2 k, i which case, as we saw i 23, ɛ, k = mi a A P X a. 29

4 780 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 E. Existig Asymptotic Results Optimal Variable-Legth Codes: Based o the correspodece we just saw betwee optimal almost-lossless fixedto-fixed codes ad optimal strictly lossless fixed-to-variable codes, previous results o the asymptotics of fixed-to-fixed compressio ca be brought to bear. I particular the Shao- MacMilla theorem 25], 34] implies that for a statioary ergodic fiite-alphabet source with etropy rate H X, ad for all 0 <ɛ<, lim R,ɛ= H X. 30 Suppose X is geerated by a memoryless source with distributio, P X = P X P X P X, 3 where P X has etropy H X ad varetropy 2 σ 2 X = Varı X X. 32 For the expected legth, Szpakowski ad Verdú 38] show that the behavior of 7 for o-equiprobable sources is R = H X 2 log 2 + O, 33 whichisalsorefiedtoshowthatifı X X is o-lattice, 3 the, R = H X 2 log 2 8πeσ 2 X + o. 34 Yushkevich 44] showed that R,ɛ= H X + σx Q ɛ + o 35 a result that was refied for o-equiprobable memoryless sources such that ı X X is o-lattice by Strasse 37]: 4 R,ɛ = H X + σx Q ɛ 2 log 2 2πσ 2 Xe Q ɛ 2 + 6σ 2 X Q ɛ 2 + o 36 where Qx = 2π x e t 2 /2 dt deotes the stadard Gaussia tail fuctio, ad is the third cetered absolute momet of ı X X. Also i the cotext of memoryless sources with fiitealphabet A all whose letters have positive probabilitites, the expoetial decrease of the error probability is give by e.g. 9] the parametric expressio: lim log ɛ, R = DP X α X 37 R = H X α 38 2 σ 2 X is called miimal codig variace i 9]. 3 A discrete radom variable is lattice if all its masses are o a subset of some lattice {ν + kς ; k Z}. 4 See the discussio i Sectio V regardig Strasse s claimed proof of this result. where R H X, log A, α 0,, ad P Xα a = c Pα X a 39 with c = a A Pα X a. 2 Optimal Prefix Variable-Legth Codes: I cotrast to 33, whe a prefix-free coditio is imposed, we have the well-kow behavior for the average rate, R p = H X + O, 40 for ay source for which the limit i exists, see, e.g., 8]. It follows immediately from Theorem that the prefix-free coditio icurs o loss as far as the limit i 30 is cocered: lim R p,ɛ= H X. 4 Kotoyiais 9] gives a differet kid of Gaussia approximatio for the codelegths lf X of arbitrary prefix codes f o memoryless data X, showig that, with probability oe, lf X is evetually bouded below by a radom variable that has a approximately Gaussia distributio, lf X Z where Z D NHX, σ 2 X; 42 ad σ 2 X is the varetropy as i 32. Therefore, the codelegths lf X will have at least Gaussia fluctuatios of O ; this is further sharpeed i 9] to a correspodig law of the iterated logarithm, statig that, with probability oe, the compressed legths lf X will have fluctuatios of O l l, ifiitely ofte: With probability oe: lim sup lf X H X 2 l l σx. 43 Both results 42 ad 43 are show to hold for Markov sources as well as for a wide class of mixig sources with ifiite memory. As far as the large deviatios of the distributio of the legths, 27] shows for a class of sources with memory that either the prefix costrait or uiversality of the compressor degrade the optimum error expoet, which is i fact achieved by the Lempel-Ziv compressor. F. Outlie of Mai New Results Theorem shows that the prefix costrait oly icurs oe bit pealty as far as the distributio of the optimum rate is cocered. Note that requires the prefix code to be optimized for a give overflow probability. I cotrast, as we commeted i Sectios I-E. ad I-E.2, the pealty i average rate icurred by a hypothetical Huffma code that operated at the whole file level is larger. Sectio II gives a geeral aalysis of the distributio of the legths of the optimal lossless code for ay discrete iformatio source. We adopt a sigle-shot approach which ca be particularized to the covetioal case i which the source produces a strig of symbols of either determiistic or radom legth. I Theorems 3 ad 4 we give simple achievability ad coverse bouds, showig that the distributio fuctio of the optimal codelegths, P lf X t ], is itimately related to the distributio fuctio of the iformatio radom variable,

5 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 78 P ı X X t]. Also we observe that the optimal codelegths lf X are always bouded above by ı X X, ad we give a example where their distributios are oticeably differet. However, sigificat deviatios caot be too probable accordig to Theorem 5. Theorem 7 offers a exact, o-asymptotic expressio for the best achievable rate R,ɛ. A exact expressio for the average probability of error achieved by almost-lossless radom biig, is give i Theorem 8. Geeral o-asymptotic ad asymptotic results for the expected optimal legth, R, are obtaied i Sectio III. I Sectio IV we revisit the refied asymptotic results 42 ad 43 of 9], ad show that they remai valid for geeral ot ecessarily prefix compressors, ad for a broad class of possibly ifiite-memory sources. Sectio V examies i detail the fiite-blocklegth behavior of the fudametal limit R,ɛ for the case of memoryless sources. We prove tight, o-asymptotic ad easily computable bouds for R,ɛ. combiig the results of Theorems 7 ad 8 implies the followig approximatio: Gaussia approximatio I: For a memoryless source with etropy H X ad varetropy σ 2 X, the best achievable rate R,ɛ satisfies R,ɛ HX + σx Q ɛ 2 log 2 + O 44 I view of Theorem, the same holds for R p,ɛ i the case of prefix codes. The approximatio 44 is established by combiig the geeral results of Sectio II with the classical Berry-Essée boud 22], 30]. This approximatio is made precise i a oasymptotic way, ad all the costats ivolved are explicitly idetified. I Sectio VI, achievability ad coverse bouds Theorems 9 ad 20 are established for R,ɛ, i the case of geeral ergodic Markov sources. Those results are aalogous though slightly weaker to those i Sectio V. We also defie the varetropy rate of a arbitrary source as the limitig ormalized variace of the iformatio radom variables ı X X, ad we show that, for Markov chais, it plays the same role as the varetropy defied i 32 for memoryless sources. Those results i particular imply the followig: Gaussia approximatio II: For ay ergodic Markov source with etropy rate H X ad varetropy rate σ 2 X, the blocklegth R,ɛ required for the compressio rate to exceed + ηh X with probability o greater tha ɛ>0, satisfies, + ηh X, ɛ σ 2 X Q 2 ɛ H X η Fially, Sectio VII defies the source dispersio D as the limitig ormalized variace of the optimal codelegths. I effect, the dispersio gauges the time oe must wait for the source realizatio to become typical withi a give probability, as i 45 above, with D i place of σ 2 X. For a large class of sources icludig ergodic Markov chais of ay order, the dispersio D is show to equal the varetropy rate σ 2 X of the source. II. NON-ASYMPTOTICS FOR ARBITRARY SOURCES I this sectio we aalyze the best achievable compressio performace o a completely geeral discrete radom source. I particular, except where oted we do ot ecessarily assume that the alphabet is fiite ad we do ot exploit the fact that i the origial problem we are iterested i compressig ablockof symbols. I this way we eve ecompass the case where the source strig legth is a priori ukow at the decompressor. Thus, we cosider a give probability mass fuctio P X defied o a arbitrary fiite alphabet X, which may but is ot assumed to cosist of variable-legth strigs draw from some alphabet. The results ca the be particularized to the settig i Sectio I, lettig X A ad P X P X. Coversely, we ca simply let = isectioi to yield the settig i this sectio. The best achievable rate R,ɛ at blocklegth = is abbreviated as R ɛ = R,ɛ. By defiitio, R ɛ is the lowest R such that Plf X > R] ɛ, 46 which is equal to the quatile fuctio 5 of the iteger-valued radom variable lf X evaluated at ɛ. A. Achievability Boud Our goal is to express the distributio of the optimal codelegths lf X i terms of the distributio of ı X X.The first result is the followig simple ad powerful relatioship: Theorem 2: Assume that the elemets of X are itegervalued with decreasig probabilities: P X P X 2, ad that f is the optimal code that assigs sequetially ad lexicographically the shortest available strig. The, for all a X 6 : lf a ı X a 47 Proof: Fix i =, 2,... Sice there are i outcomes at least as likely as i we must have 7 P X i i. 48 as otherwise the sum of the probabilities would exceed. Takig log 2 of 48, results i where 5 follows as i 5. ı X i log 2 i 49 log 2 i 50 = lf i 5 5 The quatile fuctio Q: 0, ] R is the iverse of the cumulative distributio fuctio F. Specifically, Qα = mi{x : Fx = α} if the set is oempty; otherwise α lies withi a jump lim x xα Fx <α<fx α ad we defie Qα = x α. 6 A legacy of the Kraft iequality midset, the term ideal codelegth is sometimes used for ı X X, which is either a codelegth or ideal i view of 47. As we illustrate i Figure, the differece betwee both sides of 48 may be substatial. 7 This iequality was used by Wyer 43] to show that for a positive-itegervalued radom variable Z with decreasig probabilities Elog Z] H Z.

6 782 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 Theorem 2 results i the followig achievability result: Theorem 3:40] For ay a 0, P lf X a ] P ı X X a]. 52 Proof: Sice either the actual choice of f or the labelig of the elemets of X ca affect either side of 52, we are free to choose those specified i Theorem 2. The, 52 follows immediately. We ote that either Theorem 2 or Theorem 3 require X to take values o a fiite alphabet. Theorem 3 is the startig poit for the achievability result for R,ɛ established for Markov sources i Theorem 9. B. Coverse Bouds I Theorem 4 we give a correspodig coverse result; cf. 40]. It will be used later to obtai sharp coverse bouds for R,ɛ for memoryless ad Markov sources, i Theorems 8 ad 20, respectively. Theorem 4: For ay oegative iteger k, { P ı X X k + τ] 2 τ } P lf X k ]. 53 sup τ>0 Proof: As i the proof of Theorem 3, we label the values take by X as the positive itegers i decreasig probabilities. Fix a arbitrary τ>0. Defie: L ={i X : P X i 2 k τ } 54 C ={, 2,...2 k }. 55 The, abbreviatig P X B = i B P X i, B X, P ı X X k + τ ] = P X L 56 = P X L C + P X L C c 57 P X L C + P X C c 58 2 k 2 k τ + P X C c 59 < 2 τ + P log 2 X k ] 60 = 2 τ + P lf X k ], 6 where 6 follows i view of 5. Next we give aother geeral coverse boud, similar to that of Theorem 4, where this time we directly compare the codelegths lfx of a arbitrary compressor with the values of the iformatio radom variable ı X X. Whereas from 47 we kow that lfx is always smaller tha ı X X, Theorem 5 says that it caot be much smaller with high probability. This is a atural aalog of the correspodig coverse established for prefix compressors i 2], ad stated as Theorem 6 below. Applied to a fiite-alphabet source, Theorem 5 is the key boud i the derivatio of all the poitwise asymptotic results of Sectio IV, amely, Theorems 2, 3 ad 4. It is also the mai techical igrediet of the proof of Theorem 23 i Sectio VII statig that the source dispersio is equal to its varetropy. Theorem 5: For ay compressor f ad ay τ>0: P lfx ı X X τ ] 2 τ log 2 X Proof: Lettig {A} deote the idicator fuctio of the evet A, the probability i 62 ca be bouded above as PlfX ı X X τ] = { P X x P X x 2 τ lfx} 63 x X 2 τ x X log 2 X 2 τ j=0 2 lfx 64 2 j 2 j, 65 where the sum i 64 is maximized if f assigs a strig of legth j + oly if it also assigs all strigs of legth j. Therefore, 65 holds because that code cotais all the strigs of legths 0,,..., log 2 X plus X 2 log 2 X + 2 log 2 X 66 strigs of legth log 2 X. We saw i Theorem that the optimum prefix code uder the criterio of miimum excess legth probability icurs a pealty of at most oe bit. The followig elemetary coverse is derived i 2], 9]; its short proof is icluded for completeess. Theorem 6: For ay prefix code f, adayτ>0: Proof: to 64, PlfX < ı X X τ] 2 τ. 67 We have, as i the proof of Theorem 5 leadig PlfX ı X X τ] 2 τ 2 lfx x X 68 2 τ, 69 where 69 is Kraft s iequality. C. Exact Fudametal Limit The followig result expresses the o-asymptotic data compressio fudametal limit R ɛ = R,ɛ as a parametric fuctio of the source iformatio spectrum. Theorem 7: For all a 0, the exact miimum rate compatible with give excess-legth probability satisfies, with R ɛ = log 2 + M2 a, 70 ɛ = Pı X X a], 7 where Mβ deotes the umber of masses with probability strictly larger tha β, ad which ca be expressed as: Mβ = β P ı X X <log 2 β ] β P ı X X log 2 t ] dt. 72 Proof: As above, the values take by X are labeled as the positive itegers i order of decreasig probability. By the defiitio of M, for ay positive iteger i, ada > 0, P X i 2 a + M2 a i, 73

7 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 783 Therefore, P log 2 X log 2 + M X 2 a ] = P ı X X a] 74 Moreover, because of 5, for all b 0, P log 2 X b ] = P lf X X b] 75 P log 2 X b ] 76 Particularizig 75 to b = log 2 + M X 2 a ad lettig ɛ< be give by 7 we see that if R = log 2 + M X 2 a, ad Plf X > R] =ɛ 77 Plf X > R] >ɛ 78 if R < log 2 + M X 2 a. Therefore, 70 holds. The proof of 72 follows a sequece of elemetary steps: Mβ = { P X x > } 79 β x X {PX X > β = E } ] 80 P X X {PX X > β = P } ] > t dt 8 0 P X X β = P 0 β < P X X < ] dt 82 t β ] = P 0 β < P X X P P X X ] dt 83 t = β P ı X X <log 2 β ] β P ı X X log 2 t ] dt. 84 While Theorem 7 gives R ɛ = R,ɛ exactly for those ɛ which correspod to values take by the complemetary cumulative distributio fuctio of the iformatio radom variable ı X X, a cotiuous sweep of a > 0 gives a very dese grid of values, uless X whose alphabet size typically grows expoetially with i the fixed-to-variable setup takes values i a very small alphabet. From the value of a we ca obtai the probability i the right side of 7. The optimum code achieves a excess probability ɛ = P lf ] X l a for legths equal to l a = a + log 2 2 a + 2 a M2 a, 85 where the secod term is egative ad represets the exact gai with respect to the iformatio spectrum of the source. For later use we observe that if we let M X + β be the umber of masses with probability larger or equal tha β, the,8 M X + β = { P X x } 86 β x X = E exp ı X X { ı X X log 2 β }] Where typographically coveiet we use expa = 2 a Fig.. Cumulative distributio fuctios of lf X ad ı X X whe X is the umber of tails obtaied i 0,000 fair coi flips. Figure shows the cumulative distributio fuctios of lf X ad ı X X whe X is a biomially distributed radom variable: the umber of tails obtaied i 0,000 fair coi flips. Therefore, ı X X rages from 0 4 log to 0 4 ad, H X = Elf X] =6.29, 89 where all figures are i bits. This example illustrates that i the o-asymptotic regime, the optimum codig legth may be substatially smaller tha the iformatio cf. Footote 6. D. Exact Behavior of Radom Biig The followig result gives a exact expressio for the performace of radom biig for arbitrary sources cf. 32] for the exact performace of radom codig i chael codig, as a fuctio of the cumulative distributio fuctio of the radom variable ı X X via 72. I biig, the compressor is o loger costraied to be a ijective mappig. Whe the label received by the decompressor ca be explaied by more tha oe source realizatio, it chooses the most likely oe, breakig ties arbitrarily. Theorem 8: Averagig uiformly over all biig compressors f: X {, 2,...N}, results i a expected error probability equal to J X J X l E X N l, 90 + l l=0 where M is give i 72, the umber of masses whose probability is equal to P X x is deoted by: Jx = PP X X = P X x], 9 P X x ad x = M P X x +J x 92 N Proof: For the purposes of the proof, it is coveiet to assume that ties are broke uiformly at radom amog the most likely source outcomes i the bi. To verify 90, ote that give that the source realizatio is x 0 : The umber of masses with probability strictly higher tha that of x 0 is M P X x 0.

8 784 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY Correct decompressio of x 0 requires that o x with P X x >P X x 0 be assiged to the same bi as x 0. This occurs with probability: N M P X x If, i additio to x 0, its bi cotais l masses with the same probability as x 0, correct decompressio occurs, assumig 2 is satisfied, with probability +l. 4 The probability that there are l masses with the same probability as x 0 i the same bi is equal to: Jx0 l N J x0 l N l. 94 The, 90 follows sice all the bi assigmets are idepedet. Theorem 8 leads to a achievability boud for both almost-lossless fixed-to-fixed compressio ad lossless fixedto-variable compressio. However, i view of the simplicity ad tightess of Theorem 3, the mai usefuless of Theorem 8 is to gauge the suboptimality of radom biig i the fiite i fact, rather short because of computatioal complexity blocklegth regime. III. MINIMAL EXPECTED LENGTH Recall the defiitio of the best achievable rate R i Sectio I, expressed i terms of f as i 0. A immediate cosequece of Theorem 3 is the boud, R = E lf X ] H X, 95 Ideed, by liftig the prefix coditio it is possible to beat the etropy o average as we saw i the asymptotic results 33 ad 34. Lower bouds o the miimal average legth as a fuctio of H X ca be foud i ], 3], 5], ], 23], 33], 38], 42]. A explicit expressio ca be obtaied easily by labelig the outcomes as the positive itegers with decreasig probabilities as i the proof of Theorem 3: Elf X] =E log 2 X ] 96 = P log 2 X k] 97 = k= PX 2 k ]. 98 k= I cotrast, the correspodig miimum legth for prefix codes ca be computed umerically, i the special case of fiite alphabets, usig the Huffma algorithm, but o exact expressio i terms of P X is kow. Example 2: The average umber of bits required to ecode at which flip of a fair coi the first tail appears is equal to PX 2 k ]= 99 k= = 2 2 j k= j=2 k 2 2k 00 k= , 0 sice, i this case, X is a geometric radom variable with PX = j] =2 j. I cotrast, imposig a prefix costrait disables ay compressio: The optimal prefix code cosists of all, possibly empty, strigs of 0s termiatedby, achievig a average legth of 2. Example 3: Suppose X M elemets: The miimal average legth is E lf X M ] = log 2 M + M which simplifies to is equiprobable o a set of M 2 + log 2 M 2 log 2 M + 02 E lf X M ] = M + log 2 M + 2, 03 M whe M + isapowerof2. 2 For large alphabet sizes M, { lim sup H X M E lf X M ]} = 2 04 M { lim if H X M E lf X M ]} M = + log 2 e log 2 log 2 e, 05 where the etropy is expressed i bits. Theorem 9: For ay source X with fiite etropy rate, H X = lim sup H X <, 06 the ormalized miimal average legth satisfies: lim sup R = H X. 07 Proof: The achievability upper boud i 07 holds i view of 95. I the reverse directio, we ivoke the boud ]: H X Elf X ] log 2 H X + + log 2 e. 08 Upo dividig both sides of 08 by ad takig lim sup the desired result follows, sice for ay δ>0, for all sufficietly large, H X HX + δ. I view of 40, we see that the pealty icurred o the average rate by the prefix coditio vaishes asymptotically i the very wide geerality allowed by Theorem 9. I fact, the same proof we used for Theorem 9 shows the followig result: Theorem 0: For ay ot ecessarily serial, i.e. A is ot ecessarily a Cartesia product source X ={P X A }, lim R H X = lim as log as H X diverges. Elf X ] H X =, 09 IV. POINTWISE ASYMPTOTICS A. Normalized Poitwise Redudacy Before turig to the precise evaluatio of the best achievable rate R,ɛ, i this sectio we examie the asymptotic behavior of the actual codelegths lf X of arbitrary compressors f. More specifically, we examie the differece

9 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 785 betwee the codelegth lf X ad the iformatio fuctio, a quatity sometimes called the poitwise redudacy. Theorem : For ay fiite-alphabet discrete source X ad ay positive diverget determiistic sequece κ such that log lim = 0, 0 κ the followig hold. a For ay sequece {f } of codes, with probability w.p.: lim if κ lf X ı X X 0. b The sequece of optimal codes {f } achieves w.p. : lim lf κ X ı X X = 0, 2 Proof: Part a: We ivoke the geeral coverse i Theorem 5, with X ad A i place of X ad X, respectively. Fixig arbitrary ɛ>0 ad lettig τ = τ = ɛκ, we obtai that P lf X ı X X ] ɛκ 2 log 2 ɛκ log 2 A +, 3 which is summable i. Therefore, the Borel-Catelli lemma implies that the lim sup of the evet o the left side of 3 has zero probability, or equivaletly, with probability, lf X ı X X ɛκ 4 is violated oly a fiite umber of times. Sice ɛ ca be chose arbitrarily small, follows. Part b follows from a ad 47. B. Statioary Ergodic Sources Theorem 9 states that for ay discrete process X the expected rate of the optimal codes f satisfy, lim sup Elf X ] =H X. 5 The ext result shows that if the source is statioary ad ergodic, the the same asymptotic relatio holds ot just i expectatio, but also with probability. Moreover, o compressor ca beat the etropy rate asymptotically with positive probability. The correspodig results for prefix codes were established i 2], 8], 9]. Theorem 2: Suppose that X is a statioary ergodic source with etropy rate H X. i For ay sequece {f } of codes, lim if lf X H X, w.p.. 6 ii The sequece of optimal codes {f } achieves, lim lf X = H X, w.p.. 7 Proof: The Shao-Macmilla-Breima theorem states that ı X X H X, w.p.. 8 Therefore, the result is a immediate cosequece of Theorem with κ =. C. Statioary Ergodic Markov Sources We assume that the source is a statioary ergodic firstorder Markov chai, with trasitio kerel, P X X x x x, x A 2, 9 o the fiite alphabet A. Further restrictig the source to be Markov eables us to aalyze more precisely the behavior of the iformatio radom variables ad, i particular, the zeromea radom variables, Z = ı X X H X, 20 will be see to be asymptotically ormal with variace give by the varetropy rate. This geeralizes the varetropy of a sigle radom variable defied i 32. Defiitio : The varetropy rate of a radom process X is: σ 2 X = lim sup Varı X X. 2 Remarks: If X is a statioary memoryless process each of whose letters X is distributed accordig to P X, the the varetropy rate of X is equal to the varetropy of X. The varetropy of X is zero if ad oly if it is equiprobable o its support. 2 I cotrast to the first momet, we do ot kow whether statioarity is sufficiet for lim sup = lim if i 2. The difficulty appears to be that, ulike ı X, ı 2 X does ot satisfy a chai rule. 3 While the etropy rate of a Markov chai admits a two-letter expressio, the varetropy does ot. I particular, if σ 2 a deotes the varetropy of the distributio P X X a, the the varetropy of the chai is, i geeral, ot give by Eσ 2 X 0 ]. The reaso is the ozero correlatio betwee the radom variables {ı Xk X k X k X k }. 4 The varetropy rate of Markov sources is typically ozero. For example, for a first order Markov chai it was observed i 20], 44] that σ 2 = 0 if ad oly if the source satisfies the followig determiistic equipartitio property: Every strig x + that starts ad eds with the same symbol, has probability give that X = x q, for some costat 0 q. The simplest otrivial example of the varetropy of a source with memory is the followig. Example 4: Cosider a biary symmetric homogeeous statioary Markov chai with PX 2 = 0 X = ] =PX 2 = X = 0] =α 22 The etropy i bits is H X = H X + H X 2 X 23 = + hα 24 Furthermore, it is easy to check that for k =, 2,... Elog 2 P Xk+ X k X k+ X k ] = α log 2 α + α log 2 α 25

10 786 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 while for k = 2, 3,... Elog P Xk+ X k X k+ X k log P X2 X X 2 X ]=h 2 α 26 We ca the write ] E log 2 P X X 2 = E log P Xk+ X k X k+ X k 27 k= = + 2 hα + 2h 2 α + Elog 2 P X2 X X 2 X ] 28 = H 2 X + α α log 2 α 29 α Therefore, we coclude that Varı X X = α α log 2 α 30 α ad cosequetly σ 2 X = α α log 2 α 3 α which coicides with the varetropy of the Beroulli source with bias α. The followig result is kow see 3] ad the refereces therei. We iclude its proof for completeess ad for later referece. Theorem 3: Let X be a statioary ergodic fiite-state Markov chai. i The varetropy rate σ 2 X is also equal to the correspodig lim if of the ormalized variaces i 2, ad it is fiite. ii The ormalized iformatio radom variables are asymptotically ormal, i the sese that, as, ı X X H X N0,σ 2 X, 32 i distributio. iii The ormalized iformatio radom variables satisfy a correspodig law of the iterated logarithm: lim sup lim if ı X X H X 2 l l = σx, w.p. 33 ı X X H X 2 l l = σx, w.p. 34 Proof: i ad ii: Cosider the bivariate Markov chai { X = X, X + } o the alphabet B = {x, y A 2 : P X X y x >0} ad the fuctio f : B R defied by f x, y = ı X X y x. Sice X is statioary ad ergodic, so is { X }, hece, by the cetral limit theorem for fuctios of Markov chais 6], ı X X X X H X X = f X i E f X i ], 35 i= coverges i distributio to the zero-mea Gaussia law with fiite variace Furthermore, sice 2 = lim Varı X X X X. 36 ı X X H X = ı X X X X H X X + ı X X H X, 37 where the secod term is bouded, 32 must hold ad we must have 2 = σ 2 X. iii: Normalizig 35 by 2 l l istead of,we ca ivoke the law of the iterated logarithm for fuctios of Markov chais 6] to show that the lim sup / lim if of the sum behave as claimed. Sice, upo ormalizatio, the last term i 37 vaishes almost surely, ı X X H X must exhibit the same asymptotic behavior. Similarly, the followig result readily follows from Theorem 3 ad Theorem with κ = 2 l l. Theorem 4: Suppose X is a statioary ergodic Markov chai with etropy rate H X ad varetropy rate σ 2 X. The: i lf X HX N0,σ 2 X, 38 ii For ay sequece of codes {f }: lim sup lim if lf X H X 2 l l σx, w.p.; 39 lf X H X 2 l l σx, w.p.. 40 ii The sequece of optimal codes {f } achieves the bouds i 39 ad 40 with equality. As far as the poitwise ad 2 l l asymptotics the optimal codelegths exhibit the same behavior as the Shao prefix code ad arithmetic codig. However, the large deviatios behavior of the arithmetic ad Shao codes is cosiderably worse tha that of the optimal codes without prefix costraits. D. Beyod Markov The Markov sufficiet coditio i Theorem 3 eabled the applicatio of the cetral limit theorem ad the law of the iterated logarithm to the sum i 35. Accordig to Theorem 9. of 3] a more geeral sufficiet coditio is that X be a statioary process with αd = Od 336 ad γd = Od 48, where the mixig coefficiets αd ad γd are defied as: ı γd = max E a A X0 X a X ı X 0 X d a X d 4 αd = sup { PB A PBPA } 42 where the sup is over A F 0 ad B F d. Here F 0 ad F d deote the σ -algebras geerated by the collectios of radom variables X 0, X,... ad X d, X d+,..., respectively. The αd are the strog mixig

11 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 787 coefficiets 4] of X, adtheγd were itroduced by Ibragimov i 6]. Although these mixig coditios may be hard to verify i practice, they are fairly weak i that they require oly polyomial decay of αd ad γd. I particular, ay ergodic Markov chai of ay order satisfies these coditios. From the Lideberg-Feller theorem 7], aother sufficiet coditio for 32 to hold cosists i assumig that the lim sup i 2 is equal to the lim if ad is fiite, ad that for all η>0, where lim Z j = E Z j {Z j >ηv X } ] = 0 43 j= ı X j X j X j X j H X j X j Fig. 2. The optimum rate R, 0., the Gaussia approximatio R, 0. i 49, ad the upper boud R u, 0., for a Beroulli-0. source ad blocklegths V. GAUSSIAN APPROXIMATION FOR MEMORYLESS SOURCES We tur our attetio to the o-asymptotic behavior of the best rate R,ɛ that ca be achieved whe compressig a statioary memoryless source X with values i the fiite alphabet A ad margial distributio P X, whose etropy ad varetropy are deoted by H X ad σ 2 X, respectively. Specifically, we will derive explicit upper ad lower bouds o R,ɛ i terms of the first three momets of the iformatio radom variable ı X X. Although usig Theorem 7 it is possible, i priciple, to compute R,ɛ exactly, it is more desirable to derive approximatios that are both easier to compute ad offer more ituitio ito the behavior of the fudametal limit R,ɛ. Theorems 7 ad 8 imply that, for all ɛ 0, /2, the best achievable rate R,ɛ satisfies, c R,ɛ H X + σx Q ɛ log 2 ] c The upper boud is valid for all, ad the lower boud is valid for 0. Explicit values are derived for the costats 0, c ad c. I view of Theorem which states that miimizig the probability that the ecoded legth exceeds a give boud, the prefix costrait oly icurs oe bit pealty, essetially the same results as i 45 hold for prefix codes: c R p,ɛ H X + σx Q ɛ log 2 ] 2 c The bouds i 45 ad 46 result i the Gaussia approximatio 44 stated i Sectio I. Before establishig the precise o-asymptotic relatios leadig to 45 ad 46, we illustrate their utility. To facilitate this, ote that Theorem 3 immediately yields the followig simple boud: Theorem 5: For all, ɛ>0, R,ɛ R u,ɛ, 47 Fig. 3. The optimum rate R, 0. ad the Gaussia approximatio R, 0. i 49, for a Beroulli-0. source ad blocklegths where R u,ɛ is the quatile fuctio of the iformatio spectrum, i.e., the lowest R such that: ] P ı X X i R ɛ. 48 i= I Figure 2, we exhibit the behavior of the fudametal compressio limit R,ɛ i the case of coi flips with bias 0. = h 0.5 for which H X = 0.5 bits. I particular, we compare R,ɛ ad R u,ɛ for ɛ = 0.. The omootoic ature of both R,ɛ ad R u,ɛ with is ot surprisig: Although, the larger the value of, thelessweare at the mercy of the source radomess, we also eed to compress more iformatio. Figure 2 also illustrates that R,ɛ is tracked rather closely by the Gaussia approximatio, R,ɛ = H X + Q ɛ σx 2 log 2, 49 suggested by 45. Figure 3 focuses the compariso betwee R, 0. ad R, 0. o the short blocklegth rage up to 200 ot show i Figure 2. For > 60, the discrepacy betwee the two ever exceeds 4%. The remaider of the sectio is devoted to justifyig the use of 49 as a accurate approximatio to R,ɛ.Tothat ed, i Theorems 8 ad 7 we establish the bouds give i 45. Their derivatio requires that we overcome two techical hurdles:

12 788 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 The distributio fuctio of the optimal ecodig legth is ot the same as the distributio of i= ı X X i ; 2 The distributio of i= ı X X i is oly approximately Gaussia. To cope with the secod hurdle we will appeal to the classical Berry-Essée boud 22], 30]: Theorem 6: Let {Z i } be idepedet ad idetically distributed radom variables with zero mea ad uit variace, ad let Z be stadard ormal. The, for all adaya: ] P Z i a P Z a ] E Z 3 ] i= Ivokig the Berry-Essée boud, Strasse 37] claimed the followig approximatio for > 9600, δ 6 R,ɛ R,ɛ 40 δ 8, 5 where, { } δ mi σx, ɛ, ɛ, μ / = E ı X X H X 3 ]. 53 Ufortuately, there appears to be a gap i how Strasse 37] justifies the applicatio of a asymptotic form of the Berry- Essée theorem with uiform covergece, i order to boud the error itroduced by takig expectatios with respect to Gaussia approximatios cf. equatios 2.7, 3.8 ad the displayed equatio betwee 3.5 ad 3.6 i 37]. The followig achievability result holds for all blocklegths. Theorem 7: For all 0 <ɛ 2 ad all, R,ɛ H X + σx Q ɛ log log log2 e 2 2πσ 2 X + μ 3 σ 3 X + σ 2 Xφ Q ɛ+ μ 54 3 σ 3 X, as log as the varetropy σ 2 X is strictly positive, where = Q ad φ = are the stadard Gaussia distributio fuctio ad desity, respectively, ad is the third absolute momet of ı X X defied i 53. Proof: The proof follows Strasse s costructio, but the essetial approximatio steps are differet. The positive costat β is uiquely defied by: P ı X X ] log 2 β ɛ, 55 P ı X X ] <log 2 β < ɛ. 56 Sice the iformatio spectrum i.e., the distributio fuctio of the iformatio radom variable ı X X is piecewise costat, log 2 β is the locatio of the jump where the iformatio spectrum reaches or exceeds for the first time the value ɛ. Furthermore, defiig the ormalized costat, λ = log 2 β HX, 57 σx the probability i the left side of 55 satisfies ı X X ] HX P λ σx λ + 2σ 3 X, 58 where we have applied Theorem 6. Aalogously 50 also holds for P Zi < a], we obtai, ı X X ] HX P <λ σx λ 2σ 3 X. 59 Sice ɛ is sadwiched betwee the right sides of 58 ad 59, as we must have λ λ, where, λ = ɛ = Q ɛ. 60 By a simple first-order Taylor boud, λ μ 3 λ + 2σ 3 X 6 = λ + 2σ 3 X ξ 62 = λ + 2σ 3 X φ ξ, 63 for some ξ λ, λ + 2σ 3 X ]. Sice ɛ /2, we have λ 0ad λ /2, so that ξ /2. Ad sice t is strictly icreasig for all t, while φ is strictly decreasig for t 0, from 63 we obtai, λ λ + 2σ 3 X φ λ σ 3 X. The evet E i the left side of 55 cotais all the high probability strigs, ad has probability ɛ. Its cardiality is M + X β, defied i 86 with X X. Therefore, deotig, we obtai, with ϕt = 2 t {t 0} 65 Y i = σx ı XX i H X, 66 R,ɛ log 2 M+ X β 67 = log 2 E exp ı X X { ı X X } ] log 2 β 68 σx = H X + λ + log 2 α, 69 α = E ϕlog 2 β ı X X ] 70 σx = E ϕ λ ] Y i, 7 where we have used 57, expa = 2 a,ad{y i } are idepedet, idetically distributed, with zero mea ad uit variace. i=

13 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 789 Let ᾱ be defied as 7 except that Y i are replaced by Ȳ i which are stadard ormal. The, straightforward algebra yields, ] ᾱ = E 2 σxλ Ȳ {Ȳ λ } = x λσx 2 x e 2σ 2 X 72 2 dx 0 2πσ 2 X 73 log 2 e 2πσ 2 X. 74 To deal with the fact that the radom variables i 7 are ot ormal, we apply the Lebesgue-Stieltjes formula for itegratio by parts to 7. Deotig the distributio of the ormalized sum i 7 by F t, α becomes, α = λ = F λ λ 2 σxλ t df t 75 F t σx2 σxλ t dt loge 2 76 =ᾱ + F λ λ σx log e 2 λ F t t2 σxλ t dt 77 ᾱ + 2σ 3 X + μ λ 3 2σ 2 2 σxλ t dt loge 2 78 X =ᾱ + σ 3 X 79 log2 e 2πσ 2 X + μ 3 σ 3, 80 X where 78 results from applyig Theorem 6 twice. The desired result ow follows from 69 after assemblig the bouds o λ ad α i 64 ad 80, respectively. Next we give a complemetary coverse result. Theorem 8: For all 0 <ɛ< 2 ad all such that > 0 = 4 + μ 2 3 2σ 3 X φq ɛq ɛ 2, 8 the followig lower boud holds, R,ɛ H X + σx Q ɛ log σ 3 X σ 2 XφQ ɛ, 82 as log as the varetropy σ 2 X is strictly positive. Proof: Let η = 2σ 2 X + σx φq, 83 ɛ ad cosider P ı X X i HX + σx ] Q ɛ η i= = P Q ɛ + i= ı X X i H X σx Q ɛ η σx Q ɛ η σx φq ɛ 2σ 3 X 2σ 3 X ] η σx = ɛ +, 87 where 85 follows from Theorem 6, ad 86 follows from Qa Qa + φqa, 88 which holds at least as log as a > > Lettig a = Q η ɛ ad = σx, 89 is equivalet to 8. We proceed to ivoke Theorem 4 with X X, k equal to times the right side of 82, ad τ = 2 log 2.Iviewof the defiitio of R,ɛ ad 84 87, the desired result follows. VI. GAUSSIAN APPROXIMATION FOR MARKOV SOURCES Let X be a irreducible, aperiodic, kth order Markov chai o the fiite alphabet A, with trasitio probabilities, P X X k x k+ x k, x k+ A k+, 90 ad etropy rate H X. Note that we do ot assume that the source is statioary. I Theorem 3 of Sectio IV we established that the varetropy rate defied i geeral i equatio 2, for statioary ergodic chais exists as the limit, σ 2 X = lim Varı X X. 9 A examiatio of the proof shows that, by a applicatio of the geeral cetral limit theorem for uiformly ergodic Markov chais 6], 28], the assumptio of statioarity is ot ecessary, ad 9 holds for all irreducible aperiodic chais. Theorem 9: Suppose X is a irreducible ad aperiodic kth order Markov source, ad let ɛ 0, /2. The, there is a positive costat C such that, for all large eough, R,ɛ HX + σx Q ɛ + C, 92 where the varetropy rate σ 2 X is give by 9 ad it is assumed to be strictly positive. Theorem 20: Uder the same assumptios as i Theorem 9, for all large eough, R,ɛ HX + σx Q ɛ 2 log 2 C, 93 where C > 0 is a fiite costat, possibly differet from tha i Theorem 9.

14 790 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 Remarks: By defiitio, the lower boud i Theorem 20 also applies to R p,ɛ, while i view Theorem, the upper boud i Theorem 9 also applies to R p,ɛ provided C is replaced by C +. 2 Note that, ulike the direct ad coverse codig theorems for memoryless sources Theorems 7 ad 8, respectively i the results of Theorems 9 ad 20 we do ot give explicit bouds for the costat terms. This is because the mai probabilistic tool we use i the proofs the Berry-Essée boud i Theorem 6 does ot have a equally precise couterpart for Markov chais. Specifically, i the proof of Theorem 2 below we appeal to a Berry-Essée boud established by Nagaev i 29], which does ot give a explicit value for the multiplicative costat A i If we restrict our attetio to the much more arrow class of reversible chais, the it is ideed possible to apply the Berry-Essée boud of Ma 24] to obtai explicit values for the costats i Theorems 9 ad 20; but the resultig values are pretty loose, drastically limitig the egieerig usefuless of the resultig bouds. For example, i Ma s versio of the Berry-Essée boud, the correspodig right side of the iequality i Theorem 6 is multiplied by a factor of Therefore, we have opted for the less explicit but much more geeral statemets give above. 4 Similar commets to those i the last two remarks apply to the observatio that Theorem 9 is a weaker boud tha that established i Theorem 7 for memoryless sources, by a /2 log 2 term. Istead of restrictig our result to the much more arrow class of reversible Markov chais, or extedig the ivolved proof of Theorem 7 to the case of a Markovia source, we chose to illustrate how this weaker boud ca be established i full geerality, with a much shorter proof. 5 The proof of Theorem 9 shows that the costat i its statemet ca be chose as C = 2AσX φq ɛ, 94 for all 2A 2 πeφq ɛ 4, 95 where A is the costat appearig i Theorem 2, below. Similarly, from the proof of Theorem 20 we see that the costat i its statemet ca be chose as σxa + C = φq +, 96 ɛ for all, A + 2 Q ɛφq. 97 ɛ Note that, i both cases, the values of the costats ca easily be improved, but they still deped o the implicit costat A of Theorem 2. As metioed above, we will eed a Berry-Essée-type boud o the scaled iformatio radom variables, ı X X HX. σx Beyod the Shao-McMilla-Breima Theorem, several more refied asymptotic results have bee established for this sequece; see, i particular, 6], 3], 37], 44] ad the discussios i 20] ad i Sectio IV. Ulike these asymptotic results, we will use the followig o-asymptotic boud. Theorem 2: For a ergodic, kth order Markov source X with etropy rate H X ad positive varetropy rate σ 2 X, there exists a fiite costat A > 0 such that, for all, sup P ı X X HX>z σx ] Qz A. 98 z R Proof: For itegers i j, we adopt the otatio x j i ad X j i for blocks of strigs x i, x i+,...,x j ad radom variables X i, X i+,...,x j, respectively. For all x +k A +k such that P X k x k >0adP X X j x j x j j k >0, j k for j = k +, k + 2,... + k, wehave ı X x = log 2 = = k+ j=k+ P X k x k j=k+ P X X j j k log 2 P X X j j k x j x j j k log 2 P X k x k +k j=+ P X X j j k x j x j j k 99 x j x j j k 200 f x j+k +, 20 j= where the fuctio f : A k+ R is defied o the state space by ad A ={x k+ A k+ : P X X k x k+ x k >0}. 202 f x k+ = ı X X k x k+ x k 203 = log 2 P X X k x k+ x k, 204 = log 2 +k The we ca boud, j=+ P X X j j k P X k x k x j x j j k. 205 δ 206 = max log P X k x k ] 2 +k j=+ P X X k x j x j j k <, 207 where the maximum is over the positive probability strigs for which we have established 20.

15 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 79 Let {Y } deote the first-order Markov source defied by takig overlappig k + -blocks i the origial chai, Y = X, X +,...,X +k. 208 Sice X is irreducible ad aperiodic, so is {Y }. Now, sice the chai {Y } is irreducible ad aperiodic o a fiite state space, coditio 0.2 of 29] is satisfied, ad sice the fuctio f is bouded, Theorem of 29] implies that there exists a fiite costat A such that, for all, j= sup P f Y j HX σx z R ] > z where the etropy rate is H X = E f Ỹ ] ad σ 2 X = lim E j= Qz A, ] f Ỹ j HX, 20 where {Ỹ } is a statioary versio of {Y }, that is, it has the same trasitio probabilities but its iitial distributio is its uique ivariat distributio, PỸ = x k+ ]=πx k P X X k x k+ x k, 2 where π is the uique ivariat distributio of the origial chai X. Sice the fuctio f is bouded ad the distributio of the chai {Y } coverges to statioarity expoetially fast, it is easy to see that 20 coicides with the source varetropy rate. Let F z, G z deote the complemetary cumulative distributio fuctios F z = P ı X X HX >z σx ], 22 G z = P f Y j HX >z σx. 23 j= Sice F z ad G z are o-icreasig, 20 ad 207 imply that F z G z + δ σ 24 Q z + δ σ A, 25 Qz A, 26 uiformly i z, where 25 follows from 209, ad 26 holds with A = A +δ/ 2π sice Q z = φz is bouded by / 2π. A similar argumet shows F z G z δ/ 27 Qz δ/ + A 28 Qz + A. 29 Sice both 26 ad 29 hold uiformly i z R, together they form the statemet of the theorem. Proof of Theorem 9: Let K = HX + σ Q ɛ + C, 220 where we abbreviate σ = σx ad, for ow C is a arbitrary positive costat. Theorem 3 with X i place of X states Plf X K ] Pı X X K ] 22 Q Q ɛ + C σ + A, 222 where 222 follows from Theorem 2. Sice Q x = φx Q x = xφx, 2πe x 0, 224 a secod-order Taylor expasio of the first term i the right side of 222 gives Plf X K ] ɛ C σ { φq ɛ C 2σ 2πe Aσ }, 225 C ad choosig C as i 94 for satisfyig 95 the right side of 225 is bouded above by ɛ. Therefore, Plf X > K ] ɛ, which, by defiitio implies that R,ɛ K,as claimed. Proof of Theorem 20: Applyig Theorem 4 with X i place of X ad with δ>0adk arbitrary, we obtai Plf X K ] ] P ı X X K + δ 2 δ 226 K HX + δ Q σ A 2 δ, 227 where 227 ow follows from Theorem 2. Lettig δ = δ = 2 log 2 ad yields K = HX + σ Q ɛ δ σa + φq ɛ, 228 Plf X K ] Q Q A + ɛ φq ɛ A >ɛ 230 which holds as log as ɛ 0, /2 ad A + 2 Q ɛφq, 23 ɛ i order to esure that the argumet of the Q fuctio i 229 is oegative. Note that i 229 we have used a two-term Taylor expasio of Q based o Q ɛ > 0adQ x = φx. We coclude from 230 that R,ɛ > K, as claimed.

16 792 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 VII. SOURCE DISPERSION AND VARENTROPY Traditioally, refied aalyses i lossless data compressio have focused attetio o the redudacy, defied as the differece betwee the miimum average compressio rate ad the etropy rate. As metioed i Sectio I-E, the persymbol redudacy is positive ad of order O whe the prefix coditio is eforced, while it is 2 log 2 + O without the prefix coditio. But sice the results i Sectios V ad VI show that the stadard deviatio of the best achievable compressio rate is of order O, the deviatio of the rate from the etropy will be domiated by these stochastic fluctuatios. Therefore, as oted i 9], it is of primary importace to aalyze the variace of the optimal codelegths. To that ed, we itroduce the followig operatioal defiitio: Defiitio 2: The dispersio D measured i bits 2 of a source X is D = lim sup Varlf X, 232 where lf is the legth of the optimum fixed-to-variable lossless code cf. Sectio I-B. As we show i Theorem 23 below, for a broad class of sources, the dispersio D is equal to the source varetropy rate σ 2 X defied i 2. Moreover, i view of the Gaussia approximatio bouds for R,ɛ i Sectios V ad VI ad more geerally, as log as a similar two-term Gaussia approximatio i terms of the etropy rate ad varetropy rate ca be established up to o/ accuracy we ca coclude the followig: By the defiitio of R,ɛ i Sectio I-B, the source blocklegth required for the compressio rate to exceed + ηh X with probability o greater tha ɛ>0is approximated by + ηh X, ɛ σ 2 X Q 2 ɛ H X η = D Q 2 ɛ H 2, 234 X η i.e., by the product of a factor that depeds oly o the source through σ 2 X/H 2 X, ad a factor that depeds oly o the desig requiremets ɛ ad η. Note the close parallel with the otio of chael dispersio itroduced i 32]. Example 5: Coi flips with bias p have varetropy, σ 2 X = p p log 2 p, 235 p so the key parameter i 234 which characterizes the time horizo required for the source to become typical is σ 2 X H 2 X = D p 2 log2 p H 2 = p p 236 X hp where h deotes the biary etropy fuctio i bits. I view of Example 4, the ormalized dispersio for the biary symmetric Markov chai with trasitio probability p is also give by 236. Example 6: For a memoryless source whose margial is the geometric distributio, P X k = q q k, k 0, 237 Fig. 4. sources. Normalized dispersio as a fuctio of etropy for memoryless the ratio of varetropy to squared etropy is σ 2 X H 2 X = D log2 q 2 H 2 = q. 238 X hq Figure 4 compares the ormalized dispersio to the etropy for the Beroulli, geometric ad Poisso distributios. We see that, as the source becomes more compressible lower etropy per letter, the horizo over which we eed to compress i order to squeeze most of the redudacy out of the source gets loger. Defiitio 3: A source X takig values o the fiite alphabet A is a liear iformatio growth source if the probability of every strig is either zero or is asymptotically lower bouded by a expoetial, that is, if there is a fiite costat A ad ad a iteger N 0 suchthat,forall N 0,every ozero-probability strig x A satisfies ı X x A. 239 Ay memoryless source belogs to the class of liear iformatio growth. Also ote that every irreducible ad aperiodic Markov chai is a liear iformatio growth source: Writig q for the smallest ozero elemet of the trasitio matrix, ad π for the smallest ozero probability for X, we easily see that 239 is satisfied with N 0 = 2, A = log 2 /q + log 2 q/π. The class of liear iformatio growth sources is related, at least at the level of ituitio, to the class of fiite-eergy processes cosidered by Shields 35] ad to processes satisfyig the Doebli-like coditio of Kotoyiais ad Suhov 2]. We proceed to show a iterestig regularity result for liear iformatio growth sources: Lemma : Suppose X is a ot ecessarily statioary or ergodic liear iformatio growth source. The: 2 ] lim E lf X ı X X = Proof: For brevity, deote l = lf X ad ı = ı X X, respectively. Select a arbitrary τ. The expectatio of iterest is El ı 2 ]=El ı 2 {l ı τ }] +El ı 2 {l < ı τ }]. 24

17 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 793 Sice l ı always, o the evet {l ı τ } we have l ı 2 τ 2. Also, by the liear iformatio growth assumptio we have the boud 0 ı l ı C for a fiite costat C ad all large eough. Combiig these two observatios with Theorem 5, we obtai that El ı 2 ] τ 2 + C2 2 P{l < ı τ } 242 τ 2 + C2 2 2 τ log 2 A τ 2 + C 3 2 τ, 244 for some C < ad all large eough. Takig τ = 3log 2, dividig by ad lettig gives the claimed result. Note that we have actually proved a stroger result, amely, 2 ] E lf X ı X X = Olog Liear iformatio growth is sufficiet for dispersio to equal varetropy rate: Theorem 22: If the source X has liear iformatio growth ad fiite varetropy rate, the: D = σ 2 X. 246 Proof: For otatioal coveiece, we abbreviate H X as H, ad as i the previous proof we write, l = lf X ad ı = ı X X. Expadig the defiitio of the variace of l, we obtai, Varlf X = E l El ] 2] ] = E l ı + ı H El ı ] 248 = El ı 2 ]+Eı H 2 ] E 2 l ı ] +2 El ı ı H ], 249 ad therefore, usig the Cauchy-Schwarz iequality twice, Varlf X Varı X X = El ı 2 ] E 2 l ı ] +2 El ı ı H ] El ı 2 ] +2 El ı 2 ]σı X X. 25 where σ idicates the stadard deviatio. Dividig by ad lettig, we obtai that the first term teds to zero by Lemma, ad the secod term becomes the square root of 4 El ı 2 ] Varı X X 252 which also teds to zero by Lemma ad the fiite-varetropy rate assumptio. Therefore, lim Varlf X Varı X X =0, 253 which, i particular, implies that σ 2 X = D. I view of 245, if we ormalize by log, istead of i the last step of the proof of Theorem 22, we obtai the stroger result: Varlf X Varı X X =O log Also, Lemma ad Theorem 22 remai valid if istead of the liear iformatio growth coditio we ivoke the weaker assumptio that there exists a sequece ɛ = o, such that max x : P X x =0 ı X x = o 2 ɛ / For Markov chais, the various results satisfied by varetropy ad dispersio are as follows. Theorem 23: Let X be a irreducible, aperiodic ot ecessarily statioary Markov source with etropy rate H X. The: The varetropy rate σ 2 X defied i 2 exists as the limit σ 2 X = lim Varı X X The dispersio D defied i 232 exists as the limit D = lim Varlf X D = σ 2 X. 4 The varetropy rate or, equivaletly, the dispersio ca be characterized i terms of the best achievable rate R,ɛ as σ 2 R,ɛ H X 2 X = lim lim ɛ 0 2l 258 ɛ = lim lim ɛ 0 as log as σ 2 X is ozero. R,ɛ H X 2 Q, 259 ɛ Proof: The limitig expressio i part was already established i Theorem 3 of Sectio IV; see also the discussio leadig to 9 i Sectio VI. Recallig that every irreducible ad aperiodic Markov source is a liear iformatio growth source, combiig part with Theorem 22 immediately yields the results of parts 2 ad 3. Fially, part 4 follows from the results of Sectio VI. Uder the preset assumptios, Theorems 9 ad 20 together imply that there is a fiite costat C such that R,ɛ H X σxq ɛ log 2 + C, for all ɛ 0, /2 ad all large eough. Therefore, lim R,ɛ H X 2 = σ 2 XQ ɛ Dividig by 2 l ɛ, lettig ɛ 0, ad recallig the simple fact that Q ɛ 2 2l ɛ see, e.g., 39, Sectio 3.3] proves 259 ad completes the proof of the theorem. From Theorem 22 it follows that, for a broad class of sources icludig all ergodic Markov chais with ozero varetropy rate, Var lf lim X Var ı X X =. 262 Aalogously to Theorem 0, we could explore whether 262 might hold uder broader coditios, icludig the geeral

18 794 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 60, NO. 2, FEBRUARY 204 settig of possibly o-serial sources. However, cosider the followig simple example. Example 7: As i Example 3, let X M be equiprobable o asetofm elemets, the, H X M = log 2 M 263 Var ı X M X M = lim sup Var lf X M = M 4 lim if Var lf X M = M To verify 265 ad 266, defie the fuctio, K sk = i 2 2 i 267 i= = K + 3 2K + K It is straightforward to check that ] E l 2 f X M = s log M 2 M log 2 M 2 2 log 2 M + M 269 Together with 02, 269 results i, Var lf X M = 3ξ M ξm 2 + o, 270 with ξ M = 2+ log 2 M, M 27 which takes values i, 2]. O that iterval, the parabola 3x x 2 takes a miimum value of 2 ad a maximum value of 3/2 2, ad 265, 266 follow. Although the ratio of optimal codelegth variace to the varetropy rate may be ifiity as illustrated i Example 7, we do have the followig couterpart of the first-momet result i Theorem 0 for the secod momets: Theorem 24: For ay ot ecessarily serial source X ={P X }, lim El2 f X ] ] =, 272 E ı 2 X X as log as the deomiator diverges. Proof: Theorem 3 implies that ] El 2 f X ] E ı 2 X X. 273 Therefore, the lim sup i 272 is bouded above by. To establish the correspodig lower boud, fix a arbitrary ϑ>0. The, El 2 f X ] = ] P l 2 f X k 274 k = P l ] k 275 k = ] P l k 276 k k P ı X X + ϑ ] ] k 2 ϑ k, 277 where 277 follows by lettig τ = ϑ k i the coverse Theorem 4. Therefore, El 2 f X ] C ϑ + P ı 2 X X + ϑ 2 k 2] 278 k D ϑ + k P ı 2 X ] X + ϑ 3 k 279 D ϑ + + ϑ 3 Eı 2 X X ]. 280 where C ϑ, D ϑ are positive scalars that oly vary with ϑ. Note that 278 holds because a k is summable for all 0 < a < ; 279 holds because + ϑk k 2 for all sufficietly large k; ad 280 holds because the mea of a oegative radom variable is the itegral of the complemetary cumulative distributio fuctio, which i tur satisfies k+ k Fx dx Fk +, 28 Dividig both sides of by the secod momet Eı 2 X X ] ad lettig, we coclude that the ratio i 272 is bouded below by + ϑ 3.Siceϑ ca be take to be arbitrarily small, this proves that the lim if 272 is bouded below by, as required. REFERENCES ] N. Alo ad A. Orlitsky, A lower boud o the expected legth of oeto-oe codes, IEEE Tras. If. Theory, vol. 40, o. 5, pp , Sep ] A. R. Barro, Logically smooth desity estimatio, Ph.D. dissertatio, Dept. Electr. Eg., Staford Uiv., Staford, CA, USA, Sep ] C. Bludo ad R. de Prisco, New bouds o the expected legth of oe-to-oe codes, IEEE Tras. If. Theory, vol. 42, o., pp , Ja ] B. C. Bradley, Basic properties of strog mixig coditios, i Depedece i Probability ad Statistics, E. Wilel ad M. S. Taqqu, Eds. Bosto, MA, USA: Birkhäuser, 986, pp ] J. Cheg, T. K. Huag, ad C. Weidma, New bouds o the expected legth of optimal oe-to-oe codes, IEEE Tras. If. Theory, vol. 53, o. 5, pp , May ] K. L. Chug, Markov Chais with Statioary Trasitio Probabilities. New York, NY, USA: Spriger-Verlag, ] K. L. Chug, A Course i Probability Theory, 2d ed. New York, NY, USA: Academic, ] T. Cover ad J. Thomas, Elemets of Iformatio Theory, 2d ed. New York, NY, USA: Wiley, ] I. Csiszár ad J. Körer, Iformatio Theory: Codig Theorems for Discrete Memoryless Systems, 2d ed. Cambridge, U.K.: Cambridge Uiv. Press, 20. 0] A. Dembo ad I. Kotoyiais, Source codig, large deviatios, ad approximate patter matchig, IEEE Tras. If. Theory, vol. 48, o. 6, pp , Ju ] J. G. Duham, Optimal oiseless codig of radom variables Corresp., IEEE Tras. If. Theory, vol. 26, o. 3, p. 345, May ] P. Hall, Rates of Covergece i the Cetral Limit Theorem. Lodo, U.K.: Pitma, ] T. S. Ha ad S. Verdú, Approximatio theory of output statistics, IEEE Tras. If. Theory, vol. 39, o. 3, pp , May ] T. C. Hu, D. J. Kleitma, ad J. K. Tamaki, Biary trees optimum uder various criteria, SIAM J. Appl. Math., vol. 37, o. 2, pp , ] P. Humblet, Geeralizatio of Huffma codig to miimize the probability of buffer overflow Corresp., IEEE Tras. If. Theory, vol. 27, o. 2, pp , Mar ] I. A. Ibragimov, Some limit theorems for statioary processes, Theory Probab. Appl., vol. 7, o. 4, pp , 962.

19 KONTOYIANNIS AND VERDÚ: OPTIMAL LOSSLESS DATA COMPRESSION 795 7] M. Hayashi, Secod-order asymptotics i fixed-legth source codig ad itrisic radomess, IEEE Tras. If. Theory, vol. 54, o. 0, pp , Oct ] J. C. Kieffer, Sample coverses i source codig theory, IEEE Tras. If. Theory, vol. 37, o. 2, pp , Mar ] I. Kotoyiais, Secod-order oiseless source codig theorems, IEEE Tras. If. Theory, vol. 43, o. 3, pp , Jul ] I. Kotoyiais, Asymptotic recurrece ad waitig times for statioary processes, J. Theoretical Probab., vol., o. 3, pp , ] I. Kotoyiais ad Y. M. Suhov, Prefixes ad the etropy rate for log-rage sources, Probability Statistics ad Optimizatio, F. P. Kelly, Ed. New York, NY, USA: Wiley, ] V. Yu. Korolev ad I. G. Shevtsova, O the upper boud for the absolute costat i the Berry Essée iequality, Theory Probab. Appl., vol. 54, o. 4, pp , ] S. K. Leug-Ya-Cheog ad T. Cover, Some equivaleces betwee Shao etropy ad Kolmogorov complexity, IEEE Tras. If. Theory, vol. 24, o. 3, pp , May ] B. Ma, Berry Essée cetral limit theorems for Markov chais, Ph.D. dissertatio, Dept. Math., Harvard Uiv., Cambridge, MA, USA, ] B. McMilla, The basic theorems of iformatio theory, A. Math. Statist., vol. 24, pp , Ju ] B. McMilla, Two iequalities implied by uique decipherability, IRE Tras. If. Theory, vol. 2, o. 4, pp. 5 6, Dec ] N. Merhav, Uiversal codig with miimum probability of codeword legth overflow, IEEE Tras. If. Theory, vol. 37, o. 3, pp , May ] S. P. Mey ad R. L. Tweedie, Markov Chais ad Stochastic Stability, 2d ed. Cambridge, U.K.: Cambridge Uiv. Press, ] S. V. Nagaev, More exact limit theorems for homogeeous Markov chais, Theory Probab. Appl., vol. 6, o., pp. 62 8, ] V. V. Petrov, Limit Theorems of Probability Theory: Sequeces of Idepedet Radom Variables. Oxford, U.K.: Oxford Sciece, ] W. Philipp ad W. Stout, Almost Sure Ivariace Priciples for Partial Sums of Weakly Depedet Radom Variables. Providece, RI, USA: AMS, ] Y. Polyaskiy, H. V. Poor, ad S. Verdú, Chael codig rate i the fiite blocklegth regime, IEEE Tras. If. Theory, vol. 56, o. 5, pp , May ] J. Rissae, Tight lower bouds for optimum code legth, IEEE Tras. If. Theory, vol. 28, o. 2, pp , Mar ] C. E. Shao, A mathematical theory of commuicatio, Bell Syst. Tech. J., vol. 27, o. 4, pp , ] P. C. Shields, The Ergodic Theory of Discrete Sample Paths Graduate Studies i Mathematics. Providece, RI, USA: AMS, ] G. Somasudaram ad A. Shrivastava, Iformatio Storage ad Maagemet: Storig, Maagig, ad Protectig Digital Iformatio i Classic, Virtualized, ad Cloud Eviromets. New York, NY, USA: Wiley, ] V. Strasse, Asymptotische abschäzuge i Shaos iformatiostheorie, i Proc. Tras. Third Prague Cof. If. Theory, Statist., Decisio Fuct., Radom Process., 964, pp ] W. Szpakowski ad S. Verdú, Miimum expected legth of fixed-tovariable lossless compressio without prefix costraits, IEEE Tras. If. Theory, vol. 57, o. 7, pp , Jul ] S. Verdú, Multiuser Detectio. Cambridge, U.K.: Cambridge Uiv. Press, ] S. Verdú, teachig it, preseted at the IEEE It. Symp. If. Theory, Nice, Frace, Ju ] S. Verdú, Teachig lossless data compressio, IEEE If. Theory Soc. Newsletter, vol. 6, o., pp. 8 9, Apr ] E. I. Verriest, A achievable boud for optimal oiseless codig of a radom variable, IEEE Tras. If. Theory, vol. 32, o. 4, pp , Jul ] A. D. Wyer, A upper boud o the etropy series, If. Cotrol, vol. 20, o. 2, pp. 76 8, ] A. A. Yushkevich, O limit theorems coected with the cocept of etropy of Markov chais, Uspekhi Mat. Nauk, vol. 8, o. 557, pp , 953. Ioais Kotoyiais was bor i Athes, Greece, i 972. He received the B.Sc. degree i mathematics i 992 from Imperial College Uiversity of Lodo, U.K., ad i 993 he obtaied a distictio i Part III of the Cambridge Uiversity Pure Mathematics Tripos. He received the M.S. degree i statistics ad the Ph.D. degree i electrical egieerig, both from Staford Uiversity, Staford, CA, i 997 ad 998, respectively. From Jue 998 to August 200, he was a Assistat Professor with the Departmet of Statistics, Purdue Uiversity, West Lafayette, IN ad also, by courtesy, with the Departmet of Mathematics, ad the School of Electrical ad Computer Egieerig. From August 2000 util July 2005, he was a Assistat, the Associate Professor, with the Divisio of Applied Mathematics ad with the Departmet of Computer Sciece, Brow Uiversity, Providece, RI. Sice March 2005, he has bee with the Departmet of Iformatics, Athes Uiversity of Ecoomics ad Busiess, where he is curretly a Professor. Dr. Kotoyiais was awarded the Maig edowed Assistat Professorship i 2002, ad was awarded a hoorary Master of Arts degree Ad Eudem, i 2005, both by Brow Uiversity. I 2004, he was awarded a Sloa Foudatio Research Fellowship. He has served two terms as a Associate Editor for the IEEE TRANSACTIONS ON INFORMATION THEORY. His research iterests iclude data compressio, applied probability, iformatio theory, statistics, simulatio, ad mathematical biology. Sergio Verdú S 80 M 84 SM 88 F 93 is o the faculty of the School of Egieerig ad Applied Sciece at Priceto Uiversity. A member of the Natioal Academy of Egieerig, Verdú is the recipiet of the 2007 Claude E. Shao Award ad of the 2008 IEEE Richard W. Hammig Medal.

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

THE ABRACADABRA PROBLEM

THE ABRACADABRA PROBLEM THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected

More information

Irreducible polynomials with consecutive zero coefficients

Irreducible polynomials with consecutive zero coefficients Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

1 The Gaussian channel

1 The Gaussian channel ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length

A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length Joural o Satisfiability, Boolea Modelig ad Computatio 1 2005) 49-60 A Faster Clause-Shorteig Algorithm for SAT with No Restrictio o Clause Legth Evgey Datsi Alexader Wolpert Departmet of Computer Sciece

More information

Factors of sums of powers of binomial coefficients

Factors of sums of powers of binomial coefficients ACTA ARITHMETICA LXXXVI.1 (1998) Factors of sums of powers of biomial coefficiets by Neil J. Cali (Clemso, S.C.) Dedicated to the memory of Paul Erdős 1. Itroductio. It is well ow that if ( ) a f,a = the

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Perfect Packing Theorems and the Average-Case Behavior of Optimal and Online Bin Packing

Perfect Packing Theorems and the Average-Case Behavior of Optimal and Online Bin Packing SIAM REVIEW Vol. 44, No. 1, pp. 95 108 c 2002 Society for Idustrial ad Applied Mathematics Perfect Packig Theorems ad the Average-Case Behavior of Optimal ad Olie Bi Packig E. G. Coffma, Jr. C. Courcoubetis

More information

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k. 18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006 Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

A Recursive Formula for Moments of a Binomial Distribution

A Recursive Formula for Moments of a Binomial Distribution A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Notes on exponential generating functions and structures.

Notes on exponential generating functions and structures. Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL. Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Lecture 4: Cheeger s Inequality

Lecture 4: Cheeger s Inequality Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a d-regular

More information

A RANDOM PERMUTATION MODEL ARISING IN CHEMISTRY

A RANDOM PERMUTATION MODEL ARISING IN CHEMISTRY J. Appl. Prob. 45, 060 070 2008 Prited i Eglad Applied Probability Trust 2008 A RANDOM PERMUTATION MODEL ARISING IN CHEMISTRY MARK BROWN, The City College of New York EROL A. PEKÖZ, Bosto Uiversity SHELDON

More information

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to

More information

4.3. The Integral and Comparison Tests

4.3. The Integral and Comparison Tests 4.3. THE INTEGRAL AND COMPARISON TESTS 9 4.3. The Itegral ad Compariso Tests 4.3.. The Itegral Test. Suppose f is a cotiuous, positive, decreasig fuctio o [, ), ad let a = f(). The the covergece or divergece

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Analysis Notes (only a draft, and the first one!)

Analysis Notes (only a draft, and the first one!) Aalysis Notes (oly a draft, ad the first oe!) Ali Nesi Mathematics Departmet Istabul Bilgi Uiversity Kuştepe Şişli Istabul Turkey aesi@bilgi.edu.tr Jue 22, 2004 2 Cotets 1 Prelimiaries 9 1.1 Biary Operatio...........................

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

arxiv:1506.03481v1 [stat.me] 10 Jun 2015

arxiv:1506.03481v1 [stat.me] 10 Jun 2015 BEHAVIOUR OF ABC FOR BIG DATA By Wetao Li ad Paul Fearhead Lacaster Uiversity arxiv:1506.03481v1 [stat.me] 10 Ju 2015 May statistical applicatios ivolve models that it is difficult to evaluate the likelihood,

More information

Metric, Normed, and Topological Spaces

Metric, Normed, and Topological Spaces Chapter 13 Metric, Normed, ad Topological Spaces A metric space is a set X that has a otio of the distace d(x, y) betwee every pair of poits x, y X. A fudametal example is R with the absolute-value metric

More information

19 Another Look at Differentiability in Quadratic Mean

19 Another Look at Differentiability in Quadratic Mean 19 Aother Look at Differetiability i Quadratic Mea David Pollard 1 ABSTRACT This ote revisits the delightfully subtle itercoectios betwee three ideas: differetiability, i a L 2 sese, of the square-root

More information

THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E. MCCARTHY, SANDRA POTT, AND BRETT D. WICK

THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E. MCCARTHY, SANDRA POTT, AND BRETT D. WICK THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E MCCARTHY, SANDRA POTT, AND BRETT D WICK Abstract We provide a ew proof of Volberg s Theorem characterizig thi iterpolatig sequeces as those for

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uiv-ates.fr Jea-Luc Marichal Applied Mathematics

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series 8 Fourier Series Our aim is to show that uder reasoable assumptios a give -periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

NATIONAL SENIOR CERTIFICATE GRADE 12

NATIONAL SENIOR CERTIFICATE GRADE 12 NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS

More information

Nr. 2. Interpolation of Discount Factors. Heinz Cremers Willi Schwarz. Mai 1996

Nr. 2. Interpolation of Discount Factors. Heinz Cremers Willi Schwarz. Mai 1996 Nr 2 Iterpolatio of Discout Factors Heiz Cremers Willi Schwarz Mai 1996 Autore: Herausgeber: Prof Dr Heiz Cremers Quatitative Methode ud Spezielle Bakbetriebslehre Hochschule für Bakwirtschaft Dr Willi

More information

Tradigms of Astundithi and Toyota

Tradigms of Astundithi and Toyota Tradig the radomess - Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

Chapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 8 GENE H GOLUB 1 Positive Defiite Matrices A matrix A is positive defiite if x Ax > 0 for all ozero x A positive defiite matrix has real ad positive

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. Powers of a matrix We begi with a propositio which illustrates the usefuless of the diagoalizatio. Recall that a square matrix A is diogaalizable if

More information

Stock Market Trading via Stochastic Network Optimization

Stock Market Trading via Stochastic Network Optimization PROC. IEEE CONFERENCE ON DECISION AND CONTROL (CDC), ATLANTA, GA, DEC. 2010 1 Stock Market Tradig via Stochastic Network Optimizatio Michael J. Neely Uiversity of Souther Califoria http://www-rcf.usc.edu/

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information