1570 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 Finite-Length Anaysis of Low-Density Paity-Check Codes on the Binay Easue Channe Changyan Di, Student Membe, IEEE, David Poietti, I. Eme Teata, Membe, IEEE, Thomas J. Richadson, and Rüdige L. Ubanke Invited Pape Abstact In this pape, we ae concened with the finite-ength anaysis of ow-density paity-check (LDPC) codes when used ove the binay easue channe (BEC). The main esut is an expession fo the exact aveage bit and bock easue pobabiity fo a given egua ensembe of LDPC codes when decoded iteativey. We aso give expessions fo uppe bounds on the aveage bit and bock easue pobabiity fo egua LDPC ensembes and the standad andom ensembe unde maximum-ikeihood (ML) decoding. Finay, we pesent what we conside to be the most impotant open pobems in this aea. Index Tems Beief popagation, binay easue channe (BEC), finite-ength anaysis, ow-density paity-check (LDPC) codes. I. INTRODUCTION IN this pape, we ae concened with the finite-ength anaysis of ow-density paity-check (LDPC) codes when used ove the binay easue channe (BEC). The main esut is an expession fo the exact aveage bit and bock easue pobabiity fo agivenegua ensembe when decoded iteativey with message-passing agoithms as in, e.g., [11]. Fo an intoduction into the teminoogy and basic esuts of LDPC codes we efe the eade to [3] [9], [11] [15]. Fo a paticua code 1 in a given ensembe, et denote the expected bit easue pobabiity if is used to tansmit ove a BEC with paamete and if the eceived wod is decoded iteativey by the standad beief popagation decode. Hee, the expectation is ove a eaizations of the channe. Let denote the coesponding ensembe aveage. The foowing two esuts ae we known, see [7], [9]. Manuscipt eceived Septembe 9, 2001; evised Januay 15, 2002. The mateia in this pape was pesented at the 39th Annua Aeton Confeence on Communication, Conto and Computing, Aeton, UIUC, Octobe 3 5, 2001. C. Di, D. Poietti, and R. L. Ubanke ae with the Swiss Fedea Institute of Technoogy-Lausanne, LTHC-IC, CH-1015 Lausanne, Switzeand (e-mai: changyan.di@epf.ch; david.poietti@epf.ch; udige.ubanke@epf.ch). I. E. Teata is with the Swiss Fedea Institute of Technoogy-Lausanne, LTHI-IC, CH-1015 Lausanne, Switzeand (e-mai:eme.teata @epf.ch). T. J. Richadson is with Faion Technoogies, Bedminste, NJ 07921 USA (e-mai: ichadson@faion.com). Communicated by S. Shamai, Guest Edito. Pubishe Item Identifie S 0018-9448(02)04026-9. 1 Moe pecisey, G denotes the bipatite gaph epesenting the code. [Concentation Aound Ensembe Aveage] Fo any given thee exists an such that [Convegence of Ensembe Aveage to Cyce-Fee Case] Thee exists a constant such that In wods, the fist statement assets that the behavio of the individua codes concentates aound the ensembe aveage and that this concentation is exponentia in the bock ength. The second statement assets that the ensembe aveage conveges to the ensembe aveage of the cyce-fee case as the bock ength tends to infinity. 2 Note, though, that the speed of the convegence to the cyce-fee case is known to be of ode at east and is ikey to be poynomia at best, wheeas the convege to the ensembe aveage is exponentia in the bock ength. 3 The above two statements suggest the foowing. Fix the bock ength and conside individua eements of. Athough the behavio of individua codes can diffe significanty fom that of the cyce-fee (asymptotic) case fo modeate bock engths, the behavio of individua instances is ikey to be concentated aound the ensembe aveage. Let us demonstate this point by means of an exampe. Conside the situation depicted in Fig. 1. The two soid cuves epesent (eft soid cuve) and (ight soid cuve), espectivey. As we can see, fo a bock ength of, the aveage bit easue pobabiity diffes significanty fom the one of the cyce-fee case. Aso potted ae cuves coesponding to fo sevea andomy chosen instances of (dashed cuves). These cuves foow the ensembe aveage vey cosey fo bit easue pobabiities down to. Fom the above obsevations we can see that the ensembe aveage pays a significant oe in the anaysis of finite ength codes and that, theefoe, computabe expessions fo 2 Reca that in the imit of infinite bock ength, the suppot tee up to any fixed given depth of a andomy chosen node o edge is cyce fee with pobabiity that goes to one. We, theefoe, use the phases cyce fee and infinite bock ength intechangeaby. 3 Fo the easue channe moe pecise statements about the convegence speeds can be gained by an anaysis of the eo foo, see [10], [16]. 0018-9448/02$17.00 2002 IEEE
DI et a.: FINITE-LENGTH ANALYSIS OF LDPC CODES ON THE BINARY ERASURE CHANNEL 1571 Fig. 1. Concentation of the bit easue pobabiity P (G; ) fo specific instances G 2 C(512; x ; x ) (dashed cuves) aound the ensembe aveage [P (G; )] (eft soid cuve). (It is notewothy that thee appea to be two dominant modes of behavio.) Aso shown is the pefomance of the cyce-fee case, [P (G; )] (ight soid cuve). Fig. 2. A specific eement G fom the ensembe C(10; x ; x ). ae of consideabe vaue. Viewing the decoding opeation fom a standad message-passing point of view, it is had to see how one coud deive anaytic expessions of. Cyces in the gaph seem to ende the finite-ength pobem quite intactabe. The cucia innovation in this pape is to use as a stating point a combinatoia chaacteization of decoding faiues. This combinatoia chaacteization was oiginay poposed in [12] in the context of the efficient encoding of LDPC codes. To eca some notation, an ensembe of LDPC codes is chaacteized by its bock ength, a vaiabe node degee distibution, and a check node degee distibution. Hee, is equa to the pobabiity that a andomy chosen edge is connected to a vaiabe (check) node of degee. To be specific, conside egua ensembes of the fom. Fo exampe, a typica eement of is shown in Fig. 2. Note that each vaiabe node paticipates in exacty thee checks and that each check node checks exacty six vaiabe nodes. The foowing definition chaacteizes the key object needed to study the finite-ength pefomance of LDPC codes ove the BEC. Definition 1.1 [Stopping Sets]: A stopping set is a subset of, the set of vaiabe nodes, such that a neighbos of ae connected to at east twice. As one can see fom Fig. 3, fo the paticua shown the set is a stopping set. Note, in paticua, that the empty set is a stopping set. The space of stopping sets is cosed unde unions, i.e., if and ae both stopping sets then so is. (To see this note that if is a neighbo of then it must be a neighbo of at east one of o, assume that is a neighbo of. Since is a stopping set, has at east two connections to and theefoe at east two connections to.) Each subset of thus ceay contains a unique maxima stopping set (which might be the empty set). Fig. 3. The set fv ; v ; v ; v g is a stopping set. The next emma shows the cucia oe that stopping sets pay in the pocess of iteative decoding of LDPC codes when used ove the BEC. Lemma 1.1 [Combinatoia Chaacteization of Iteative Decode Pefomance]: Let be a given eement fom. Assume that we use to tansmit ove the BEC and that we decode the eceived wod in an iteative fashion unti eithe the codewod has been ecoveed o unti the decode fais to pogess futhe. Let denote the subset of the set of vaiabe nodes which is eased by the channe. Then the set of easues which emain when the decode stops is equa to the unique maxima stopping set of. Poof: Let be a stopping set contained in. We caim that the iteative decode cannot detemine the vaiabe nodes contained in. This is tue, since even if a othe bits wee known, evey neighbo of has at east two connections to the set and so a messages to wi be easue messages. It foows that the decode cannot detemine the vaiabes contained
1572 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 in the unique maxima stopping set contained in. Convesey, if the decode teminates at a set, then a messages enteing this subset must be easue messages which happens ony if a neighbos of have at east two connections to. In othe wods, must be a stopping set and, since no easue contained in a stopping set can be detemined by the iteative decode, it must be the maxima such stopping set. In ode now to detemine the exact (bock) easue pobabiity unde iteative decoding it emains to find the pobabiity that a andom subset of the set of vaiabe nodes (the set of easues ) of a andomy chosen eement fom the ensembe contains a nonempty stopping set. We show in Theoem 2.1 that this can be done exacty. In Section III, we conside the maximum ikeihood (ML) pefomance of LDPC ensembes as we as of the standad andom ensembe. It is instuctive to study the ML pefomance since this makes it possibe to distinguish how much of the incued pefomance oss of iteative coding systems is due to the suboptima decoding and how much is due to the paticua choice of codes. Finay, in Section IV, we pesent what we conside to be the most impotant open pobems in this aea. II. FINITE-LENGTH ANALYSIS A. LDPC Codes Unde Beief Popagation Decoding The chaacteization of decoding faiues stated in Lemma 1.1 educes the task of the exact detemination of the pefomance of iteative decodes to a combinatoia pobem. In this section, we pesent a soution to that combinatoia pobem. In the seque, if is a powe seies,, we denote by its th coefficient. Theoem 2.1: Let denote the bit easue pobabiity when tansmitting ove a BEC with easue pobabiity using a code,, and a beief popagation decode. Heeby we assume that we iteate unti eithe a easues have been detemined o the decode fais to pogess futhe. In a simia manne, et denote the bock easue pobabiity. Define the functions,,, and by the ecusions (2.1) (2.2) (2.3) Fig. 4. Thee ae v vaiabe nodes of degee, c check nodes of degee, and one supe check node of degee d. (Note, in (2.4), that if so need not be defined fo this case.) Then whee. Poof: Conside the situation depicted in Fig. 4. Thee ae vaiabe nodes of degee, check nodes of degee, and one supe check node of degee. 4 Labe the vaiabe node sockets in some abitay but fixed way with eements fom the set and, in a simia manne, abe the check node sockets in some abitay but fixed way with eements fom the set. Let and the bounday condition (2.4) denote maps which descibe the association of vaiabe and check node sockets to thei espective nodes, so that, e.g., if then this signifies that the thid vaiabe node socket emanates fom the fifth vaiabe node. We aways abe the egua check nodes by and set the abe of the supe check node to. if o 4 As we wi see shoty, it is the intoduction of this exta check node which makes it possibe to wite down the ecusions.
DI et a.: FINITE-LENGTH ANALYSIS OF LDPC CODES ON THE BINARY ERASURE CHANNEL 1573 Fo simpicity, we wi efe to a paticua eaization of connecting the vaiabe node sockets to the check node sockets as a consteation. Moe pecisey, a consteation is an injective map (so is equied) so that vaiabe node socket,, is connected to check node socket,. Let denote the set of a such maps and et. Since thee ae degees of feedom in choosing which of the check node sockets ae connected and a futhe ways of pemuting the coesponding edges, is as given in (2.1). We wi say that a consteation contains a stopping set if it contains a nonempty subset of the vaiabe nodes such that any egua check node which is connected to, is connected to at east twice. Moe pecisey,,, is a stopping set if Note that this definition is sighty moe genea than the one given in Definition 1.1 since in ou cuent setup we have in addition a supe check node of degee. In paticua, in this extended definition, no estictions ae paced on the numbe of connections fom the stopping set to the supe check node. Ceay, the set can be patitioned into the set of maps that contain no stopping set, ca this set, and the set of maps that contain at east one stopping set, ca this set. Letting and we, theefoe, have the eationship (2.2). Conside now, the set of consteations that contain at east one stopping set. Obseve that if and ae two stopping sets then thei union is a stopping set. It foows that each eement of contains a unique maxima stopping set. Theefoe, we have (2.5) whee denotes the set of consteations which have as thei unique maxima stopping set. By some abuse of notation, et whee we have used the fact that the cadinaity of ony depends on the cadinaity of but not on the specific choice of vaiabe nodes. Since thee ae choices of of size and since the union in (2.5) is disjoint we get (2.3). It emains to pove the ecusion (2.4) which inks to. Conside the situation depicted in Fig. 5, whee a specific set of cadinaity is chosen. We ae inteested in counting the eements of. Note that by definition of, is the unique maxima stopping set. Fist, this impies that is a stopping set. Conside those eements of fo which the set is connected to (out of the ) egua check nodes. Thee ae ways of choosing these check nodes. Next, thee ae Fig. 5. Thee ae v vaiabe nodes of degee, c check nodes of degee, and one supe check node of degee d. Futhe, S is a subset of V, the set of vaiabe nodes, of cadinaity s. ways of choosing the check node sockets to which the sockets of the set ae connected. Finay, the edges emanating fom can be pemuted in ways. So fa we have ony been concened with edges that emanate fom. We sti need to ensue that we ony count those consteations fo which is the maxima stopping set. Conside a set. Assume that has the popety that any egua check node which is connected to but not to is connected to at east twice. Then ceay is aso a stopping set and so is not the maxima stopping set. Convesey, assume that is not the maxima stopping set. Let be the maxima stopping set and conside. By definition, evey egua check node which is connected to is connected to at east twice. Theefoe, evey egua check node which is connected to but not to is connected to at east twice. We concude that wi be the unique maxima stopping set iff does not contain a subset with the popety that evey egua check node which is connected to but not to is connect to at east twice. How many consteations ae thee which fufi this popety? A moment s thought shows that this numbe is equa to : thee ae vaiabe nodes in ; thee ae futhe egua check nodes which ae not neighbos of ; and the emaining avaiabe sockets can be combined eegated the supe check node. The bit easue and bock easue pobabiity can be expessed in a staightfowad manne in tems of. The decode teminates in the unique maxima stopping set contained in the set of eased bits. If we ae inteested in the aveage faction of eased bits emaining, then a maxima stopping set of size wi cause easues. If we ae inteested in the bock easue pobabiity then each nonempty stopping set counts equay. Fom these obsevations the stated fomuas fo the easue pobabiities foow in a staightfowad manne. Fo the second expession giving the bock easue pobabiity we ague as foows: the quantity
1574 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 Fig. 6. [P (G; )] as a function of fo n =2, i 2 [5]. Aso shown is the imit [P (G; )] (cyce-fee case). is the pobabiity that a andomy chosen subset of size contains a nonempty stopping set. If we mutipy this quantity with the pobabiity that the size of the easue set is equa to and sum ove a then we get the bock easue pobabiity. We can simpify the expession by veifying that this quantity is equa to one if. Exampe 1: Conside the ensembe. Fig. 6 shows as a function of fo,. Aso shown is the imit (cycefee case). This imiting cuve can be detemined in the foowing way. Reca that the theshod associated to a degee distibution pai can be chaacteized as 5 Assume now that the initia easue pobabiity is sticty above this theshod. In this case, the decode wi not teminate successfuy and a fixed faction of easues wi emain. To detemine this faction define, whee, as In wods, is the easue pobabiity of the messages emitted fom the vaiabe nodes at the point whee the decode teminates. To this coesponds an easue pobabiity of the messages emitted by the check nodes of. Fom this quantity it is now easy to see that the coesponding bit easue pobabiity is equa to, whee Fig. 7. [P (G; )] as a function of fo n =2, i 2 [10]. is given by. Fom we get so that the imit cuve is given in paametic fom by B. Efficient Evauation of Expessions It is cea that the ecusions stated in Theoem 2.1 quicky become impactica to evauate as the bock ength gows (this is in fact the eason why in Fig. 6 we ony depicted the cuves up to ength!). Fo the cases o the foowing ecusions ae substantiay easie to evauate. Fig. 7 shows the aveage bock easue pobabiity fo the ensembe fo bock engths,, as detemined by the foowing expessions. Theoem 2.2: Let and be ecusivey defined by is the vaiabe node degee distibution fom the node pespective. Theefoe, the imiting cuve is given in paametic fom as and (2.6) Fo the specific exampe of the -egua code it is moe convenient to paameteize the cuve by (instead of ). We know fom [1] that and the coesponding 5 Note, that the ange of x in this definition can be chosen to be x 2 (0; 1] athe than x 2 (0; ] since fo x 2 (; 1] the inequaity is automaticay fufied if it is fufied fo x =.
DI et a.: FINITE-LENGTH ANALYSIS OF LDPC CODES ON THE BINARY ERASURE CHANNEL 1575 with the bounday condition Define Then othewise. The basic idea in deiving these ecusions is simpe athough the detais become quite cumbesome. Conside a consteation which does not contain a stopping set. Then it must contain at east one degee-one check node. Pea off this check node, i.e., emove it togethe with its connected vaiabe node, any edges connected to these nodes and any futhe check nodes which, afte emova of these edges, have degee zeo. The esut wi be a smae consteation which again does not contain a stopping set and so we can appy this pocedue ecusivey. Revesing this pocedue, we see that consteations which do not contain stopping sets can be buit up one vaiabe node at a time. This gives ise to the stated ecusions. Some cae has to be taken to make sue that we count each consteation ony once since in genea consteations might contain moe than just one check node of degee one and so the same consteation can be constucted in genea in many ways stating fom suitabe smae consteations. In the above ecusions, denotes the numbe of vaiabe nodes of a consteation, the numbe of used check nodes, the numbe of check nodes of degee one, and the degee of the supe check node. In moe detai: Conside a stopping-set-fee consteation which has vaiabe nodes, uses check nodes, of which have degee one, and is abeed by the standad abes and, espectivey. Let denote the set of a such consteations and et denote its cadinaity. We wi now descibe how we can pune and gow consteations. This wi give ise to the desied ecusion. Fix an eement fom. Fo each vaiabe node, ca it,, et denote the numbe of neighboing check nodes of degee one. We wi ca the mutipicity of and we wi denote these neighbos by.topune an eement of, pick a vaiabe node of mutipicity at east one and deete and fom the consteation. The paametes of the new consteation ae theefoe,, and. In ode to make this consteation an eement of we have to ensue that its abeing is the standad one with abe sets and fo the vaiabe and check nodes, espectivey. We do this in the natua way, i.e., fo the puned consteation a abes smae than emain unchanged wheeas a abes age than ae deceased by one. An equivaent pocedue is appied at the check node side whee we have deeted nodes. The above pocedue can be inveted, i.e., if we stat with this puned consteation and add a vaiabe node with abe as we as check nodes with abes then we can ecove ou oigina consteation by connecting the edges in an appopiate way. Heeby, in adding, e.g., the vaiabe node with abe we have to incease a abes of vaiabe nodes with abes equa to at east by exacty one and a simia emak appies fo c c the check nodes. Let denote the subset of which contain the vaiabe node of mutipicity which is connected to the degee-one check nodes. c c Now note that each eement in can be econstucted in a unique way fom an eement of by adding and. It foows that a given eement of can be econstucted in exacty as many ways as the numbe of its vaiabe nodes which have mutipicity at east one. Note that, by definition, the sum of the mutipicities of a vaiabe nodes is equa to. Theefoe, the above statement can be ephased in the foowing manne. If we weigh each econstuction by the mutipicity of the inseted vaiabe node then this weighted sum of econstuctions equas. Conside now the ecusion fo in moe detai. Without much oss of geneaity we assume hee that, i.e., that thee is no supe check node. The genea case is a quite staightfowad extension. On the eft-hand side of the ecusion we wite, which by ou emaks above is equa to the weighted sum of econstuctions. Thee ae ony thee possibe ways of eaching an eement of by adding one vaiabe node of degee two to a consteation in. We can have o Conside fist the case, and theefoe. In this case, we can choose the abe in ways and the abe in ways. Futhe, thee ae choices fo the socket of and, as a moment s thought shows, choices fo the socket of the second edge. Next, ook at the case which aso impies that. As befoe, we can choose the abe in ways, the abe in ways, and thee ae choices fo the socket of. The second edge is now connected to a check node of degee one, and thee ae of them and futhe we can choose one out of sockets. Finay, conside the case, which impies that. As befoe, we can choose the abe in ways and the abes in ways and we have choices fo the sockets of the two check nodes. Since we count weighted
1576 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 econstuctions we aso have to add a facto get the ecusion. In summay we ML decode. Let easue pobabiity. Then denote the coesponding bock (3.1) We can simpify the above ecusions by noting that sevea factos ae common to a tems and ony depend on and. This gives ise to the ecusion stated in (2.6). Rathe than expaining the case in detai we efe the eade to [16], whee the above appoach has been geneaized to abitay and a systematic deivation is given. Thee ae many moe atenative ways in which expessions fo the aveage bock o bit easue pobabiity can be deived. We mention one which is paticua to the case. Note that in this case, a stopping-set-fee consteation cannot contain a doube edge, i.e., each vaiabe node connects two distinct check nodes. Theefoe, stopping-set-fee consteations can be epesented as egua gaphs, whose nodes ae the check nodes of the bipatite gaph and whose edges ae in one-to-one coespondence with vaiabe nodes of the bipatite gaph. A moment s thought now shows that stopping-set-fee consteations on the bipatite gaph coespond to a foest in the coesponding egua gaph. We can, theefoe, equivaenty count the numbe of foests, whee each node in the egua gaph has degee at most and whee sockets and edges ae abeed. III. ML DECODING It is instuctive to compae the pefomance of an LDPC ensembe unde iteative decoding to that of the same LDPC ensembe unde ML decoding as we as the pefomance of the standad andom ensembe unde ML decoding. The eason fo ou inteest in these quantities is that they indicate how much of the pefomance oss of iteative coding systems is due to the choice of codes and how much is due to the choice of the suboptima decoding agoithm. We note that we assume an ML decode which detemines a those bits which ae uniquey specified by the channe obsevations but does not beak ties and theefoe we wi dea with tue easue pobabiities athe than eo pobabiities. A. Standad Random Ensembe Unde ML Decoding Theoem 3.1: Conside the ensembe of binay inea codes of ength and dimension defined by means of thei paity-check matix, whee each enty of is an independent and identicay distibuted (i.i.d.) Benoui andom vaiabe with paamete one-haf. Let denote the bit easue pobabiity of a paticua code defined by the paity-check matix when used to tansmit ove a BEC with easue pobabiity and when decoded by an (3.2) whee is the numbe of binay matices of ank. An enumeation is given in Appendix A. Poof: Fist conside the bock easue pobabiity. Let denote the set of easues and et denote the submatix of which consists of those coumns of which ae indexed by. In a simia manne, et denote those components of the codewod which ae indexed by. Fom the defining equation we concude that (3.3) whee. Now note that if denotes the tansmitted codewod and denotes the set of easues then, the ight-hand side of (3.3), is known to the eceive. In standad teminoogy, is caed the syndome. Conside now the equation. Since, by assumption, is a vaid codewod, we know that this equation has at east one soution. It has mutipe soutions, i.e., the ML decode wi not be abe to ecove the codewod uniquey, iff has ank ess than. Fom (A1) we know that this happens with pobabiity othewise. Fom this, (3.2) foows in a staightfowad manne. Next conside the bit easue pobabiity. We caim that a bit cannot be ecoveed by an ML decode iff is an eement of the space spanned by coumns of. To see this we ague as foows. Wite the basic equation in the fom Since, by assumption, is a codewod we know that thee is at east one choice of such that this set of equations has soutions. The ML decode wi not be abe to detemine if this equation has soutions fo both choices of. But this happens iff is contained in the coumn space spanned by, as caimed. Fom (A1) we know that the pobabiity that has ank is equa to
DI et a.: FINITE-LENGTH ANALYSIS OF LDPC CODES ON THE BINARY ERASURE CHANNEL 1577 IV. INTERPRETATION In compaing Fig. 7 with Fig. 9 (assuming that the shown union bound is easonaby tight) and Fig. 6 with Fig. 8, we see, at east fo the ensembe, that most of the pefomance oss is due to the stuctue of the codes themseves. Notice that fo the ensembe the pefomance unde iteative decoding is ony sighty wose (at east in the eo foo egion ) than the pefomance unde ML decoding. In paticua, even unde ML decoding the cuves show an eo foo egion which is so chaacteistic of iteative coding systems. We emak that this effect is even moe ponounced since we ook hee at bock eo cuves. The coesponding bit eo cuves woud show this eo foo to a esse degee. Fig. 8. [P (H; )] as a function of fo n =2, i 2 [10] (soid cuves). Aso shown is the union bound (dashed cuves). As we can see, fo inceasing engths the union bound expessions become moe and moe accuate. Fig. 9. Union bound on the quantity [P (H; )] as a function of fo n =2, i 2 [10]. Futhe, assuming that has ank, the pobabiity that is an eement of the space spanned by the coumns of is equa to. Fom these two obsevations (3.1) foows easiy. Exampe 2: Fig. 8 shows as a function of fo,. Aso shown is the union bound which is deived in Appendix B. As we can see, fo inceasing engths the union bound expessions become moe and moe accuate. B. LDPC Ensembe Unde ML Decoding We have so fa not succeeded in deiving exact expessions fo the ML pefomance of LPDC ensembes. Fom the pevious section though one can see that the union bound on the ML easue pobabiity fo the andom standad andom ensembe is easonaby tight except fo vey shot engths. Theefoe, it is meaningfu to deive the union bound of the ML pefomance of LDPC ensembes as we. This is done in Appendix B. Stonge bounds, especiay away fom the eo foo egion, can be obtained using moe powefu techniques, see, e.g., [4]. Exampe 3: Fig. 9 shows the union bound on the quantity as a function of fo,. V. OUTLOOK Athough the exact chaacteization of the aveage bit and bock easue pobabiities given in this pape ae quite encouaging, much wok emains to be done. We biefy gathe hee what we conside to be the most inteesting open pobems. 1) In Fig. 1 we see that the individua bit easue cuves fa into two categoies. Thee is one cuve which shows a faiy ponounced eo foo, wheeas a othe cuves exhibit a much steepe sope. In the egion whee the individua cuves divege, the ensembe aveage is to a age degee dominated by those bad gaphs. This suggest that one can define an expugated ensembe and that the concentation of the individua behavio aound the aveage of this expugated ensembe hods down to much owe easue pobabiities. The question is how to find a suitabe definition of such an expugated ensembe and whethe one can sti find the ensembe aveage of the expugated ensembe? Some pogess in this diection has been made in [10]. 2) The exact evauation of and is, in genea, a nontivia task and it woud be highy desiabe to find simpe expessions. It is paticuay fustating that not even the simpe ecusion fo the cyce code case seems amenabe to an anaytic attack. Fo exampe, if we ty the obvious path empoying geneating functions, the esuting patia diffeentia equation does not seem to admit an anaytic soution. Simpe bounds on these quantities woud aso be usefu. 3) Once simpe expessions fo the egua case have been found, it is natua to investigate if exact expessions can aso be given fo the iegua case. 4) These expessions can then be used to find the optimum degee distibution pais fo a given ength. 5) Find exact expessions fo the bit and bock easue pobabiity fo LDPC ensembes unde ML decoding. Compaing then the expessions fo the iteative decoding of
1578 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 6, JUNE 2002 LDPC ensembes with the ones fo the ML of LDPC ensembes and the ones fo the ML of standad andom ensembes it wi then be possibe to assess how much oss is due to the stuctue of the codes and how much oss is due to the suboptimum decoding. A eated but simpe pobem is to find the theshod fo LDPC codes beow which the bock easue pobabiity can be made abitaiy sma. It shoud be inteesting to see fo which codes the theshod fo bit and bock easue pobabiity ae diffeent and fo which they ae the same. Some patia answes to the ast question can be found in [10]. 6) Find exact expessions fo the bit and bock eo pobabiity of LDPC ensembes unde iteative decoding fo moe genea channes. 7) Appy the same anaysis to othe ensembes, e.g., epeat accumuate (RA) ensembes [2]. 8) In this pape, we assumed that the decode poceeds unti no futhe pogess is achieved. What is the distibution of the numbe of equied iteations? Aso, since measuements by MacKay and Kante have indicated that the distibution of the numbe of equied iteations have sowy decaying tais it is inteesting to see how the eo pobabiities behave if we pefom a fixed numbe of iteations. APPENDIX A FULL RANK MATRICES Lemma A.1: Let denote the numbe of binay matices of dimension and ank. By symmety ways, and convesey, any matix of ank can be mapped to a unique matix of ank by deeting the ast ow. It foows that and, theefoe, that Finay, to pove the ecusion we ague as foows. Conside the numbe of matices of dimension and ank. Spit these matices into those matices such that afte deetion of the ast ow the esuting matices of dimension have ank and those that have ank. The fist such goup has by definition cadinaity and each eement in this goup can be extended to a matix of ank in exacty distinct ways. The second goup has cadinaity and each eement in this goup can be extended to a matix of ank in exacty distinct ways. APPENDIX B UNION BOUNDS It is usefu to deive union bounds on the bock and bit easue pobabiities of the standad andom ensembe as we as fo LDPC ensembes unde ML decoding. We stat with the standad andom ensembe. A. Random Ensembes Lemma B.1 [Union Bound fo Standad Random Ensembes Unde ML Decoding]: Fo othewise. (A1) Poof: Fist note that ank Poof: Ceay, thee is exacty one ank, namey, the a-zeo matix, so that. Next, note that matix of zeo, fo since any nonzeo binay eement of foms a matix of ank. Futhe by induction, any matix of ank can be extended to a matix of ank in exacty Theefoe,
DI et a.: FINITE-LENGTH ANALYSIS OF LDPC CODES ON THE BINARY ERASURE CHANNEL 1579 whee denotes the weight of, fom which the bock easue pobabiity foows in a staightfowad manne. ACKNOWLEDGMENT The authos wish to thank Iga Sason and the eviewes fo thei many hepfu comments on an eaie daft of this pape. B. LDPC Ensembes In exacty the same manne we can deive bounds on the easue pobabiities fo LDPC codes unde ML decoding. Lemma B.2 [Union Bound fo LDPC Codes Unde ML Decoding]: Poof: We have REFERENCES [1] L. Bazzi, T. Richadson, and R. Ubanke, Exact theshods and optima codes fo the binay symmetic channe and Gaage s decoding agoithm A, IEEE Tans. Infom. Theoy, 1999, to be pubished. [2] D. Divsaa, H. Jin, and R. McEiece, Coding theoems fo tubo-ike codes, in Poc. 1998 Aeton Conf., 1998, p. 210. [3] R. Gaage, Low-density paity-check codes, IRE Tans. Infom. Theoy, vo. IT-8, pp. 21 28, Jan. 1962. [4] R. G. Gaage, Low-Density Paity-Check Codes. Cambidge, MA: MIT Pess, 1963. [5] M. Luby, M. Mitzenmache, A. Shokoahi, and D. Spieman, Anaysis of ow density codes and impoved designs using iegua gaphs, in Poc. 30th Annu. ACM Symp. Theoy of Computing, 1998, pp. 249 258. [6], Impoved ow-density paity-check codes using iegua gaphs and beief popagation, in Poc. 1998 IEEE Int. Symp. Infomation Theoy, 1998, p. 117. [7] M. Luby, M. Mitzenmache, A. Shokoahi, D. Spieman, and V. Stemann, Pactica oss-esiient codes, in Poc. 29th Annu. ACM Symp. Theoy of Computing, 1997, pp. 150 159. [8] D. J. C. MacKay, Good eo coecting codes based on vey spase matices, IEEE Tans. Infom. Theoy, vo. 45, pp. 399 431, Ma. 1999. [9] T. Richadson, A. Shokoahi, and R. Ubanke, Design of capacity appoaching iegua ow-density paity check codes, IEEE Tans. Infom. Theoy, vo. 47, pp. 619 637, Feb. 2001. [10], Eo foo anaysis of vaious ow-density paity-check ensembes fo the binay easue channe, submitted to IEEE Int. Symp. Infomation Theoy, Lausanne, 2002. [11] T. Richadson and R. Ubanke, The capacity of ow-density paity check codes unde message-passing decoding, IEEE Tans. Infom. Theoy, vo. 47, pp. 599 618, Feb. 2001. [12], Efficient encoding of ow density paity check codes, IEEE Tans. Infom. Theoy, vo. 47, pp. 638 656, Feb. 2001. [13] A. Shokoahi, New sequences of inea time easue codes appoaching the channe capacity, in Poc. 13th Conf. Appied Ageba, Eo Coecting Codes, and Cyptogaphy (Lectue Notes in Compute Science). New Yok: Spinge Veag, 1999, pp. 65 76. [14], Capacity-achieving sequences, in Codes, Systems, and Gaphica Modes, B. Macus and J. Rosentha, Eds. New Yok: Spinge- Veag, 2000, vo. 123, IMA Voumes in Mathematics and its Appications, pp. 153 166. [15] A. Shokoahi and R. Ston, Design of efficient easue codes with diffeentia evoution, in Poc. Int. Symp. Infomation Theoy, Soento, 2000. [16] J. Zhang and A. Oitsky, Finite ength anaysis of LDPC codes with age eft degees, submitted to IEEE Int. Symp. Infomation Theoy, Lausanne, Switzeand, 2002.