An Algorithm for Computing Nucleic Acid BasePairing Probabilities Including Pseudoknots


 Hillary Rose
 1 years ago
 Views:
Transcription
1 An Alorthm for Computn Nuclec Acd BaseParn Proaltes Includn Pseudoknots ROBERT M. DIRKS, 1 NILES A. PIERCE 2 1 Department of Chemstry, Calforna Insttute of Technoloy, Pasadena, Calforna Departments of Appled & Computatonal Mathematcs and Boenneern, Calforna Insttute of Technoloy, Mal Code , Pasadena, Calforna Receved 21 January 2004; Accepted 19 March 2004 DOI /jcc Pulshed onlne n Wley InterScence (www.nterscence.wley.com). Astract: Gven a nuclec acd sequence, a recent alorthm allows the calculaton of the partton functon over secondary structure space ncludn a class of physcally relevant pseudoknots. Here, we present a method for computn aseparn proaltes startn from the output of ths partton functon alorthm. The approach reles on the calculaton of recurson proaltes that are computed y acktrackn throuh the partton functon alorthm, applyn a partcular transformaton at each step. Ths transformaton s applcale to any partton functon alorthm that follows the same asc dynamc prorammn paradm. Baseparn proaltes are useful for analyzn the equlrum ensemle propertes of natural and enneered nuclec acds, as demonstrated for a human telomerase RNA and a synthetc DNA nanostructure Wley Perodcals, Inc. J Comput Chem 25: , 2004 Key words: DNA; RNA; aseparn proaltes; partton functon; pseudoknots Introducton Thermodynamc models ased on nuclec acd secondary structure and nearestnehor denttes 1 5 underly dynamc prorammn alorthms for predctn the mnmum enery secondary structure 6 10 and calculatn the partton functon over secondary structure space In ther ornal forms, these alorthms eclude the posslty of pseudoknots, a olocally relevant class of secondary structures 13 that also arses n DNA nanotechnoloy applcatons. 14,15 Pseudoknots result when two ase pars j and d e, wth d, fal to satsfy the nestn property d e j (see, e.., F. 1). Recent etensons of the structure predcton and partton functon 18 alorthms allow the ncluson of certan pseudoknots. For an ensemle of secondary structures s, the partton functon Q e Gs/RT s may e used to compute the proalty ps* 1 Q egs*/rt (1) that secondary structure s* s sampled at thermodynamc equlrum. The ensemle equlrum can also e characterzed y the matr of aseparn proaltes wth entres p, j correspondn to the proalty that ase s pared wth ase j n. McCaskll s ornal artcle 11 defnes eleant dynamc prorams to compute the partton functon and aseparn proaltes over the ensemle of unpseudoknotted secondary structures. The partton functon alorthm ulds up recursvely from short susequences to the full strand, and then the par proaltes are computed y workn ackwards to short susequences usn ntermedate results from the partton functon calculaton. In the asence of pseudoknots, the partton functon alorthm s suffcently succnct that McCaskll s ale to determne the form of the par proalty acktrack alorthm smply y consdern the few possle forms of enclosn secondary structure for any ven ase par. Althouh ths approach s smple and effcent, t s not easly Correspondence to: Nles A. Perce; emal: Contract/rant sponsor: NSF raduate research fellowshp (R.M.D.). Contract/rant sponsor: Defense Advanced Research Projects Aency (DARPA) and Ar Force Research Laoratory under F (N.A.P.). Contract/rant sponsor: Ralph M. Parsons Foundaton (N.A.P.). Contract/rant sponsor: Charles Lee Powell Foundaton (N.A.P.) Wley Perodcals, Inc.
2 1296 Drks and Perce Vol. 25, No. 10 Journal of Computatonal Chemstry descrpton n the same notaton). [The complety may e reduced to O(N 3 ) y eplotn the formulaton of the nearestnehor enery model for lon nteror loops. 18,21 ] Partton functon recursons are nonredundant n the sense that every secondary structure n the ensemle s vsted eactly once usn a unque sequence of recursons. The alorthm computes the partton functon Q, j for each susequence [, j] norn all ases eteror to [, j], startn from susequences of lenth l 1 and uldn up ncrementally to l N. The recursons that defne Q, j rely on addtonal restrcted partton functons Q, j and Q m, j. Q, j represents the partton functon for susequence [, j] ven that and j are ase pared and Q m, j s used to calculate multloop contrutons. At the end of the recursve process, the full partton functon Q s ven y Q 1,N and the values of Q, j, Q, j, Q m, j are stored n matrces for 1, j N. These ntermedate results wll play a crtcal role n the new alorthm descred elow. Recurson Proaltes Fure 1. Secondary structures of competn pseudoknot and harpn constructs n human telomerase RNA. The wldtype sequence s shown. For the twopont mutant mplcated n dyskeratoss conenta, GC s replaced y AG n the shaded oes, dsruptn two ase pars n the pseudoknot construct. For the epermental studes of the harpn structure, 20 the 18 nucleotdes at the 3 end are ecluded to prevent formaton of the pseudoknot. eneralzale to alorthmc etensons, such as the ncluson of pseudoknots. Here, we descre a eneral method for mechancally transformn the new pseudoknot partton functon alorthm 18 to compute recurson proaltes, whch can e used n turn to compute aseparn proaltes. The transformaton approach s eneralzale to any future partton functon etensons that follow the same dynamc prorammn paradm. Baseparn proaltes assst n the analyss of olocally relevant pseudoknots. Here, we eamne human telomerase RNA, whch ests at equlrum n oth harpn and pseudoknotted forms. 19 A twopont mutaton, mplcated n the dsease dyskeratoss conenta, alters the thermodynamc alance etween these competn structures. 20 Ths shft n equlrum s clearly dentfale when the aseparn proaltes for the two sequences are compared. Baseparn proaltes that permt pseudoknots are also useful n analyzn synthetc DNA nanostructures. 14,15 Follown the eecuton of the partton functon calculaton, a second alorthm can e mplemented to calculate proalty matrces, P, P, P m, correspondn to the Q, Q, Q m matrces. The values stored n these Ptype matrces wll e termed recurson proaltes. Recurson proaltes can e ntutvely descred as follows. Consder sampln the ensemle of secondary structures s where the proalty of selectn structure s* s ven y the Boltzmann proalty (1). For each secondary structure s*, the contruton to Q s computed y a unque recurson sequence nvolvn specfc Q, j, Q, j, and Q m, j ntermedates. Assocatn these ntermedates wth structure s*, the recurson proalty P, j, P, j or P m, j corresponds to the proalty that the sampled structure s* requres the use of the correspondn ntermedate Q, j, Q, j or Q m, j to calculate the partton functon contruton. Recent work y Dn and Lawrence 22 eplots quanttes related to recurson proaltes to statstcally sample the dstruton of unpseudoknotted secondary structures for a ven sequence. Here, we develop a eneral approach for computn Ptype ma Alorthm For clarty, we en y consdern the class of secondary structures ecludn pseudoknots and then address the addtonal complety that arses when pseudoknots are ntroduced. Partton Functon Recursons For a strand of lenth N, the partton functon may e computed over all unpseudoknotted secondary structures n O(N 4 ) usn the alorthm 10,11 summarzed n Fure 2 (see ref. 18 for a detaled Fure 2. O(N 4 ) partton functon alorthm that ecludes pseudoknots.
3 Alorthm for Computn Nuclec Acd BaseParn Proaltes 1297 Fure 3. Recurson daram correspondn to recursve update (2), depctn the addton to Q, j of partton functon contrutons for those structures wth rhtmost ase par d e. See ref. 18 for a thorouh descrpton of the partton functon alorthm (wth or wthout pseudoknots) n terms of recurson darams. trces ven a set of Qtype matrces and correspondn partton functon recursons. An alorthm for computn recurson proaltes can e formulated n a mechancal way startn from a set of partton functon recursons. The cru of ths formulaton s the repeated applcaton of a snle transformaton to the partton functon code. In partcular, updates of the form Q, j Q,d1 Q d,e (equvalent to the recurson daram of F. 3) are converted to the follown seres of statements (2) Startn from the partton functon alorthm of Fure 2, the recurson proalty alorthm s otaned y performn three modfcatons: (1) the two outermost loops are altered so that the alorthm starts wth the full strand of lenth l N and decrements down to susequences of lenth l 1; (2) all recursve updates are transformed as for (3) aove; (3) the order of the recurson locks (Q, [Q, Q m ]) s reversed ([P, P m ], P ). Ths last modfcaton s necessary ecause the recurson order n the partton functon alorthm ensures that f one quantty (e.., Q, j ) recurses to another quantty of the same lenth (e.., Q, j ) then the lower level quantty (.e., Q, j ) s calculated frst. The reverse ordern s needed for the recurson proalty alorthm, ecause P, j cannot e used untl t has een fully computed n the P, j loop. The pseudocode n Fure 4 detals the outcome of these transformatons for the unpseudoknotted case. Ths modfed alorthm reverses the flow of the partton functon calculaton and ncrementally determnes all recurson proaltes (frequences of famles of structures), ased on the proaltes of all superstructures that drectly contan them. Once recurson proaltes are computed for all and j, the aseparn proalty p, j s smply P, j, ecause Q, j s assocated wth every structure s n whch j appears, and j s assocated wth eactly one Q, j. By startn from a more complcated O(N 3 ) partton functon alorthm, 18,21 the computatonal complety of the recurson proalty alorthm can also e reduced to O(N 3 ) as descred n the Append. Pseudoknots condtonal proalty p P, j Q,d1 Q d,e /Q, j P,d1 p The procedure outlned aove for otann recurson proalty alorthms s equally applcale to a new partton functon alorthm that ncludes pseudoknots (see the pseudocode of F. 21 n ref. 18). For the unpseudoknotted alorthm, all ase pars stem P d,e p (3) Specfcally, the rhthand sde (RHS) of each recursve update s dvded y the lefthand sde (LHS), and the P term correspondn to the new denomnator s multpled y ths quotent. The resultn proaltes, temporarly stored as p, are susequently added to every Ptype value correspondn to the Qtype terms on the RHS of the ornal statement (2). To understand ths transformaton, recall that Q, j, Q m, j and Q, j are partton functons for structural suclasses of the full sequence. In recursve updates such as (2), the rato of the RHS to the fully computed LHS corresponds to the proalty that a structure drawn from an equlrum ensemle defned y the LHS partton functon s n the suensemle defned y the RHS partton functon. As an eample, transformaton (3) states that for any, d, e, j, the structures represented y Q, j partally consst of sustructures represented y Q,d1 and Q d,e. Consequently, once the proalty P, j s determned, t can e used to aument P,d1 and P d,e ecause the frequences of the correspondn sustructures wthn the Q, j ensemle can e derved from Q,d1 and Q d,e. By acktrackn throuh the partton functon alorthm and transformn all recursve updates analaously to (3), proaltes can e calculated for each recurson. Fure 4. O(N 4 ) recurson proalty alorthm that ecludes pseudoknots. For smplcty, we omt detals such as checkn for updates wth zero n the denomnator (n whch case the numerator wll also evaluate to zero and the epresson should e skpped).
4 1298 Drks and Perce Vol. 25, No. 10 Journal of Computatonal Chemstry from Q recursons, so the values stored n P are precsely the desred proaltes (.e., p, j P, j ). For the pseudoknotted case, P, j only ves the proalty that and j form a nested par. The full aseparn proalty must also take nto consderaton those ase pars that are nonnested and lead to pseudoknotted structures (termed apspannn pars n ref. 18). For these apspannn pars, there s no snle recurson proalty that represents the contruton to p, j. However, ths contruton may e succnctly represented n terms of Qtype and Ptype matrces for the full pseudoknotted alorthm. A new set of quanttes, P, j, wll e used to store the ase parn proaltes of j apspannn pars n pseudoknots. The most pertnent recurson proalty, P,d,e, j, stores the proalty of a ap structure wth outer apspannn par j and nner apspannn par d e correspondn to the partton functon recurson Q,d,e, j (see F. 19 n ref. 18). Due to the structure of the Q,d,e, j recurson, the sum of P,d,e, j over all values of d, e precsely ves the proalty of an outer par j P, j However, the sum of P,d,e, j dej P,d,e, j. (4) over all values of, j does not ve the proalty of an nner par d e, ecause the same nner par may e present n multple recursons requred to defne the same secondary structure. To correctly determne the proaltes of nner apspannn pars, only the porton of P that corresponds to calln Q drectly from Q l should e ncluded P d,f defj l P,e,f, j z Q,d,f, j Q d1,e l ep 2 /RT/Q,e,f, j. (5) Here, Q l and Q z are partton functon recursons used to defne the nteror structure of a pseudoknot, and 2 s a pseudoknot enery parameter (see Fs. 18 and 12 n ref. 18). Allown pseudoknots, the total proalty of a ase par j s then p, j P, j P, j. Pseudocode detaln the alorthm for computn recurson proaltes n the pseudoknotted case s provded n Fure 5, where the calculaton of P, j usn (4) and (5) has een emedded at lttle addtonal cost. [Note that (4) and (5) use dfferent ndces for P to mantan consstency wth the pseudocode.] In the Append, we descre how to reduce the complety of the pseudoknotted alorthm from O(N 6 ) to O(N 5 ). Methods The standard enery model 4 and pseudoknot etenson 18 are mplemented as descred prevously, 18 ncludn danle eneres and penaltes for helces not termnated y a G C par. These terms do not chane the structure of the recursons descred n the pseudocode and are omtted for clarty. Coaal stackn contrutons are not ncluded n the physcal model, as t s unclear how to treat dfferent stackns assocated wth the same secondary structure n the contet of the partton functon. To mantan consstency wth a recent desn study, 23 danle eneres are treated analoously to the d2 opton n the Venna packae. 10 Follown ths approach, danle eneres are ncluded even f two helces are separated y one or zero ases, provdn some compensaton for the nelect of coaal stackn onuses. Applcatons The recurson proalty alorthm provdes a smple, eneral method for calculatn the frequency of varous sustructures n the ensemle of states for a ven nuclec acd. Baseparn proaltes derved from the recurson proaltes are partcularly useful for analyzn secondary structure va dot plot analyses. 11 A tradtonal dot plot depcts the proaltes of formn all possle j ase pars. In the case of pseudoknots, the dot plot can e decomposed nto two dot plots one for nested pars and one for nonnested apspannn pars. To see the utlty of ths decomposton, calculatons were run on wldtype and mutant sequences of a pseudoknot construct derved from human telomerase RNA. 20 Epermental evdence suests that ths pseudoknot ests n equlrum wth an alternatve, harpn structure, and that ths equlrum functons as a olocal swtch. 19 A twopont mutant, found n a small percentae of people, shfts the equlrum towards the harpn structure, leadn to a dsease known as dyskeratoss conenta. 19 Feon and coworkers 20 eamne ths shft n equlrum for sements of the wldtype and mutant sequences descred n Fure 1, revealn that the pseudoknot conformaton domnates the harpn for the wldtype sequence (95% to 5%) ut competes rouhly equally n the mutant sequence (50% to 50%). Usn prelmnary pseudoknot parameters, 18 eneres were computed for oth the wldtype sequence and the twopont mutant on the pseudoknotted and harpn structures. The predcted eneres n Tale 1 match reasonaly well wth epermental values. 20 For the wldtype sequence, the dsparty etween the pseudoknot and harpn eneres suests an equlrum that favors the more stale pseudoknot. In contrast, the eneres for the doule mutant sequence suest a more alanced equlrum. Fures 6 and 7 llustrate that the harpn conformaton has a snfcant mpact on the par proaltes for the mutant, ut not for the wldtype sequence. Baseparn proaltes can also e used to construct metrcs for evaluatn nuclec acd desns. The secondary structure s may e descred y a symmetrc N N matr S wth entres S, j 1 f s contans ase par j and S, j 0 otherwse. We aument ths matr y an addtonal column wth entres S,N1 1 f ase s unpared and S,N1 0 otherwse. Hence, each row sum s one. For a ven sequence of lenth N, the metrc 23 ns* N 1N 1jN1 p, j S*, j represents the averae numer of nucleotdes that dffer from the taret secondary structure s* at thermodynamc equlrum. Ths
5 Alorthm for Computn Nuclec Acd BaseParn Proaltes 1299 Fure 5. O(N 6 ) recurson proalty alorthm that ncludes a class of pseudoknots. Modfcatons requred to produce an O(N 5 ) verson of the alorthm are noted n comments. See the Append for detals.
6 1300 Drks and Perce Vol. 25, No. 10 Journal of Computatonal Chemstry Tale 1. Enery Comparsons for Human Telomerase RNA Constructs. Eneres (kcal/mol) RNA Conformaton G ep G calc Wldtype Pseudoknot Harpn 9.8 a 11.5 c Mutant Pseudoknot Harpn 10.5 a 11.5 c a Eperments were performed on partal sequences that ecluded the 18 nucleotdes on the 3 end to prevent the formaton of pseudoknots. 20 Ths truncaton does not affect the correspondn G calc. A related pseudoknot structure that s otherwse dentcal ut omts the three consecutve A U pars n the stem wth the ule loop s predcted to e 0.5 kcal/mol more stale. c The secondary structure enery calculaton nores the four consecutve noncanoncal ase pars that are oserved to close the nteror loop n the harpn stem. 20 s a less strnent metrc than p(s*), the proalty that the sequence eactly adopts structure s*; even f p(s*) s not close to unty, n(s*) can stll e small f the equlrum ensemle s domnated y structures that dffer only slhtly from s*. It s llustratve to compare the two metrcs on a real desn prolem nvolvn pseudoknots. For eample, Wnfree et al. 14 desned and constructed DNA doulecrossover molecules 24 that nteract to form a twodmensonal lattce wth a pseudoknotted unt cell. These sequence desns were performed usn sequence symmetry mnmzaton 25 to ensure that ncorrectly pared susequences of lenth s would always contan at least one msmatch and most ncorrectly pared susequences of lenth fve would also contan a msmatch. 14 Lackn DNA pseudoknot parameters, we eamne an RNA analo of ther sequence for the porton of the pseudoknotted unt cell depcted n Fure 8a. The proalty of adoptn the taret structure s p(s*) 0.1 and the averae numer of ncorrect nucleotdes s n(s*) 4.0. The low value of Fure 6. Dot plots for wldtype human telomerase RNA. (a) Pseudoknot (ottom left) and harpn (top rht) constructs. For () and (c), lare dots ndcate a p, j 0.5 and small dots ndcate 0.5 p, j () Baseparn proaltes ncludn pseudoknots (ottom left) and ecludn pseudoknots (top rht). (c) A decomposton of the full aseparn proaltes nto apspannn pars (ottom left) and nested pars (top rht). Note that there are no nested pars wth snfcant proalty, ndcatn that pseudoknot conformatons are domnatn the equlrum. Fure 7. Dot plots for doule mutant human telomerase RNA. The plots are analoous to those of Fure 6. The key dfference s oserved n (c), where the harpn stem appears as oth apspannn pars and nested pars, ndcatn the ncreased snfcance of harpn conformatons. p(s*) mht possly ndcate a cause for concern, ut for a structure wth 90 nucleotdes and helces of lenth eht, the averae numer of ncorrect nucleotdes s relatvely small. Hence, t s not surprsn that the sequence ehaves well epermentally, demonstratn the correct aseparn topoloy despte slht predcted varatons at the ends of helces. The dot plot n Fure 8 llustrates the smlarty etween the averae structure and the desred taret. Interestnly, desn methods descred n prevous work 23 can e used, n conjuncton wth the pseudoknot partton functon alorthm, to fnd sequences that acheve p(s*) 0.98 and n(s*) 1. It s unclear whether these sequences would provde any epermental eneft for ths system (even ven a perfect enery model), ecause the dfference etween n(s*) 4 and n(s*) 1 may e lost n epermental nose. By contrast, f a sequence produced p(s*) 0.1 wth n(s*) 4, then the equlrum ensemle could nclude mportant structures dffern snfcantly from the taret structure. Conclusons A eneral transformaton rule etends nuclec acd partton functon alorthms to calculate recurson proaltes, whch n turn, can e used to compute aseparn proaltes. We use ths approach to derve an alorthm for computn aseparn proaltes startn from a partton functon alorthm that ncludes a class of pseudoknots. The same stratey wll apply to future partton functon etensons that follow the same dynamc prorammn paradm. To demonstrate the utlty of aseparn proaltes, calculatons were performed on a pseudoknot/harpn construct thouht to represent an mportant olocal swtch. In areement wth epermental evdence, the computatonal results ndcate that the pseudoknot domnates the harpn for the wldtype sequence, ut not for the doule mutant. Baseparn proaltes were also used to eamne the ensemle propertes of a synthetc nuclec acd sequence desned to assemle nto a pseudoknotted doulecrossover molecule. The averae numer of ncorrect nucleotdes was found to e small, suestn that the relatvely low computed proalty of adoptn the
7 Alorthm for Computn Nuclec Acd BaseParn Proaltes 1301 Acknowledments We wsh to thank C. Ueda for dscussons on human telomerase RNA and E. Wnfree for dscussons on the DNA lattce. Append: Reducn Computatonal Complety Fure 8. Computatonal eamnaton of a pseudoknotted DNA nanostructure. (a) Secondary structure for a doulecrossover molecule that forms a porton of the unt cell n a twodmensonal lattce. 14 For our computatonal study, we jon the lue and orane strands (arrows denote 3) nto a snle strand usn aulary nucleotdes (reen) to facltate the use of the snlestranded partton functon alorthm. 18 In the asence of DNA pseudoknot parameters, we consder the RNA analo 5CCAACUCCUAGCGAUUUUUCGCUAGGUUUACCA GAUCCACAAGCCGACGUUACAUUUUGGAUCUGGUAAG UUGGUGUAACGUCGGCUUGU3, where the nteror hyphens denote the oundares of the aulary lnker sement. () Dot plot analyss of the desned sequence. The ottom left depcts the asepars n the taret structure, and the upper rht depcts the aseparn proaltes. Lare dots ndcate a p, j 0.5 and small dots ndcate 0.5 p, j The crcles ndcate the major dfferences etween the taret structure and the calculated par proaltes. taret secondary structure should not snfcantly affect the epermental performance of the molecule. Software Download The alorthms descred n ths artcle are avalale for download at as part of the NUPACK software sute. The alorthms presented n the man tet provde an neffcent treatment of nteror loops. By eplotn the form of the nteror loop potental functon, the computatonal complety of the partton functon alorthms ecludn and ncludn pseudoknots can e reduced y a factor of N, where N s the sequence lenth. 18,21 A detaled descrpton of the fastloops treatment s provded n ref. 18 and the correspondn Supplementary Materal. The fastloops modfcaton detracts from the smplcty of the presentaton ecause the necessary recursons do not conform to the same structure as the other terms n the alorthm. Here, we descre the etenson of ths approach to recurson proalty alorthms. In the unpseudoknotted case, pseudocode for an O(N 3 ) partton functon alorthm s provded n Fure 11 of ref. 18, whch employs the fastloops functon of Supplementary Materal Fure S2. To ths pont, we have assumed that all Qtype values are accessle at the end of the partton functon calculaton. For the fastloops methods, the values Q, Q 1 and Q 2 are computed on the fly and dscarded to save memory. Hence, for the recurson proalty alorthm, t s necessary to recompute the Q type terms at the same tme that the correspondn P type terms are calculated. An O(N 3 ) recurson proalty alorthm that ecludes pseudoknots s descred n Fure A1, whch references the functon fastloopsn3 of Fure A2. If pseudoknots are ncluded, the computatonal complety of the recurson proalty alorthm n Fure 5 s reduced to O(N 5 ) usn fastloopsn5 descred n Fure A3. A few aspects of the fastloopsn3 and fastloopsn5 routnes deserve menton. It s advsale to revew the relevant sectons of ref. 18 and the correspondn Supplementary Materal efore proceedn. An nteror loop wth closn par j and nteror par d e has enery G nteror,d,e, j, sdes of lenths L 1 d 1, L 2 j e 1, (6) and sze L 1 L 2. Loops wth oth L 1 4 and L 2 4 are termed etensle and ther contrutons to the partton functon alorthm are calculated usn Q. Furthermore, Q also contans nformaton aout possle etensle loops for whch the defntons of L 1, L 2 are the same ut and j are not requred to asepar. The partton functon alorthm eamnes susequences of lenth l j 1, startn wth l 1 and endn wth l N. Q s effcently calculated usn the etenson dentty [see eq. (15) of ref. 18], Q 1,s2 s 2 L 14 L 1L 2s2 s 2 L 24 L 1L 2s2 Q,s ep 1 s 2 1 s/rt (7)
8 1302 Drks and Perce Vol. 25, No. 10 Journal of Computatonal Chemstry whch relates Q,s (for susequences of lenth l ) to Q 1,s2 (for susequences of lenth l 2). The frst lne seeds Q wth cases that are oth etensle (L 1 4 and L 2 4) and at an end of the strand ( 1 or j N). For mplementaton purposes, the second lne of (8) s calculated durn the l, loop and temporarly stored 2 n Q 1,s2. The frst lne of (8) s added to ths contruton n the l 2, 1 loop. We retan the conventon that L 1 and L 2 are defned wth respect to the loop nde n whch they are calculated (.e., l, for the second lne and l 2, 1 for the frst lne). Dervaton of the alorthm to compute P requres careful consderaton. The quanttes Q and Q 2 contan ncomplete partton functon nformaton for possle etensle loops, ut they do not represent susequence partton functons n the manner of other Qtype matrces. In a normal recurson relaton, Fure A1. O(N 3 ) recurson proalty alorthm that ecludes pseudoknots. The alorthm proceeds from loner susequences to shorter ones, so n contrast to the analoous partton functon alorthm (see F. 11 of ref. 18), Q 1 and Q 2 refer to susequences whose lenths are shorter (y 1 and 2, respectvely) than the current susequence of lenth l. whch relates Q,s (for susequences of lenth l ) to Q 1,s2 (for susequences of lenth l 2). The frst lne seeds Q wth cases at an etenson order (L 1 4 or L 2 4) for susequent etenson to loner susequences. For concseness, we have ntroduced the defnton s ep 1 s 2 L 1 L 2 3 e, d, e 1, d 1/RTQ d,e, where d and e are defned mplctly n terms of L 1 and L 2. For mplementaton purposes, the second lne of (7) s calculated 2 durn the l, loop and temporarly stored n Q 1,s2. The frst lne of (7) s added to ths contruton n the l 2, 1 loop. As a result of ths two step procedure, we adopt the conventon that L 1 and L 2 are defned wth respect to the loop nde n whch they are calculated (.e., l, for the second lne and l 2, 1 for the frst lne). Ths conventon facltates the comparson of the etenson dentty wth pseudocode. The recurson proalty alorthm eamnes susequences of lenth l startn wth l N and endn wth l 1. To recompute Q n ths contet, we use the contracton dentty Q 1,s2 1 L 14,L 24 L 1L 2s2 s 2 jn L 14,L 24 L 1L 2s2 s 2 Q,s s L 14 s L 24 L 1L 2s L 1L 2s ep 1 s 2 1 s/rt (8) Fure A2. Pseudocode for computn nteror loop contrutons to P n O(N 3 ) as an alternatve to the O(N 4 ) nteror loop recurson of Fure 4.
9 Alorthm for Computn Nuclec Acd BaseParn Proaltes 1303 calculaton of P requres nformaton aout whch Q quanttes ultmately contrute to secondary structures n the ensemle. As a result, the etenson dentty (7) cannot smply e transformed usn the standard recurson proalty approach, whch assumes that oth sdes of the equaton represent susequence partton functons that are assured of contrutn to the equlrum ensemle. Ths realzaton suests computn P,s y addn the proaltes of all nternal loops that rely on Q,s to ncorporate nformaton n the partton functon. To calculate P,s (for a fed l ), note that Q,s wll e nvoked for all nteror loops (, d, e, j) wth nteror par d e and closn par j such that j j 0, L 1 4, L 2 4, L 1 L 2 s, (9) where L 1, L 2 and s are defned wth respect to and j. Hence, a partcular loop (, d, e, j) s dentfed wth a set of Q,s terms that are related y the etenson dentty (7). Alternatvely, a partcular Q,s term s dentfed wth all of the nteror loops (, d, e, j) to whch t ultmately contrutes va the etenson dentty. Consequently, from the noton of recurson proaltes ntroduced earler, P,s (for a fed l ) should e the sum of the proaltes of all nteror loops (, d, e, j) that satsfy the propertes (9). For the case where 1 N j (the case 1 N j yelds analoous results), t follows that P,s 1 p, d, e, j, (10) L 14,L 24 L 1L 2s where p(, d, e, j) s the proalty of the (, d, e, j) nteror loop n the equlrum ensemle of secondary structures. Because 2 P 1,s2 s defned smlarly, wth l and s decremented y 2, t follows that Fure A3. Pseudocode for computn nteror loop contrutons to P n O(N 5 ) as an alternatve to the O(N 6 ) nteror loop recurson of Fure 5. Qtype matrces on the rhthand sde are susequence partton functons descrn a local structural motf that contrutes to the larer susequence partton functon on the lefthand sde. Q,s contans nformaton aout possle etensle loops that may not actually est (f and j are not complementary). The etenson dentty (7) passes ths potentally useful nformaton on to 2 Q 1,s2. Consder, for eample, a chan of Q values related y the etenson dentty n a case where no complementary j ase par s encountered whle ncrementn l y 2 untl an end of the strand s reached. In ths scenaro, the values of Q computed n ths chan should not contrute to the correspondn recurson proaltes P ecause the values of Q are not dentfed wth any secondary structure n the equlrum ensemle. Hence, the 2 P 1,s2 1 1 p, d, e, j, (11) L 15,L 25 L 1L 2s where L 1 and L 2 are temporarly defned wth respect to and j to retan the sze constrant L 1 L 2 s. Comparn (10) and (11), we then dentfy the relatonshp 2 P 1,s2 P,s L 15,L 25 L 1L 2s 1 p, d, e, j 1, jj1 p, d, e, j L14,L24 L 1L 2s 1 p, d, e, j L15,L24, L 1L 2s where L 1 and L 2 contnue to e defned wth respect to and j. Fnally, we shft the ndces n the frst lne so that L 1 and L 2 are defned wth respect to 1 and j 1
MANY of the problems that arise in early vision can be
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 147 What Energy Functons Can Be Mnmzed va Graph Cuts? Vladmr Kolmogorov, Member, IEEE, and Ramn Zabh, Member,
More informationDo Firms Maximize? Evidence from Professional Football
Do Frms Maxmze? Evdence from Professonal Football Davd Romer Unversty of Calforna, Berkeley and Natonal Bureau of Economc Research Ths paper examnes a sngle, narrow decson the choce on fourth down n the
More informationBoosting as a Regularized Path to a Maximum Margin Classifier
Journal of Machne Learnng Research 5 (2004) 941 973 Submtted 5/03; Revsed 10/03; Publshed 8/04 Boostng as a Regularzed Path to a Maxmum Margn Classfer Saharon Rosset Data Analytcs Research Group IBM T.J.
More informationComplete Fairness in Secure TwoParty Computation
Complete Farness n Secure TwoParty Computaton S. Dov Gordon Carmt Hazay Jonathan Katz Yehuda Lndell Abstract In the settng of secure twoparty computaton, two mutually dstrustng partes wsh to compute
More informationTurbulence Models and Their Application to Complex Flows R. H. Nichols University of Alabama at Birmingham
Turbulence Models and Ther Applcaton to Complex Flows R. H. Nchols Unversty of Alabama at Brmngham Revson 4.01 CONTENTS Page 1.0 Introducton 1.1 An Introducton to Turbulent Flow 11 1. Transton to Turbulent
More informationAlpha if Deleted and Loss in Criterion Validity 1. Appeared in British Journal of Mathematical and Statistical Psychology, 2008, 61, 275285
Alpha f Deleted and Loss n Crteron Valdty Appeared n Brtsh Journal of Mathematcal and Statstcal Psychology, 2008, 6, 275285 Alpha f Item Deleted: A Note on Crteron Valdty Loss n Scale Revson f Maxmsng
More informationThe Relationship between Exchange Rates and Stock Prices: Studied in a Multivariate Model Desislava Dimitrova, The College of Wooster
Issues n Poltcal Economy, Vol. 4, August 005 The Relatonshp between Exchange Rates and Stock Prces: Studed n a Multvarate Model Desslava Dmtrova, The College of Wooster In the perod November 00 to February
More informationAssessing health efficiency across countries with a twostep and bootstrap analysis *
Assessng health effcency across countres wth a twostep and bootstrap analyss * Antóno Afonso # $ and Mguel St. Aubyn # February 2007 Abstract We estmate a semparametrc model of health producton process
More informationWho are you with and Where are you going?
Who are you wth and Where are you gong? Kota Yamaguch Alexander C. Berg Lus E. Ortz Tamara L. Berg Stony Brook Unversty Stony Brook Unversty, NY 11794, USA {kyamagu, aberg, leortz, tlberg}@cs.stonybrook.edu
More informationEVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1
Int. J. Systems Sc., 1970, vol. 1, No. 2, 8997 EVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1 Roger C. Conant Department of Informaton Engneerng, Unversty of Illnos, Box 4348, Chcago,
More information4.3.3 Some Studies in Machine Learning Using the Game of Checkers
4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 535 Some Studes n Machne Learnng Usng the Game of Checkers Arthur L. Samuel Abstract: Two machnelearnng procedures have been nvestgated n some
More informationDISCUSSION PAPER. Should Urban Transit Subsidies Be Reduced? Ian W.H. Parry and Kenneth A. Small
DISCUSSION PAPER JULY 2007 RFF DP 0738 Should Urban Transt Subsdes Be Reduced? Ian W.H. Parry and Kenneth A. Small 1616 P St. NW Washngton, DC 20036 2023285000 www.rff.org Should Urban Transt Subsdes
More informationDISCUSSION PAPER. Is There a Rationale for OutputBased Rebating of Environmental Levies? Alain L. Bernard, Carolyn Fischer, and Alan Fox
DISCUSSION PAPER October 00; revsed October 006 RFF DP 03 REV Is There a Ratonale for OutputBased Rebatng of Envronmental Leves? Alan L. Bernard, Carolyn Fscher, and Alan Fox 66 P St. NW Washngton, DC
More informationAsRigidAsPossible Image Registration for Handdrawn Cartoon Animations
AsRgdAsPossble Image Regstraton for Handdrawn Cartoon Anmatons Danel Sýkora Trnty College Dubln John Dnglana Trnty College Dubln Steven Collns Trnty College Dubln source target our approach [Papenberg
More informationcan basic entrepreneurship transform the economic lives of the poor?
can basc entrepreneurshp transform the economc lves of the poor? Orana Bandera, Robn Burgess, Narayan Das, Selm Gulesc, Imran Rasul, Munsh Sulaman Aprl 2013 Abstract The world s poorest people lack captal
More informationFinance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C.
Fnance and Economcs Dscusson Seres Dvsons of Research & Statstcs and Monetary Affars Federal Reserve Board, Washngton, D.C. Banks as Patent Fxed Income Investors Samuel G. Hanson, Andre Shlefer, Jeremy
More informationWhat to Maximize if You Must
What to Maxmze f You Must Avad Hefetz Chrs Shannon Yoss Spegel Ths verson: July 2004 Abstract The assumpton that decson makers choose actons to maxmze ther preferences s a central tenet n economcs. Ths
More informationShould marginal abatement costs differ across sectors? The effect of lowcarbon capital accumulation
Should margnal abatement costs dffer across sectors? The effect of lowcarbon captal accumulaton Adren VogtSchlb 1,, Guy Meuner 2, Stéphane Hallegatte 3 1 CIRED, NogentsurMarne, France. 2 INRA UR133
More informationFrom Computing with Numbers to Computing with Words From Manipulation of Measurements to Manipulation of Perceptions
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 45, NO. 1, JANUARY 1999 105 From Computng wth Numbers to Computng wth Words From Manpulaton of Measurements to Manpulaton
More informationTrueSkill Through Time: Revisiting the History of Chess
TrueSkll Through Tme: Revstng the Hstory of Chess Perre Dangauther INRIA Rhone Alpes Grenoble, France perre.dangauther@mag.fr Ralf Herbrch Mcrosoft Research Ltd. Cambrdge, UK rherb@mcrosoft.com Tom Mnka
More informationThe Global Macroeconomic Costs of Raising Bank Capital Adequacy Requirements
W/1/44 The Global Macroeconomc Costs of Rasng Bank Captal Adequacy Requrements Scott Roger and Francs Vtek 01 Internatonal Monetary Fund W/1/44 IMF Workng aper IMF Offces n Europe Monetary and Captal Markets
More informationWhy Don t We See Poverty Convergence?
Why Don t We See Poverty Convergence? Martn Ravallon 1 Development Research Group, World Bank 1818 H Street NW, Washngton DC, 20433, USA Abstract: We see sgns of convergence n average lvng standards amongst
More informationCiphers with Arbitrary Finite Domains
Cphers wth Arbtrary Fnte Domans John Black 1 and Phllp Rogaway 2 1 Dept. of Computer Scence, Unversty of Nevada, Reno NV 89557, USA, jrb@cs.unr.edu, WWW home page: http://www.cs.unr.edu/~jrb 2 Dept. of
More informationMULTIPLE VALUED FUNCTIONS AND INTEGRAL CURRENTS
ULTIPLE VALUED FUNCTIONS AND INTEGRAL CURRENTS CAILLO DE LELLIS AND EANUELE SPADARO Abstract. We prove several results on Almgren s multple valued functons and ther lnks to ntegral currents. In partcular,
More informationEnsembling Neural Networks: Many Could Be Better Than All
Artfcal Intellgence, 22, vol.37, no.2, pp.239263. @Elsever Ensemblng eural etworks: Many Could Be Better Than All ZhHua Zhou*, Janxn Wu, We Tang atonal Laboratory for ovel Software Technology, anng
More information(Almost) No Label No Cry
(Almost) No Label No Cry Gorgo Patrn,, Rchard Nock,, Paul Rvera,, Tbero Caetano,3,4 Australan Natonal Unversty, NICTA, Unversty of New South Wales 3, Ambata 4 Sydney, NSW, Australa {namesurname}@anueduau
More informationAsRigidAsPossible Shape Manipulation
AsRgdAsPossble Shape Manpulaton akeo Igarash 1, 3 omer Moscovch John F. Hughes 1 he Unversty of okyo Brown Unversty 3 PRESO, JS Abstract We present an nteractve system that lets a user move and deform
More informationUPGRADE YOUR PHYSICS
Correctons March 7 UPGRADE YOUR PHYSICS NOTES FOR BRITISH SIXTH FORM STUDENTS WHO ARE PREPARING FOR THE INTERNATIONAL PHYSICS OLYMPIAD, OR WISH TO TAKE THEIR KNOWLEDGE OF PHYSICS BEYOND THE ALEVEL SYLLABI.
More informationIncome per natural: Measuring development as if people mattered more than places
Income per natural: Measurng development as f people mattered more than places Mchael A. Clemens Center for Global Development Lant Prtchett Kennedy School of Government Harvard Unversty, and Center for
More informationThe Developing World Is Poorer Than We Thought, But No Less Successful in the Fight against Poverty
Publc Dsclosure Authorzed Pol c y Re s e a rc h Wo r k n g Pa p e r 4703 WPS4703 Publc Dsclosure Authorzed Publc Dsclosure Authorzed The Developng World Is Poorer Than We Thought, But No Less Successful
More information