A parallel algorithm for the extraction of structured motifs
|
|
|
- Noel James
- 10 years ago
- Views:
Transcription
1 parallel algorithm for the extraction of structured motifs lexandra arvalho MEI 2002/03 omputação em Sistemas Distribuídos 2003 p.1/27
2 Plan of the talk Biological model of regulation Nucleic acids: DN and RN lassification of living organisms: Prokaryotes and Eukaryotes ranscription and ranslation Promoter and Regulatory Sequences omputational model of regulation Suffix tree and generalized suffix tree Single models extraction [M.-F. Sagot, Latin, 1998] Structured models extraction [L. Marsan and M.-F. Sagot, J. omputational Biology, 2000] Parallelization [. arvalho,. Freitas,. Oliveira and M.-F. Sagot, submitted, 2003] he PRIION UP O ε problem he Simpleut algorithm he tree partition problem he PSMILE algorithm Experimental results omputação em Sistemas Distribuídos 2003 p.2/27
3 Nucleic acids Nucleotides: storage and retrieval of biological information building blocks for the construction of nucleic acids omputação em Sistemas Distribuídos 2003 p.3/27
4 Nucleic acids Nucleotides: storage and retrieval of biological information building blocks for the construction of nucleic acids wo main types of nucleic acids: DN - DeoxyriboNucleic cid: contain the bases,,, and double-stranded molecule RN - RiboNucleic cid: contain the bases,,, and U single-stranded molecule omputação em Sistemas Distribuídos 2003 p.3/27
5 lassification of living organisms Prokaryotes: reek words: pro "before"and karyon "nucleus" bacteria and prokaryotes are generally used interchangeably most prokaryotes live as single-celled organisms Eukaryotes: reek words: eu "well"and karyon "nucleus" yeast is an eukaryotic single-celled organism omputação em Sistemas Distribuídos 2003 p.4/27
6 ranscription and ranslation omputação em Sistemas Distribuídos 2003 p.5/27
7 Promoter and Regulatory Sequences omputação em Sistemas Distribuídos 2003 p.6/27
8 Structured motifs Definition. model model is an element in Σ +. Definition. structured model structured model is a pair (m, d) where: m = (m i ) 1 i p, denoting the p boxes d = (d mini, d maxi, δ i ) 1 i p 1, denoting the p 1 intervals of distance omputação em Sistemas Distribuídos 2003 p.7/27
9 Structured motifs Definition. model model is an element in Σ +. Definition. structured model structured model is a pair (m, d) where: m = (m i ) 1 i p, denoting the p boxes d = (d mini, d maxi, δ i ) 1 i p 1, denoting the p 1 intervals of distance m1 e1 subst. [d1min,d1max] m2 e2 subst. mp ep subst. k1 k2 kp p boxes n attempt to model the combinatorics of regulation omputação em Sistemas Distribuídos 2003 p.7/27
10 Structured motifs Definition. e-occurrence model m e-occurs in the input sequences if exists u in the input sequences such that HammingDistance(m, u) e (minimum number of substitutions to transform u into m). omputação em Sistemas Distribuídos 2003 p.8/27
11 Structured motifs Definition. e-occurrence model m e-occurs in the input sequences if exists u in the input sequences such that HammingDistance(m, u) e (minimum number of substitutions to transform u into m). Definition. valid model, quorum model is valid if e-occurs in at least q input sequences, where q is called the quorum. omputação em Sistemas Distribuídos 2003 p.8/27
12 Structured motifs Definition. e-occurrence model m e-occurs in the input sequences if exists u in the input sequences such that HammingDistance(m, u) e (minimum number of substitutions to transform u into m). Definition. valid model, quorum model is valid if e-occurs in at least q input sequences, where q is called the quorum. e1=2 e2=1 [16,18] k1=[6,8] k2=[6,8] q=3 omputação em Sistemas Distribuídos 2003 p.8/27
13 Structured motifs Definition. e-occurrence model m e-occurs in the input sequences if exists u in the input sequences such that HammingDistance(m, u) e (minimum number of substitutions to transform u into m). Definition. valid model, quorum model is valid if e-occurs in at least q input sequences, where q is called the quorum. e1=2 e2=1 [16,18] k1=[6,8] k2=[6,8] too far 17 q=3 omputação em Sistemas Distribuídos 2003 p.8/27
14 Input sequences >strand + guab inositol-monophosphate dehydrogenas >strand - yaa yaa >strand + yaaj similar to hypothetical proteins >strand - yaai similar to isochorismatase >strand + mets methionyl-trn synthetase omputação em Sistemas Distribuídos 2003 p.9/27
15 Suffix ree Definition. Suffix tree suffix tree of a n-character string S is a rooted directed tree with exactly n leaves: leaves are numbered 1 to n each internal node has at least two children each edge is labeled with a nonempty substring of S no two edges out of a node can have edge-labels beginning with the same character he key feature of the suffix tree is that for any leaf i, the label of the path from the root to the leaf i exactly spells out the suffix of S that starts at position i. Weiner, IEEE Symposium on Switching and utomata heory, 1973 Ukkonen, lgorithmica, 1995 heorem. Suffix trees can be built in linear-time. omputação em Sistemas Distribuídos 2003 p.10/27
16 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
17 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
18 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
19 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
20 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
21 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
22 Suffix ree Suffix tree for the string omputação em Sistemas Distribuídos 2003 p.11/27
23 eneralized Suffix ree Suffix tree for the strings and # omputação em Sistemas Distribuídos 2003 p.12/27
24 eneralized Suffix ree Suffix tree for the strings and # omputação em Sistemas Distribuídos 2003 p.12/27
25 eneralized Suffix ree Suffix tree for the strings and # # omputação em Sistemas Distribuídos 2003 p.12/27
26 eneralized Suffix ree Suffix tree for the strings and # # # omputação em Sistemas Distribuídos 2003 p.12/27
27 eneralized Suffix ree Suffix tree for the strings and # # # # omputação em Sistemas Distribuídos 2003 p.12/27
28 eneralized Suffix ree Suffix tree for the strings and # # # # # omputação em Sistemas Distribuídos 2003 p.12/27
29 eneralized Suffix ree Suffix tree for the strings and # # # # # # omputação em Sistemas Distribuídos 2003 p.12/27
30 eneralized Suffix ree Suffix tree for the strings and #,# # # # # # omputação em Sistemas Distribuídos 2003 p.12/27
31 eneralized Suffix ree Suffix tree for the strings and #,# # # # # #,# omputação em Sistemas Distribuídos 2003 p.12/27
32 eneralized Suffix ree with olors Suffix tree for the strings and # [0,1] #,#,# # [0,1] [0,1] [1,0] [1,0] # [0,1] # [1,0] [1,0] [0,1] [1,0] # [0,1] [ ]: bit vectors called olors omputação em Sistemas Distribuídos 2003 p.13/27
33 Extraction of Single Models Definition. e-node-occurrence e-node-occurrence of a model m is represented by a pair (v, e v ) where v is a tree node and e v e is the Hamming distance between the label of the path from the root to v and m. Notation. ν(e, k) he number of distinct words at Hamming distance at most e from a k-long word: ν(e, k) = e k i i=0 ( Σ 1) i k e Σ e. Notation. n k he number of tree nodes at depth k of a suffix tree. omputação em Sistemas Distribuídos 2003 p.14/27
34 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.15/27
35 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and [1,0] [0,1] [1,0] (,0); (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
36 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and [1,0] [0,1] [1,0] (,0); (,1); (,1) (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
37 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and [1,0] [0,1] [1,0] (,0); (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
38 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and [1,0] [0,1] [1,0] (,0); (,1); (,1) (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
39 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,0); (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
40 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,0); (,1); (,1) (,1) omputação em Sistemas Distribuídos 2003 p.15/27
41 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,0); (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
42 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,0); (,1); (,1) (,1) omputação em Sistemas Distribuídos 2003 p.15/27
43 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,0); (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
44 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.15/27
45 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,1); (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
46 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) [1,0] [0,1] [1,0] (,1); (,0); (,1) (,1); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
47 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) [1,0] [0,1] [1,0] (,1); (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
48 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) [1,0] [0,1] [1,0] (,1); (,0); (,1) (,1); (,0) omputação em Sistemas Distribuídos 2003 p.15/27
49 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] (,1); (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
50 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] (,1); (,0); (,1) (,1) omputação em Sistemas Distribuídos 2003 p.15/27
51 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] (,1); (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
52 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] (,1); (,0); (,1) (,1) omputação em Sistemas Distribuídos 2003 p.15/27
53 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] (,1); (,0); (,1) omputação em Sistemas Distribuídos 2003 p.15/27
54 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.15/27
55 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) (,1) (,1) (,1) (,1) [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.15/27
56 Extraction of Single Models M.-F. Sagot, Latin, 1998 k = 2 e = 1 q = 2 Input sequences: and (,0) (,1) (,1) (,1) (,1) (,0) (,1) (,1) (,1) (,1) [1,0] [0,1] [1,0] Proposition. he single motifs extraction takes O(Nn k ν(e, k)) time. omputação em Sistemas Distribuídos 2003 p.15/27
57 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 ExtractModels(Model m, Block i) 1. for each node-occurrence v of m 2. if (i > 1) 3. put in P otencialstarts the children of v at levels (i 1)k + (i 1)d mini 1 to (i 1)k + (i 1)d maxi 1 4. else 5. put v in P otencialstarts 6. for each model m i obtained by doing a recursive depth-first traversal from the root of the virtual model tree M while simultaneously traversing from the node-occurrences in P otencialstarts 7. if (i < p) 8. ExtractModels(m = m 1... m i,i + 1) 9. else 10. KeepModel( (m 1,..., m p ), ((d min1, d max1 ),..., (d minp, d maxp )) ) omputação em Sistemas Distribuídos 2003 p.16/27
58 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
59 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
60 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and (,1) (,1) [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
61 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and (,1) (,1) [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
62 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and (,1) [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
63 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and (,1) [0,1] [1,0] [0,1] [1,0] omputação em Sistemas Distribuídos 2003 p.17/27
64 Extraction of Structured Models: SMILE L. Marsan and M.-F. Sagot, Journal of omputational Biology, 2000 p = 2 k 1 = 2, d = 1, k 2 = 2 e 1 = 1, e 2 = 1 q = 2 Input sequences: and (,1) [0,1] [1,0] [0,1] [1,0] Proposition. he structured motifs extraction takes O(Nn pk+(p 1)dmax ν p (e, k)) time. omputação em Sistemas Distribuídos 2003 p.17/27
65 PRIION UP O ε PRIION UP O ε problem: l gold bars w i 0 is the the weight of the ith gold bar any gold bar can be cut in c equal parts Optimization version: he problem is how to share the gold between r persons, with the minimum number of gold bars z, in such a way that each person gets the same share of gold up to some weight ε > 0. Decision version: he problem is to decide whether it is possible to share the gold between r persons, with z gold bars, in such a way that each person gets the same share of gold up to some weight ε 0. Proposition. he PRIION UP O ε problem is NP-complete in the strong sense. omputação em Sistemas Distribuídos 2003 p.18/27
66 PRIION UP O ε Simpleut(Partition i, oldbars l, Persons r, Weights w j, utfactor c, WorkOverload ε) 1. find the smallest t such that max w j c t ε 2. for each j {1,..., l} [ j 1 3. let V j = k=1 w k c t, ) j k=1 w k c t 4. let w = l j=1 w j 5. let γ = w c t mod r 6. let δ = w ct r 7. let I i [(i 1)(δ + 1), i(δ + 1) ) = [γ(δ + 1) + (i (γ + 1))δ, γ(δ + 1) + (i γ)δ ) for all i γ otherwise 8. transform I i = [a, b) into I i = [f(a), f(b)) with f : w c t l c t : f(x) = (j 1) c t + x inf(v j ) w j l c t for all x V j if x = w c t omputação em Sistemas Distribuídos 2003 p.19/27
67 PRIION UP O ε j 1 2 w j 2 1 r = 3 ε = 1 omputação em Sistemas Distribuídos 2003 p.20/27
68 PRIION UP O ε j 1 2 w j 2 1 r = 3 ε = 1 t = 1 1. find the smallest t such that max w j c t ε omputação em Sistemas Distribuídos 2003 p.20/27
69 PRIION UP O ε j 1 2 w j 2 1 r = 3 ε = 1 t = 1 2. for each j[ 1,..., l j 1 3. V j = k=1 w k c t, ) j k=1 w k c t V1=[0,4) V2=[4,6) omputação em Sistemas Distribuídos 2003 p.20/27
70 PRIION UP O ε j 1 2 w j 2 1 r = 3 ε = 1 t = 1 w = 3 γ = 0 δ = 2 4. w = l j=1 w j 5. γ = w c t mod r 6. δ = w ct r V1=[0,4) V2=[4,6) omputação em Sistemas Distribuídos 2003 p.20/27
71 PRIION UP O ε j 1 2 w j 2 1 { 7. I i [(i 1)(δ + 1), i(δ + 1) ) = for all i γ [γ(δ + 1) + (i (γ + 1))δ, γ(δ + 1) + (i γ)δ ) otherwise r = 3 ε = 1 t = 1 w = 3 γ = 0 δ = 2 V1=[0,4) V2=[4,6) I 1=[0,2) I 2=[2,4) I 3=[4,6) omputação em Sistemas Distribuídos 2003 p.20/27
72 PRIION UP O ε j 1 2 w j 2 1 r = 3 ε = 1 t = 1 w = 3 γ = 0 δ = 2 8. transform I i = [a, b) into I i = [f(a), f(b)) with { (j 1) c t + x inf(v j ) for all x V f(x) = w j j l c t if x = w c t V1=[0,4) V2=[4,6) I 1=[0,2) I 2=[2,4) I 3=[4,6) I1=[0,1) I2=[1,2) I3=[2,4) omputação em Sistemas Distribuídos 2003 p.20/27
73 PRIION UP O ε Proposition. he Simpleut algorithm requires O(l) time. Proposition. he Simpleut algorithm has a ratio bound ρ(l, r, (w i ) 1 i l, c, ε) = O( max w i ε ). omputação em Sistemas Distribuídos 2003 p.21/27
74 Parallelization Reducing the tree partition problem to the PRIION UP O ε problem Input of the Simpleut algorithm for the ith grid node: l = Σ r matches the number of grid nodes w j of each symbol of the alphabet is obtained by scanning the input sequences c = Σ ε is an user parameter omputação em Sistemas Distribuídos 2003 p.22/27
75 Parallelization Reducing the tree partition problem to the PRIION UP O ε problem Input of the Simpleut algorithm for the ith grid node: l = Σ r matches the number of grid nodes w j of each symbol of the alphabet is obtained by scanning the input sequences c = Σ ε is an user parameter Output of the Simpleut algorithm for the ith grid node: the number t of cuts gives the depth t + 1 of the tree where the partition is defined an interval I i corresponding to tree nodes at depth t + 1 assigned to the ith grid node omputação em Sistemas Distribuídos 2003 p.22/27
76 Parallelization Reducing the tree partition problem to the PRIION UP O ε problem Input of the Simpleut algorithm for the ith grid node: l = Σ r matches the number of grid nodes w j of each symbol of the alphabet is obtained by scanning the input sequences c = Σ ε is an user parameter Output of the Simpleut algorithm for the ith grid node: the number t of cuts gives the depth t + 1 of the tree where the partition is defined an interval I i corresponding to tree nodes at depth t + 1 assigned to the ith grid node Simpleut(Partition i, lphabetsize l, ridnodes r, Weights w j, lphabetsize c, WorkOverload ε) 1. find the smallest t such that max w j c t 2. let t = min(depth(m) - 1,t ) ε omputação em Sistemas Distribuídos 2003 p.22/27
77 Parallelization j σ j w j r = 5 ε = 1 omputação em Sistemas Distribuídos 2003 p.23/27
78 Parallelization j σ j w j r = 5 ε = 1 t = 1 t = 1 1. find the smallest t such that max w j 2. t = min(depth(m) - 1,t ) c t ε omputação em Sistemas Distribuídos 2003 p.23/27
79 Parallelization j σ j w j for each j[ 1,..., l j 1 4. V j = k=1 w k c t, ) j k=1 w k c t r = 5 ε = 1 t = 1 t = 1 V1=[0,8) V2=[8,12) V3=[12,16) V4=[16,24) omputação em Sistemas Distribuídos 2003 p.23/27
80 Parallelization j σ j w j r = 5 ε = 1 t = 1 t = 1 5. w = l j=1 w j 6. γ = w c t mod r 7. δ = w ct r V1=[0,8) V2=[8,12) V3=[12,16) V4=[16,24) omputação em Sistemas Distribuídos 2003 p.23/27
81 Parallelization j σ j w j { 8. I i [(i 1)(δ + 1), i(δ + 1) ) = for all i γ [γ(δ + 1) + (i (γ + 1))δ, γ(δ + 1) + (i γ)δ ) otherwise r = 5 ε = 1 t = 1 t = 1 w = 6 γ = 4 δ = 4 V1=[0,8) V2=[8,12) V3=[12,16) V4=[16,24) I 1=[0,5) I 2=[5,10) I 3=[10,15) I 4=[15,20) I 5=[20,24) omputação em Sistemas Distribuídos 2003 p.23/27
82 Parallelization j σ j w j r = 5 ε = 1 t = 1 t = 1 w = 6 γ = 4 δ = 4 9. transform I i = [a, b) into I i = [f(a), f(b)) with { (j 1) c t + x inf(v j ) for all x V f(x) = w j j l c t if x = w c t V1=[0,8) V2=[8,12) V3=[12,16) V4=[16,24) I 1=[0,5) I 2=[5,10) I 3=[10,15) I 4=[15,20) I 5=[20,24) I1=[0,2) I2=[2,6) I3=[6,11) I4=[11,14) I5=[14,16) omputação em Sistemas Distribuídos 2003 p.23/27
83 Parallelization PExtractModels(Model m, Block i, PartitionSet I i of M) 1. for each node-occurrence v of m 2. if (i > 1) 3. put in P otencialstarts the children of v at levels (i 1)k + (i 1)d mini 1 to (i 1)k + (i 1)d maxi 1 4. else 5. put v in P otencialstarts 6. for each model m i I i obtained by doing a recursive depth-first traversal from the root of the virtual model tree M while simultaneously traversing from the node-occurrences in P otencialstarts 7. if (i < p) 8. PExtractModels(m = m 1... m i,i + 1, I i ) 9. else 10. KeepModel( (m 1,..., m p ), ((d min1, d max1 ),..., (d minp, d maxp )) ) omputação em Sistemas Distribuídos 2003 p.24/27
84 Parallelization PSmile(ridNode i, WorkOverload ε) 1. compute weights (w i ) 1 i Σ ; 2. build suffix tree ; 3. create colors on ; 4. let I i = Simpleut(i, Σ, r, (w i ) 1 i Σ, Σ, ε); 5. call PExtractModels(, I i ); Proposition. ssume Σ fixed and w i = 1 for 1 i Σ. he parallel algorithm PSmile is work-efficient with respect to the sequential version when r = O(ν p 2 (e, k)) and ε w 1 r. omputação em Sistemas Distribuídos 2003 p.25/27
85 Parallelization Experimental results 2 boxes 3 boxes models time (sec) models time (sec) grid node grid node grid node grid node total parallel time sequential time speed up omputação em Sistemas Distribuídos 2003 p.26/27
86 On going and future work Implementation of a more efficient sequential algorithm to extract structured models [L. Marsan and M.-F. Sagot, J. omputational Biology, 2000] Establishing an even more efficient algorithm to extract structured models [. arvalho,. Freitas,. Oliveira and M.-F. Sagot, in preparation, 2003] omparison between algorithms which attempts to model the combinatorics of regulation omputação em Sistemas Distribuídos 2003 p.27/27
Analysis of Algorithms I: Optimal Binary Search Trees
Analysis of Algorithms I: Optimal Binary Search Trees Xi Chen Columbia University Given a set of n keys K = {k 1,..., k n } in sorted order: k 1 < k 2 < < k n we wish to build an optimal binary search
The LCA Problem Revisited
The LA Problem Revisited Michael A. Bender Martín Farach-olton SUNY Stony Brook Rutgers University May 16, 2000 Abstract We present a very simple algorithm for the Least ommon Ancestor problem. We thus
On line construction of suffix trees 1
(To appear in ALGORITHMICA) On line construction of suffix trees 1 Esko Ukkonen Department of Computer Science, University of Helsinki, P. O. Box 26 (Teollisuuskatu 23), FIN 00014 University of Helsinki,
Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!
Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel
Analysis of Algorithms I: Binary Search Trees
Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary
Efficient Recovery of Secrets
Efficient Recovery of Secrets Marcel Fernandez Miguel Soriano, IEEE Senior Member Department of Telematics Engineering. Universitat Politècnica de Catalunya. C/ Jordi Girona 1 i 3. Campus Nord, Mod C3,
Lecture 4: Exact string searching algorithms. Exact string search algorithms. Definitions. Exact string searching or matching
COSC 348: Computing for Bioinformatics Definitions A pattern (keyword) is an ordered sequence of symbols. Lecture 4: Exact string searching algorithms Lubica Benuskova http://www.cs.otago.ac.nz/cosc348/
Arithmetic Coding: Introduction
Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation
Distance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
Network (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 [email protected],
The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).
CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical
DNA is found in all organisms from the smallest bacteria to humans. DNA has the same composition and structure in all organisms!
Biological Sciences Initiative HHMI DNA omponents and Structure Introduction Nucleic acids are molecules that are essential to, and characteristic of, life on Earth. There are two basic types of nucleic
On Covert Data Communication Channels Employing DNA Steganography with Application in Massive Data Storage
ARAB ACADEMY FOR SCIENCE, TECHNOLOGY AND MARITIME TRANSPORT COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ENGINEERING DEPARTMENT On Covert Data Communication Channels Employing DNA Steganography with
Offline sorting buffers on Line
Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: [email protected] 2 IBM India Research Lab, New Delhi. email: [email protected]
Solutions to Homework 6
Solutions to Homework 6 Debasish Das EECS Department, Northwestern University [email protected] 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example
2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]
Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)
OPTIMAL BINARY SEARCH TREES
OPTIMAL BINARY SEARCH TREES 1. PREPARATION BEFORE LAB DATA STRUCTURES An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such that the tree cost is minimum.
SOLiD System accuracy with the Exact Call Chemistry module
WHITE PPER 55 Series SOLiD System SOLiD System accuracy with the Exact all hemistry module ONTENTS Principles of Exact all hemistry Introduction Encoding of base sequences with Exact all hemistry Demonstration
agucacaaacgcu agugcuaguuua uaugcagucuua
RNA Secondary Structure Prediction: The Co-transcriptional effect on RNA folding agucacaaacgcu agugcuaguuua uaugcagucuua By Conrad Godfrey Abstract RNA secondary structure prediction is an area of bioinformatics
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety
CS711008Z Algorithm Design and Analysis
CS711008Z Algorithm Design and Analysis Lecture 7 Binary heap, binomial heap, and Fibonacci heap 1 Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 The slides were
Translation. Translation: Assembly of polypeptides on a ribosome
Translation Translation: Assembly of polypeptides on a ribosome Living cells devote more energy to the synthesis of proteins than to any other aspect of metabolism. About a third of the dry mass of a cell
RNA and Protein Synthesis
Name lass Date RN and Protein Synthesis Information and Heredity Q: How does information fl ow from DN to RN to direct the synthesis of proteins? 13.1 What is RN? WHT I KNOW SMPLE NSWER: RN is a nucleic
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM
THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM Iuon Chang Lin Department of Management Information Systems, National Chung Hsing University, Taiwan, Department of Photonics and Communication Engineering,
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )
Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins
Single machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max. structure of a schedule Q...
Lecture 4 Scheduling 1 Single machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max structure of a schedule 0 Q 1100 11 00 11 000 111 0 0 1 1 00 11 00 11 00
Binary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary
Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) n-element vector start address + ielementsize 0 +1 +2 +3 +4... +n-1 start address continuous memory block static, if size is known at compile time dynamic,
GRAPH THEORY LECTURE 4: TREES
GRAPH THEORY LECTURE 4: TREES Abstract. 3.1 presents some standard characterizations and properties of trees. 3.2 presents several different types of trees. 3.7 develops a counting method based on a bijection
Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R
Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent
Extended Application of Suffix Trees to Data Compression
Extended Application of Suffix Trees to Data Compression N. Jesper Larsson A practical scheme for maintaining an index for a sliding window in optimal time and space, by use of a suffix tree, is presented.
Ph.D. Thesis. Judit Nagy-György. Supervisor: Péter Hajnal Associate Professor
Online algorithms for combinatorial problems Ph.D. Thesis by Judit Nagy-György Supervisor: Péter Hajnal Associate Professor Doctoral School in Mathematics and Computer Science University of Szeged Bolyai
Load balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
Partitioning and Divide and Conquer Strategies
and Divide and Conquer Strategies Lecture 4 and Strategies Strategies Data partitioning aka domain decomposition Functional decomposition Lecture 4 and Strategies Quiz 4.1 For nuclear reactor simulation,
Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scale-free,
On Frequency Assignment in Cellular Networks
On Frequency ssignment in ellular Networks Sanguthevar Rajasekaran Dept.ofISE,Univ. offlorida Gainesville, FL 32611 David Wei Dept. of S, Fordham University New York, NY K. Naik Dept. of S, Univ. of izu
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams Chen Li University of California, Irvine CA 9697, USA [email protected] Bin Wang Northeastern University
Name Class Date. Figure 13 1. 2. Which nucleotide in Figure 13 1 indicates the nucleic acid above is RNA? a. uracil c. cytosine b. guanine d.
13 Multiple Choice RNA and Protein Synthesis Chapter Test A Write the letter that best answers the question or completes the statement on the line provided. 1. Which of the following are found in both
Name Date Period. 2. When a molecule of double-stranded DNA undergoes replication, it results in
DNA, RNA, Protein Synthesis Keystone 1. During the process shown above, the two strands of one DNA molecule are unwound. Then, DNA polymerases add complementary nucleotides to each strand which results
Topological Properties
Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any
Output: 12 18 30 72 90 87. struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;
50 20 70 10 30 69 90 14 35 68 85 98 16 22 60 34 (c) Execute the algorithm shown below using the tree shown above. Show the exact output produced by the algorithm. Assume that the initial call is: prob3(root)
A Partition-Based Efficient Algorithm for Large Scale. Multiple-Strings Matching
A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching Ping Liu Jianlong Tan, Yanbing Liu Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing,
GenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
HMM : Viterbi algorithm - a toy example
MM : Viterbi algorithm - a toy example.5.3.4.2 et's consider the following simple MM. This model is composed of 2 states, (high C content) and (low C content). We can for example consider that state characterizes
Process Mining by Measuring Process Block Similarity
Process Mining by Measuring Process Block Similarity Joonsoo Bae, James Caverlee 2, Ling Liu 2, Bill Rouse 2, Hua Yan 2 Dept of Industrial & Sys Eng, Chonbuk National Univ, South Korea jsbae@chonbukackr
GENERATING THE FIBONACCI CHAIN IN O(log n) SPACE AND O(n) TIME J. Patera
ˆ ˆŠ Œ ˆ ˆ Œ ƒ Ÿ 2002.. 33.. 7 Š 539.12.01 GENERATING THE FIBONACCI CHAIN IN O(log n) SPACE AND O(n) TIME J. Patera Department of Mathematics, Faculty of Nuclear Science and Physical Engineering, Czech
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010. Class 4 Nancy Lynch
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010 Class 4 Nancy Lynch Today Two more models of computation: Nondeterministic Finite Automata (NFAs)
SIMS 255 Foundations of Software Design. Complexity and NP-completeness
SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 [email protected] 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity
LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b
LZ77 The original LZ77 algorithm works as follows: A phrase T j starting at a position i is encoded as a triple of the form distance, length, symbol. A triple d, l, s means that: T j = T [i...i + l] =
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Simone Faro and Thierry Lecroq Università di Catania, Viale A.Doria n.6, 95125 Catania, Italy Université de Rouen, LITIS EA 4108,
On Binary Signed Digit Representations of Integers
On Binary Signed Digit Representations of Integers Nevine Ebeid and M. Anwar Hasan Department of Electrical and Computer Engineering and Centre for Applied Cryptographic Research, University of Waterloo,
Image Compression through DCT and Huffman Coding Technique
International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework
Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern
Suffix Tree Construction and Storage with Limited Main Memory
Universität Bielefeld Technische Fakultät Abteilung Informationstechnik Forschungsberichte Suffix Tree Construction and Storage with Limited Main Memory Klaus-Bernd Schürmann Jens Stoye Report 2003-06
Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs
CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like
Classification/Decision Trees (II)
Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: [email protected] Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).
Vector storage and access; algorithms in GIS. This is lecture 6
Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector
EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27
EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27 The Problem Given a set of N objects, do any two intersect? Objects could be lines, rectangles, circles, polygons, or other geometric objects Simple to
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
Web Document Clustering
Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,
Sorting Hierarchical Data in External Memory for Archiving
Sorting Hierarchical Data in External Memory for Archiving Ioannis Koltsidas School of Informatics University of Edinburgh [email protected] Heiko Müller School of Informatics University of Edinburgh
Load Balancing. Load Balancing 1 / 24
Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait
R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
Near Optimal Solutions
Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.
Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
A Non-Linear Schema Theorem for Genetic Algorithms
A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland
Load Balancing between Computing Clusters
Load Balancing between Computing Clusters Siu-Cheung Chau Dept. of Physics and Computing, Wilfrid Laurier University, Waterloo, Ontario, Canada, NL 3C5 e-mail: [email protected] Ada Wai-Chee Fu Dept. of Computer
Less naive Bayes spam detection
Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:[email protected] also CoSiNe Connectivity Systems
Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm
R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*
Name Class Date. binomial nomenclature. MAIN IDEA: Linnaeus developed the scientific naming system still used today.
Section 1: The Linnaean System of Classification 17.1 Reading Guide KEY CONCEPT Organisms can be classified based on physical similarities. VOCABULARY taxonomy taxon binomial nomenclature genus MAIN IDEA:
System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1
System Interconnect Architectures CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures Direct networks for static connections Indirect
Scheduling Shop Scheduling. Tim Nieberg
Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations
External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
Using Lexical Similarity in Handwritten Word Recognition
Using Lexical Similarity in Handwritten Word Recognition Jaehwa Park and Venu Govindaraju Center of Excellence for Document Analysis and Recognition (CEDAR) Department of Computer Science and Engineering
Introduction to Scheduling Theory
Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France [email protected] November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling
A Business Process Driven Approach for Generating Software Modules
A Business Process Driven Approach for Generating Software Modules Xulin Zhao, Ying Zou Dept. of Electrical and Computer Engineering, Queen s University, Kingston, ON, Canada SUMMARY Business processes
A new binary floating-point division algorithm and its software implementation on the ST231 processor
19th IEEE Symposium on Computer Arithmetic (ARITH 19) Portland, Oregon, USA, June 8-10, 2009 A new binary floating-point division algorithm and its software implementation on the ST231 processor Claude-Pierre
Nucleotides and Nucleic Acids
Nucleotides and Nucleic Acids Brief History 1 1869 - Miescher Isolated nuclein from soiled bandages 1902 - Garrod Studied rare genetic disorder: Alkaptonuria; concluded that specific gene is associated
Approximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
Lempel-Ziv Factorization: LZ77 without Window
Lempel-Ziv Factorization: LZ77 without Window Enno Ohlebusch May 13, 2016 1 Sux arrays To construct the sux array of a string S boils down to sorting all suxes of S in lexicographic order (also known as
Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees
Learning Outcomes COMP202 Complexity of Algorithms Binary Search Trees and Other Search Trees [See relevant sections in chapters 2 and 3 in Goodrich and Tamassia.] At the conclusion of this set of lecture
ONLINE DEGREE-BOUNDED STEINER NETWORK DESIGN. Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015
ONLINE DEGREE-BOUNDED STEINER NETWORK DESIGN Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015 ONLINE STEINER FOREST PROBLEM An initially given graph G. s 1 s 2 A sequence of demands (s i, t i ) arriving
Generating models of a matched formula with a polynomial delay
Generating models of a matched formula with a polynomial delay Petr Savicky Institute of Computer Science, Academy of Sciences of Czech Republic, Pod Vodárenskou Věží 2, 182 07 Praha 8, Czech Republic
Scheduling Single Machine Scheduling. Tim Nieberg
Scheduling Single Machine Scheduling Tim Nieberg Single machine models Observation: for non-preemptive problems and regular objectives, a sequence in which the jobs are processed is sufficient to describe
Scan-Line Fill. Scan-Line Algorithm. Sort by scan line Fill each span vertex order generated by vertex list
Scan-Line Fill Can also fill by maintaining a data structure of all intersections of polygons with scan lines Sort by scan line Fill each span vertex order generated by vertex list desired order Scan-Line
Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li. Advised by: Dave Mount. May 22, 2014
Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li Advised by: Dave Mount May 22, 2014 1 INTRODUCTION In this report we consider the implementation of an efficient
