# The Tree-to-Tree Correction Problem

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 The Tree-to-Tree Correction Problem KUO-CHUNG TAI North Carolina State Umverslty, Ralezgh, North Carohna ABSTRACT The tree-to-tree correctmn problem Is to determine, for two labeled ordered trees T and T', the distance from T to T' as measured by the mlmmum cost sequence of edit operauons needed to transform T into T' The edit operations investigated allow changing one node of a tree into another node, deleting one node from a tree, or inserting a node into a tree An algorithm Is presented which solves this problem m time O(V* V'*LZ* L'2), where V and V' are the numbers of nodes respectively of T and T', and L and L' are the maximum depths respectively of T and T' Possible apphcatmns are to the problems of measuring the similarity between trees, automatic error recovery and correction for programming languages, and determining the largest common substructure of two trees KEY WORDS AND PHRASES tree correction, tree modification, tree similarity CR CATEGORIES 3 79, 4A2, 4 22, 5 23, introduction The string-to-string correction problem, which is to determine the distance between two strings as measured by the minimum cost sequence of edit operations needed to transform one string into the other, was investigated in [7, 8, 10, 12, 13, 14]. In [13] Wagner and Fischer considered the following three edit operations: changing a character into another character, deleting a character, and inserting a character; and they presented an algorithm that computes the distance between two strings in time O(m* n), where m and n are the lengths of the two given strings. The various attempts to achieve high-dimensional generalizations of stnngs include trees, graphs, webs, and plex structures. Trees are considered the most important nonlinear structures arising in computer algonthms [6]. Tree automata, tree grammars, and their apphcations for syntactic pattern recognition have been studied (e.g. [2]). In this paper we define the notion of distance between two labeled ordered trees and present an algorithm that computes the distance in time O(V* V'* L2* L'~), where V and V' are the numbers of nodes of the two trees and L and L' are the maximum depths of the two trees. The set of allowable edit operations includes: (1) changing one node into another node (changing the label of the node); (2) deleting one node from a tree; and (3) inserting a node into a tree. Since each string can be considered a tree of depth two (with a vnrtual root added), the string-to-string correction problem is just a special case of the tree-to-tree correction problem with L = L' = 2. Possible applications of the notion of distance between two trees are discussed at the end of this paper. 2. Edit Operations on Trees In this paper all trees we discuss are rooted, ordered, and labeled. Let T be a tree. I Tt Permission to copy w,thout fee all or part of thts material is granted provided that the copies are not made or &stnbuted for direct commercial advantage, the ACM copyright nouce and the title of the pubhcatlon and its date appear, and notice ~s given that copying Is by permission of the AssoclaUon for Computing Machinery To copy otherwise, or to repubhsh, reqmres a fee and/or specific permission Author's address Department of Computer Science, North Carohna State Umverstt, P O Box 5972, Raleigh, NC ACM !/79/ \$00 75 Journal of the Association for Computing Machm ry, Vo 26, No 3, July 1979, pp

2 The Tree-to-Tree Correction Problem 423 denotes the number of nodes of T. T[i] denotes the node of T whose position in the preorder for nodes of T is i. The preorder traversal on T is to first visit the root of T and then traverse the subtrees of T from left to right, each subtree traversed in preorder. The following diagram illustrates how nodes of a tree T are denoted: LEMMA 2.1. Assume that T[iiI and T[i2] are nodes of T with r[hl being an ancestor of T[i2]. Then (1) il </2; (2) For any i such that h < i _< i2, T[i] is a descendant of T[il]; (3) For the father T[ia] of T[i2], T[za] is T[t2-1] or an ancestor of T[i2-1], and T[i3] is on the path from T[il] to T[i2 - l]. PROOF. The proof follows from the definmon of preorder traversal. Q.E.D. Let A denote the null node. An edit operation is written b ~ c, where each of b and c is either a node or A. b ~ c is a change operation if b # A and c # A, a delete operation if b # A = c, and an insert operation if b = A # c. Let T' be the tree that results from the application of an edit operation b ~ c to tree T; this is written T => T' via b ~ c. The relations between T and T' are descnbed as follows: (1) If b ~ c is a change operation, then node b is replaced by node c, i.e., T = and T t = for some trees T1 and T2. (2) If b --~ c is a delete operation, then node b is deleted, i.e., T = and T' - for some trees T1 and T2. (3) If b ~ c is an insert operation, then node c is imerted, i.e., T = and T'

3 424 KUO-CHUNG TAI for some trees T~ and T2. Each node of a tree except the root is associated with a unique edge connecting the node and its father. For the deletion or insertion of a node, both the node and its associated edge are included. Without loss of generahty it may be assumed that the roots of all trees have the same umque label and that the root of each tree remains unchanged during editing. Examples are given below to illustrate the appl!cauons of the three edit operations. R I ~F "> A via B -~ C E R I F R /k A A A => A E F / \ E F R via B-* A A R A E F "> A ~ via A + C / \ E F Let S be a sequence ;~, ;2,, ;m of edit operations. S transforms tree T to tree T' if there is a sequence of trees To, T~,, Tm such that T = To, T' = Tm and T,-, => T, via ;, for l <_ i_< m. Let r be an arbitrary cost function which assigns to each edlt operation b ~ c a nonnegative real number r(b ~ c). Extend r to a sequence of edit operatlons S = ;i, ;2,, ;m by letting r(s) = ~_, r(;,). Without loss of generality it may be assumed that r(b ~ b) = 0 and that r(b ---> a) + r(a ~ c) _> r(b --> c). The dtstance d(t, T') from tree T to tree T' is defined to be the minimum cost of all sequences of edit operations which transform T into T', i.e., d(t, T') = min{r(s)ls is a sequence of edit operations which transforms T into T'}. The number of different sequences of edit operations which transform T into T' is infimte. Therefore, it is impossible to enumerate all valid sequences and find the minimum cost In the next section, structures called mappings are defined so that d(t, T') can be computed in polynomial time. 3 Mapping; Intuitively, a mapping is a description of how a sequence of edit operations transforms T into T', ignoring the order in which edit operations are applied. Consider the following diagram: R A T[,] T'[!] Z[2~... T" [Z~S] T[3] T[4]... T'[3]... T [6] A dotted line from T[z] to T'[j] indicates that T[t] should be changed to T'[j] if T[i] # T'[j], or that T[i] remains unchanged, but becomes T'[j], If T[i] = T'[j]. Nodes of T not touched by dotted lines are to be deleted and nodes of T' not touched are to be inserted. Such a diagram is called a mapping, which shows a way of transforming T to T'. Formally, we define a triple (M, T, T') to be a mappmg from T to T', where M is any set of pairs of integers (h J) satisfying:

4 The Tree-to-Tree Correction Problem 425 (1) 1 _< i ~< I TI, 1 ~j _< I T'[; (2) For any two pairs (il, jl) and (/2, j2) in M, (a) ix = t2 iffjl = j2; (b) il </2 iffjl < J2; (c) T[ix] is an ancestor (descendant) of T[12] lff T'[jl] is an ancestor (descendant) of T'[j2]. Where there is no resulting confusion, we do not distinguish between the triple (M, T, T') and the set M. Each pair (i,j) in M IS interpreted to be a line segment joining T[i] and T'[j]. Condition (2a) ensures that each node of both T and T' is touched by at most one line. Conditions (2b) and (2c) ensure that after untouched nodes of both T and T' are deleted, T and T' become s~milar (i.e., they have the same structure), and the one-to-one correspondence between nodes of T and T' which preserves the structure is indicated exactly by the hnes of M. By deleting untouched nodes, the preceding diagram becomes: Till T'D] T[2~... T" [ 2 ~ r[5]... T [6] 1 T[4]... T' [3] Let M he a mapping from T to T' and let! and J be the sets of untouched nodes of T and T' respectively. We define the cost of M: cost(m) = ~ r(t[t]--~ T'[j]) + ~ r(r[t] ~ A) + ~ r(a~ T'[j]). O,3)~M t~l J~J Thus, the cost of M is just the cost of the sequence of edit operations which consists of a change operation T[t] --~ T'[j] for each line (i, j) of M, a delete operation T[t] ~ A for each node T[i] not touched by a line of M, and an insert operation A ~ T'[j] for each node T'[j] not touched by a hne of M. In order to show that d(t, T') can be determined by a minimum cost mapping from T to T', the following two lemmas need to be verified: LEMMA 3.1. Let M1 be a mapping from T1 to T2 and let M2 be a mapping from T2 to T3. Then: (1) M1 o M2 = {(i, k)[(i, j) ~ Mi and (j, k) E M2 for some j} ts a mapping from T1 to T3; (2) cost(m1 o M2) --< cost(m1) + cost(m2). PROOF. (1) Let (il, k~) and (/2, k2) be two lines of Mi o M2. Then there exist Jl and./2 such that (il, Jl), (i2, j2) E M1 and (J1, kl), (j2, k2) E M2. By the defimtion of a mapping: (a) (tl =/2 iffjt = J2) and (j~ ffi j2 iff k~ = k2); (b) (ix < i2 iffjx < d2) and (j~ < j2 iff k~ < k2); (c) (Tl[il] is an ancestor (descendant) of T1[i2] iff T2[Ji] is an ancestor (descendant) of T2[j2]), and (T2[J1] is an ancestor (descendant) of Tz[J2] iff T3[kl] is an ancestor (descendant) of Tz[k2]). Therefore, M~ o M2 is a mapping from T1 to T3. (2) The proof that cost(m1 o M2) -< cost(m~) + cost(m2) follows closely the proof of Lemma 1 in [13] and is omitted. This proof relies on the assumption that r(b ~ c) -< r(b~ a) + r(a~ c). Q.E.D. LEMMA 3.2. For any sequence S of edtt operations which transforms T into T', there exists a mapping M from T to T' such that cost(m) <_ r(s). PROOF. It can be shown by induction on m that if S = sl, s2,, Sm is a sequence of edit operations and To, T~,, Tm is the sequence of trees from To to Tm via S, then there

5 426 KUO-CHUNG TAI exists a mapping M from To to Tm such that cost(m) _< r(s). This proof is similar to the proof of Theorem 1 in [13] and is omitted. Q.E.D. Since d(t, T') = min{cost(s) ls is a sequence of edit operations which transforms T into T'}, we have: THEOREM 3.1 d(t, T') = min{cost(m)lm is a raappingfrom T to T'}. Hence the search for a minimal cost sequence of edit operations has been reduced to a search for a minimal cost mapping. Let T(il: /2) denote the portion of T which consists of nodes T[i~], T[il + 1]... T[i2], and their associated edges, where il is an ancestor of /2; and similarly define T'[j~:j2]. Let D(i + l,j + 1) be the distance from tree T(I: i + 1) to tree T'(I:j + 1), i.e., D(i + l,j + I) = d(t(l: i + 1), T'(I:j + 1)). Consider the following diagram: T[.i] T'[I] T(I:t I) t' \T' [j ] T' (l:j+i) Suppose M is a minimum cost mapping from T(I: i + 1) to T'(l:j + 1), i.e., cost(m) = D(i + l,j + 1). Then at least one of the following three cases must hold: Case 1: T[i + 1] is not touched by a line of M. Then cost(m) = D(i,j + 1) + r(t[i + 1] ~ A), corresponding to the cost of transforming T(I: t) to T'(I:j + 1) plus the cost of deleting T[i + 1]. Case 2: T'[j + 1] is not touched by a line of M. Then cost(m) = D(i + l, j) + r(a ~ T'[j + 1]), corresponding to the cost of transforming T(I: i + 1) to T'(I:j) plus the cost of inserting T'[j + 1]. Case 3: T[i + 1] and T'[j + 1] are touched by lines (i + 1, q) and (p,j + l) of M. By the definition of a mapping, (i + 1 =p iffq =j + l) and (i + l >p iffq >j + 1). Since i+ 1 _>p and j+ 1 _>q, i+ 1 --p and j+ 1 = q. Thus, T[i+ l]and T'[j+ 1] are both touched by the same line (i + 1, j + 1). Note that if M' = M - {(i + l, j + l)) is not a minimum cost mapping from T(l:/) to T'(I: j), then cost(m) = cost(m') + r(t[i + l] --, T'[j + 1]) > D(i,j) + r(t[s + 1] ~ T'[j + l]). Thus, it may be true that D(i + 1, j + 1) > D(i, j) + r(t[i + 1] ~ T'[j + 1]), for a minimum cost mapping from T(I:/) to T'(I:j) may not be extendable to a mapping by adding (i + l,j + 1). For example, consider the following trees T1 and T2: A A T 1 B B C T 2 I D Assume that the cost of each change, delete, or insert operation is 1 for all nodes. {(1, 1), (2, 2)} is a minimum cost mapping from Ti(I: 2) to T2(l: 3). However, {(1, 1), (2, 2), (3, 4)} is not a mapping from Ti(I: 3) to Tz(l: 4) because T][2] is an ancestor of T113] but Tz[2] is not an ancestor of T214]. Define MIN M(i + l,j + 1) = min{cost(m)lm is a mapping from T(I: i + 1) to T'(I:j + 1) such that (i + l,j + 1) E M}. From the above argument we have the following theorem: t D

6 The Tree-to-Tree Correction Problem 427 THEOREM 3.2. D(t + l,j + 1) = mm(d(t,j + 1) + r(t[i + 1] ~ A), D(i + l,j) + r(a ~ T'[j + 1]), MIN M(t + 1,j + 1)} foralli, j, I_<i<IT I, I<_J<IT' I. For the special case that MIN M(i + l,j + 1) = DO, J) + r( T[ i + 1] ~ T'[j + 1]) for all i,j, 1 _< i< ITI, and 1 _<j_< [T'[, D([T[, IT'D can be computed in time O([TI*IT'[) (see [13]). The computation of MIN M(i + 1, j + 1) is not trivial. In the next section we explore some properties of mappings and then show that MIN M(t + l,j + 1) can be computed in polynomial time. With the assumption that the root of any tree remains unchanged during editing, every mapping from T(I: i + 1) to T'(I:j + 1) contains (1, 1) and D(I, 1) ~0. LEMMA 3.3. O(l, l) = O, t n(i, l) = Z r(t[k]--) A), l<t_<lti, k-2 and J D(I,j) = Z r(h-* T'[k]), 1 <j_< IT'I. k-2 4. Computatton of MIN M(t + l,j + 1) We first show that a mapping from T(I: i + 1) to T'(I:j + 1) may be decomposed into submappings such that each submapping is a mapping from one portion of T(I: t + 1) to one portion of T'(I: j + 1). Consequently, for a mimmum cost mapping, each of its submappings is also a minimum cost mapping. LEMMA 4.1. Assume that T[p] and T'[ q] are ancestors of T[i + 1] and T;[j + 1], respectively, and that M is a mapping from T(I: i + 1) to T'(I: j + 1) such that (p, q) and (i + l,j + 1) are in M. Let M1 and Mz be the subsets of M defined by Ml={(m,n)l(m,n)~M, l _< m_< p and l _< n_< q} and Mz = {(m, n)l(m, n) ~ M, p _< m _< i and q _< n.<_ j}. Then (1) M1 is a mapping from T0:p) to T'(I: q); Mz is a mapping from T(p: t) to T'( q: j); M = M1 t_j M2 U {(i + l,j + 1)}; and cost(m) = cost(m1) + cost(m2) - r(t[p] ~ T'[ q]) + r(t[i + 1] ~ T'[j + 1]). (2) cost(m) = mm(cost(m')lm' ts a mapping from T(I: i + 1) to T'(I:j + 1) such that (p, q) and (i + l, j + 1) are m M'} iff cost(m1) = mm(cost(m~)lm~ is a mapping from T(l:p) to T'(I: q) such that (p, q) E M~} and cost(m2) = min (cost(m[) I M[ is a mappmgfrom T(p: i) to T'( q: j) such that (p, q) E M[ and M~ U {(t + l,j + 1)} is a mapping}. PROOF. Consider the following diagram: T(I:t+I) T' (l:j+l) J

7 428 KUO-CHUNG TAI (1) For each (m, n) in M, m >_ p if and only If n >_ q. Therefore, M ffi M1 t2 Ms O ((i + l,j + 1)) and M1 and M2 have exactly one common line, (p, q). (2) Only ifi Assume that cost(m1) > min{cost(m~)lm~ is a mapping from T(I:p) to T'(l: q)with (p, q) in M~) ffi cost(m~'), where M~' is a mapping from T(l: p) to T'(I: q) with (p, q) in Mr. Then M" = Mi" O M2 U ((i + l,j + 1)) is a mapping from T(I: i + l) to T'(I:j + l) with (p, q) and (i + l,j + 1) in M", and cost(m") < cost(m). Thus cost(m) > mm(cost(m')lm' is a mapping from T(I: l + l) to T'(I:j + 1) with (p, q) and (i + l,j + 1) in M'). The same conclusion holds if cost(m2) > min{cost(m~)] M~ is a mapping from T(p" i) to T'( q: j) with (p, q) m M~ and M~ U ((i + l,j + 1)} is a mapping). If." The proof is similar to the "only if" proof. Q.E.D. Assume that M is a mapping from T(I: l + l) to T'(I:j + 1) with (i + l,j + 1) in M. Let T[SM] and T'[tM] be the latest ancestors of T[l + 1] and T'[j + 1], respectively, touched by lines of M. (It has been assumed that every mapping from T(l: i + 1) to T'(I'j + 1) contains the line (1, 1), and therefore T[SM] and T'[tM] must exist.) Since (l + l,j + 1) is in M, (SM, tm) is in M. The following diagram illustrates the meaning of nodes T[sm] and T'[Ud: l l T(l:i+l) T t (l:j+l) In the above diagram, f(x) denotes the father of node x. By Lemma 2.1, T[f(i + 1)] (T'[f(j + 1)]) is on the path from T[SM] (T'[tM]) to T[i] (T'[j]). Slashes crossing the line from SM tof(i + 1) (not including SM) indicate that any descendant of T[SM] on the path from T[SM] to T[f(i + 1)] is not touched by any line of M. Slashes in T'(I: j + 1) are defined similarly. LFMMA 4.2. MIN M(i+ l,j+ 1)= r(t[i+ 1]~ T'[j+ l]) + min~,t {min{cost(gl)[m1 is a mapping from T(I: s) to T'(l: t) with (s, t) in Mi) + min(cost(m2)lm2 is a mapping from T(s: i) to T'(t: j) such that (s, t) is in g and any descendant of T[s] (T'[t]) on the path from T[s] (T'[t]) to T[fO + l)] (T'[f(j + 1)]) is not touched by any line of M2) - r(t[s] ~ T'[t])} where T[s] and T'[t] are ancestors of T[i + l] and T'[j + l], respectively. PROOF. Let R denote the right side of the above formula and let L denote MIN M(i + 1, j + l), which is min{cost(m)[m is a mapping from T(I: i + 1) to T'(l:j + l) with (i + l,j + l) in M). Assume that M is a mapping from T(I: i + l) to T'(l:j + l) with (i + l,j + l) in M. Then M contains (SM, tm), where T[SM] and T'[tM] are the latest ancestors of T[i + l] and T'[j + 1], respectively, touched by lines of M. By Lemma 4. l, M can be decomposed into submappings M1 and M2 such that M1 is a mapping from T(l: su) to T'(l: tm), M~ is a

8 The Tree-to-Tree Correction Problem 429 mapping from T(SM: i) to T'(tM:j), and M = M1 U Mz t.j {(i + l,j + 1)). It follows that L ~_R. Assume that T[s] and T'[t] are ancestors of T[i + 1] and T'[j + 1], respectively. Let M1 be a mapping from T(h s) to T'(I: t) with (s, t) in Mi and let Mz be a mapping from T(s: t) to T'(t. j) such that (s, t) is in M2 and any descendant of T[s] (T'[t]) on the path from T[s] (T'[t]) to T[f(t + 1)] (T'[f(j + 1)]) is not touched by any line of M2. Then M = M~ t.j M20 {(i + 1, j + 1)} is a mapping from T(I" i +!) to T'(j + 1). Therefore L _< R. From L _> R and L _< R, it follows that L = R. Q.E.D. Assume that s <_ u _< l, t _< v _< j, T[u] is on the path from T[s] to T[t], and T'[v] is on the path from T'[t] to T'[j]. Define E[s: u: i, t: v:j] to be mm{cost(m)l M Is a mapping from T(s: i) to T'(t:j) such that (s, t) is in M and no descendant of T[s] (T'[t]) on the path from T[s] (T'[t]) to T[u] (T'[v]) is touched by a line of M}. Then we have the following theorem from Lemma 4.2: THEOREM 4.1. MIN M(i + l,j + 1) -- r(t[i + 1] ~ T'[j + 1]) + mins.t {MIN M(s, t) + E[s:f(t + 1): i, t:f(j + l):j] - r(t[s] ~ T'[t])} where T[s] and T'[t] are ancestors of T[i + 1] and T'[j + 1], respectively. Now the remaining problem is how to compute E[s: u: i, t: v: j], where s _< u _< t, t _< v <_j, and T[u] (T'[v]) is on the path from T[s] (T'[t]) to T[i] (T'[j]). First, assume that s _< u < i and t _< v <j. Then T[u] (T'[v]) has a son on the path from T[u] (T'[v]) to T[t] (T'[j]). Consider the following diagram: s t T(s:i) T' (t:j) where T[x] (T'[y]) is the son of T[u] (T'[v]) on the path from T[u] (T'[v]) to T[t] (T'[j]). LEMMA 4.3. Assume that s _< u < i and t _< v < j. Then E[s: u: i, t: v: j] = min{e[s: x: t, t: v: j], E[s: u: t, t: y: j], E[s: u: x - 1, t: v: y - 1] + E[x: x: i, y: y: j]). PROOF. Let M be a mapping from T(s: i) to T'(t: j) such that cost(m) = E[s: u: i, t: v: j], (s, t) is in M, and no descendant of T[s] (T'[/]) on the path from T[s] (T'[t]) to T[u] (T'[v]) is touched by a line of M. Then at least one of the following three cases must hold: Case 1: T[x] is not touched by a line of M. Then E[s" u: i, t: v: j] = E[s: x: i, t: v: j]. Case 2: T'[y] is not touched by a line of M. Then E[s: u: t, t: v: j] = E[s: u: t, t: y: j]. Case 3: T[x] and T'[y] are touched by hnes (x, q) and (p, y) of M. Assume that p > x. Since T[i] is a descendant of T[x] and i _> p > x, by Lemma 2.1 T[p] is a descendant of T[x]. By the definition of a mapping, y > q and T'[y] is a descendant of T'[q]. Thus, T'[q] is a descendant of T'[t] on the path from T'[t] to T'[v]. However, this contradicts the assumption that no descendant of T'[t] on the path from T'[t] to T'[v] is touched by a line of M Also, p < x will cause a similar contradiction. Therefore, p = x and q ffi y. Let M1 and Me be defined by and M1 = ((m, n) l(m, n) is in M, m < x and n < y}, M2 = {(m, n) l(m, n) is in M, m _> x and n _> y}.

9 430 KUO-CHUNG TAI Then M~ is a mapping from T(s: x - 1) to T'(t: y - 1), M2 is a mapping from T(x: i) to T' (y: j), and cost(m) = cost(ma) + cost(m2). Since cost(m) = E[s: u: i. t: v: j], it follows that cost(m~) = E[s: u: x - 1, t" v" y - 1] and cost(mz) = E[x: x" i, y" y: j]. (Since T[u] is the father of T[x], by Lemma 2.1 T[u] is on the path from T[s] to T[x - 1] and likewise for T'[v].) Therefore, E[s: u: t, t: v: j] = E[s: u: x - 1, t: v y - 1] + E[x: x: t, y: y: j]. Q.E.D. To compute E[s" u: i, t: v:j], we now consider the cases in which one or both of T[x] and T[y] do not exist, I.e., u = i or v =j. Let M be a mapping from T(s" i) to T'(t:j) such that cost(m) = E[s: u: t, t: v: j], (s, t) is in M, and no descendant of T[s] (T'[t]) on the path from T[s] (T'[t]) to T[u] (T'[v]) is touched by a hne of M. Case h u = t and v <j. There are two subcases to be considered: s t T(s:i) T' (t:j) (a) s = u = t. Then T(s:/) contains exactly one node, T[s], and no descendant of T'[t] is touched by a line of M. By Lemma 2.1, T'[f(j)] is on the path from T'[t] to T'[j - 1]. Therefore, E[s: u: i, t: v: j] = E[s" u" i, t: f(j)'j - 1] + r(a ~ T'[j]). (b) s < u = i. By Lemma 2.1, T[f(i)] is on the path from T[s] to T[t - 1]. Since no descendant of T[s] on the path from T[s] to T[f(/)] is touched by a hne of M, it follows that E[s" u: t, t: v" j] = E[s" f(t): t - 1, t: v: j] + r(t[t] ~ A) Case 2: u < t and v = j. This case is similar to case 1. (a) t=v=j. Then (b) t < v = j. Then E[s: u: i, t: v: j] = E[s: f(i): t - 1, t: v.j] + r(t[i] ~ A) E[s: u: t, t: v: j] = E[s: u: t, t: f(j): j- Case 3. There are four subcases to be considered: 1] + r(a--> T'[j]). s t T(s:i) T' (t:j) (a) s=u=tandt= v=j. Then E[s: u: i, t: v: j] = r(t[i] ~ T'[j]). (b) s=u=iandt<v=j. Then E[s: u: i, t: v: j] = E[s: u: i, t: f(j): j - 1] + r(a ~ T'[j]). (c) s<u=zandt=v=j. Then E[s: u" i, t: v: j] = E[s: f(i): ~ - 1, t: v: j] + r(t[i]---~ A)

10 The Tree-to-Tree Correction Problem 431 (d) s < u = i and t < v = j. Then neither T[i] nor T'[j] is touched by a line of M. Therefore, E[s: u: i, t: v:j] = E[s:f(0: i - 1, t: v:j] + r(t[l] ~ A), or E[s: u: t, t:f(y).j - 1] + r(a ~ T'[j]), or E[s: f(i): i - 1, t: f(j): j - 1] + r(t[t] ~ A) + r(a--~ T'[j]). Although eight subcases are considered in Cases 1, 2, and 3, only four different formulas are used. 5. An Algorithm for the Tree-to-Tree Correction Problem Based on the results shown in Sections 3 and 4, an algorithm Is presented which computes the distance from tree T to tree T' m polynomial time. This algorithm consists of the following three steps: (1) Compute E[s: u: l, t: v: j] for all s, u, t, t, v,j, where I_<i_<IT I, I<_j_<IT' l, T[u] (T'[v]) is on the path from T[I] (T'[I]) to T[i] (T'[j]), T[s] (T'[t]) is on the path from T[I] (T'[I]) to T[u] (T'[v]); (2) Compute MIN M(t, j) for all i, j, where 1 _< i _< [ T [ and 1 <_ j _< [ T'I; (3) Compute DO, j) for all i,j, where 1 _< t _< I TI and 1 _<j _< I T'I. Definefn(x) =f(fn-l(x)) for n >_ 1 and x > 1, wheref(x) is the father of node x, and f (x) = x The following is an algorithm for step (1): for t = 1, 2,,ITI do forj= 1,2,,IT'ldo for u = t,f(o,f2(t),., 1 do for s = u,f(u),f~(u),, 1 do for v =j,f(j),f2(j),, 1 do for t = v,f(v),f2(v),, l do ifs = u = t A t = v =jthen E[s u t, t v j] = r(t[t]--~ T'[j]) elseifs=u=lvt<v=jthene[s u i, t v j] = E[s: u t, t f ( j) j - l]+r(a~ T'[j]) else ifs<u=tvt=v=jthene[s.u z,t v j]= E[s'f(O t- l,t v j] + r(t[t]~ A) else E[s u l,t v.j]=mm(e[s x t,t v.j],e[s u t, t y j],e[s u x- I,t v y- 1] + LE[x x t,y.y.j]) (T[x] is the son of T[u] on the path from T[u] to T[i], and T'[y] is the son of T'[v] on the path from T'[v] to T'[j].) The following is an algorithm for step (2): MIN M(1, 1) = 0, for l = 2, 3,, ITI do forj = 2, 3,, [T'I do begin MIN M(~, j) ~ INFINITE, for s =f(t),f2(t),, 1 do for t =f(j),fz(j),, 1 do begin temp ~ MIN M(s, t) + E[s f(z) l - I, t f(j) j - 1] - r(t[s] --, T'[t]), MIN M(i, j) ~ mm(temp, MIN M(i, j)) enid, MIN M(t, j) ~ MIN_._M(t, j) + r(t[,] ~ T'D]) end, Finally, an algorithm for step (3) is given below: D(I,!) ~--- 0, D(l, l)~--d(t- 1, l)+r(t[t]----~a) for 1=2,3,.,ITI, D0,j)~D(I,j-I)+r(A~T'[J]) for j=2,3,,it'i, for I = 2, 3,, ITldo forj = 2, 3,, IT'I do D(t, fl ~-- mm(d(z,j - 1) + r(a--~ T'[JI), D(t - l,j) + r(t[t] ~ A), M1N M(i,j));

11 432 KUO-CHLING TA1 Now the tree-to-tree correctton problem has been completely solved. The following theorem shows the time complexity of the proposed algorithm for computing the distance from one tree to another. THEOREM 5.1 Given two trees T and T', the proposed algorithm computes the distance from T to T' in ttme O( V* V' * L 2. L '2), where V and V' are the numbers of nodes respectively of T and T', and L and L' are the mammum depths respeenvely of T and T' PROOF. Step (1) computes E[s: u: i, t: v: j] for all s, u, i, t, v, j, where l<i<_v, l<_j<_ V', T[u] (T'[v]) is on the path from T[I] (T'[I]) to T[t] (T'[j]), T[s] (T'[t]) Is on the path from T[I] (T'[I]) to T[u] (T'[v]). Thus, step (1) takes O(V* V' * L 2 * L '2) time. Step (2) computes MIN M(t, j) for all i, j, where 1 < s < V and 1 < j < V'. Thus, step (2) takes O(V* V' * L * L') time. Step (3) computes D(i, j) for all t, j, where 1 < i < V and 1 < j < V'. Thus, step (3) takes O( V * V') ume. From steps (1), (2), and (3), the distance D(V, V') from Tto T' can be computed m time O(V* V' *L2*L'2). Q.E.D. 6. Concluston The notion of distance between two trees can be applied to measuring the similarity between two trees. Since trees have been used for &fferent apphcations, the simdanty between trees can have different interpretations. One possible apphcauon is to the problem of syntacuc error recovery and correction for programming languages. In [11] it was suggested that for the selection of an error recovery or correcuon, the similarity between a corrected string and its replacement should be based on the two strings as well as their associated parse trees. The longest common subsequence problem, which ts a spectal case of the string-to-string correction problem, has received much attention [1, 3, 4, 5, 13]. The notion of longest common subsequence between two strings can be extended for trees as well. Tree T" is a substructure of tree T if there exists a mapping M from T" to T such that every node of T" is touched by a line of M and for every (p, q) E M, T[p] = T'[ q] Tree T" is a common substructure of trees T and T' if T" is a substructure of both T and T'. Tree T" is a largest common substructure of T and T' if there is no common substructure of T and T' that has more nodes than T". With constant cost Wc, Wo, Wz for changmg, deleting, and inserting any node, and with Wc = Wi + WD, the tree resulting from T by deleting nodes of T that are changed or deleted during a mimmum cost transformation from T to T' is a largest common substructure of T and T'. In [7, 12] the string-to-string correction problem was extended by allowing the operauon of interchanging two adjacent characters. Wagner [12] showed that under certain restric- Uons the extended string-to-string correcuon problem can be solved in determimstlc polynomial time, but the general problem ~s NP-complete. How to extend the tree-to-tree correction problem by allowing the operation of interchanging two adjacent nodes is currently being investigated. The tree-to-tree correction problem may also be extendable by modifying the definmons of change, delete, and insert operations. One example Is to add the restriction that the delete and insert operations can only be apphed to the leaves of trees. For this restricted problem, one more condmon should be added to the definition of a mapping M: Forevery(i,j)~M, if i#l#j, then (f(i),f(j))em. This condition imphes that if T[i] and T'[j] are touched by the same hne, then they have the same level number. The algorithm presented in SecUon 5 can be simplified to solve the

12 The Tree-to-Tree Correction Problem 433 restricted problem, but the simplification process is not trivial. In [9] Selkow proposed a simple algorithm which solves this restricted problem in time O( I TI * I T'I) ACKNOWLEDGMENTS. I would like to thank the referees for their helpful suggestions for improving the readabdity of this paper. The nouon of E[s: u: i, t: v: j] was motivated by a suggestion gwen by one referee. Using this notation, the computation of MIN M(i, j) has been substantially simplified. REFERENCES I AHO, A V, HIRSCHBERG, D S, AND ULLMAN, J D Bounds on the complexity of the longest common subsequence problem J ACM 23, 1 (Jan 1976), Fu, K S, AND BHARGAVA, B K Tree systems for syntactic pattern recognition IEEE Trans Comptrs C-22, 12 (Dec 1973), H1RSCHBERG, D S A linear space algorithm for computing maximal common subsequences Comm ACM 18, 6 (June 1975), HIRSCHaERG, D S Algorithms for the longest common subsequence problem J ACM 24, 4 (Oct 1977), HUNT, J W, AriD SZYMANSKt, T G A fast algorithm for computing longest common subsequences Comm ACM 20, 5 (May 1977), KNUTIt, D E TheArt of Computer Programming, Vol I FundamentalAlgorlthms Addison-Wesley, Reading, Mass, sec ed, LOWRANCE, R, AND WAGNER, R A An extension of the string-to-string correction problem J A CM 22, 2 (April 1975), SANKOrF, D Matching sequences under deletion/insertion constraints Proc Nat Acad Scl USA 69, 1 (Jan 1974), SELKOW, S M The tree-to-tree editing problem Inform Processing Letters 6, 6 (Dec 1977), l0 SELLERS, P H An algorithm for the distance between two fimte sequences J Combm Theory, Set A, 16 (1974), I l TAI, K C Syntactic error correction m programming languages Ph D Th, Dept Comptr. Sci, Cornell U, Ithaca, N Y, WAGNER, R A On the complexity of the extended stnng-to-stnng correctton problem Proc Seventh Annual ACM Symp on Theory of Comptng, Albuquerque, New Mex, 1975, pp WAGNER, R A, AND FISCHER, M J The stnng-to-strlng correction problem. J ACM 21, I (Jan 1974), WONG, C K, AND CHANDRA, A K Bounds for the string editing problem J. ACM 23, 1 (Jan 1976), RECEIVED MARCH 1977, REVISED JANUARY 1979 Jourllal of the Assoclatton for Computing Machinery, Vol 26, No 3, July 1979

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### On the k-path cover problem for cacti

On the k-path cover problem for cacti Zemin Jin and Xueliang Li Center for Combinatorics and LPMC Nankai University Tianjin 300071, P.R. China zeminjin@eyou.com, x.li@eyou.com Abstract In this paper we

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Is Sometime Ever Better Than Alway?

Is Sometime Ever Better Than Alway? DAVID GRIES Cornell University The "intermittent assertion" method for proving programs correct is explained and compared with the conventional method. Simple conventional

### Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.

Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the

### Determination of the normalization level of database schemas through equivalence classes of attributes

Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,

### Comparing of SGML documents

Comparing of SGML documents MohammadTaghi Hajiaghayi mhajiaghayi@uwaterloo.ca Department of computer science University of Waterloo Abstract Documents can be represented as structures with a hierarchial

### 4 Basics of Trees. Petr Hliněný, FI MU Brno 1 FI: MA010: Trees and Forests

4 Basics of Trees Trees, actually acyclic connected simple graphs, are among the simplest graph classes. Despite their simplicity, they still have rich structure and many useful application, such as in

### GRAPH THEORY LECTURE 4: TREES

GRAPH THEORY LECTURE 4: TREES Abstract. 3.1 presents some standard characterizations and properties of trees. 3.2 presents several different types of trees. 3.7 develops a counting method based on a bijection

### Analysis of Algorithms I: Optimal Binary Search Trees

Analysis of Algorithms I: Optimal Binary Search Trees Xi Chen Columbia University Given a set of n keys K = {k 1,..., k n } in sorted order: k 1 < k 2 < < k n we wish to build an optimal binary search

### Two General Methods to Reduce Delay and Change of Enumeration Algorithms

ISSN 1346-5597 NII Technical Report Two General Methods to Reduce Delay and Change of Enumeration Algorithms Takeaki Uno NII-2003-004E Apr.2003 Two General Methods to Reduce Delay and Change of Enumeration

### COT5405 Analysis of Algorithms Homework 3 Solutions

COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal - DFS(G) and BFS(G,s) - where G is a graph and s is any

### A CHARACTERIZATION OF TREE TYPE

A CHARACTERIZATION OF TREE TYPE LON H MITCHELL Abstract Let L(G) be the Laplacian matrix of a simple graph G The characteristic valuation associated with the algebraic connectivity a(g) is used in classifying

### Any set with a (unique) η-representation is Σ 3, and any set with a strong η-representation is

η-representation OF SETS AND DEGREES KENNETH HARRIS Abstract. We show that a set has an η-representation in a linear order if and only if it is the range of a 0 -computable limitwise monotonic function.

### Chapter 4. Trees. 4.1 Basics

Chapter 4 Trees 4.1 Basics A tree is a connected graph with no cycles. A forest is a collection of trees. A vertex of degree one, particularly in a tree, is called a leaf. Trees arise in a variety of applications.

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

### 136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

136 TER 4. INDUCTION, GRHS ND TREES 4.3 Graphs In this chapter we introduce a fundamental structural idea of discrete mathematics, that of a graph. Many situations in the applications of discrete mathematics

### Approximation Algorithms. Scheduling. Approximation algorithms. Scheduling jobs on a single machine

Approximation algorithms Approximation Algorithms Fast. Cheap. Reliable. Choose two. NP-hard problems: choose 2 of optimal polynomial time all instances Approximation algorithms. Trade-off between time

### Gheorghe Antonoiu and Pradip K Srimani Department of Computer Science Colorado State University Ft. Collins, CO Technical Report CS

Computer Science Technical Report A Self-Stabilizing Leader Election Algorithm for Tree Graphs Gheorghe Antonoiu and Pradip K Srimani Department of Computer Science Colorado State University Ft. Collins,

### OPTIMAL SELECTION BASED ON RELATIVE RANK* (the "Secretary Problem")

OPTIMAL SELECTION BASED ON RELATIVE RANK* (the "Secretary Problem") BY Y. S. CHOW, S. MORIGUTI, H. ROBBINS AND S. M. SAMUELS ABSTRACT n rankable persons appear sequentially in random order. At the ith

### Solutions to Homework 6

Solutions to Homework 6 Debasish Das EECS Department, Northwestern University ddas@northwestern.edu 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example

### 8.1 Min Degree Spanning Tree

CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree

### Dynamic Programming. Applies when the following Principle of Optimality

Dynamic Programming Applies when the following Principle of Optimality holds: In an optimal sequence of decisions or choices, each subsequence must be optimal. Translation: There s a recursive solution.

### 1. What s wrong with the following proofs by induction?

ArsDigita University Month : Discrete Mathematics - Professor Shai Simonson Problem Set 4 Induction and Recurrence Equations Thanks to Jeffrey Radcliffe and Joe Rizzo for many of the solutions. Pasted

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### 1 Basic Definitions and Concepts in Graph Theory

CME 305: Discrete Mathematics and Algorithms 1 Basic Definitions and Concepts in Graph Theory A graph G(V, E) is a set V of vertices and a set E of edges. In an undirected graph, an edge is an unordered

### Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

### Brendan D. McKay Computer Science Dept., Vanderbilt University, Nashville, Tennessee 37235

139. Proc. 3rd Car. Conf. Comb. & Comp. pp. 139-143 SPANNING TREES IN RANDOM REGULAR GRAPHS Brendan D. McKay Computer Science Dept., Vanderbilt University, Nashville, Tennessee 37235 Let n~ < n2

### [2], [3], which realize a time bound of O(n. e(c + 1)).

SIAM J. COMPUT. Vol. 4, No. 1, March 1975 FINDING ALL THE ELEMENTARY CIRCUITS OF A DIRECTED GRAPH* DONALD B. JOHNSON Abstract. An algorithm is presented which finds all the elementary circuits-of a directed

### Network File Storage with Graceful Performance Degradation

Network File Storage with Graceful Performance Degradation ANXIAO (ANDREW) JIANG California Institute of Technology and JEHOSHUA BRUCK California Institute of Technology A file storage scheme is proposed

### THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER JASPER DEANTONIO Abstract. This paper is a study of the Turing Degrees, which are levels of incomputability naturally arising from sets of natural numbers.

### Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

### 6.852: Distributed Algorithms Fall, 2009. Class 2

.8: Distributed Algorithms Fall, 009 Class Today s plan Leader election in a synchronous ring: Lower bound for comparison-based algorithms. Basic computation in general synchronous networks: Leader election

### Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

### CMSC 451: Graph Properties, DFS, BFS, etc.

CMSC 451: Graph Properties, DFS, BFS, etc. Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Chapter 3 of Algorithm Design by Kleinberg & Tardos. Graphs

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

### Sample Problems in Discrete Mathematics

Sample Problems in Discrete Mathematics This handout lists some sample problems that you should be able to solve as a pre-requisite to Computer Algorithms Try to solve all of them You should also read

### Labeling outerplanar graphs with maximum degree three

Labeling outerplanar graphs with maximum degree three Xiangwen Li 1 and Sanming Zhou 2 1 Department of Mathematics Huazhong Normal University, Wuhan 430079, China 2 Department of Mathematics and Statistics

### Min-cost flow problems and network simplex algorithm

Min-cost flow problems and network simplex algorithm The particular structure of some LP problems can be sometimes used for the design of solution techniques more efficient than the simplex algorithm.

### Trees and Fundamental Circuits

Trees and Fundamental Circuits Tree A connected graph without any circuits. o must have at least one vertex. o definition implies that it must be a simple graph. o only finite trees are being considered

### Fairness in Routing and Load Balancing

Fairness in Routing and Load Balancing Jon Kleinberg Yuval Rabani Éva Tardos Abstract We consider the issue of network routing subject to explicit fairness conditions. The optimization of fairness criteria

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### On line construction of suffix trees 1

(To appear in ALGORITHMICA) On line construction of suffix trees 1 Esko Ukkonen Department of Computer Science, University of Helsinki, P. O. Box 26 (Teollisuuskatu 23), FIN 00014 University of Helsinki,

### Trees. Definition: A tree is a connected undirected graph with no simple circuits. Example: Which of these graphs are trees?

Section 11.1 Trees Definition: A tree is a connected undirected graph with no simple circuits. Example: Which of these graphs are trees? Solution: G 1 and G 2 are trees both are connected and have no simple

### A. V. Gerbessiotis CS Spring 2014 PS 3 Mar 24, 2014 No points

A. V. Gerbessiotis CS 610-102 Spring 2014 PS 3 Mar 24, 2014 No points Problem 1. Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p(n, m) be the

### IT2lxmin (depth( Tt), leaves( T)) x min (depth(t2), leaves(t2))) and space O(Ir, x lt21) compared with

SIAM J. COMPUT. Vol. 18, No. 6, pp. 12451262, December 1989 (C) 1989 Society for Industrial and Applied Mathematics 011 SIMPLE FAST ALGORITHMS FOR THE EDITING DISTANCE BETWEEN TREES AND RELATED PROBLEMS*

### Comparing Leaf and Root Insertion

30 Research Article SACJ, No. 44., December 2009 Comparing Leaf Root Insertion Jaco Geldenhuys, Brink van der Merwe Computer Science Division, Department of Mathematical Sciences, Stellenbosch University,

### Bandwidth Efficient All-to-All Broadcast on Switched Clusters

Bandwidth Efficient All-to-All Broadcast on Switched Clusters Ahmad Faraj Pitch Patarasuk Xin Yuan Blue Gene Software Development Department of Computer Science IBM Corporation Florida State University

### FIND ALL LONGER AND SHORTER BOUNDARY DURATION VECTORS UNDER PROJECT TIME AND BUDGET CONSTRAINTS

Journal of the Operations Research Society of Japan Vol. 45, No. 3, September 2002 2002 The Operations Research Society of Japan FIND ALL LONGER AND SHORTER BOUNDARY DURATION VECTORS UNDER PROJECT TIME

### 1 Definitions. Supplementary Material for: Digraphs. Concept graphs

Supplementary Material for: van Rooij, I., Evans, P., Müller, M., Gedge, J. & Wareham, T. (2008). Identifying Sources of Intractability in Cognitive Models: An Illustration using Analogical Structure Mapping.

### Planar Tree Transformation: Results and Counterexample

Planar Tree Transformation: Results and Counterexample Selim G Akl, Kamrul Islam, and Henk Meijer School of Computing, Queen s University Kingston, Ontario, Canada K7L 3N6 Abstract We consider the problem

### Constant Time Generation of Free Trees

Constant Time Generation of Free Trees Michael J. Dinneen Department of Computer Science University of Auckland Class: Combinatorial Algorithms CompSci 7 Introduction This is a combinatorial algorithms

### Graphs without proper subgraphs of minimum degree 3 and short cycles

Graphs without proper subgraphs of minimum degree 3 and short cycles Lothar Narins, Alexey Pokrovskiy, Tibor Szabó Department of Mathematics, Freie Universität, Berlin, Germany. August 22, 2014 Abstract

### ECEN 5682 Theory and Practice of Error Control Codes

ECEN 5682 Theory and Practice of Error Control Codes Convolutional Codes University of Colorado Spring 2007 Linear (n, k) block codes take k data symbols at a time and encode them into n code symbols.

### DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS ASHER M. KACH, KAREN LANGE, AND REED SOLOMON Abstract. We construct two computable presentations of computable torsion-free abelian groups, one of isomorphism

### Classification/Decision Trees (II)

Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).

### The Student-Project Allocation Problem

The Student-Project Allocation Problem David J. Abraham, Robert W. Irving, and David F. Manlove Department of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK Email: {dabraham,rwi,davidm}@dcs.gla.ac.uk.

### Discrete Mathematics, Chapter 5: Induction and Recursion

Discrete Mathematics, Chapter 5: Induction and Recursion Richard Mayr University of Edinburgh, UK Richard Mayr (University of Edinburgh, UK) Discrete Mathematics. Chapter 5 1 / 20 Outline 1 Well-founded

### root node level: internal node edge leaf node CS@VT Data Structures & Algorithms 2000-2009 McQuain

inary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from each

### AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC

AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC W. T. TUTTE. Introduction. In a recent series of papers [l-4] on graphs and matroids I used definitions equivalent to the following.

### / Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Steiner Tree and TSP Date: 01/29/15 Scribe: Katie Henry 2.1 Steiner Tree Definition 2.1.1 In the Steiner Tree problem the input

### Basic Notions on Graphs. Planar Graphs and Vertex Colourings. Joe Ryan. Presented by

Basic Notions on Graphs Planar Graphs and Vertex Colourings Presented by Joe Ryan School of Electrical Engineering and Computer Science University of Newcastle, Australia Planar graphs Graphs may be drawn

### INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

### A note on properties for a complementary graph and its tree graph

A note on properties for a complementary graph and its tree graph Abulimiti Yiming Department of Mathematics Institute of Mathematics & Informations Xinjiang Normal University China Masami Yasuda Department

### The Complexity of DC-Switching Problems

arxiv:1411.4369v1 [cs.cc] 17 Nov 2014 The Complexity of DC-Switching Problems Karsten Lehmann 1,2 Alban Grastien 1,2 Pascal Van Hentenryck 2,1 Abstract: This report provides a comprehensive complexity

### Mathematical Induction. Lecture 10-11

Mathematical Induction Lecture 10-11 Menu Mathematical Induction Strong Induction Recursive Definitions Structural Induction Climbing an Infinite Ladder Suppose we have an infinite ladder: 1. We can reach

### Finding and counting given length cycles

Finding and counting given length cycles Noga Alon Raphael Yuster Uri Zwick Abstract We present an assortment of methods for finding and counting simple cycles of a given length in directed and undirected

Lecture 10 Union-Find The union-nd data structure is motivated by Kruskal's minimum spanning tree algorithm (Algorithm 2.6), in which we needed two operations on disjoint sets of vertices: determine whether

### 2. FINDING A SOLUTION

The 7 th Balan Conference on Operational Research BACOR 5 Constanta, May 5, Roania OPTIMAL TIME AND SPACE COMPLEXITY ALGORITHM FOR CONSTRUCTION OF ALL BINARY TREES FROM PRE-ORDER AND POST-ORDER TRAVERSALS

### 5.1 Bipartite Matching

CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson

### Diameter and Treewidth in Minor-Closed Graph Families, Revisited

Algorithmica manuscript No. (will be inserted by the editor) Diameter and Treewidth in Minor-Closed Graph Families, Revisited Erik D. Demaine, MohammadTaghi Hajiaghayi MIT Computer Science and Artificial

### On Data Recovery in Distributed Databases

On Data Recovery in Distributed Databases Sergei L. Bezrukov 1,UweLeck 1, and Victor P. Piotrowski 2 1 Dept. of Mathematics and Computer Science, University of Wisconsin-Superior {sbezruko,uleck}@uwsuper.edu

### Trees. Tree Definitions: Let T = (V, E, r) be a tree. The size of a tree denotes the number of nodes of the tree.

Trees After lists (including stacks and queues) and hash tables, trees represent one of the most widely used data structures. On one hand trees can be used to represent objects that are recursive in nature

### Single machine parallel batch scheduling with unbounded capacity

Workshop on Combinatorics and Graph Theory 21th, April, 2006 Nankai University Single machine parallel batch scheduling with unbounded capacity Yuan Jinjiang Department of mathematics, Zhengzhou University

### CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

CS 598CSC: Combinatorial Optimization Lecture date: /4/010 Instructor: Chandra Chekuri Scribe: David Morrison Gomory-Hu Trees (The work in this section closely follows [3]) Let G = (V, E) be an undirected

### Tenacity and rupture degree of permutation graphs of complete bipartite graphs

Tenacity and rupture degree of permutation graphs of complete bipartite graphs Fengwei Li, Qingfang Ye and Xueliang Li Department of mathematics, Shaoxing University, Shaoxing Zhejiang 312000, P.R. China

### Practice Final Exam Solutions

1 (1 points) For the following problem, the alphabet used is the standard -letter English alphabet, and vowels refers only to the letters A, E, I, O, and U (a) ( points) How many strings of five letters

### Catalan Numbers. Thomas A. Dowling, Department of Mathematics, Ohio State Uni- versity.

7 Catalan Numbers Thomas A. Dowling, Department of Mathematics, Ohio State Uni- Author: versity. Prerequisites: The prerequisites for this chapter are recursive definitions, basic counting principles,

### CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

### CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE

CARDINALITY, COUNTABLE AND UNCOUNTABLE SETS PART ONE With the notion of bijection at hand, it is easy to formalize the idea that two finite sets have the same number of elements: we just need to verify

### max cx s.t. Ax c where the matrix A, cost vector c and right hand side b are given and x is a vector of variables. For this example we have x

Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study

### Cycles in a Graph Whose Lengths Differ by One or Two

Cycles in a Graph Whose Lengths Differ by One or Two J. A. Bondy 1 and A. Vince 2 1 LABORATOIRE DE MATHÉMATIQUES DISCRÉTES UNIVERSITÉ CLAUDE-BERNARD LYON 1 69622 VILLEURBANNE, FRANCE 2 DEPARTMENT OF MATHEMATICS

### DEGREES OF CATEGORICITY AND THE HYPERARITHMETIC HIERARCHY

DEGREES OF CATEGORICITY AND THE HYPERARITHMETIC HIERARCHY BARBARA F. CSIMA, JOHANNA N. Y. FRANKLIN, AND RICHARD A. SHORE Abstract. We study arithmetic and hyperarithmetic degrees of categoricity. We extend

### The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical

### Key words. Mixed-integer programming, mixing sets, convex hull descriptions, lot-sizing.

MIXING SETS LINKED BY BI-DIRECTED PATHS MARCO DI SUMMA AND LAURENCE A. WOLSEY Abstract. Recently there has been considerable research on simple mixed-integer sets, called mixing sets, and closely related

### Scheduling Single Machine Scheduling. Tim Nieberg

Scheduling Single Machine Scheduling Tim Nieberg Single machine models Observation: for non-preemptive problems and regular objectives, a sequence in which the jobs are processed is sufficient to describe

### Diversity Coloring for Distributed Data Storage in Networks 1

Diversity Coloring for Distributed Data Storage in Networks 1 Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Pasadena, CA 9115, U.S.A. {jax, bruck}@paradise.caltech.edu Abstract

### SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by

SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples

### Victims Compensation Claim Status of All Pending Claims and Claims Decided Within the Last Three Years

Claim#:021914-174 Initials: J.T. Last4SSN: 6996 DOB: 5/3/1970 Crime Date: 4/30/2013 Status: Claim is currently under review. Decision expected within 7 days Claim#:041715-334 Initials: M.S. Last4SSN: 2957

### Small independent edge dominating sets in graphs of maximum degree three

Small independent edge dominating sets in graphs of maimum degree three Grażna Zwoźniak Grazna.Zwozniak@ii.uni.wroc.pl Institute of Computer Science Wrocław Universit Small independent edge dominating

### Automata and Languages

Automata and Languages Computational Models: An idealized mathematical model of a computer. A computational model may be accurate in some ways but not in others. We shall be defining increasingly powerful

### On strong fairness in UNITY

On strong fairness in UNITY H.P.Gumm, D.Zhukov Fachbereich Mathematik und Informatik Philipps Universität Marburg {gumm,shukov}@mathematik.uni-marburg.de Abstract. In [6] Tsay and Bagrodia present a correct

### GRAPH THEORY and APPLICATIONS. Trees

GRAPH THEORY and APPLICATIONS Trees Properties Tree: a connected graph with no cycle (acyclic) Forest: a graph with no cycle Paths are trees. Star: A tree consisting of one vertex adjacent to all the others.

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### Lecture 5 - CPA security, Pseudorandom functions

Lecture 5 - CPA security, Pseudorandom functions Boaz Barak October 2, 2007 Reading Pages 82 93 and 221 225 of KL (sections 3.5, 3.6.1, 3.6.2 and 6.5). See also Goldreich (Vol I) for proof of PRF construction.