When Simulation Meets Antichains (on Checking Language Inclusion of NFAs)


 Sharleen Hall
 1 years ago
 Views:
Transcription
1 When Simultion Meets Antichins (on Checking Lnguge Inclusion of NFAs) Prosh Aziz Abdull 1, YuFng Chen 1, Lukáš Holík 2, Richrd Myr 3, nd Tomáš Vojnr 2 1 Uppsl University 2 Brno University of Technology 3 University of Edinburgh Abstrct. We describe new nd more efficient lgorithm for checking universlity nd lnguge inclusion on nondeterministic finite word utomt (NFA) nd tree utomt (TA). To the best of our knowledge, the ntichinbsed pproch proposed by Wulf et l. ws the most efficient one so fr. Our ide is to exploit simultion reltion on the sttes of finite utomt to ccelerte the ntichinbsed lgorithms. Normlly, simultion reltion cn be obtined firly efficiently, nd it cn help the ntichinbsed pproch to prune out lrge portion of unnecessry serch pths. We evlute the performnce of our new method on NFA/TA obtined from rndom regulr expressions nd from the intermedite steps of regulr model checking. The results show tht our pproch significntly outperforms the previous ntichinbsed pproch in most of the experiments. 1 Introduction The lnguge inclusion problem for regulr lnguges is importnt in mny ppliction domins, e.g., forml verifiction. Mny verifiction problems cn be formulted s lnguge inclusion problem. For exmple, one my describe the ctul behviors of n implementtion in n utomton A nd ll of the behviors permitted by the specifiction in nother utomton B. Then, the problem of whether the implementtion meets the specifiction is equivlent to the problem L(A) L(B). Methods for proving lnguge inclusion cn be ctegorized into two types: those bsed on simultion (e.g., [6]) nd those bsed on the subset construction (e.g., [5, 8 10]). Simultionbsed pproches first compute simultion reltion on the sttes of two utomt A nd B nd then check if ll initil sttes of A cn be simulted by some initil stte of B. Since simultion cn be computed in polynomil time, simultionbsed methods re usully very efficient. Their min drwbck is tht they re incomplete. Simultion preorder implies lnguge inclusion, but not vicevers. On the other hnd, methods bsed on the subset construction re complete but inefficient becuse in mny cses they will cuse n exponentil blow up in the number of sttes. Recently, Wulf et l. [11] proposed the ntichinbsed pproch. To the best of our knowledge, it ws the most efficient one mong ll of the methods bsed on the subset construction. Although the ntichinbsed method significntly outperforms the clssicl subset construction, in mny cses, it still sometimes suffers from the exponentil blow up problem. In this pper, we describe new pproch tht nicely combines the simultionbsed nd the ntichinbsed pproches. The computed simultion reltion is used for pruning out unnecessry serch pths of the ntichinbsed method. To simplify the presenttion, we first consider the problem of checking universlity for word utomton A. In similr mnner to the clssicl subset construction, we
2 strt from the set of initil sttes nd serch for sets of sttes (here referred to s mcrosttes) which re not ccepting (i.e., we serch for counterexmple of universlity). The key ide is to define n esytocheck ordering on the sttes of A which implies lnguge inclusion (i.e., p q implies tht the lnguge of the stte p is included in the lnguge of the stte q). From, we derive n ordering on mcrosttes which we use in two wys to optimize the subset construction: (1) serching from mcrostte needs not continue in cse smller mcrostte hs lredy been nlyzed; nd (2) given mcrostte is represented by (the subset of) its mximl elements. In this pper, we tke the ordering to be the wellknown mximl simultion reltion on the utomton A. In fct, the ntichin lgorithm of [11] coincides with the specil cse where the ordering is the identity reltion. Subsequently, we describe how to generlize the bove pproch to the cse of checking lnguge inclusion between two utomt A nd B, by extending the ordering to pirs ech consisting of stte of A nd mcrostte of B. In the second prt of the pper, we extend our lgorithms to the cse of tree utomt. First, we define the notion of open trees which we use to chrcterize the lnguges defined by tuples of sttes of the tree utomton. We identify here new ppliction of the so clled upwrd simultion reltion from [1]. We show tht it implies (open tree) lnguge inclusion, nd we describe how we cn use it to optimize existing lgorithms for checking the universlity nd lnguge inclusion properties. We hve implemented our lgorithms nd crried out n extensive experimenttion using NFA obtined from severl different sources. These include NFA from rndom regulr expressions nd lso 1069 pirs of NFA generted from the intermedite steps of bstrct regulr model checking [4] while verifying the correctness of the bkery lgorithm, producerconsumer system, the bubble sort lgorithm, n lgorithm tht reverses circulr list, nd Petri net model of the reders/writers protocol. We hve lso considered treeutomt derived from intermedite steps of bstrct regulr tree model checking. The experiments show tht our pproch significntly outperforms the previous ntichinbsed pproch in lmost ll of the considered cses. (Furthermore, in those cses where simultion is sufficient to prove lnguge inclusion, our lgorithm hs polynomil running time.) The reminder of the pper is orgnized s follows. Section 2 contins some bsic definitions. In Section 3, we begin the discussion by pplying our ide to solve the universlity problem for NFA. The problem is simpler thn the lnguge inclusion problem nd thus we believe tht presenting our universlity checking lgorithm first mkes it esier for the reder to grsp the ide. The correctness proof of our universlity checking lgorithm is given in Section 4. In Section 5 we discuss our lnguge inclusion checking lgorithm for NFA. Section 6 defines bsic nottions for tree utomt nd in Section 7, we present the lgorithms for checking universlity nd lnguge inclusion for tree utomt. The experimentl results re described in Section 8. Finlly, in Section 9, we conclude the pper nd discuss further reserch directions. 2 Preliminries A Nondeterministic Finite Automton (NFA) A is tuple (Σ,Q,I,F,δ) where: Σ is n lphbet, Q is finite set of sttes, I Q is nonempty set of initil sttes, F Q is
3 set of finl sttes, nd δ Q Σ Q is the trnsition reltion. For convenience, we use p q to denote the trnsition from the stte p to the stte q with the lbel. A word u = u 1...u n is ccepted by A from the stte q 0 if there exists sequence u j q 0 u 1 q 1 u 2...u n q n such tht q n F nd q j 1 q j for ll 0 < j n. Define L(A)(q) := {u u is ccepted by A from the stte q} (the lnguge of the stte q in A). Define the lnguge L(A) of A s S q I L(A)(q). We sy tht A is universl if L(A) =Σ. Let A =(Σ,Q A,I A,F A,δ A ) nd B =(Σ,Q B,I B,F B,δ B ) be two NFAs. Define their union utomton A B :=(Σ,Q A Q B,I A I B,F A F B,δ A δ B ). We define the postimge of stte Post(p) := {p Σ : (p,, p ) δ}. A simultion on A =(Σ,Q,I,F,δ) is reltion Q Q such tht p r only if (i) p F = r F nd (ii) for every trnsition p p, there exists trnsition r r such tht p r. It cn be shown tht for ech utomton A =(Σ,Q,I,F,δ), there exists unique mximl simultion. The following is wellknown lemm. Lemm 1. Given simultion on n NFA A, p r = L(A)(p) L(A)(r). For convenience, we cll set of sttes in A mcrostte, i.e., mcrostte is subset of Q. A mcrostte is ccepting if it contins t lest one ccepting stte, otherwise it is rejecting. For mcrostte P, define L(A)(P) := S p P L(A)(p). We sy tht mcrostte P is universl if L(A)(P)=Σ. For two mcrosttes P nd R, we write P R s shorthnd for p P. r R : p r. We define the postimge of mcrostte Post(P) := {P Σ : P = {p p P : (p,, p ) δ}}. We use A to denote the set of reltions over the sttes of A tht implies lnguge inclusion, i.e., if A, then we hve p r = L(A)(p) L(A)(r). 3 Universlity of NFAs The universlity problem for n NFA A =(Σ,Q,I,F,δ) is to decide whether L(A) = Σ. The problem is PSPACEcomplete. The clssicl lgorithm for the problem first determinizes A with the subset construction nd then checks if every rechble mcrostte is ccepting. The lgorithm is inefficient since in mny cses the determiniztion will cuse n exponentil blowup in the number of sttes. Note tht for universlity checking, we cn stop the subset construction immeditely nd conclude tht A is not universl whenever rejecting mcrostte is encountered. An exmple of run of this lgorithm is given in Fig. 1. The utomton A used in Fig. 1 is universl becuse ll rechble mcrosttes re ccepting. In this section, we propose more efficient pproch to universlity checking. In similr mnner to the clssicl lgorithm, we run the subset construction procedure nd check if ny rejecting mcrostte is rechble. However, our lgorithm ugments the subset construction with two optimiztions, henceforth referred to s Optimiztion 1 nd Optimiztion 2, respectively. Optimiztion 1 is bsed on the fct tht if the lgorithm encounters mcrostte R whose lnguge is superset of the lnguge of visited mcrostte P, then there is no need to continue the serch from R. The intuition behind this is tht if word is not ccepted from R, then it is lso not ccepted from P. For instnce, in Fig. 1(b), the serch needs not continue from the mcrostte {s 2,s 3 } since its lnguge is superset of the lnguge of the initil mcrostte {s 1,s 2 }. However, in generl it is difficult to
4 Clssicl b s 1 b s 2 b b s 3 b () Source NFA A s 1 s 2 s 1 (c) Optimiztion 1 nd 2 b b s 4 s2,s3 Antichin Optimiztion 1 s1,s2 s1,s2,s4 b b s1,s3 s2,s3 s2,s3 s1,s2,s3,s4 b b s1,s2 s1,s3 s1,s2,s3 s1,s2,s3,s4 b s1,s2,s3 b s1,s2,s3,s4 (b) A run of the lgorithms. The res lbeled Optimiztion 1, Antichin, Clssicl re the mcrosttes generted by our pproch with the mximl simultion nd Optimiztion 1, the ntichinbsed pproch, nd the clssicl pproch, respectively. Fig. 1. Universlity Checking Algorithms check if L(A)(P) L(A)(R) before the resulting DFA is completely built. Therefore, we suggest to use n esytocompute lterntive bsed on the following lemm. Lemm 2. Let P, R be two mcrosttes, A be n NFA, nd be reltion in A. Then, P R implies L(A)(P) L(A)(R). Note tht in Lemm 2, cn be ny reltion on the sttes of A tht implies lnguge inclusion. This includes ny simultion reltion (Lemm 1). When is the mximl simultion or the identity reltion, it cn be efficiently obtined from A before the subset construction lgorithm is triggered nd used to prune out unnecessry serch pths. An exmple of how the described optimiztion cn help is given in Fig. 1(b). If is the identity, the universlity checking lgorithm will not continue the serch from the mcrostte {s 1,s 2,s 4 } becuse it is superset of the initil mcrostte. In fct, the ntichinbsed pproch [11] cn be viewed s specil cse of our pproch when is the identity. Notice tht, in this cse, only 7 mcrosttes re generted (the clssicl lgorithm genertes 13 mcrosttes). When is the mximl simultion, we do not need to continue from the mcrostte {s 2,s 3 } either becuse s 1 s 3 nd hence {s 1,s 2 } {s 2,s 3 }. In this cse, only 3 mcrosttes re generted. As we cn see from the exmple, better reduction of the number of generted sttes cn be chieved when weker reltion (e.g., the mximl simultion) is used. Optimiztion 2 is bsed on the observtion tht L(A)(P)=L(A)(P\{p 1 }) if there is some p 2 P with p 1 p 2. This fct is simple consequence of Lemm 2 (note tht P P \{p 1 }). Since the two mcrosttes P nd P \{p 1 } hve the sme lnguge, if word is not ccepted from P, it is not ccepted from P \{p 1 } either. On the other hnd, if ll words in Σ cn be ccepted from P, then they cn lso be ccepted from P \{p 1 }. Therefore, it is sfe to replce the mcrostte P with P \{p 1 }. Consider the exmple in Fig. 1. If is the mximl simultion reltion, we cn remove the stte s 2 from the initil mcrostte {s 1,s 2 } without chnging its lnguge,
5 Algorithm 1: Universlity Checking Input: An NFA A =(Σ,Q,I,F,δ) nd reltion A. Output: TRUE if A is universl. Otherwise, FALSE. if I is rejecting then return FALSE; Processed:=/0; Next:={Minimize(I)}; while Next = /0 do Pick nd remove mcrostte R from Next nd move it to Processed; forech P {Minimize(R ) R Post(R)} do if P is rejecting then return FALSE; else if S Processed Next s.t. S P then Remove ll S from Processed Next s.t. P S; Add P to Next; 11 return TRUE becuse s 2 s 1. This chnge will propgte to ll the serching pths. With this optimiztion, our pproch will only genertes 3 mcrosttes, ll of which re singletons. The result fter pply the two optimiztions re pplied is shown in Fig. 1(c). Algorithm 1 describes our pproch in pseudocode. In this lgorithm, the function Minimize(R) implements Optimiztion 2. The function does the following: it chooses new stte r 1 from R, removes r 1 from R if there exists stte r 2 in R such tht r 1 r 2, nd then repets the procedure until ll of the sttes in R re processed. Lines 8 10 of the lgorithm implement Optimiztion 1. Overll, the lgorithm works s follows. Till the set Next of mcrosttes witing to be processed is nonempty (or rejecting mcrostte is found), the lgorithm chooses one mcrostte from Next, nd moves it to the Processed set. Moreover, it genertes ll successors of the chosen mcrostte, minimizes them, nd dds them to Next unless there is lredy some smller mcrostte in Next or in Processed. If new mcrostte is dded to Next, the lgorithm t the sme time removes ll bigger mcrosttes from both Next nd Processed. Note tht the pruning of the Next nd Processed sets together with checking whether new mcrostte should be dded into Next cn be done within single itertion through Next nd Processed. We discuss correctness of the lgorithm in the next section. 4 Correctness of the Optimized Universlity Checking In this section, we prove correctness of Algorithm 1. Due to the spce limittion, we only present n overview. A more detiled proof cn be found in Appendix A. Let A =(Σ,Q,I,F,δ) be the input utomton. We first introduce some definitions nd nottions tht will be used in the proof. For mcrostte P, define Dist(P) N { } s the length of the shortest word in Σ tht is not in L(A)(P) (if L(A)(P) =Σ, Dist(P) = ). For set of mcrosttes MSttes, the function Dist(MSttes) N { } returns the length of the shortest word in Σ tht is not in the lnguge of some mcrostte in MSttes. More precisely, if MSttes = /0, Dist(MSttes)=, otherwise, Dist(MSttes)=min P MSttes Dist(P). The predicte Univ(MSttes) is true if nd only if ll the mcrosttes in MSttes re universl, i.e., P MSttes : L(A)(P)=Σ.
6 Lemm 3 describes the invrints used to prove the prtil correctness of Alg. 1. Lemm 3. The below two loop invrints hold in Algorithm 1: 1. Univ(Processed Next) = Univ({I}). 2. Univ({I}) = Dist(Processed) > Dist(Next). Due to the finite number of mcrosttes, we cn show tht Algorithm 1 eventully termintes. Algorithm 1 returns FALSE only if either the set of initil sttes is rejecting, or the minimized version of some successor R of mcrostte R chosen from Next on line 5 is found rejecting. In the ltter cse, due to Lemm 2, R is lso rejecting. Then, R is nonuniversl, nd hence Univ(Processed Next) is flse. By Lemm 3 (Invrint 1), we hve A is not universl. The lgorithm returns TRUE only when Next becomes empty. When Next is empty, Dist(Processed) > Dist(Next) is not true. Therefore, by Lemm 3 (Invrint 2), A is universl. This gives the following theorem. Theorem 1. Algorithm 1 lwys termintes nd returns TRUE iff the input utomton A is universl. 5 The Lnguge Inclusion Problem The technique described in Section 3 cn be generlized to solve the lngugeinclusion problem. Let A nd B be two NFAs. The lnguge inclusion problem for A nd B is to decide whether L(A) L(B). This problem is lso PSPACEcomplete. The clssicl lgorithm for solving this problem builds onthefly the product utomton A B of A nd the complement of B nd serches for n ccepting stte. A stte in the product utomton A B is pir (p,p) where p is stte in A nd P is mcrostte in B. For convenience, we cll such pir (p, P) productstte. A productstte is ccepting iff p is n ccepting stte in A nd P is rejecting mcrostte in B. We use L(A,B)(p,P) to denote the lnguge of the productstte (p,p) in A B. The lnguge of A is not contined in the lnguge of B iff there exists some ccepting productstte (p, P) rechble from some initil productstte. Indeed, L(A, B)(p, P) =L(A)(p) \ L(B)(P), nd the lnguge of A B consists of words which cn be used s witnesses of the fct tht L(A) L(B) does not hold. In similr mnner to universlity checking, the lgorithm cn stop the serch immeditely nd conclude tht the lnguge inclusion does not hold whenever n ccepting productstte is encountered. An exmple of run of the clssicl lgorithm is given in Fig. 2. We find tht L(A) L(B) is true nd the lgorithm genertes 13 productsttes (Fig. 2(c), the re lbeled Clssicl ). Optimiztion 1 tht we use for universlity checking cn be generlized for lnguge inclusion checking s follows. Let A =(Σ,Q A,I A,F A,δ A ) nd B =(Σ,Q B,I B,F B, δ B ) be two NFAs such tht Q A Q B = /0. We denote by A B the NFA (Σ,Q A Q B,I A I B,F A F B,δ A δ B ). Let be reltion in (A B). During the process of constructing the product utomton nd serching for n ccepting productstte, we cn stop the serch from productstte (p,p) if () there exists some visited productstte (r,r) such tht p r nd R P, or (b) p P : p p. Optimiztion 1() is justified by Lemm 4, which is very similr to Lemm 2 for universlity checking. Lemm 4. Let A, B be two NFAs, (p,p), (r,r) be two productsttes, where p, r re sttes in A nd P, R re mcrosttes in B, nd be reltion in (A B). Then, p r nd R P implies L(A,B)(p,P) L(A,B)(r,R).
7 p 1,b () NFA A q 1,b (b) NFA B p 2 q 2 Clssicl Antichin Optimiztion 1(b) p1,{q1} Optimiztion 1() p1,{q2} p2,{q2} b p1,{q1,q2} p2,{q1,q2} p1,{q1,q2} p2,{q1,q2} p1,{q1} b p1,{q1,q2} p2,{q1,q2} p1,{q1,q2} p2,{q1,q2} p1,{q1} (c) A run of the lgorithms while checking L(A) L(B). Fig. 2. Lnguge Inclusion Checking Algorithms By the bove lemm, if word tkes the productstte (p,p) to n ccepting productstte, it will lso tke (r,r) to n ccepting productstte. Therefore, we do not need to continue the serch from (p,p). Let us use Fig. 2(c) to illustrte Optimiztion 1(). As we mentioned, the ntichinbsed pproch cn be viewed s specil cse of our pproch when is the identity. When is the identity, we do not need to continue the serch from the productstte (p 2,{q 1,q 2 }) becuse {q 2 } {q 1,q 2 }. In this cse, the lgorithm genertes 8 productsttes (Fig. 2(c), the re lbeled Antichin ). In the cse tht is the mximl simultion, we do not need to continue the serch from productsttes (p 1,{q 2 }), (p 1,{q 1,q 2 }), nd (p 2,{q 1,q 2 }) becuse q 1 q 2 nd the lgorithm lredy visited the productsttes (p 1,{q 1 }) nd (p 2,{q 2 }). Hence, the lgorithm genertes only 6 productsttes (Fig. 2(c), the re lbeled Optimiztion 1() ). If the condition of Optimiztion 1(b) holds, we hve tht the lnguge of p (w.r.t. A) is subset of the lnguge of P (w.r.t. B). In this cse, for ny word tht tkes p to n ccepting stte in A, it lso tkes P to n ccepting mcrostte in B. Hence, we do not need to continue the serch from the productstte (p,p) becuse ll of its successor sttes re rejecting productsttes. Consider gin the exmple in Fig. 2(c). With Optimiztion 1(b), if is the mximl simultion on the sttes of A B, we do not need to continue the serch from the first productstte (p 1,{q 1 }) becuse p 1 q 1. In this cse, the lgorithm cn conclude tht the lnguge inclusion holds immeditely fter the first productstte is generted (Fig. 2(c), the re lbeled Optimiztion 1(b) ). Observe tht from Lemm 4, it holds tht for ny productstte (p,p) such tht p 1 p 2 for some p 1, p 2 P, L(A,B)(p,P) =L(A,B)(p,P \{p 1 }) (s P P \{p 1 }). Optimiztion 2 tht we used for universlity checking cn therefore be generlized for lnguge inclusion checking too. We give the pseudocode of our optimized inclusion checking in Algorithm 2, which is strightforwrd extension of Algorithm 1. In the lgorithm, the definition of the Minimize(R) function is the sme s wht we hve defined in Algorithm 1. The function Initilize(PSttes) pplies Optimiztion 1 on the set of productsttes PSttes to void unnecessry serching. More precisely, it returns mximl subset of PSttes such tht (1) for ny two elements (p,p), (q,q) in the subset, p q Q P nd (2) for ny element (p,p) in the subset, p P : p p. We define the postimge of productstte Post((p,P)) := {(p,p ) Σ : (p,, p ) δ,p = {p p P : (p,, p ) δ}}.
8 Algorithm 2: Lnguge Inclusion Checking Input: NFA A =(Σ,Q A,I A,F A,δ A ), B =(Σ,Q B,I B,F B,δ B ). A reltion (A B). Output: TRUE if L(A) L(B). Otherwise, FALSE. if there is n ccepting productstte in {(i,i B ) i I A } then return FALSE; Processed:=/0; Next:= Initilize({(i,Minimize(I B )) i I A }); while Next = /0 do Pick nd remove productstte (r,r) from Next nd move it to Processed; forech (p,p) {(r,minimize(r )) (r,r ) Post((r,R))} do if (p,p) is n ccepting productstte then return FALSE; else if p P s.t. p p then if (s,s) Processed Next s.t. p s S P then Remove ll (s,s) from Processed Next s.t. s p P S; Add (p,p) to Next; 12 return TRUE Correctness: Define Dist(P) N { } s the length of the shortest word in the lnguge of the productstte P or if the lnguge of P is empty. The vlue Dist(PSttes) N { } is the length of the shortest word in the lnguge of some productstte in PSttes or if PSttes is empty. The predicte Incl(PSttes) is true iff for ll productsttes (p,p) in PSttes, L(A)(p) L(B)(P). The correctness of Algorithm 2 cn now be proved in very similr wy to Algorithm 1, using the below invrints: 1. Incl(Processed Next) = Incl({(i,I B ) i I A }). 2. Incl({(i,I B ) i I A })= Dist(Processed) > Dist(Next). 6 Tree Automt Preliminries To be ble to present generliztion of the bove methods for the domin of tree utomt, we now introduce some needed preliminries on tree utomt. A rnked lphbet Σ is set of symbols together with rnking function # : Σ N. For Σ, the vlue #() is clled the rnk of. For ny n 0, we denote by Σ n the set of ll symbols of rnk n from Σ. Let ε denote the empty sequence. A tree t over rnked lphbet Σ is prtil mpping t : N Σ tht stisfies the following conditions: (1) dom(t) is finite, prefixclosed subset of N nd (2) for ech v dom(t), if #(t(v)) = n 0, then {i vi dom(t)} = {1,...,n}. Ech sequence v dom(t) is clled node of t. For node v, we define the i th child of v to be the node vi, nd the i th subtree of v to be the tree t such tht t (v )=t(viv ) for ll v N.Alef of t is node v which does not hve ny children, i.e., there is no i N with vi dom(t). We denote by T (Σ) the set of ll trees over the lphbet Σ. A (finite, nondeterministic, bottomup) tree utomton (bbrevited s TA in the sequel) is qudruple A =(Q,Σ,,F) where Q is finite set of sttes, F Q is set of finl sttes, Σ is rnked lphbet, nd is set of trnsition rules. Ech trnsition rule is triple of the form ((q 1,...,q n ),,q) where q 1,...,q n,q Q, Σ, nd #() =n. We use (q 1,...,q n ) q to denote tht ((q 1,...,q n ),,q). In the specil cse where n = 0, we spek bout the soclled lef rules, which we sometimes bbrevite s q.
9 Let A =(Q,Σ,,F) be TA. A run of A over tree t T (Σ) is mpping π : dom(t) Q such tht, for ech node v dom(t) of rity #(t(v)) = n where q = π(v), if q i = π(vi) for 1 i n, then hs rule (q 1,...,q n ) t(v) π q. We write t = q to denote π tht π is run of A over t such tht π(ε)=q. We use t = q to denote tht t = q for some run π. The lnguge ccepted by stte q is defined by L(A)(q)={t t = q}, while the lnguge of A is defined by L(A)= S q F L(A)(q). 7 Universlity nd Lnguge Inclusion of Tree Automt To optimize universlity nd inclusion checking on word utomt, we used reltions tht imply lnguge inclusion. For the cse of universlity nd inclusion checking on tree utomt, we now propose to use reltions tht imply inclusion of lnguges of the so clled open trees (i.e., lefless trees or equivlently trees whose leves re replced by specil symbol denoting hole ) tht re ccepted from tuples of tree utomt sttes. We formlly define the notion below. Notice tht in contrst to the notion of lnguge ccepted from stte of word utomton, which refers to possible futures of the stte, the notion of lnguge ccepted t stte of TA refers to possible psts of the stte. Our notion of lnguges of open trees ccepted from tuples of tree utomt sttes speks gin bout the future of sttes, which turns out useful when trying to optimize the (ntichinbsed) subset construction for TA. Consider specil symbol Σ with rnk 0, clled hole. An open tree over Σ is tree over Σ such tht ll its leves re lbeled 1 by. We use T (Σ) to denote the set of ll open trees over Σ. Given sttes q 1,...,q n Q nd n open tree t with leves v 1,...,v n, run π of A on t from (q 1,...,q n ) is defined in similr wy s the run on tree except tht for ech lef v i,1 i n, we hve π(v i )= π q i. We use t(q 1,...,q n ) = q to denote tht π is run of A on t from (q 1,...,q n ) such tht π(ε) =q. The nottion t(q 1,...,q n )= q is explined in similr mnner to runs on trees. Then, the lnguge of A ccepted from tuple (q 1,...,q n ) of sttes is L (A)(q 1,...,q n )={t T t(q 1,...,q n )= q for some q F}. Finlly, we define the lnguge ccepted from tuple of mcrosttes (P 1,...,P n ) Q n s the set L (A)(P 1,...,P n )= S {L (A)(q 1,...,q n ) (q 1,...,q n ) P 1... P n }.We define Post (q 1,...,q n ) := {q (q 1,...,q n ) q}. For tuple of mcrosttes, we let Post (P 1,...,P n ) := S {Post (q 1,...,q n ) (q 1,...,q n ) P 1 P n }. Let us use t to denote the open tree tht rises from tree t T (Σ) by replcing ll the lef symbols of t by nd let for every lef symbol Σ, I = {q q} is the so clled initil mcrostte. Lnguges ccepted t finl sttes of A correspond to the lnguges ccepted from tuples of initil mcrosttes of A s stted in Lemm 5. Lemm 5. Let t be tree over Σ with leves lbeled by 1,..., n. Then t L(A) if nd only if t L (A)(I 1,...,I n ). 7.1 Upwrd Simultion We now work towrds defining suitble reltions on sttes of TA llowing us to optimize the universlity nd inclusion checking. We extend reltions Q Q on sttes to tuples of sttes such tht (q 1,...,q n ) (r 1,...,r n ) iff q i r i for ech 1 i n. We define 1 Note tht no internl nodes of n open tree cn be lbeled by s #()=0.
10 the set A of reltions tht imply inclusion of lnguges of tuples of sttes such tht A iff (q 1,...,q n ) (r 1,...,r n ) implies L (A)(q 1,...,q n ) L (A)(r 1,...,r n ). We define n extension of simultion reltions on sttes of word utomt tht stisfies the bove property s follows. Upwrd simultion on A is reltion Q Q such tht if q r, then (1) q F = r F nd (2) if (q 1,...,q n ) q where q = q i, then (q 1,...,q i 1,r,q i+1,...,q n ) r where q r. Upwrd simultions were discussed in [1], together with n efficient lgorithm of computing them. 2 Lemm 6. For the mximl upwrd simultion on A, we hve A. The proof of this lemm cn be obtined s follows. We first show tht the mximl upwrd simultion hs the following property: If (q 1,...,q n ) q in A, then for every (r 1,...,r n ) with (q 1,...,q n ) (r 1,...,r n ), there is r Q such tht q r nd (r 1,...,r n ) r. From (q 1,...,q n ) q nd q 1 r 1, we hve tht there is some rule (r 1,q 2,...,q n ) s 1 such tht q s 1. From the existence of (r 1,q 2,...,q n ) s 1 nd from q 2 r 2, we then get tht there is some rule (r 1,r 2,q 3,...,q n ) s 2 such tht s 1 s 2, etc. Since the mximl upwrd simultion is trnsitive [1], we obtin the property mentioned bove. This in turn implies Lemm Tree Automt Universlity Checking We now show how upwrd simultions cn be used for optimized universlity checking on tree utomt. Let A =(Σ,Q,F, ) be tree utomton. We define Tn (Σ) s the set of ll open trees over Σ with n leves. We sy tht n ntuple (q 1,...,q n ) of sttes of A is universl if L (A)(q 1,...,q n )=Tn (Σ), this is, ll open trees with n leves constructible over Σ cn be ccepted from (q 1,...,q n ). A set of mcrosttes MSttes is universl if ll tuples in MSttes re universl. From Lemm 5, we cn deduce tht A is universl (i.e., L(A)=T (Σ)) if nd only if {I Σ 0 } is universl. The following Lemm llows us to design new TA universlity checking lgorithm in similr mnner to Algorithm 1 using Optimiztions 1 nd 2 from Section 3. Lemm 7. For ny A nd two tuples of mcrosttes of A, we hve (R 1,...,R n ) (P 1,...,P n ) implies L (A)(R 1,...,R n ) L (A)(P 1,...,P n ). Algorithm 3 describes our pproch to checking universlity of tree utomt in pseudocode. It resembles closely Algorithm 1. There re two min differences: (1) The initil vlue of the Next set is the result of pplying the function Initilize to the set {Minimize(I ) Σ 0 }. Initilize returns the set of ll mcrosttes in {Minimize(I ) Σ 0 }, which re miniml w.r.t. (i.e., those mcro sttes with the best chnce of finding counterexmple to universlity). (2) The computtion of the Postimge of set of mcrosttes is bit more complicted. More precisely, for ech symbol Σ n,n N, we hve to compute the post imge of ech ntuple of mcrosttes from the set. We design the lgorithm such tht we void computing the Postimge of tuple more thn once. We define the Postimge Post(MSttes)(R) of set of 2 In [1], upwrd simultions re prmeterized by some downwrd simultion. However, upwrd simultions prmeterized by downwrd simultion greter thn the identity cnnot be used in our frmework since they do not generlly imply inclusion of lnguges of tuples of sttes.
11 Algorithm 3: Tree Automt Universlity Checking Input: A tree utomton A =(Σ,Q,F, ) nd reltion A. Output: TRUE if A is universl. Otherwise, FALSE. if Σ 0 such tht I is rejecting then return FALSE; Processed:=/0; Next:= Initilize{Minimize(I ) Σ 0 }; while Next = /0 do Pick nd remove mcrostte R from Next nd move it to Processed; forech P {Minimize(R ) R Post(Processed)(R)} do if P is rejecting mcrostte then return FALSE; else if Q Processed Next s.t. Q P then Remove ll Q from Processed Next s.t. P Q; Add P to Next; 11 return TRUE mcrosttes MSttes w.r.t. mcrosttes R MSttes. It is the set of ll mcrosttes P = Post (P 1,...,P n ) where Σ n,n N nd R occurs t lest once in the tuple (P 1,...,P n ) MSttes. Formlly, Post(MSttes)(R) = S Σ{Post (P 1,...,P n ) n = #(),P 1,...,P n MSttes,R {P 1,...,P n }}. The following theorem sttes correctness of Algorithm 3, which cn be proved using similr invrints s in the cse of Algorithm 1 when the notion of distnce from n ccepting stte is suitbly defined (see Appendix B for more detils). Theorem 2. Algorithm 3 lwys termintes nd returns TRUE if nd only if the input tree utomton A is universl. 7.3 Tree Automt Lnguge Inclusion Checking We re interested in testing lnguge inclusion of two tree utomt A =(Σ,Q A,F A, A ) nd B =(Σ,Q B,F B, B ). From Lemm 5, we hve tht L(A) L(B) iff for every tuple 1,..., n of symbols from Σ 0, L (A)(I A 1,...,I A n ) L (B)(I B 1,...,I B n ). In other words, for ny 1,..., n Σ 0, every open tree tht cn be ccepted from tuple of sttes from I A 1... I A n cn lso be ccepted from tuple of sttes from I B 1... I B n. This justifies similr use of the notion of productsttes s in Section 5. We define the lnguge of tuple of productsttes s L (A,B)((q 1,P 1 ),...,(q n,p n )) := L (A)(q 1,...,q n ) \ L (B)(P 1,...,P n ). Observe tht we obtin tht L(A) L(B) iff the lnguge of every ntuple (for ny n N) of productsttes from the set {(i,i B ) Σ 0,i I A } is empty. Our lgorithm for testing lnguge inclusion of tree utomt will check whether it is possible to rech productstte of the form (q,p) with q F A nd P F B = /0 (tht we cll ccepting) from tuple of productsttes from {(i,i B ) Σ 0,i I A }. The following lemm llows us to use Optimiztion 1() nd Optimiztion 2 from Section 5.
12 Algorithm 4: Tree Automt Lnguge Inclusion Checking Input: TAs A nd B over n lphbet Σ. A reltion (A B). Output: TRUE if L(A) L(B). Otherwise, FALSE. if there exists n ccepting productstte in S Σ 0 {(i,i B ) i I A } then return FALSE; Processed:=/0; Next:=Initilize( S Σ 0 {(i,minimize(i B )) i I A }); while Next = /0 do Pick nd remove productstte (r,r) from Next nd move it to Processed; forech (p,p) {(r,minimize(r )) (r,r ) Post(Processed)(r,R)} do if (p,p) is n ccepting productstte then return FALSE; else if p P s.t. p p then if (q,q) Processed Next s.t. p q Q P then Remove ll (q,q) from Processed Next s.t. q p P Q; Add (p,p) to Next; 12 return TRUE Lemm 8. Let (A B). For ny two tuples of sttes nd two tuples of productsttes such tht (p 1,...,p n ) (r 1,...,r n ) nd (R 1,...,R n ) (P 1,...,P n ), we hve L (A,B)((p 1,P 1 ),...,(p n,p n )) L (A,B)((r 1,R 1 ),...,(r n,r n )). It is lso possible to use Optimiztion 1(b) where we stop serching from productsttes of the form (q,p) such tht q r for some r P. However, note tht this optimiztion is of limited use for tree utomt. Under the ssumption tht the utomt A nd B do not contin useless sttes, the reson is tht for ny q Q A nd r Q B, if q ppers t lefthnd side of some rule of rity more thn 1, then no reflexive reltion from (A B) llows q r. 3 Algorithm 4 describes our method for checking lnguge inclusion of TA in pseudocode. It closely follows Algorithm 2. It differs in two min points. First, the initil vlue of the Next set is the result of pplying the function Initilize on the set {(i,minimize(i B )) Σ 0,i I A }, where Initilize is the sme function s in Algorithm 2. Second, the computtion of the Post imge of set of productsttes mens tht for ech symbol Σ n,n N, we construct the Post imge of ech ntuple of productsttes from the set. Like in Algorithm 3, we design the lgorithm such tht we void computing the Post imge of tuple more thn once. We define the post imge Post(PSttes)(r, R) of set of productsttes PSttes w.r.t. productstte (r, R) PSttes. It is the set of ll productsttes (q,p) such tht there is some Σ,#()=n nd some ntuple ((q 1,P 1 ),...,(q n,p n )) of productsttes from PSttes tht contins t lest one occurrence of (r,r), where q Post (q 1,...,q n ) nd P = Post (P 1,...,P n ). Theorem 3. Algorithm 4 lwys termintes nd returns TRUE iff L(A) L(B). 3 To see this, ssume tht n open tree t is ccepted from (q 1,...,q n ) Q n A,q = q i,1 i n. If q r, then by the definition of, t L (A B)(q 1,...,q i 1,r,q i+1,...,q n ). However, tht cnnot hppen, s A B does not contin ny rules with left hnd sides contining both sttes from A nd sttes from B.
13 Antichin Simultion () Detiled results Size Antichin Simultion (b) Averge execution time for different NFA pir sizes (in seconds) Fig. 3. Lnguge inclusion checking on NFAs generted from regulr model checker 8 Experimentl Results In this section, we describe our experimentl results. We concentrted on experiments with inclusion checking, since it is more common thn universlity checking in vrious symbolic verifiction procedures, decision procedures, etc. We compred our pproch, prmeterized by mximl simultion (or, for tree utomt, mximl upwrd simultion), with the previous pure ntichinbsed pproch of [11], nd with clssicl subsetconstructionbsed pproch. We implemented ll the bove in OCml. We used the lgorithm in [7] for computing mximl simultions. In order to mke the figures esier to red, we often do not show the results of the clssicl lgorithm, since in ll of the experiments tht we hve done, the clssicl lgorithm performed much worse thn the other two pproches. 8.1 The Results on NFA For lnguge inclusion checking of NFA, we tested our pproch on exmples generted from the intermedite steps of tool for bstrct regulr model checking [4]. In totl, we hve 1069 pirs of NFA generted from different verifiction tsks, which included verifying version of the bkery lgorithm, system with prmeterized number of producers nd consumers communicting through doubleended queue, the bubble sort lgorithm, n lgorithm tht reverses circulr list, nd Petri net model of the reders/writers protocol (cf. [4, 3] for detiled description of the verifiction problems). In Fig. 3 (), the horizontl xis is the sum of the sizes of the pirs of utomt whose lnguge inclusion we check, nd the verticl xis is the execution time (the time for computing the mximl simultion is included). Ech point denotes result from inclusion testing for pir of NFA. Fig. 3 (b) shows the verge results for different NFA sizes. From the figure, one cn see tht our pproch hs much better performnce thn the ntichinbsed one. Also, the difference between our pproch nd the ntichinbsed pproch becomes lrger when the size of the NFA pirs increses. If we compre the verge results on the smllest 1000 NFA pirs, our pproch is 60% slower thn the the ntichinbsed pproch. For the lrgest NFA pirs (those with size lrger thn 5000), our pproch is 5.32 times fster thn the the ntichinbsed pproch. We lso tested our pproch using NFA generted from rndom regulr expressions. We hve two different tests: (1) lnguge inclusion does not lwys hold nd (2) lnguge inclusion lwys holds 4. The result of the first test is in Fig. 4(). In the figure, 4 To get sufficient number of tests for the second cse, we generte two NFA A nd B from rndom regulr expressions, build their union utomton C = A B, nd test L(A) L(C).
14 Simultion Antichin Simultion Antichin Clssicl () Lnguge inclusion does not lwys hold (b) Lnguge inclusion lwys holds Fig. 4. Lnguge inclusion checking on NFA generted from regulr expressions the horizontl xis is the sum of the sizes of the pirs of utomt whose lnguge inclusion we check, nd the verticl xis is the execution time (the time for computing the mximl simultion is included). From Fig. 4(), we cn see tht the performnce of our pproch is much more stble. It seldom produces extreme results. In ll of the cses we tested, it lwys termintes within 10 seconds. In contrst, the ntichinbsed pproch needs more thn 100 seconds in the worst cse. The result of the second test is in Fig. 4(b) where the horizontl xis is the length of the regulr expression nd the verticl xis is the verge execution time of 30 cses in milliseconds. From Fig. 4(b), we observe tht our pproch hs much better performnce thn the ntichinbsed pproch if the lnguge inclusion holds. When the length of the regulr expression is 900, our pproch is lmost 20 times fster thn the ntichinbsed pproch. When the mximl simultion reltion is given, nturl wy to ccelerte the lnguge inclusion checking is to use to minimize the size of the two input utomt by merging equivlent sttes. In this cse, the simultion reltion becomes sprser. A question rises whether our pproch hs still better performnce thn the ntichinbsed pproch in this cse. Therefore, we lso evluted our pproch under this setting. Here gin, we used the NFA pirs generted from bstrct regulr model checking [4]. The results show tht lthough the ntichinbsed pproch gins some speedup when combined with minimiztion, it is still slower thn our pproch. The min reson is tht in mny cses, simultion holds only in one direction, but not in the other. Our pproch cn lso utilize this type of reltion. In contrst, the minimiztion lgorithm merges only simultion equivlent sttes. We hve lso evluted the performnce of our pproch using bckwrd lnguge inclusion checking combined with mximl bckwrd simultion. As Wulf et l. [11] hve shown in their pper, bckwrd lnguge inclusion checking of two utomt is in fct equivlent to the forwrd version on the reversed utomt. This cn be esily generlized to our cse. The result is very consistent to wht we hve obtined; our lgorithm is still significntly better thn the ntichinbsed pproch. 8.2 The Results on TA For lnguge inclusion checking on TA, we tested our pproch on 86 tree utomt pirs generted from the intermedite steps of regulr tree model checker [2] while verifying the lgorithm of reblncing redblck trees fter insertion or deletion of lef
15 node. The results re given in Antichin Simultion Tble 1. Our pproch hs Size Diff. # of Pirs (sec.) (sec.) much better performnce when % 29 the size of TA pir is lrge % 15 For TA pirs of size smller % 14 thn 200, our pproch is on % 13 verge 1.39 times fster thn % 5 the ntichinbsed pproch % 10 However, for those of size Tble 1. Lnguge inclusion checking on TA bove 1000, our pproch is on verge 6.8 times fster thn the ntichinbsed pproch. 9 Conclusion We hve introduced severl originl wys to combine simultion reltions with ntichins in order to optimize lgorithms for checking universlity nd inclusion on NFA. We hve lso shown how the proposed techniques cn be extended to the domin of tree utomt. This ws chieved by introducing the notion of lnguges of open trees ccepted from tuples of tree utomt sttes nd using the mximl upwrd simultions prmeterized by the identity proposed in our erlier work [1]. We hve implemented the proposed techniques nd performed number of experiments showing tht our techniques cn provide very significnt improvement over currently known pproches. In the future, we would like to perform even more experiments, including, e.g., experiments where our techniques will be incorported into the entire frmework of bstrct regulr (tree) model checking or into some utomtbsed decision procedures. Aprt from tht, it is lso interesting to develop the described techniques for other clsses of utomt (notbly Büchi utomt) nd use them in setting where the trnsitions of the utomt re represented not explicitly but symboliclly, e.g., using BDDs. References 1. P.A. Abdull, A. Boujjni, L. Holík, L. Kti, nd T. Vojnr. Computing Simultions over Tree Automt. In Proc. of TACAS 08, LNCS 4963, A. Boujjni, P. Hbermehl, L. Holík, T. Touili, nd T. Vojnr. AntichinBsed Universlity nd Inclusion Testing over Nondet. Finite Tree Automt. In CIAA 08, LNCS 5148, A. Boujjni, P. Hbermehl, P. Moro, T. Vojnr. Verifying Progrms with Dynmic 1 SelectorLinked Structures in Regulr Model Checking. In TACAS 05, LNCS 3440, A. Boujjni, P. Hbermehl, nd T. Vojnr. Abstrct Regulr Model Checking. In Proc. of CAV 04, LNCS Springer, J. A. Brzozowski. Cnonicl Regulr Expressions nd Miniml Stte Grphs for Definite Events. In Mthemticl Theory of Automt, D. L. Dill, A. J. Hu, nd H. WongToi. Checking for Lnguge Inclusion Using Simultion Preorders. In Proc. of CAV 92, LNCS 575. Springer, L. Holík nd J. Šimáček. Optimizing n LTSSimultion Algorithm. Technicl Report FIT TR , Brno University of Technology, J. E. Hopcroft. An n.log n Algorithm for Minimizing Sttes in Finite Automton. Technicl Report CSTR , Stnford University, A. R. Meyer nd L. J. Stockmeyer. The Equivlence Problem for Regulr Expressions with Squring Requires Exponentil Spce. In Proc. of the 13th Annul Symposium on Switching nd Automt Theory. IEEE CS, F. Møller M. D. Wulf, L. Doyen, T. A. Henzinger, nd J.F. Rskin. Antichins: A New Algorithm for Checking Universlity of Finite Automt. In Proc. of CAV 06, LNCS Springer, 2006.
16 A Correctness of the NFA Universlity Checking The following lemm is implied directly by the fct tht if L(A)(P) L(A)(R), then the shortest word rejected by R is lso rejected by P. Lemm 9. Let P nd R be two mcrosttes such tht L(A)(P) L(A)(R). We hve Dist(P) Dist(R). Lemm 3. The below two loop invrints hold in Algorithm 1: 1. Univ(Processed Next) = Univ({I}). 2. Univ({I}) = Dist(Processed) > Dist(Next). Proof. It is trivil to see tht the invrints hold t the entry of the loop, tking into ccount Lemm 2 covering the effect of the Minimize function. We show tht the invrints continue to hold when the loop body is executed from configurtion of the lgorithm in which the invrints hold. We use Processed old nd Next old to denote the vlues of Processed nd Next when the control is on line 4 before executing the loop body nd we use Processed new nd Next new to denote their vlues when the control gets bck to line 4 fter executing the loop body once. We ssume tht Next old = /0. Let us strt with Invrint 1. Assume first tht Univ(Processed old Next old ) holds. Then, R must be universl, which holds lso for ll of its successors nd, due to Lemm 2, lso for their minimized versions, which my be dded to Next on line 10. Hence, Univ(Processed new Next new ) holds fter executing the loop body, nd thus Invrint 1 holds too. Now ssume tht Univ(Processed old Next old ) holds. Then, Univ({I}) holds, nd hence Invrint 1 must hold for Processed new nd Next new too. We proceed to Invrint 2 nd we ssume tht Univ({I}) holds (the other cse being trivil). Hence, Dist(Processed old ) > Dist(Next old ) holds. We distinguish two cses: 1. Dist(R)= or Q Processed old : Dist(Q) Dist(R). In this cse, Dist(Processed) will not decrese on line 5. From Dist(Processed old ) > Dist(Next old ), there exists some mcrostte R in Next old s.t. Dist(R )=Dist(Next old ) < Dist(Processed old ) Dist(Q) Dist(R). Therefore, Dist(Next) will not chnge on line 5 either. Moreover, for ny mcrostte P, removing Q s.t. P Q from Next nd Processed on line 9 nd then dding P to Next on line 10 cnnot invlidte Dist(Processed new ) > Dist(Next new ) since Dist(P) Dist(Q) due to Lemms 2 nd 9. Hence, Invrint 2 must hold for Processed new nd Next new too. 2. Dist(R) = nd Q Processed old : Dist(Q) Dist(R). In this cse, the vlue of Dist(Processed) decreses to Dist(R) on line 5. Clerly, Dist(R) = 0 or else we would hve terminted before. Then there must be some successor R of R which is either rejecting (nd the loop stops without getting bck to line 4) or one step closer to rejection, mening tht Dist(R ) < Dist(R). Moreover, R either ppers in Next new or there lredy exists some R Next old such tht R R, mening tht Dist(Processed new ) > Dist(Next new ). It is impossible tht R Processed old : R R, becuse R Processed old : Dist(R ) > Dist(R) > Dist(R ) nd from Lemms 2 nd 9, R R implies Dist(R ) < Dist(R ). Furthermore, if some mcrostte is removed from Processed on line 9, Dist(Processed) cn only grow, nd hence we re done.
17 Lemm 10 (Termintion). Algorithm 1 eventully termintes. Proof. For the lgorithm not to terminte, it would hve to be the cse tht some mcrostte is repetedly dded into Next. However, once some mcrostte R is dded into Next, there will lwys be some mcrostte Q Processed Next such tht Q R. This holds since R either stys in Next, moves to Procesed, or is replced by some Q such tht Q R in ech itertion of the loop. Hence, R cnnot be dded to Next for the second time since mcrostte is dded to Next on line 10 only if there is no Q Processed Next such tht Q R. Theorem 1. The lgorithm termintes with the return vlue FALSE if the input utomton A is not universl. Otherwise, it termintes with the return vlue TRUE. Proof. From Lemm 10, the lgorithm eventully termintes. It returns FALSE only if either the set of initil sttes is rejecting, or the minimized version R of some successor S of mcrostte R chosen from Next on line 5 is found rejecting. In the ltter cse, due to Lemm 2, S is lso rejecting. Then R is nonuniversl, nd hence Univ(Processed Next) is flse. By Lemm 3 (Invrint 1), we hve A is not universl. The lgorithm returns TRUE only when Next becomes empty. When Next is empty, Dist(Processed) > Dist(Next) is not true. Therefore, by Lemm 3 (Invrint 2), A is universl. B Correctness of the TA Universlity Checking In this section, we prove correctness of Algorithm 3 in very similr wy to Algorithm 1, using suitbly modified notions of distnces nd rnks. Let A =(Q,Σ,,F) be TA. For n 0 nd n ntuple of mcrosttes (Q 1,...,Q n ) where Q i Q for 1 i n, we let Dist(Q 1,...,Q n )=0 iff Q i F = /0 for some i {1,...,n}. We define Dist(Q 1,...,Q n )=k N + { } iff Q i F for ll i {1,...,n} nd k = min({ t t Tn (Σ) t L (A)(Q 1,...,Q n )}). Here, we ssume min(/0)=. For set MSttes of mcrosttes over Q, we let Rnk(MSttes)=min({Dist(Q 1,...,Q n ) n 1 1 i n : Q i MSttes}) nd we define Univ(MSttes) Rnk(MSttes)=. Lemm 11. The below two loop invrints hold in Algorithm 3: 1. Univ(Processed Next) = Univ({I Σ 0 }). 2. Univ({I Σ 0 })= Rnk(Processed) > Rnk(Processed Next). Proof. It is trivil to see tht the invrints hold t the entry of the loop, tking into ccount Lemm 7. We show tht the invrints continue to hold when the loop body is executed from configurtion of the lgorithm in which the invrints hold. We use Processed old nd Next old to denote the vlues of Processed nd Next when the control is on line 4 before executing the loop body nd we use Processed new nd Next new to denote their vlues when the control gets bck to line 4 fter executing the loop body once. We ssume tht Next old = /0. Let us strt with Invrint 1. Assume first tht Univ(Processed old Next old ) holds. Then, R cn pper within tuples constructed over Processed old Next old which re universl only. In such cse, ll mcrosttes Q rechble from ll tuples T built over
18 Processed old Next old re such tht when we dd them to Processed old Next old, the resulting set will still llow building universl tuples only. Otherwise, one could tke nonuniversl tuple contining some of the newly dded mcrosttes Q, replce Q by the tuple T from which it rose, nd obtin nonuniversl tuple over Processed old Next old, which is impossible. Hence, the possibility of dding the new mcrosttes to Next on line 10 cnnot cuse nonuniverslity of Processed new Next new, which due to Lemm 7 holds when dding the minimized mcrosttes too. Moreover, removing elements from Next or Processed cnnot cuse nonuniverslity either. Hence, Invrint 1 holds over Processed new nd Next new in this cse. Next, let us ssume tht Univ(Processed old Next old ) holds. Then, Univ({I Σ 0 }) holds, nd hence Invrint 1 must hold for Processed new nd Next new too. We proceed to Invrint 2 nd we ssume tht Univ({I Σ 0 }) holds (the other cse being trivil). Hence, Rnk(Processed old ) > Rnk(Processed old Next old ) holds. We distinguish two cses: 1. In order to build tuple T over Processed old nd Next old tht is of Dist equl to Rnk(Processed old Next old ), one needs to use mcrostte Q in Next old \{R}. The mcrostte Q stys in Next new or is replced by smller mcrostte dded to Next on line 10 tht, due to Lemm 7, cn only llow to build tuples of the sme or even smller Dist. Likewise, the mcrosttes ccompnying Q in T sty in Next new or Processed new or re replced by smller mcrosttes dded to Next on line 10 llowing to build tuples of the sme or smller Dist, due to Lemm 7. Hence, moving R to Processed on line 5 cnnot cuse the invrint to brek. Moreover, dding some further mcrosttes to Next on line 10 cn only cuse Rnk(Processed Next) to decrese while removing mcrosttes from Processed on line 9 cn only cuse Rnk(Processed) to grow. Finlly, replcing mcrostte in Next by smller one s combined effect of lines 9 nd 10 cn gin just decrese Rnk(Processed Next), due to Lemm 7. Hence, in this cse, Invrint 2 must hold over Processed new nd Next new. 2. One cn build some tuple T over Processed old nd Next old tht is of Dist equl to Rnk(Processed old Next old ) using Processed old {R} only. In this cse, there must be tuples constructible over Processed old {R} nd contining R tht re not universl. We cn distinguish the following subcses: () From some of the tuples built over Processed old {R} nd contining R, nonccepting mcrostte is reched vi single trnsition of A, nd the lgorithm stops without getting bck to line 4. (b) Otherwise, some of the mcrosttes tht pper in Post(Processed, R) nd tht will be dded in the minimized form to Next must llow one to construct tuples which re of Dist smller thn those bsed on R. This holds since if mcrostte Q is reched from some tuple T contining R by single trnsition, we cn replce T in lrger tuples leding to nonccepttion by Q, nd hence decrese the size of the open tree needed to rech nonccepttion. Tking into ccount Lemm 7 to cover the effect of the minimiztion nd using similr resoning s bove for covering the effect of lines 9 nd 10, it is then cler tht Invrint 2 will remin to hold in this cse.
19 Lemm 12. Algorithm 3 eventully termintes. Proof. An nlogy of the proof of Lemm 10. Theorem 2 cn now be proved in very similr wy s Theorem 1. C Correctness of the TA Lnguge Inclusion Checking We prove correctness of Algorithm 4 in very similr wy to Algorithm 2, using suitbly modified notions of distnces nd rnks. Let A =(Σ,Q A,F A, A ) nd B =(Σ,Q B,F B, B ) be two tree utomt. For n 0 nd n ntuple of mcrosttes ((q 1,P 1 ),...,(q n,p n )), we let Dist((q 1,P 1 ),...,(q n,p n )) = 0 iff ε L (A,B)((q 1,P 1 ),...,(q n,p n )). Otherwise we define Dist((q 1,P 1 ),...,(q n,p n )) = k N + { } iff k = min({ t t Tn (Σ) t L (A,B)((q 1,P 1 ),...,(q n,p n ))}). Here, we ssume min(/0) =. For set PSttes of productsttes, we let Rnk(PSttes) = min({dist((q 1,P 1 ),...,(q n,p n )) n 1 1 i n : (q i,p i ) PSttes}). The predicte Incl(PSttes) is defined to be true iff Rnk(PSttes)=. Lemm 13. The following two loop invrints hold in Algorithm 4: 1. Incl(Processed Next) = Incl( S Σ 0 {(i,i B ) i I A }). 2. Incl( S Σ 0 {(i,i B ) i I A }) = Rnk(Processed) > Rnk(Processed Next). The proof is similr to tht of Lemm 11. Lemm 14. Algorithm 4 eventully termintes. Proof. An nlogy of the proof of Lemm 10. Theorem 3 cn now be proved in very similr wy s Theorem 1.
On the Robustness of Most Probable Explanations
On the Robustness of Most Probble Explntions Hei Chn School of Electricl Engineering nd Computer Science Oregon Stte University Corvllis, OR 97330 chnhe@eecs.oregonstte.edu Adnn Drwiche Computer Science
More informationSome Techniques for Proving Correctness of Programs which Alter Data Structures
Some Techniques for Proving Correctness of Progrms which Alter Dt Structures R. M. Burstll Deprtment of Mchine Intelligence University of Edinburgh 1. INTRODUCTION Consider the following sequence of instructions
More informationFirst variation. (onevariable problem) January 21, 2015
First vrition (onevrible problem) Jnury 21, 2015 Contents 1 Sttionrity of n integrl functionl 2 1.1 Euler eqution (Optimlity conditions)............... 2 1.2 First integrls: Three specil cses.................
More informationAllAtom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins
3586 J. Phys. Chem. B 1998, 102, 35863616 AllAtom Empiricl Potentil for Moleculr Modeling nd Dynmics Studies of Proteins A. D. McKerell, Jr.,*,, D. Bshford,, M. Bellott,, R. L. Dunbrck, Jr.,, J. D. Evnseck,,
More informationLIFE AS POLYCONTEXTURALITY *)
Ferury 2004 LIFE AS POLYCONTEXTURALITY *) y Gotthrd Günther Kein Leendiges ist ein Eins, Immer ist's ein Vieles. (Goethe) Prt I : The Concept of Contexture A gret epoch of scientific trdition is out to
More informationThe Tradeoff Between Inequality and Growth
ANNALS OF ECONOMICS AND FINANCE 4, 329 345 2003 The Trdeoff Between Inequlity nd Growth Jess Benhbib Deprtment of Economics, New York University 269 Mercer Street, 7th floor, New York, NY 10003, USA. Emil:
More informationDoes the chimpanzee have a theory of mind? 30 years later
Review Does the chimpnzee hve theory of mind? 30 yers lter Josep Cll nd Michel Tomsello Mx Plnck Institute for Evolutionry Anthropology, Deutscher Pltz 6, D04103 Leipzig, Germny On the 30th nniversry
More informationNAEYC Early Childhood Program Standards and Accreditation Criteria & Guidance for Assessment
NAEYC Erly Childhood Progrm Stndrds nd Accredittion Criteri & Guidnce for Assessment This document incorportes the lnguge of ll NAEYC Erly Childhood Progrm Stndrds nd Accredittion Criteri, including 39
More informationTheImpactoftheNation smost WidelyUsedInsecticidesonBirds
TheImpctoftheNtion smost WidelyUsedInsecticidesonBirds Neonicotinoid Insecticides nd Birds The Impct of the Ntion s Most Widely Used Insecticides on Birds Americn Bird Conservncy, Mrch 2013 Grsshopper
More informationEach copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
Do Artifcts Hve Politics? Author(s): Lngdon Winner Source: Dedlus, Vol. 109, No. 1, Modern Technology: Problem or Opportunity? (Winter, 1980), pp. 121136 Published by: The MIT Press on behlf Americn Acdemy
More informationMATH 150 HOMEWORK 4 SOLUTIONS
MATH 150 HOMEWORK 4 SOLUTIONS Section 1.8 Show tht the product of two of the numbers 65 1000 8 2001 + 3 177, 79 1212 9 2399 + 2 2001, nd 24 4493 5 8192 + 7 1777 is nonnegtive. Is your proof constructive
More informationOn line construction of suffix trees 1
(To appear in ALGORITHMICA) On line construction of suffix trees 1 Esko Ukkonen Department of Computer Science, University of Helsinki, P. O. Box 26 (Teollisuuskatu 23), FIN 00014 University of Helsinki,
More informationRegular Languages are Testable with a Constant Number of Queries
Regular Languages are Testable with a Constant Number of Queries Noga Alon Michael Krivelevich Ilan Newman Mario Szegedy Abstract We continue the study of combinatorial property testing, initiated by Goldreich,
More informationMagnetism from Conductors, and Enhanced NonLinear Phenomena
Mnetism from Conductors, nd Enhnced NonLiner Phenomen JB Pendry, AJ Holden, DJ Roins, nd WJ Stewrt Astrct  We show tht microstructures uilt from nonmnetic conductin sheets exhiit n effective mnetic
More informationOn the Decidability and Complexity of Query Answering over Inconsistent and Incomplete Databases
On the Decidability and Complexity of Query Answering over Inconsistent and Incomplete Databases Andrea Calì Domenico Lembo Riccardo Rosati Dipartimento di Informatica e Sistemistica Università di Roma
More informationTwodimensional Languages
Charles University Faculty of Mathematics and Physics Mgr. Daniel Průša Twodimensional Languages Doctoral Thesis Supervisor: Martin Plátek, CSc. Prague, 2004 Acknowledgements The results presented in
More informationInformation and Computation
Information and Computation 207 (2009) 849 866 Contents lists available at ScienceDirect Information and Computation journal homepage: www.elsevier.com/locate/ic The myriad virtues of Wavelet Trees Paolo
More informationTHE INSCRIPTIONS FROM TEMPLE XIX AT PALENQUE
DAVID STUART THE INSCRIPTIONS FROM TEMPLE XIX AT PALENQUE The Inscriptions fromtemple XIX t Plenque A Commentry The Inscriptions from TempleXIX t Plenque A Commentry By Dvid Sturt Photogrphs y Jorge Pérez
More informationGiotto: A TimeTriggered Language for Embedded Programming
Giotto: A TimeTriggered Language for Embedded Programming THOMAS A HENZINGER, MEMBER, IEEE, BENJAMIN HOROWITZ, MEMBER, IEEE, AND CHRISTOPH M KIRSCH Invited Paper Giotto provides an abstract programmer
More informationWHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?
WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly
More informationAdaptive LL(*) Parsing: The Power of Dynamic Analysis
Adaptive LL(*) Parsing: The Power of Dynamic Analysis Terence Parr University of San Francisco parrt@cs.usfca.edu Sam Harwell University of Texas at Austin samharwell@utexas.edu Kathleen Fisher Tufts University
More informationApproximate Frequency Counts over Data Streams
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku Stanford University manku@cs.stanford.edu Rajeev Motwani Stanford University rajeev@cs.stanford.edu Abstract We present algorithms for
More informationDiscovering All Most Specific Sentences
Discovering All Most Specific Sentences DIMITRIOS GUNOPULOS Computer Science and Engineering Department, University of California, Riverside RONI KHARDON EECS Department, Tufts University, Medford, MA
More informationHow to Use Expert Advice
NICOLÒ CESABIANCHI Università di Milano, Milan, Italy YOAV FREUND AT&T Labs, Florham Park, New Jersey DAVID HAUSSLER AND DAVID P. HELMBOLD University of California, Santa Cruz, Santa Cruz, California
More informationAn O(ND) Difference Algorithm and Its Variations
An O(ND) Difference Algorithm and Its Variations EUGENE W. MYERS Department of Computer Science, University of Arizona, Tucson, AZ 85721, U.S.A. ABSTRACT The problems of finding a longest common subsequence
More informationSynthesis of Transportation Fuels from Biomass: Chemistry, Catalysts, and Engineering
4044 Chem. Rev. 2006, 106, 4044 4098 Synthesis of Trnsporttion Fuels from Biomss: Chemistry, Ctlysts, nd Engineering George W. Huber, Sr Iborr, nd Avelino Corm* Instituto de Tecnologí Químici, UPVCSIC,
More informationHow many numbers there are?
How many numbers there are? RADEK HONZIK Radek Honzik: Charles University, Department of Logic, Celetná 20, Praha 1, 116 42, Czech Republic radek.honzik@ff.cuni.cz Contents 1 What are numbers 2 1.1 Natural
More informationDependent Types at Work
Dependent Types at Work Ana Bove and Peter Dybjer Chalmers University of Technology, Göteborg, Sweden {bove,peterd}@chalmers.se Abstract. In these lecture notes we give an introduction to functional programming
More informationSwitching Algebra and Logic Gates
Chapter 2 Switching Algebra and Logic Gates The word algebra in the title of this chapter should alert you that more mathematics is coming. No doubt, some of you are itching to get on with digital design
More informationExtracting k Most Important Groups from Data Efficiently
Extracting k Most Important Groups from Data Efficiently Man Lung Yiu a, Nikos Mamoulis b, Vagelis Hristidis c a Department of Computer Science, Aalborg University, DK9220 Aalborg, Denmark b Department
More information