Solutions for Selected Exercises from Introduction to Compiler Design Torben Æ. Mogensen Lst updte: My 30, 2011 1 Introduction This document provides solutions for selected exercises from Introduction to Compiler Design. Note tht in some cses there cn be severl eqully vlid solutions, of which only one is provided here. If your own solutions differ from those given here, you should use your own judgement to check if your solution is correct. 2 Exercises for chpter 1 Exercise 1.1 ) 0 42 b) The number must either be one-digit number, two-digit number different from 42 or hve t lest three significnt digits: 0 ([0 9] [1 3][0 9] 4[0 1] 4[3 9] [5 9][0 9] [1 9][0 9][0 9] + ) c) The number must either be two-digit number greter thn 42 or hve t lest three significnt digits: 0 (4[3 9] [5 9][0 9] [1 9][0 9][0 9] + ) 1
Exercise 1.2 ) 2 ε ε 5 ε 1 3 6 7 8 ε b 4 b) A = ε-closure({1}) = {1, 2, 3, 4, 5} B = move(a, ) = ε-closure({1, 6}) = {1, 6, 2, 3, 4, 5} C = move(a, b) = ε-closure({6}) = {6} D = move(b, ) = ε-closure({1, 6, 7}) = {1, 6, 7, 2, 3, 4, 5} move(b, b) = ε-closure({6}) = C E = move(c, ) = ε-closure({7}) = {7} move(c, b) = ε-closure({}) = {} F = move(d,) = ε-closure({1,6,7,8}) = {1,6,7,8,2,3,4,5} move(d, b) = ε-closure({6}) = C G = move(e, ) = ε-closure({8}) = {8} move(e, b) = ε-closure({}) = {} move(f, ) = ε-closure({1, 6, 7, 8}) = F move(f, b) = ε-closure({6}) = C move(g, ) = ε-closure({}) = {} move(g, b) = ε-closure({}) = {} Sttes F nd G re ccepting since they contin the ccepting NFA stte 8. In digrm form, we get: 2
A B D F b b b b C E G Exercise 1.5 We strt by noting tht there re no ded sttes, then we divide into groups of ccepting nd non-ccepting sttes: We now check if group A is consistent: 0 = {0} A = {1,2,3,4} A b 1 A 2 A 3 A 0 4 A 0 We see tht we must split A into two groups: B = {1, 2} C = {3, 4} And we now check these, strting with B: B b 1 B 2 C So we need to split B into it individul sttes. The only non-singleton group left is C, which we now check: C b 3 C 0 4 C 0 This is consistent, so we cn see tht we could only combine sttes 3 nd 4 into group C. The resulting digrm is: 3
0 b b 1 C 2 Exercise 1.7 ) b 0 b 1 b 2 3 b) b 0 b 1 2 b c) 1 b b b 0 2 4
Exercise 1.8 ( ( ( 0 1 2 3 ) ) ) Exercise 1.9 ) The number must be 0 or end in two zeroes: 0 1 1 0 0 1 0 2 1 b) We use tht reding 0 is the sme s multiplying by 2 nd reding 1 is the sme s multiplying by two nd dding 1. So of we hve reminder m, reding 0 gives us reminder (2m)mod 5 nd reding 1 gives us reminder (2m + 1)mod 5. We cn mke the following trnsition tble: m 0 1 0 0 1 1 2 3 2 4 0 3 1 2 4 3 4 The stte corresponding to m = 0 is ccepting. We must lso strt with reminder 0, but since the empty string isn t vlid number, we cn t use the ccepting stte s strt stte. So we dd n extr strt stte 0 tht hs the sme trnsitions s 0, but isn t ccepting: 0 1 0 0 0 0 0 1 1 2 3 1 0 4 1 1 0 5 1
c) If n = 2 b, the binry number for n is the number for followed by b zeroes. We cn mke DFA for n odd number in the sme wy we did for 5 bove by using the rules tht reding 0 in stte m gives us trnsition to stte (2m)mod nd reding 1 in stte m gives us trnsition to stte (2m + 1)mod. If we (for now) ignore the extr strt stte, this DFA hs sttes. This is miniml becuse nd 2 (the bse number of binry numbers) re reltive prime ( complete proof requires some number theory). If b = 0, the DFA for n is the sme s the DFA constructed bove for, but with one extr strt stte s we did for the DFA for 5, so the totl number of sttes is + 1. If b > 0, we tke the DFA for nd mke b extr sttes: 0 1, 0 2,..., 0 b. All of these hve trnsitions to stte 1 on 1. Stte 0 is chnged so it goes to stte 0 1 on 0 (insted of to itself). For i = 1,..., b 1, stte 0 i hs trnsition on 1 to 0 (i+1) while 0 b hs trnsition to itself on 0. 0 b is the only ccepting stte. The strt stte is stte 0 from the DFA for. This DFA will first recognize number tht is n odd multiple of (which ends in 1) nd then check tht there re t lest b zeroes fter this. The totl number of sttes is + b. So, if n is odd, the number of sttes for DFA tht recognises numbers divisible by n is n, but if n = 2 b, where is odd nd b > 0, then the number of sttes is + b. Exercise 1.10 ) φ s = s becuse L(φ) L(s) = /0 L(s) = L(s) φs = φ becuse there re no strings in φ to put in front of strings in s sφ = φ becuse there re no strings in φ to put fter strings in s φ = ε becuse φ = ε φφ = ε φ = ε b) ε c) As there cn now be ded sttes, the minimiztion lgorithm will hve to tke these into considertion s described in section 2.8.2. Exercise 1.11 In the following, we will ssume tht for the regulr lnguge L, we hve n NFA N with no ded sttes. 6
Closure under prefix. When N reds string w L, it will t ech prefix of w be t some stte s in N. By mking s ccepting, we cn mke N ccept this prefix. By mking ll sttes ccepting, we cn ccept ll prefixes of strings in L. So n utomton N p tht ccepts the prefixes of strings in L is mde of the sme sttes nd trnsitions s N, with the modifiction tht ll sttes in N p ccepting. Closure under suffix. When N reds string w L, where w = uv, it will fter reding u be in some stte s. If we mde s the strt stte of N, N would hence ccept the suffix v of w. If we mde ll sttes of N into strt sttes, we would hence be ble to ccept ll suffixes of strings in L. Since we re only llowed one strt stte, we insted dd new strt stte nd ε-trnsitions from this to ll the old sttes (including the originl strt stte, which is no longer the strt stte). So n utomton N s tht ccepts ll suffixes of strings in L is mde of the sme sttes nd trnsitions s N plus new strt stte s 0 tht hs ε-trnsitions to ll sttes in N. Closure under subsequences. A subsequence of string w cn be obtined by deleting (or jumping over) ny number of the letters in w. We cn modify N to jump over letters by for ech trnsition s c t on letter c dd n ε-trnsition s ε t between the sme pir of sttes. So n utomton N b tht ccepts ll subsequences of strings in L is mde of the sme sttes nd trnsitions s N, with the modifiction tht we dd n ε-trnsitions s ε t whenever N hs trnsition s c t. Closure under reversl. We ssume N hs only one ccepting stte. We cn sfely mke this ssumption, since we cn mke it so by dding n extr ccepting stte f nd mke ε-trnsitions from ll the originl ccepting sttes to f nd then mke f the only ccepting stte. We cn now mke N ccept the reverses of the strings from L by reversing ll trnsitions nd swp strt stte nd ccepting stte. So n utomton N r tht ccepts ll reverses of strings in L is mde the following wy: 1. Copy ll sttes (but no trnsitions) from N to N r. 2. The copy of the strt stte s 0 from N is the only ccepting stte in N r. 3. Add new strt stte s 0 to N r nd mke ε-trnsitions from s 0 to ll sttes in N r tht re copies of ccepting sttes from N. 4. When N hs trnsition s c t, dd trnsition t c s to N r, where s nd t re the copies in N r of the sttes s nd t from N. 7
3 Exercises for chpter 2 Exercise 2.3 If we t first ignore mbiguity, the obvious grmmr is P P (P) P PP i.e., the empty string, prentesis round blnced sequence nd conctention of two blnced sequences. But s the lst production is both left recursive nd right recursive, the grmmr is mbiguous. An unmbiguous grmmr is: P P (P)P which combines the two lst productions from the first grmmr into one. Exercise 2.4 ) b) S S SbS S bss Explntion: The empty string hs the sme number of s nd bs. If string strts with n, we find b to mtch it nd vice vers. A AA A SS S S SbS S bss Explntion: Ech excess hs (possibly) empty sequences of equl numbers of s nd bs. 8
c) d) D A D B A AA A SS B BB B SbS S S SbS S bss Explntion: If there re more s thn bs, we use A from bove nd otherwise we use similrly constructed B. S S SSbS S SbSS S bsss Explntion: If the string strts with n, we find lter mcthing s nd bs, if it strts with b, we find two mtching s. Exercise 2.5 ) B ε B O 1 C 1 B O 2 C 2 O 1 (B O 1 [B)B O 2 [B O 2 O 1 O 1 C 1 )B C 1 (B]B C 2 ]B C 2 C 1 C 1 9
B is blnced, O 1 /C 1 re open one nd close one, nd O 2 /C 2 re open two nd close two. b) B O 1 C 1 [ B ) B ( B ] B ε ε ε ε B C 2 O 2 C 1 C 1 [ B ( B ] B ) B ε ε ε ε Exercise 2.6 The string id id hs these two syntx trees: A A A id id A A A id id We cn mke these unmbiguous grmmrs: ) : A A id A B B B B id b) : A A A B B B id B id The trees for the string id id with these two grmmrs re: 10
A ) A B B id id b) A A B B id id Exercise 2.9 We first find the equtions for Nullble: This trivilly solves to Nullble(A) = Nullble(BA) Nullble(ε) Nullble(B) = Nullble(bBc) Nullble(AA) Nullble(A) = true Nullble(B) = true Next, we set up the equtions for FIRST: FIRST(A) = FIRST(BA) FIRST(ε) FIRST(B) = FIRST(bBc) FIRST(AA) Given tht both A nd B re Nullble, we cn reduce this to which solve to FIRST(A) = FIRST(B) FIRST(A) {} FIRST(B) = {b} FIRST(A) FIRST(A) = {, b} FIRST(B) = {, b} Finlly, we dd the production A $ nd set up the constrints for FOLLOW: 11
{$} FOLLOW(A) FIRST(A) FOLLOW(B) {} FOLLOW(A) {c} FOLLOW(B) FIRST(A) FOLLOW(A) FOLLOW(B) FOLLOW(A) which we solve to FOLLOW(A) = {, b, c, $} FOLLOW(B) = {, b, c} Exercise 2.10 Exercise 2.11 Exp numexp 1 Exp ( Exp ) Exp 1 Exp 1 + Exp Exp 1 Exp 1 Exp Exp 1 Exp 1 Exp Exp 1 Exp 1 / Exp Exp 1 Exp 1 Nullble for ech right-hnd side is trivilly found to be: The FIRST sets re lso esily found: Nullble(Exp2 Exp ) = f lse Nullble(+ Exp2 Exp ) = f lse Nullble( Exp2 Exp ) = f lse Nullble() = true Nullble(Exp3 Exp2 ) = f lse Nullble( Exp3 Exp2 ) = f lse Nullble(/ Exp3 Exp2 ) = f lse Nullble() = true Nullble(num) = f lse Nullble(( Exp )) = f lse 12
Exercise 2.12 FIRST (Exp2 Exp ) = {num, (} FIRST (+ Exp2 Exp ) = {+} FIRST ( Exp2 Exp ) = { } FIRST () = {} FIRST (Exp3 Exp2 ) = {num, (} FIRST ( Exp3 Exp2 ) = { } FIRST (/ Exp3 Exp2 ) = {/} FIRST () = {} FIRST (num) = {num} FIRST (( Exp )) = {(} We get the following constrints for ech production (bbreviting FIRST nd FOLLOW to FI nd FO nd ignoring trivil constrints like FO(Exp ) FO(Exp 1 )): Exp Exp$ : $ FO(Exp) Exp numexp 1 : FO(Exp) FO(Exp 1 ) Exp ( Exp ) Exp 1 : ) FO(Exp), FO(Exp) FO(Exp 1 ) Exp 1 + Exp Exp 1 : FI(Exp 1 ) FO(Exp), FO(Exp 1 ) FO(Exp) Exp 1 Exp Exp 1 : FI(Exp 1 ) FO(Exp), FO(Exp 1 ) FO(Exp) Exp 1 Exp Exp 1 : FI(Exp 1 ) FO(Exp), FO(Exp 1 ) FO(Exp) Exp 1 / Exp Exp 1 : FI(Exp 1 ) FO(Exp), FO(Exp 1 ) FO(Exp) Exp 1 : As FI(Exp 1 ) = {+,,, /}, we get Exercise 2.13 FO(Exp) = FO(Exp 1 ) = {+,,, /, ), $} The tble is too wide for the pge, so we split it into two, but for lyout only (they re used s single tble). num + Exp Exp Exp$ Exp Exp numexp 1 Exp 1 Exp 1 +ExpExp 1 Exp 1 Exp 1 ExpExp 1 Exp 1 Exp 1 ExpExp 1 Exp 1 13
/ ( ) $ Exp Exp Exp$ Exp Exp (Exp)Exp 1 Exp 1 Exp 1 /ExpExp 1 Exp 1 Exp 1 Exp 1 Note tht there re severl conflicts for Exp 1, which isn t surprising, s the grmmr is mbiguous. Exercise 2.14 ) b) c) E nume E E + E E E E E E nume E E Aux E Aux +E Aux E Nullble FIRST E nume f lse {num} E E Aux f lse {num} E true {} Aux +E f lse {+} Aux E f lse { } FOLLOW E {+,, $} E {+,, $} Aux {+,, $} d) num + $ E E nume E E E Aux E E E Aux Aux +E Aux E 14
Exercise 2.19 ) We dd the production T T. b) We dd the production T T $ for clculting FOLLOW. We get the constrints (omitting trivilly true constrints): which solves to T T $ : $ FOLLOW(T ) T T : FOLLOW(T ) FOLLOW(T ) T T >T : > FOLLOW(T ) T T T : FOLLOW(T ) T int : c) We number the productions: nd mke NFAs for ech: 0 T A B FOLLOW(T ) = {$} FOLLOW(T ) = {$, >, } 0: T T 1: T T >T 2: T T T 3: T int 1 T C -> D T E F 2 T G * H T I J 3 int K L 15
We then dd epsilon-trnsitions: A C E G I ε C, G, K C, G, K C, G, K C, G, K C, G, K nd convert to DFA (in tbulr form): stte NFA sttes int -> * T 0 A, C, G, K s1 g2 1 L 2 B, D, H s3 s4 3 E, C, G, K s1 g5 4 I, C, G, K s1 g6 5 F, D, H s3 s4 6 J, D, H s3 s4 nd dd ccept/reduce ctions ccording to the FOLLOW sets: stte NFA sttes int -> * $ T 0 A, C, G, K s1 g2 1 L r3 r3 r3 2 B, D, H s3 s4 cc 3 E, C, G, K s1 g5 4 I, C, G, K s1 g6 5 F, D, H s3/r1 s4/r1 r1 6 J, D, H s3/r2 s4/r2 r2 d) The conflict in stte 5 on -> is between shifting on -> or reducing to production 1 (which contins ->). Since -> is right-ssocitive, we shift. The conflict in stte 5 on * is between shifting on * or reducing to production 1 (which contins ->). Since * binds tighter, we shift. The conflict in stte 6 on -> is between shifting on -> or reducing to production 2 (which contins *). Since * binds tighter, we reduce. The conflict in stte 6 on * is between shifting on * or reducing to production 2 (which contins *). Since * is left-ssocitive, we reduce. The finl tble is: 16
stte int -> * $ T 0 s1 g2 1 r3 r3 r3 2 s3 s4 cc 3 s1 g5 4 s1 g6 5 s3 s4 r1 6 r2 r2 r2 Exercise 2.20 The method from section 3.16.3 cn be used with stndrd prser genertor nd with n unlimited number of precedences, but the restructuring of the syntx tree fterwrds is bothersome. The precedence of n opertor needs not be known t the time the opertor is red, s long s it is known t the end of reding the syntx tree. Method ) requires non-stndrd prser genertor or modifiction of generted prser, but it lso llows n unlimited number of precedences nd it doesn t require restructuring fterwrds. The precedence of n opertor needs to be known when it is red, but this knowledge cn be quired erlier in the sme prse. Method b) cn be used with stndrd prser genertor. The lexer hs rule for ll possible opertor nmes nd looks up in tble to find which token to use for the opertor (similr to how, s described in section 2.9.1, identifiers cn looked up in tble to see if they re keywords or vribles). This tble cn be updted s result of prser ction, so like method ), precedence cn be declred erlier in the sme prse, but not lter. The min disdvntge is tht the number of precedence levels nd the ssocitivity of ech level is fixed in dvnce, when the prser is constructed. Exercise 2.21 ) The grmmr describes the lnguge of ll even-length plindromes, i.e., strings tht re the sme when red forwrds or bckwrds. b) The grmmr is unmbiguous, which cn be proven by induction on the length of the string: If it is 0, the lst production is the only tht mtches. If greter thn 0, the first nd lst chrcters in the string uniquely selects the first or second production (or fils, if none mtch). After the first nd lst chrcters re removed, we re bck to the originl prsing problem, but on shorter string. By the induction hypothesis, this will hve unique syntx tree. 17
c) We dd strt production A A) nd number the productions: 0: A A 1: A A 2: A b A b 3: A We note tht FOLLOW(A) = {, b, $} nd mke NFAs for ech production: 0 A A B 1 C A D E F 2 b G A H b I J 3 K We then dd epsilon-trnsitions: A D H ε C, G, K C, G, K C, G, K nd convert to DFA (in tbulr form) nd dd ccept/reduce ctions: stte NFA sttes b $ A 0 A, C, G, K s1/r3 s2/r3 r3 g3 1 D, C, G, K s1/r3 s2/r3 r3 g4 2 H, C, G, K s1/r3 s2/r3 r3 g5 3 B cc 4 E s6 5 I s7 6 F r1 r1 r1 7 J r2 r2 r2 18
d) Consider the string. In stte 0, we shift on the first to stte 1. Here we re given choice between shifting on the second or reducing with the empty reduction. The right ction is reduction, so r3 on in stte 1 must be preserved. Consider insted the string. After the first shift, we re left with the sme choice s before, but now the right ction is to do nother shift (nd then reduce). So s1 on in stte 1 must lso be preserved. Removing ny of these two ctions will, hence, mke legl string unprseble. So we cn t remove ll conflicts. Some cn be removed, though, s we cn see tht choosing some ctions will led to sttes from which there re no legl ctions. This is true for the r3 ctions in nd b in stte 0, s these will led to stte 3 before reching the end of input. The r3 ction on b in stte 1 cn be removed, s this would indicte tht we re t the middle of the string with n before the middle nd b fter the middle. Similrly, the r3 ction on in stte 2 cn be removed. But we re still left with two conflicts, which cn not be removed: stte b $ A 0 s1 s2 r3 g3 1 s1/r3 s2 r3 g4 2 s1 s2/r3 r3 g5 3 cc 4 s6 5 s7 6 r1 r1 r1 7 r2 r2 r2 4 Exercises for chpter 3 Exercise 3.2 In Stndrd ML, nturl choice for simple symbol tbles re lists of pirs, where ech pir consists of nme nd the object bound to it. The empty symbol tble is, hence the empty list: vl emptytble = []} Binding symbol to n object is done by prepending the pir of the nme nd object to the list: fun bind(nme,object,tble) = (nme,object)::tble} 19
Looking up nme in tble is serching (from the front of the list) for pir whose first component is the nme we look for. To hndle the possibility of nme not being found, we let the lookup function return n option type: If nme is bound, we return SOME obj, where obj is the object bound to the nme, otherwise, we return NONE: fun lookup(nme,[]) = NONE lookup(nme,(nme1,obj)::tble) = if nme=nme1 then SOME obj else lookup(nme,tble) Entering nd exiting scopes don t require ctions, s the symbol tbles re persistent. Exercise 3.3 The simplest solution is to convert ll letters in nmes to lower cse before doing ny symbol-tble opertions on the nmes. Error messges should, however, use the originl nmes, so it is bd ide to do the conversion lredy during lexicl nlysis or prsing. 5 Exercises for chpter 5 Exercise 5.3 We use synthesized ttribute tht is the set of exceptions tht re thrown but not cught by the expression. If t S 1 ctch i S 2, i is not mong the exceptions tht S 1 cn throw, we give n error messge. If the set of exceptions tht the top-level sttement cn throw is not empty, we lso give n error messge. Check S (S) = cse S of throw id {nme(id)} S 1 ctch id S 2 thrown 1 = Check S (S 1 ) thrown 2 = Check S (S 2 ) i f nme(id) thrown 1 then (thrown 1 \ {nme(id)}) thrown 2 else error() S 1 or S 2 Check S (S 1 ) Check S (S 2 ) other {} 20
CheckTopLevel S (S) = thrown = Check S (S) i f thrown {} then error() 6 Exercises for chpter 6 Exercise 6.1 New temporries re t2,... in the order they re generted. Indenttion shows sub-expression level. t2 := 2 t5 := t0 t6 := t1 t4 := t5+t6 t8 := t0 t9 := t1 t7 := t8*t9 t3 := CALL _g(t4,t7) r := t2+t3 21
Exercise 6.9 ) Trns Stt (Stt,vtble, ftble,endlbel) = cse Stt of Stt 1 ; Stt 2 lbel 1 = newlbel() code 1 = Trns Stt (Stt 1,vtble, ftble,lbel 1 ) code 2 = Trns Stt (Stt 2,vtble, ftble,endlbel) code 1 ++[LABEL lbel 1 ]++code 2 id := Exp plce = lookup(vtble, nme(id)) Trns Exp (Exp,vtble, ftble, plce) if Cond lbel 1 = newlbel() then Stt 1 code 1 = Trns Cond (Cond,lbel 1,endlbel,vtble, ftble) code 2 = Trns Stt (Stt 1,vtble, ftble,endlbel) code 1 ++[LABEL lbel 1 ]++code 2 if Cond lbel 1 = newlbel() then Stt 1 lbel 2 = newlbel() else Stt 2 code 1 = Trns Cond (Cond,lbel 1,lbel 2,vtble, ftble) code 2 = Trns Stt (Stt 1,vtble, ftble,endlbel) code 3 = Trns Stt (Stt 2,vtble, ftble,endlbel) code 1 ++[LABEL lbel 1 ]++code 2 ++[GOTO endlbel, LABEL lbel 2 ] ++code 3 while Cond lbel 1 = newlbel() do Stt 1 lbel 2 = newlbel() code 1 = Trns Cond (Cond,lbel 2,endlbel,vtble, ftble) code 2 = Trns Stt (Stt 1,vtble, ftble,lbel 1 ) [LABEL lbel 1 ]++code 1 ++[LABEL lbel 2 ]++code 2 ++[GOTO lbel 1 ] repet Stt 1 lbel 1 = newlbel() until Cond lbel 3 = newlbel() code 1 = Trns Stt (Stt 1,vtble, ftble,lbel 3 ) code 2 = Trns Cond (Cond,endlbel,lbel 1,vtble, ftble) [LABEL lbel 1 ]++code 1 ++[LABEL lbel 3 ]++code 2 b) New temporries re t2,... in the order they re generted. Indenttion follows sttement structure. LABEL l1 t2 := t0 22
t3 := 0 IF t2>t3 THEN l2 ELSE endlb LABEL l2 t4 := t0 t5 := 1 t0 := t4-t5 t6 := x t7 := 10 IF t6>t6 THEN l3 ELSE l1 LABEL l3 t8 := t0 t9 := 2 t0 := t8/t9 GOTO l1 LABEL endlb 23