CPS 22 Thory of Computation REGULAR LANGUAGES Rgular xprssions Lik mathmatical xprssion (5+3) * 4. Rgular xprssion ar built using rgular oprations. (By th way, rgular xprssions show up in various languags: Prl, Java, Python, tc - grat for pattrn matching oprations). Exampls: a * b ( ) * = (+)* Dfinition. R is a rgular xprssion, if R is on of th following:. ε 2. a, for som a Σ 3. 4. (R R 2 ), whr R and R 2 ar rgular xprssions 5. R R 2, whr R and R 2 ar rgular xprssions 6. (R ) *, whr R is a rgular xprssion Not: it is a inductiv dfinition - it is dfind basd on itslf. Not: th + symbol will at tims b usd for union (+)* R R + + = RR! " = * R L((+)*(+)*) = {w w contains a in th middl} (Not: othr rgular xprssion xampls on p.65)
Equivalnc of Rgular Exprssions and Finit Automata Thorm. A languag is rgular if and only if som rgular xprssion dscribs it. W nd to prov two dirctions:. If a languag is dscribd by a rgular xprssion, thn it is rgular. 2. If a languag is rgular, thn thr is a rgular xprssion that dscribs it. Part. This is th asy part - "If a languag is dscribd by a rgular xprssion, thn it is rgular." Say that a rgular xprssion R dscribs som languag A. W will convrt R to an NFA N that rcognizs A. Thn, A has to b rgular. R has on of six possibl forms:. R = ε. Thn, 2. R = a, for som a Σ. Thn, a 3. R =. Thn, 4. R = R R 2. 5. R = R R 2. 6. R = (R ) * In th last thr cass, th constructions givn in th proofs that th class of rgular languags ar closd undr th rgular oprations can b usd hr as wll. That is, w assum R and R 2 ar rcognizd by NFAs N and N 2, and us th sam constructions to crat N from N and N 2. Convrt th following rgular xprssion to a NFA: (+)*(+)*
Not: using th lttr "" to rprsnt ε Not: For th abov rsulting NFA - major limination of stats and transistion can tak plac. Doubl 's can b rippd out. What ls can b liminatd? Part 2: W nd to show that if a languag is rgular, thn it is dscribd by a rgular xprssion. "If a languag is rgular, thn thr is a rgular xprssion that dscribs it." If a languag A is rgular, thn it is rcognizd by a DFA M. W will show how to convrt an arbitrary DFA M into an quivalnt rgular xprssion. Th main ida is that w will gradually liminat th stats of M. Stratgy: DFA --> GNFA --> rgular xprssions GNFA - gnralizd nondtrministic finit automaton Dfinition (informal dtails in th book). A GNFA is a Finit Automaton, xcpt:. Thr is only on start stat, on final stat, and th two ar distinct. 2. Th start stat has arrows going to vry othr stat. 3. Thr ar no arrows going to th start stat. 4. Th final stat has arrows coming from vry othr stat. 5. Thr ar no arrows laving th final stat. 6. Evry stat (xcpt start & final) has arrows going to vry othr stat. 7. Th labls of th arrows ar rgular xprssions.
To convrt a DFA into a GNFA:. Crat a nw start stat, with ε-transitions to th prvious start stat and -transitions to all othr stats. 2. Crat a nw final stat, with ε-transitions from th prvious final stat and - transitions from all othr stats. 3. Add -transitions from vry stat to vry stat (othr than start & final). Exampl: (Not: a vn mor profound xampl would b a DFA with multipl accpt stats) Th rsulting GNFA always has 2 stats. (It actually always has 3 stats, but now w will start rmoving stats until it has xactly 2 stats.) To convrt from GNFA to RE: W will rmov stats incrmntally, and at th sam tim build up a rgular xprssion. Whnvr th GNFA has 2 stats, w liminat a stat as follows: Th stat q blow is any stat othr than th (uniqu) start and (uniqu) final stats. That stat q, has possibly a numbr of transitions to it, including a slf-transition, and a numbr of transitions from it. q
For vry pair of stats p and r (including start & final), q is btwn p and r: p always has a transition to q, and r always has a transition from q. W rmov q, and in its plac put a rgular xprssion dscribing how to gt from p to r through q: R2 p R q R3 r p RR2*R3 r Th abov 2 GNFA sgmnts ar quivalnt. This ida of quivalncy mans: to gt from p to r, ithr gt through th arrow from p to r, or gt through th old q-path, in which cas you nd to go through th old p-to-q transition, thn possibly loop in q an arbitrary numbr of tims, and thn go through th old q-to-r path. Continu this procss of liminating stats, until w ar lft with just th start and final stats. Thn, E is th rgular xprssion w nd! E will accpt xactly th strings that th old DFA was accpting. Multipl dgs can b rmovd: a+b a a b b a+b Practic.. Find a DFA accpting th languag + ( + )*
Stp. Find NFA 4 3 2 Stp2. Transform NFA to DFA Q $ " Q # = { q} Q Q = { q2} = { q, q4} = { q3} { q2} { q2} { q3}! { q, q4} { q3} { q2}! 2,,4 2 3, 2. Find th rgular xprssion accptd by this NFA.
, 5 2 3 4 Answr: (+)*(+)+ = (+)* Simplifying formulas for REs (prov thm as an xrcis!): Not: E rprsnts th alphabt. + E = E εe = Eε = E E = E = (E * ) * = E * * = ε ε * = ε E + E = E E(F + G) = EF + EG (F + G)E = FE + GE Th UNIX-styl + -oprator, matchs a string onc or mor tims (not zro tims). W could dfin it hr as E + = EE * = E * E. Thn, chck as an xrcis that E * = E + ε. Rfrncs: Introduction to th Thory of Computation (2nd d.) Michal Sipsr Problm Solving in Automata, Languags, and Complxity Ding-Zhu Du and Kr-I Ko