Ambiguity, closure properties, Overview of ambiguity, and Chomsky s standard form



Similar documents
The Halting Problem is Undecidable

Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, School of Informatics University of Edinburgh als@inf.ed.ac.

Introduction to Automata Theory. Reading: Chapter 1

Overview of E0222: Automata and Computability

Infinitely Repeated Games with Discounting Ù

Inner Product Spaces and Orthogonality

Automata and Computability. Solutions to Exercises

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

ÖVNINGSUPPGIFTER I SAMMANHANGSFRIA SPRÅK. 15 april Master Edition

Approximating Context-Free Grammars for Parsing and Verification

Formal Grammars and Languages

Model 2.4 Faculty member + student

Lecture 1: Schur s Unitary Triangularization Theorem

Syntaktická analýza. Ján Šturc. Zima 208

CS5236 Advanced Automata Theory

Course Manual Automata & Complexity 2015

On String Languages Generated by Spiking Neural P Systems


Chapter 7 Uncomputability

Turing Machines: An Introduction

Compiler Construction

Reading 13 : Finite State Automata and Regular Expressions

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, Notes on Algebra

Mildly Context Sensitive Grammar Formalisms

3515ICT Theory of Computation Turing Machines

Low upper bound of ideals, coding into rich Π 0 1 classes

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

Local periods and binary partial words: An algorithm

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

Mathematical Induction

Classification of Cartan matrices

Fixed-Point Logics and Computation

[Refer Slide Time: 05:10]

SRM UNIVERSITY FACULTY OF ENGINEERING & TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF SOFTWARE ENGINEERING COURSE PLAN

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Unified Language for Network Security Policy Implementation

Khair Eddin Sabri and Ridha Khedri

Lecture 4: Partitioned Matrices and Determinants

Analysis of Algorithms I: Binary Search Trees

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Compiler I: Syntax Analysis Human Thought

FIBRATION SEQUENCES AND PULLBACK SQUARES. Contents. 2. Connectivity and fiber sequences. 3

Computer Science Department. Technion - IIT, Haifa, Israel. Itai and Rodeh [IR] have proved that for any 2-connected graph G and any vertex s G there

Solving Systems of Linear Equations

Grammars with Regulated Rewriting

MACM 101 Discrete Mathematics I

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

Computability Theory

Introduction to Scheduling Theory

1 Approximating Set Cover

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Genetic programming with regular expressions

We used attributes in Chapter 3 to augment a context-free grammar

The Goldberg Rao Algorithm for the Maximum Flow Problem

Instant SQL Programming

3.2 The Factor Theorem and The Remainder Theorem

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

THE DEGREES OF BI-HYPERHYPERIMMUNE SETS

Finite Automata. Reading: Chapter 2

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

Teaching Formal Methods for Computational Linguistics at Uppsala University

Finite Automata. Reading: Chapter 2

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

Systems of Linear Equations

Liability and Privacy Issues in Business

Lecture 9 - Message Authentication Codes

CS154. Turing Machines. Turing Machine. Turing Machines versus DFAs FINITE STATE CONTROL AI N P U T INFINITE TAPE. read write move.

Fast nondeterministic recognition of context-free languages using two queues

Chapter 7: Products and quotients

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

CHAPTER 7 GENERAL PROOF SYSTEMS

Automata and Formal Languages

Calculus with Parametric Curves

7 Gaussian Elimination and LU Factorization

Data Structures Fibonacci Heaps, Amortized Analysis

Numerical Analysis Lecture Notes

Relational Databases

Theory of Computation Class Notes 1

Compiler Design July 2004

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, Chapter 7: Digraphs

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Parsing Beyond Context-Free Grammars: Introduction

Graphs without proper subgraphs of minimum degree 3 and short cycles

Taylor and Maclaurin Series

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

Formal Engineering for Industrial Software Development

Software Engineering

CNC Applications. Tool Radius Compensation for Machining Centers

ON THE EMBEDDING OF BIRKHOFF-WITT RINGS IN QUOTIENT FIELDS

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Transcription:

Ambiguity, closure properties, and Chomsky s normal form Overview of ambiguity, and Chomsky s standard form

The notion of ambiguity Convention: we assume that in every substitution, the leftmost remaining variable is the one that is replaced. When this convention is applied to all substitutions in a derivation, we talk about leftmost derivations Definition: A CGF is said to be ambiguous if a string in its language has two different leftmost derivations. In such a case, the string has necessarily two different parsing trees

Equivalence and ambiguity Definition: Two context-free grammars are declared to be equivalent if and only if they produce the same context-free language. Remark: An ambiguous context-free grammar may have an equivalent unambiguous context-free grammar. Definition: Given an ambiguous grammar, the process of identifying an equivalent grammar that is not ambiguous is called disambiguation. If not such grammar exists, the language is called inherently ambiguous 3

Example of an ambiguous contextfree grammar Let G = ({, A}, {,}, R, } R = { A λ, A A A } w = L( G) Then, the string has the following two leftmost derivations:.. A A A A A A 4

The grammar is not inherently ambiguous The ambiguity stems from the fact that the rule A A can be use in the first or in the second substitution indistinctively. In either case the same string is derived. w = L( G) L(G) However, is not inherently ambiguous. Following is a context free grammar that eliminates the above ambiguity. 5

Disambiguation Let G R = ({, A, B},{,}, R, ) = { A λ, A A B, B B } Claim: L ( G ) = L ( G ) and under G each string is derived by a unique sequence of leftmost substitutions. This is, the new grammar derives the language unambiguously Proof: exercise! 6

Closure properties Theorem 9.: Context-free languages are closed under regular operations L L cketch of the proof: Let and be context-free languages. Let G = ( V, A, R, and G = ( V, A, R, ) be context-free grammars such that L = L( G ) and L ( ) = L G respectively. We may assume without loss of generality that V I = φ. V ) 7

Proof (cont.) Define the context-free grammars:. U = ({ } UV UV, A, R U R U{ },. C = ({ } UV UV, A, R U R U{ }, 3. T = ({ } U V, A, R U { λ }, ) Here we assume V UV Claim: L ( U ) = L U L L( C) = L L L( T) = L The rest of the proof (i.e. verification of the claim) is left as an exercise. ) ) * 8

Applications n. The language { n n n M = : n, n } is context-free, since it is the concatenation n n of L = { : n } with itself. The language = i n n nk nk { : k N ( i =,..., k) n is also context-free, since it is the starclosure of L } 9

Chomsky normal form Theorem 9.: (Chomsky normal form) Any context-free grammar is equivalent to a grammar G = ( V, A, R, ) whose productions are either of the form: Z UT Z a λ where Z, U, T V U T ; or where a A ; or

Method 9. Transforming a context-free grammar into a grammar in Chomsky normal form Given a context-free grammar G=(V,A,R,). Add a new start variable and add the rule. Remove all rules of the form u λ, u V adding the rules that are necessary to preserve the grammar s productions 3. Eliminate unit rules. These are rules of the form of u v, u, v V and their elimination requires adding the rules u w wherever v w appears, unless was already eliminated 4. Convert rules of the form k to a set of rules of the form u a u, u au,..., uk ak ak where the u i ' s are new variables. 5. Convert rules of the form k to rules of the form of u w u a... a a3 a, k 3 u a... a, k ui ai, wherever ai is a terminal.here, ui is a new variable.

Example The context free grammar: G R = ({ },{,}, R, ) = { λ} is not in Chomsky normal form. We will find an equivalent context-free grammar N, which is in Chomsky normal form, using the previous procedure (Method 9.)

Example (cont.). Define a new start variable Old productions λ

Example (cont.). Eliminate unitary productions. In our case we have: λ Recall that the removal is performed by eliminating them and adding all rules necessary to preserve the original grammar s derivations

Example (cont.) The derivations that involve these unitary productions (and have to be preserved) are: Thus, the new set of rules is: λ

Example (cont.) The latter set of rules define an equivalent grammar which is not yet in Chomsky normal form. Indeed no rule, except the one that associate the starting variable with the null string, is in Chomsky normal form. We ll fix this in the next step: 3. Decomposing all remaining productions that are not in Chomsky normal form into sets of productions with two variables or a terminal on the right hand side.

Example (cont.) This applies to the current productions:,, We transform each of them into a set of productions in Chomsky normal form as follows

Example (cont.) Consider first: This production is clearly equivalent to the combined action of the following productions:

Example (cont.) As for we have: ) ( 3 3 ince there are here productions that were already defined, we replace the variables: ) ( ) ( 5 5 3 4 3 4 5 4 : : = =

Example (cont.) After replacing we get: 3 3

Example (cont.) And similarly, using the already defined productions, we transform and 3 3

Example (cont.) In summary, we get:, { ), },{,},,,,, ({ 3 3 R R N = = λ },,,, { 3 3 3 R = λ

Does it work? Let s derive a few strings: λ 3 3

What s so important about Chomsky s normal form? Theorem 9.3: A context-free grammar in Chomsky normal form derives a string of length n in exactly n- substitutions Proof: Let G w = w of rules of of be a context - free grammar in Chomsky normal form and let w such rules. L ( G ).ince each w n i produced by substituting a the form of v But, the formation of the form u v n. Thus, rule of zv, u, z, v such rules involved in the generation of is a terminal, it could only be the form v the derivation of v v n w involves n applications of can only be done by applying variables.there are exactly n applications v i v w n. i in a string of variables

A note on context-sensitive grammars Definition: A context-sensitive grammar is a quadruple G = ( V, A, R, ) wherev, A and are just as in a context - free grammar, but the set of rules R includes rules of the form aub cvd, whereu andv are variables, and a, b, c, and d are strings of variables and literals. 5

Example Let G=({,R,T,U,V,W},{a, b, c}, R, ) R= { arc, R art b, btc bbcc, btt bbut, UT UU, UUc VUc Vcc, UV VV, bvc bbcc, bvv bbwv, WV WW, WWc TWc Tcc, WT TT } This grammar generates the canonical non-context-free n n n language: {a b c : n } 6

A derivation The derivation of the string is: aaabbbccc arc aartc aaarttc aaabttc aaabbutc aaabbuuc aaabbvuc aaabbvcc aaabbbccc 7

Note on normal forms Principle: Every context-sensitive grammar which does not generate the empty string can be transformed into an equivalent grammar in Kuroda normal form Remark: The normal form will not in general be a context-sensitive grammar, but will be a non-contracting grammar 8