If-Then-Else Problem (a motivating example for LR grammars)



Similar documents
CS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Bottom-Up Syntax Analysis LR - metódy

Bottom-Up Parsing. An Introductory Example

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

Compiler Construction

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

FIRST and FOLLOW sets a necessary preliminary to constructing the LL(1) parsing table

Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, School of Informatics University of Edinburgh als@inf.ed.ac.

Context free grammars and predictive parsing

Syntaktická analýza. Ján Šturc. Zima 208

Approximating Context-Free Grammars for Parsing and Verification

Design Patterns in Parsing

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Lecture 9. Semantic Analysis Scoping and Symbol Table

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

SOLUTION Trial Test Grammar & Parsing Deficiency Course for the Master in Software Technology Programme Utrecht University

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Basic Parsing Algorithms Chart Parsing

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

Scanner. tokens scanner parser IR. source code. errors

A Programming Language Where the Syntax and Semantics Are Mutable at Runtime

CSCI 3136 Principles of Programming Languages

Programming Languages CIS 443

Compiler I: Syntax Analysis Human Thought

Yacc: Yet Another Compiler-Compiler

Adaptive LL(*) Parsing: The Power of Dynamic Analysis

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

Antlr ANother TutoRiaL

03 - Lexical Analysis

Parsing Expression Grammar as a Primitive Recursive-Descent Parser with Backtracking

Compiler Construction

Formal Grammars and Languages

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Unified Language for Network Security Policy Implementation

Compiler Construction

Programming Languages

Business Process- and Graph Grammar-Based Approach to ERP System Modelling

Compiler Design July 2004

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Programming Language Pragmatics

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

In-Memory Database: Query Optimisation. S S Kausik ( ) Aamod Kore ( ) Mehul Goyal ( ) Nisheeth Lahoti ( )

Grammars and Parsing. 2. A finite nonterminal alphabet N. Symbols in this alphabet are variables of the grammar.

6.231 Dynamic Programming and Stochastic Control Fall 2008

Software Engineering and Service Design: courses in ITMO University

A Test Case Generator for the Validation of High-Level Petri Nets

Natural Language to Relational Query by Using Parsing Compiler

CUP User's Manual. Scott E. Hudson Graphics Visualization and Usability Center Georgia Institute of Technology. Table of Contents.

2) Write in detail the issues in the design of code generator.

ANTLR - Introduction. Overview of Lecture 4. ANTLR Introduction (1) ANTLR Introduction (2) Introduction

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

Transition-Based Dependency Parsing with Long Distance Collocations

Parsing by Example. Diplomarbeit der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern. vorgelegt von.

Outline of today s lecture

Programming Languages

Natural Language Database Interface for the Community Based Monitoring System *

Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment

CS5310 Algorithms 3 credit hours 2 hours lecture and 2 hours recitation every week

Flex/Bison Tutorial. Aaron Myles Landwehr CAPSL 2/17/2012

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Language Processing Systems

7. Building Compilers with Coco/R. 7.1 Overview 7.2 Scanner Specification 7.3 Parser Specification 7.4 Error Handling 7.5 LL(1) Conflicts 7.

The Mjølner BETA system

Eindhoven University of Technology

Implementation of Recursively Enumerable Languages using Universal Turing Machine in JFLAP

Implementing Programming Languages. Aarne Ranta

Systematically Enhancing Black-Box Web Vulnerability Scanners

CGT 581 G Procedural Methods L systems

Migrating Web Applications

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Fast nondeterministic recognition of context-free languages using two queues

Programming Language Concepts for Software Developers

Briki: a Flexible Java Compiler

Automata and Computability. Solutions to Exercises

Semantic parsing with Structured SVM Ensemble Classification Models

Performance Analysis and Optimization on Lucene

CS154. Turing Machines. Turing Machine. Turing Machines versus DFAs FINITE STATE CONTROL AI N P U T INFINITE TAPE. read write move.

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006

Semester Review. CSC 301, Fall 2015

Language Interface for an XML. Constructing a Generic Natural. Database. Rohit Paravastu

SRM UNIVERSITY FACULTY OF ENGINEERING & TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF SOFTWARE ENGINEERING COURSE PLAN

Simulation-Based Security with Inexhaustible Interactive Turing Machines

C Compiler Targeting the Java Virtual Machine

Transcription:

If-Then-Else Problem (a motivating example for LR grammars) If x then y else z If a then if b then c else d this is analogous to a bracket notation when left brackets >= right brackets: [ [ ] ([ i ] j, i>=j) Grammar: S [ S C S λ C ] [ [ ] SSλC or SSCλ ambiguous C λ 24

Solving the If-Then-Else Problem The ambiguity exists at the language level as well. The semantics need to be defined properly: e.g., the then part belongs to the closest matching if S [ S S S1 S1 [ S1 ] S1 λ This grammer is still not LL(1), nor is it LL(k) Show that this is so. 25

Parsing the If-Then-Else Construct LL(k) parsers can look ahead k tokens. (LL(k) will not be discussed in this class, but be able to explain in English what LL(k) means and recognize a simple LL(k) grammar.) For the If-Then-Else construct, a parsing strategy is needed that can look ahead at the entire RHS of a production (not just k tokens) before deciding what production to take. LR parsers can do that. 26

LR Parsers A Shift-Reduce Parser: Basic idea: put tokens on a stack until an entire production is found. Issues: recognize the end point of a production find the length of the production (RHS) find the corresponding nonterminal (i.e., the LHS of the production) 27

Data Structures for Shift-Reduce Parsers At each state, given the next token, a goto table defines the successor state an action table defines whether to shift (put the next state and token on the stack) reduce (a RHS is found, process the production) terminate (parsing is complete) 28

Example of Shift-Reduce Parsing Consider the simple Grammar: <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts> <stmts> λ Shift Reduce Driver Algorithm on page 142, Fig 6.1 29

LR Parser Generators (OR: HOW TO COME UP WITH GOTO AND ACTION TABLES) Basic idea: shift tokens onto the stack; at any step keep the set of productions that match the read-in tokens. Reduce the RHS of recognized productions to the corresponding nonterminal on the LHS of the production by replacing the RHS tokens on the stack wit the LHS non-terminal. 30

LR(k) Parsers LR(0) parsers: no lookahead predict which production to use by looking only at the symbols already read. LR(k) parsers: k symbol lookahead most powerful class of deterministic bottom-up parsers (in terms of grammars read) 31

Terminology for LR Parsing Configuration: A X 1... X i X i+1... X j Configuration set: marks the point to which the production has been recognized all the configurations that apply at a given point in the parse. For example: A B CD A B GH T B Z 32

Configuration Closure Set Include all configurations necessary to recognize the next symbol after the. For example: S E$ E E + T T T ID (E) closure0({s E$})={ S E $ E E+T E T T ID T (E) } 33

Successor Configuration Set Starting with the initial configuration set s0 = closure0({s α $}), an LR(0) parser will find the successor, given a (next) symbol X. X can be either a terminal (a token from the scanner) or a nonterminal (the result of a reduction) Determining the successor s = go_to(s,x) : 1. pick all configurations in s of the form A β X γ 2. take closure0 of this set 34

Building the Characteristic Finite State Machine (CFSM) Nodes are configuration sets Arcs are go_to relationships Example: S S$ S ID State 0: State 1: ID S S$ S ID S ID State 2: State 3: S $ S S $ S S $ 35

Building the go_to Table Building the go_to table is straightforward from the CFSM: For the previous example the table looks like this: State Symbol ID $ S 0 1 2 1 2 3 3 36

Building the Action Table Given the configuration set s: We shift if the next token matches a terminal after the in some configuration in A α a β s and a V t, else error We reduce production P if the is at the end of a production B α s and production P is B α 37

Exercise Create CFSM, go_to table, and action table for S E$ E E + T T T ID (E) 1: S E$ 2: E E + T 3: E T 4: T ID 5: T (E) 38

LR(0) and LR(k) Grammars For LR(0) grammars the action table entries just described are unique. For most useful grammars we cannot decide on shift or reduce based on the symbols read. Instead, we have to look ahead k tokens. This leads to LR(k). However, it is possible to create an LR(0) grammar that is equivalent to any given LR(k) grammar (provided there is an end marker). This is only of theoretical interest because this grammar may be very complex and unreadable. 39