Bottom-Up Parsing. An Introductory Example



Similar documents
CS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

If-Then-Else Problem (a motivating example for LR grammars)

Basic Parsing Algorithms Chart Parsing

Syntaktická analýza. Ján Šturc. Zima 208

Bottom-Up Syntax Analysis LR - metódy

Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, School of Informatics University of Edinburgh als@inf.ed.ac.

FIRST and FOLLOW sets a necessary preliminary to constructing the LL(1) parsing table

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

SOLUTION Trial Test Grammar & Parsing Deficiency Course for the Master in Software Technology Programme Utrecht University

Compiler I: Syntax Analysis Human Thought

Last not not Last Last Next! Next! Line Line Forms Forms Here Here Last In, First Out Last In, First Out not Last Next! Call stack: Worst line ever!

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Yacc: Yet Another Compiler-Compiler

Data Structures and Algorithms V Otávio Braga

Chapter 4: Computer Codes

Lecture 9. Semantic Analysis Scoping and Symbol Table

Formal Grammars and Languages

Context free grammars and predictive parsing

Approximating Context-Free Grammars for Parsing and Verification

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Automata and Computability. Solutions to Exercises

Pushdown Automata. place the input head on the leftmost input symbol. while symbol read = b and pile contains discs advance head remove disc from pile

PES Institute of Technology-BSC QUESTION BANK

3.1. RATIONAL EXPRESSIONS

ÖVNINGSUPPGIFTER I SAMMANHANGSFRIA SPRÅK. 15 april Master Edition

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Introduction to the 1st Obligatory Exercise

Automata and Formal Languages

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

03 - Lexical Analysis

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Pseudo code Tutorial and Exercises Teacher s Version

Negative Integer Exponents

Fast nondeterministic recognition of context-free languages using two queues

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

Preliminary Mathematics

Outline of today s lecture


Algorithms and Data Structures

Regular Expressions and Automata using Haskell

FIRST AND FOLLOW Example

Fundamentele Informatica II

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Common Data Structures

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

CS5236 Advanced Automata Theory

Chapter 2: Algorithm Discovery and Design. Invitation to Computer Science, C++ Version, Third Edition

Data Structures Using C++ 2E. Chapter 5 Linked Lists

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Compiler Construction

Computer Science 281 Binary and Hexadecimal Review

DATA STRUCTURES USING C

Parsing by Example. Diplomarbeit der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern. vorgelegt von.

Recursion vs. Iteration Eliminating Recursion

Data Structures. Topic #12

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

St S a t ck a ck nd Qu Q eue 1

C H A P T E R Regular Expressions regular expression

Keyboard Basics. By Starling Jones, Jr.

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

PLG: a Framework for the Generation of Business Process Models and their Execution Logs

Lecture 2 Matrix Operations

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Stacks. Stacks (and Queues) Stacks. q Stack: what is it? q ADT. q Applications. q Implementation(s) CSCU9A3 1

Outline. Computer Science 331. Stack ADT. Definition of a Stack ADT. Stacks. Parenthesis Matching. Mike Jacobson

1 Solving LPs: The Simplex Algorithm of George Dantzig

Module 2 Stacks and Queues: Abstract Data Types

Web Data Extraction: 1 o Semestre 2007/2008

We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: b + 90c = c + 10b

Grammars and Parsing. 2. A finite nonterminal alphabet N. Symbols in this alphabet are variables of the grammar.

FORDHAM UNIVERSITY CISC Dept. of Computer and Info. Science Spring, Lab 2. The Full-Adder

Semester Review. CSC 301, Fall 2015

Chapter 4 -- Decimals

8 Divisibility and prime numbers

MATHEMATICAL NOTATION COMPARISONS BETWEEN U.S. AND LATIN AMERICAN COUNTRIES

PSTricks. pst-ode. A PSTricks package for solving initial value problems for sets of Ordinary Differential Equations (ODE), v0.7.

Generalizing Overloading for C++2000

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

NFAs with Tagged Transitions, their Conversion to Deterministic Automata and Application to Regular Expressions

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

C Compiler Targeting the Java Virtual Machine

HW3: Programming with stacks

Sowing Games JEFF ERICKSON

Introduction to Data Structures

Special Notice. Rules. Weiss Schwarz Comprehensive Rules ver Last updated: October 15 th Outline of the Game

Design Patterns in Parsing

Year 9 set 1 Mathematics notes, to accompany the 9H book.

Introduction to Fractions

=

Texas Essential Knowledge and Skills Correlation to Video Game Design Foundations 2011 N Video Game Design

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

CAs and Turing Machines. The Basis for Universal Computation

Transcription:

Bottom-Up Parsing Bottom-up parsing is more general than top-down parsing Just as efficient Builds on ideas in top-down parsing Bottom-up is the preferred method in practice Reading: Section 4.5 An Introductory Example Bottom-up parsers don t need leftfactored grammars Hence we can revert to the natural grammar for our example: E T + E T T int * T int (E) Consider the string: int * int + int

The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: int * int + int int * T + int T + int T + T T + E E T int T int * T T int E T E T + E Observation Read productions from bottom-up parse in reverse (i.e., from bottom to top) This is a rightmost derivation! int * int + int int * T + int T + int T + T T + E E T int T int * T T int E T E T + E

A Bottom-up Parse int * int + int int * T + int T + int T + T T + E E int E T E * T T int + int A bottom-up parser traces a rightmost derivation in reverse! Bottom-up Parse in Detail int * (int + int) + int

Trivial Bottom-Up Parsing Algorithm Let I = input string repeat pick a non-empty substring β of I where X βis a production if no such β, backtrack replace one β by X in I until I = S (the start symbol) or all possibilities are exhausted Questions Does this algorithm terminate? How fast is the algorithm? Does the algorithm handle all cases? How do we choose the substring to reduce at each step?

Where Do Reductions Happen Important Fact #1 has an interesting consequence: Let αβω be a step of a bottom-up parse Assume the next reduction is by X β Then ω is a string of terminals Why? Because αxω αβωis a step in a right-most derivation Notation Idea: Split string into two substrings Right substring is as yet unexamined by parsing (a string of terminals) Left substring has terminals and non-terminals The dividing point is marked by a The is not part of the string Initially, all input is unexamined x 1 x 2... x n

Shift-Reduce Parsing Bottom-up parsing uses two kinds of actions: Shift Reduce Shift Shift: Move one place to the right Shifts a terminal to the left string ABC xyz ABCx yz

Reduce Apply an inverse production at the right end of the left string If A xy is a production, then Cbxy ijk CbA ijk Example with Reductions Only int * int + int int * int + int int * int + int int * int + int reduce T int int * T + int reduce T int * T T + int T + int T + int reduce T int T + T reduce E T T + E reduce E T + E E

The Example with Shift- Reduce Parsing int * int + int int * int + int int * int + int int * int + int int * T + int T + int T + int T + int T + T T + E E reduce T int reduce T int * T reduce T int reduce E T reduce E T + E Shift-Reduce Parse in Detail int * (int + int) + int

The Stack Left string can be implemented by a stack Top of the stack is the Shift pushes a terminal on the stack Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a nonterminal on the stack (production lhs) Key Issue (will be resolved by algorithms) How do we decide when to or reduce? Consider step int * int + int We could reduce by T int giving T * int + int A fatal mistake: No way to reduce to the start symbol E

Conflicts Generic -reduce strategy: If there is a handle on top of the stack, reduce Otherwise, But what if there is a choice? If it is legal to or reduce, there is a -reduce conflict If it is legal to reduce by two different productions, there is a reduce-reduce conflict Source of Conflicts Ambiguous grammars always cause conflicts But beware, so do many nonambiguous grammars

Conflict Example Consider our favorite ambiguous grammar: E E + E E * E (E) int One Shift-Reduce Parse int * int + int... E * E + int E + int E + int E + int E + E E... reduce E E * E reduce E int reduce E E + E

Another Shift-Reduce Parse int * int + int... E * E + int E * E + int E * E + int E * E + E E * E E... reduce E int reduce E E + E reduce E E * E Example Notes In the second step E * E + int we can either or reduce by E E * E Choice determines associativity of + and * As noted previously, grammar can be rewritten to enforce precedence Precedence declarations are an alternative

Precedence Declarations Revisited Precedence declarations cause -reduce parsers to resolve conflicts in certain ways Declaring * has greater precedence than + causes parser to reduce at E * E + int More precisely, precedence declaration is used to resolve conflict between reducing a * and ing a + Precedence Declarations Revisited (Cont.) The term precedence declaration is misleading These declarations do not define precedence; they define conflict resolutions Not quite the same thing!