FIRST and FOLLOW sets a necessary preliminary to constructing the LL(1) parsing table



Similar documents
Syntaktická analýza. Ján Šturc. Zima 208

FIRST AND FOLLOW Example

If-Then-Else Problem (a motivating example for LR grammars)

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, School of Informatics University of Edinburgh als@inf.ed.ac.

CS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

Context free grammars and predictive parsing

Bottom-Up Parsing. An Introductory Example

Bottom-Up Syntax Analysis LR - metódy

Compiler Construction

SOLUTION Trial Test Grammar & Parsing Deficiency Course for the Master in Software Technology Programme Utrecht University

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

Basic Parsing Algorithms Chart Parsing

Design Patterns in Parsing

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Formal Grammars and Languages

Compiler I: Syntax Analysis Human Thought

Flex/Bison Tutorial. Aaron Myles Landwehr CAPSL 2/17/2012

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Lecture Notes on Database Normalization

Grammars and Parsing. 2. A finite nonterminal alphabet N. Symbols in this alphabet are variables of the grammar.

Stanford Math Circle: Sunday, May 9, 2010 Square-Triangular Numbers, Pell s Equation, and Continued Fractions

1.4 Compound Inequalities

Lecture 9. Semantic Analysis Scoping and Symbol Table

Compiler Design July 2004

We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: b + 90c = c + 10b

The previous chapter provided a definition of the semantics of a programming

Chapter 7 Uncomputability

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

Base Conversion written by Cathy Saxton

03 - Lexical Analysis

Computer Science 281 Binary and Hexadecimal Review

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

Architectural Design Patterns for Language Parsers

Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer

16. Recursion. COMP 110 Prasun Dewan 1. Developing a Recursive Solution

Lecture 1. Basic Concepts of Set Theory, Functions and Relations

Notes on Excel Forecasting Tools. Data Table, Scenario Manager, Goal Seek, & Solver

Automata Theory. Şubat 2006 Tuğrul Yılmaz Ankara Üniversitesi

Introduction to Turing Machines

9.2 Summation Notation

=

2.2. Instantaneous Velocity

CS / IT B.E. VII Semester Examination, December 2013 Cloud Computing Time : Three Hours Maximum Marks : 70

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

Discrete Mathematics: Homework 7 solution. Due:

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Pushdown Automata. place the input head on the leftmost input symbol. while symbol read = b and pile contains discs advance head remove disc from pile

MATHEMATICS OF FINANCE AND INVESTMENT

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Turing Machines: An Introduction

A Formal Theory of Inductive Inference. Part II

Compilers Lexical Analysis

NATURAL LANGUAGE QUERY PROCESSING USING SEMANTIC GRAMMAR

SPSS Workbook 1 Data Entry : Questionnaire Data

Business Process- and Graph Grammar-Based Approach to ERP System Modelling

Using MS Excel to Analyze Data: A Tutorial

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Think it s not worth the hassle to switch your checking account? With Fairport Savings Bank, it takes just minutes to switch!

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

28 CHAPTER 1. VECTORS AND THE GEOMETRY OF SPACE. v x. u y v z u z v y u y u z. v y v z

#1-12: Write the first 4 terms of the sequence. (Assume n begins with 1.)

Fibonacci Numbers and Greatest Common Divisors. The Finonacci numbers are the numbers in the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,...

Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment

the points are called control points approximating curve

Semester Review. CSC 301, Fall 2015

Yacc: Yet Another Compiler-Compiler

Exam 3 Review. FV = PV (1 + i) n. Format. What to Bring/Remember. Time Value of Money. Solving for Other Variables Example. Solving for Other Values

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

Course Manual Automata & Complexity 2015

The Time Value of Money

Automata and Formal Languages

Regular Expressions and Automata using Haskell

Algorithms and Data Structures

Sequentially Indexed Grammars

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

3.2 Matrix Multiplication

Chapter 7: Functional Programming Languages

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

Models of Sequential Computation

Synchronous Binarization for Machine Translation

Microsoft Access Basics

Antlr ANother TutoRiaL

Software quality improvement via pattern matching

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

Row Echelon Form and Reduced Row Echelon Form

Transcription:

FIRST and FOLLOW sets a necessary preliminary to constructing the LL(1) parsing table Remember: A predictive parser can only be built for an LL(1) grammar. A grammar is not LL(1) if it is: 1. Left recursive, or 2. Not left factored. However, grammars that are not left recursive and are left factored may still not be LL(1). To see if a grammar is LL(1), try to build the parse table for the predictive parser by the method we are about to describe. If any element in the table contains more than one grammar rule right-hand side, then the grammar is not LL(1). To build the table, we must must compute FIRST and FOLLOW sets for the grammar. 1

FIRST Sets Ultimately, we want to define FIRST sets for the right-hand sides of each of the grammar s productions. The idea is that for a sequence α of symbols, FIRST(α) is the set of terminals that begin the strings derivable from α, and also, if α can derive ǫ, then ǫ is in FIRST(α). By allowing ǫ to be in FIRST(α), we can avoid defining the notion of nullable as Appel does in the book. More formally: FIRST(α) = {t (t is a terminal and α tβ) or (t = ǫ and α ǫ)} To define FIRST(α) for arbitrary α, we start by defining FIRST(X), for a single symbol X (a terminal, a nonterminal, or ǫ): X is a terminal: FIRST(X) = X X is ǫ: FIRST(X) = ǫ 2

X is a nonterminal. In this case, we must look at all grammar productions with X on the left, i.e., productions of the form: X Y 1 Y 2 Y 3... Y k where each Y k is a single terminal or nonterminal (or there is just one Y, and it is ǫ). For each such production, we perform the following actions: Put FIRST(Y 1 ) {ǫ} into FIRST(X). If ǫ is in FIRST(Y 1 ), then put FIRST(Y 2 ) {ǫ} into FIRST(X). If ǫ is in FIRST(Y 2 ), then put FIRST(Y 3 ) {ǫ} into FIRST(X).... If ǫ is in FIRST(Y i ) for 1 i k (all production right- hand sides) then put ǫ into FIRST(X). For example, compute the FIRST sets of each of the non-terminals in the following grammar: exp term exp 3

exp term exp ǫ term factor term term / factor term ǫ factor INTLITERAL ( exp ) Once we have computed FIRST(X) for each terminal and nonterminal X, we can compute FIRST(α) for every production s right-hand-side α. In general, alpha will be of the form: X 1 X 2... X n where each X is a single terminal or nonterminal, or there is just one X 1 and it is ǫ. The rules for computing FIRST(α) are essentially the same as the rules for computing the first set of a nonterminal. Put FIRST(X 1 ) {ǫ} into FIRST(α) If ǫ is in FIRST(X 1 ) put FIRST(X 2 ) {ǫ} into FIRST(α). 4

... If ǫ is in the FIRST set for every X k, put ǫ into FIRST(α). For the example grammar above, compute the FIRST sets for each production s right-hand side. Why do we care about the FIRST(α) sets? During parsing, suppose the top-of- stack symbol is nonterminal A, that there are two productions A α and A β, and that the current token is a. Well, if FIRST(α) includes a... 5

FOLLOW sets FOLLOW sets are only defined for single nonterminals. The definition is as follows: For a nonterminal A, FOLLOW(A) is the set of terminals that can appear immediately to the right of A in some partial derivation; i.e., terminal t is in FOLLOW(A) if: S +... A t... where t is a terminal Furthermore, if A can be the rightmost symbol in a derivation, then EOF is in FOLLOW(A). It is worth noting that epsilon is never in a FOLLOW set. Notationally: 6

FOLLOW(A) = {t (t is a terminal and S + α A t β) or (t is EOF and S αa)} Some conditions under which symbols a, c, and EOF are in the FOLLOW set of nonterminal A: S S S / \ / \ / \ / \ / \ / \ / \ / X Y\ / \ / /\ \ / / \ / \ A \ /\ (A) /\ A B /\ / \ / / \ (a)b c epsilon / \ / / \ EOF is in FOLLOW(A) (c)d e f a is in FOLLOW(A) \ c is in FOLLOW(A) 7

How to compute FOLLOW(A) for each nonterminal A: If A is the start nonterminal, put EOF in FOLLOW(A) Find the productions with A on the right-hand-side: For each production X αaβ, put FIRST(β) {ǫ} in FOLLOW(A) If ǫ is in F IRST(β) then put F OLLOW(X) into F OLLOW(A) For each production X αa, put FOLLOW(X) into FOLLOW(A) Note that To compute FIRST(A) you must look for A on a production s left-hand side. To compute FOLLOW(A) you must look for A on a production s right-hand side. FIRST and FOLLOW sets are always sets of terminals (plus, perhaps, ǫ for FIRST sets, and EOF for follow sets). Nonterminals are never in a FIRST or a FOLLOW set. 8

Example: Compute FOLLOW sets of the non-terminals in the following language: S B c D B B a b c S D d ǫ Why do we care about FOLLOW sets? Suppose during parsing we have X on top of the stack and a is the current token. What if X has two productions X α and X β but a is not in the FIRST set of either α or β? 9

How to build parse tables and tell if a grammar is LL(1) To build the table, we fill in the rows one at a time for each nonterminal X as follows: for each production X α for each terminal t in First(α): put α in Table[X,t] if ǫ is in First(α) then: for each terminal t in Follow(X): put α in Table[X,t] The grammar is not LL(1) if and only if there is more than one entry for any cell in the table. Try building a parse table for the following grammar: S B c D B B a b c S D d ǫ 10