Compiler Design July 2004



Similar documents
Compiler Construction

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Intermediate Code. Intermediate Code Generation

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

Compiler Construction

Compiler I: Syntax Analysis Human Thought

Semantic Analysis: Types and Type Checking

CSCI 3136 Principles of Programming Languages

03 - Lexical Analysis

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

C Compiler Targeting the Java Virtual Machine

Programming Languages CIS 443

The previous chapter provided a definition of the semantics of a programming

Lecture 9. Semantic Analysis Scoping and Symbol Table

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

Syntaktická analýza. Ján Šturc. Zima 208

Semester Review. CSC 301, Fall 2015

Instruction Set Architecture (ISA)

CA Compiler Construction

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS

Objects for lexical analysis

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Glossary of Object Oriented Terms

Symbol Tables. Introduction

Introduction to Automata Theory. Reading: Chapter 1

1/20/2016 INTRODUCTION

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, School of Informatics University of Edinburgh als@inf.ed.ac.

If-Then-Else Problem (a motivating example for LR grammars)

Object Oriented Software Design

Programming Language Pragmatics

Moving from CS 61A Scheme to CS 61B Java

2) Write in detail the issues in the design of code generator.

Algorithms and Data Structures

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

Yacc: Yet Another Compiler-Compiler

UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming

Scanner. tokens scanner parser IR. source code. errors

Chapter 7D The Java Virtual Machine

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

DATA STRUCTURES USING C

High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)

Chapter 5 Names, Bindings, Type Checking, and Scopes

PL / SQL Basics. Chapter 3

Chapter 2: Elements of Java

Language Processing Systems

Textual Modeling Languages

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Variables, Constants, and Data Types

Flex/Bison Tutorial. Aaron Myles Landwehr CAPSL 2/17/2012

CS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

C++ INTERVIEW QUESTIONS

Object Oriented Software Design

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Design Patterns in Parsing

The C Programming Language course syllabus associate level

Pushdown Automata. place the input head on the leftmost input symbol. while symbol read = b and pile contains discs advance head remove disc from pile

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Programming Languages

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Bottom-Up Syntax Analysis LR - metódy

C++ Language Tutorial

GENERIC and GIMPLE: A New Tree Representation for Entire Functions

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions

1 The Java Virtual Machine

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Departamento de Investigación. LaST: Language Study Tool. Nº 143 Edgard Lindner y Enrique Molinari Coordinación: Graciela Matich

A Grammar for the C- Programming Language (Version S16) March 12, 2016

University of Twente. A simulation of the Java Virtual Machine using graph grammars

Data Structure with C

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

Introduction to Java Applications Pearson Education, Inc. All rights reserved.

The programming language C. sws1 1

A Programming Language Where the Syntax and Semantics Are Mutable at Runtime

Approximating Context-Free Grammars for Parsing and Verification

Programming Languages

Pemrograman Dasar. Basic Elements Of Java

Ed. v1.0 PROGRAMMING LANGUAGES WORKING PAPER DRAFT PROGRAMMING LANGUAGES. Ed. v1.0

Summit Public Schools Summit, New Jersey Grade Level / Content Area: Mathematics Length of Course: 1 Academic Year Curriculum: AP Computer Science A

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Programming Project 1: Lexical Analyzer (Scanner)

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

7. Building Compilers with Coco/R. 7.1 Overview 7.2 Scanner Specification 7.3 Parser Specification 7.4 Error Handling 7.5 LL(1) Conflicts 7.

n Introduction n Art of programming language design n Programming language spectrum n Why study programming languages? n Overview of compilation

csce4313 Programming Languages Scanner (pass/fail)

6.170 Tutorial 3 - Ruby Basics

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

SQL Server Database Coding Standards and Guidelines

Transcription:

irst Prev Next Last CS432/CSL 728: Compiler Design July 2004 S. Arun-Kumar sak@cse.iitd.ernet.in Department of Computer Science and ngineering I. I.. Delhi, Hauz Khas, New Delhi 110 016. July 30, 2004 Page 1 of 100

irst Prev Next Last Contents first he Bigpicture last first Lexical Analysis last first Syntax Analysis last first Context-free Grammars last first Ambiguity last first Shift-Reduce Parsing last first Parse rees last Page 2 of 100 first Semantic Analysis last first Synthesized Attributes last first Inherited Attributes last Contd...

irst Prev Next Last Contents... Contd first Abstract Syntax rees last first Symbol ables last first Intermediate Representation last IR: Properties ypical Instruction Set IR: Generation first Runtime Structure last Page 3 of 100

irst Prev Next Last Page 4 of 100

he Big Picture: 1 stream of characters SCANNR stream of tokens Page 5 of 100 irst Prev Next Last

he Big Picture: 2 stream of characters SCANNR stream of tokens PARSR parse tree Page 6 of 100 irst Prev Next Last

he Big Picture: 3 stream of characters SCANNR stream of tokens PARSR parse tree SMANIC ANALYZR abstract syntax tree Page 7 of 100 irst Prev Next Last

he Big Picture: 4 stream of characters SCANNR stream of tokens PARSR parse tree SMANIC ANALYZR abstract syntax tree Page 8 of 100 I.R. COD GNRAOR ermediate representation irst Prev Next Last

he Big Picture: 5 stream of characters SCANNR stream of tokens PARSR parse tree SMANIC ANALYZR abstract syntax tree Page 9 of 100 I.R. COD GNRAOR ermediate representation OPIMIZR optimized ermediate representation irst Prev Next Last

he Big Picture: 6 stream of characters SCANNR stream of tokens PARSR parse tree SMANIC ANALYZR abstract syntax tree Page 10 of 100 I.R. COD GNRAOR ermediate representation OPIMIZR optimized ermediate representation COD GNRAOR target code irst Prev Next Last

he Big Picture: 7 stream of characters SCANNR stream of tokens PARSR parse tree RROR SMANIC ANALYZR SYMBOL HANDLR abstract syntax tree I.R. COD GNRAOR ABL MANAGR Page 11 of 100 ermediate representation OPIMIZR optimized ermediate representation COD GNRAOR target code irst Prev Next Last

he Big Picture: 7 stream of characters SCANNR stream of tokens PARSR RROR parse tree SMANIC ANALYZR SYMBOL HANDLR abstract syntax tree I.R. COD GNRAOR ABL MANAGR Page 12 of 100 ermediate representation OPIMIZR optimized ermediate representation COD GNRAOR target code Scanner Parser Semantic Analysis Symbol able IR Optimization Contents irst Prev Next Last

irst Prev Next Last Page 13 of 100

irst Prev Next Last Scanning: 1 akes a stream of characters and identifies tokens from the lexemes. liminates comments and redundant whitepace. Page 14 of 100 Keeps track of line numbers and column numbers and passes them as parameters to the other phases to enable error-reporting to the user.

irst Prev Next Last Scanning: 2 Whitespace: A sequence of space, tab, newline, carriage-return, form-feed characters etc. Lexeme: A sequence of non-whitespace characters delimited by whitespace or special characters (e.g. operators like +, -, *). xamples of lexemes. reserved words, keywords, identifiers etc. ach comment is usually a single lexeme preprocessor directives Page 15 of 100

irst Prev Next Last Scanning: 3 oken: A sequence of characters to be treated as a single unit. xamples of tokens. Reserved words (e.g. begin, end, struct, if etc.) Keywords (eger, true etc.) Operators (+, &&, ++ etc) Identifiers (variable names, procedure names, parameter names) Literal constants (numeric, string, character constants etc.) Punctuation marks (:,, etc.) Page 16 of 100

irst Prev Next Last Scanning: 4 Identification of tokens is usually done by a Deterministic inite-state automaton (DA). he set of tokens of a language is represented by a large regular expression. his regular expression is fed to a lexical-analyser generator such as Lex, lex or ML-Lex. A giant DA is created by the Lexical analyser generator. Page 17 of 100

irst Prev Next Last Scanning: 5 a e,g z identifier if identifier f 0 9,a z 0 9,a z 2 3 4 0 9 white space 14 i other chars 1 a h,j z ( 0 9 5 0 9 6 real 0 9 0 9 7 8 real any char Page 18 of 100 13 error 9 error * 10 * 11 ) 12 comment he Big Picture

irst Prev Next Last Syntax Analysis Consider the following two languages over an alphabet A = {a, b}. R = {a n b n n < 100} P = {a n b n n > 0} R may be finitely represented by a regular expression (even though the actual expression is very long). However, P cannot actually be represented by a regular expression A regular expression is not powerful enough to represent languages which require parenthesis matching to arbitrary depths. All high level programming languages require an underlying language of expressions which require parentheses to be nested and matched to arbitrary depth. Page 19 of 100

irst Prev Next Last C-Grammars: Definition A context-free grammar (CG) G = N,, P, S consists of a set N of nonterminal symbols, a set of terminal symbols or the alphabet, a set P of productions or rewrite rules, each production is of the form X α, where X N is a nonterminal and α (N ) is a string of terminals and nonterminals a start symbol S N. Page 20 of 100

irst Prev Next Last CG: xample G = {S}, {a, b}, P, S, where S ab and S asb are the only productions in P. Derivations look like this: S ab S asb aabb S asb aasbb aaabbb L(G), the language generated by G is {a n b n n > 0}. Page 21 of 100 Actually can be proved by induction on the length and structure of derivations.

irst Prev Next Last CG: mpty word G = {S}, {a, b}, P, S, where S SS asb ε generates all sequences of matching nested parentheses, including the empty word ε. A leftmost derivation might look like this: S SS SSS SS asbs abs abasb... A rightmost derivation might look like this: S SS SSS SS SaSb Sab asbab... Other derivations might look like God alone knows what! Could be quite confusing! S SS SSS SS... Page 22 of 100

irst Prev Next Last CG: Derivation trees 1 Derivation sequences put an artificial order in which productions are fired. instead look at trees of derivations in which we may think of productions as being fired in parallel. here is then no highlighting in red to determine which copy of a nonterminal was used to get the next member of the sequence. Of course, generation of the empty word ε must be shown explicitly in the tree. Page 23 of 100

irst Prev Next Last CG: Derivation trees 2 S S S S S ε Page 24 of 100 a S b a S b ε a S b Derivation tree of abaabb ε

irst Prev Next Last CG: Derivation trees 3 S S S S S a S b Page 25 of 100 a S b ε a S b ε Another Derivation tree of ε abaabb

irst Prev Next Last CG: Derivation trees 4 S S S S S a S b Page 26 of 100 ε a S b a S b Yet another Derivation tree of abaabb ε ε

irst Prev Next Last Page 27 of 100

irst Prev Next Last Ambiguity: 1 I C + I y z C 4 Consider the sentence y + 4 z. Page 28 of 100

irst Prev Next Last Ambiguity: 2 I C + I y z C 4 Consider the sentence y + 4 z. Page 29 of 100 + *

irst Prev Next Last Ambiguity: 3 I C + I y z C 4 Consider the sentence y + 4 z. Page 30 of 100 + * * + I I

irst Prev Next Last Ambiguity: 4 I C + I y z C 4 Consider the sentence y + 4 z. Page 31 of 100 + * * + I I I I C y C z

irst Prev Next Last Ambiguity: 5 I C + I y z C 4 Consider the sentence y + 4 z. Page 32 of 100 + * * + I I I I C y C z z y 4 4

irst Prev Next Last Page 33 of 100

irst Prev Next Last Parsing: 0 r1. r2 r3 / D r4 D r5 D a b ( ) a a / b Page 34 of 100

irst Prev Next Last Parsing: 1 r1. r2 r3 / D r4 D r5 D a b ( ) a / b Page 35 of 100 Principle: Reduce whenever possible. Shift only when reduce is impossible a Shift

irst Prev Next Last Parsing: 2 r1. r2 r3 / D r4 D r5 D a b ( ) a / b Page 36 of 100 D Reduce by r5

irst Prev Next Last Parsing: 3 r1. r2 r3 / D r4 D r5 D a b ( ) a / b Page 37 of 100 Reduce by r4

irst Prev Next Last Parsing: 4 r1. r2 r3 / D r4 D r5 D a b ( ) a / b Page 38 of 100 Reduce by r2

irst Prev Next Last Parsing: 5 r1. r2 r3 / D r4 D r5 D a b ( ) a / b Page 39 of 100 Shift

irst Prev Next Last Parsing: 6 r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 40 of 100 a Shift

irst Prev Next Last Parsing: 7 r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 41 of 100 D Reduce by r5

irst Prev Next Last Parsing: 8 r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 42 of 100 Reduce by r4

irst Prev Next Last Parsing: 8a r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 43 of 100 Reduce by r4

irst Prev Next Last Parsing: 9a r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 44 of 100 Reduce by r1

irst Prev Next Last Parsing: 10a r1. r2 r3 / D r4 D r5 D a b ( ) b Page 45 of 100 / Shift

irst Prev Next Last Parsing: 11a r1. r2 r3 / D r4 D r5 D a b ( ) Page 46 of 100 b / Shift

irst Prev Next Last Parsing: 12a r1. r2 r3 / D r4 D r5 D a b ( ) Page 47 of 100 D / Reduce by r5

irst Prev Next Last Parsing: 13a r1. r2 r3 / D r4 D r5 D a b ( ) Page 48 of 100 / Reduce by r4

irst Prev Next Last Parsing: 14a r1. r2 r3 / D r4 D r5 D a b ( ) Page 49 of 100 Get back! Stuck! / Reduce by r2

irst Prev Next Last Parsing: 14b r1. r2 r3 / D r4 D r5 D a b ( ) Page 50 of 100 Get back! / Reduce by r2

irst Prev Next Last Parsing: 13b r1. r2 r3 / D r4 D r5 D a b ( ) Page 51 of 100 Get back! / Reduce by r4

irst Prev Next Last Parsing: 12b r1. r2 r3 / D r4 D r5 D a b ( ) Page 52 of 100 Get back! D / Reduce by r5

irst Prev Next Last Parsing: 11b r1. r2 r3 / D r4 D r5 D a b ( ) Page 53 of 100 Get back! b / Shift

irst Prev Next Last Parsing: 10b r1. r2 r3 / D r4 D r5 D a b ( ) b Page 54 of 100 Get back! / Shift

irst Prev Next Last Parsing: 9b r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 55 of 100 Get back to where you once belonged! Reduce by r1

irst Prev Next Last r1. r2 r3 / D r4 D Parsing: 8b r5 D a b ( ) modified Principle: Reduce whenever possible, but but depending upon lookahead Shift instead of reduce here! / b Page 56 of 100 Shift reduce conflict Reduce by r4

irst Prev Next Last Parsing: 8 r1. r2 r3 / D r4 D r5 D a b ( ) / b Page 57 of 100 Reduce by r4

irst Prev Next Last Parsing: 9 r1. r2 r3 / D r4 D r5 D a b ( ) b Page 58 of 100 / Shift

irst Prev Next Last Parsing: 10 r1. r2 r3 / D r4 D r5 D a b ( ) Page 59 of 100 b / Shift

irst Prev Next Last Parsing: 11 r1. r2 r3 / D r4 D r5 D a b ( ) Page 60 of 100 D / Reduce by r5

irst Prev Next Last Parsing: 12 r1. r2 r3 / D r4 D r5 D a b ( ) Page 61 of 100 Reduce by r3

irst Prev Next Last Parsing: 13 r1. r2 r3 / D r4 D r5 D a b ( ) Page 62 of 100 Reduce by r1

irst Prev Next Last Parse rees: 0 r1. r2 r4 D r3 / D r5 D a b ( ) Page 63 of 100 a a / b

irst Prev Next Last Parse rees: 1 r1. r2 r4 D r3 / D r5 D a b ( ) Page 64 of 100 D a a / b

irst Prev Next Last Parse rees: 2 r1. r2 r4 D r3 / D r5 D a b ( ) Page 65 of 100 D a a / b

irst Prev Next Last Parse rees: 3 r1. r2 r4 D r3 / D r5 D a b ( ) Page 66 of 100 D a a / b

irst Prev Next Last Parse rees: 3a r1. r2 r4 D r3 / D r5 D a b ( ) Page 67 of 100 D a a / b

irst Prev Next Last Parse rees: 3b r1. r2 r4 D r3 / D r5 D a b ( ) Page 68 of 100 D a a / b

irst Prev Next Last Parse rees: 4 r1. r2 r4 D r3 / D r5 D a b ( ) Page 69 of 100 D D a a / b

irst Prev Next Last Parse rees: 5 r1. r2 r4 D r3 / D r5 D a b ( ) Page 70 of 100 D D a a / b

irst Prev Next Last Parse rees: 5a r1. r2 r4 D r3 / D r5 D a b ( ) Page 71 of 100 D D a a / b

irst Prev Next Last Parse rees: 5b r1. r2 r4 D r3 / D r5 D a b ( ) Page 72 of 100 D D a a / b

irst Prev Next Last Parse rees: 6 r1. r2 r4 D r3 / D r5 D a b ( ) Page 73 of 100 D D D a a / b

irst Prev Next Last Parse rees: 7 r1. r2 r4 D r3 / D r5 D a b ( ) Page 74 of 100 D D D a a / b

irst Prev Next Last Parse rees: 8 r1. r2 r4 D r3 / D r5 D a b ( ) Page 75 of 100 D D D a a / b

irst Prev Next Last Parsing: Summary: 1 All high-level languages are designed so that they may be parsed in this fashion with only a single token lookahead. Parsers for a language can be automatically constructed by parger-generators such as Yacc, Bison, ML- Yacc. Shift-reduce conflicts if any, are automatically detected and reported by the parser-generator. Shift-reduce conflicts may be avoided by suitably redesigning the context-free grammar. Page 76 of 100

irst Prev Next Last Parsing: Summary: 2 Very often shift-reduce conflicts may occur because of the prefix problem. In such cases many parsergenerators resolve the conflict in favour of shifting. here is also a possiblility of reduce-reduce conflicts. his usually happens when there is more than one nonterminal symbol to which the contents of the stack may reduce. A minor reworking of the grammar to avoid redundant non-terminal symbols will get rid of reduce-reduce conflicts. he Big Picture Page 77 of 100

irst Prev Next Last Semantic Analysis: 1 very Programming langauge can be used to program any computable function, assuming of course, it has unbounded memory, and unbounded time he parser of a programming language provides the framework within which the target code is to be generated. he parser also provides a structuring mechanism that divides the task of code generation o bits and pieces determined by the individual nonterminals and production rules. However, contex-free grammars are not powerful enough to represent all computable functions. xample, the language {a n b n c n n > 0}. Page 78 of 100

irst Prev Next Last Semantic Analysis: 2 here are context-sensitive aspects of a program that cannot be represented/enforced by a context-free grammar definition. xamples include correspondence between formal and actual parameters type consistency between declaration and use. scope and visibility issues with respect to identifiers in a program. Page 79 of 100

irst Prev Next Last Page 80 of 100

irst Prev Next Last Synthesized Attributes: 0 / ( ) n / n Page 81 of 100 ( ) n n

irst Prev Next Last Synthesized Attributes: 1 / ( ) n / n Page 82 of 100 ( ) Synthesized Attributes 4 3 2 1 n 4 n

irst Prev Next Last Synthesized Attributes: 2 / ( ) n / n Page 83 of 100 ( ) Synthesized Attributes 4 3 2 1 4 n 4 n

irst Prev Next Last Synthesized Attributes: 3 / ( ) n / n Page 84 of 100 ( ) Synthesized Attributes 4 3 2 1 4 4 n 4 n

irst Prev Next Last Synthesized Attributes: 4 / ( ) n / n Page 85 of 100 ( ) 4 Synthesized Attributes 4 3 2 1 4 4 n 4 n

irst Prev Next Last Synthesized Attributes: 5 / ( ) n / n Page 86 of 100 ( ) 4 Synthesized Attributes 4 3 2 1 4 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 6 / ( ) n / n Page 87 of 100 ( ) 4 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 7 / ( ) n / n Page 88 of 100 ( ) 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 8 / ( ) n / n Page 89 of 100 ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 9 / ( ) n / Page 90 of 100 3 n ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 10 / ( ) n 3 / Page 91 of 100 3 n ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 11 / ( ) n 3 / Page 92 of 100 3 n 2 ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 12 / ( ) n 3 / 2 Page 93 of 100 3 n 2 ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 13 / ( ) n 1 3 / 2 Page 94 of 100 3 n 2 ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last Synthesized Attributes: 14 1 / ( ) n 1 3 / 2 Page 95 of 100 3 n 2 ( ) 3 4 1 Synthesized Attributes 4 3 2 1 4 1 4 n 1 4 n

irst Prev Next Last An Attribute Grammar 1 0 1 0.val := sub( 1.val,.val) 3 / 1 2.val :=.val 3 n 2 0 1 / 0.val := div( 1.val,.val) Page 96 of 100 4 ( 3 ) 1.val :=.val 4 4 n 1 1 ().val :=.val 4 n n.val := n.val

irst Prev Next Last Inherited Attributes: 0 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 97 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 1 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 98 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 2 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 99 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 3 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 100 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 4 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 101 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 5 D L L, I z I L, Page 102 of 100 I y D L I x y z x,

irst Prev Next Last Inherited Attributes: 6 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 103 of 100 L, I y I D L I x y z x,

irst Prev Next Last Inherited Attributes: 7 C-style declarations generating x, y, z. D L float L L,I I I x y z D L L, I z Page 104 of 100 L, I y I D L I x y z x,

irst Prev Next Last An Attribute Grammar D L D L L.in :=.type L, I.type :=. L, I z float.type := float.f loat Page 105 of 100 I y L 0 L 1,I L 1 := L 0.in x L I I.in := L.in I id id.type := I.in

irst Prev Next Last Page 106 of 100

irst Prev Next Last Abstract Syntax: 0 / n () Suppose we want to evaluate an expression (4 1)/2. What we actually want is a tree that looks like this: / Page 107 of 100 2 4 1

irst Prev Next Last valuation: 0 / 2 Page 108 of 100 4 1

irst Prev Next Last valuation: 1 / 2 Page 109 of 100 4 1

irst Prev Next Last valuation: 2 / 3 2 Page 110 of 100

irst Prev Next Last valuation: 3 / 3 2 Page 111 of 100

irst Prev Next Last valuation: 4 1 Page 112 of 100 But what we actually get during parsing is a tree that looks like...

irst Prev Next Last Abstract Syntax: 1... HIS! / n ( ) Page 113 of 100 / n n n n n

irst Prev Next Last Abstract Syntax: 2 We use attribute grammar rules to construct the abstract syntax tree (AS)!. But in order to do that we first require two procedures for tree construction. makeleaf(literal) : Creates a node with label literal and returns a poer to it. makebinarynode(opr, opd1, opd2) : Creates a node with label opr (with fields which po to opd1 and opd2) and returns a poer to the newly created node. Now we may associate a synthesized attribute called ptr with each terminal and nonterminal symbol which pos to the root of the subtree created for it. Page 114 of 100

irst Prev Next Last Abstract Syntax: 3 0 1 0.ptr := makebinarynode(, 1.ptr,.ptr).ptr :=.ptr 0 1 / 0.ptr := makebinarynode(/, 1.ptr,.ptr).ptr :=.ptr ().ptr :=.ptr Page 115 of 100 n.ptr := makeleaf(n.val) he Big Picture

irst Prev Next Last Symbol able:1 he store house of context-sensitive and run-time information about every identifier in the source program. All accesses relating to an identifier require to first find the attributes of the identifier from the symbol table Usually organized as a hash table provides fast access. Compiler-generated temporaries may also be stored in the symbol table Page 116 of 100

irst Prev Next Last Symbol able:2 Attributes stored in a symbol table for each identifier: type size scope/visibility information base address addresses to location of auxiliary symbol tables (in case of records, procedures, classes) address of the location containing the string which actually names the identifier and its length in the string pool Page 117 of 100

irst Prev Next Last Symbol able:3 A symbol table exists through out the compilation and run-time. Major operations required of a symbol table: insertion search deletions are purely logical (depending on scope and visibility) and not physical Keywords are often stored in the symbol table before the compilation process begins. Page 118 of 100

irst Prev Next Last Symbol able:4 Accesses to the symbol table at every stage of the compilation process, Scanning: Insertion of new identifiers. Parsing: Access to the symbol table to ensure that an operand exists (declaration before use). Semantic analysis: Determination of types of identifiers from declarations type checking to ensure that operands are used in type-valid contexts. Checking scope, visibility violations. Page 119 of 100

irst Prev Next Last Symbol able:5 IR generation:. Memory allocation and relative a address calculation. Optimization: All memory accesses through symbol table arget code: ranslation of relative addresses to absolute addresses in terms of word length, word boundary etc. he Big picture a i.e.relative to a base address that is known only at run-time Page 120 of 100

irst Prev Next Last Intermediate Representation Intermediate representations are important for reasons of portability. (more or less) independent of specific features of the high-level language. xample. Java byte-code for any high-level language. (more or less) independent of specific features of any particular target architecture (e.g. number of registers, memory size) number of registers memory size word length Page 121 of 100

irst Prev Next Last IR Properties: 1 1. It is fairly low-level containing instructions common to all target architectures and assembly languages. How low can you stoop?... 2. It contains some fairly high-level instructions that are common to most high-level programming languages. How high can you rise? 3. o ensure portability an unbounded number of variables and memory locations no commitment to Representational Issues 4. o ensure type-safety memory locations are also typed according to the data they may contain, no commitment is made regarding word boundaries, and the structure of individual data items. Page 122 of 100 Next

irst Prev Next Last IR: Representation? No commitment to word boundaries or byte boundaries No commitment to representation of vs. float, float vs. double, packed vs. unpacked, strings where and how?. Back to IR Properties:1 Page 123 of 100

irst Prev Next Last IR: How low can you stoop? most arithmetic and logical operations, load and store instructions etc. so as to be erpreted easily, the erpreter is fairly small, execution speeds are high, to have fixed length instructions (where each operand position has a specific meaning). Back to IR Properties:1 Page 124 of 100

irst Prev Next Last IR: How high can you rise? typed variables, temporary variables instead of registers. array-indexing, random access to record fields, parameter-passing, poers and poer management no limits on memory addresses Back to IR Properties:1 Page 125 of 100

irst Prev Next Last A typical instruction set: 1 hree address code: A suite of instructions. ach instruction has at most 3 operands. an opcode representing an operation with at most 2 operands two operands on which the binary operation is performed a target operand, which accumulates the result of the (binary) operation. If an operation requires less than 3 operands then one or more of the operands is made null. Page 126 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) Jumps (conditional and unconditional) Procedures and parameters Arrays and array-indexing Poer Referencing and Dereferencing Page 127 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) x := y bop z, where bop is a binary operation x := uop y, where uop is a unary operation x := y, load, store, copy or register transfer Jumps (conditional and unconditional) Procedures and parameters Arrays and array-indexing Poer Referencing and Dereferencing Page 128 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) Jumps (conditional and unconditional) goto L Unconditional jump, x relop y goto L Conditional jump, where relop is a relational operator Procedures and parameters Arrays and array-indexing Poer Referencing and Dereferencing Page 129 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) Jumps (conditional and unconditional) Procedures and parameters call p n, where n is the number of parameters return y, return value from a procedures call param x, parameter declaration Arrays and array-indexing Poer Referencing and Dereferencing Page 130 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) Jumps (conditional and unconditional) Procedures and parameters Arrays and array-indexing x := a[i] array indexing for r-value a[j] := y array indexing for l-value Note: he two opcodes are different depending on whether l-value or r-value is desired. x and y are always simple variables Poer Referencing and Dereferencing Page 131 of 100

irst Prev Next Last A typical instruction set: 2 Assignments (LOAD-SOR) Jumps (conditional and unconditional) Procedures and parameters Arrays and array-indexing Poer Referencing and Dereferencing x := ˆy referencing: set x to po to y x := *y dereferencing: copy contents of location poed to by y o x *x := y dereferencing: copy r-value of y o the location poed to by x Picture Page 132 of 100

irst Prev Next Last Poers x *x *y y x := ^y @y x *y y x *x @z y *z x @z y x := *y Page 133 of 100 *z z *z z x @z *y *z y z *x := y @z x *y *y y z

irst Prev Next Last IR: Generation Can be generated by recursive traversal of the abstract syntax tree. Can be generated by syntax-directed translation as follows: or every non-terminal symbol N in the grammar of the source language there exist two attributes N.place, which denotes the address of a temporary variable where the result of the execution of the generated code is stored N.code, which is the actual code segment generated. In addition a global counter for the instructions generated is maained as part of the generation process. It is independent of the source language but can express target machine operations without committing to too much detail. Page 134 of 100

irst Prev Next Last IR: Infrastructure Given an abstract syntax tree, with also denoting its root node..place address of temporary variable where result of execution of the is stored. newtemp returns a fresh variable name and also installs it in the symbol table along with relevant information.code the actual sequence of instructions generated for the tree. newlabel returns a label to mark an instruction in the generated code which may be the target of a jump. emit emits an instructions (regarded as a string). Page 135 of 100

irst Prev Next Last IR: Infrastructure Colour and font coding of IR code generation Green: Nodes of the Abstract Syntax ree Brown: Characters and strings of the Intermediate Representation Red: Variables and data structures of the language in which the IR code generator is written blue: Names of relevant procedures used in IR code generation. Black: All other stuff. Page 136 of 100

irst Prev Next Last IR: xample id.place.code := id.place; := emit() 0 1 2 0.place := newtemp; 0.code := 1.code 2.code emit( 0.place := 1.place 2.place) Page 137 of 100

irst Prev Next Last IR: xample S id := S.code :=.code emit(id.place:=.place) S 0 while do S 1 Page 138 of 100 S 0.begin := newlabel; S 0.after := newlabel; S 0.code := emit(s 0.begin:).code emit(if.place= 0 goto S 0.after) S 1.code emit(gotos 0.begin) emit(s 0.after:)

irst Prev Next Last IR: xample S id := S.code :=.code emit(id.place:=.place) S 0 while do S 1 Page 139 of 100 S 0.begin := newlabel; S 0.after := newlabel; S 0.code := emit(s 0.begin:).code emit(if.place= 0 goto S 0.after) S 1.code emit(gotos 0.begin) emit(s 0.after:)

irst Prev Next Last IR: Generation While generating the ermediate representation, it is sometimes necessary to generate jumps o code that has not been generated as yet (hence the address of the label is unknown). his usually happens while processing forward jumps short-circuit evaluation of boolean expressions It is usual in such circumstances to either fill up the empty label entries in a second pass over the the code or through a process of backpatching (which is the maenance of lists of jumps to the same instruction number), wherein the blank entries are filled in once the sequence number of the target instruction becomes known. Page 140 of 100

irst Prev Next Last A Calling Chain Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Body of P21 Body of P2 Call P21 Call P21 Page 141 of 100 Procedure P1 Locals of P1 Body of P1 Main body Call P2 Call P1

irst Prev Next Last Run-time Structure: 1 Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Page 142 of 100 Body of P21 Body of P2 Procedure P1 Locals of P1 Body of P1 Main body Globals

irst Prev Next Last Run-time Structure: 2 Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Page 143 of 100 Body of P21 Body of P2 Procedure P1 Locals of P1 Body of P1 Main body Return address to Main Dynamic link to Main Locals of P1 Static link to Main ormal par of P1 Globals

irst Prev Next Last Run-time Structure: 3 Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Body of P21 Body of P2 Procedure P1 Locals of P1 Body of P1 Main body Return address to last of P1 Dynamic link to last P1 Locals of P2 Static link to last P1 ormal par P2 Return address to Main Dynamic link to Main Locals of P1 Static link to Main ormal par of P1 Globals Page 144 of 100

irst Prev Next Last Run-time Structure: 4 Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Return address to last of P2 Dynamic link to last P2 Locals of P21 Static link last P2 ormal par P21 Body of P21 Body of P2 Procedure P1 Locals of P1 Body of P1 Main body Return address to last of P1 Dynamic link to last P1 Locals of P2 Static link to last P1 ormal par P2 Return address to Main Dynamic link to Main Locals of P1 Static link to Main ormal par of P1 Globals Page 145 of 100

irst Prev Next Last Run-time Structure: 5 Main program Globals Procedure P2 Locals of P2 Procedure P21 Locals of P21 Body of P21 Body of P2 Procedure P1 Locals of P1 Body of P1 Main body Return address to last of P21 Dynamic link to last P21 Locals of P21 Static link to last P2 ormal par P21 Return address to last of P2 Dynamic link to last P2 Locals of P21 Static link last P2 ormal par P21 Return address to last of P1 Dynamic link to last P1 Locals of P2 Static link to last P1 ormal par P2 Return address to Main Dynamic link to Main Locals of P1 Static link to Main ormal par of P1 Globals Page 146 of 100

irst Prev Next Last Page 147 of 100