Intermediate Code. Intermediate Code Generation



Similar documents
2) Write in detail the issues in the design of code generator.

Semantic Analysis: Types and Type Checking

Compiler Design July 2004

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

Algorithms and Data Structures

A Grammar for the C- Programming Language (Version S16) March 12, 2016

Symbol Tables. Introduction

The previous chapter provided a definition of the semantics of a programming

Chapter 5. Selection 5-1

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

Compiler Construction

DATA STRUCTURES USING C

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Tutorial on C Language Programming

Moving from CS 61A Scheme to CS 61B Java

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS

Sources: On the Web: Slides will be available on:

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

CSCI 3136 Principles of Programming Languages

The Cool Reference Manual

X86-64 Architecture Guide

GENERIC and GIMPLE: A New Tree Representation for Entire Functions

Object Oriented Software Design

Module 2 Stacks and Queues: Abstract Data Types

C Compiler Targeting the Java Virtual Machine

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

Object Oriented Software Design

Glossary of Object Oriented Terms

University of Twente. A simulation of the Java Virtual Machine using graph grammars

Functional Programming. Functional Programming Languages. Chapter 14. Introduction

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

Chapter 3: Restricted Structures Page 1

Instruction Set Architecture

Rigorous Software Development CSCI-GA

Lecture 9. Semantic Analysis Scoping and Symbol Table

10CS35: Data Structures Using C

Instruction Set Architecture (ISA)

Language Processing Systems

Selection Statements

Programming Languages

COMP 356 Programming Language Structures Notes for Chapter 10 of Concepts of Programming Languages Implementing Subprograms.

The programming language C. sws1 1

Boolean Expressions, Conditions, Loops, and Enumerations. Precedence Rules (from highest to lowest priority)

Chapter 7D The Java Virtual Machine

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

PES Institute of Technology-BSC QUESTION BANK

Machine-Code Generation for Functions

UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming

Chapter 2: Elements of Java

Inside the Java Virtual Machine

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Keywords are identifiers having predefined meanings in C programming language. The list of keywords used in standard C are : unsigned void

Outline. Conditional Statements. Logical Data in C. Logical Expressions. Relational Examples. Relational Operators

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Chapter 3 Operators and Control Flow

Stack Allocation. Run-Time Data Structures. Static Structures

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Common Data Structures

Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities (Technical Report)

Programming Language Rankings. Lecture 15: Type Inference, polymorphism & Type Classes. Top Combined. Tiobe Index. CSC 131! Fall, 2014!

JAVA - QUICK GUIDE. Java SE is freely available from the link Download Java. So you download a version based on your operating system.

Compilers I - Chapter 4: Generating Better Code

7.1 Our Current Model

High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)

Stacks. Linear data structures

Syntaktická analýza. Ján Šturc. Zima 208

Automatic Test Data Synthesis using UML Sequence Diagrams

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Computer Math

TATJA: A Test Automation Tool for Java Applets

IS0020 Program Design and Software Tools Midterm, Feb 24, Instruction

Chapter 7: Functional Programming Languages

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

Logic in Computer Science: Logic Gates

Compiler Construction

Relational Databases

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Static Analysis. Find the Bug! : Analysis of Software Artifacts. Jonathan Aldrich. disable interrupts. ERROR: returning with interrupts disabled

COMPUTER SCIENCE TRIPOS

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

Translating C code to MIPS

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

AP Computer Science Java Subset

I PUC - Computer Science. Practical s Syllabus. Contents

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Lecture Outline. Stack machines The MIPS assembly language. Code Generation (I)

C++ INTERVIEW QUESTIONS

Lecture Notes on Linear Search

Integrating Formal Models into the Programming Languages Course

Introduction to Computers and Programming. Testing

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

C++FA 5.1 PRACTICE MID-TERM EXAM

Transcription:

Intermediate Code CS 5300 - SJAllan Intermediate Code 1 Intermediate Code Generation The front end of a compiler translates a source program into an intermediate representation Details of the back end are left to the back end Benefits include: Retargeting Machine-independent code optimization Parser Static Checker Intermediate Code Generator Intermediate Code Code Generator CS 5300 - SJAllan Intermediate Code 2 1

Intermediate Languages Consider the code: a := b * -c + b * -c A syntax tree graphically depicts code Postfix notation is a linearized representation of a syntax tree: a b c uminus * b c uminus * + assign Syntax Tree CS 5300 - SJAllan Intermediate Code 3 Three-Address Code Three-address code is a sequence of statements of the general form x := y op z x, y, and z are names, constants, or compiler-generated temporaries op can be any operator Three-address code is a linearized representation of a syntax tree Explicit names correspond to interior nodes of the graph CS 5300 - SJAllan Intermediate Code 4 2

Three-Address Code Example t1 := -c t2 := b * t1 t3 := -c t4 := b * t3 t5 := t2 + t4 a := t5 CS 5300 - SJAllan Intermediate Code 5 Types of Three-Address Statements 1. Assignment statements: a. x := y op z, where op is a binary operator b. x := op y, where op is a unary operator 2. Copy statements a. x := y 3. The unconditional jumps: a. goto L 4. Conditional jumps: a. if x relop y goto L 5. param x and call p, n and return y relating to procedure calls 6. Assignments: a. x := y[i] b. x[i] := y 7. Address and pointer assignments: a. x := &y, x := *y, and *x = y CS 5300 - SJAllan Intermediate Code 6 3

Generating Three-Address Code Temporary names are made up for the interior nodes of a syntax tree The synthesized attribute S.code represents the code for the assignment S The nonterminal E has attributes: E.place is the name that holds the value of E E.code is a sequence of three-address statements evaluating E The function newtemp returns a sequence of distinct names The function newlabel returns a sequence of distinct labels CS 5300 - SJAllan Intermediate Code 7 Assignments S id := E E E 1 + E 2 E E 1 * E 2 S.code := E.code gen(id.place ':=' E.place) E.code := E 1.code E 2.code gen(e.place ':=' E 1.place '+' E 2.place) E.code := E 1.code E 2.code gen(e.place ':=' E 1.place '*' E 2.place) CS 5300 - SJAllan Intermediate Code 8 4

Assignments E -E 1 E (E 1 ) E id E.place := E 1.place; E.code := E 1.code E.place := id.place; E.code := E.code := E 1.code gen(e.place ':=' 'uminus' E 1.place) CS 5300 - SJAllan Intermediate Code 9 While Statement S.begin: E.code if E.place = 0 goto S.after S1.code goto S.begin S.after:... CS 5300 - SJAllan Intermediate Code 10 5

Example: S while E do S 1 S.begin := newlabel; S.after := newlabel; S.code := gen(s.begin ':') E.code gen('if' E.place '=' '0' 'goto' S.after) S 1.code gen('goto' S.begin) gen(s.after ':') CS 5300 - SJAllan Intermediate Code 11 Quadruples A quadruple is a record structure with four fields: op, arg 1, arg 2, and result The op field contains an internal code for an operator Statements with unary operators do not use arg 2 Operators like param use neither arg 2 nor result The target label for conditional and unconditional jumps are in result The contents of fields arg 1, arg 2, and result are typically pointers to symbol table entries If so, temporaries must be entered into the symbol table as they are created Obviously, constants need to be handled differently CS 5300 - SJAllan Intermediate Code 12 6

Quadruples Example op arg1 arg2 result (0) uminus c t 1 (1) * b t 1 t 2 (2) uminus c t 3 (3) * b t 3 t 4 (4) + t 2 t 4 t 5 (5) := t 5 a CS 5300 - SJAllan Intermediate Code 13 Triples Triples refer to a temporary value by the position of the statement that computes it Statements can be represented by a record with only three fields: op, arg 1, and arg 2 Avoids the need to enter temporary names into the symbol table Contents of arg 1 and arg 2 : Pointer into symbol table (for programmer defined names) Pointer into triple structure (for temporaries) Of course, still need to handle constants differently CS 5300 - SJAllan Intermediate Code 14 7

Triples Example op arg1 arg2 (0) uminus c (1) * b (0) (2) uminus c (3) * b (2) (4) + (1) (3) (5) assign a (4) CS 5300 - SJAllan Intermediate Code 15 Declarations A symbol table entry is created for every declared name Information includes name, type, relative address of storage, etc. Relative address consists of an offset: Offset is from the base of the static data area for globals Offset is from the field for local data in an activation record for locals to procedures Types are assigned attributes type and width (size) Becomes more complex if we need to deal with nested procedures or records CS 5300 - SJAllan Intermediate Code 16 8

Declarations P D D D ; D D id : T T integer T real T array[num] of T 1 T T 1 offset := 0 enter(id.name, T.type, offset); offset := offset + T.width T.type := integer; T.width := 4 T.type := real T.width := 8 T.type := array(num, T 1.type); T.width := num * T 1.width T.type := pointer(t 1.type); T.width := 4 CS 5300 - SJAllan Intermediate Code 17 Translating Assignments S id := E E E 1 + E 2 E E 1 * E 2 p := lookup(id.name); if p!= NULL then emit(p ':=' E.place) else error emit(e.place ':=' E 1.place '+' E 2.place) emit(e.place ':=' E 1.place '*' E 2.place) CS 5300 - SJAllan Intermediate Code 18 9

Translating Assignments E -E 1 E (E 1 ) E id E.place := E 1.place emit(e.place ':=' 'uminus' E 1.place) p := lookup(id.name); if p!= NULL then E.place := p else error CS 5300 - SJAllan Intermediate Code 19 Addressing Array Elements The location of the i th element of array A is: base + (i low) * w w is the width of each element low is the lower bound of the subscript base is the relative address of A[low] The expression for the location can be rewritten as: i * w + (base low * w) The subexpression in parentheses is a constant That subexpression can be evaluated at compile time CS 5300 - SJAllan Intermediate Code 20 10

Two-Dimensional Arrays Stored in row-major form The address of A[i 1,i 2 ] is: base+((i 1 low 1 ) n 2 +i 2 low 2 ) w Where n 2 = high 2 low 2 + 1 We can rewrite the above as: ((i 1 n 2 )+i 2 ) w+(base ((low 1 n 2 )+low 2 ) w) The last term can be computed at compile time CS 5300 - SJAllan Intermediate Code 21 Type Conversions There are multiple types (e.g. integer, real) for variables and constants Compiler may need to reject certain mixed-type operations At times, a compiler needs to general type conversion instructions An attribute E.type holds the type of an expression CS 5300 - SJAllan Intermediate Code 22 11

Semantic Action: E E 1 + E 2 if E 1.type = integer and E 2.type = integer then begin emit(e.place ':=' E 1.place 'int+' E 2.place); E.type := integer end else if E 1.type = real and E2.type = real then else if E 1.type = integer and E 2.type = real then begin u := newtemp; emit(u ':=' 'inttoreal' E 1.place); emit(e.place ':=' u 'real+' E 2.place); E.type := real end else if E 1.type = real and E 2.type = integer then else E.type := type_error; CS 5300 - SJAllan Intermediate Code 23 Example: x := y + i * j In this example, x and y have type real i and j have type integer The intermediate code is shown below: t 1 := i int* j t 3 := inttoreal t 1 t 2 := y real+ t 3 x := t 2 CS 5300 - SJAllan Intermediate Code 24 12

Boolean Expressions Boolean expressions compute logical values Often used with flow-of-control statements Methods of translating boolean expression: Numerical methods: True is represented as 1 and false is represented as 0 Nonzero values are considered true and zero values are considered false Flow-of-control methods: Represent the value of a boolean by the position reached in a program Often not necessary to evaluate entire expression CS 5300 - SJAllan Intermediate Code 25 Numerical Representation Expressions evaluated left to right using 1 to denote true and 0 to donate false Example: a or b and not c t 1 := not c t 2 := b and t 1 t 3 := a or t 2 Another example: a < b 100: if a < b goto 103 101: t : = 0 102: goto 104 103: t : = 1 104: CS 5300 - SJAllan Intermediate Code 26 13

Numerical Representation E E 1 or E 2 E E 1 and E 2 E not E 1 E (E 1 ) emit(e.place ':=' E 1.place 'or E 2.place) emit(e.place ':=' E 1.place 'and E 2.place) emit(e.place ':=' 'not' E 1.place) E.place := E1.place; CS 5300 - SJAllan Intermediate Code 27 Numerical Representation E id 1 relop id 2 E true E false emit(e.place ':=' '1') emit(e.place ':=' '0') emit('if' id 1.place relop.op id 2.place 'goto' nextstat+3); emit(e.place ':=' '0'); emit('goto' nextstat+2); emit(e.place ':=' '1'); CS 5300 - SJAllan Intermediate Code 28 14

Example: a<b or c<d and e<f 100: if a < b goto 103 101: t 1 := 0 102: goto 104 103: t 1 := 1 104: if c < d goto 107 105: t 2 := 0 106: goto 108 107: t 2 := 1 108: if e < f goto 111 109: t 3 := 0 110: goto 112 111: t 3 := 1 112: t 4 := t 2 and t 3 113: t 5 := t 1 or t 4 slt t 1, a, b slt t 2, c, d slt t 3, e, f and t 4, t 2, t 3 or t 5, t 1, t 4 MIPS code CS 5300 - SJAllan Intermediate Code 29 Flow-of-Control The function newlabel will return a new symbolic label each time it is called Each boolean expression will have two new attributes: E.true is the label to which control flows if E is true E.false is the label to which control flows if E is false Attribute S.next of a statement S: Inherited attribute whose value is the label attached to the first instruction to be executed after the code for S Used to avoid jumps to jumps CS 5300 - SJAllan Intermediate Code 30 15

Flow-of-Control S if E then S 1 E.true := newlabel; E.false := S.next; S 1.next := S.next; S.code := E.code gen(e.true ':') S 1.code CS 5300 - SJAllan Intermediate Code 31 Flow-of-Control S if E then S 1 else S 2 E.true := newlabel; E.false := newlabel; S 1.next := S.next; S 2.next := S.next; S.code := E.code gen(e.true ':') S 1.code gen('goto' S.next) gen(e.false ':') S 2.code CS 5300 - SJAllan Intermediate Code 32 16

Flow-of-Control S while E do S 1 S.begin := newlabel; E.true := newlabel; E.false := S.next; S1.next := S.begin; S.code := gen(s.begin ':') E.code gen(e.true ':') S 1.code gen('goto' S.begin) CS 5300 - SJAllan Intermediate Code 33 Boolean Expressions Revisited E E 1 or E 2 E E 1 and E 2 E 1.true := E.true; E 1.false := newlabel; E 2.true := E.true; E 2.false := E.false; E.code := E 1.code gen(e 1.false ':') E 2.code E 1.true := newlabel; E 1.false := E.false; E 2.true := E.true; E 2.false := E.false; E.code := E 1.code gen(e 1.true ':') E 2.code CS 5300 - SJAllan Intermediate Code 34 17

Boolean Expressions Revisited E not E 1 E (E 1 ) E id 1 relop id 2 E true E false E 1.true := E.false; E 1.false := E.true; E.code := E 1.code E 1.true := E.true; E 1.false := E.false; E.code := E 1.code E.code := gen('if' id.place relop.op id 2.place 'goto' E.true) gen('goto' E.false) E.code := gen('goto' E.true) E.code := gen('goto' E.false) CS 5300 - SJAllan Intermediate Code 35 Revisited: a<b or c<d and e<f if a < b goto Ltrue goto L1 L1: if c < d goto L2 goto Lfalse L2: if e < f goto Ltrue goto Lfalse The code generated is inefficient What is the problem? Why was the code generated that way? CS 5300 - SJAllan Intermediate Code 36 18

Another Example while a < b do if c < d then x := y + z else x := y - z L1: if a < b goto L2 goto Lnext L2: if c < d goto L3 goto L4 L3: t 1 := y + z x:= t 1 goto L1 L4: t 2 := y z x := t 2 goto L1 Lnext: CS 5300 - SJAllan Intermediate Code 37 Mixed-Mode Expressions Boolean expressions often have arithmetic subexpressions, e.g. (a + b) < c If false has the value 0 and true has the value 1 arithmetic expressions can have boolean subexpressions Example: (a < b) + (b < a) has value 0 if a and b are equal and 1 otherwise Some operators may require both operands to be boolean Other operators may take both types of arguments, including mixed arguments CS 5300 - SJAllan Intermediate Code 38 19

Revisited: E E 1 + E 2 E.type := arith; if E1.type = arith and E2.type = arith then begin /* normal arithmetic add */ E.code := E 1.code E 2.code gen(e.place ':=' E 1.place '+' E 2.place) end else if E1.type := arith and E2.type = bool then begin E2.place := newtemp; E2.true := newlabel; E2.flase := newlabel; E.code := E1.code E2.code gen(e2.true ':' E.place ':=' E1.place + 1) gen('goto' nextstat+1) gen(e2.false ':' E.place ':=' E1.place) else if CS 5300 - SJAllan Intermediate Code 39 Case (Switch) Statements Implemented as: Sequence of if statements Jump table CS 5300 - SJAllan Intermediate Code 40 20

Labels and Goto Statements The definition of a label is treated as a declaration of the label Labels are typically entered into the symbol table Entry is created the first time the label is seen This may be before the definition of the label if it is the target of any forward goto When a compiler encounters a goto statement: It must ensure that there is exactly one appropriate label in the current scope If so, it must generate the appropriate code; otherwise, an error should be indicated CS 5300 - SJAllan Intermediate Code 41 Procedures The procedure is an extremely important, very commonly used construct Imperative that a compiler generates good calls and returns Much of the support for procedure calls is provided by a run-time support package S call id ( Elist) Elist Elist, E Elist E CS 5300 - SJAllan Intermediate Code 42 21

Calling Sequence Calling sequences can differ for different implementations of the same language Certain actions typically take place: Space must be allocated for the activation record of the called procedure The passed arguments must be evaluated and made available to the called procedure Environment pointers must be established to enable the called procedure to access appropriate data The state of the calling procedure must be saved The return address must be stored in a known place An appropriate jump statement must be generated CS 5300 - SJAllan Intermediate Code 43 Return Statements Several actions must also take place when a procedure terminates If the called procedure is a function, the result must be stored in a known place The activation record of the calling procedure must be restored A jump to the calling procedure's return address must be generated No exact division of run-time tasks between the calling and called procedure CS 5300 - SJAllan Intermediate Code 44 22

Pass by Reference The param statements can be used as placeholders for arguments The called procedure is passed a pointer to the first of the param statements Any argument can by obtained by using the proper offset from the base pointer Arguments other than simple names: First generate three-address statements needed to evaluate these arguments Follow this by a list of param three-address statements CS 5300 - SJAllan Intermediate Code 45 Using a Queue S call id ( Elist ) Elist Elist, E Elist E for each item p on queue do emit('param' p); emit('call' id.place) push E.place to queue initialize queue to contain E The code to evaluate arguments is emitted first, followed by param statements and then a call If desired, could augment rules to count the number of parameters CS 5300 - SJAllan Intermediate Code 46 23