The Translation of Functional Programming Languages

Similar documents
Stack Allocation. Run-Time Data Structures. Static Structures

COMP 356 Programming Language Structures Notes for Chapter 10 of Concepts of Programming Languages Implementing Subprograms.

Computer Systems Architecture

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

1 The Java Virtual Machine

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

2) Write in detail the issues in the design of code generator.

Chapter 5 Names, Bindings, Type Checking, and Scopes

Lecture Outline. Stack machines The MIPS assembly language. Code Generation (I)

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Instruction Set Architecture

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Parameter passing in LISP

Programming Language Features (cont.) CMSC 330: Organization of Programming Languages. Parameter Passing in OCaml. Call-by-Value

Assembly Language Programming

Chapter 7: Functional Programming Languages

Symbol Tables. Introduction

Tail Recursion Without Space Leaks

Functional Programming. Functional Programming Languages. Chapter 14. Introduction

umps software development

Boolean Expressions, Conditions, Loops, and Enumerations. Precedence Rules (from highest to lowest priority)

Translating C code to MIPS

Code Generation & Parameter Passing

More MIPS: Recursion. Computer Science 104 Lecture 9

Recursion. Slides. Programming in C++ Computer Science Dept Va Tech Aug., Barnette ND, McQuain WD

The C Programming Language course syllabus associate level

Semester Review. CSC 301, Fall 2015

Moving from CS 61A Scheme to CS 61B Java

Inside the Java Virtual Machine

Intermediate Code. Intermediate Code Generation

Chapter 7D The Java Virtual Machine

Compilers I - Chapter 4: Generating Better Code

Habanero Extreme Scale Software Research Project

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Cloud Computing. Up until now

CHAPTER 7: The CPU and Memory

PROGRAMMING CONCEPTS AND EMBEDDED PROGRAMMING IN C, C++ and JAVA: Lesson-4: Data Structures: Stacks

Linked Lists, Stacks, Queues, Deques. It s time for a chainge!

General Introduction

Visualizing Information Flow through C Programs

Computer Architecture

M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

The countdown problem

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Instruction Set Architecture (ISA)

CompuScholar, Inc. Alignment to Utah's Computer Programming II Standards

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

16. Recursion. COMP 110 Prasun Dewan 1. Developing a Recursive Solution

[Refer Slide Time: 05:10]

Introduction. What is an Operating System?

Programming Language Rankings. Lecture 15: Type Inference, polymorphism & Type Classes. Top Combined. Tiobe Index. CSC 131! Fall, 2014!

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Using UML Part One Structural Modeling Diagrams

picojava TM : A Hardware Implementation of the Java Virtual Machine

1 Abstract Data Types Information Hiding

Lecture 2: Data Structures Steven Skiena. skiena

Outline. The Stack ADT Applications of Stacks Array-based implementation Growable array-based stack. Stacks 2

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

Functional programming languages

Intel 8086 architecture

Exception and Interrupt Handling in ARM

Introduction CORBA Distributed COM. Sections 9.1 & 9.2. Corba & DCOM. John P. Daigle. Department of Computer Science Georgia State University

Object Oriented Software Design II

From Bytecode to Javascript: the Js of ocaml Compiler

6.088 Intro to C/C++ Day 4: Object-oriented programming in C++ Eunsuk Kang and Jean Yang

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

Recursive Fibonacci and the Stack Frame. the Fibonacci function. The Fibonacci function defines the Fibonacci sequence, using a recursive definition :

Fall 2006 CS/ECE 333 Lab 1 Programming in SRC Assembly

CSE 307: Principles of Programming Languages

X86-64 Architecture Guide

sys socketcall: Network systems calls on Linux

1 External Model Access

10CS35: Data Structures Using C

University of Twente. A simulation of the Java Virtual Machine using graph grammars

Developing an Inventory Management System for Second Life

Static Taint-Analysis on Binary Executables

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

The Design of the Inferno Virtual Machine. Introduction

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

7. Classification. Business value. Structuring (repetition) Automation. Classification (after Leymann/Roller) Automation.

Short Notes on Dynamic Memory Allocation, Pointer and Data Structure

A Typing System for an Optimizing Multiple-Backend Tcl Compiler

Introduction to Data Structures

CSCI 3136 Principles of Programming Languages

Chapter 13: Query Processing. Basic Steps in Query Processing

Transcription:

The Translation of Functional Programming Languages 108

11 The language PuF We only regard a mini-language PuF ( Pure Functions ). We do not treat, as yet: Side effects; Data structures. 109

A Program is an expression e of the form: An expression is therefore e ::= b x ( 1 e) (e 1 2 e 2 ) (if e 0 then e 1 else e 2 ) (e e 0... e k 1 ) (fn x 0,..., x k 1 e) (let x 1 = e 1 ;... ; x n = e n in e 0 ) (letrec x 1 = e 1 ;... ; x n = e n in e 0 ) a basic value, a variable, the application of an operator, or a function-application, a function-abstraction, or a let-expression, i.e. an expression with locally defined variables, or a letrec-expression, i.e. an expression with simultaneously defined local variables. For simplicity, we only allow int and bool as basic types. 110

Example: The following well-known function computes the factorial of a natural number: letrec fac = fn x if x 1 then 1 else x fac (x 1) in fac 7 As usual, we only use the minimal amount of parentheses. There are two Semantics: CBV: Arguments are evaluated before they are passed to the function (as in SML); CBN: Arguments are passed unevaluated; they are only evaluated when their value is needed (as in Haskell). 111

12 Architecture of the MaMa: We know already the following components: C 0 1 PC C = Code-store contains the MaMa-program; each cell contains one instruction; PC = Program Counter points to the instruction to be executed next; 112

S 0 SP FP S = Runtime-Stack each cell can hold a basic value or an address; SP = Stack-Pointer points to the topmost occupied cell; as in the CMa implicitely represented; FP = Frame-Pointer points to the actual stack frame. 113

We also need a heap H: Tag Code Pointer Value Heap Pointer 114

... it can be thought of as an abstract data type, being capable of holding data objects of the following form: B v 173 Basic Value C cp gp Closure F cp ap gp Function V n v[0]... v[n 1] Vector 115

The instruction new (tag, args) creates a corresponding object (B, C, F, V) in H and returns a reference to it. We distinguish three different kinds of code for an expression e: code V e (generates code that) computes the Value of e, stores it in the heap and returns a reference to it on top of the stack (the normal case); code B e computes the value of e, and returns it on the top of the stack (only for Basic types); code C e does not evaluate e, but stores a Closure of e in the heap and returns a reference to the closure on top of the stack. We start with the code schemata for the first two kinds: 116

13 Simple expressions Expressions consisting only of constants, operator applications, and conditionals are translated like expressions in imperative languages: code B b ρ sd = loadc b code B ( 1 e) ρ sd = code B e ρ sd op 1 code B (e 1 2 e 2 ) ρ sd = code B e 1 ρ sd code B e 2 ρ (sd + 1) op 2 117

code B (if e 0 then e 1 else e 2 ) ρ sd = code B e 0 ρ sd jumpz A code B e 1 ρ sd jump B A: code B e 2 ρ sd B:... 118

Note: ρ denotes the actual address environment, in which the expression is translated. Address environments have the form: ρ : Vars {L, G} Z The extra argument sd, the stack difference, simulates the movement of the SP when instruction execution modifies the stack. It is needed later to address variables. The instructions op 1 and op 2 implement the operators 1 and 2, in the same way as the the operators neg and add implement negation resp. addition in the CMa. For all other expressions, we first compute the value in the heap and then dereference the returned pointer: code B e ρ sd = code V e ρ sd getbasic 119

B 17 17 getbasic if (H[S[SP]]!= (B,_)) Error not basic! ; else S[SP] = H[S[SP]].v; 120

For code V and simple expressions, we define analogously: code V b ρ sd = loadc b; mkbasic code V ( 1 e) ρ sd = code B e ρ sd op 1 ; mkbasic code V (e 1 2 e 2 ) ρ sd = code B e 1 ρ sd code B e 2 ρ (sd + 1) op 2 ; mkbasic code V (if e 0 then e 1 else e 2 ) ρ sd = code B e 0 ρ sd jumpz A code V e 1 ρ sd jump B A: code V e 2 ρ sd B:... 121

17 mkbasic B 17 S[SP] = new (B,S[SP]); 122

14 Accessing Variables We must distinguish between local and global variables. Example: Regard the function f : let c = 5 in f = fn a let b = a a f c in b + c The function f uses the global variable c and the local variables a (as formal parameter) and b (introduced by the inner let). The binding of a global variable is determined, when the function is constructed (static scoping!), and later only looked up. 123

Accessing Global Variables The bindings of global variables of an expression or a function are kept in a vector in the heap (Global Vector). They are addressed consecutively starting with 0. When an F-object or a C-object are constructed, the Global Vector for the function or the expression is determined and a reference to it is stored in the gp-component of the object. During the evaluation of an expression, the (new) register GP (Global Pointer) points to the actual Global Vector. In constrast, local variables should be administered on the stack... == General form of the address environment: ρ : Vars {L, G} Z 124

Accessing Local Variables Local variables are administered on the stack, in stack frames. Let e e e 0... e m 1 be the application of a function e to arguments e 0,..., e m 1. Warning: The arity of e does not need to be m :-) PuF functions have curried types, f : t 1 t 2... t n t f may therefore receive less than n arguments (under supply); f may also receive more than n arguments, if t is a functional type (over supply). 125

Possible stack organisations: F e e m 1 FP e 0 + Addressing of the arguments can be done relative to FP The local variables of e cannot be addressed relative to FP. If e is an n-ary function with n < m, i.e., we have an over-supplied function application, the remaining m n arguments will have to be shifted. 126

If e evaluates to a function, which has already been partially applied to the parameters a 0,..., a k 1, these have to be sneaked in underneath e 0 : e m 1 a 1 e 0 FP a 0 127

Alternative: F e e 0 FP e m 1 + The further arguments a 0,..., a k 1 and the local variables can be allocated above the arguments. 128

a 0 e 0 a1 FP e m 1 Addressing of arguments and local variables relative to FP is no more possible. (Remember: m is unknown when the function definition is translated.) 129

Way out: We address both, arguments and local variables, relative to the stack pointer SP!!! However, the stack pointer changes during program execution... sd SP sp e 0 0 FP e m 1 130

The differerence between the current value of SP and its value sp 0 at the entry of the function body is called the stack distance, sd. Fortunately, this stack distance can be determined at compile time for each program point, by simulating the movement of the SP. The formal parameters x 0, x 1, x 2,... successively receive the non-positive relative addresses 0, 1, 2,..., i.e., ρ x i = (L, i). The absolute address of the i-th formal parameter consequently is sp 0 i = (SP sd) i The local let-variables y 1, y 2, y 3,... will be successively pushed onto the stack: 131

SP sd sp 0 : 2 1 0 1 2 3 y 3 y 1 x 0 y 2 x 1 x k 1 The y i have positive relative addresses 1, 2, 3,..., that is: ρ y i = (L, i). The absolute address of y i is then sp 0 + i = (SP sd) + i 132

With CBN, we generate for the access to a variable: code V x ρ sd = getvar x ρ sd eval The instruction eval checks, whether the value has already been computed or whether its evaluation has to yet to be done (== will be treated later :-) With CBV, we can just delete eval from the above code schema. The (compile-time) macro getvar is defined by: getvar x ρ sd = let (t, i) = ρ x in case t of L pushloc (sd i) G pushglob i end 133

The access to local variables: pushloc n n S[SP+1] =S[SP - n]; SP++; 134

Correctness argument: Let sp and sd be the values of the stack pointer resp. stack distance before the execution of the instruction. The value of the local variable with address i is loaded from S[a] with... exactly as it should be :-) a = sp (sd i) = (sp sd) + i = sp 0 + i 135

The access to global variables is much simpler: pushglob i GP V GP V i SP = SP + 1; S[SP] = GP v[i]; 136

Example: Regard e (b + c) for ρ = {b (L, 1), c (G, 0)} and sd = 1. With CBN, we obtain: code V e ρ 1 = getvar b ρ 1 = 1 pushloc 0 eval 2 eval getbasic 2 getbasic getvar c ρ 2 2 pushglob 0 eval 3 eval getbasic 3 getbasic add 3 add mkbasic 2 mkbasic 137

15 let-expressions As a warm-up let us first consider the treatment of local variables :-) Let e let y 1 = e 1 ;... ; y n = e n in e 0 be a let-expression. The translation of e must deliver an instruction sequence that allocates local variables y 1,..., y n ; in the case of CBV: evaluates e 1,..., e n and binds the y i to their values; CBN: constructs closures for the e 1,..., e n and binds the y i to them; evaluates the expression e 0 and returns its value. Here, we consider the non-recursive case only, i.e. where y j only depends on y 1,..., y j 1. We obtain for CBN: 138

code V e ρ sd = code C e 1 ρ sd code C e 2 ρ 1 (sd + 1)... code C e n ρ n 1 (sd + n 1) code V e 0 ρ n (sd + n) slide n // deallocates local variables where ρ j = ρ {y i (L, sd + i) i = 1,..., j}. In the case of CBV, we use code V for the expressions e 1,..., e n. Warning! All the e i must be associated with the same binding for the global variables! 139

Example: Consider the expression e let a = 19; b = a a in a + b for ρ = and sd = 0. We obtain (for CBV): 0 loadc 19 3 getbasic 3 pushloc 1 1 mkbasic 3 mul 4 getbasic 1 pushloc 0 2 mkbasic 4 add 2 getbasic 2 pushloc 1 3 mkbasic 2 pushloc 1 3 getbasic 3 slide 2 140

The instruction slide k deallocates again the space for the locals: k slide k S[SP-k] = S[SP]; SP = SP - k; 141