Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz 2007 03 16



Similar documents
Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

n Introduction n Art of programming language design n Programming language spectrum n Why study programming languages? n Overview of compilation

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

Jonathan Worthington Scarborough Linux User Group

System Structures. Services Interface Structure

CSCI 3136 Principles of Programming Languages

CSCI E 98: Managed Environments for the Execution of Programs

1. Overview of the Java Language

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Evolution of the Major Programming Languages

1/20/2016 INTRODUCTION

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

Functional Programming. Functional Programming Languages. Chapter 14. Introduction

Programming Languages

General Introduction

Programming Languages

02 B The Java Virtual Machine

Programming Language Concepts for Software Developers

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

The Design of the Inferno Virtual Machine. Introduction

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Online Recruitment System 1. INTRODUCTION

Chapter 3: Operating-System Structures. Common System Components

Semantic Analysis: Types and Type Checking

2) Write in detail the issues in the design of code generator.

Fachbereich Informatik und Elektrotechnik SunSPOT. Ubiquitous Computing. Ubiquitous Computing, Helmut Dispert

Language Processing Systems

Lecture 1 Introduction to Android

Cloud Computing. Up until now

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Programming Language Pragmatics

Programming Languages

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

New Generation of Software Development

Compiler I: Syntax Analysis Human Thought

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Java Garbage Collection Basics

CS 40 Computing for the Web

HOTPATH VM. An Effective JIT Compiler for Resource-constrained Devices

CS 51 Intro to CS. Art Lee. September 2, 2014

Functional Programming

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine

Multi-core Programming System Overview

What is a programming language?

MICHIGAN TEST FOR TEACHER CERTIFICATION (MTTC) TEST OBJECTIVES FIELD 050: COMPUTER SCIENCE

Restraining Execution Environments

Software quality improvement via pattern matching

How To Trace

2 Introduction to Java. Introduction to Programming 1 1

Tail call elimination. Michel Schinz

Java and Real Time Storage Applications

Semester Review. CSC 301, Fall 2015

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Glossary of Object Oriented Terms

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Lecture 9. Semantic Analysis Scoping and Symbol Table

Compiler Construction

Moving from CS 61A Scheme to CS 61B Java

Garbage Collection in the Java HotSpot Virtual Machine

CPU Organisation and Operation

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

CA Compiler Construction

OKLAHOMA SUBJECT AREA TESTS (OSAT )

03 - Lexical Analysis

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

McGraw-Hill The McGraw-Hill Companies, Inc.,

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

University of Twente. A simulation of the Java Virtual Machine using graph grammars

Lecture 1: Introduction

PC Based Escape Analysis in the Java Virtual Machine

CSC Software II: Principles of Programming Languages

Briki: a Flexible Java Compiler

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

A Comparison of Programming Languages for Graphical User Interface Programming

An Easier Way for Cross-Platform Data Acquisition Application Development

Write Barrier Removal by Static Analysis

Developing Embedded Software in Java Part 1: Technology and Architecture

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

HVM TP : A Time Predictable and Portable Java Virtual Machine for Hard Real-Time Embedded Systems JTRES 2014

Tail Recursion Without Space Leaks

Programming Languages CIS 443

Can You Trust Your JVM Diagnostic Tools?

Course MS10975A Introduction to Programming. Length: 5 Days

Trace-Based and Sample-Based Profiling in Rational Application Developer

The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science

1.1 WHAT IS A PROGRAMMING LANGUAGE?

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want.

Transcription:

Advanced compiler construction Michel Schinz 2007 03 16 General course information Teacher & assistant Course goals Teacher: Michel Schinz Michel.Schinz@epfl.ch Assistant: Iulian Dragos INR 321, 368 64 Iulian.Dragos@epfl.ch The goal of this course is to teach you: 1. how to compile high-level functional and objectoriented programming languages, and 2. how to optimise the generated code. To achieve these goals, the course is split in two parts: 1. a part covering virtual machines, memory management, closure conversion, etc. 2. a part covering data-flow analysis, SSA form, etc. 3 4 Evaluation Grading scheme The grade will be attributed according to the following scheme: Evaluation will be based on: a project, made in groups of two persons at most, and an individual exam oral or written at the end of the semester. Notice that the exam will take place during the last week of the semester, not after it. Part Weight Project 1: garbage collector 15% Project 2: closure conversion 15% Project 3: tail call elimination 10% Advanced project 30% Exam 30% 5 6

Project overview Project parts You will have to improve a compiler and a virtual machine (VM) for minischeme a tiny dialect of Scheme, itself a dialect of Lisp. For example, the map function is minischeme is written: (define map (lambda (f l) (if (null? l) nil (cons (f (head l)) (map f (tail l)))))) The compiler is written in Scala, the VM in C. The project is split in two parts: 1. a common part, during which all groups have to complete the same simple tasks (e.g. add garbage collection to the VM), 2. an individual part, during which all groups have to choose two advanced tasks, try to complete them and describe their work in a short report (e.g. implement a JIT compiler for the VM). 7 8 Resources The course has a Web page: http://lamp.epfl.ch/teaching/advanced_compiler Moreover, to handle project submissions we will use moodle please register to this course on the system! http://moodle.epfl.ch/ Course name: Advanced Compiler Construction Enrolment key: ACT Questions can either be asked during the exercise sessions, or through the course s newsgroup: epfl.ic.cours.act Course overview 9 What is a compiler? Your current view of a compiler must be something like this: What is a compiler, really? Real compilers are often more complicated Character stream Lexical analysis Syntactical analysis Name & type analysis Code generation Scanner Parser Analyser Generator Token stream Tree Attributed tree Executable code multiple simplification and optimisation phases Scanner Parser Analyser Generator sophisticated run time system 11 12

Additional phases Simplification phases transform the program so that complex concepts of the language (e.g. pattern matching, anonymous functions, ) are translated using simpler ones. Optimisation phases transform the program so that it hopefully makes better use of some resource e.g. CPU cycles, memory, etc. Of course, all these phases must preserve the meaning of the original program! Simplification phases Example of simplification phase: Java compilers have a simplification phase that transforms nested classes to toplevel ones. class Out { void f1() { class In { void f2() { f1(); class Out { void f1() { class Out$In { final Out this$0; Out$In(Out o) { this$0 = o; void f2() { this$0.f1(); 13 14 Optimisation phases Example of optimisation phase: Java compilers optimise expressions involving constant values. That includes removing dead code, i.e. code that can never be executed. class C { public final static boolean debug =!true; int f() { if (debug) { System.out.println("C.f() called"); return 10; dead code, removed during compilation Intermediate representations To manipulate the program, simplification and optimisation phases must represent it in some way. One possibility is to use the representation produced by the parser the abstract syntax tree (AST). The AST is perfectly suited to certain tasks, but other intermediate representations (IR) exist and are more appropriate in some situations. 15 16 Kinds of IRs Graphical IRs Intermediate representations can broadly be split in three categories: graphical IRs, which represent the program as a graph, linear IRs, which represent the program as a linear sequence of instructions, and hybrid IRs, which are partly graphical, partly linear. Graphical intermediate representations represent the program as a graph. They are often used in the initial phases of the compiler. In particular, the AST produced by the parser is a graphical IR. Examples: ASTs, some kinds of control-flow graphs, etc. 17 18

Graphical IR example Linear IRs x!12 x!y y!5 if x<y x!2*y This is an example of a control-flow graph (CFG). Nodes are instructions, and edges represent the possible flow of control: if there is an edge from n 1 to n 2, then control can flow directly from n 1 to n 2. Linear intermediate representations represent the program as a sequence of instructions. They are often used in the final phases of the compiler, since machine code itself is linear. Examples: three-address code, stack languages, etc. z!x/y 19 20 Linear IR example Hybrid IRs x!12 y!5 if x<y goto L1 x!y goto L2 L1: x!2*y L2: z!x/y Hybrid intermediate representations have graphical and linear components. For example, most control-flow graphs are hybrid, unlike the one presented before: the nodes in the CFG are linear sequences of instructions called basic blocks but the CFG itself is a graph. 21 22 Hybrid IR example SSA form basic block x!y x!12 y!5 if x<y z!x/y x!2*y This CFG is equivalent to the one presented before, but its nodes are basic blocks and not single instructions. It therefore contains fewer nodes. A basic block is a maximal sequence of instructions such that control always enters at the top and leaves at the bottom. Static single-assignment (SSA) form is an intermediate representation with an important characteristic: all variables are assigned exactly once. This characteristic makes a lot of optimisations easier. For example, identifying common sub-expressions is trivial. Transforming an imperative program to SSA form implies the introduction of so-called "-functions. 23 24

SSA form example Intermediate languages x1!12 y1!5 if x1<y1 goto L1 x2!y1 goto L2 L1: x3!2*y1 L2: x4!"(x2,x3) z1!x4/y1 The "-function magically selects the correct x, depending on the flow of control Intermediate representations that can be represented textually as a program are often called intermediate languages. Intermediate languages are similar to normal programming languages, but designed with different goals. For example, simplicity is usually preferred to conciseness. Some intermediate languages are typed. This can help debugging the compiler, as the result of each phase can be type-checked. 25 26 Run time system Implementing a high-level programming language usually means more than just writing a compiler! A complete run time system (RTS) must be written, to assist the execution of compiled programs by providing various services like memory management, threads, etc. For example, the Java Virtual Machine is the run time system for Java, Scala and many other programming languages. It handles (lazy) class loading, byte-code verification and interpretation, just-in-time compilation, threading, garbage collection, etc. and provides a debugging interface. A Java Virtual Machine is actually more complex to develop than a Java compiler! Memory management Most modern programming languages offer automatic memory management: the programmer allocates memory explicitly, but deallocation is performed automatically. The deallocation of memory is usually performed by a part of the run time system called the garbage collector (GC). A garbage collector periodically frees all memory that has been allocated by the program but is not reachable anymore. 27 28 Virtual machines VM pros and cons Instead of targeting a real processor, a compiler can target a virtual one, usually called a virtual machine (VM). The produced code is then interpreted by a program emulating the virtual machine. Virtual machines are interesting for several reasons: the compiler can target a single, usually high-level architecture, the program can easily be monitored during execution, e.g. to prevent malicious behaviour, or provide debugging facilities, the distribution of compiled code is easier. The main (only?) disadvantage of virtual machines is their speed: it is always slower to interpret a program in software than to execute it directly in hardware. 29 30

Dynamic (JIT) compilation Summary To make virtual machines faster, dynamic, or just-in-time (JIT) compilation was invented. The idea is simple: Instead of interpreting a piece of code, the virtual machine translates it to machine code, and hands that code to the processor for execution. This is usually faster than interpretation. Compilers for high-level languages are more complex than the ones you ve studied, since: they must translate high-level concepts like patternmatching, anonymous functions, etc. to lower-level equivalents, they must be accompanied by a sophisticated run time system, and they should produced optimised code. This course will be focused on these aspects of compilers and run time systems. 31 32