Language Processing Systems



Similar documents
How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

Programming Languages

CA Compiler Construction

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

CSCI 3136 Principles of Programming Languages

n Introduction n Art of programming language design n Programming language spectrum n Why study programming languages? n Overview of compilation

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Programming Languages

Compiler I: Syntax Analysis Human Thought

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Lecture 27 C and Assembly

Compiler Construction

Programming Languages

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

Artificial Intelligence. Class: 3 rd

Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices

Programming Language Pragmatics

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

Semester Review. CSC 301, Fall 2015

C Compiler Targeting the Java Virtual Machine

Division of Mathematical Sciences

1/20/2016 INTRODUCTION

Compiler Construction

2) Write in detail the issues in the design of code generator.

DEGREE PLAN INSTRUCTIONS FOR COMPUTER ENGINEERING

Compilers I - Chapter 4: Generating Better Code

Introducción. Diseño de sistemas digitales.1

Semantic Analysis: Types and Type Checking

Levels of Programming Languages. Gerald Penn CSC 324

Instruction Set Architecture (ISA)

Lecture 1: Introduction

How To Write Portable Programs In C

LSN 2 Computer Processors

Compiler and Language Processing Tools

CSC Software II: Principles of Programming Languages

Algorithm & Flowchart & Pseudo code. Staff Incharge: S.Sasirekha

Computer Programming. Course Details An Introduction to Computational Tools. Prof. Mauro Gaspari:

Lecture 9. Semantic Analysis Scoping and Symbol Table

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

1.1 WHAT IS A PROGRAMMING LANGUAGE?

Computer Science. Master of Science

What is a programming language?

X86-64 Architecture Guide

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Application Architectures

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Virtual Machines as an Aid in Teaching Computer Concepts

Course MS10975A Introduction to Programming. Length: 5 Days

KS3 Computing Group 1 Programme of Study hours per week

CS 40 Computing for the Web

03 - Lexical Analysis

Image credits:

GENERIC and GIMPLE: A New Tree Representation for Entire Functions

Texas Essential Knowledge and Skills Correlation to Video Game Design Foundations 2011 N Video Game Design

McGraw-Hill The McGraw-Hill Companies, Inc.,

02 B The Java Virtual Machine

Administrative Issues

64-Bit NASM Notes. Invoking 64-Bit NASM

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Generating a Recognizer from Declarative Machine Descriptions

CHAPTER 1 ENGINEERING PROBLEM SOLVING. Copyright 2013 Pearson Education, Inc.

CSE 307: Principles of Programming Languages

Chapter 2 Logic Gates and Introduction to Computer Architecture

EMC Publishing. Ontario Curriculum Computer and Information Science Grade 11

CS61: Systems Programing and Machine Organization

Optimizing compilers. CS Modern Compilers: Theory and Practise. Optimization. Compiler structure. Overview of different optimizations

Principles of Programming Languages Topic: Introduction Professor Louis Steinberg

Introduction to Virtual Machines

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

Assessment for Master s Degree Program Fall Spring 2011 Computer Science Dept. Texas A&M University - Commerce

Flex/Bison Tutorial. Aaron Myles Landwehr CAPSL 2/17/2012

Datavetenskapligt Program (kandidat) Computer Science Programme (master)

ATSBA: Advanced Technologies Supporting Business Areas. Programming with Java. 1 Overview and Introduction

Module: Software Instruction Scheduling Part I

Masters in Information Technology

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

School of Computer Science

High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Machine Programming II: Instruc8ons

AP Computer Science AB Syllabus 1

Syntaktická analýza. Ján Šturc. Zima 208

Natural Language to Relational Query by Using Parsing Compiler

Chapter 13: Program Development and Programming Languages

EE361: Digital Computer Organization Course Syllabus

Applying Clang Static Analyzer to Linux Kernel

CS 51 Intro to CS. Art Lee. September 2, 2014

How Compilers Work. by Walter Bright. Digital Mars

Machine-Level Programming II: Arithmetic & Control

How To Trace

Transcription:

Language Processing Systems

Evaluation Active sheets 10 % Exercise reports 30 % Midterm Exam 20 % Final Exam 40 %

Contact Send e-mail to hamada@u-aizu.ac.jp Course materials at www.u-aizu.ac.jp/~hamada/education.html Check every week for update

Books Andrew W. Appel : Modern Compiler Implementation in C A. Aho, R. Sethi and J. Ullman, Compilers: Principles, Techniques and Tools (The Dragon Book), Addison Wesley S. Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufman, 1997

Books

Goals understand the structure of a compiler understand how the components operate understand the tools involved scanner generator, parser generator, etc. understanding means [theory] be able to read source code [practice] be able to adapt/write source code

The Course covers: Introduction Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Generation Code Generation Code Optimization (if there is time)

Related to Compilers Interpreters (direct execution) Assemblers Preprocessors Text formatters (non-wysiwyg) Analysis tools

Today s Outline Introduction to Language Processing Systems Why do we need a compiler? What are compilers? Anatomy of a compiler

Why study compilers? Better understanding of programming language concepts Wide applicability Transforming data is very common Many useful data structures and algorithms Bring together: Data structures & Algorithms Formal Languages Computer Architecture Influence: Language Design Architecture (influence is bi-directional)

Issues Driving Compiler Design Correctness Speed (runtime and compile time) Degrees of optimization Multiple passes Space Feedback to user Debugging

Why Study Compilers? Compilers enable programming at a high level language instead of machine instructions. Malleability, Portability, Modularity, Simplicity, Programmer Productivity Also Efficiency and Performance

Compilers Construction touches many topics in Computer Science Theory Finite State Automata, Grammars and Parsing, data-flow Algorithms Graph manipulation, dynamic programming Data structures Symbol tables, abstract syntax trees Systems Allocation and naming, multi-pass systems, compiler construction Computer Architecture Memory hierarchy, instruction selection, interlocks and latencies Security Detection of and Protection against vulnerabilities Software Engineering Software development environments, debugging Artificial Intelligence Heuristic based search

Power of a Language Can use to describe any action Not tied to a context Many ways to describe the same action Flexible

How to instruct a computer How about natural languages? English?? Open the pod bay doors, Hal. I am sorry Dave, I am afraid I cannot do that We are not there yet!! Natural Languages: Powerful, but Ambiguous Same expression describes many possible actions

Programming Languages Properties need to be precise need to be concise need to be expressive need to be at a high-level (lot of abstractions)

High-level Abstract Description to Low-level Implementation Details President General Sergeant My poll ratings are low, lets invade a small nation Cross the river and take defensive positions Forward march, turn left Stop!, Shoot Foot Soldier

1. How to instruct the computer Write a program using a programming language High-level Abstract Description Microprocessors talk in assembly language Low-level Implementation Details Program written in a Programming Languages Compiler Assembly Language Translation

1. How to instruct the computer Input: High-level programming language Output: Low-level assembly instructions Compiler does the translation: Read and understand the program Precisely determine what actions it require Figure-out how to faithfully carry-out those actions Instruct the computer to carry out those actions

Input to the Compiler Standard imperative language (Java, C, C++) State Variables, Structures, Arrays Computation Expressions (arithmetic, logical, etc.) Assignment statements Control flow (conditionals, loops) Procedures

Output of the Compiler State Registers Memory with Flat Address Space Machine code load/store architecture Load, store instructions Arithmetic, logical operations on registers Branch instructions

Example (input program) int sumcalc(int a, int b, int N) { int i, x, y; x = 0; y = 0; for(i = 0; i <= N; i++) { x = x + (4*a/b)*i + (i+1)*(i+1); x = x + b*y; } return x; }

Example (Output assembly code) sumcalc: pushq %rbp movq %rsp, %rbp movl %edi, -4(%rbp) movl %esi, -8(%rbp) movl %edx, -12(%rbp) movl $0, -20(%rbp) movl $0, -24(%rbp) movl $0, -16(%rbp).L2: movl -16(%rbp), %eax cmpl -12(%rbp), %eax jg.l3 movl -4(%rbp), %eax leal 0(,%rax,4), %edx leaq -8(%rbp), %rax movq %rax, -40(%rbp) movl %edx, %eax movq -40(%rbp), %rcx cltd idivl (%rcx) movl %eax, -28(%rbp) movl -28(%rbp), %edx imull -16(%rbp), %edx movl -16(%rbp), %eax incl %eax imull %eax, %eax addl %eax, %edx leaq -20(%rbp), %rax addl %edx, (%rax) movl -8(%rbp), %eax movl %eax, %edx imull -24(%rbp), %edx leaq -20(%rbp), %rax addl %edx, (%rax) leaq -16(%rbp), %rax incl (%rax) jmp.l2.l3: movl -20(%rbp), %eax leave ret.size sumcalc,.-sumcalc.section.lframe1:.long.lecie1-.lscie1.lscie1:.long 0x0.byte 0x1.string "".uleb128 0x1.sleb128-8.byte 0x10.byte 0xc.uleb128 0x7.uleb128 0x8.byte 0x90.uleb128 0x1.align 8.LECIE1:.long.LEFDE1-.LASFDE1.long.LASFDE1-.Lframe1.quad.LFB2.quad.LFE2-.LFB2.byte 0x4.long.LCFI0-.LFB2.byte 0xe.uleb128 0x10.byte 0x86.uleb128 0x2.byte 0x4.long.LCFI1-.LCFI0.byte 0xd.uleb128 0x6.align 8

Anatomy of a Computer Program written in a Programming Languages Compiler Assembly Language Translation

What is a compiler? A compiler is a program that reads a program written in one language and translates it into another language. program in some source language compiler executable code for target machine Traditionally, compilers go from high-level languages to low-level languages.

Example X=a+b*10 compiler MOV id3, R2 MUL #10.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1

What is a compiler? Intermediate representation program in some source language front-end analysis semantic representation back-end synthesis compiler executable code for target machine

Compiler Architecture Front End Back End Source language Scanner (lexical analysis) tokens Parser (syntax analysis) Parse tree Semantic Analysis AST IC generator Intermediate Language Code Optimizer OIL Code Generator Target language Error Handler Symbol Table

front-end: from program text to AST program text lexical analysis tokens front-end syntax analysis AST context handling annotated AST

front-end: from program text to AST Scanner program text token description language grammar Parser scanner generator parser generator lexical analysis tokens syntax analysis AST context handling Semantic analysis Semantic representation annotated AST

Semantic representation program in some source language front-end analysis semantic representation back-end synthesis compiler executable code for target machine heart of the compiler intermediate code linked lists of pseudo instructions abstract syntax tree (AST)

AST example expression grammar expression expression + term expression - term term term term * factor term / factor factor factor identifier constant ( expression ) example expression b*b 4*a*c

parse tree: b*b 4*a*c expression expression - term term term * factor term * factor term * factor identifier factor identifier factor identifier c identifier b constant a b 4

AST: b*b 4*a*c - * * b b * c 4 a

annotated AST: b*b 4*a*c - type: real loc: reg1 * type: real loc: reg1 * type: real loc: reg2 b type: real loc: sp+16 b type: real loc: sp+16 * type: real loc: reg2 c type: real loc: sp+24 identifier constant term expression 4 type: real loc: const a type: real loc: sp+8

position = initial + rate * 60 Example Scanner id 1 := id 2 + id 3 * 60 Parser := id 1 + id 2 * id 3 60 Semantic Analyzer := id 1 + id 2 * id 3 int-to-real 60

AST exercise (5 min.) expression grammar expression expression + term expression - term term term term * factor term / factor factor factor identifier constant ( expression ) example expression b*b (4*a*c) draw parse tree and AST

Answers

answer parse tree: b*b 4*a*c expression expression - term term term * factor term * factor term * factor identifier factor identifier factor identifier c identifier b constant a b 4

answer parse tree: b*b (4*a*c) expression expression - term term factor term * factor expression ( ) factor identifier identifier b 4*a*c b

Break

Advantages of Using Front-end and Backend 1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Optimization - reuse intermediate code optimizers in compilers for different languages and different machines. Note: the terms intermediate code, intermediate language, and intermediate representation are all used interchangeably.

Compiler structure program in some source language front-end analysis back-end synthesis executable code for target machine program in some source language front-end analysis semantic representation L+M modules = LxM compilers back-end synthesis compiler back-end synthesis executable code for target machine executable code for target machine

Limitations of modular approach performance generic vs specific loss of information program in some source language program in some source language front-end analysis front-end analysis semantic representation back-end synthesis back-end synthesis compiler executable code for target machine executable code for target machine variations must be small same programming paradigm similar processor architecture back-end synthesis executable code for target machine

Front-end and Back-end Suppose you want to write 3 compilers to4 computer platforms: C++ Java FORTRAN MIPS SPARC Pentium PowerPC We need to write 12 programs

Front-end and Back-end But we can do it better C++ FE BE MIPS Java FE IR BE BE SPARC Pentium FORTRAN FE BE PowerPC We need to write 7 programs only IR: Intermediate Representation FE: Front-End BE: Back-End

Front-end and Back-end Suppose you want to write compilers fromm source languages ton computer platforms. A naïve solution requiresn*m programs: C++ Java FORTRAN but we can do it with n+m programs: C++ Java FORTRAN FE FE FE IR BE BE BE BE MIPS SPARC Pentium IR: Intermediate Representation FE: Front-End BE: Back-End PowerPC MIPS SPARC Pentium PowerPC

Compiler Example position=initial+rate*60 compiler MOV id3, R2 MUL #60.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1

position := initial + rate * 60 Scanner id 1 := id 2 + id 3 * 60 Parser := id 1 + id 2 * id 3 60 Semantic Analyzer := id 1 + id 2 * id 3 int-to-real 60 Example Intermediate Code Generator temp1 := int-to-real (60) temp2 := id 3 * temp1 temp3 := id 2 + temp2 id1 := temp3 Code Optimizer temp1 := id 3 * 60.0 id1 := id 2 + temp1 Code Generator MOV id3, R2 MUL #60.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1

Resident Compiler Compiler Development Test Cycle Development Compiler Source Code Development Compiler Application Output Verifier Application Source Code Expected Application Output Application Output Compiled Application Application Input

A Simple Compiler Example Our goal is to build a very simple compiler its source program are expressions formed from digits separated by plus (+) and minus (-) signs in infix form. The target program is the same expression but in a postfix form. Infix expression compiler Postfix expression Infix expression: Refer to expressions in which the operations are put between its operands. Example: a+b*10 Postfix expression: Refer to expressions in which the operations come after its operands. Example: ab10*+

Infix to Postfix translation 1. If E is a digit then its postfix form is E 2. If E=E 1 +E 2 then its postfix form is E 1`E 2`+ 3. If E=E 1 -E 2 then its postfix form is E 1`E 2`- 4. If E=(E 1 ) then E and E 1 have the same postfix form Where in 2 and 3 E 1` and E 2` represent the postfix forms of E 1 and E 2 respectively.

END

Interpreter vs Compiler Source Program Input Interpreter Output Source Program Compiler Input Target Program Output

Typical Compiler Source Program Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target Program