Semantic Analysis: Types and Type Checking



Similar documents
Lecture 9. Semantic Analysis Scoping and Symbol Table

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer

Chapter 5 Names, Bindings, Type Checking, and Scopes

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

CSCI 3136 Principles of Programming Languages

CS143 Handout 18 Summer 2012 July 16 th, 2012 Semantic Analysis

Symbol Tables. Introduction

Compiler Construction

Programming Languages

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Compiler Construction

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

The C Programming Language course syllabus associate level

Memory management. Announcements. Safe user input. Function pointers. Uses of function pointers. Function pointer example

Programming Languages CIS 443

Moving from CS 61A Scheme to CS 61B Java

CS 106 Introduction to Computer Science I

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Intermediate Code. Intermediate Code Generation

Lab Experience 17. Programming Language Translation

1/20/2016 INTRODUCTION

Compiler I: Syntax Analysis Human Thought

Object Oriented Software Design

C++ INTERVIEW QUESTIONS

C Compiler Targeting the Java Virtual Machine

03 - Lexical Analysis

2) Write in detail the issues in the design of code generator.

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007

Chapter 6: Programming Languages

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

Chapter 2: Elements of Java

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

C++FA 3.1 OPTIMIZING C++

The programming language C. sws1 1

Basic Programming and PC Skills: Basic Programming and PC Skills:

Programming Project 1: Lexical Analyzer (Scanner)

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

C++FA 5.1 PRACTICE MID-TERM EXAM

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Semester Review. CSC 301, Fall 2015

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

Tutorial on C Language Programming

Lecture 11 Doubly Linked Lists & Array of Linked Lists. Doubly Linked Lists

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

Pemrograman Dasar. Basic Elements Of Java

arrays C Programming Language - Arrays

Flex/Bison Tutorial. Aaron Myles Landwehr CAPSL 2/17/2012

Organization of Programming Languages CS320/520N. Lecture 05. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Object Oriented Software Design

Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment

Thomas Jefferson High School for Science and Technology Program of Studies Foundations of Computer Science. Unit of Study / Textbook Correlation

CS 111 Classes I 1. Software Organization View to this point:

Glossary of Object Oriented Terms

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Integrating Formal Models into the Programming Languages Course

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

J a v a Quiz (Unit 3, Test 0 Practice)

Summit Public Schools Summit, New Jersey Grade Level / Content Area: Mathematics Length of Course: 1 Academic Year Curriculum: AP Computer Science A

C++ Programming Language

Keywords are identifiers having predefined meanings in C programming language. The list of keywords used in standard C are : unsigned void

Introduction to Java Applications Pearson Education, Inc. All rights reserved.

Java Basics: Data Types, Variables, and Loops

Stacks. Linear data structures

The Cool Reference Manual

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

Course Name: ADVANCE COURSE IN SOFTWARE DEVELOPMENT (Specialization:.Net Technologies)

ARIZONA CTE CAREER PREPARATION STANDARDS & MEASUREMENT CRITERIA SOFTWARE DEVELOPMENT,

CS193D Handout 06 Winter 2004 January 26, 2004 Copy Constructor and operator=

Programming Languages

CA Compiler Construction

Contents. 9-1 Copyright (c) N. Afshartous

Operator Overloading. Lecture 8. Operator Overloading. Running Example: Complex Numbers. Syntax. What can be overloaded. Syntax -- First Example

Anatomy of Programming Languages. William R. Cook

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Curriculum Map. Discipline: Computer Science Course: C++

Habanero Extreme Scale Software Research Project

Sources: On the Web: Slides will be available on:

Introduction to Automata Theory. Reading: Chapter 1

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

Topics. Parts of a Java Program. Topics (2) CS 146. Introduction To Computers And Java Chapter Objectives To understand:

Programming Language Pragmatics

Masters programmes in Computer Science and Information Systems. Object-Oriented Design and Programming. Sample module entry test xxth December 2013

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006

Java Interview Questions and Answers

Computer Programming I

Compiler Design July 2004

13 File Output and Input

6.170 Tutorial 3 - Ruby Basics

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

Transcription:

Semantic Analysis Semantic Analysis: Types and Type Checking CS 471 October 10, 2007 Source code Lexical Analysis tokens Syntactic Analysis AST Semantic Analysis AST Intermediate Code Gen lexical errors syntax errors semantic errors 1 CS 471 Fall 2007 Beyond Syntax Errors What s wrong with this code? (Note: it parses perfectly) foo(int a, char * s){ int bar() { int f[3]; int i, j, k; char *p; float k; foo(f[6], 10, j); break; i->val = 5; j = i + k; printf( %s,%s.\n,p,q); goto label23; Undeclared identifier Multiply declared identifier Index out of bounds Wrong number or types of args to call Incompatible types for operation Break statement outside switch/loop Goto with no label 2 CS 471 Fall 2007 3 CS 471 Fall 2007 Kinds of Checks Uniqueness checks Certain names must be unique Many languages require variable declarations Flow-of-control checks Match control-flow operators with structures Example: break applies to innermost loop/switch Type checks Check compatibility of operators and operands Program Checking Why do we care? Obvious: Report mistakes to programmer Avoid bugs: f[6] will cause a run-time failure Help programmer verify intent How do these checks help compilers? Allocate right amount of space for variables Select right machine operations Proper implementation of control structures 4 CS 471 Fall 2007 5 CS 471 Fall 2007 1

Goals of a Semantic Analyzer Find all possible remaining errors that would make program invalid Static checks Done by the compiler Detect and report errors by analyzing the sources Dynamic checks Done at run time Detect and handle errors as they occur What are the pros and cons? Efficiency? Completeness? Developer vs. user experience? Language flexibility? Types What is a type? The notion varies from language to language Consensus A set of values A set of operations allowed on those values Certain operations are legal for each type It doesn t make sense to add a function pointer and an integer in C It does make sense to add two integers But both have the same assembly language implementation! 6 CS 471 Fall 2007 7 CS 471 Fall 2007 Why Do We Need Type Systems? Consider the assembly language fragment addi $r1, $r2, $r3 What are the types of $r1, $r2, $r3? Type Systems A language s type system specifies which operations are valid for which types The goal of type checking is to ensure that operations are used with the correct types Enforces intended interpretation of values Type systems provide a concise formalization of the semantic checking rules 8 CS 471 Fall 2007 9 CS 471 Fall 2007 Type Checking Overview Four kinds of languages: Statically typed: All or almost all checking of types is done as part of compilation Dynamically typed: Almost all checking of types is done as part of program execution (no compiler) as in Perl, Ruby Mixed Model : Java Untyped: No type checking (machine code) The Type Wars Competing views on static vs. dynamic typing Static typing proponents say: Static checking catches many programming errors at compile time Avoids overhead of run-time type checks Dynamic typing proponents say: Static type systems are restrictive Rapid prototyping easier in a dynamic type system In practice, most code is written in statically typed languages with an escape mechanism Unsafe casts in C, Java The best or worst of both worlds? 10 CS 471 Fall 2007 11 CS 471 Fall 2007 2

Type Checking and Type Inference Type Checking is the process of verifying fully typed programs Given an operation and an operand of some type, determine whether the operation is allowed Type Inference is the process of filling in missing type information Given the type of operands, determine the meaning of the operation the type of the operation OR, without variable declarations, infer type from the way the variable is used The two are different, but are often used interchangeably Issues in Typing Does the language have a type system? Untyped languages (e.g. assembly) have no type system at all When is typing performed? Static typing: At compile time Dynamic typing: At runtime How strictly are the rules enforced? Strongly typed: No exceptions Weakly typed: With well-defined exceptions Type equivalence & subtyping When are two types equivalent? What does "equivalent" mean anyway? When can one type replace another? 12 CS 471 Fall 2007 13 CS 471 Fall 2007 Components of a Type System Built-in types Rules for constructing new types Where do we store type information? Rules for determining if two types are equivalent Rules for inferring the types of expressions Component: Built-in Types Integer usual operations: standard arithmetic Floating point usual operations: standard arithmetic Character character set generally ordered lexicographically usual operations: (lexicographic) comparisons Boolean usual operations: not, and, or, xor 14 CS 471 Fall 2007 15 CS 471 Fall 2007 Component: Type Constructors Arrays array(i,t) denotes the type of an array with elements of type T, and index set I multidimensional arrays are just arrays where T is also an array operations: element access, array assignment, products Strings bitstrings, character strings operations: concatenation, lexicographic comparison Records (structs) Groups of multiple objects of different types where the elements are given specific names. Component: Type Constructors Pointers addresses operations: arithmetic, dereferencing, referencing issue: equivalency Function types A function such as "int add(real, int)" has type real int int 16 CS 471 Fall 2007 17 CS 471 Fall 2007 3

Component: Type Equivalence Name equivalence Types are equiv only when they have the same name Structural equivalence Types are equiv when they have the same structure Example C uses structural equivalence for structs and name equivalence for arrays/pointers Tiger uses name equivalence for both let type a = {x:int, y:int type b = {x:int, y:int var i : a := var j : b := in i := j end let type a = {x:int,y:int type c = a var i : a := var j : c := in i := j end Component: Type Equivalence Type Coercion If x is float, is x=3 acceptable? Disallow Allow and implicitly convert 3 to float "Allow" but require programmer to explicitly convert 3 to float What should be allowed? float to int? int to float? What if multiple coercions are possible? Consider 3 + "4" 18 CS 471 Fall 2007 19 CS 471 Fall 2007 Formalizing Types: Rules of Inference We have seen two examples of formal notation specifying parts of a compiler Regular expressions (for the lexer) Context-free grammars (for the parser) The appropriate formalism for type checking is logical rules of inference Rules of Inference Inference rules have the form If Hypothesis is true, then Conclusion is true Type checking computes via reasoning If E 1 and E 2 have certain types, then E 3 has certain type Rules of inference are a compact notation for If-Then statements 20 CS 471 Fall 2007 21 CS 471 Fall 2007 From English to an Inference Rule The notation is easy to read (with practice) Start with a simplified system and gradually add features Building blocks Symbol is and Symbol is if-then x:t is x has type T From English to an Inference Rule (2) If e 1 has type int and e 2 has type int, then e 1 + e 2 has type int (e 1 has type int e 2 has type int) e 1 + e 2 has type int (e 1 : int e 2 : int) e 1 + e 2 : int 22 CS 471 Fall 2007 23 CS 471 Fall 2007 4

From English to an Inference Rule (3) The statement (e 1 : int e 2 : int) e 1 + e 2 : int is a special case of (Hypothesis 1... Hypothesis n ) Conclusion This is an inference rule Notation for Inference Rules By tradition inference rules are written Hypothesis 1 Hypothesis n Conclusion Our type rules have hypotheses and conclusions of the form: e : T means it is provable that... 24 CS 471 Fall 2007 25 CS 471 Fall 2007 Two Rules Example: 1 + 2 i is an integer i : int [int] e 1 : int e 2 : int e 1 + e 2 : int [Add] 1 is an integer 2 is an integer 1 : int 2 : int 1 + 2 : int 26 CS 471 Fall 2007 27 CS 471 Fall 2007 A Few More Rules e : boolean not e : boolean [Not] What s the Point? Provides a concise (several pages vs. several hundred pages) representation of a language Hence, you ll see a LOT of this in a PL course e 1 : int e 2 : int e 1 < e 2 : boolean [Less] e 1 : boolean e 2 : boolean e 1 && e 2 : boolean [And] But that s enough for our purposes 28 CS 471 Fall 2007 29 CS 471 Fall 2007 5

Variables/Identifiers Need an environment that keeps track of types of all identifiers in scope { int i, n = ; for (i=0; i < n; i++) boolean b= i int n int i int n int b boolean? Symbol Tables purpose: keep track of names declared in the program names of variables, functions, classes, types, symbol table entry: associates a name with a set of attributes, e.g.: kind of name (variable, type, function, etc) type (int, float, etc) nesting level size memory location (i.e., where will it be found at runtime) 30 CS 471 Fall 2007 31 CS 471 Fall 2007 Symbol Tables Symbol table (also called environments) Symbol table (static) Environment(dynamic) Can be represented as set of name type pairs (bindings) {a string, b int Functions: Type Lookup(String id) Void Add(String id, Type binding) Environments Represents a set of mappings in the symbol table σ0 function f(a:int, b:int, c:int) = σ1 = σ0 + a int ( print_int(a+c); Lookup let var j := a+b in σ1 σ2 = σ1 + j int var a := hello in print(a); print_int(j) end; σ1 print_int(b) ) σ0 32 CS 471 Fall 2007 33 CS 471 Fall 2007 Imperative vs. Functional Environments Functional style keep all copies of σ0 σ1 σ2 Imperative style modify σ1 until it becomes σ2 - destroys σ1 - can undo σ2 to get back to σ1 again - single environment σ that becomes σ1 σ2 σ3 - latest bindings destroyed, old bindings restored NOTE: Functional/imperative environment management can be used regardless of whether the language is functional imperative or object-oriented Implementation Options Array simple, but linear LookUp time However, we may use a sorted array for reserved words, since they are generally few and known in advance Binary Tree O(log n) lookup time if kept balanced Hash Table most common implementation O(1) LookUp time How would you implement an imperative environment? 34 CS 471 Fall 2007 35 CS 471 Fall 2007 6

Scope Issues One idea is to have a global symbol table and save the scope information for each entry. When an identifier goes out of scope, scan the table and remove the corresponding entries We may even link all same-scope entries together for easier removal Careful: deleting from a hash table that uses open addressing is tricky We must mark a slot as Deleted, rather than Empty, otherwise later LookUp operations may fail Alternatively, we can maintain a separate, local table for each scope. Structure Tables Where should we store struct field names? Separate mini symbol table for each struct Conceptually easy Separate table for all struct field names We need to somehow uniquely map each name to its structure (e.g. by concatenating the field name with the struct name) No special storage struct field names are stored in the regular symbol table Again we need to be able to map each name to its structure 36 CS 471 Fall 2007 37 CS 471 Fall 2007 How symbol tables work (1) How symbol tables work (2) 38 CS 471 Fall 2007 39 CS 471 Fall 2007 How symbol tables work (3) How symbol tables work (4) 40 CS 471 Fall 2007 41 CS 471 Fall 2007 7

How symbol tables work (5) How symbol tables work (6) 42 CS 471 Fall 2007 43 CS 471 Fall 2007 8