Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar"

Transcription

1 Lexical Analysis and Scanning Honors Compilers Feb 5 th 2001 Robert Dewar

2 The Input Read string input Might be sequence of characters (Unix) Might be sequence of lines (VMS) Character set ASCII ISO Latin-1 ISO (16-bit = unicode) Others (EBCDIC, JIS, etc)

3 A series of tokens The Output Punctuation ( ) ;, [ ] Operators + - ** := Keywords begin end if Identifiers Square_Root String literals hello this is a string Character literals x Numeric literals 123 4_5.23e+2 16#ac#

4 Free form vs Fixed form Free form languages White space does not matter Tabs, spaces, new lines, carriage returns Only the ordering of tokens is important Fixed format languages Layout is critical Fortran, label in cols 1-61 COBOL, area A B Lexical analyzer must worry about layout

5 Punctuation Typically individual special characters Such as + - Lexical analyzer does not know : from : Sometimes double characters E.g. (* treated as a kind of bracket Returned just as identity of token And perhaps location For error message and debugging purposes

6 Operators Like punctuation No real difference for lexical analyzer Typically single or double special chars Operators + - Operations := Returned just as identity of token And perhaps location

7 Keywords Reserved identifiers E.g. BEGIN END in Pascal, if in C Maybe distinguished from identifiers E.g. mode vs mode in Algol-68 Returned just as token identity With possible location information Unreserved keywords (e.g. PL/1) Handled as identifiers (parser distinguishes)

8 Rules differ Identifiers Length, allowed characters, separators Need to build table So that junk1 is recognized as junk1 Typical structure: hash table Lexical analyzer returns token type And key to table entry Table entry includes location information

9 More on Identifier Tables Most common structure is hash table With fixed number of headers Chain according to hash code Serial search on one chain Hash code computed from characters No hash code is perfect! Avoid any arbitrary limits

10 String Literals Text must be stored Actual characters are important Not like identifiers Character set issues Table needed Lexical analyzer returns key to table May or may not be worth hashing

11 Character Literals Similar issues to string literals Lexical Analyzer returns Token type Identity of character Note, cannot assume character set of host machine, may be different

12 Numeric Literals Also need a table Typically record value E.g. 123 = 0123 = 01_23 (Ada) But cannot use int for values Because may have different characteristics Float stuff much more complex Denormals,, correct rounding Very delicate stuff

13 Handling Comments Comments have no effect on program Can therefore be eliminated by scanner But may need to be retrieved by tools Error detection issues E.g. unclosed comments Scanner does not return comments

14 Case Equivalence Some languages have case equivalence Pascal, Ada Some do not C, Java Lexical analyzer ignores case if needed This_Routine = THIS_RouTine Error analysis may need exact casing

15 Issues to Address Speed Lexical analysis can take a lot of time Minimize processing per character I/O is also an issue (read large blocks) We compile frequently Compilation time is important Especially during development

16 General Approach Define set of token codes An enumeration type A series of integer definitions These are just codes (no semantics) Some codes associated with data E.g. key for identifier table May be useful to build tree node For identifiers, literals etc

17 Interface to Lexical Analyzer Convert entire file to a file of tokens Lexical analyzer is separate phase Parser calls lexical analyzer Get next token This approach avoids extra I/O Parser builds tree as we go along

18 Implementation of Scanner Given the input text Generate the required tokens Or provide token by token on demand Before we describe implementations We take this short break To describe relevant formalisms

19 Relevant Formalisms Type 3 (Regular) Grammars Regular Expressions Finite State Machines

20 Regular Grammars Regular grammars Non-terminals (arbitrary names) Terminals (characters) Two forms of rules Non-terminal ::= terminal Non-terminal ::= terminal Non-terminal One non-terminal is the start symbol Regular (type 3) grammars cannot count No concept of matching nested parens

21 Regular Grammars Regular grammars E.g. grammar of reals with no exponent REAL ::= 0 REAL1 (repeat for 1.. 9) REAL1 ::= 0 REAL1 (repeat for 1.. 9) REAL1 ::=. INTEGER INTEGER ::= 0 INTEGER (repeat for 1.. 9) INTEGER ::= 0 (repeat for 1.. 9) Start symbol is REAL

22 Regular Expressions Regular expressions (RE) defined by Any terminal character is an RE Alternation RE RE Concatenation RE1 RE2 Repetition RE* (zero or more RE s) Language of RE s s = type 3 grammars Regular expressions are more convenient

23 Specifying RE s s in Unix Tools Single characters a b c d \x Alternation [bcd[ bcd] ] [b-z] ab cd Match any character. Match sequence of characters x* y+ Concatenation abc[d-q] Optional [0-9]+(.[0 9]+(.[0-9]*)? 9]*)?

24 Finite State Machines Languages and Automata A language is a set of strings An automaton is a machine That determines if a given string is in the language or not. FSM s are automata that recognize regular languages (regular expressions)

25 Definitions of FSM A set of labeled states Directed arcs labeled with character A state may be marked as terminal Transition from state S1 to S2 If and only if arc from S1 to S2 Labeled with next character (which is eaten) Recognized if ends up in terminal state One state is distinguished start state

26 Building FSM from Grammar One state for each non-terminal A rule of the form Nont1 ::= terminal Generates transition from S1 to final state A rule of the form Nont1 ::= terminal Nont2 Generates transition from S1 to S2

27 Building FSM s from RE s Every RE corresponds to a grammar For all regular expressions A natural translation to FSM exists We will not give details of algorithm here

28 Non-Deterministic FSM A non-deterministic FSM Has at least one state With two arcs to two separate states Labeled with the same character Which way to go? Implementation requires backtracking Nasty

29 Deterministic FSM For all states S For all characters C There is either ONE or NO arcs From state S Labeled with character C Much easier to implement No backtracking

30 Dealing with ND FSM Construction naturally leads to ND FSM For example, consider FSM for [0-9]+ [0-9]+ 9]+\.[0-9]+ (integer or real) We will naturally get a start state With two sets of branches And thus non-deterministic

31 Converting to Deterministic There is an algorithm for converting From any ND FSM To an equivalent deterministic FSM Algorithm is in the text book Example (given in terms of RE s) [0-9]+ [0-9]+ 9]+\.[0-9]+ [0-9]+( 9]+(\.[0-9]+)?

32 Implementing the Scanner Three methods Completely informal, just write code Define tokens using regular expressions Convert RE s s to ND finite state machine Convert ND FSM to deterministic FSM Program the FSM Use an automated program To achieve above three steps

33 Ad Hoc Code (forget FSM s) Write normal hand code A procedure called Scan Normal coding techniques Basically scan over white space and comments till non-blank character found. Base subsequent processing on character E.g. colon may be : or := / may be operator or start of comment Return token found Write aggressive efficient code

34 Using FSM Formalisms Start with regular grammar or RE Typically found in the language standard For example, for Ada: Chapter 2. Lexical Elements Digit ::= decimal-literal literal ::= integer [.integer][exponent] integer ::= digit {[underline] digit} exponent ::= E [+] integer E - integer

35 Using FSM formalisms, cont Given RE s s or grammar Convert to finite state machine Convert ND FSM to deterministic FSM Write a program to recognize Using the deterministic FSM

36 Implementing FSM (Method 1) Each state is code of the form: <<state1>> case Next_Character is when a => goto state3; when b => goto state1; when others => End_of_token_processing; end case; <<state2>>

37 Implementing FSM (Method 2) There is a variable called State loop case State is when state1 =><<state1>> case Next_Character is when a => State := state3; when b => State := state1; when others => End_token_processing; end case; when state2 end case; end loop;

38 Implementing FSM (Method 3) T : array (State, Character) of State; while More_Input loop Curstate := T (Curstate( Curstate,, Next_Char); if Curstate = Error_State then end loop;

39 Automatic FSM Generation Our example, FLEX See home page for manual in HTML FLEX is given A set of regular expressions Actions associated with each RE It builds a scanner Which matches RE s s and executes actions

40 Flex General Format Input to Flex is a set of rules: Regexp Regexp actions (C statements) actions (C statements) Flex scans the longest matching Regexp And executes the corresponding actions

41 An Example of a Flex scanner DIGIT [0-9] ID [a-z][a z][a-z0-9]* %% {DIGIT}+ { printf ( an integer %s (%d)\n, yytext, atoi (yytext)); } {DIGIT}+. {DIGIT}* {DIGIT}* { printf ( a a float %s (%g)\n, yytext, atof (yytext)); if then begin end procedure function { printf ( a a keyword: %s\n, yytext));

42 Flex Example (continued) {ID} printf ( an identifier %s\n, yytext); + - * / { printf ( an operator %s\n, yytext); } --.*\n n /* eat Ada style comment */ [ \t\n]+ /* eat white space */. printf ( unrecognized character ); %%

43 Assembling the flex program %{ #include <math.h> /* for atof */ %} <<flex text we gave goes here>> %% main (argc( argc, argv) int argc; char **argv argv; { yyin = fopen (argv[1], r ); yylex(); }

44 Running flex flex is a program that is executed The input is as we have given The output is a running C program For Ada fans Look at aflex ( For C++ fans flex can run in C++ mode Generates appropriate classes

45 Choice Between Methods? Hand written scanners Typically much faster execution And pretty easy to write And a easier for good error recovery Flex approach Simple to Use Easy to modify token language

46 The GNAT Scanner Hand written (scn.adb/scn.ads( scn.adb/scn.ads) Basically a call does Super quick scan past blanks/comments etc Big case statement Process based on first character Call special routines Namet.Get_Name for identifier (hashing) Keywords recognized by special hash Strings (stringt.ads( stringt.ads) Integers (uintp.ads( uintp.ads) Reals (ureal.ads)

47 More on the GNAT Scanner Entire source read into memory Single contiguous block Source location is index into this block Different index range for each source file See sinput.adb/ads for source mgmt See scans.ads for definitions of tokens

48 More on GNAT Scanner Read scn.adb code Very easy reading, e.g.

49 ASSIGNMENT TWO Write a flex or aflex program Recognize tokens of Algol-68s program Print out tokens in style of flex example Extra credit Build hash table for identifiers Output hash table key

50 Preprocessors Some languages allow preprocessing This is a separate step Input is source Output is expanded source Can either be done as separate phase Or embedded into the lexical analyzer Often done as separate phase Need to keep track of source locations

51 Nasty Glitches Separation of tokens Not all languages have clear rules FORTRAN has optional spaces DO10I=1.6 identifier operator literal DO10I = 1.6 DO10I=1,6 Keyword stmt loopvar operator literal punc literal DO 10 I = 1, 6 Modern languages avoid this kind of thing!

Compiler Construction

Compiler Construction Compiler Construction Regular expressions Scanning Görel Hedin Reviderad 2013 01 23.a 2013 Compiler Construction 2013 F02-1 Compiler overview source code lexical analysis tokens intermediate code generation

More information

03 - Lexical Analysis

03 - Lexical Analysis 03 - Lexical Analysis First, let s see a simplified overview of the compilation process: source code file (sequence of char) Step 2: parsing (syntax analysis) arse Tree Step 1: scanning (lexical analysis)

More information

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016 Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS Lexical analysis Floriano Scioscia 1 Introductive terminological distinction Lexical string or lexeme = meaningful

More information

CSCI 3136 Principles of Programming Languages

CSCI 3136 Principles of Programming Languages CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University Winter 2013 CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University

More information

Program translation.

Program translation. Program translation. ffl On the very earliest computers programs were written and entered in binary form. ffl Some computers required the program to be entered one binary word at a time, using swithes

More information

C Programming Language CIS 218

C Programming Language CIS 218 C Programming Language CIS 218 Description C is a procedural languages designed to provide lowlevel access to computer system resources, provide language constructs that map efficiently to machine instructions,

More information

Your first C program. Which one is best?

Your first C program. Which one is best? Your first C program #include void main(void) { printf( Hello, world!\n ); } #include void main(void) { printf( Hello, ); printf( world! ); printf( \n ); } Which one is best? #include

More information

A New Parser. Neil Mitchell. Neil Mitchell

A New Parser. Neil Mitchell. Neil Mitchell A New Parser Neil Mitchell Neil Mitchell 2004 1 Disclaimer This is not something I have done as part of my PhD I have done it on my own I haven t researched other systems Its not finished Any claims may

More information

Programming Languages CIS 443

Programming Languages CIS 443 Course Objectives Programming Languages CIS 443 0.1 Lexical analysis Syntax Semantics Functional programming Variable lifetime and scoping Parameter passing Object-oriented programming Continuations Exception

More information

5HFDOO &RPSLOHU 6WUXFWXUH

5HFDOO &RPSLOHU 6WUXFWXUH 6FDQQLQJ 2XWOLQH 2. Scanning The basics Ad-hoc scanning FSM based techniques A Lexical Analysis tool - Lex (a scanner generator) 5HFDOO &RPSLOHU 6WUXFWXUH 6RXUFH &RGH /H[LFDO $QDO\VLV6FDQQLQJ 6\QWD[ $QDO\VLV3DUVLQJ

More information

The programming language C. sws1 1

The programming language C. sws1 1 The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan

More information

UNIT-1. C Programming & Data Structures. Introduction to Computers: Computing Environment: Types of Computing Environments:

UNIT-1. C Programming & Data Structures. Introduction to Computers: Computing Environment: Types of Computing Environments: Introduction to Computers: C Programming & Data Structures UNIT-1 A computer system consists of hardware and software. Computer hardware is the collection of physical elements that comprise a computer

More information

Introduction to Lex. General Description Input file Output file How matching is done Regular expressions Local names Using Lex

Introduction to Lex. General Description Input file Output file How matching is done Regular expressions Local names Using Lex Introduction to Lex General Description Input file Output file How matching is done Regular expressions Local names Using Lex General Description Lex is a program that automatically generates code for

More information

1 Installation. 2 Setup. 3 Example 0. Jumpstart Flex and Bison Bo Waggoner Updated:

1 Installation. 2 Setup. 3 Example 0. Jumpstart Flex and Bison Bo Waggoner Updated: Jumpstart Flex and Bison Bo Waggoner Updated: 2014-10-18 Abstract Flex and Bison are tools for writing a compiler (they are free/replacement versions of the famous Lex and Yacc). We try to get some minimal

More information

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing The scanner (or lexical analyzer) of a compiler processes the source program, recognizing

More information

Compiler Construction

Compiler Construction Compiler Construction Lecture 1 - An Overview 2003 Robert M. Siegfried All rights reserved A few basic definitions Translate - v, a.to turn into one s own language or another. b. to transform or turn from

More information

Overview of a C Program

Overview of a C Program Overview of a C Program Programming with C CSCI 112, Spring 2015 Patrick Donnelly Montana State University Programming with C (CSCI 112) Spring 2015 2 / 42 C Language Components Preprocessor Directives

More information

Scanner. tokens scanner parser IR. source code. errors

Scanner. tokens scanner parser IR. source code. errors Scanner source code tokens scanner parser IR errors maps characters into tokens the basic unit of syntax x = x + y; becomes = + ; character string value for a token is a lexeme

More information

OSMIC. CSoft ware. C Language manual. Rev Copyright COSMIC Software 1999, 2003 All rights reserved.

OSMIC. CSoft ware. C Language manual. Rev Copyright COSMIC Software 1999, 2003 All rights reserved. OSMIC CSoft ware C Language manual Rev. 1.1 Copyright COSMIC Software 1999, 2003 All rights reserved. Table of Contents Preface Chapter 1 Historical Introduction Chapter 2 C Language Overview C Files...2-1

More information

CSCI-GA Compiler Construction Lecture 4: Lexical Analysis I. Mohamed Zahran (aka Z)

CSCI-GA Compiler Construction Lecture 4: Lexical Analysis I. Mohamed Zahran (aka Z) CSCI-GA.2130-001 Compiler Construction Lecture 4: Lexical Analysis I Mohamed Zahran (aka Z) mzahran@cs.nyu.edu Role of the Lexical Analyzer Remove comments and white spaces (aka scanning) Macros expansion

More information

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T) Unit- I Introduction to c Language: C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating

More information

Lecture 9. Semantic Analysis Scoping and Symbol Table

Lecture 9. Semantic Analysis Scoping and Symbol Table Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax

More information

Lecture 03 Bits, Bytes and Data Types

Lecture 03 Bits, Bytes and Data Types Lecture 03 Bits, Bytes and Data Types In this lecture Computer Languages Assembly Language The compiler Operating system Data and program instructions Bits, Bytes and Data Types ASCII table Data Types

More information

Lex The Idea Program Structure What Lex does Using Lex Lex Operators States in Lex

Lex The Idea Program Structure What Lex does Using Lex Lex Operators States in Lex Lex The Idea Program Structure What Lex does Using Lex Lex Operators States in Lex The Lex Idea Lex is an implementation that uses an extension of RE notation as well as states and C/C++ Extending REs

More information

Semantic Analysis: Types and Type Checking

Semantic Analysis: Types and Type Checking Semantic Analysis Semantic Analysis: Types and Type Checking CS 471 October 10, 2007 Source code Lexical Analysis tokens Syntactic Analysis AST Semantic Analysis AST Intermediate Code Gen lexical errors

More information

Objects for lexical analysis

Objects for lexical analysis Rochester Institute of Technology RIT Scholar Works Articles 2002 Objects for lexical analysis Bernd Kuhl Axel-Tobias Schreiner Follow this and additional works at: http://scholarworks.rit.edu/article

More information

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions CSC45 AUTOMATA 2. Finite Automata: Examples and Definitions Finite Automata: Examples and Definitions A finite automaton is a simple type of computer. Itsoutputislimitedto yes to or no. It has very primitive

More information

Programming Language: Syntax. Introduction to C Language Overview, variables, Operators, Statements

Programming Language: Syntax. Introduction to C Language Overview, variables, Operators, Statements Programming Language: Syntax Introduction to C Language Overview, variables, Operators, Statements Based on slides McGraw-Hill Additional material 2004/2005 Lewis/Martin Modified by Diana Palsetia Syntax

More information

Pemrograman Dasar. Basic Elements Of Java

Pemrograman Dasar. Basic Elements Of Java Pemrograman Dasar Basic Elements Of Java Compiling and Running a Java Application 2 Portable Java Application 3 Java Platform Platform: hardware or software environment in which a program runs. Oracle

More information

Introduction to Python

Introduction to Python WEEK ONE Introduction to Python Python is such a simple language to learn that we can throw away the manual and start with an example. Traditionally, the first program to write in any programming language

More information

SIT102 Introduction to Programming

SIT102 Introduction to Programming SIT102 Introduction to Programming After working through this session you should: Understand the relationships between operating systems, their user interfaces, and programs; Understand the difference

More information

MPI and C-Language Seminars 2010

MPI and C-Language Seminars 2010 MPI and C-Language Seminars 2010 Seminar Plan (1/3) Aim: Introduce the C Programming Language. Plan to cover: Basic C, and programming techniques needed for HPC coursework. C-bindings for the Message Passing

More information

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam. Compilers Spring term Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.es Lecture 1 to Compilers 1 Topic 1: What is a Compiler? 3 What is a Compiler? A compiler is a computer

More information

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters The char Type ASCII Encoding The C char type stores small integers. It is usually 8 bits. char variables guaranteed to be able to hold integers 0.. +127. char variables mostly used to store characters

More information

An Overview of a Compiler

An Overview of a Compiler An Overview of a Compiler Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler Design Outline of the Lecture About the course

More information

INDEX. C programming Page 1 of 10. 5) Function. 1) Introduction to C Programming

INDEX. C programming Page 1 of 10. 5) Function. 1) Introduction to C Programming INDEX 1) Introduction to C Programming a. What is C? b. Getting started with C 2) Data Types, Variables, Constants a. Constants, Variables and Keywords b. Types of Variables c. C Keyword d. Types of C

More information

Introduction to C Programming S Y STEMS

Introduction to C Programming S Y STEMS Introduction to C Programming CS 40: INTRODUCTION TO U NIX A ND L I NUX O P E R AT ING S Y STEMS Objectives Introduce C programming, including what it is and what it contains, which includes: Command line

More information

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C Embedded Systems A Review of ANSI C and Considerations for Embedded C Programming Dr. Jeff Jackson Lecture 2-1 Review of ANSI C Topics Basic features of C C fundamentals Basic data types Expressions Selection

More information

Programming for MSc Part I

Programming for MSc Part I Herbert Martin Dietze University of Buckingham herbert@the-little-red-haired-girl.org July 24, 2001 Abstract The course introduces the C programming language and fundamental software development techniques.

More information

strsep exercises Introduction C strings Arrays of char

strsep exercises Introduction C strings Arrays of char strsep exercises Introduction The standard library function strsep enables a C programmer to parse or decompose a string into substrings, each terminated by a specified character. The goals of this document

More information

Basic C Syntax. Comp-206 : Introduction to Software Systems Lecture 10. Alexandre Denault Computer Science McGill University Fall 2006

Basic C Syntax. Comp-206 : Introduction to Software Systems Lecture 10. Alexandre Denault Computer Science McGill University Fall 2006 Basic C Syntax Comp-206 : Introduction to Software Systems Lecture 10 Alexandre Denault Computer Science McGill University Fall 2006 Next Week I'm away for the week. I'll still check my mails though. No

More information

Programming Project 1: Lexical Analyzer (Scanner)

Programming Project 1: Lexical Analyzer (Scanner) CS 331 Compilers Fall 2015 Programming Project 1: Lexical Analyzer (Scanner) Prof. Szajda Due Tuesday, September 15, 11:59:59 pm 1 Overview of the Programming Project Programming projects I IV will direct

More information

2. Compressing data to reduce the amount of transmitted data (e.g., to save money).

2. Compressing data to reduce the amount of transmitted data (e.g., to save money). Presentation Layer The presentation layer is concerned with preserving the meaning of information sent across a network. The presentation layer may represent (encode) the data in various ways (e.g., data

More information

C Syntax and Semantics

C Syntax and Semantics C Syntax and Semantics 1 C Program Structure C Language Elements Preprocessor directives Function Header, and Function body Executable statements Reserved word, Standard identifiers, user defined identifiers

More information

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm Scanning and Parsing Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm Today Outline of planned topics for course Overall structure of a compiler Lexical analysis

More information

Compilers Lexical Analysis

Compilers Lexical Analysis Compilers Lexical Analysis SITE : http://www.info.univ-tours.fr/ mirian/ TLC - Mírian Halfeld-Ferrari p. 1/3 The Role of the Lexical Analyzer The first phase of a compiler. Lexical analysis : process of

More information

Previously we saw that a string constant was just a sequence of characters enclosed within quotation marks. Now we take a look at string variables.

Previously we saw that a string constant was just a sequence of characters enclosed within quotation marks. Now we take a look at string variables. 1 Programming with C Terry Marris November 2010 2 Strings Previously we saw that a string constant was just a sequence of characters enclosed within quotation marks. Now we take a look at string variables.

More information

Reading. C Programming Language. Basic syntax Whitespaces. Whitespaces (cont d) #include. Basic syntax Comments

Reading. C Programming Language. Basic syntax Whitespaces. Whitespaces (cont d) #include. Basic syntax Comments Reading C Programming Language Types, operators, expressions Control flow, functions Basic IO K&R Chapter 2 Types, Operators, and Expressions K&R Chapter 3 Control Flow K&R Chapter 7 Basic I/O NEWS Assignment

More information

Design Patterns in Parsing

Design Patterns in Parsing Abstract Axel T. Schreiner Department of Computer Science Rochester Institute of Technology 102 Lomb Memorial Drive Rochester NY 14623-5608 USA ats@cs.rit.edu Design Patterns in Parsing James E. Heliotis

More information

MIT Aurangabad FE Computer Engineering

MIT Aurangabad FE Computer Engineering MIT Aurangabad FE Computer Engineering Unit 1: Introduction to C 1. The symbol # is called a. Header file c. include b. Preprocessor d. semicolon 2. The size of integer number is limited to a. -32768 to

More information

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 18: Intermediate Code 29 Feb 08

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 18: Intermediate Code 29 Feb 08 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 18: Intermediate Code 29 Feb 08 CS 412/413 Spring 2008 Introduction to Compilers 1 Summary: Semantic Analysis Check errors not detected by lexical

More information

Number Representation

Number Representation Number Representation CS10001: Programming & Data Structures Pallab Dasgupta Professor, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Topics to be Discussed How are numeric data

More information

Compiler I: Syntax Analysis Human Thought

Compiler I: Syntax Analysis Human Thought Course map Compiler I: Syntax Analysis Human Thought Abstract design Chapters 9, 12 H.L. Language & Operating Sys. Compiler Chapters 10-11 Virtual Machine Software hierarchy Translator Chapters 7-8 Assembly

More information

C Primer. Fall Introduction C vs. Java... 1

C Primer. Fall Introduction C vs. Java... 1 CS 33 Intro Computer Systems Doeppner C Primer Fall 2016 Contents 1 Introduction 1 1.1 C vs. Java.......................................... 1 2 Functions 1 2.1 The main() Function....................................

More information

Computer is a binary digital system. Data. Unsigned Integers (cont.) Unsigned Integers. Binary (base two) system: Has two states: 0 and 1

Computer is a binary digital system. Data. Unsigned Integers (cont.) Unsigned Integers. Binary (base two) system: Has two states: 0 and 1 Computer Programming Programming Language Is telling the computer how to do something Wikipedia Definition: Applies specific programming languages to solve specific computational problems with solutions

More information

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology Honors Class (Foundations of) Informatics Tom Verhoeff Department of Mathematics & Computer Science Software Engineering & Technology www.win.tue.nl/~wstomv/edu/hci c 2011, T. Verhoeff @ TUE.NL 1/20 Information

More information

The IC Language Specification. Spring 2006 Cornell University

The IC Language Specification. Spring 2006 Cornell University The IC Language Specification Spring 2006 Cornell University The IC language is a simple object-oriented language that we will use in the CS413 project. The goal is to build a complete optimizing compiler

More information

If-Then-Else Problem (a motivating example for LR grammars)

If-Then-Else Problem (a motivating example for LR grammars) If-Then-Else Problem (a motivating example for LR grammars) If x then y else z If a then if b then c else d this is analogous to a bracket notation when left brackets >= right brackets: [ [ ] ([ i ] j,

More information

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6) Semantic Analysis Scoping (Readings 7.1,7.4,7.6) Static Dynamic Parameter passing methods (7.5) Building symbol tables (7.6) How to use them to find multiply-declared and undeclared variables Type checking

More information

The Fundamentals of C++

The Fundamentals of C++ The Fundamentals of C++ Basic programming elements and concepts JPC and JWD 2002 McGraw-Hill, Inc. Program Organization Program statement Definition Declaration Action Executable unit Named set of program

More information

Programming Languages

Programming Languages Programming Languages Programming languages bridge the gap between people and machines; for that matter, they also bridge the gap among people who would like to share algorithms in a way that immediately

More information

Programmierpraktikum

Programmierpraktikum Programmierpraktikum Claudius Gros, SS2012 Institut für theoretische Physik Goethe-University Frankfurt a.m. 1 of 21 10/16/2012 09:29 AM Java - A First Glance 2 of 21 10/16/2012 09:29 AM programming languages

More information

CSI 333 Lecture 2 Introduction to C: Part I 2 1 / 16

CSI 333 Lecture 2 Introduction to C: Part I 2 1 / 16 CSI 333 Lecture 2 Introduction to C: Part I 2 1 / 16 Basics of C Remark: Skim Chapters 1 through 6 of Deitel & Deitel. You will notice the following: C is (more or less) a subset of Java. (So, you are

More information

Objective Type Questions

Objective Type Questions Objective Type Questions Unit- Introduction to computer and programming: Sr.No. Multiple Choice Question Paper Marks Which of the following is a part of primary memory of computer a) PROM b) CD-ROM c)

More information

CA4003 - Compiler Construction

CA4003 - Compiler Construction CA4003 - Compiler Construction David Sinclair Overview This module will cover the compilation process, reading and parsing a structured language, storing it in an appropriate data structure, analysing

More information

Ed. v1.0 PROGRAMMING LANGUAGES WORKING PAPER DRAFT PROGRAMMING LANGUAGES. Ed. v1.0

Ed. v1.0 PROGRAMMING LANGUAGES WORKING PAPER DRAFT PROGRAMMING LANGUAGES. Ed. v1.0 i PROGRAMMING LANGUAGES ii Copyright 2011 Juhász István iii COLLABORATORS TITLE : PROGRAMMING LANGUAGES ACTION NAME DATE SIGNATURE WRITTEN BY István Juhász 2012. március 26. Reviewed by Ágnes Korotij 2012.

More information

Lecture Set 2: Starting Java

Lecture Set 2: Starting Java Lecture Set 2: Starting Java 1. Java Concepts 2. Java Programming Basics 3. User output 4. Variables and types 5. Expressions 6. User input 7. Uninitialized Variables CMSC 131 - Lecture Outlines - set

More information

C Programming Dr. Hasan Demirel

C Programming Dr. Hasan Demirel C How to Program, H. M. Deitel and P. J. Deitel, Prentice Hall, 5 th edition (3 rd edition or above is also OK). Introduction to C Programming Dr. Hasan Demirel Programming Languages There are three types

More information

INTRODUCTION TO FLOWCHARTING

INTRODUCTION TO FLOWCHARTING CHAPTER 1 INTRODUCTION TO FLOWCHARTING 1.0 Objectives 1.1 Introduction 1.2 Flowcharts 1.3 Types of Flowcharts 1.3.1 Types of flowchart 1.3.2 System flowcharts 1.4 Flowchart Symbols 1.5 Advantages of Flowcharts

More information

Project 2: Bejeweled

Project 2: Bejeweled Project 2: Bejeweled Project Objective: Post: Tuesday March 26, 2013. Due: 11:59PM, Monday April 15, 2013 1. master the process of completing a programming project in UNIX. 2. get familiar with command

More information

Characters and Strings. Constants

Characters and Strings. Constants Characters and Strings Constants Characters are the fundamental building blocks of source programs Character constants One character surrounded by single quotes A or? Actually an int value represented

More information

Flex/Bison Tutorial. Aaron Myles Landwehr aron+ta@udel.edu CAPSL 2/17/2012

Flex/Bison Tutorial. Aaron Myles Landwehr aron+ta@udel.edu CAPSL 2/17/2012 Flex/Bison Tutorial Aaron Myles Landwehr aron+ta@udel.edu 1 GENERAL COMPILER OVERVIEW 2 Compiler Overview Frontend Middle-end Backend Lexer / Scanner Parser Semantic Analyzer Optimizers Code Generator

More information

Informatica e Sistemi in Tempo Reale

Informatica e Sistemi in Tempo Reale Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)

More information

1 (1x17 =17 points) 2 (21 points) 3 (5 points) 4 (3 points) 5 (4 points) Total ( 50points) Page 1

1 (1x17 =17 points) 2 (21 points) 3 (5 points) 4 (3 points) 5 (4 points) Total ( 50points) Page 1 CS 1621 MIDTERM EXAM 1 Name: Problem 1 (1x17 =17 points) 2 (21 points) 3 (5 points) 4 (3 points) 5 (4 points) Total ( 50points) Score Page 1 1. (1 x 17 = 17 points) Determine if each of the following statements

More information

Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer

Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer Lecture 10: Static Semantics Overview 1 Lexical analysis Produces tokens Detects & eliminates illegal tokens Parsing Produces trees Detects & eliminates ill-formed parse trees Static semantic analysis

More information

Model Paper Computer Science Objective. Paper Code Time Allowed: 20 minutes

Model Paper Computer Science Objective. Paper Code Time Allowed: 20 minutes Note: This is Model Paper for guidance of students & teachers. Q. Model Paper Computer Science Objective Intermediate Part II ( th Class) Examination Session -4 and onward Total marks: 7 Paper Code Time

More information

3) Some coders debug their programs by placing comment symbols on some codes instead of deleting it. How does this aid in debugging?

3) Some coders debug their programs by placing comment symbols on some codes instead of deleting it. How does this aid in debugging? Freshers Club Important 100 C Interview Questions & Answers 1) How do you construct an increment statement or decrement statement in C? There are actually two ways you can do this. One is to use the increment

More information

C Compiler Targeting the Java Virtual Machine

C Compiler Targeting the Java Virtual Machine C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the

More information

CS 106 Introduction to Computer Science I

CS 106 Introduction to Computer Science I CS 106 Introduction to Computer Science I 01 / 21 / 2014 Instructor: Michael Eckmann Today s Topics Introduction Homework assignment Review the syllabus Review the policies on academic dishonesty and improper

More information

Problem Solving With C++ Ninth Edition

Problem Solving With C++ Ninth Edition CISC 1600/1610 Computer Science I Programming in C++ Professor Daniel Leeds dleeds@fordham.edu JMH 328A Introduction to programming with C++ Learn Fundamental programming concepts Key techniques Basic

More information

First Java Programs. V. Paúl Pauca. CSC 111D Fall, 2015. Department of Computer Science Wake Forest University. Introduction to Computer Science

First Java Programs. V. Paúl Pauca. CSC 111D Fall, 2015. Department of Computer Science Wake Forest University. Introduction to Computer Science First Java Programs V. Paúl Pauca Department of Computer Science Wake Forest University CSC 111D Fall, 2015 Hello World revisited / 8/23/15 The f i r s t o b l i g a t o r y Java program @author Paul Pauca

More information

Automata and Languages

Automata and Languages Automata and Languages Computational Models: An idealized mathematical model of a computer. A computational model may be accurate in some ways but not in others. We shall be defining increasingly powerful

More information

Chapter 1 Java Program Design and Development

Chapter 1 Java Program Design and Development presentation slides for JAVA, JAVA, JAVA Object-Oriented Problem Solving Third Edition Ralph Morelli Ralph Walde Trinity College Hartford, CT published by Prentice Hall Java, Java, Java Object Oriented

More information

Regular Expressions. Languages. Recall. A language is a set of strings made up of symbols from a given alphabet. Computer Science Theory 2

Regular Expressions. Languages. Recall. A language is a set of strings made up of symbols from a given alphabet. Computer Science Theory 2 Regular Expressions Languages Recall. A language is a set of strings made up of symbols from a given alphabet. Computer Science Theory 2 1 String Recognition Machine Given a string and a definition of

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 7 Scanner Parser Project Wednesday, September 7 DUE: Wednesday, September 21 This

More information

Chapter 2 Assemblers -- Basic Assembler Functions

Chapter 2 Assemblers -- Basic Assembler Functions Chapter 2 Assemblers -- Basic Assembler Functions Outline Basic assembler functions A simple SIC assembler Assembler algorithm and data structure Basic assembler functions Translating mnemonic operation

More information

Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment

Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment 1 Overview Programming assignments II V will direct you to design and build a compiler for Cool. Each assignment will

More information

Compiler Compiler: Flex and Bison

Compiler Compiler: Flex and Bison Compiler Compiler: Flex and Bison 1 Today s goal Learn how to use lexical scanner Flex and context parser Bison. Learn how to write Flex grammar files. Learn how to write Bison grammar files. Learn how

More information

CS106A, Stanford Handout #38. Strings and Chars

CS106A, Stanford Handout #38. Strings and Chars CS106A, Stanford Handout #38 Fall, 2004-05 Nick Parlante Strings and Chars The char type (pronounced "car") represents a single character. A char literal value can be written in the code using single quotes

More information

Strings in C++ and Java. Questions:

Strings in C++ and Java. Questions: Strings in C++ and Java Questions: 1 1. What kind of access control is achieved by the access control modifier protected? 2 2. There is a slight difference between how protected works in C++ and how it

More information

C++ Programming: From Problem Analysis to Program Design, Fifth Edition. Chapter 2: Basic Elements of C++

C++ Programming: From Problem Analysis to Program Design, Fifth Edition. Chapter 2: Basic Elements of C++ C++ Programming: From Problem Analysis to Program Design, Fifth Edition Chapter 2: Basic Elements of C++ Objectives In this chapter, you will: Become familiar with the basic components of a C++ program,

More information

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program. Name: Class: Date: Exam #1 - Prep True/False Indicate whether the statement is true or false. 1. Programming is the process of writing a computer program in a language that the computer can respond to

More information

C A short introduction

C A short introduction About these lectures C A short introduction Stefan Johansson Department of Computing Science Umeå University Objectives Give a short introduction to C and the C programming environment in Linux/Unix Go

More information

The C Programming Language course syllabus associate level

The C Programming Language course syllabus associate level TECHNOLOGIES The C Programming Language course syllabus associate level Course description The course fully covers the basics of programming in the C programming language and demonstrates fundamental programming

More information

Optimization Techniques in C. Team Emertxe

Optimization Techniques in C. Team Emertxe Optimization Techniques in C Team Emertxe Optimization Techniques Basic Concepts Programming Algorithm and Techniques Optimization Techniques Basic Concepts What is Optimization Methods Space and Time

More information

C programming. Intro to syntax & basic operations

C programming. Intro to syntax & basic operations C programming Intro to syntax & basic operations Example 1: simple calculation with I/O Program, line by line Line 1: preprocessor directive; used to incorporate code from existing library not actually

More information

1.3 Data Representation

1.3 Data Representation 8628-28 r4 vs.fm Page 9 Thursday, January 2, 2 2:4 PM.3 Data Representation 9 appears at Level 3, uses short mnemonics such as ADD, SUB, and MOV, which are easily translated to the ISA level. Assembly

More information

SANKALCHAND PATEL COLLEGE OF ENGINEERING, VISNAGAR ODD/EVEN ACADEMICSEMESTER (2014-15) ASSIGNMENT / QUESTION BANK (2110003) [F.Y.B.E.

SANKALCHAND PATEL COLLEGE OF ENGINEERING, VISNAGAR ODD/EVEN ACADEMICSEMESTER (2014-15) ASSIGNMENT / QUESTION BANK (2110003) [F.Y.B.E. SANKALCHAND PATEL COLLEGE OF ENGINEERING, VISNAGAR ODD/EVEN ACADEMICSEMESTER (2014-15) ASSIGNMENT / QUESTION BANK Subject: Computer Programming and Utilization (2110003) [F.Y.B.E.: ALL BRANCHES] Unit 1

More information

How to Write a Simple Makefile

How to Write a Simple Makefile Chapter 1 CHAPTER 1 How to Write a Simple Makefile The mechanics of programming usually follow a fairly simple routine of editing source files, compiling the source into an executable form, and debugging

More information

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages Introduction Compiler esign CSE 504 1 Overview 2 3 Phases of Translation ast modifled: Mon Jan 28 2013 at 17:19:57 EST Version: 1.5 23:45:54 2013/01/28 Compiled at 11:48 on 2015/01/28 Compiler esign Introduction

More information