5HFDOO &RPSLOHU 6WUXFWXUH
|
|
- Barnaby Farmer
- 8 years ago
- Views:
Transcription
1 6FDQQLQJ 2XWOLQH 2. Scanning The basics Ad-hoc scanning FSM based techniques A Lexical Analysis tool - Lex (a scanner generator) 5HFDOO &RPSLOHU 6WUXFWXUH 6RXUFH &RGH /H[LFDO $QDO\VLV6FDQQLQJ 6\QWD[ $QDO\VLV3DUVLQJ )URQW (QG 6HPDQWLF $QDO\VLV 0DFKLQH,QGHSHQGHQW 2SWLPL]DWLRQ %DFN (QG &RGH *HQHUDWLRQ 0DFKLQH 'HSHQGHQW 2SWLPL]DWLRQ 0DFKLQH &RGH 1
2 &RPSLOHU 6WUXFWXUH $QRWKHU 9LHZ,QSXW /DQJXDJH /H[LFDO $QDO\]HU /H[HPHV RU 7RNHQV 6\QWD[ $QDO\]HU 3KUDVH 6WUXFWXUH &RGH *HQHUDWRU 2XWSXW /DQJXDJH /H[LFDO $QDO\VLV :KDW LV LW" The input to a compiler/interpreter is a source program which is structured as a sequence/stream of characters or rather unstructured Processing individual characters is pretty tedious and highly inefficient As such, the first thing we have to do is add some basic structure to the source code 2
3 /H[LFDO $QDO\VLV :KDW LV LW" A Lexical Analyzer (a.k.a. scanner) converts a stream of characters into a stream of tokens i.e. they tokenize the input This is a many:1 transformation and thus later phases of compilation will only need to deal with comparatively few tokens. A token (a.k.a. lexeme or syntactic unit) is a fundamental component of a program /H[LFDO $QDO\VLV 7RNHQV Tokens are typically the bottom level entities in syntax diagrams Typical tokens include: identifiers (e.g. variable names, etc.) keywords operators literals (i.e. constant values) punctuation Consider a simple program and its tokens: 3
4 /H[LFDO $QDO\VLV 7RNHQV 352*5$0 WHVW FUOI 6RXUFH &RGH 9$5 [,17(*(5 FUOI %(*,1 FUOI [ [ FUOI (1' ^ WHVW ` 7RNHQV NH\ZRUG352*5$0 LGHQWWHVW NH\ZRUG9$5 LGHQW[ SXQFW NH\ZRUG,17(*(5 SXQFW NH\ZRUG%(*,1 LGHQW[ RSHUDWRU LGHQW[ RSHUDWRU OLWHUDO SXQFW NH\ZRUG(1' SXQFW 2WKHU 6FDQQHU )XQFWLRQV A scanner also removes white space from a program white space consists of spaces, tabs, carriage returns, comments, and the like stuff put into the source code solely for readability which does not affect the functional specification provided by the program Some scanners also enter symbols in the symbol table (more later) 4
5 $G KRF 6FDQQLQJ There are many applications outside of compiler construction that require simple scanning functions e.g. recognizing numeric values in financial and other applications These applications either implement their own recognition functions or rely on library routines or language based pattern matching to provide the needed functionality $G KRF 6FDQQLQJ Manual recognition of tokens involves a multitude of IF, WHILE, and SWITCH statements This approach is ugly, extremely tedious, highly error prone, and difficult to understand, maintain, and extend Using existing routines for doing pattern matching is a significant improvement 5
6 $G KRF 6FDQQLQJ In many cases (e.g. C language) these facilities are provided by library routines #include <string.h>: index, strlen, strcat, etc. In other cases (e.g. some variants of Pascal) they are incorporated into the language substring functions, sets, etc. or consider the language Perl!!! Both these reflect the prevalence and importance of such functionality $G KRF 6FDQQLQJ Anyone who has had to do a significant amount of such scanning/pattern matching knows how awkward it is e.g. consider data verification or the processing of command line arguments as other examples Scanning in a compiler/interpreter is typically far worse Even simple languages have complex lexemes 6
7 *UDPPDUV A generative grammar is a set of rules to generate valid phrases in a particular language Grammar G = {V, T, P, S}; V - finite set of nonterminals or variables, T -finite set of terminals or tokens, P - finite set of productions, S - is a nonterminal called start symbol Noam Chomsky defined classes of complexity of generative grammars The hierarchy of four classes, each of which properly contains the next is called the Chomsky hierarchy *UDPPDUV &KRPVN\ KLHUDUFK\ Type 0 Unrestricted Grammars Type 1 Context-Sensitive Grammars (CSGs) Type 2 Context-Free Grammars (CFGs) Type 3 Regular Grammars (RGs) 7
8 8QUHVWULFWHG *UDPPDUV This type of grammar is too complex for programming languages -- cannot construct efficient parsers for this type of grammar This grammar consists of productions of the form α β &RQWH[W6HQVLWLYH *UDPPDUV Most computer languages fall into this class of grammars The productions in this class are of the form α 1 Αα 2 α 1 βα 2 A becomes β in the context of α 1 and α 2 -- in general these grammars are still too complex for efficient computer analysis The context sensitivity of the programming languages is handled by other means so that context free grammars can be used for programming languages 8
9 &RQWH[W )UHH *UDPPDUV A production of a context free grammar (CFG) is of the form Α α, where Α is a variable and α is a string of symbols In CFGs, the derivations are on variables are independent of what surrounds them To generate phrases in the language, strings of terminals are derived by repeated expansion of non-terminals CFGs permit the construction of efficient syntax analyzers &RQWH[W )UHH *UDPPDUV Example: <S> a <A> b <A> <B> c <B> d Productions of the grammar Language generated by the above grammar is adcb 9
10 5HJXODU *UDPPDUV If all the productions of a CFG are of the form Α ωβ or Α ω, where Α, Β are non-terminals and ω is a string of terminals (possibly empty) Α Βω or Α ω, where Α, Β are non-terminals and ω is a string of terminals (possibly empty) Then the grammar is a RG -- first form is called Right linear and the second form is called Left linear RGs are too restrictive for most purposes Very efficient parsers can be built 5HJXODU *UDPPDUV The reason for the efficiency is that the language generation from RG can be performed without remembering our current position in the production that is currently being expanded Lack of memory makes RGs incapable of generating languages with arbitrarily nested structures In compilers, RGs will be used to describe words and CFGs will be used to describe phrases constructed from these words 10
11 5HJXODU ([SUHVVLRQV 5(V Regular expressions are a simplified form of grammar used to represent RGs ε (epsilon - empty set) is a regular expression that matches nothing symbol (terminal) s in the language is a RE that matches s if R is a RE, (R) * matches zero or more occurrences of the pattern R - known as the closure of R if R is a RE, (R) + matches one or more occurrences of the pattern R 5HJXODU ([SUHVVLRQV 5(V If R and S are RE, (R) (S) matches either the pattern R or the pattern S -- alternation If R and S are RE, (R)(S) matches the catenation of pattern R followed by pattern S Example <int> ::= ( ) + <int_no_leading_zero> ::= ( ) ( ) * 11
12 %HWWHU 6FDQQLQJ 7HFKQLTXHV This has motivated the development of both techniques and tools for doing scanning The most common of these are based on what are known as finite state machines (FSMs) which recognize regular languages The key to being able to do this is the existence of certain restrictions placed on the format of programming languages E.g.; tokens are usually separated by delimiters )60EDVHG 6FDQQLQJ The most common techniques used for building scanners are based on finite state machines(or FSMs) FSMs can be easily used to recognize language constructs (tokens) which are described by regular languages 12
13 5HJXODU /DQJXDJHV 5HYLVWHG A regular language is one which is composed of regular expressions A regular expression consists of simple, atomic elements combined using only three operations catenation, alternation, and repetition 5HJXODU /DQJXDJHV 5HYLVWHG Catenation (a.k.a. concatenation or sequencing) is represented by physical adjacency e.g. the regular expression <letter> <digit> simply represents (depending on the definition of letter and digit) a sequence composed of a letter followed by a digit we would use the ::= (equivalence) operator to associated a definition with <letter> or <digit> 13
14 5HJXODU /DQJXDJHV 5HYLVWHG Alternation allows selection from a number of choices and is commonly represented by the operator E.g. <digit> ::= Certain shorthand forms are also commonly used with alternation (especially ellipses) E.g. <alpha> ::= a b z A B Z 5HJXODU /DQJXDJHV 5HYLVWHG Finally, repetition permits the expression of constructs which are to be repeated some number of times There are two operators used for this purpose: superscript +, and superscript * E.g. <word> ::= <letter> + this implies 1 or more letters (* would imply 0 or more letters) 14
15 5HJXODU /DQJXDJHV 5HYLVWHG Finally, parenthesis ( ( and ) ) are used for grouping regular expressions Normally, the repetition operators have the highest precedence followed by alternation and then followed by catenation These 3 simple operations permit us to easily express the tokens that occur in existing programming languages 5HJXODU /DQJXDJHV 5HYLVWHG Consider the following regular expressions for a few common tokens and token types we might encounter 1RWH WKH XVH RI TXRWHV <assignop> ::= := <alphanum> ::= <alpha> <digit> <ident> ::= (<alpha> _ $ ) <alphanum> * <intconst> ::= <digit> + Not everything is this simple to specify 15
16 5HJXODU /DQJXDJHV 5HYLVWHG For this reason, there are a couple of other short cuts that make life easier These are notational conveniences only and can easily be represented using the basic constructs Logical Negation ( ^ or ~ ) commonly used with other constructs <comment> ::= { (~ } ) * } 5HJXODU /DQJXDJHV 5HYLVWHG ~a implies anything in U-{a} Negation can be done simply by enumerating everything in U-{a} e.g. if U={a b c d e} then we could write (~a) * or, alternatively, (b c d e) * Optional Constructs sometime it becomes tedious to list a number of similar options which could be more conveniently expressed by saying some constructs are optional 16
17 5HJXODU /DQJXDJHV 5HYLVWHG The most common notation for an optional construct is the use of braces E.g. <signedintconst> ::= [+ -] <intconst> The preceding example is equivalent to the following: <signedintconst> ::= <intconst> + <intconst> - <intconst> If we could specify the number of times a repetition could take place we could do it another way too 5HJXODU /DQJXDJHV 5HYLVWHG Consider: <signedintconst> ::= (+ -) 0..1 <intconst> The 0..1 is intended to imply that repetition can take place at most once (0 or 1 times) This illustrates yet another possible construct which, like the others, may be expressed using only catenation, alternation, and replication albeit more verbosely 17
18 5HJXODU /DQJXDJHV 5HYLVWHG Let s try something a bit more challenging: What does a real constant look like? It might have a sign for the mantissa The mantissa consists of some digits followed by a decimal point possibly followed by some more digits (the fractional part) There might be an exponent as well which could be signed 5HJXODU /DQJXDJHV 5HYLVWHG Let s do this in pieces... <realconst> ::= <mantissa> [ E <exponent>] Consider the exponent first - its just a signed integer constant: <exponent> ::= [+ -] <intconst> where <intconst> ::= <digit> + <digit> ::=
19 5HJXODU /DQJXDJHV 5HYLVWHG Now let s try the mantissa <mantissa> ::= [+ -] <intconst>. [ intconst] As with programming, divide and conquer works well to handle the complexity of regular expression specification Also, the use of the optional constructs greatly simplifies this specification As an exercise, try doing the real constant without [ and ] 5HJXODU /DQJXDJHV )60V A good way to start developing a scanner is to produce regular expressions for the tokens you wish to recognize The regular expressions themselves, however, are not the basis of the scanning process This requires a Finite State Machine (FSM) specification 19
20 )LQLWH 6WDWH 0DFKLQHV Fortunately, there is a direct 1:1 mapping between regular expressions and the FSMs that implement them An FSM is an abstract machine which can be in one of a finite number of states, which makes state transitions based on inputs, and which performs specific actions in specific states or on transitions between states Moore and Mealy machines from digital logic )LQLWH 6WDWH 0DFKLQHV FSMs are commonly represented graphically Nodes in the graph represent individual states and are assigned meaningful names Edges represent transitions between the states and are labeled with the input values which cause the state transitions An FSM-based scanner takes its input from the source code character stream 20
21 )LQLWH 6WDWH 0DFKLQHV The FSM-based scanner performs certain actions which include recognizing specific characters, accumulating the characters in a particular token, and returning completed tokens to form the output token stream We ll begin by just recognizing some simple tokens and worry about actually building the tokens later )LQLWH 6WDWH 0DFKLQHV <digit> ::= <intconst> ::= <digit> intconst other 21
22 )LQLWH 6WDWH 0DFKLQHV <ident> ::= (<alpha> _ $ ) <alphanum> * alphanum alpha,_,$ ident other (TXLYDOHQFH RI 5(V DQG )60V For each regular expression (RE), there is an FSM that recognizes strings conforming to the regular expression Consider the three basic RE operations Catenation: a b start a b done 22
23 (TXLYDOHQFH RI 5(V DQG )60V Alternation: a b c start a b c done Repetition: a* a start a U-{a} done ε $ 6DPSOH 5HJXODU /DQJXDJH <comment> ::= { (~ } ) * } <letter> ::= a z A Z <digit> ::= 0 9 <ident> ::= <letter> (<letter> digit>) * <numconst> ::= <digit> + [. <digit> + ] <strconst> ::= (~ )* <assignop> ::= := :+= :-= :*= :/= <negop> ::= ~ ~< ~> ~= 23
24 $ 6DPSOH )60 IRU WKH ODQJXDJH Leading } { Comment other Finish letter digit Ident letter, digit other other Lit 1 digit other Lit 3 digit. other Lit 2 digit : Lit 4 other Assign? ~ = +-*/ Assign! = ;,. [ ] Neg ><= other %XLOGLQJ D 6FDQQHU How does a scanner interact with the parser? Consider the following: token Source Program Lexical Analyzer Syntax Analyzer Parse Tree get next token() 24
25 6FDQQHU $FWLRQV As the scanner changes from state to state, it must do something with the characters it scans in order to build the tokens to be returned to the parser calling it In some cases, it must append the character seen onto a developing token and consume it so the next input character is visible E.g. when scanning characters in an identifier 6FDQQHU $FWLRQV In other cases it must preserve the character and return a completed token E.g. MaxVal := -999; After scanning the : we know that we have found the end of the identifier MaxVal so we want to return that to the parser but we do not want to lose the : so we must preserve it Another possible action is to simply consume a character E.g. characters in comments 25
26 ,PSOHPHQWLQJ WKH )60 A finite state machine may be easily implemented using a table driven technique Table driven techniques are highly methodical Comparatively easy to handle changes and/or extensions to the grammar Straightforward code that is not error-prone Easy to maintain the code,psohphqwlqj WKH )60 Regard the scanner as a device which takes a character stream as input and produces a token stream as output. At any given point in time... The device is in a specific state Based on the current state and the next input character, it will perform a specific action, and move into a new (possibly different) state 26
27 6FDQQHU $FWLRQV GHWDLO Typical actions include: C : Consume AC : Append and Consume PI : PL: Preserve and build ID token Preserve and build Literal token PK : Preserve and build Keyword token PP : Preserve and build Punctuation token CO : Consume and build Operator token CL : Consume and build Literal token What actions you need depends on the 6DPSOH )60 ZLWK DFWLRQV Leading letter AC digit AC AC : AC } C { C Lit 1 ~ AC Ident Lit 4 Assign? Neg Comment digit AC. AC letter, digit AC other AC other PP = CO +-*/ AC other PL ><= CO Lit 2 CL other PO Assign! other C digit AC other PI Lit 3 = CO digit AC other PL ;,. [ ] CP Finish 27
28 $ VFDQQHU PDLQOLQH STATIC GLOBAL ipchar; GLOBAL str, token, preserve str = state = Leading WHILE (state <> Finish) DO preserve = NO CALL action[state,ipchar] state = nextstate[state,ipchar] IF NOT preserve THEN ipchar = getchar() RETURN(token) $FWLRQ WDEOH Current State Input Character <alpha> <digit>. + : = { etc. 1. Leading AC AC CP AC CO AC CO C 2. Comment C C C 3. Ident AC AC PI PI PI PI PI PI 4. Lit 1 PL AC 5. Lit 2 6. Lit 3 7. Lit 4 etc. 8. Assign? 9. Assign! 10. Neg 11. Finish etc. 28
29 1H[W 6WDWH WDEOH Current State Input Character <alpha> <digit>. + : = { etc. 1. Leading Comment 3. Ident Lit Lit 2 6. Lit 3 7. Lit 4 etc. 8. Assign? 9. Assign! 10. Neg 11. Finish etc. $GGLWLRQDO FRGH All we have to do now is add action routines append adds the current character onto a string representing the token being recognized consume vs. preserve is handled by the preserve flag 29
30 $ /H[LFDO $QDO\]HU *HQHUDWRU Building a scanner manually (even using the FSM technique) is tedious We know that the mapping from regular expressions to FSM is straightforward so why don t we automate the process? Then we just type in regular expressions and get back code to implement a scanner That is exactly what lex does +RZ lex ZRUNV Lex Source Program lex.l Lex Compiler lex.yy.c lex.yy.c C Compiler a.out input stream a.out sequence of tokens 30
31 OH[ 6SHFLILFDWLRQV lex programs are divided into three components declarations - variable defined, include files specified, etc %% translation rules pattern action (using REs) { C/C++ statements} %% auxiliary procedures -- support routines for the C/C++ statements above 6DPSOH lex SURJUDP %{ /* * this sample demonstrates (very) simple recognition: * a verb/not a verb. */ /* include s and define s should go in this section */ %} %% 31
32 6DPSOH lex SURJUDP [\t ]+ /* ignore white space */ ; is am are were was be being been do does did have had go { printf("%s: is a verb\n", yytext); } 6DPSOH lex SURJUDP [a-za-z]+ } { printf("%s: is not a verb\n", yytext);. \n { ECHO; /* normal default anyway */ } %% main() { yylex(); } 32
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing The scanner (or lexical analyzer) of a compiler processes the source program, recognizing
More informationLexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar
Lexical Analysis and Scanning Honors Compilers Feb 5 th 2001 Robert Dewar The Input Read string input Might be sequence of characters (Unix) Might be sequence of lines (VMS) Character set ASCII ISO Latin-1
More informationCompiler Construction
Compiler Construction Regular expressions Scanning Görel Hedin Reviderad 2013 01 23.a 2013 Compiler Construction 2013 F02-1 Compiler overview source code lexical analysis tokens intermediate code generation
More information03 - Lexical Analysis
03 - Lexical Analysis First, let s see a simplified overview of the compilation process: source code file (sequence of char) Step 2: parsing (syntax analysis) arse Tree Step 1: scanning (lexical analysis)
More informationLexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016
Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS Lexical analysis Floriano Scioscia 1 Introductive terminological distinction Lexical string or lexeme = meaningful
More informationProgramming Languages CIS 443
Course Objectives Programming Languages CIS 443 0.1 Lexical analysis Syntax Semantics Functional programming Variable lifetime and scoping Parameter passing Object-oriented programming Continuations Exception
More informationCompilers Lexical Analysis
Compilers Lexical Analysis SITE : http://www.info.univ-tours.fr/ mirian/ TLC - Mírian Halfeld-Ferrari p. 1/3 The Role of the Lexical Analyzer The first phase of a compiler. Lexical analysis : process of
More informationCompiler I: Syntax Analysis Human Thought
Course map Compiler I: Syntax Analysis Human Thought Abstract design Chapters 9, 12 H.L. Language & Operating Sys. Compiler Chapters 10-11 Virtual Machine Software hierarchy Translator Chapters 7-8 Assembly
More informationIntroduction to Automata Theory. Reading: Chapter 1
Introduction to Automata Theory Reading: Chapter 1 1 What is Automata Theory? Study of abstract computing devices, or machines Automaton = an abstract computing device Note: A device need not even be a
More informationCSCI 3136 Principles of Programming Languages
CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University Winter 2013 CSCI 3136 Principles of Programming Languages Faculty of Computer Science Dalhousie University
More informationScanner. tokens scanner parser IR. source code. errors
Scanner source code tokens scanner parser IR errors maps characters into tokens the basic unit of syntax x = x + y; becomes = + ; character string value for a token is a lexeme
More informationSyntaktická analýza. Ján Šturc. Zima 208
Syntaktická analýza Ján Šturc Zima 208 Position of a Parser in the Compiler Model 2 The parser The task of the parser is to check syntax The syntax-directed translation stage in the compiler s front-end
More informationProgramming Assignment II Due Date: See online CISC 672 schedule Individual Assignment
Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment 1 Overview Programming assignments II V will direct you to design and build a compiler for Cool. Each assignment will
More informationProgramming Project 1: Lexical Analyzer (Scanner)
CS 331 Compilers Fall 2015 Programming Project 1: Lexical Analyzer (Scanner) Prof. Szajda Due Tuesday, September 15, 11:59:59 pm 1 Overview of the Programming Project Programming projects I IV will direct
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 7 Scanner Parser Project Wednesday, September 7 DUE: Wednesday, September 21 This
More information1 Introduction. 2 An Interpreter. 2.1 Handling Source Code
1 Introduction The purpose of this assignment is to write an interpreter for a small subset of the Lisp programming language. The interpreter should be able to perform simple arithmetic and comparisons
More informationName: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.
Name: Class: Date: Exam #1 - Prep True/False Indicate whether the statement is true or false. 1. Programming is the process of writing a computer program in a language that the computer can respond to
More informationLecture 9. Semantic Analysis Scoping and Symbol Table
Lecture 9. Semantic Analysis Scoping and Symbol Table Wei Le 2015.10 Outline Semantic analysis Scoping The Role of Symbol Table Implementing a Symbol Table Semantic Analysis Parser builds abstract syntax
More informationMoving from CS 61A Scheme to CS 61B Java
Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you
More informationFlex/Bison Tutorial. Aaron Myles Landwehr aron+ta@udel.edu CAPSL 2/17/2012
Flex/Bison Tutorial Aaron Myles Landwehr aron+ta@udel.edu 1 GENERAL COMPILER OVERVIEW 2 Compiler Overview Frontend Middle-end Backend Lexer / Scanner Parser Semantic Analyzer Optimizers Code Generator
More informationScoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)
Semantic Analysis Scoping (Readings 7.1,7.4,7.6) Static Dynamic Parameter passing methods (7.5) Building symbol tables (7.6) How to use them to find multiply-declared and undeclared variables Type checking
More informationIntroduction to Lex. General Description Input file Output file How matching is done Regular expressions Local names Using Lex
Introduction to Lex General Description Input file Output file How matching is done Regular expressions Local names Using Lex General Description Lex is a program that automatically generates code for
More informationTextual Modeling Languages
Textual Modeling Languages Slides 4-31 and 38-40 of this lecture are reused from the Model Engineering course at TU Vienna with the kind permission of Prof. Gerti Kappel (head of the Business Informatics
More informationA Lex Tutorial. Victor Eijkhout. July 2004. 1 Introduction. 2 Structure of a lex file
A Lex Tutorial Victor Eijkhout July 2004 1 Introduction The unix utility lex parses a file of characters. It uses regular expression matching; typically it is used to tokenize the contents of the file.
More informationCompiler Construction
Compiler Construction Lecture 1 - An Overview 2003 Robert M. Siegfried All rights reserved A few basic definitions Translate - v, a.to turn into one s own language or another. b. to transform or turn from
More informationEventia Log Parsing Editor 1.0 Administration Guide
Eventia Log Parsing Editor 1.0 Administration Guide Revised: November 28, 2007 In This Document Overview page 2 Installation and Supported Platforms page 4 Menus and Main Window page 5 Creating Parsing
More informationParsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
More informationPL / SQL Basics. Chapter 3
PL / SQL Basics Chapter 3 PL / SQL Basics PL / SQL block Lexical units Variable declarations PL / SQL types Expressions and operators PL / SQL control structures PL / SQL style guide 2 PL / SQL Block Basic
More informationHonors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology
Honors Class (Foundations of) Informatics Tom Verhoeff Department of Mathematics & Computer Science Software Engineering & Technology www.win.tue.nl/~wstomv/edu/hci c 2011, T. Verhoeff @ TUE.NL 1/20 Information
More informationcsce4313 Programming Languages Scanner (pass/fail)
csce4313 Programming Languages Scanner (pass/fail) John C. Lusth Revision Date: January 18, 2005 This is your first pass/fail assignment. You may develop your code using any procedural language, but you
More informationUniversity of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005
University of Toronto Department of Electrical and Computer Engineering Midterm Examination CSC467 Compilers and Interpreters Fall Semester, 2005 Time and date: TBA Location: TBA Print your name and ID
More informationScanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm
Scanning and Parsing Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm Today Outline of planned topics for course Overall structure of a compiler Lexical analysis
More informationVisual Basic Programming. An Introduction
Visual Basic Programming An Introduction Why Visual Basic? Programming for the Windows User Interface is extremely complicated. Other Graphical User Interfaces (GUI) are no better. Visual Basic provides
More informationIntroduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages
Introduction Compiler esign CSE 504 1 Overview 2 3 Phases of Translation ast modifled: Mon Jan 28 2013 at 17:19:57 EST Version: 1.5 23:45:54 2013/01/28 Compiled at 11:48 on 2015/01/28 Compiler esign Introduction
More informationAdvanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz 2007 03 16
Advanced compiler construction Michel Schinz 2007 03 16 General course information Teacher & assistant Course goals Teacher: Michel Schinz Michel.Schinz@epfl.ch Assistant: Iulian Dragos INR 321, 368 64
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Scanner-Parser Project Thursday, Feb 7 DUE: Wednesday, Feb 20, 9:00 pm This project
More informationJavaScript: Introduction to Scripting. 2008 Pearson Education, Inc. All rights reserved.
1 6 JavaScript: Introduction to Scripting 2 Comment is free, but facts are sacred. C. P. Scott The creditor hath a better memory than the debtor. James Howell When faced with a decision, I always ask,
More informationSemantic Analysis: Types and Type Checking
Semantic Analysis Semantic Analysis: Types and Type Checking CS 471 October 10, 2007 Source code Lexical Analysis tokens Syntactic Analysis AST Semantic Analysis AST Intermediate Code Gen lexical errors
More informationUseful Number Systems
Useful Number Systems Decimal Base = 10 Digit Set = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} Binary Base = 2 Digit Set = {0, 1} Octal Base = 8 = 2 3 Digit Set = {0, 1, 2, 3, 4, 5, 6, 7} Hexadecimal Base = 16 = 2
More informationStacks. Linear data structures
Stacks Linear data structures Collection of components that can be arranged as a straight line Data structure grows or shrinks as we add or remove objects ADTs provide an abstract layer for various operations
More informationIntroduction to Java Applications. 2005 Pearson Education, Inc. All rights reserved.
1 2 Introduction to Java Applications 2.2 First Program in Java: Printing a Line of Text 2 Application Executes when you use the java command to launch the Java Virtual Machine (JVM) Sample program Displays
More informationVHDL Test Bench Tutorial
University of Pennsylvania Department of Electrical and Systems Engineering ESE171 - Digital Design Laboratory VHDL Test Bench Tutorial Purpose The goal of this tutorial is to demonstrate how to automate
More informationBachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)
Unit- I Introduction to c Language: C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating
More informationComputer Science 281 Binary and Hexadecimal Review
Computer Science 281 Binary and Hexadecimal Review 1 The Binary Number System Computers store everything, both instructions and data, by using many, many transistors, each of which can be in one of two
More informationThe programming language C. sws1 1
The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan
More informationChapter 3. Input and output. 3.1 The System class
Chapter 3 Input and output The programs we ve looked at so far just display messages, which doesn t involve a lot of real computation. This chapter will show you how to read input from the keyboard, use
More informationLecture 18 Regular Expressions
Lecture 18 Regular Expressions Many of today s web applications require matching patterns in a text document to look for specific information. A good example is parsing a html file to extract tags
More informationUnified Language for Network Security Policy Implementation
Unified Language for Network Security Policy Implementation Dmitry Chernyavskiy Information Security Faculty National Research Nuclear University MEPhI Moscow, Russia milnat2004@yahoo.co.uk Natalia Miloslavskaya
More informationHow to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer
How to make the computer understand? Fall 2005 Lecture 15: Putting it all together From parsing to code generation Write a program using a programming language Microprocessors talk in assembly language
More informationBase Conversion written by Cathy Saxton
Base Conversion written by Cathy Saxton 1. Base 10 In base 10, the digits, from right to left, specify the 1 s, 10 s, 100 s, 1000 s, etc. These are powers of 10 (10 x ): 10 0 = 1, 10 1 = 10, 10 2 = 100,
More informationRegular Expressions with Nested Levels of Back Referencing Form a Hierarchy
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety
More informationLecture 5: Java Fundamentals III
Lecture 5: Java Fundamentals III School of Science and Technology The University of New England Trimester 2 2015 Lecture 5: Java Fundamentals III - Operators Reading: Finish reading Chapter 2 of the 2nd
More informationUniversity of Hull Department of Computer Science. Wrestling with Python Week 01 Playing with Python
Introduction Welcome to our Python sessions. University of Hull Department of Computer Science Wrestling with Python Week 01 Playing with Python Vsn. 1.0 Rob Miles 2013 Please follow the instructions carefully.
More information1 Introduction. 2 Overview of the Tool. Program Visualization Tool for Educational Code Analysis
Program Visualization Tool for Educational Code Analysis Natalie Beams University of Oklahoma, Norman, OK nataliebeams@gmail.com Program Visualization Tool for Educational Code Analysis 1 Introduction
More informationSource Code Translation
Source Code Translation Everyone who writes computer software eventually faces the requirement of converting a large code base from one programming language to another. That requirement is sometimes driven
More informationDesign Patterns in Parsing
Abstract Axel T. Schreiner Department of Computer Science Rochester Institute of Technology 102 Lomb Memorial Drive Rochester NY 14623-5608 USA ats@cs.rit.edu Design Patterns in Parsing James E. Heliotis
More informationCSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions
CSC45 AUTOMATA 2. Finite Automata: Examples and Definitions Finite Automata: Examples and Definitions A finite automaton is a simple type of computer. Itsoutputislimitedto yes to or no. It has very primitive
More informationLEX/Flex Scanner Generator
Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 1 LEX/Flex Scanner Generator Compilers: CS31003 Computer Sc & Engg: IIT Kharagpur 2 flex - Fast Lexical Analyzer Generator We can use flex a to automatically
More informationqwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq
qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq Introduction to Programming using Java wertyuiopasdfghjklzxcvbnmqwertyui
More informationPushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, 2014. School of Informatics University of Edinburgh als@inf.ed.ac.
Pushdown automata Informatics 2A: Lecture 9 Alex Simpson School of Informatics University of Edinburgh als@inf.ed.ac.uk 3 October, 2014 1 / 17 Recap of lecture 8 Context-free languages are defined by context-free
More informationApplies to Version 6 Release 5 X12.6 Application Control Structure
Applies to Version 6 Release 5 X12.6 Application Control Structure ASC X12C/2012-xx Copyright 2012, Data Interchange Standards Association on behalf of ASC X12. Format 2012 Washington Publishing Company.
More informationC Compiler Targeting the Java Virtual Machine
C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the
More informationCS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing
CS143 Handout 08 Summer 2008 July 02, 2007 Bottom-Up Parsing Handout written by Maggie Johnson and revised by Julie Zelenski. Bottom-up parsing As the name suggests, bottom-up parsing works in the opposite
More informationSome Scanner Class Methods
Keyboard Input Scanner, Documentation, Style Java 5.0 has reasonable facilities for handling keyboard input. These facilities are provided by the Scanner class in the java.util package. A package is a
More informationKITES TECHNOLOGY COURSE MODULE (C, C++, DS)
KITES TECHNOLOGY 360 Degree Solution www.kitestechnology.com/academy.php info@kitestechnology.com technologykites@gmail.com Contact: - 8961334776 9433759247 9830639522.NET JAVA WEB DESIGN PHP SQL, PL/SQL
More informationMemory Systems. Static Random Access Memory (SRAM) Cell
Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled
More informationChapter 2: Elements of Java
Chapter 2: Elements of Java Basic components of a Java program Primitive data types Arithmetic expressions Type casting. The String type (introduction) Basic I/O statements Importing packages. 1 Introduction
More informationSources: On the Web: Slides will be available on:
C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,
More informationComputational Mathematics with Python
Boolean Arrays Classes Computational Mathematics with Python Basics Olivier Verdier and Claus Führer 2009-03-24 Olivier Verdier and Claus Führer Computational Mathematics with Python 2009-03-24 1 / 40
More informationCS 106 Introduction to Computer Science I
CS 106 Introduction to Computer Science I 01 / 21 / 2014 Instructor: Michael Eckmann Today s Topics Introduction Homework assignment Review the syllabus Review the policies on academic dishonesty and improper
More informationestatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS
WP. 2 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Bonn, Germany, 25-27 September
More informationXML Schema Definition Language (XSDL)
Chapter 4 XML Schema Definition Language (XSDL) Peter Wood (BBK) XML Data Management 80 / 227 XML Schema XML Schema is a W3C Recommendation XML Schema Part 0: Primer XML Schema Part 1: Structures XML Schema
More informationFirst Java Programs. V. Paúl Pauca. CSC 111D Fall, 2015. Department of Computer Science Wake Forest University. Introduction to Computer Science
First Java Programs V. Paúl Pauca Department of Computer Science Wake Forest University CSC 111D Fall, 2015 Hello World revisited / 8/23/15 The f i r s t o b l i g a t o r y Java program @author Paul Pauca
More informationFTP client Selection and Programming
COMP 431 INTERNET SERVICES & PROTOCOLS Spring 2016 Programming Homework 3, February 4 Due: Tuesday, February 16, 8:30 AM File Transfer Protocol (FTP), Client and Server Step 3 In this assignment you will
More informationA Programming Language Where the Syntax and Semantics Are Mutable at Runtime
DEPARTMENT OF COMPUTER SCIENCE A Programming Language Where the Syntax and Semantics Are Mutable at Runtime Christopher Graham Seaton A dissertation submitted to the University of Bristol in accordance
More informationInformatique Fondamentale IMA S8
Informatique Fondamentale IMA S8 Cours 1 - Intro + schedule + finite state machines Laure Gonnord http://laure.gonnord.org/pro/teaching/ Laure.Gonnord@polytech-lille.fr Université Lille 1 - Polytech Lille
More informationNew York University Computer Science Department Courant Institute of Mathematical Sciences
New York University Computer Science Department Courant Institute of Mathematical Sciences Course Title: Data Communications & Networks Course Number: g22.2662-001 Instructor: Jean-Claude Franchitti Session:
More informationCS106A, Stanford Handout #38. Strings and Chars
CS106A, Stanford Handout #38 Fall, 2004-05 Nick Parlante Strings and Chars The char type (pronounced "car") represents a single character. A char literal value can be written in the code using single quotes
More informationAutomata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Fall 25 Alexis Maciel Department of Computer Science Clarkson University Copyright c 25 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
More informationInformatica e Sistemi in Tempo Reale
Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)
More informationPython Loops and String Manipulation
WEEK TWO Python Loops and String Manipulation Last week, we showed you some basic Python programming and gave you some intriguing problems to solve. But it is hard to do anything really exciting until
More informationEmbedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C
Embedded Systems A Review of ANSI C and Considerations for Embedded C Programming Dr. Jeff Jackson Lecture 2-1 Review of ANSI C Topics Basic features of C C fundamentals Basic data types Expressions Selection
More informationASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters
The char Type ASCII Encoding The C char type stores small integers. It is usually 8 bits. char variables guaranteed to be able to hold integers 0.. +127. char variables mostly used to store characters
More informationBarcode Labels Feature Focus Series. POSitive For Windows
Barcode Labels Feature Focus Series POSitive For Windows Inventory Label Printing... 3 PFW System Requirement for Scanners... 3 A Note About Barcode Symbologies... 4 An Occasional Misunderstanding... 4
More informationSo far we have considered only numeric processing, i.e. processing of numeric data represented
Chapter 4 Processing Character Data So far we have considered only numeric processing, i.e. processing of numeric data represented as integer and oating point types. Humans also use computers to manipulate
More informationLumousoft Visual Programming Language and its IDE
Lumousoft Visual Programming Language and its IDE Xianliang Lu Lumousoft Inc. Waterloo Ontario Canada Abstract - This paper presents a new high-level graphical programming language and its IDE (Integration
More informationUniversity of Wales Swansea. Department of Computer Science. Compilers. Course notes for module CS 218
University of Wales Swansea Department of Computer Science Compilers Course notes for module CS 218 Dr. Matt Poole 2002, edited by Mr. Christopher Whyley, 2nd Semester 2006/2007 www-compsci.swan.ac.uk/~cschris/compilers
More informationWA2099 Introduction to Java using RAD 8.0 EVALUATION ONLY. Student Labs. Web Age Solutions Inc.
WA2099 Introduction to Java using RAD 8.0 Student Labs Web Age Solutions Inc. 1 Table of Contents Lab 1 - The HelloWorld Class...3 Lab 2 - Refining The HelloWorld Class...20 Lab 3 - The Arithmetic Class...25
More informationCS321. Introduction to Numerical Methods
CS3 Introduction to Numerical Methods Lecture Number Representations and Errors Professor Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506-0633 August 7, 05 Number in
More informationProgramming Languages
Programming Languages Programming languages bridge the gap between people and machines; for that matter, they also bridge the gap among people who would like to share algorithms in a way that immediately
More informationCA4003 - Compiler Construction
CA4003 - Compiler Construction David Sinclair Overview This module will cover the compilation process, reading and parsing a structured language, storing it in an appropriate data structure, analysing
More informationCSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals
CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals 1 Recall From Last Time: Java Program import java.util.scanner; public class EggBasket { public static void main(string[]
More informationIf-Then-Else Problem (a motivating example for LR grammars)
If-Then-Else Problem (a motivating example for LR grammars) If x then y else z If a then if b then c else d this is analogous to a bracket notation when left brackets >= right brackets: [ [ ] ([ i ] j,
More informationMATLAB Programming. Problem 1: Sequential
Division of Engineering Fundamentals, Copyright 1999 by J.C. Malzahn Kampe 1 / 21 MATLAB Programming When we use the phrase computer solution, it should be understood that a computer will only follow directions;
More informationVIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR
VIRTUAL LABORATORY: MULTI-STYLE CODE EDITOR Andrey V.Lyamin, State University of IT, Mechanics and Optics St. Petersburg, Russia Oleg E.Vashenkov, State University of IT, Mechanics and Optics, St.Petersburg,
More informationChapter 13 - The Preprocessor
Chapter 13 - The Preprocessor Outline 13.1 Introduction 13.2 The#include Preprocessor Directive 13.3 The#define Preprocessor Directive: Symbolic Constants 13.4 The#define Preprocessor Directive: Macros
More informationConcepts and terminology in the Simula Programming Language
Concepts and terminology in the Simula Programming Language An introduction for new readers of Simula literature Stein Krogdahl Department of Informatics University of Oslo, Norway April 2010 Introduction
More informationLimitation of Liability
Limitation of Liability Information in this document is subject to change without notice. THE TRADING SIGNALS, INDICATORS, SHOWME STUDIES, PAINTBAR STUDIES, PROBABILITYMAP STUDIES, ACTIVITYBAR STUDIES,
More informationComputational Mathematics with Python
Numerical Analysis, Lund University, 2011 1 Computational Mathematics with Python Chapter 1: Basics Numerical Analysis, Lund University Claus Führer, Jan Erik Solem, Olivier Verdier, Tony Stillfjord Spring
More information