flex Regular Expressions and Lexical Scanning Regular Expressions and flex Examples on Alphabet A = {a,b} (Standard) Regular Expressions on Alphabet A



Similar documents
Homework 3 Solutions

Regular Sets and Expressions

One Minute To Learn Programming: Finite Automata

A Visual and Interactive Input abb Automata. Theory Course with JFLAP 4.0

Java CUP. Java CUP Specifications. User Code Additions You may define Java code to be included within the generated parser:

Binary Representation of Numbers Autar Kaw

FORMAL LANGUAGES, AUTOMATA AND THEORY OF COMPUTATION EXERCISES ON REGULAR LANGUAGES

Unambiguous Recognizable Two-dimensional Languages

Introduction to Lex. General Description Input file Output file How matching is done Regular expressions Local names Using Lex

Data Compression. Lossless And Lossy Compression

Regular Languages and Finite Automata

Example 27.1 Draw a Venn diagram to show the relationship between counting numbers, whole numbers, integers, and rational numbers.

Solutions for Selected Exercises from Introduction to Compiler Design

and thus, they are similar. If k = 3 then the Jordan form of both matrices is

Vectors Recap of vectors

Reasoning to Solve Equations and Inequalities

Protocol Analysis / Analysis of Software Artifacts Kevin Bierhoff

Automated Grading of DFA Constructions

Solution to Problem Set 1

Math 135 Circles and Completing the Square Examples

Small Business Networking

Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

5 a LAN 6 a gateway 7 a modem

AntiSpyware Enterprise Module 8.5

Small Business Networking

Small Business Networking

Small Business Networking

How To Network A Smll Business

MATH 150 HOMEWORK 4 SOLUTIONS

Use Geometry Expressions to create a more complex locus of points. Find evidence for equivalence using Geometry Expressions.

String Searching. String Search. Spam Filtering. String Search

Lecture 3 Gaussian Probability Distribution

Rotating DC Motors Part II

Outline of the Lecture. Software Testing. Unit & Integration Testing. Components. Lecture Notes 3 (of 4)

Quick Reference Guide: One-time Account Update

LINEAR TRANSFORMATIONS AND THEIR REPRESENTING MATRICES

How To Set Up A Network For Your Business

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

Vendor Rating for Service Desk Selection

Algebra Review. How well do you remember your algebra?

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Factoring Polynomials

Scanner. tokens scanner parser IR. source code. errors

Density Curve. Continuous Distributions. Continuous Distribution. Density Curve. Meaning of Area Under Curve. Meaning of Area Under Curve

Small Businesses Decisions to Offer Health Insurance to Employees

9.3. The Scalar Product. Introduction. Prerequisites. Learning Outcomes

Warm-up for Differential Calculus

Example A rectangular box without lid is to be made from a square cardboard of sides 18 cm by cutting equal squares from each corner and then folding

2 DIODE CLIPPING and CLAMPING CIRCUITS

NQF Level: 2 US No: 7480

Section 5-4 Trigonometric Functions

APPLICATION NOTE Revision 3.0 MTD/PS-0534 August 13, 2008 KODAK IMAGE SENDORS COLOR CORRECTION FOR IMAGE SENSORS

Compiler Construction

Virtual Machine. Part II: Program Control. Building a Modern Computer From First Principles.

Concept Formation Using Graph Grammars

A.7.1 Trigonometric interpretation of dot product A.7.2 Geometric interpretation of dot product

PROF. BOYAN KOSTADINOV NEW YORK CITY COLLEGE OF TECHNOLOGY, CUNY

Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

Integration by Substitution

Small Business Cloud Services

Automata theory. An algorithmic approach. Lecture Notes. Javier Esparza

On the expressive power of temporal logic

Section 5.2, Commands for Configuring ISDN Protocols. Section 5.3, Configuring ISDN Signaling. Section 5.4, Configuring ISDN LAPD and Call Control

trademark and symbol guidelines FOR CORPORATE STATIONARY APPLICATIONS reviewed

Object Semantics Lecture 2

Lec 2: Gates and Logic

1.00/1.001 Introduction to Computers and Engineering Problem Solving Fall Final Exam

JaERM Software-as-a-Solution Package

Operations with Polynomials

High-Performance Hardware Monitors to Protect Network Processors from Data Plane Attacks


Morgan Stanley Ad Hoc Reporting Guide

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

Solving the String Statistics Problem in Time O(n log n)

Firm Objectives. The Theory of the Firm II. Cost Minimization Mathematical Approach. First order conditions. Cost Minimization Graphical Approach

Distributions. (corresponding to the cumulative distribution function for the discrete case).

OUTLINE SYSTEM-ON-CHIP DESIGN. GETTING STARTED WITH VHDL August 31, 2015 GAJSKI S Y-CHART (1983) TOP-DOWN DESIGN (1)

Cypress Creek High School IB Physics SL/AP Physics B MP2 Test 1 Newton s Laws. Name: SOLUTIONS Date: Period:

HP Application Lifecycle Management

How fast can we sort? Sorting. Decision-tree model. Decision-tree for insertion sort Sort a 1, a 2, a 3. CS Spring 2009

Enterprise Risk Management Software Buyer s Guide

1.2 The Integers and Rational Numbers


Compilers Lexical Analysis

Learning Workflow Petri Nets

4.11 Inner Product Spaces

New Internet Radio Feature

1. Find the zeros Find roots. Set function = 0, factor or use quadratic equation if quadratic, graph to find zeros on calculator

Transcription:

flex Regulr Expressions nd Lexicl Scnning Using flex to Build Scnner flex genertes lexicl scnners: progrms tht discover tokens. Tokens re the smllest meningful units of progrm (or other string). flex is freewre ville from the Free Softwre Foundtion nd Gnu Project (http://www.gnu.org/). It s lso ville in the Cygwin (Unix emultor for Windows) downlod (http://cygwin.com/). Regulr Expressions nd flex flex input: file contining regulr expressions nd some code The regulr expressions define tokens such s if, then, while, nd clsses of tokens such s identifier, flot. flex output: C (or C++) code for lexicl scnner (Stndrd) Regulr Expressions on Alphet A Ø (for ech symol in A) e1 e (where e1 nd e re regulr exps) e1 e (where e1 nd e re regulr exps) e* (where e is regulr expression) (e) (where e is regulr expression) Lnguge Denoted y Regulr Expression L(Ø) = Ø L( ) = { } L() = {} L(e1 e) = L(e1) L(e) L(e1 e) = L(e1) U L(e) L(e*) = L(e)* L((e)) = L(e) Precedence: *, then conctention, then Exmples on Alphet A = {,} is regulr expression denoting. is regulr expression denoting {, }. * is regulr expression denoting {,,,, }. ( )* is regulr expression denoting {strings tht egin nd end with }. 1

Exmples for You RE for strings of even length RE for strings with s sustring RE for strings with exctly one RE for strings tht do NOT hve s sustring RE for INTEGER (possily preceded y + or -, no leding zeroes; lphet is {0,,9,+,-}) flex Regulr Expressions Cn hve opertors other thn conctention, union, nd closure For every flex expression, there exists n equivlent stndrd regulr expression. Advntge of stndrd regulr expressions: esy to prove theorems Advntge of flex regulr expressions: esier to express mny lnguges flex Regulr Expressions Regulr Mening expression [c] or or c flex Regulr Expressions, cont d Regulr Mening expression [-z]* 0 or more smll letters [\t\n] T or newline [-z]+ 1 or more smll letters [-z] A smll letter [-z]? 0 or 1 smll letter [-za-z] smll letter or cpitl letter. Any chrcter except \n flex Regulr Expressions, cont d Regulr Mening expression [^c] A chrcter other thn,, or c. \. A period {exp} exp1 exp The vlue of exp Anything mtching exp1 or exp flex Exmple 1: Section 1 %{ chr unused_vr; %} %option noyywrp /* regulr definitions */ delim [ \t\n] ws {delim}+ letter [A-Z-z] digit [0-9] identifier {letter}({letter} {digit} _)*

flex Exmple 1: Section {ws} {/* Do nothing. */ } select {printf("found token SELECT\n");} from {printf("found token FROM\n");} \, {printf("found token COMMA\n");} quit {return;} {identifier}{printf("found token IDENTIFIER\n");} flex Exmple 1: Section 3 int min() { yylex(); return 0; } Using flex sql.l Using the Scnner myquery.sql flex lex.yy.c C compiler lexicl scnner lexicl scnner SELECT SID FROM yylex() flex Sections Section 1: C (or C++) code to e copied to scnner - %{ %}; flex options; nd regulr definitions Section : token definitions nd semntic ctions Section 3: dditionl definitions, usully functions, to e copied to end of scnner code flex Exmple : Section 1 Code Copied to Scnner %{ #define SELECT 1 #define FROM #define COMMA 3 #define IDENTIFIER 4 #define QUIT 5 chr *yylvl; %} 3

flex Exmple : Section 1, cont. %option noyywrp /* regulr definitions */ delim [ \t\n] ws {delim}+ letter [A-Z-z] digit [0-9] identifier {letter}({letter} {digit} _)* flex Exmple : Section {ws} {/* Do nothing. */ } select {return( SELECT );} from {return( FROM );} \, {return( COMMA ); } {identifier} {yylvl = (chr *) mlloc (sizeof(chr) *strlen(yytext)+1); strcpy (yylvl, yytext) ; return( IDENTIFIER );} flex Exmple : Section 3 int min() { int my_token=1; while(my_token = 0) { my_token = yylex(); if(my_token == IDENTIFIER) printf("returned token is %d with vlue %s\n", my_token, yylvl); else printf("returned token is %d\n", my_token); } return 0; } Flex Strt Conditions <COND1regexp mtch regexp if condition COND1 holds <COND1,INITIALregexp mtch regexp if condition COND1 or INITIAL holds <*regexp mtch regexp under ny condition BEGIN(COND1) - in semntic ction, switches to COND1 Declring Flex Strt Conditions %x COND1 declres COND1 to e n exclusive strt condition %s COND declres COND to e n inclusive (shred) strt condition If COND1 is current, regexp is ctive. If COND is current, regexp1 nd regexp3 re ctive: regexp1 <COND1regexp <CONDregexp3 Exmple: Flex Strt Conditions %s ATT_VALUE, STR_CONST <TAG_NAME"=" {BEGIN(ATT_VALUE); return(eq);} <ATT_VALUE{string_const} {BEGIN(TAG_NAME); return(str_const);} 4

Distinguishing Identifiers %x SELCLAUSE, FROMCLAUSE // SELCLAUSE nd FROMCLAUSE re // exclusive strt conditions identifier {letter}({letter} {digit} _)* // regulr definition // of "identifier" Distinguishing Identifiers, cont d <INITIALselect // Accept "select" if in INITIAL strt // condition {BEGIN(SELCLAUSE); // Switch to SELCLAUSE strt condition return( SELECT ); <SELCLAUSEfrom // Accept "from" if in SELCLAUSE strt condition {BEGIN(FROMCLAUSE); // Switch to FROMCLAUSE strt condition return( FROM );} Distinguishing Identifiers, cont d <SELCLAUSE{identifier} // Accept identifiers if in SELCLAUSE // strt condition {return( ATTIDENTIFIER );} Finite Automt: An utomton tht ccepts {, } 1 <FROMCLAUSE{identifier} // Accept identifiers if in FROMCLAUSE // strt condition {return( RELIDENTIFIER );} 0 3 Finite Automton for Digits Finite Automton for Identifier 0 1... 9 Digit = 0 1 9,,, z, A, B,, Z,,, z, A, B,, Z, 0,, 9, _ 0, 1,, 9 Identifier := {letter}( {letter} {digit} _ ) * 5

Nondeterministic Finite Automton: Lnguge = strings with sustring Deterministic Finite Automton: Lnguge = strings with sustring,,, 0 1 3 0 1 3 Exmple DFAs for you Theory DFA for strings in L( ( (c))* ) DFA for strings of even length DFA for strings tht hve s sustring DFA for INTEGER (possily preceded y + or -, no leding zeroes) For every regulr expression, there is n NFA tht genertes the sme lnguge (nd vice vers). For every NFA, there is DFA tht genertes the sme lnguge (nd vice vers). Wht flex Does Converts regulr expressions to nondeterministic utomt (NFAs) Converts nondeterministic utomt to deterministic utomt (DFAs) Minimizes deterministic utomt Outputs code to simulte minimized DFAs Conclusions Lexicl scnning is first phse of compiltion/interprettion. Lexicl scnning useful for mny progrms, not just trnsltors. flex nd lex re most populr of mny scnner genertors. flex is sed on elegnt theory of lnguges nd mchines. 6