csce4313 Programming Languages Scanner (pass/fail)



Similar documents
Programming Assignment II Due Date: See online CISC 672 schedule Individual Assignment

Programming Project 1: Lexical Analyzer (Scanner)

Moving from CS 61A Scheme to CS 61B Java

Programming Languages CIS 443

Compiler Construction

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

Introduction to Java Applications Pearson Education, Inc. All rights reserved.

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

Lecture 5: Java Fundamentals III

Chapter 2: Elements of Java

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

C++ INTERVIEW QUESTIONS

qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq

03 - Lexical Analysis

Pemrograman Dasar. Basic Elements Of Java

13 File Output and Input

Object Oriented Software Design

Illustration 1: Diagram of program function and data flow

Example of a Java program

Compiler Construction

Install Java Development Kit (JDK) 1.8

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

Objects for lexical analysis

Variables, Constants, and Data Types

An Incomplete C++ Primer. University of Wyoming MA 5310

C Compiler Targeting the Java Virtual Machine

The C Programming Language course syllabus associate level

CISC 181 Project 3 Designing Classes for Bank Accounts

1001ICT Introduction To Programming Lecture Notes

LAB 1. Familiarization of Rational Rose Environment And UML for small Java Application Development

Topics. Parts of a Java Program. Topics (2) CS 146. Introduction To Computers And Java Chapter Objectives To understand:

Introduction to Java

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions

CS 106 Introduction to Computer Science I

Object Oriented Software Design

java.util.scanner Here are some of the many features of Scanner objects. Some Features of java.util.scanner

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

Sources: On the Web: Slides will be available on:

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Introduction to Java. CS 3: Computer Programming in Java

Java CPD (I) Frans Coenen Department of Computer Science

J a v a Quiz (Unit 3, Test 0 Practice)

PA2: Word Cloud (100 Points)

Compilers Lexical Analysis

Project 2: Bejeweled

Comp151. Definitions & Declarations

File class in Java. Scanner reminder. Files 10/19/2012. File Input and Output (Savitch, Chapter 10)

CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

Understanding class definitions

Installing Java (Windows) and Writing your First Program

Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016

JDK 1.5 Updates for Introduction to Java Programming with SUN ONE Studio 4

UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming

Vim, Emacs, and JUnit Testing. Audience: Students in CS 331 Written by: Kathleen Lockhart, CS Tutor

AppendixA1A1. Java Language Coding Guidelines. A1.1 Introduction

Part 1 Foundations of object orientation

Hypercosm. Studio.

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

CS 241 Data Organization Coding Standards

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Lab 2 - CMPS 1043, Computer Science I Introduction to File Input/Output (I/O) Projects and Solutions (C++)

CNT5106C Project Description

CS 111 Classes I 1. Software Organization View to this point:

Some Scanner Class Methods

University of Hull Department of Computer Science. Wrestling with Python Week 01 Playing with Python

VISUAL GUIDE to. RX Scripting. for Roulette Xtreme - System Designer 2.0

1.00 Lecture 1. Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders

Java Programming Fundamentals

Introduction to Lex. General Description Input file Output file How matching is done Regular expressions Local names Using Lex

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Computer Programming C++ Classes and Objects 15 th Lecture

Chapter 3. Input and output. 3.1 The System class

How to Write a Simple Makefile

Cryptography and Network Security Department of Computer Science and Engineering Indian Institute of Technology Kharagpur

Organizational Development Qualtrics Online Surveys for Program Evaluation

Management Information Systems 260 Web Programming Fall 2006 (CRN: 42459)

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Introduction to Python

Explain the relationship between a class and an object. Which is general and which is specific?

Using Dedicated Servers from the game

ECE 122. Engineering Problem Solving with Java

Object-Oriented Programming in Java

VB.NET Programming Fundamentals

Part I. Multiple Choice Questions (2 points each):

Building a Multi-Threaded Web Server

Informatica e Sistemi in Tempo Reale

Creating a Java application using Perfect Developer and the Java Develo...

So far we have considered only numeric processing, i.e. processing of numeric data represented

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Mobile App Design Project #1 Java Boot Camp: Design Model for Chutes and Ladders Board Game

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

Package HadoopStreaming

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

sqlite driver manual

LEX/Flex Scanner Generator

6.1. Example: A Tip Calculator 6-1

Transcription:

csce4313 Programming Languages Scanner (pass/fail) John C. Lusth Revision Date: January 18, 2005 This is your first pass/fail assignment. You may develop your code using any procedural language, but you must ensure it runs correctly on turing.csce.uark.edu before submission. Your task is to write a scanner for Java. A scanner picks through the source code of a program, identifying the individual tokens. To scan a file, it is necessary to understand lexical analysis. Lexical Analysis Lexical analysis is the process of breaking up a language stream into its tokens (words) and categorizing those tokens as to their types (parts of speech). These token - type pairs are known as lexemes. Formally, a lexical analyzer maps a stream of characters onto a stream of lexemes. At a minimum, a lexeme is an object that holds at least two pieces of information, the type of the token and, in some cases, the actual token itself. As can be seen further on, a lexeme representing a semicolon would generally only use the type field, since the actual word ; can easily be reconstructed from the type information. The basic definition of a lexeme class might look like: class lexeme implements types String type; String word; Note: for programming in a non-object oriented language, substitute the approriate data structure for the above class. With respect to the lexemes found in a programming language, it may be efficient to store some of the words as other than strings. For example, numbers may best be stored as native numbers rather than as strings. To reflect this idea, we add a field corresponding to each of the primitive types in the language we are to implement: class lexeme extends types String type; String string; int integer; double real; Since memory is relatively cheap, we will not worry about the fact that for any given lexeme, two of the three value fields will remain empty. As an example of lexical analysis with respect to computer languages, consider some simple expressions in Java: 1

int a = 2; double zeta; if (a!= 0) a = 64.0 / a; zeta = log(a*a,2); System.out.println("the value of zeta is " + zeta); The parts of speech in this language are keywords, variables, operators, numbers and punctuation (in order of appearance). Now consider a program for taking this source code and producing the lexical stream. The function that reads a portion of the input and produces the next lexeme has historically been called lex. How does the lex decide what type a particular token is? Through its morphology! A plus sign, for example, has a distinctive shape, both as a character and as a bit pattern representing that character. As a more complicated example, a variable may have the following shape: VARIABLE: a letter followed by any number of letters or digits An integer has more restrictive shape: INTEGER : a digit followed by any number of digits Often, in a language, a keyword has the same general shape as a variable, yet its specific shape is different. For example, the keyword if has the following exact shape: IF : an i followed by an f Putting these ideas together, it becomes rather simple to write the lex function: lexeme lex() char ch; skipwhitespace(); ch = Input. read(); if (Input. failed) return new lexeme(endofinput); switch(ch) // single character tokens case ( : return new lexeme(oparen); case ) : return new lexeme(cparen); case, : return new lexeme(comma); case + : return new lexeme(plus); //what about ++ and +=? case * : return new lexemetimes); case - : return new lexeme(minus); case / : return new lexeme(divides); case < : return new lexeme(lessthan); case > : 2

return new lexeme(greaterthan); case = : return new lexeme(assign); case ; : return new lexeme(semicolon); //add any other cases here default: // multi-character tokens (only numbers, // variables/keywords, and strings) if (Character. isdigit(ch)) Input. pushback(ch); return lexnumber(); else if (Character. isletter(ch)) Input. pushback(ch); return lexvariable(); //and keywords! else if (ch == \" ) Input. pushback(ch); return lexstring(); else return new Lexeme(UNKNOWN, ch); In the (partial) implementation of lex given above, the function lexvariable is charged with looking for the specific shapes of the keywords within the general shape of variables. The implementations of lexnumber, lexvariable, lexstring, and skipwhitespace are left to the reader. The tokens ENDofINPUT, PLUS, MINUS, etc. are String constants, defined in the class/module types, and are used to fill out the type component of a lexeme. Note that when input is exhausted, lex repeatedly returns the ENDofINPUT lexeme. When given the input... int alpha = 10; while (alpha > 0) System.out.println("alpha is " + alpha); alpha = alpha - 1;...repeated calls to lex might produce the following stream of lexemes: TYPE_INT ASSIGN INTEGER 10 SEMICOLON WHILE OPEN_PAREN GREATER_THAN 3

INTEGER 0 CLOSE_PAREN OPEN_BRACE VARIABLE System DOT VARIABLE out DOT VARIABLE println OPEN_PAREN STRING "alpha is " PLUS CLOSE_PAREN SEMICOLON ASSIGN MINUS INTEGER 1 SEMICOLON CLOSE_BRACE ENDofINPUT A program which repeatedly calls lex and displays the resulting lexemes is called a scanner: void scanner(string filename) lexeme token; lexer i = new lexer(filename); token = i.lex(); while (token. type!= ENDofINPUT) System.out.println(token); token = i.lex(); Lexical analysis is the first phase of implementing a programming language. The next phase, covered in detail in the next pass/fail assignment, ensures that the order of the lexemes makes logical sense. Specifics Specifically, you are to write a scanner for Java. You should implement the following classes/modules: types lexeme lexer scanner Your implementation must be based upon one-character-at-a-time input (e.g. you may not use Java s string tokenizer if you are writing in Java). That is, your lex method reads a single character and then analyzes what to do with it as above. If you are the least bit confused about this requirement, see me. One should be able to run the scanner on a file of Java source code by typing in the command... scanner f...where f is the name of the Java file to be scanned. 4

Passing the assignment To pass, you must, at a minimum, correctly scan the file test.java. You do not need to be able to lex real numbers, but you will need to be able to skip both over /* */ comments and // comments. Coding Standards For this assignment, there are three possible scores: a strong pass worth 100pass will be awarded if the program is successful but the following coding standards and the coding standards in the syllabus are not followed. You will recieve no credit if your program fails to compile or you do not supply a makefile. Compiling Style Compile at the highest level of warnings (-Wall for gcc, g++) Supply code that compiles with no warnings or errors (C, C++ programmers) main returns int (C, C++ programmers) no implicit typing of functions or variables (C, C++ programmers) Each object should have it s own module/class (C, C++ programmers) The public interface for that object is placed in the header file (C, C++ programmers) Typedef all structures (C++ programmers) Use namespaces and modern include files In general, methods should be not contain very much code - if they do, see if there is some way to reduce the complexity with helper functions - in this particular program, the lex method will be longish because of the switch statement There should be no repetitious code Use symbolic constants (no numeric contants in code or declarations,) Programs should not contain explicit limits Portability Do not supply code that is tied to a specific operating system Do not call system Do not include conio.h or any other OS specific header file Miscellaneous Do not fflush(stdin) - it doesn t do anything useful Sign your work (all files) Comment appropriately (not too much, not too little, but just right) Indent no more than four spaces Don t change the length of a tab in your editor Do not read unlimited length strings into fixed length buffers! 5

Submitting the assignment To submit your assignment, delete all object or class files from your working directory, leaving only source code, a makefile, a readme file, and any test cases you may have. Then, while in your working directory on turing.csce.uark.edu, type the command submit csce4313 lusth scanner The submit program will bundle up all the files in your current directory and ship them to me. This includes subdirectories as well since all the files in any subdirectories will also be shipped to me. You will receive a weak pass if you supply extraneous files (such as.o files or.class files), so be careful. Make sure you that I will be able to call your scanner with the command named scanner. The command scanner must take a filename as a command line argument. You must supply a makefile that will compile your scanner when responding to the command make. If you do not supply a makefile, you will receive a failing grade for this assignment. You may submit as many times as you want before the deadline; new submissions replace old submissions. You must sign your work and give credit where credit is due! You must submit from turing.csce.uark.edu! 6