A simple Monadic Parsec based parser in Haskell. Mario Lang Revision: April 10, 2003

Similar documents
Formal Methods for Software Development

Testing and Tracing Lazy Functional Programs using QuickCheck and Hat

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

Session 7 Fractions and Decimals

Simple File Input & Output

Paper Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA

Compiler Construction

Project 2: Bejeweled

AN INTRODUCTION TO UNIX

Basics of I/O Streams and File I/O

1 Description of The Simpletron

Lecture 3. Arrays. Name of array. c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] c[8] c[9] c[10] c[11] Position number of the element within array c

Outline. Conditional Statements. Logical Data in C. Logical Expressions. Relational Examples. Relational Operators

Visual Logic Instructions and Assignments

TIMESHEET EXCEL TEMPLATES USER GUIDE. MS-Excel Tool User Guide

J a v a Quiz (Unit 3, Test 0 Practice)

Introduction to Python

Anatomy of Programming Languages. William R. Cook

3.GETTING STARTED WITH ORACLE8i

Running your first Linux Program

Not agree with bug 3, precision actually was. 8,5 not set in the code. Not agree with bug 3, precision actually was

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

Lesson Share TEACHER S NOTES. Making arrangements by Claire Gibbs. Activity sheet 1. Procedure. Lead-in. Worksheet.

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

Application Unit, MDRC AB/S 1.1, GH Q R0111

One Day. Helen Naylor. ... Level 2. Series editor: Philip Prowse. Cambridge University Press One Day.

Important Tips when using Ad Hoc

EXCEL FINANCIAL USES

CISC 181 Project 3 Designing Classes for Bank Accounts

public static void main(string[] args) { System.out.println("hello, world"); } }

Sources: On the Web: Slides will be available on:

Grade 6 Math Circles. Binary and Beyond

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

Introduction to SQL for Data Scientists

Moving from CS 61A Scheme to CS 61B Java

Hands-On UNIX Exercise:

Time Clock Import Setup & Use

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

If this PDF has opened in Full Screen mode, you can quit by pressing Alt and F4, or press escape to view in normal mode. Click here to start.

13 Classes & Objects with Constructors/Destructors

C++ INTERVIEW QUESTIONS

STEAM STUDENT SET: INVENTION LOG

Template and Daily Schedules

2 When is a 2-Digit Number the Sum of the Squares of its Digits?

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. CSC467 Compilers and Interpreters Fall Semester, 2005

The Windows Command Prompt: Simpler and More Useful Than You Think

Object-Oriented Programming in Java

Lesson 4 Web Service Interface Definition (Part I)

The Little Man Computer

Programming and Reasoning with Side-Effects in IDRIS

Nemo 96HD/HD+ MODBUS

Continuous Integration Part 2

On the M-BUS Interface you can add up to 60 M-BUS Devices (60 unit loads).

Cleo Host Loginid Import Utility and Cleo Host Password Encryption Utility

Processing transparencies - a step-by-step guide

Configuring the Scheduler

Detecting Pattern-Match Failures in Haskell. Neil Mitchell and Colin Runciman York University

TrendWorX32 SQL Query Engine V9.2 Beta III

Advanced Bash Scripting. Joshua Malone

Dynamic Programming Problem Set Partial Solution CMPSC 465

Chapter 6 - Condor finder

This session. Abstract Data Types. Modularisation. Naming modules. Functional Programming and Reasoning

Setting up a basic database in Access 2003

Table of Contents. Access this document and other HRIS information at Page 1

Lesson 26: Reflection & Mirror Diagrams

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

Tom wants to find two real numbers, a and b, that have a sum of 10 and have a product of 10. He makes this table.

VLOOKUP Functions How do I?

GENERATIONS QUICK START GUIDE

Reading a 999 File and how to fix Rejections using ClinicPro s Reference Files

MAKER FOR VTIGER CRM

ACADEMIC TECHNOLOGY SUPPORT

Selecting Features by Attributes in ArcGIS Using the Query Builder

Informatica e Sistemi in Tempo Reale

User s Guide for the Texas Assessment Management System

I PUC - Computer Science. Practical s Syllabus. Contents

Excel 2007 Basic knowledge

Ummmm! Definitely interested. She took the pen and pad out of my hand and constructed a third one for herself:

LabVIEW Day 6: Saving Files and Making Sub vis

Chapter 15 Functional Programming Languages

How To Proofread

Welcome to GAMS 1. Jesper Jensen TECA TRAINING ApS This version: September 2006

3. Mathematical Induction

Lab 1: Introduction to C, ASCII ART and the Linux Command Line Environment

CSE 326: Data Structures. Java Generics & JUnit. Section notes, 4/10/08 slides by Hal Perkins

Creating Basic Excel Formulas

Introduction to SAS Informats and Formats

Performing Simple Calculations Using the Status Bar

Masters programmes in Computer Science and Information Systems. Object-Oriented Design and Programming. Sample module entry test xxth December 2013

Instructions for Labs

5. CHANGING STRUCTURE AND DATA

Programming Fundamentals. Lesson 20 Sections

I look forward to doing business with you and hope we get the chance to meet soon

Example of a Java program

Real SQL Programming. Persistent Stored Modules (PSM) PL/SQL Embedded SQL

Teaching Statement Richard A. Eisenberg

Employee Central Payroll Time Sheet

Sequential Program Execution

rma_product_return version BoostMyShop

Transcription:

A simple Monadic Parsec based parser in Haskell Mario Lang e-mail: mlang@delysid.org Revision: April 10, 2003 1

Abstract This program implements a simple validating parser for a timesheet ASCII file. It is implemented in Haskell, using the monadic parser combinator library Parsec. Contents 1 Introduction 3 1.1 Timesheet format........................... 3 2 Module and import definitions 3 3 Convenience 3 4 Representing and parsing time 4 4.1 Internal representation........................ 4 4.2 Parsing time............................. 5 5 A day of work 5 6 Putting things together 6 7 Main 6 8 Testing 8 9 Conclusion 8 9.1 Open Questions............................ 8 10 Building 9 2

1 Introduction The following module implements a timesheet validating parser written in Haskell. It uses Parsec a Monadic parser combinator library. This is the first program I wrote in Haskell, so it is much of a learning project for me. If you have any suggestions, please send them via e-mail. 1.1 Timesheet format The format of my personal timesheet is very simple: Mi 02.01.: 08:50-15:50-01:00 Do 03.01.: 09:45-17:30-00:15 Fr 04.01.: 09:50-17:40-00:10 Sa 05.01.: 14:00-22:30 +08:30 So =+07:05 Mo 07.01.: 10:10-17:30-00:40 Di 08.01.: 09:45-17:50 +00:05 Mi 09.01.: 11:00-17:00-02:00 Do 10.01.: 09:30-17:15-00:15 Fr 11.01.: 11:45-19:00-00:45 Sa So =+03:30 We assume that on a normal weekday (Monad through Friday) I have to work 8 hours. The last column of a line from Monad through Saturday is simply the remaining time for this day. On a Sunday, we have the remaining time as a whole. This makes the file easy to read and also easy to verify for correctness. I use this file-format personally to record how much I ve worked. The following code implements two goals at once: 1. Validate that the individual sums were entered correctly, and 2. sum up the whole timesheet and return the remaining time. 2 Module and import definitions module STUNDEN where import PARSEC import PARSECCHAR import MAYBE import MONAD(liftM ) 3 Convenience As we have some common characters in our file-format, we are going to define those here: dot,colon,comma :: CHARPARSER MINUTES CHAR dot = char. ; colon = char : ; comma = char, 3

4 Representing and parsing time Time in our specialized time-sheet format always looks like this: [+ ]hh:mm 4.1 Internal representation So, how can we represent this internally. We want to calculate with time. Well, it s basicly nothing else than Minutes since midnight: We define a newtype to be able to derive instances of this time: newtype MINUTES = MINUTES INT deriving (EQ,ORD) And now we define simple mathematical operations on Minutes: instance NUM MINUTES where (MINUTES a) + (MINUTES b) = MINUTES (a+b) (MINUTES a) (MINUTES b) = MINUTES (a b) (MINUTES a) (MINUTES b) = MINUTES (a b) frominteger = MINUTES. frominteger abs (MINUTES x ) = MINUTES (abs x ) negate (MINUTES x ) = MINUTES ( x ) signum (MINUTES x ) = MINUTES (signum x ) Also, it would be nice if we could use the function show on a value of type Minutes and get the representation we decided to use. instance SHOW MINUTES where show (MINUTES m) where hh m = zeropad (m `div` 60) mm m = zeropad (m `mod` 60) zeropad x m 0 = + : hh m + : + mm m m < 0 = - : hh ( m) + : + mm ( m) x < 10 = 0 : show x otherwise = show x To illustrate what the above code actually does, here is an example interactive session using our newly defined data constructor and show function. Call: Minutes 960 Result: +16:00 Call: Minutes 0 - Minutes 23 Result: -00:23 Call: Minutes (-99) Result: -01:39 4

4.2 Parsing time Well, to be able to parse our hh:mm format, we are going to define a two-digit parser first: (Note: One shouldn t really use read in Parsec-based programs. Any good alternative here?) number :: CHARPARSER MINUTES INT number = count 2 digit >>= return.read Now we can define a simple hh:mm parser. time :: CHARPARSER MINUTES MINUTES time = number >>= λh colon >> number >>= λm return (MINUTES (h 60+m)) A interval (09:00-17:00) indicates a time interval. duration :: CHARPARSER MINUTES MINUTES duration = time >>= λt 1 char - >> time >>= λt 2 return (t 2 t 1 ) And we also want to be able to have several intervals on one day, separated by a comma. durations :: CHARPARSER MINUTES MINUTES durations = liftm (duration `sepby ` comma) 5 A day of work A day is represented using the german two-letter initials. Lets first define a data type for internal use: data DAY = MO DI MI DO FR SA SO deriving (EQ,ORD,ENUM,SHOW) Now we define a parser which tries to match one of our two-letter days, and if it matches one, it returns it. dayofweek :: CHARPARSER MINUTES DAY dayofweek = foldr 1 (< >) (map isday [MO..]) where isday x = try (string (show x )) >> return (x ) A day of work looks like this: Mi 02.01.: 08:50-15:50-01:00 The final column is the sum of remaining time for this day. workday :: CHARPARSER MINUTES MINUTES workday = dayofweek >>= λdow space >> number >> dot >> number >> dot >> colon >> space >> durations >>= λdur 5

updatestate (+(( towork dow )+dur )) >> many 1 space >> getstate >>= λbalance (if dow = SO then string ( = + show balance) else string (show (( towork dow )+dur ))) >> many (oneof \t\n ) >> getstate >>= return where towork day day `elem` [MO.. FR] = 8 60 otherwise = 0 6 Putting things together The whole timesheet is a string composed of several lines. workentries :: CHARPARSER MINUTES MINUTES workentries = many workday >> eof >> getstate >>= return Finally we need a function to handle the final parser result, so that we can print error messages, or the remaining time if everything went OK. parsework :: STRING IO () parsework = putstrln. either error result. runparser workentries (MINUTES 0) where result min = ( Remaining time + show min +! ) error = ( +) parse error at. show 7 Main Very similar to C, a Haskell program also needs a Main module (and a main function) if you want to link it into an executable file. Our Main module is very simple, it only tries to figure out if a command-line argument was passed, and if yes, use that as filename for the input file. Here it is: module MAIN where import SYSTEM (getargs) import IO (hgetcontents, openfile, stdin, IOMODE(READMODE)) import STUNDEN (parsework ) main = getargs >>= λargs (if null args then hgetcontents stdin else openfile (args!!0) READMODE >>= λfh hgetcontents fh) 6

>>= parsework 7

8 Testing Now that we have implemented all necessary functions and data-types, lets put in a test-case to illustrate how the whole thing actually works. The following is a test-case on our program. First we show the given input data, below you can see the ouput the program produces. Input: Mi 02.01.: 08:50-15:50-01:00 Do 03.01.: 09:45-17:30-00:15 Fr 04.01.: 09:50-17:40-00:10 Sa 05.01.: 14:00-22:30 +08:30 So 06.01.: =+07:05 Output: Remaining time +07:05! Now this is fine, but how does the parser react if we give it incorrect data? Input: Mi 02.01.: 08:50-15:50-01:00 Do 03.01.: 09:45-17:30-00:15 Fr 04.01.: 09:50-17:40-00:10 Sa 05.01.: 14:00-22:30 +08:30 So 06.01.: =+07:00 Output: parse error at (line 5, column 24): unexpected 0 expecting =+07:05 9 Conclusion We clearly can see that the implicit error handling of Parsec is a real advantage here. Without ever explicitly specifying any kind of coloumn or line number handling, we get all this for free! 9.1 Open Questions It would be nice if we could implement non-terminating Warnings. For instance, the parsework function could only warn if the sum of a day was wrong, but fail (terminate) if the sum of a week is wrong. I didn t find a way yet to pass warnings on in the Parser Monad, because the return type is Either, and Left returns the final result of the parser, whereas Riight returns an ParserError type which doesn t contain the state of the parser as far as I understand. Any hints on how to easily implment warnings would be nice. 8

10 Building The following listing shows how to compile and build all the files related to this project. It is written using make, a standard UNIX tool. Listing: Makefile The building process in a nutshell # Some d e f a u l t v a l u e s HC = ghc HFLAGS = package t e x t HCFLAGS = O2 # We are going to b u i l d t h o s e f i l e s RESULTS = Stunden. ps Stunden. pdf Stunden # S t a r t o f b u i l d r u l e s : a l l : $ (RESULTS) Stunden : Stunden. o Main. o $ (HC) $ (HFLAGS) o $@ $+ s t r i p $@ Stunden. dvi : Stunden. l h s Main. l h s Makefile Stunden c l e a n : rm f. hi. o. inp. r e s $ (RESULTS) r e l e a s e : $ (MAKE) a l l rm f. hi. o cd.. & & t a r c z v f Stunden. t a r. gz Stunden # I m p l i c i t r u l e s f o r H a s k e l l compilatio n : %.o : %. l h s $ (HC) $ (HCFLAGS) $ (HFLAGS) o $@ c $< # I m p l i c i t r u l e s f o r pdf / ps g e n e r a t i o n : %. toc %. aux %. l o g %. inp %. r e s : %. l h s s h e l l e s c a p e=y l a t e x $< %. dvi : %. l h s %. toc s h e l l e s c a p e=y l a t e x $< %. ps : %. dvi dvips $< %. pdf : %. dvi dvipdf $< # make s p e c i f i c s e t t i n g s 9

.PRONY: c l e a n r e l e a s e.intermediate : Stunden. aux Stunden. dvi Stunden. inp Stunden. l o g Stunden. r e s # DO NOT DELETE: Beginning o f H a s k e l l dependencies Stunden. o : Stunden. l h s Main. o : Main. l h s Main. o : Stunden. hi # DO NOT DELETE: End o f H a s k e l l dependencies 10