Computer Programming in Perl: Internet and Text Processing



Similar documents
02-201: Programming for Scientists

INFO 2130 Introduction to Business Computing Spring 2013 Self-Paced Section 006

Basic Web Development RTD Instructor: Don Barth, Office: Room 115b, NW Annex A Office hours: 8:30 AM to 12:30 Morris

CS 1361-D10: Computer Science I

Math 35 Section Spring Class meetings: 6 Saturdays 9:00AM-11:30AM (on the following dates: 2/22, 3/8, 3/29, 5/3, 5/24, 6/7)

IST565 M001 Yu Spring 2015 Syllabus Data Mining

INFO 2130 Introduction to Business Computing Fall 2014

Computer Science 210: Data Structures. Introduction

Web Analytics. Using emetrics to Guide Marketing Strategies on the Web

Priority: Medium Channel to Actor: Graphical User Interface (GUI) Usage Frequency: Weekly Secondary Actors: Database, Brisk Application

A Note for Students: How to Use This Book

INT322. By the end of this week you will: (1)understand the interaction between a browser, web server, web script, interpreter, and database server.

Google Product Development/Management Process

Management Information Systems 260 Web Programming Fall 2006 (CRN: 42459)

Using Web-based Tools to Enhance Student Learning and Practice in Data Structures Course

How to Pass Physics 212

OIT 307/ OIT 218: Web Programming

Spring 2013 CS 6930 Advanced Topics in Web Security and Privacy - 3 Credit Hours Syllabus and Course Policies

ECON 424/CFRM 462 Introduction to Computational Finance and Financial Econometrics

Political Science 1336 American Government I U.S. and Texas Constitutions and Politics FALL 2009

CS1400 Introduction to Computer Science

Dr. Stephen K. Pollard. Online. Online. None

Psychology 1F03 Course Outline Spring 2014

How to Outsource Without Being a Ninnyhammer

Prerequisite Math 115 with a grade of C or better, or appropriate skill level demonstrated through the Math assessment process, or by permit.

Jon Gregor Bjornstad

Greetings! Welcome to Community Psychology! accelerated online semester Fall 2015 Sept. 1 Nov. 7, 2015

INFSCI 1017 Implementation of Information Systems

Online Student Orientation

Application of Project-driven Teaching Practice Based on Sakai

CSC108H: Introduction to Computer Programming

Efficiency Considerations of PERL and Python in Distributed Processing

THE ELEMENTS OF USER EXPERIENCE

HUW166: Introduction to Web Development

In this topic we discuss a number of design decisions you can make to help ensure your course is accessible to all users.

Game Programming CS233G / Winter 2012 / PCC

Course Content Concepts

Master of Science in Computer Science College of Computer and Information Science. Course Syllabus

Blackboard Version Interactive Tools

CS 40 Computing for the Web

How to be Successful in Foundations of Physics I (PHYS 2305)

Bergen Community College - Information Technology Course Syllabus

CJS 101: Introduction to Criminal Justice Sciences

CA252: Spreadsheet Applications Syllabus

Practice Marketing Training Guide

Welcome to Life Span Psychology (Psych 41) East Los Angeles College. 3 Credit Hours. Professor Maria Mayoryk

My Media LESSON PLAN UNIT 2. Essential Question What are your personal media habits, and how much time do you spend with different forms of media?

Each student will be responsible for creating an original web page that is essentially a portfolio of work completed in the class.

Online International Business Certificate Programs

Better Business Analytics with Powerful Business Intelligence Tools

ITSY1342 Section 151 (I-Net) Information Technology Security

PCB 3043: Ecology Spring 2012, MMC

It is vital that you understand the rationale behind the correct answer(s) as wel as the incorrect answer options.

Best Practices for Managing Your Public Web Space and Private Work Spaces

Website Planning Questionnaire. Introduction. Thank you for your interest in the services of The Ultimate Answer!

Earth Science 101 Introduction to Weather Fall 2015 Online

Mendocino College Online Math Orientation presented by Susan Bell & Jason Edington

Integrating a Factory and Supply Chain Simulator into a Textile Supply Chain Management Curriculum

Introduction to Psychology (PSY 120)

INF 203: Introduction to Network Systems (3 credit hours) Spring W1, Class number 9870

General Procedures for Developing an Online Course

Getting Started with WebCT

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

STEPfwd Quick Start Guide

Borough of Manhattan Community College Department of Social Science. POL American Government Spring 2014

UNIVERSITY OF MARYLAND MONEY AND BANKING Economics 330 Fall 2015

Psychology 318, Thinking and Decision Making Course Syllabus, Spring 2015 TR 8-9:20 in Lago W262

Customer Service Standards - Greetings

RICHARD STOCKTON COLLEGE OF NJ Business Continuity Planning

Online Master of Science in Information Technology Degree Program User s Guide for Students

Machine Learning. CUNY Graduate Center, Spring Professor Liang Huang.

REGULATIONS FOR THE DEGREE OF BACHELOR OF SCIENCE IN BIOINFORMATICS (BSc[BioInf])

CMSC Fundamentals of Computer Programming II (C++)

FYS Life Maps JACKSON COMMUNITY COLLEGE 1/10 through 2/23/2012 WINTER 2012

AMS 5 Statistics. Instructor: Bruno Mendes mendes@ams.ucsc.edu, Office 141 Baskin Engineering. July 11, 2008

SYLLABUS FOR CS340: INTRODUCTION TO DATABASES

An Evaluation of Open Source Learning Management Systems According to Learners Tools

Welcome to the Time Management and Study Skills Workshop. Presented by: The Counseling Center of Gulf Coast State College

Action Bar. Action Link. Action Links Icon. Add Content Link Button. Administration Area. Adobe Presenter. Alt Text.

BSCI222 Principles of Genetics Winter 2014 TENTATIVE

PROGRAMMING FOR BIOLOGISTS. BIOL 6297 Monday, Wednesday 10 am -12 pm

ECE 297 Design and Communication. Course Syllabus, January 2015

Student User Guide for BioPortal Biochemistry, Seventh Edition

Class and Office Hours. Course Requirements. Concepts to Learn. CMPUT 499: Introduction

Transcription:

Computer Programming in Perl: Internet and Text Processing Instructor: Dmitriy Genzel June 28 July 16 1 Course Description This course will teach you how to program in Perl, the programming language behind much of the web functionality you use daily. The beginning of the course will be spent in learning the language, and the rest of it in applying it to various tasks both offline and online, with a special focus on the web tasks and text processing. Possible projects include developing web forums, search engines, programs that can execute other programs, programs that appear to be intelligent (e.g., chatbots), network and graphical applications (perhaps games). You will choose some of the projects yourself. The purpose of this class is to show you what programming is all about (namely, having fun), what can be done in Perl, how to do it, and how to learn more about Perl on your own. The course also provides introduction to computer science. Previous programming experience is recommended. An ability to think logically, solve problems, and learn fast is fundamental. This course is for those seriously interested in learning programming and Perl; this is not a gentle introduction to computers. Enrollment limited to 20 students. 2 Goals and Objectives The goals of the course are: 1. To serve as an introduction to programming, Perl, and Computer Science 2. To explore in depth the topics of interest to you 3. To make you interested in exploring programming on your own By the end of the course, you should (in the order of decreasing importance): 1. Be able to program: Given a task, know how to perform it 2. Know some Perl: Be able to read Perl code 3. Know about various Perl resources (primarily Perl modules at CPAN): Be able use them on your own 4. Have a general notion of Computer Science and its subfields: Be able to name at least five fields and name important problems they deal with 5. Know a little about software engineering: Be able to write a 1000-line program you can read six months from now 6. Know about other programming languages and how they compare to Perl: Know when to use or not to use Perl 7. Acquire some non-programming skills relevant to learning CS: Be able to write and present in a comprehensible way 1

3 Learning Format There will be two kinds of sessions: lectures and supervised labs. You will also attend labs on your own to do your assignments. The lectures will be used to present course materials useful to you and also for your presentations; labs will be used to help you with your home assignments/projects and to get some programming experience under instructor s eye. In addition, I will be in the lab during my office hours while you do your homework. Some of the lectures are designated as Special topics which means that during these lectures I want to cover topics of interest to you. Please let me know what you want me to cover as soon as possible. If you offer no suggestions, I will cover topics given in parenthesis after the lecture name. See the schedule for details. I will hold office hours for three hours after the afternoon section (2:30-5:30pm). 4 Readings One does not learn how to program (or even learn a particular programming language) by reading about it. However, it is a good idea to read about something before you try it, even if your inclination is to jump right in. Also, you may prefer to use the book to look something up, even though real hackers use online documentation for this. Most books are (also) available online for those on Brown campus. See course webpage or Brown library for information. We will use the following readings: Required: Learning Perl by Randal L. Schwartz and Tom Phoenix, 3rd edition, known as the Llama book. Buy at the bookstore or use online (not recommended). Required: Perl manpages. Available on your system. Type perldoc perl at command prompt. Recommended: Programming Perl by Larry Wall, Tom Christiansen, and Jon Orwant, 3rd edition, known as the Camel book. Buy at the bookstore if plan to use Perl outside of class or use online while at Brown. Recommended: CPAN: Comprehensive Perl Archive Network. Online, at http://www.cpan.org/ Useful: Perl Cookbook by Tom Christiansen and Nathan Torkington, known as the Cookbook. Find on reserve in the library or use online. Useful: Mastering Perl/Tk by Stephen Lidie and Nancy Walsh, not really known as anything other than the title. Use online. You are expected to read Learning Perl as we go along, but it is not required if you are very confident that you don t need it. The material in the book is often complementary to the lectures and you will find it very useful. Do not be scared if you are asked to read three chapters in a day. Simply skim them, I will not cover all the material there. Programming Perl is the Perl Bible and should be used as a primary reference if you prefer a dead-tree form, rather than the digital one. For digital documentation use man/perldoc pages. Perl Cookbook is a collection of recipes for common tasks. If you have a task that seems common to you, like sorting a list, opening a socket, or listing a directory, you will find a code snippet to do it there. Make sure you understand how it works before you use it, though! Mastering Perl/Tk is a book about Tk, a GUI library for Perl. We will use it occasionally in class, and it will be very useful if you decide to do a GUI final project. CPAN should be used for module documentation (although you should first check man pages and HTML documentation on your machine). I will also make the lectures available online immediately following the class, so you can consult them. This does not mean, however, that you can skip the classes, there s more to the class than just the slides. Many of the classes won t be lectures anyway. I will check attendance. 5 Assignments: General The only way to learn to program and to learn a new programming language is to actually write programs. Therefore, the primary kind of assignments will be programming projects. They will gradually increase in difficulty, culminating in a final project (chosen by you) which will be of a significant complexity. Please see 2

the appendix for some suggested projects. One of the major points of this course is for you to have fun while programming, and to create a major piece of software which is worth being proud of. There will also be a few non-programming assignments. The purpose of these is to provide some breadth to the your experience. Whether you like it or not (I don t), computer scientists and programmers need to be able to write and present clearly. You will write one short paper and give at least one presentation and a final project demo. The paper will involve some research. There will be no group projects (except possibly the final one, if you convince me). This means that the work you submit should be your own. You are welcome to talk to other students and ask their advice, but please don t copy their code. There will also be discussion questions due the next day. I will ask one or two of you to discuss with me or another student the question I assigned. I will ask for volunteers first, but everyone will go through it, so it is in your interest to volunteer if you have something to say on the topic. All assignments will be due at 10am on the due date. There will be no tests. The evaluation will be based on your assignments. I will provide a numeric grade based on the following (for programming assignments): 1. The program produces no syntactic or other Perl error for any user input 2. The program solves the problem 3. The solution is the most efficient possible 4. The program is written in a good style, easy to read 5. Significant effort was made or an improvement in quality (compared to previous assignments) was accomplished. The final evaluation (there is no grade) will be based on (in the order of decreasing importance): 1. Homework (programming) 2. Final project 3. Attendance 4. Non-programming assignments More weight will be given to the later homeworks. 6 Doing Homework Normally lectures won t take the whole class. This is intentional, since you learn by doing and lecturing takes the time away from that. So when the lecture part is finished we will automatically turn into a lab mode and you will start doing your homework. This is why there would be a lot of homework. I hope that even the brightest among you would not be able to finish all of it before they run out of time. I expect you to do as much homework as you are able, and I expect your abilities to increase very fast. You would mostly be doing homework during the office hour period, since this is when the lab is open and I am around to answer questions. 7 Assignment Listing All homework will involve book exercises for the chapter(s) we covered that day. In addition, the following problems will be included (many are optional): Short programming assignments (due next day): S1. Taxes, ASCII graphics S2. Instant run-off voting S3. HTML, text processing S4. CGI comments form S5. Regular expressions 3

S6. References: trees S7. Simple calculator Medium programming assignments (due in two days): For the following assignments you are expected to submit a status report (how far along you are, etc.,) or the actual things you got to work the day after the assignment is distributed. The assignment will provide details. M1. A chat bot. M2. Paint program M3. Choose one of the following: GUI: A Calendar application Web: A web forum Net: Mirroring software Final project-related assignments: F1. Preliminary proposal. Short description of proposed project. May be revised until F2 is submitted. F2. Detailed proposal. Includes list of features to be implemented (basic and optional). Needs to be approved before the work is started. F3. Early status check. Short report on what s done so far. Request for change in functionality. F4. Mid-project deadline. Submit code. 3/4 of basic features should be implemented. This is followed by meeting with the customer to discuss. F5. Final status update. Short report on what s working and what s not. 95% of basic features should be implemented F6. Final demo. 15 minute presentation in front of the class. Non-programming assignments: N1. Write a short research paper (3-4 pages) on a general topic related to CS. The topic will be assigned to you, but you will have some latitude to change it if you really hate it. I may ask you to present this (instead of N2) if it is especially good or bad. N2. Prepare a presentation on some technical problem you faced and how you solved it. For example, describe how to use some CPAN module to accomplish a particular task. This may be waived for some people on the basis of N1 (if they presented). All presentations should be no more than 15 minutes (including time for questions). 4

8 Schedule [tentative] In the Date column (m) means morning section, (a) means afternoon section. Items in the Out and Due columns refer to assignment numbers. Items in the Read column refer to chapters in Learning Perl which are covered by that class. Date Description Read Out Due Week 1 June 28 (m) Course goals, syllabus, introduction to Perl (a) Scalar data Ch. 2 S1 June 29 (m) Lists and arrays Ch. 3 S2 S1 (a) Subroutines Ch. 4 F1 June 30 (m) Hashtables, Input/Output Ch. 5-6 S3 S2 (a) HTML N1 July 1 (m) Files and directories Ch. 11-13 S4 S3 (a) Modules, basic CGI July 2 (m) Regular expressions Ch. 7 S5 S4 (a) Regular expressions (cont.) Ch. 8-9 M1 July 3 (m) References; Final project showcase S6 S5, N1-src (a) no class Week 2 July 6 (m) More control structures, strings, sorting Ch. 10, 15 S7 M1, S6 (a) Perl/TK F2 F1 July 7 (m) Perl/TK (cont.) M2 S7 (a) Lab N1 July 8 (m) Simple databases; DBI, DBM Ch. 16 F2 (a) Lab July 9 (m) Internet tools, processes, advanced topics Ch. 14, 17 M3 M2 (a) Presentations for assignment N1 Week 3 July 12 (m) Special topics (Databases) F3, N2 M3 (a) Lab: special topics July 13 (m) Special topics (Perl objects) F4 F3 (a) Lab: special topics July 14 (m) FP: meeting with the customer F5 F4 (a) FP: meeting with the customer N2 topic July 15 (m) Lab: help with final project F6 F5 (a) Presentation of N2 N2 July 16 (m) Demo for final projects F6 (a) Demo for final projects July 3 (Saturday) class is optional, but recomended 9 Contact Information Dmitriy Genzel Phone (office): 401-863-7672 Email: dg@cs.brown.edu Office: CIT Room 551 Address: Box 1910, Brown University, Providence, RI 02912 Class webpage: http://www.cs.brown.edu/ dg/summer04/ TA: Haruyoshi Sakai, hsakai@cs.brown.edu (see potential projects on the next page) 5

Appendix: The list of potential projects Some ideas for possible projects: Obvious: An extension of any earlier project (including N1) A web forum (bulletin board) A blog A simple game (e.g., tetris, puzzle, Life) A chat client (GUI or non-gui) A calendar GUI application A music player A music collection organizer A chess program that lets two people play Less obvious: Web/Internet A search engine A web proxy (anonimyzer, ad blocker, etc) A web server A P2P secret chat network Less obvious: Algorithms: Some image manipulation task (e.g. find borders, etc) Some crypto task (e.g. substitution ciphers, http://sicp.ai.mit.edu/fall-2002/) Spam filter: perceptron or something else LZ Compression Implementing some paper in NLP Text processing, concordances, like http://www.opensourceshakespeare.org/ Word (or sentence) alignment for machine translation Interpreter for LOGO Less obvious: Simulation: Simulated societies (sugarscape) Looking up at the stars Physics simulation (gravity (solar system), any force (field lines)) 6