Lab3: Dictionary Array

Similar documents
Lecture 11 Doubly Linked Lists & Array of Linked Lists. Doubly Linked Lists

Lab Experience 17. Programming Language Translation

CS 1133, LAB 2: FUNCTIONS AND TESTING

CISC 181 Project 3 Designing Classes for Bank Accounts

1 Abstract Data Types Information Hiding

Opening a Command Shell

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

csce4313 Programming Languages Scanner (pass/fail)

Notepad++ The COMPSCI 101 Text Editor for Windows. What is a text editor? Install Python 3

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

Project 4 DB A Simple database program

Lab 1: Introduction to C, ASCII ART and the Linux Command Line Environment

Week 2 Practical Objects and Turtles

Leak Check Version 2.1 for Linux TM

Creating and Using Master Documents

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Lab 1 Beginning C Program

C++ Programming Language

Assignment 09. Problem statement : Write a Embedded C program to switch-on/switch-off LED.

PA2: Word Cloud (100 Points)

CPS122 - OBJECT-ORIENTED SOFTWARE DEVELOPMENT. Team Project

How to Write a Simple Makefile

10CS35: Data Structures Using C

Install Java Development Kit (JDK) 1.8

Illustration 1: Diagram of program function and data flow

The goal with this tutorial is to show how to implement and use the Selenium testing framework.

Lab 2: Swat ATM (Machine (Machine))

Some Scanner Class Methods

Creating Database Tables in Microsoft SQL Server

A Comparison of Programming Languages for Graphical User Interface Programming

Offline Image Viewer Guide

Introduction to Data Structures

Repetition Using the End of File Condition

public static void main(string[] args) { System.out.println("hello, world"); } }

Lecture 22: C Programming 4 Embedded Systems

The C Programming Language course syllabus associate level

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Installing Java. Table of contents

Time Limit: X Flags: -std=gnu99 -w -O2 -fomitframe-pointer. Time Limit: X. Flags: -std=c++0x -w -O2 -fomit-frame-pointer - lm

Chapter 1: Getting Started

Project 2: Bejeweled

Upgrading from Windows XP to Windows 7

1 Description of The Simpletron

edgebooks Quick Start Guide 4

Integrated Accounting System for Mac OS X

1.00 Lecture 1. Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders

Memory management. Announcements. Safe user input. Function pointers. Uses of function pointers. Function pointer example

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

LAB 6: Code Generation with Visual Paradigm for UML and JDBC Integration

The Advantages of Dan Grossman CSE303 Spring 2005, Lecture 25

Sources: On the Web: Slides will be available on:

Create a report with formatting, headings, page numbers and table of contents

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

El Dorado Union High School District Educational Services

Software Design and Implementation - or, how to be a hacker

8.5. <summary> Cppcheck addons Using Cppcheck addons Where to find some Cppcheck addons

LabVIEW Day 6: Saving Files and Making Sub vis

6.s096. Introduction to C and C++

QUICK START BASIC LINUX AND G++ COMMANDS. Prepared By: Pn. Azura Bt Ishak

Contents. Microsoft Office 2010 Tutorial... 1

Jonathan Worthington Scarborough Linux User Group

1001ICT Introduction To Programming Lecture Notes

This presentation explains how to monitor memory consumption of DataStage processes during run time.

ECS 165B: Database System Implementa6on Lecture 2

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Introduction to Eclipse

Semantic Analysis: Types and Type Checking

AP Computer Science Java Mr. Clausen Program 9A, 9B

CS 241 Data Organization Coding Standards

Automated Inventory System

Visual Studio 2008 Express Editions

Chapter 13 Storage classes

Lab 4: Socket Programming: netcat part

Programming Languages CIS 443

Introduction to Java

Sequential Program Execution

ICS Technology. PADS Viewer Manual. ICS Technology Inc PO Box 4063 Middletown, NJ

Microsoft Windows PowerShell v2 For Administrators

Moving from CS 61A Scheme to CS 61B Java

Figure 1: Graphical example of a mergesort 1.

Table of Contents. Java CGI HOWTO

CPSC 226 Lab Nine Fall 2015

ECE 341 Coding Standard

C# and Other Languages

5 Arrays and Pointers

Suite. How to Use GrandMaster Suite. Exporting with ODBC

Symbol Tables. Introduction

Expedite for Windows Software Development Kit Programming Guide

A Python Tour: Just a Brief Introduction CS 303e: Elements of Computers and Programming

FileMaker 14. ODBC and JDBC Guide

Introduction. How does FTP work?

PES Institute of Technology-BSC QUESTION BANK

Vim, Emacs, and JUnit Testing. Audience: Students in CS 331 Written by: Kathleen Lockhart, CS Tutor

Running your first Linux Program

1.2 Using the GPG Gen key Command

Introduction to Programming System Design. CSCI 455x (4 Units)

Python for Series 60 Platform

Upgrading from Windows XP to Windows 7

SQL Injection Attack Lab Using Collabtive

Transcription:

Lab3: Dictionary Array Due Date: Saturday, 14 Feb 2009 by midnight Background: In Lab2, we learned how to use a static 2D array of characters of max size around 172,000x32. The space required to store this file is around 5MB, definitely a waste of memory when most files are smaller than 172,000 words and most words and way smaller than 32 characters. In Lab3, we will implement the same dictionary structure, but via a dynamically allocated array of dynamically allocated char pointers. You will apply all that you have learned so far about arrays, pointers, passing pointers, malloc and free in this lab. WARNING: This is the first time you will be exposed to dealing with dynamic pointers. You must start this program early to make sure you can take care of all the debugging issues early on. This program will definitely take more than one afternoon of coding. Get started early. What You'll Need You will need lab3.c, dictlib.h and dictlib.c and dictlib-solution.o and solution(exe) to get started. You can copy them from the download site at /afs/andrew/course/15/123/downloads/lab3 You will need to carefully study the header file, dictlib.h and implement those functions in the dictlib.c for part 2 of the assignment. We will develop some demo code in class and recitations. It is important to understand that all the strings are hanging from the pointers in the array. Also pay attention to the syntax. Notice that when we wish to assign something into one of the pointers in the array we use (*array)[index] and that we MUST parenthesize (*array). Why? Start with the small input files first (in particular, you should use 10-words.txt to make sure basic functionality works before going on to write the double-capacity function, which you can then test with 105-words.txt and beyond.

Downloading Files: You can download files from from /afs/andrew/course/15/123/downloads/lab3 Assignment: This assignment is divided into two parts. In part 1, you will have to develop the main program that works with the provided dictlib-solution.o. This will give you the opportunity to develop the main program by using our dictionary library. First run our sample solution executable. To run the solution executable, follow the directions below. % chmod +x solution %./solution input/inputfile output/outputfile This will display the menu and try few things to see how they work. You may want to test this with a small file 10-words.txt so you can manually inspect output. Now develop your main program(lab3.c) to do the same thing. Understand how to use each function as listed in dictlib.h and develop the main program. You don t need to have the dictlib.c yet. We will test your main with the provided dictlib-solution.o Your program must do the following. 1. Load the Dictionary loadarray() function will open the input file, read in all the strings, and then close the input file. Your load must also update the wordcount. As you read in words from the file, keep the dictionary in sorted (lexical) order at all times. You may not load the entire array in original order and then call a sort afterwards. Keeping an array sorted in this manner is called insert-in-order. 2. Develop the Menu program After loading the dictionary into the array, you must call a menu() function that offers the user a list of operations to interact with the dictionary. The menu offered to the user should look something like this.

Choose: 'P'rint, 'S'earch, 'I'nsert, 'R'emove, 'C'ount, 'Q'uit : and the meaning of the options are: 'P'rint : A little bit of formatting required here: call printarray() to print out as many words as will fit on a 80 char line separated by a space, without going over 80 chars then go to the next line and repeat until all the words in the dictionary are printed. You may want to use a temporary buffer to do this. 'S'earch : Prompt for a word, e.g., foo, then call searchforword(). After returning from the call to searchforword() print something like: "foo found" OR "foo NOT found". 'I'nsert : Prompt for a word and insert it into the dictionary at the proper position to maintain order. Use insertword. Do not store duplicate words in the dictionary. This option should report back to the user something like "foo inserted" OR "foo ignored (duplicate)". 'R'emove: Prompt for a word then call removeone() to delete it. This option should report back to the user something like "foo removed" OR "foo ignored (not found)". Memory allocated for the word must be freed. 'C'ount : Prints the current wordcount to the console. 'Q'uit : Print message that program is ending and dictionary will be sent to the output file specified on command line. Then call savearray() to save the contents of the dictionary to the output file. It is to be saved in exactly the same format as the input file, i.e., one word per line You need to develop the menu and test that with the sample library dictlibsolution.o to compile your code use: % gcc ansi pedantic Wall lab3.c dictlib-solution.o o mysolution %./mysolution input/inputfile output/outputfile This should work exactly as with the output provided by our solution executable. Now you are ready to move on to part 2. Part 2 In this part, you will develop your own dictlib.c file that will produce the same outputs as produced by the dictlib-solution.o. The following functions must be developed. 1. /* loading from the input file */ int loadarray(char *infilename, char ***array, int *count, int *capacity); [10 pts] 2. /* searching for a specific word */ int searchforword(char **array, int count, char *response); [10 pts] 3. /* menu */ int menu(char ***array, int *count, int *capacity); [10 pts] 4. /* inserting a new word */ int insertword(char ***array, int *count, int *capacity, char word[]); [10 pts]

5. /* removing word */ int removeone(char **array, int *wordcount, char word[]); [10 pts] 6. /* saving the dictionary to a file*/ void savearray(char *filename, char **array, int count); [10 pts] 7. /* double size the array if there isn't enough space */ void darray(char ***array, int count, int *capacity); [10 pts] 8. /* free the entire array. We will be testing all free with valgrind */ void freeall(char **array, int count); [10 pts] 9. /* print the array and all its entries */ void printarray(char **array, int count); [10 pts] 10. Style points (style points are based on indentation, proper use of variable names, structure of your program, handin proper files etc. Your TA can provide more guidance on this. Please ask) [10 pts] DO NOT change the function prototypes as they will be tested from automated scripts. Main requirements for Lab 3 Do not use any additional data structures in your attempt to be clever/fast on the load. You should document your code as best you can. You should especially document anything that is "clever" or unusual. Do not write 2 different insert() functions. Just write one function that does not care whether the string came from the input file, or whether it came from the user in the 'I' menu function. The insert function itself should not write the found / not found message. Let it return a value and check the value after the call. This avoids console output during the initial load. Do not use strcpy to shuffle the words during insertion. Instead, copy the pointers! You must NEVER create any garbage. We will be using Valgrind. You will note that most function prototypes take addresses of variables from the calling program and manipulate the content directly. As such you have to deal with many * (a pointer), ** (a pointer to a pointer OR array of pointers) or *** (an address of an array of pointers). It is important to learn how to dereference various * s. For example, dereferencing an int* leads to an int, dereferencing int** leads to an int* and dereferencing int*** leads to a int** (or array of int* s) Be sure to come to class so you can learn all about *, **, and *** s Unlike the previous assignment, we do not allocate memory in advance. Memory is allocated as needed. Your initial allocation for the array should be space for 50 pointers. If the array ever gets full (i.e., wordcount equals current capacity), and there's a word to insert, you will double the

capacity of the array (i.e., malloc a new array of twice the current size, copy the strings over into the front half of the newly allocated array, and free the space used by the old array. This is what Java does when it runs out of room in an array. It is very important to free the old memory or otherwise, your program may run out of memory and seg fault. You are allowed to use other functions like calloc or realloc. Use Valgrind to make sure you have no memory leaks from this program Program Management through Source File Decomposition This is our first C program where we try to solve the problem by using multiple source (*.c) and header files (*.h). You MUST break up your source code into a main file and a pair of.h /.c files (e.g., dictlib.h and dictlib.c). You cannot change the prototypes given in dictlib.h. The main should only have the includes at the top and the main function below. All other function definitions should be in a separate.c file and their corresponding prototypes in a separate.h file (which must be protected correctly with an #ifndef). Compiling Code You MUSTcompile your source files separately. To compile main.c type > gcc -c ansi Wall pedantic lab3.c This will create the object file main.o. You are also asking the compiler to list all compiler warnings by using the flag Wall and pedantic. Be sure to remove all warnings before submission. To compile dictlib.c type > gcc -c ansi Wall dictlib.c This will creat the object file dictlib.o. Now to create the executable (called exec) you can type >gcc -o exec main.o dictlib.o. We strongly encourage you to create a makefile to automate the program management. We will discuss makefiles in class/recitations.

Data Files We are using the same data files as in Lab2 10-words.txt (test file of 10 words - use this first!) 105-words.txt (test file of 105 words) 42K-words.txt 172K-words.txt DON'T use this one until you are SURE you're done! Testing the program A general command to test the program looks as follows. You MUST test your program by typing./exec inputfile outfile. After input file is read, program will display the menu until user types 'Q' Testing for memory leaks Run valgrind after you are done with everything. % gcc g Wall pedantic ansi dictlib.c lab3.c % valgrind --tool=memcheck --leak-check=full./a.out inputfile outputfile Be sure that you remove all definitely lost error messages from valgrind output. Grading your program Grading Your program will be graded as follows The following grading criterion is strictly enforced for ALL assignments. 1. A program that does not compile - 0 points 2. A program that is 0-24 hours late max grade is 50%

3. A program that is 25-48 hours late - max grade is 20% 4. A program that is more than 48 hours late 0 points 5. Only one late can be used per assignment (you have a total of 3 for the semester) See feedback.txt in the download folder to see the grading criteria used. Handing in your Solution For this assignment, all three files (main.c, dictlib.c, dictlib.h) should be in a zip file. Create the zip file first. >zip lab3.zip lab3.c dictlib.c dictlib.h >cp lab3.zip /afs/andrew.cmu.edu/course/15/123/handin/lab3/yourid to submit your zip file.