Lab3: Dictionary Array Due Date: Saturday, 14 Feb 2009 by midnight Background: In Lab2, we learned how to use a static 2D array of characters of max size around 172,000x32. The space required to store this file is around 5MB, definitely a waste of memory when most files are smaller than 172,000 words and most words and way smaller than 32 characters. In Lab3, we will implement the same dictionary structure, but via a dynamically allocated array of dynamically allocated char pointers. You will apply all that you have learned so far about arrays, pointers, passing pointers, malloc and free in this lab. WARNING: This is the first time you will be exposed to dealing with dynamic pointers. You must start this program early to make sure you can take care of all the debugging issues early on. This program will definitely take more than one afternoon of coding. Get started early. What You'll Need You will need lab3.c, dictlib.h and dictlib.c and dictlib-solution.o and solution(exe) to get started. You can copy them from the download site at /afs/andrew/course/15/123/downloads/lab3 You will need to carefully study the header file, dictlib.h and implement those functions in the dictlib.c for part 2 of the assignment. We will develop some demo code in class and recitations. It is important to understand that all the strings are hanging from the pointers in the array. Also pay attention to the syntax. Notice that when we wish to assign something into one of the pointers in the array we use (*array)[index] and that we MUST parenthesize (*array). Why? Start with the small input files first (in particular, you should use 10-words.txt to make sure basic functionality works before going on to write the double-capacity function, which you can then test with 105-words.txt and beyond.
Downloading Files: You can download files from from /afs/andrew/course/15/123/downloads/lab3 Assignment: This assignment is divided into two parts. In part 1, you will have to develop the main program that works with the provided dictlib-solution.o. This will give you the opportunity to develop the main program by using our dictionary library. First run our sample solution executable. To run the solution executable, follow the directions below. % chmod +x solution %./solution input/inputfile output/outputfile This will display the menu and try few things to see how they work. You may want to test this with a small file 10-words.txt so you can manually inspect output. Now develop your main program(lab3.c) to do the same thing. Understand how to use each function as listed in dictlib.h and develop the main program. You don t need to have the dictlib.c yet. We will test your main with the provided dictlib-solution.o Your program must do the following. 1. Load the Dictionary loadarray() function will open the input file, read in all the strings, and then close the input file. Your load must also update the wordcount. As you read in words from the file, keep the dictionary in sorted (lexical) order at all times. You may not load the entire array in original order and then call a sort afterwards. Keeping an array sorted in this manner is called insert-in-order. 2. Develop the Menu program After loading the dictionary into the array, you must call a menu() function that offers the user a list of operations to interact with the dictionary. The menu offered to the user should look something like this.
Choose: 'P'rint, 'S'earch, 'I'nsert, 'R'emove, 'C'ount, 'Q'uit : and the meaning of the options are: 'P'rint : A little bit of formatting required here: call printarray() to print out as many words as will fit on a 80 char line separated by a space, without going over 80 chars then go to the next line and repeat until all the words in the dictionary are printed. You may want to use a temporary buffer to do this. 'S'earch : Prompt for a word, e.g., foo, then call searchforword(). After returning from the call to searchforword() print something like: "foo found" OR "foo NOT found". 'I'nsert : Prompt for a word and insert it into the dictionary at the proper position to maintain order. Use insertword. Do not store duplicate words in the dictionary. This option should report back to the user something like "foo inserted" OR "foo ignored (duplicate)". 'R'emove: Prompt for a word then call removeone() to delete it. This option should report back to the user something like "foo removed" OR "foo ignored (not found)". Memory allocated for the word must be freed. 'C'ount : Prints the current wordcount to the console. 'Q'uit : Print message that program is ending and dictionary will be sent to the output file specified on command line. Then call savearray() to save the contents of the dictionary to the output file. It is to be saved in exactly the same format as the input file, i.e., one word per line You need to develop the menu and test that with the sample library dictlibsolution.o to compile your code use: % gcc ansi pedantic Wall lab3.c dictlib-solution.o o mysolution %./mysolution input/inputfile output/outputfile This should work exactly as with the output provided by our solution executable. Now you are ready to move on to part 2. Part 2 In this part, you will develop your own dictlib.c file that will produce the same outputs as produced by the dictlib-solution.o. The following functions must be developed. 1. /* loading from the input file */ int loadarray(char *infilename, char ***array, int *count, int *capacity); [10 pts] 2. /* searching for a specific word */ int searchforword(char **array, int count, char *response); [10 pts] 3. /* menu */ int menu(char ***array, int *count, int *capacity); [10 pts] 4. /* inserting a new word */ int insertword(char ***array, int *count, int *capacity, char word[]); [10 pts]
5. /* removing word */ int removeone(char **array, int *wordcount, char word[]); [10 pts] 6. /* saving the dictionary to a file*/ void savearray(char *filename, char **array, int count); [10 pts] 7. /* double size the array if there isn't enough space */ void darray(char ***array, int count, int *capacity); [10 pts] 8. /* free the entire array. We will be testing all free with valgrind */ void freeall(char **array, int count); [10 pts] 9. /* print the array and all its entries */ void printarray(char **array, int count); [10 pts] 10. Style points (style points are based on indentation, proper use of variable names, structure of your program, handin proper files etc. Your TA can provide more guidance on this. Please ask) [10 pts] DO NOT change the function prototypes as they will be tested from automated scripts. Main requirements for Lab 3 Do not use any additional data structures in your attempt to be clever/fast on the load. You should document your code as best you can. You should especially document anything that is "clever" or unusual. Do not write 2 different insert() functions. Just write one function that does not care whether the string came from the input file, or whether it came from the user in the 'I' menu function. The insert function itself should not write the found / not found message. Let it return a value and check the value after the call. This avoids console output during the initial load. Do not use strcpy to shuffle the words during insertion. Instead, copy the pointers! You must NEVER create any garbage. We will be using Valgrind. You will note that most function prototypes take addresses of variables from the calling program and manipulate the content directly. As such you have to deal with many * (a pointer), ** (a pointer to a pointer OR array of pointers) or *** (an address of an array of pointers). It is important to learn how to dereference various * s. For example, dereferencing an int* leads to an int, dereferencing int** leads to an int* and dereferencing int*** leads to a int** (or array of int* s) Be sure to come to class so you can learn all about *, **, and *** s Unlike the previous assignment, we do not allocate memory in advance. Memory is allocated as needed. Your initial allocation for the array should be space for 50 pointers. If the array ever gets full (i.e., wordcount equals current capacity), and there's a word to insert, you will double the
capacity of the array (i.e., malloc a new array of twice the current size, copy the strings over into the front half of the newly allocated array, and free the space used by the old array. This is what Java does when it runs out of room in an array. It is very important to free the old memory or otherwise, your program may run out of memory and seg fault. You are allowed to use other functions like calloc or realloc. Use Valgrind to make sure you have no memory leaks from this program Program Management through Source File Decomposition This is our first C program where we try to solve the problem by using multiple source (*.c) and header files (*.h). You MUST break up your source code into a main file and a pair of.h /.c files (e.g., dictlib.h and dictlib.c). You cannot change the prototypes given in dictlib.h. The main should only have the includes at the top and the main function below. All other function definitions should be in a separate.c file and their corresponding prototypes in a separate.h file (which must be protected correctly with an #ifndef). Compiling Code You MUSTcompile your source files separately. To compile main.c type > gcc -c ansi Wall pedantic lab3.c This will create the object file main.o. You are also asking the compiler to list all compiler warnings by using the flag Wall and pedantic. Be sure to remove all warnings before submission. To compile dictlib.c type > gcc -c ansi Wall dictlib.c This will creat the object file dictlib.o. Now to create the executable (called exec) you can type >gcc -o exec main.o dictlib.o. We strongly encourage you to create a makefile to automate the program management. We will discuss makefiles in class/recitations.
Data Files We are using the same data files as in Lab2 10-words.txt (test file of 10 words - use this first!) 105-words.txt (test file of 105 words) 42K-words.txt 172K-words.txt DON'T use this one until you are SURE you're done! Testing the program A general command to test the program looks as follows. You MUST test your program by typing./exec inputfile outfile. After input file is read, program will display the menu until user types 'Q' Testing for memory leaks Run valgrind after you are done with everything. % gcc g Wall pedantic ansi dictlib.c lab3.c % valgrind --tool=memcheck --leak-check=full./a.out inputfile outputfile Be sure that you remove all definitely lost error messages from valgrind output. Grading your program Grading Your program will be graded as follows The following grading criterion is strictly enforced for ALL assignments. 1. A program that does not compile - 0 points 2. A program that is 0-24 hours late max grade is 50%
3. A program that is 25-48 hours late - max grade is 20% 4. A program that is more than 48 hours late 0 points 5. Only one late can be used per assignment (you have a total of 3 for the semester) See feedback.txt in the download folder to see the grading criteria used. Handing in your Solution For this assignment, all three files (main.c, dictlib.c, dictlib.h) should be in a zip file. Create the zip file first. >zip lab3.zip lab3.c dictlib.c dictlib.h >cp lab3.zip /afs/andrew.cmu.edu/course/15/123/handin/lab3/yourid to submit your zip file.