CS 246 Winter 2016 Assignment 3 Instructors: Peter Buhr and Rob Schluntz Due Date: Monday, March 7, 2016 at 22:00

Similar documents
How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

C++ INTERVIEW QUESTIONS

An Incomplete C++ Primer. University of Wyoming MA 5310

Basics of I/O Streams and File I/O

El Dorado Union High School District Educational Services

How To Write Portable Programs In C

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Member Functions of the istream Class

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

The C Programming Language course syllabus associate level

Chapter 4: Computer Codes

C++ Programming Language

C++ Programming: From Problem Analysis to Program Design, Fifth Edition. Chapter 3: Input/Output

Coding conventions and C++-style

Pemrograman Dasar. Basic Elements Of Java

Simple Image File Formats

CpSc212 Goddard Notes Chapter 6. Yet More on Classes. We discuss the problems of comparing, copying, passing, outputting, and destructing

C++ Outline. cout << "Enter two integers: "; int x, y; cin >> x >> y; cout << "The sum is: " << x + y << \n ;

Caml Virtual Machine File & data formats Document version: 1.4

Project 2: Bejeweled

Lecture 3. Arrays. Name of array. c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] c[8] c[9] c[10] c[11] Position number of the element within array c

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C

Comp151. Definitions & Declarations

PROBLEM SOLVING SEVENTH EDITION WALTER SAVITCH UNIVERSITY OF CALIFORNIA, SAN DIEGO CONTRIBUTOR KENRICK MOCK UNIVERSITY OF ALASKA, ANCHORAGE PEARSON

Object Oriented Software Design II

The programming language C. sws1 1

Application Note. Introduction AN2471/D 3/2003. PC Master Software Communication Protocol Specification

Number Representation

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

Computer Programming C++ Classes and Objects 15 th Lecture

CISC 181 Project 3 Designing Classes for Bank Accounts

Simple C++ Programs. Engineering Problem Solving with C++, Etter/Ingber. Dev-C++ Dev-C++ Windows Friendly Exit. The C++ Programming Language

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

µtasker Document FTP Client

Linux/UNIX System Programming. POSIX Shared Memory. Michael Kerrisk, man7.org c February 2015

Java Interview Questions and Answers

So far we have considered only numeric processing, i.e. processing of numeric data represented

Binary Representation

Informatica e Sistemi in Tempo Reale

Binary storage of graphs and related data

GNAT User s Guide for Native Platforms

C++ Input/Output: Streams

Brent A. Perdue. July 15, 2009

Table 1 below is a complete list of MPTH commands with descriptions. Table 1 : MPTH Commands. Command Name Code Setting Value Description

C Programming. for Embedded Microcontrollers. Warwick A. Smith. Postbus 11. Elektor International Media BV. 6114ZG Susteren The Netherlands

5 Arrays and Pointers

As previously noted, a byte can contain a numeric value in the range Computers don't understand Latin, Cyrillic, Hindi, Arabic character sets!

Keil C51 Cross Compiler

C++FA 5.1 PRACTICE MID-TERM EXAM

Dalhousie University CSCI 2132 Software Development Winter 2015 Lab 7, March 11

How to represent characters?

CSI 402 Lecture 13 (Unix Process Related System Calls) 13 1 / 17

Lecture 22: C Programming 4 Embedded Systems

Leak Check Version 2.1 for Linux TM

Lecture 11 Doubly Linked Lists & Array of Linked Lists. Doubly Linked Lists

MarshallSoft AES. (Advanced Encryption Standard) Reference Manual

Applied Informatics C++ Coding Style Guide

An overview of FAT12

Illustration 1: Diagram of program function and data flow

An API for Reading the MySQL Binary Log

csce4313 Programming Languages Scanner (pass/fail)

Using C++ File Streams

public static void main(string[] args) { System.out.println("hello, world"); } }

20 Using Scripts. (Programming without Parts) 20-1

The Answer to the 14 Most Frequently Asked Modbus Questions

File Handling. What is a file?

Stack Allocation. Run-Time Data Structures. Static Structures

How to Write a Simple Makefile

Programming languages C

1 Abstract Data Types Information Hiding

Ubuntu. Ubuntu. C++ Overview. Ubuntu. History of C++ Major Features of C++

C++ Language Tutorial

MPLAB TM C30 Managed PSV Pointers. Beta support included with MPLAB C30 V3.00

/* File: blkcopy.c. size_t n

A brief introduction to C++ and Interfacing with Excel

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Chapter 7D The Java Virtual Machine

APPLICATION NOTE. Atmel AVR911: AVR Open Source Programmer. 8-bit Atmel Microcontrollers. Features. Introduction

A Catalogue of the Steiner Triple Systems of Order 19

IS0020 Program Design and Software Tools Midterm, Feb 24, Instruction

CORBA Programming with TAOX11. The C++11 CORBA Implementation

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

MPLAB Harmony System Service Libraries Help

1 Description of The Simpletron

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Object Oriented Software Design II

Install Java Development Kit (JDK) 1.8

Forensic Analysis of Internet Explorer Activity Files

Computer Programming I

Introduction to Programming System Design. CSCI 455x (4 Units)

[MS-RDPESC]: Remote Desktop Protocol: Smart Card Virtual Channel Extension

Bluetooth HID Profile

Design: Metadata Cache Logging

MISRA-C:2012 Standards Model Summary for C / C++

System Calls Related to File Manipulation

Passing 1D arrays to functions.

Chapter 3: Operating-System Structures. Common System Components

Logging. Working with the POCO logging framework.

Transcription:

CS 246 Winter 2016 Assignment 3 Instructors: Peter Buhr and Rob Schluntz Due Date: Monday, March 7, 2016 at 22:00 March 1, 2016 This assignment examines intermediate-level C++ and classes. Use it to become familiar with these facilities, and ensure you use the specified concepts in your assignment solution, i.e., writing a C-style solution for questions is unacceptable, and will receive little or no marks. (You may freely use the code from these example programs.) 1. Given the C++ program in Figure 1, compile the program with and without preprocessor variable DYN defined. $ g++ -DDYN new.cc $ g++ new.cc Compare the two versions of the program with respect to performance by doing the following for each version: Run the program and time the execution using the time command: $ /usr/bin/time -f "%Uu %Ss %E"./a.out 3.21u 0.02s 0:03.32 (Output from time differs depending on the shell, so use the system time command.) Compare the user time (3.21u) only, which is the CPU time consumed solely by the execution of user code (versus system and real time). Use the program command-line argument (if necessary) to adjust the number of times the experiment is performed to get user times approximately in the range 0.1 to 100 seconds. (Timing results below 0.1 seconds are inaccurate.) Use the same command-line value for all experiments. Run both the experiments again after recompiling the programs with compiler optimization turned on (i.e., compiler flag -O2). $ g++ -O2 -DDYN new.cc $ g++ -O2 new.cc Include 4 timing results to validate the experiments. Explain the relative differences in the timing results with respect to stack and dynamic allocation. State the performance difference when compiler optimization is used. Explain the use of 0 instead of NULL to initialize a pointer. (Hint: change the 0 to NULL, comment out the #include, and compile the program.) Explain why the call to delete with an address of 0 does not produce an error. 2. Write a C++ program to verify a string of bytes is a valid Unicode Transformation Format 8-bit character (UTF- 8). UTF-8 allows any universal character to be represented while maintaining full backwards-compatibility with ASCII encoding, which is achieved by using a variable-length encoding. The following table provides a summary of the Unicode value ranges in hexadecimal, and how they are represented in binary for UTF-8. Unicode ranges UTF-8 binary encoding 000000-00007F 0xxxxxxx 000080-0007FF 110xxxxx 10xxxxxx 000800-00FFFF 1110xxxx 10xxxxxx 10xxxxxx 010000-10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 1

CS 246 - Assignment 3 2 #include <cstdlib> // atoi void alloc( unsigned int size, unsigned int times ) { for ( unsigned int i = 0; i < times; i += 1 ) { #ifdef DYN volatile int * arr = new int[size]; arr[0] = 5; delete [ ] arr; #else volatile int arr[size]; // ignore volatile, prevents elimination of declaration & assignment #endif } // for } // alloc arr[0] = 5; int main( int argc, char * argv[ ] ) { int times = 100000000; switch ( argc ) { case 2: times = atoi( argv[1] ); } // switch alloc( 10, times ); volatile int * arr = 0; delete arr; } // main // ignore volatile, prevents elimination of declaration & deallocation Figure 1: Stack versus Dynamic Allocation For example, the symbol is represented by Unicode value 0xA3 (binary 1010 0011). Since falls within the range of 0x80 to 0x7FF, it is encoded by the UTF-8 bit string 110xxxxx 10xxxxxx. To fit the character into the eleven bits of the UTF-8 encoding, it is padded on the left with zeroes to 00010100011. The UTF-8 encoding becomes 11000010 10100011, where the x s are replaced with the 11-bit binary encoding giving the UTF-8 character encoding 0xC2A3 for symbol. Note, UTF-8 is a minimal encoding; e.g., it is incorrect to represent the value 0 by any encoding other than the first UTF-8 binary encoding. Use unformatted I/O to read the Unicode bytes and the data structure in Figure 2 to decode the bytes. The shell interface to the utf8 program is as follows: utf8 [ filename ] (Square brackets indicate optional command line parameters, and do not appear on the actual command line.) If no input file name is specified, input comes from standard input. Output is sent to standard output. Issue appropriate runtime error messages for incorrect usage or if a file cannot be opened. The input file contains an unknown number of packed UTF-8 characters, meaning there is no newline separation, but a newline can appear as a UTF-8 character. Structure the program in two translation units. One translation unit contains routine read: wchar t read( istream & infile, character & ch ); which reads in sufficient bytes from infile to accumulate a valid UTF-8 character in utf8char ch.data with the length of the UTF-8 character set in ch.length. It also returns the Unicode value of the UTF-8 character, e.g., for the UTF-8 character, 0xC2A3, the value 0xA3 is returned. Routine read does not print. If read finds an error in the format of the UTF-8 character, it raises a UTF8err exception containing an appropriate message, e.g.: throw UTF8err( "length" ); indicating a problem with the encoded length of a UTF-8 character. The other translation unit contains the main program, which handle the command-line arguments and calls read until end-of-file is raised, and does all necessary printing of valid UTF-8 characters or errors. Print the bytes of

CS 246 - Assignment 3 3 struct UTF8err { // exception const char * msg; UTF8err( const char * msg ) : msg( msg ) {} struct character { union UTF8 { unsigned char ch; unsigned char dt : 7; unsigned char ck : 1; } t1; unsigned char dt : 5; unsigned char ck : 3; } t2; } t3; } t4; } dt; // character // types for 1st utf-8 byte // check // check // type for extra utf-8 bytes } data[4]; // bytes in UTF-8 character unsigned int length; // number of bytes in UTF-8 character Figure 2: UTF8 Data Structure the UTF-8 character in hexadecimal. Hint: to print a character in hexadecimal use the following cast: char ch = 0xff; cout << hex << (unsigned int)(unsigned char)ch << endl; For example, given the input file: $ od -t x1 infile 0000000 23 d7 90 d7 c2 c2 a3 b0 e0 e3 e9 80 80 e0 93 90 0000020 ff f0 90 89 f0 f0 90 89 80 01 the program prints: 0x23 : valid value 0x23 0xd790 : valid value 0x5d0 0xd7c2 : invalid padding 0xc2a3 : valid value 0xa3 0xb0 : invalid length 0xe0e3 : invalid padding 0xe98080 : valid value 0x9000 0xe09390 : invalid range 0xff : invalid length 0xf09089f0 : invalid padding 0xf0908980 : valid value 0x10240 0x01 : valid value 0x1 3. Write a C++ class named string that contains a sequence of UTF-8 characters. Since the name string is already used for C++ string, it is important to prevent name clashes between the new UTF-8 strings and std::string. To prevent conflicts, place the UTF-8 string in its own namespace, called utf8. The interface for the UTF-8 string is:

CS 246 - Assignment 3 4 struct string { string(); string(const string &); string(const char * ); ~string(); string & operator=(const string &); void push back( character ch ); void reserve( unsigned int n ); character * chars; unsigned int length; unsigned int capacity; // copy assignment operator // add one UTF-8 character to the end of the string // if n > capacity, resize string to have enough space // for n UTF-8 characters // dynamically allocated array of UTF-8 characters // # of UTF-8 characters currently in the chars array // maximum # of UTF-8 characters that chars can store // IMPLEMENT INPUT, OUTPUT, ADDITION Implement the appropriate constructors and destructor, and member routines push back and reserve for the string type. Furthermore, overload the input, output, assignment, and addition operators for the UTF-8 string type. The following example illustrates how a UTF-8 string is used. using utf8::string; string s1; // create an empty string (length is zero, chars is NULL) string s2( "foobar" ); // create a UTF-8 string initialized with the character string foobar string s3( s2 ); // initialize UTF-8 string s3 with a copy of the UTF-8 string in s2 string s4( "\xc2\xa3" ); // initialize with UTF-8 pound symbol cin >> s1 >> s4; // read in whitespace-delimited UTF-8 strings from stdin cout << s1 << " " << s2 << " " << s3 << " " << s4 << endl; // print UTF-8 strings to stdout s1 = s1 + s4; s2 = s2 + "baz"; // concatenate UTF-8 strings s1 and s4 // concatenate UTF-8 string s2 and character string baz Implementation notes The declaration of the string type can be found in utf8string.h. For your submission you should add all routine and member definitions to utf8string.cc. You are not allowed to use the C++ string type to solve this question. However, you may include the header cstring and use the functions declared therein. In particular, you may find memcpy useful. For memory allocation, you must follow this allocation scheme: every default constructed string begins with a capacity of 0. The first time data is stored in a default constructed string, it is given a capacity of 5 and space is allocated accordingly. If the string was not allocated with the default constructor, you may choose a different, reasonable initial capacity. If at any point this capacity proves to be not enough, you must double the capacity (for example, capacities can go from 5 to 10 to 20 to 40...). Note that there is no realloc in C++, so doubling the size of an array necessitates allocating a new array and copying items over. Your program must not leak memory. Becoming familar with cin.peek() and the isspace function located in the <cctype> library may aid you in solving this question. In particular, note that cin.peek() does not by default skip leading whitespace. Also note that cin.peek() returns an int. The provided driver (q3.cc) can be compiled with your solution to test (and then debug) your code. Please keep in mind that the purpose of the test harness is to provide a convenient means of verifying that code you are asked to write is working correctly. Therefore, although some effort has been expended to make the harness reasonably robust, we do not guarantee that it is perfect, as that is not the point. The test harness should function correctly if you use it as intended; it may fail horribly if you abuse it. But the point of your testing is to verify your code, rather than the harness.

CS 246 - Assignment 3 5 As a hint, some of the operations are easier to implement than others. In addition, some of the operations are useful as helper functions for implementing the more difficult operations. Submission Guidelines Please follow these guidelines carefully. Review the Assignment Guidelines and C++ Coding Guidelines before starting each assignment. Each text file, i.e., *. * txt file, must be ASCII text and not exceed 500 lines in length, where a line is a maximum of 120 characters. Name your submitted files as follows: 1. new.txt contains the information required by question 1, p. 1. 2. utf8char.h,utf8char.{cc,c,cpp},q2.{cc,c,cpp} code for question 2, p. 1. The program must be divided into separate compilation units with file names given above. Program documentation must be present in your submitted code. Output for this question is checked via a marking program, so it must match exactly with the given program. 3. q2utf8.testtxt test documentation for question 2, p. 1, which includes the input and output of your tests. Write a brief description for each test explaining what aspects of the program it is testing and how you decided if the program passed the test. 4. utf8char.h,utf8char.{cc,c,cpp},utf8string.h,utf8string.{cc,c,cpp} code for question 3, p. 3. The program must be divided into separate compilation units with file names given above. Program documentation must be present in your submitted code. Output for this question is checked via a marking program, so it must match exactly with the given program. Use the following Makefile to compile the programs for questions 2, p. 1 and 3, p. 3 (do not submit this file): CXX = g++-4.9 # compiler CXXFLAGS = -g -Wall -Werror -std=c++11 -MMD # compiler flags MAKEFILE NAME = ${firstword ${MAKEFILE LIST}} # makefile name OBJECTS2 = utf8char.o q2.o EXEC2 = utf8ch OBJECTS3 = utf8char.o utf8string.o q3.o EXEC3 = utf8str OBJECTS = ${OBJECTS2} ${OBJECTS3} EXECS = ${EXEC2} ${EXEC3} DEPENDS = ${OBJECTS:.o=.d} # object files forming executable # executable name # object files forming executable # executable name # substitute.o with.d.phony : all clean all : ${EXECS} ${EXEC2} : ${OBJECTS2} ${CXX} $^ -o $@ ${EXEC3} : ${OBJECTS3} ${CXX} $^ -o $@ ${OBJECTS} : ${MAKEFILE NAME} -include ${DEPENDS} # link step # link step # OPTIONAL : changes to this file => recompile # include *.d files containing program dependences clean : rm -f ${DEPENDS} ${OBJECTS} ${EXECS} # remove files that can be regenerated Put this Makefile in the directory with your programs, name your source files appropriately, and then execute shell command make utf8ch or make utf8str in the directory to compile a program (make without an argument compiles all the programs). This Makefile is used by Marmoset to build programs, so make sure your programs compiles with it. Do not make any changes to the Makefile. Follow these guidelines. Your grade depends on it!