Chapter 2. Fundamental File Processing Operations Kim Joung-Joon Database Lab. jjkim9@db.konkuk.ac.kr
Chapter Objectives Describe the process of linking a logical file within a program to an actual physical file of device Describe the procedures used to create, open and close files Introduce the C++ input and output classes Explain the use of overloading in C++ Describe the procedures used for reading from and writing to files Introduce the concept of position within a file and describe procedures for seeking different positions Provide an introduction to the organization of hierarchical file systems Present the Unix view of a file and describe Unix file operations and commands based on this view File Structures (2) Konkuk University (DB Lab.) 2
Chapter Outline 2.1 Physical Files and Logical Files 2.2 Opening Files 2.2 Closing Files 2.4 Reading and Writing 2.5 Seeking 2.6 Special Characters in Files 2.7 The Unix Directory Structures 2.8 Physical Devices and Logical Files 2.9 File-Related Header Files 2.10 Unix File System Commands File Structures (2) Konkuk University (DB Lab.) 3
2.1 Physical Files and Logical Files File a particular collection of bytes Physical file a file on a disk or tape Logical file a file used inside the program (ex) select inp_file assign to myfile.dat. : Cobol assign (inp_file, 'myfile.dat') : Turbo Pascal logical file physical file File Structures (2) Konkuk University (DB Lab.) 4
2.1 Physical Files and Logical Files File Structures (2) Konkuk University (DB Lab.) 5
2.2 Opening Files Two options (1) open an existing file position at the beginning of the file and ready to start reading and writing (2) create a new file ready for use after creation C++ and C (fcntl.h) fd = open(filename, flags [, pmode]); => pp.17-18 (ex) fd = open(filename, O_RDWR O_CREAT, 0751); fd = open(filename, O_RDWR O_CREAT O_TRUNC, 0751); fd = open(filename, O_RDWR O_CREAT O_EXCL, 0751); File Structures (2) Konkuk University (DB Lab.) 6
2.3 Closing Files Closing a file the logical name or file descriptor is available for use with another file (i.e., breaks the link) ensure that everything has been written to the file (i.e., the buffer for the file has been flushed of data and everything we have written has been sent to the file) automatically closed by OS when a program terminates normally (=> for protection against data loss and for reuse of logical filenames) File Structures (2) Konkuk University (DB Lab.) 7
2.4 Reading and Writing (1/4) Read and Write Functions Before reading or writing, we must have already opened the file. low level read or write READ (Source_file, Destination_addr, Size) logical file first addr. of the byte count name memory block WRITE(Destination_file, Source_addr, Size) File Structures (2) Konkuk University (DB Lab.) 8
2.4 Reading and Writing (2/4) Files with-c Streams & C++ Stream Classes stream : a file or some other source or consumer of file (1) C Streams or C input/output use the standard C functions in stdio.h stdio.h contains definitions of the types & the operations on C streams stdio & stdout : standard input and output streams file = fopen(filename, type);=> pp.21-22 fread, fget, fwrite, fput, fscanf, fprintf (2) C++ stream classes use the stream classes of iostream.h and fstream.h cin, cout : predefined stream objects for the standard input & standard output files fstream : class for access to files has two constructors and methods, open, read, write >>(extraction) and <<(insertion) : overloaded for input and output File Structures (2) Konkuk University (DB Lab.) 9
2.4 Reading and Writing (3/4) Programs in C++ to Display the Contents of a File 1. Display a prompt for the name of the input file 2. Read the user s response from the keyboard into a variable called filename 3. Open the file for input 4. While there are still characters to be read from the input file 1. Read a character from the file 2. Write the character to the terminal screen 5. Close the input file Ex) Figure 2.2 : using C streams Figure 2.3 : using C++ stream classes => See Appendix D File Structures (2) Konkuk University (DB Lab.) 10
2.4 Reading and Writing (4/4) Detecting End-of-file C C++ fread call returns the 0 of elements read use the function fail to check end-of-file File Structures (2) Konkuk University (DB Lab.) 11
2.5 Seeking (1/3) Seeking to control the movement of the read/write pointer Seek (Source_file, Offset) Source_file : logical file name Offset : the # of positions from the start of the file (ex) Seek(data, 373) move directly from the origin to the 373 position File Structures (2) Konkuk University (DB Lab.) 12
2.5 Seeking (2/3) Seeking with C Streams pos = fseek(fd, byte-offset, origin) long integer to set the read/write pointer to any byte in a file (ex) pos = fseek (fd, 373, 0); 0 : 시작 1 : 현위치 2 : 마지막 File Structures (2) Konkuk University (DB Lab.) 13
2.5 Seeking (3/3) Seeking with C++ Stream Classes almost exactly the same as in C streams Two syntactic differences (1) an object of fstream has two file pointers, get pointer and put pointer =>seekg for the get pointer and seekp for the put pointer (2) seek operations are methods of the stream classes =>file.seekg(byte_offset, origin) file.seekp(byte_offset, origin) where origin = ios::beg, ios::cur, and ios::end (ex) file.seekg(373, ios::beg); file.seekp(373, ios::beg); File Structures (2) Konkuk University (DB Lab.) 14
2.7 The UNIX Directory Structure (1/2) UNIX file system / a tree-structured organization with two kinds of files ( i.e., regular files(programs and data) and directories) devices such as tape or disk drivers are also files (in dev directory) to indicate the root directory to separate directory names from the file name absolute pathname and relative pathname for file identification current directory :. parent directory :.. File Structures (2) Konkuk University (DB Lab.) 15
2.7 The UNIX Directory Structure (2/2) / (root) bin usr usr6 dev adb cc yacc bin lib lib mydir console kbd TAPE libc.a libm.a libdf.a addr DF File Structures (2) Konkuk University (DB Lab.) 16
2.8 Physical Devices and Logical Files Physical Devices as Files file in UNIX a sequence of bytes ( => very few operations ) magnetic disk and devices like the keyboard and the console are also files (/dev/kbd, /dev/console) represented logically by an integer (file descriptor) File Structures (2) Konkuk University (DB Lab.) 17
2.8 Physical Devices and Logical Files The Console, the Keyboard, and Standard Error defined in stdio.h Stdin(standard input) : keyboard Stdout(standard output): console Stderr(standard error) : console Read and write read... gets <--- stdin write... printf ---> stdout File Structures (2) Konkuk University (DB Lab.) 18
2.8 Physical Devices and Logical Files I/O Redirection and Pipes for switching between standard I/O (stdin and stdout) and regular file I/O I/O redirection to specify at execution time alternate files for input or output < file ( redirect stdin to "file" ) > file ( redirect stdou to "file" ) (ex) list > myfile Pipe to use the output of a program as input to another program without using an intermediate file program1 program2 any stdout output of program1 => any stdin input to program2 (ex) list sort File Structures (2) Konkuk University (DB Lab.) 19
2.9 File-Related Header Files Header files ( /usr/include ) have special names and values C streams : stdio.h C++ streams : iostream.h and fstream.h Unix operations : fcntl.h and file.h EOF, stdin, stdout, stderr : stdio.h O_RDONLY, O_WRONLY, O_RDWR : file.h File Structures (2) Konkuk University (DB Lab.) 20
2.10 Unix File System Commands Unix Commands cat filenames cp file1 file2 rm filenames ls rmdir name tail filenames mv file1 file2 chmod mode filename mkdir name => Consult a Unix manual for more information File Structures (2) Konkuk University (DB Lab.) 21
A.1 File I/O in Pascal (1/2) included in language definition provide high-level access to reading/writing in C, a file is a sequence of bytes, but in Pascal, a file is a sequence of records File Structures (2) Konkuk University (DB Lab.) 22
A.2 File I/O in Pascal (2/2) File I/O functions assign(input_file, myfile.dat ); // associate between a logical file and a physical file reset(input_file); // open existing file rewrite(input_file); // create new file append(input_file); // open to add data to existing file read(input_file, var); // read from file to variable readln(input_file, var); // read from file to variable write(input_file, var); // write from variable to file writeln(input_file, var); // write from variable to file close(input_file); // close file File Structures (2) Konkuk University (DB Lab.) 23
A.3 File I/O in C Low-level I/O UNIX system calls fd1 = open(filename1, rwmode); fd2 = open(filename2, rwmode); read(fd1, buf, n); write(fd2, buf, n); lseek(fd1, offset, origin); close(fd1); close(fd2); File Structures (2) Konkuk University (DB Lab.) 24
A.4 <stdio.h> fp = fopen(s, mode) /* open file s; mode r, w, a for read, write, append (returns NULL for error) */ c = getc(fp) /* get character; getchar() is getc(stdin) */ putc(c, fp) /* put character; putchar(c) is putc(c, stdout) */ ungetc(c, fp) /* put character back on input file fp; at most 1 char can be pushed back at one time */ scanf(fmt, a1,...) /* read characters from stdin into a1,... according to fmt. Each a i must be a pointer. Returns EOF or number of fields converted */ fscanf(fp,...) /* read from file fp */ printf(fmt, a1,...) /* format a1,... according to fmt, print on stdout */ fprintf(fp,...) /* print... on file fp */ fgets(s, n, fp) /* read at most n characters into s from fp. Returns NULL at end of file */ fputs(s, fp) /* print string s on file fp */ fflush(fp) /* flush any buffered output on file fp */ fclose(fp) /* close file fp */ File Structures (2) Konkuk University (DB Lab.) 25
A.5 File I/O in C++ #include <fstream.h> File Stream: fstream, ifstream, ofstream (ex) ifstream f1( input.fil ); ofstream f2( output.fil, ios::out ios::nocreat); fstream f3( inout.fil, ios::in ios::out); f1.get(ch); f1.eof(); f2.put(ch); f2.bad(); f1.seekg(); f2.seekp(); f3.close(); File Structures (2) Konkuk University (DB Lab.) 26
A.6 <iostream.h> (1/3) class ios class istream: virtual public ios class ostream: virtual public ios class iostream: public istream, public ostream Class Hierarchy File Structures (2) Konkuk University (DB Lab.) 27
A.7 <iostream.h> (2/3) class ostream: virtual public ios { public: ostream& put(char); ostream& write(char*, int); ostream& seekp(int); ostream& operator<<(char); ostream& operator<<(int); ostream& operator<<(char*); ostream& operator<<(long); ostream& operator<<(short); ostream& operator<<(float);... }; class istream: virtual public ios { public: istream& get(char*, int, char = n ); istream& get(char); istream& read(char*, int); istream& gets(char**, char = n ); istream& seekg(int); istream& operator>>(char&); istream& operator>>(int&); istream& operator>>(char*); istream& operator>>(long&);... }; File Structures (2) Konkuk University (DB Lab.) 28
A.8 <iostream.h> (3/3) class iostream: public istream, public ostream { public: iostream( ) { } }; File Structures (2) Konkuk University (DB Lab.) 29
A.9 Copy Program in C++ (1/2) #include <fstream.h> #include <libc.h> void error(char *s, char *s2 = ){ cerr << s << << s2 << n ; exit(1); } int main(int argc, char *argv[]) { if( argc!= 3) error( wrong number of arguments ); ifstream src(argv[1]); //input file stream if (!src) error( cannot open input file, argv[1]); File Structures (2) Konkuk University (DB Lab.) 30
A.9 Copy Program in C++ (2/2) ofstream dest(argv[2]); //output file stream if(!dest) error( cannot open output file, argv[2]); char ch; while( src.get(ch) ) dest.put(ch); if(!src.eof() dest.bad()) error( something strange happened! ); } return 0; File Structures (2) Konkuk University (DB Lab.) 31