File Management Chapters 10, 11, 12
Requirements For long-term storage: possible to store large amount of info. info must survive termination of processes multiple processes must be able to access concurrently Solution: store in units called files Filesystem: part of the OS responsible for providing file services to users & apps
File Management Functions Identify and locate a selected file Use a directory to describe the location of all files plus their attributes t Describe shared user access control Blocking for access to files Allocate files to free blocks Manage free storage for available blocks
Files You should already be familiar with: types (regular, directory, character/block special) naming protections (e.g., -rwxr-xr-x bits in Linux) operations (create, delete, open, read, etc.) Two types of file access: Sequential: read all bytes/records from beginning Random: read in any order use seek to move to certain location in file essential for database files
File Organization How are files organized physically? Five main organizations: pile sequential file indexed sequential file indexed file direct (hashed) file
The Pile File Organization Data collected in chronological order Records may have different fields No structure Record access is by exhaustive search
Pile
Sequential File Fixed length records All fields the same (order and length) Key field Uniquely identifies the record Records are stored in key sequence New records go in separate log file (later consolidated)
Sequential File
Indexed Sequential File Index provides a fast lookup capability Contains key field and a pointer to main file Only select indices from main file are stored in the index (requires some sequential search of file) New records stored in separate overflow file
Indexed Sequential File
Indexed File Multiple indices for different key fields Exhaustive index: one entry per record Partial index: only to records of interest No restriction on placement of records Must have ptr in at least one index
Indexed File
Direct (hashed) file Directly access a block at a known address Key field required for each record For very fast access
Directories What is a directory? Directory: a file owned by the OS accessible by file management routines contains info about the files attributes, location, ownership You should already be familiar with: pathnames naming directories permissions operations (mkdir, rmdir, etc.)
File Directories (a) Single-Level System (b) Two-Level System
File Directories (c) Hierarchical Directory System
File System Implementation ti A possible file system layout
Physical Allocation of Files How are files stored on disk? Several approaches: contiguous allocation chained allocation indexed allocation Need some data structure to keep track of where files are on disk: file allocation table (FAT) Also need to track which disk blocks are free
Methods of File Allocation Contiguous allocation Contiguous set of blocks is allocated to a file at the time of creation Only a single entry in the file allocation table Starting block and length of the file External fragmentation will occur
Contiguous Allocation
Contiguous Allocation After Compaction
Contiguous Allocation External fragmentation will occur
Methods of File Allocation Chained allocation Allocation on basis of individual block Each block contains a pointer to the next block Only single entry in the file allocation table Starting block and length of file No external fragmentation Best for sequential files No accommodation of the principle of locality
Chained Allocation
Chained Allocation After Consolidation
Indexed allocation Methods of File Allocation File allocation table contains a separate one-level index for each file The index is a block Index has one entry for each portion allocated FAT contains block number for the index
Indexed Allocation
Indexed Allocation: Variable Length Portions
FATs vs. I-nodes File Allocation Table (FAT): tracks location of files on disk disadvantage: entire table must be in RAM Index-nodes (I-nodes): small data structure, one per file lists ssattributes sand dds disk addresses sof file s blocks advantage: I-node need be in RAM only when file is opened
Example I-Node
I-nodes on Disk
Implementing Directories (a) A simple directory fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node
Implementing Directories Two ways of handling long file names in directory (a) In-line (b) In a heap
Free Space Management Must know what blocks on disk are available Bit Tables One bit per block on disk Still can be sizeable -- consider searching! Chained Free Portions Chain together free blocks on disk Disk will become fragmented Indexing Use an index table like file allocation Free Block List Each free block is numbered and stored in loc on disk Doesn t require search like bit table