Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University

Similar documents

File Management. Chapter 12

Chapter 12 File Management

Chapter 12 File Management. Roadmap

Chapter 12 File Management

Chapter 12 File Management

File Management. Chapter 12

FILE MANAGEMENT CHAPTER

File Management. File Management

File Management Chapters 10, 11, 12

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

CHAPTER 17: File Management

File-System Implementation

Outline. File Management Tanenbaum, Chapter 4. Files. File Management. Objectives for a File Management System

Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Physical Data Organization

Record Storage and Primary File Organization

File Management. COMP3231 Operating Systems. Kevin Elphinstone. Tanenbaum, Chapter 4

Filing Systems. Filing Systems

COS 318: Operating Systems. File Layout and Directories. Topics. File System Components. Steps to Open A File

Topics in Computer System Performance and Reliability: Storage Systems!

Chapter 8: Structures for Files. Truong Quynh Chi Spring- 2013

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition

COS 318: Operating Systems

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

Storage and File Structure

File System Management

Chapter 11 I/O Management and Disk Scheduling

Prof. Dr. Ing. Axel Hunger Dipl.-Ing. Bogdan Marin. Operation Systems and Computer Networks Betriebssysteme und Computer Netzwerke

CSE 120 Principles of Operating Systems

Part III Storage Management. Chapter 11: File System Implementation

CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

IFSM 310 Software and Hardware Concepts. A+ OS Domain 2.0. A+ Demo. Installing Windows XP. Installation, Configuration, and Upgrading.

Chapter 6, The Operating System Machine Level

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Lecture 1: Data Storage & Index

Storage in Database Systems. CMPSCI 445 Fall 2010

Chapter 11: File System Implementation. Chapter 11: File System Implementation. Objectives. File-System Structure

The Classical Architecture. Storage 1 / 36

University of Dublin Trinity College. Storage Hardware.

6. Storage and File Structures

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Windows NT File System. Outline. Hardware Basics. Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik

Operating Systems, 6 th ed. Test Bank Chapter 7

Outline. Windows NT File System. Hardware Basics. Win2K File System Formats. NTFS Cluster Sizes NTFS

CS 153 Design of Operating Systems Spring 2015

Storing Data: Disks and Files

Chapter Contents. Operating System Activities. Operating System Basics. Operating System Activities. Operating System Activities 25/03/2014

& Data Processing 2. Exercise 2: File Systems. Dipl.-Ing. Bogdan Marin. Universität Duisburg-Essen

OPERATING SYSTEMS FILE SYSTEMS

Introduction. What is an Operating System?

Distributed File Systems

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Principles of Database Management Systems. Overview. Principles of Data Layout. Topic for today. "Executive Summary": here.

Lecture 17: Virtual Memory II. Goals of virtual memory

Unit Storage Structures 1. Storage Structures. Unit 4.3

Chapter 1 File Organization 1.0 OBJECTIVES 1.1 INTRODUCTION 1.2 STORAGE DEVICES CHARACTERISTICS

Linux Kernel Architecture

2) What is the structure of an organization? Explain how IT support at different organizational levels.

Operating system Dr. Shroouq J.

Big Data With Hadoop

Database 2 Lecture I. Alessandro Artale

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition

Chapter 11 I/O Management and Disk Scheduling

Storage and File Systems. Chester Rebeiro IIT Madras

FAWN - a Fast Array of Wimpy Nodes

Network Attached Storage. Jinfeng Yang Oct/19/2015

Overview. File Management. File System Properties. File Management

Lecture 18: Reliable Storage

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Operating Systems 4 th Class

Operating Systems. Virtual Memory

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC /23/2006

File Systems Management and Examples

Computer Architecture

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

OPERATING SYSTEM - MEMORY MANAGEMENT

FAT32 vs. NTFS Jason Capriotti CS384, Section 1 Winter Dr. Barnicki January 28, 2000

Operating System Structures

HP-UX System and Network Administration for Experienced UNIX System Administrators Course Summary

Distributed File Systems Part I. Issues in Centralized File Systems

Block1. Block2. Block3. Block3 Striping

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Transcription:

Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University

File Management File management system has traditionally been considered part of the operating system. Applications are stored as files Input for application is stored in files Output from application is stored in files File management has become increasingly distributed and removed from the operating system

Objectives for a File Management System Guarantee that the data in the file are valid Optimize performance Provide I/O support for a variety of storage device types Minimize or eliminate the potential for lost or destroyed data Provide a standardized set of I/O interface routines Provide I/O support for multiple users Security

User Program Pile Sequential Indexed Sequential Indexed Hashed Logical I/O Basic I/O Supervisor Basic File System Disk Device Driver Tape Device Driver Figure 12.1 File System Software Architecture [GROS86]

Device Drivers Lowest level Communicates directly with peripheral devices Responsible for starting I/O operations on a device Processes the completion of an I/O request

Basic File System Physical I/O Deals with exchanging blocks Concerned with the placement of blocks Concerned with buffering blocks in main memory

File Structure Directory management Access method Records Blocking Physical blocks in main memory buffers Disk scheduling Physical blocks in secondary storage (disk) User & program comands Operation, File name File manipulation functions I/O Free storage management File allocation User access control File management concerns Operating system concerns Figure 12.2 Elements of File Management

File Organization Character Files Piles Sequential Files Indexed Sequential Files Indexed File Direct, or Hashed, File

Character Files File is organized as a sequence of characters Basic Operations Seek. Each character has a position in the file Seek may allow random access or be more restricted Read. User is allowed to read any number of bytes At lower level, blocks are read Overwrite Append Truncate Very common file organization

File Organization with Records A more sophisticated way is to organize files as records A record consists of a fixed set of fields (a structure) Each field has a data type and length All records do not necessarily have the same fields Character Files can be viewed as a file composed of very simples Records

Piles Data are collected in the order they arrive Purpose is to accumulate a mass of data and save it Records may have different fields This requires records to be self describing At the very least the length of each record needs to be stored Record access is by exhaustive search

Piles Variable-length records Variable set of fields Chronological order (a) Pile File Fixed-length records Fixed set of fields in fix Sequential order based (b) Seq Exhaustive index Exhaustive index Index levels n Main File

Sequential Files Fixed format used for records All records are identical Same length, same fields in same order Field names and lengths are attributes of the file Can support random access One field is the key field Uniquely identifies the record Records often stored in key sequence

Sequential Files h records fields order a) Pile File Fixed-length records Fixed set of fields in fixed order Sequential order based on key field (b) Sequential File Exhaustive index Exhaustive index Partial index Main File

Sequential Files If files are stored in key sequence, adding a record becomes expensive Requires moving records in the file To reduce the cost, records are not immediately added to the file New records are placed in a log file or transaction file Batch update is performed to merge the log file with the master file A special case needs to be made to search the log file

Indexed Sequential File Index provides provides a lookup capability to quickly search the file by key Index contains a key field and a pointer to the main file Not every single entry need be indexed Index is used to get close to desired record Sequential search is used to find desired record

Variable-length records Variable set of fields Chronological order Fixed-length records Fixed set of fields in fix Sequential order based Indexed Sequential (a) Pile File File Exhaustive index (b) Seq Exhaustive index Index levels 2 1 n Index Main File Overflow File (c) Indexed Sequential File Primary F (variable-length (d) Indexe Figure 12.3 Common File Organizati

Indexed Sequential File New records are stored in overflow file Overflow file is periodically merged with main file Main file includes pointers to entries in overflow file

Advantage of Index Example: a file contains 1 million records On average 500,000 accesses are required to find a record in a sequential file If an index contains 1000 entries it will take 500 accesses on average to find nearest key in index, followed by 500 accesses on average in the main file. A binary search can be used to find the nearest key in index with 10 accesses, followed by 500 accesses to main file

Indexed File Uses multiple indices for different key fields May contain an exhaustive index that contains one entry for every record in the main file May also contain a partial index

h records fields order Fixed-length records Fixed set of fields in fixed order Sequential order based on key field a) Pile File Exhaustive index Indexed (b) Sequential File Exhaustive index Partial index Main File Overflow File ed Sequential File Primary File (variable-length records) (d) Indexed File re 12.3 Common File Organizations

Direct File Similar to a sequential file, but records are not ordered Records are accessed directly by key Similar to an STL map Implemented using a hash table Hash function is used to find location of record in file Some sort of chaining is used to handle collisions

Implementation of Different File Types Most OS s do not provide direct support for the above file organizations All of the above organizations can be built on top of character files Piles can be used for web logs Relational databases can be built out of indexed sequential files UNIX dmb files are direct files

Files have several attributes Name Access Privileges Location File Directories To make locating files easy, files are organized into directories The Directory is a file Provides a mapping between file names and files A Directory is a sequential file sorted by the file name key.

Hierarchical Directories Directories can contain files an other directories The directories form a tree Their is some master director The root directory in UNIX Logical Organization Allows users to organize files and directories Physical Organization Allows system to quickly access traverse hierarchy

Master Directory Subirectory Subirectory Subirectory Subirectory Subirectory File File File File Figure 12.4 Tree-Structured Directory

Master Directory System User A User B User C Directory "User C" Directory "Word" Unit A Directory "User B" Draw Word Directory "User A" Directory "Draw" ABC Directory "Unit A" File "ABC" ABC File "ABC" Pathname: /User B/Word/UnitA/ABC Figure 12.5 Example of Tree-Structured Directory

Naming Conventions Files are located by following a path from root directory Several files can have the same file name as long they have unique path names Files can be referenced relative to the current directory

File Sharing In multiuser system, files may be shared among users Two issues Access rights Management of simultaneous access

None Access Rights User may not know of the existence of the file User is not allowed to read the user directory that includes the file Knowledge User can only determine that the file exists and who its owner is

Access Rights Execution The user can load and execute a program but not copy it Reading The user can read the file for any purpose, including copying and execution Appending The user can add data to the file but cannot modify or delete any of the file s contents

Access Rights Updating The user can modify, delete and add data. Changing Protection User can change access rights granted to other users Deletion User can delete file

Files have an owner Access Rights Owner has all rights previously listed Owner may grant rights to others users Specific users User groups Everybody

Simultaneous Access System may provide ability to lock a file locks may be at the file level user may be able to lock individual records Mutual exclusion and deadlock are issues for shared access

Record Blocking Records must be organized as blocks for I/O to be performed Records are the logical unit of access of a file Blocks are the unit of I/O with secondary storage Two issues: Should blocks be fixed or variable length? What should be relative size of a block compared to the average record size? Tradeoffs: Smaller block, more I/O operations needed Larger block, more records in one I/O operation

R1 R2 R3 R4 R4 R5 R6 Track 1 R6 R7 Fixed Blocking R9 R9 R10 R8 R11 R12 R13 Track 2 Variable Blocking: Spanned R1 R2 R3 R4 R5 R1 R2 R3 R4 Track 1 Track 1 R6 R5 R7 R6 R8 R7 R9 R10 R8 Track 2 Track 2 Variable Blocking: Unspanned Fixed Blocking Data Waste due to record fit to block size Gaps R1 due to hardware R2 design R3 R4 Waste R4 due to block R5 size constraint R6 from fixed record size Waste due to block fit to track size Track 1 R6 R7 R8 R9 R9 R10 R11 R12 R13 Figure 12.6 Record Blocking Methods [WIED87] Track 2 Variable Blocking: Spanned

R6 R7 R8 R9 R9 R10 R11 R12 R13 R5 R6 R7 R8 Variable Variable Blocking: Spanned Spanned Fixed Blocking Track 2 Track 2 R1 R2 R3 R4 R5 R1 R2 R3 R4 R4 R5 R6 Track 1 Track 1 R6 R6 R7 R7 R8 R9 R8 R9 R10 R9 R10 R11 R12 R13 Track 2 Track 2 Variable Blocking: Unspanned Variable Blocking: Spanned Data Waste due to record fit to block size Gaps R1 due to hardware R2 design R3 Waste R4 due to block size R5 constraint from fixed record size Waste due to block fit to track size Track 1 R6 R7 Figure 12.6 Record Blocking Methods [WIED87] R8 Variable Blocking: Unspanned R9 R10 Track 2 Data Waste due to record fit to block size

R1 R2 R3 R4 R4 R5 R6 Track 1 Variable R6 R7 Blocking: R8 R9 R9Unspanned R10 R11 R12 R13 Track 2 Variable Blocking: Spanned R1 R2 R3 R4 R5 Track 1 R6 R7 R8 R9 R10 Track 2 Variable Blocking: Unspanned Data Gaps due to hardware design Waste due to block fit to track size Waste due to record fit to block size Waste due to block size constraint from fixed record size Figure 12.6 Record Blocking Methods [WIED87]

Secondary Storage Management Space must be allocated to files Must keep track of the space available for allocation

Preallocation VS Dynamic Allocation Preallocation Need maximum size for the file at creation time Difficult to reliably estimate the maximum potential size of the file Tend to overestimate file size so we do not run out of space Dynamic Allocation Allocate space to a file in portions as needed

Portion Size Portion Size: the size of the portion allocated to a file Variable, large contiguous: Advantages: better performance, avoids waste, file allocation tables are small Disadvantage: space is hard to reuse Blocks: Advantages: greater flexibility Disadvantage: requires large tables or complex structures for their allocation

Free Space Fragmentation Similar problem as allocating dynamic partitions First Fit Best Fit Nearest Fit Choose the unused group of sufficient size that is closest to the previous allocation for the file to increase locality

Methods of File Allocation Contiguous allocation Single set of blocks is allocated to a file at the time of creation Only a single entry in the file allocation table: starting block and length of file External fragmentation will occur, compaction is needed Need to declare the size of the file at the time of creation

File A 0 1 2 3 4 5 6 7 8 9 File B 10 11 12 13 14 File Name File A File B File C File D File E File Allocation Table Start Block 2 9 18 30 26 Length 3 5 8 2 3 15 16 17 18 19 20 21 File C 22 23 24 File E 25 26 27 28 29 File D 30 31 32 33 34 Figure 12.7 Contiguous File Allocation

File A 0 1 2 3 4 File B 5 6 7 8 9 File C 10 11 12 13 14 File E File D 15 16 17 18 19 File Name File A File B File C File D File E File Allocation Table Start Block 0 3 8 19 16 Length 3 5 8 2 3 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Figure 12.8 Contiguous File Allocation (After Compaction)

Chained allocation Methods of File Allocation Allocation on basis of individual block Each block contains a pointer to the next block in the chain Only single entry in the file allocation table: starting block and length of file No external fragmentation Best for sequential files Does not take advantage of locality

File B 0 1 2 3 4 5 6 7 8 9 File Name File B File Allocation Table Start Block 1 Length 5 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Figure 12.9 Chained Allocation

File B 0 1 2 3 4 5 6 7 8 9 File Name File B File Allocation Table Start Block 0 Length 5 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Figure 12.10 Chained Allocation (After Consolidation)

Indexed allocation Methods of File Allocation File allocation table contains a separate on-level index for each file The index has one entry for each portion to the file The file allocation table contains block number for the index

File B 0 1 2 3 4 5 6 7 8 9 File Allocation Table File Name File B Index Block 24 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 8 3 14 28 30 31 32 33 34 Figure 12.11 Indexed Allocation with Block Portions

File B 0 1 2 3 4 5 6 7 8 9 File Allocation Table File Name File B Index Block 24 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Start Block 1 28 14 Length 3 4 1 30 31 32 33 34 Figure 12.12 Indexed Allocation with Variable-Length Portions

Bit Tables Chained Free Portions Indexing Free Block List Free Space Management

Types of files Ordinary Directory Special Named I-nodes UNIX File Management A control structure that contains the key information needed by the OS for a particular file.

Direct(0) Direct(1) Direct(2) Direct(3) Direct(4) Direct(5) Direct(6) Direct(7) Direct(8) Direct(9) single indirect double indirect triple indirect Inode address fields Blocks on disk Figure 12.13 UNIX Block Addressing Scheme

Flush the log file Log File Service Write the cache Log the transaction Read/write the file I/O Manager NTFS Driver Fault Tolerant Driver Disk Driver Read/write a mirrored or striped volume Read/write the disk Cache Manager Access the mapped file or flush the cache Load data from disk into memory Virtual Memory Manager Figure 12.15 Windows NTFS Components [CUST94]