Mass-Storage Devices: Disks. CSCI 5103 Operating Systems. Basic Disk Functionality. Magnetic Disks

Similar documents
Chapter 10: Mass-Storage Systems

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Chapter 12: Mass-Storage Systems

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Chapter 12: Secondary-Storage Structure

Data Storage - II: Efficient Usage & Errors

Sistemas Operativos: Input/Output Disks

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Devices and Device Controllers

Price/performance Modern Memory Hierarchy

Lecture 36: Chapter 6

Storage and File Structure

Storing Data: Disks and Files

Chapter 11 I/O Management and Disk Scheduling

Why disk arrays? CPUs improving faster than disks

RAID. Storage-centric computing, cloud computing. Benefits:

How To Write A Disk Array

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

RAID technology and IBM TotalStorage NAS products

Introduction to I/O and Disk Management

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

CS 6290 I/O and Storage. Milos Prvulovic

Physical Data Organization

Lecture 16: Storage Devices

Timing of a Disk I/O Transfer

Chapter 9: Peripheral Devices: Magnetic Disks

Chapter 13. Disk Storage, Basic File Structures, and Hashing

1 Storage Devices Summary

Solid State Storage in Massive Data Environments Erik Eyberg

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Classification of Physical Storage Media. Chapter 11: Storage and File Structure. Physical Storage Media (Cont.) Physical Storage Media

File System Design and Implementation

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Non-Redundant (RAID Level 0)

RAID Performance Analysis

Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

Q & A From Hitachi Data Systems WebTech Presentation:

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

An Introduction to RAID. Giovanni Stracquadanio

Chapter 10: Storage and File Structure

Disk Storage & Dependability

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

OS OBJECTIVE QUESTIONS

Oracle Database 10g: Performance Tuning 12-1

Windows Server Performance Monitoring

TELE 301 Lecture 7: Linux/Unix file

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

6. Storage and File Structures

Definition of RAID Levels

Database Management Systems

CS 153 Design of Operating Systems Spring 2015

High Performance Computing. Course Notes High Performance Storage

Chapter 11 I/O Management and Disk Scheduling

HARD DRIVE CHARACTERISTICS REFRESHER

OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS

Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho

Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

Network Attached Storage. Jinfeng Yang Oct/19/2015

OPTIMIZING EXCHANGE SERVER IN A TIERED STORAGE ENVIRONMENT WHITE PAPER NOVEMBER 2006

How To Improve Performance On A Single Chip Computer

Storage and File Systems. Chester Rebeiro IIT Madras

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Reliability and Fault Tolerance in Storage

Data Storage and Backup. Sanjay Goel School of Business University at Albany, SUNY

Hard Disk Drives and RAID

Lecture 18: Reliable Storage

Online Remote Data Backup for iscsi-based Storage Systems

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

Best practices for Implementing Lotus Domino in a Storage Area Network (SAN) Environment

Striped Set, Advantages and Disadvantages of Using RAID

Mass Storage Structure

Chapter 12 Network Administration and Support

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

VTrak SATA RAID Storage System

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

PARALLELS CLOUD STORAGE

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Block1. Block2. Block3. Block3 Striping

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

Transcription:

Mass-Storage Devices: Disks CSCI 5103 Operating Systems Instructor: Abhishek Chandra Disk Structure and Attachment Disk Scheduling Disk Management RAID Structure Stable Storage 2 Magnetic Disks Most common form of secondary storage Basic structure: Platter with read-write head Multiple platters can be in a single disk drive Disk geometry: Track: Concentric circular region which contains data Cylinder: Set of tracks across multiple platters Sector: Chunk of data within a track Basic Disk Functionality Platter rotates at high speeds Disk head flies close to the platter surface Transfers data from one or more sectors to external I/O bus Reading/writing data Seek time: Time to move head to right track Latency time: Time for right sector to come under head Transfer time: Time to move desired no. of bytes 3 4 1

Other Storage Media Solid-State Disks (SSDs) Non-volatile memory. E.g.: Flash, DRAM+battery Faster but costlier than magnetic disks No moving parts, no seek/rotational latency Reads faster than writes, limited write lifetime Usage: Smaller devices, Caches in multi-tier storage, Metadata storage in storage arrays Magentic Tapes Large capacity and low cost Much slower than disks, sequential access Usage: Backup and archiving Disk Attachment Disk drive connected to a host computer Host controller: Controller device at host Issues I/O instructions, controls data transfer to memory Device controller: Controller on the disk Operates disk hardware based on I/O command Can cache data read from disk I/O bus: Channel connecting the two Transfers data from device to host May write directly to memory via memorymapped I/O 5 6 7 Types of Disk Attachment Host-attached storage Disk attached directly to host machine I/O happens via local I/O ports E.g.: IDE, ATA, SCSI, FiberChannel Network-attached storage (NAS) Disks are accessed over a network Data transferred over TCP/IP E.g.: iscsi Storage-area network (SAN) Devices connected over a separate network Clients access data via servers E.g.: FC, iscsi, Infiniband 8 Disk Scheduling File system may issue several disk requests Each request contains info about: I/O type (read/write) Disk location (track, sector no.) Amount of data (how many bytes or sectors) Question: What order should these requests be scheduled on the disk? 2

Disk Scheduling Algorithms Goals: Maximize disk throughput Also achieve fairness, low response times, etc. How is disk scheduling different from CPU scheduling? FCFS Service requests in order of arrival Problems? 9 10 Shortest-Seek Time First (SSTF) Tries to minimize seek time for each request Move to the closest request in terms of seek time Problem? SCAN (Elevator) Algorithm Sweeps from one end of disk to the other Services requests in order during each sweep What happens to the request wait times for requests at different ends of the disk? 11 12 3

Circular SCAN (C-SCAN) Moves back to beginning of disk after each sweep Services the tracks in circular order Achieves a more uniform wait time than SCAN LOOK Algorithm Variant of SCAN, C-SCAN Does not go to end of disk Goes till the last request at each end 13 14 Selection of Disk Scheduling Algorithm Depends on: Workload, desired metrics File allocation method: contiguous vs. indexed Placement of data vs. metadata Disk controller-level scheduling: Has better understanding of disk geometry, rotational latencies Why not leave scheduling to disk controller? Has limited queue space Non-performance considerations: Fairness Priority: e.g., metadata vs. data Differentiation: e.g., demand paging vs. file I/O I/O type: reads vs. writes 15 16 4

Disk Management How does the OS manage disk storage and usage? Formatting Handling bad blocks Swap space management Disk Formatting Physical formatting Create sectors: header, trailer, and data area Header, trailer: sector number, ECC Done at disk manufacturing time OS formatting: Partitioning: Divide disk into groups of cylinders Logical formatting: Initial file system data structures (free blocks, inodes, etc.) stored on disk Raw partition: No file system structures created, sequence of logical blocks provided 17 18 Disk Partitions Raw partition: No file system E.g.: Swap space, Databases Boot partition: Used to boot system Sequence of blocks, loaded into memory Boot loader executed from pre-defined location Finds kernel on disk and loads into memory Root partition: Contains kernel and other system files Mounted at boot-time Handling Bad Blocks Some sectors may be corrupted How to handle bad blocks? Manual handling: Find and isolate bad blocks during formatting or consistency checking Sector sparing: Spare sectors maintained on disk Disk controller logically replaces bad sector with spare Sector slipping: Cascading copy of sequence of sectors to a distant spare sector 19 20 5

Swap Space Management Where is swap space allocated? Swap file: Uses file system services for creation, allocation, I/O Swap partition: Raw partition, has its own storage manager Pros and cons: Swap file: Easier to implement, higher overhead Swap partition: More efficient in terms of access time, might waste more space Linux Swap Space Allows both swap file or partition Swap space used only for: Anonymous memory: Stack, heap, uninitialized data Code segments read directly from file, thrown away Shared pages Page slots: contain swapped out pages Swap map: array of counters for page slots Counter indicates no. of mappings to page 21 22 Stable Storage Ensure that data is never lost or corrupted Disk writes happen correctly Protect against: Disk write errors Disk blocks going bad CPU crash during disk writes Useful for transactions, backups, and logging Stable Storage Implementation Maintain two physical blocks for each logical block Each block typically on a separate disk Assumptions: Write errors can be detected (e.g., using ECC) Probability of both blocks getting corrupted is negligible Possible outcomes of a disk write Success: New data written correctly Partial failure: Data corruption, detectable error Total failure: Error before start of write, so old data is correct 23 24 6

Stable Storage Operation Stable write: Write to first block, then write to second block If both writes succeed correctly, then success Stable read: First read from block 1 If error detected, read from block 2 Recovery: If both blocks good and same data, no action If one block has error, copy data from good block If both blocks good but different data, copy from block 1 to 2 RAID Redundant Array of Independent Disks A set of disks used together to store data Goals: Reliability, performance Reliability: Data placed redundantly on multiple disks Performance: Data can be read in parallel from multiple disks 25 26 RAID Techniques: Reliability Mirroring: Create a duplicate Each logical disk consists of two physical disks Each write sent to both, read from either Error correcting codes: Add redundant parity bits to detect and/or correct errors Parity bits placed on separate disks from data Single parity bit: Can detect one bit flip Hamming code: Multiple parity bits added to detect and correct bit error Block-parity: ECC for block of data. E.g.: Parity block, Reed-Solomon code RAID Techniques: Performance Striping: Split data across multiple disks Bit-level: Split each byte across disks Block-level: File blocks split across disks Others: byte-level, sector-level striping Performance improvement: Higher throughput of small accesses (via load balancing) Lower response time for large accesses (via parallelism) 27 28 7

RAID Levels Different combinations of striping and redundancy RAID 0: Non-redundant striping Block-level striping, no redundancy Performance vs. reliability? RAID 1: Mirroring No striping Performance vs. reliability? RAID 2: Memory-style ECC Add parity bits for error correction: Hamming code Bit-level striping of data+parity bits How many additional disks needed for 4 data disks? RAID Levels RAID 3: Bit-interleaved parity Disk controller can detect which sector is bad Need only one parity bit to detect and correct error Space overhead? RAID 4: Block-interleaved parity Block-level striping Parity block on additional disk RAID 5: Block-interleaved distributed parity Similar to RAID 4 Spreads data and parity blocks across all disks Benefit? 29 30 RAID Levels RAID 6: P+Q redundancy Similar to RAID 5 Adds additional redundant blocks to recover from multiple disk failures Nested RAID: Combine multiple RAID levels Goal: Achieve both performance and reliability RAID 0+1: Striping followed by mirroring RAID 1 applied on set of RAID 0 s treated as disks RAID 1+0: Mirroring followed by striping RAID 0 applied on set of RAID 1 s treated as disks More robust to multiple disk failures 31 8