Introduction to I/O and Disk Management



Similar documents
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Sistemas Operativos: Input/Output Disks

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Price/performance Modern Memory Hierarchy

Devices and Device Controllers

Chapter 10: Mass-Storage Systems

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Chapter 12: Mass-Storage Systems

Lecture 16: Storage Devices

1 Storage Devices Summary

Q & A From Hitachi Data Systems WebTech Presentation:

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

Floppy Drive & Hard Drive

Solid State Drive Architecture

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Disk Storage & Dependability

Data Storage and Backup. Sanjay Goel School of Business University at Albany, SUNY

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Data Storage - I: Memory Hierarchies & Disks

Hard Disk Drives and RAID

Communicating with devices

Storing Data: Disks and Files

Chapter 9: Peripheral Devices: Magnetic Disks

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

RAID: Redundant Arrays of Independent Disks

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

CS 6290 I/O and Storage. Milos Prvulovic

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Storage and File Systems. Chester Rebeiro IIT Madras

Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Chapter 12: Secondary-Storage Structure

Storage in Database Systems. CMPSCI 445 Fall 2010

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Lecture 36: Chapter 6

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Chapter 11 I/O Management and Disk Scheduling

Writing Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if you have questions! Lab #2 Today. Quiz #1 Tomorrow (Lectures 1-7)

HARD DRIVE CHARACTERISTICS REFRESHER

IDE/ATA Interface. Objectives. IDE Interface. IDE Interface

RAID Performance Analysis

Chapter 11 I/O Management and Disk Scheduling

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Mass Storage Structure

Architecting High-Speed Data Streaming Systems. Sujit Basu

Data Storage - II: Efficient Usage & Errors

HP Smart Array Controllers and basic RAID performance factors

4.2: Multimedia File Systems Traditional File Systems. Multimedia File Systems. Multimedia File Systems. Disk Scheduling

Optimizing LTO Backup Performance

Supporting Hard Drives

Chapter 7 Types of Storage. Discovering Computers Your Interactive Guide to the Digital World

Understanding Flash SSD Performance

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

IBM ^ xseries ServeRAID Technology

UK HQ RAID Chunk Size T F ISO 14001

Oracle Database 10g: Performance Tuning 12-1

Chapter 7: Storage Devices

Database Management Systems

File System Management

TELE 301 Lecture 7: Linux/Unix file

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

CS 153 Design of Operating Systems Spring 2015

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

RAID technology and IBM TotalStorage NAS products

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

Chapter 7. Disk subsystem

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

HP Z Turbo Drive PCIe SSD

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated

WHITE PAPER Optimizing Virtual Platform Disk Performance

What is RAID--BASICS? Mylex RAID Primer. A simple guide to understanding RAID

RAID Storage System of Standalone NVR

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

University of Dublin Trinity College. Storage Hardware.

Enterprise-class versus Desktopclass

DATA RECOVERY BOOK V1.0

Online Remote Data Backup for iscsi-based Storage Systems

Intel ESB2 SATA RAID Setup Guidelines

Update: About Apple RAID Version 1.5 About this update

QuickSpecs. HP Smart Array 5312 Controller. Overview

Timing of a Disk I/O Transfer

OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Ambiente de Armazenamento

Education. Servicing the IBM ServeRAID-8k Serial- Attached SCSI Controller. Study guide. XW5078 Release 1.00

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Transcription:

Introduction to I/O and Disk Management 1 Secondary Storage Management Disks just like memory, only different Why have disks? Memory is small. Disks are large. Short term storage for memory contents (e.g., swap space). Reduce what must be kept in memory (e.g., code pages). Memory is volatile. Disks are forever (?!) File storage. GB/dollar dollar/gb RAM 0.013(0.015,0.01) $77($68,$95) Disks 3.3(1.4,1.1) 30 (71,90 ) Capacity : 2GB vs. 1TB 2GB vs. 400GB 1GB vs 320GB 2

How to approach persistent storage Disks first, then file systems. Bottom up. Focus on device characteristics which dominate performance or reliability (they become focus of SW). Disk capacity (along with processor performance) are the crown jewels of computer engineering. File systems have won, but at what cost victory? Ipod, iphone, TivO, PDAs, laptops, desktops all have file systems. Google is made possible by a file system. File systems rock because they are: Persistent. Heirarchical (non-cyclical (mostly)). Rich in metadata (remember cassette tapes?) Indeible (hmmm, a weak point?) The price is compleity of implementation. 3 Different types of disks Advanced Technology Attachment (ATA) Standard interface for connecting storage devices (e.g., hard drives and CD-ROM drives) Referred to as IDE (Integrated Drive Electronics), ATAPI, and UDMA. ATA standards only allow cable lengths in the range of 18 to 36 inches. CHEAP. Small Computer System Interface (SCSI) Requires controller on computer and on disk. Controller commands are sophisticated, allow reordering. USB or Firewire connections to ATA disc These are new bus technologies, not new control. Microdrive impressively small motors 4

Different types of disks Bandwidth ratings. These are unachievable. 50 MB/s is ma off platters. Peak rate refers to transfer from disc device s memory cache. SATA II (serial ATA) 3 Gb/s (still only 50 MB/s off platter, so why do we care?) Cables are smaller and can be longer than pata. SCSI 320 MB/s Enables multiple drives on same bus Mode UDMA0 UDMA1 UDMA2 UDMA3 UDMA4 UDMA5 UDMA6 Speed 16.7 MB/s 25.0 MB/s 33.3 MB/s 44.4 MB/s 66.7 MB/s 100.0 MB/s 133 MB/s 5 Flash: An upcoming technology Flash memory gaining popularity One laptop per child has 1GB flash (no disk) Vista supports Flash as accelerator Future is hybrid flash/disk or just flash? Erased a block at a time (100,000 write-erase-cycles) Pages are 512 bytes or 2,048 bytes Read 18MB/s, write 15MB/s Lower power than (spinning) disk GB/dollar dollar/gb RAM 0.013(0.015,0.01) $77($68,$95) Disks 3.3 (1.4,1.1) 30 (71,90 ) Flash 0.1 $10 6

Anatomy of a Disk Basic components Track Block/Sector s 1 0 1 2 Head Cylinder Surface Platter Spindle 7 Disk structure: the big picture Physical structure of disks 8

Anatomy of a Disk Seagate 73.4 GB Fibre Channel Ultra 160 SCSI disk Specs: 12 Platters 24 Heads Variable # of sectors/track 10,000 RPM Average latency: 2.99 ms Seek times Track-to-track: 0.6/0.9 ms Average: 5.6/6.2 ms Includes acceleration and settle time. 160-200 MB/s peak transfer rate 1-8K cache 12 Arms 14,100100 Tracks 512 bytes/sector 9 Anatomy of a Disk Eample: Seagate Cheetah ST373405LC (March 2002) Specs: Capacity: 73GB 8 surfaces per pack # cylinders: 29,549 Total number of tracks per system: 236,394 Variable # of sectors/track (776 sectors/track (avg)) 10,000 RPM average latency: 2.9 ms. Seek times track-to-track: 0.4 ms Average/ma: 5.1 ms/9.4ms 50-85 MB/s peak transfer rate 4MB cache MTBF: 1,200,000 hours 10

Disk Operations Read/Write operations Present disk with a sector address Old: DA = (drive, surface, track, sector) New: Logical block address (LBA) Heads moved to appropriate track seek time settle time The appropriate head is enabled Wait for the sector to appear under the head rotational latency Read/write the sector transfer time Read time: seek time + latency + transfer time (5.6 ms + 2.99 ms + 0.014 ms) 11 Disk access latency Which component of disk access time is the longest? A. Rotational latency B. Transfer latency C. Seek latency 12

Disk Addressing Software wants a simple disc virtual address space consisting of a linear array of sectors. Sectors numbered 1..N, each 512 bytes (typical size). Writing 8 surfaces at a time writes a 4KB page. Hardware has structure: Which platter? Which track within the platter? Which sector within the track? The hardware structure t affects latency. Reading from sectors in the same track is fast. Reading from the same cylinder group is faster than seeking. 13 Disk Addressing Mapping a 3-D structure to a 1-D structure 2p 1p... 2 0 t 1... 1 0 s 1 1 0 1 Sector Surface Track? Mapping criteria block n+1 should be as close as possible to block n 0 n File blocks 14

The Impact of File Mappings File access times: Contiguous allocation Array elements map to contiguous sectors on disk Case1: Elements map to the middle of the disk 56 30 60 2,048 5.6 + 3.0 + 6.0 = 86 8.6 + 29.0 = 37.6 ms 424 Seek Time Latency Transfer Time = time per revolution number of revolutions required to transfer data Constant Terms Variable Term 15 The Impact of File Mappings File access times: Contiguous allocation Array elements map to contiguous sectors on disk Case1: Elements map to the middle tracks of the platter 56 5.6 + 30 3.0 + 60 6.0 2,048 = 86 8.6 + 29.0 = 37.6 ms 424 Case2: Elements map to the inner tracks of the platter 5.6 + 3.0 + 6.0 2,048 212 = 8.6 + 58.0 = 66.6 ms Case3: Elements map to the outer tracks of the platter 5.6 + 3.0 + 6.0 2,048 636 = 8.6 + 19.3 = 27.9 ms 16

Disk Addressing The impact of file mappings: Non-contiguous allocation Array elements map to random sectors on disk Each sector access results in a disk seek 2,048 (5.6 + 3.0) = 17.6 seconds 0 n File blocks 2p 1.... 2 0 t 1... 1 0 s 1 0 1 17 Practical Knowledge If the video you are playing off your hard drive skips, defragment your file system. OS block allocation policy is complicated. Defragmentation allows the OS to revisit layout with global information. Uni file systems need defragmentation less than Windows file systems, because they have better allocation policies. 18

Defragmentation Decisions Files written when the disk is nearly full are more likely to be fragmented. A. True B. False 19 Disk Head Scheduling Maimizing disk throughput In a multiprogramming/timesharing environment, a queue of disk I/O requests can form (surface, track, sector) Disk CPU Other I/O The OS maimizes disk I/O throughput by minimizing head movement through disk head scheduling 20

Disk Head Scheduling Eamples Assume a queue of requests eists to read/write tracks: and the head is on track 65 83 72 14 147 16 150 0 25 50 65 75 100 125 150 21 Disk Head Scheduling Eamples Assume a queue of requests eists to read/write tracks: 83 72 14 147 16 150 and the head is on track 65 0 25 50 65 75 100 125 150 FCFS scheduling results in the head moving 550 tracks Can we do better? 22

Disk Head Scheduling Minimizing head movement Greedy scheduling: shortest seek time first Rearrange queue from: 83 72 14 147 16 To: 14 16 150 147 82 150 72 0 25 50 75 100 125 150 23 Disk Head Scheduling Minimizing head movement Greedy scheduling: shortest seek time first Rearrange queue from: 83 72 14 147 16 To: 14 16 150 147 82 150 72 0 25 50 75 100 125 150 SSTF scheduling results in the head moving 221 tracks Can we do better? 24

Disk Head Scheduling SCAN scheduling Rearrange queue from: To: 83 150 72 147 14 83 147 72 16 14 150 16 0 25 50 75 100 125 150 SCAN scheduling: Move the head in one direction until all requests have been serviced and then reverse. Also called elevator scheduling. Moves the head 187 tracks 25 Disk Head Scheduling Other variations C-SCAN scheduling ( Circular -SCAN) Move the head in one direction until an edge of the disk is reached and then reset to the opposite edge 0 25 50 75 100 125 150 LOOK scheduling Same as C-SCAN ecept the head is reset when no more requests eist between the current head position and the approaching edge of the disk 26

Disk Performance Disk partitioning Disks are typically partitioned to minimize the largest possible seek time A partition is a collection of cylinders Each partition is a logically separate disk Partition A Partition B 27 Disks Technology Trends Disks are getting smaller in size Smaller spin faster; smaller distance for head to travel; and lighter weight Disks are getting denser More bits/square inch small disks with large capacities Disks are getting cheaper 2/year since 1991 Disks are getting faster Seek time, rotation latency: 5-10%/year (2-3 per decade) Bandwidth: 20-30%/year (~10 per decade) Overall: Disk capacities are improving much faster than performance 28

Management of Multiple Disks Using multiple disks to increase disk throughput Disk striping (RAID-0) Blocks broken into sub-blocks that are stored on separate disks similar to memory interleaving Provides for higher disk bandwidth through a larger effective block size 1 2 3 OS disk block 8 9 10 11 12 13 14 15 0 1 2 3 8 9 10 11 12 13 14 15 0 1 2 3 Physical disk blocks 29 Management of Multiple Disks Using multiple disks to improve reliability & availability To increase the reliability of the disk, redundancy must be introduced Simple scheme: disk mirroring (RAID-1) Write to both disks, read from either. 0 1 1 0 0 1 1 1 0 1 0 1 0 1 1 Primary disk 0 1 1 0 0 1 1 1 0 1 0 1 0 1 1 Mirror disk 30

Who controls the RAID? Hardware +Tend to be reliable (hardware implementers test) +Offload parity computation from CPU Hardware is a bit faster for rewrite intensive workloads -Dependent on card for recovery (replacements?) -Must buy card (for the PCI bus) -Serial reconstruction of lost disk Software -Software has bugs -Ties up CPU to compute parity +Other OS instances might be able to recover +No additional cost +Parallel reconstruction of lost disk 31 Management of Multiple Disks Using multiple disks to increase disk throughput RAID (redundant array of inepensive disks) disks Byte-wise striping of the disks (RAID-3) or block-wise striping of the disks (RAID-0/4/5) Provides better performance and reliability Eample: storing the byte-string 101 in a RAID-3 system 1 2 3 1 0 1 32

Improving Reliability and Availability RAID-4 Block interleaved parity striping Allows one to recover from the crash of any one disk Eample: storing 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3 Disk 1 Disk 2 Disk 3 Parity Disk RAID-4 layout: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 33 Improving Reliability and Availability RAID-5 Block interleaved parity striping Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Block 1 18 1 1 1 19 1 1 0 0100 0 0 0110 0 1 1121 1 0 0130 0 0 0141 1 0 10 1 1 Block 0 0 1 0 0151 1 0 1 23 0 1 0 1 1 0 0 0 1 1 0 1 0 1 0 Parity 1 1 0 34

Improving Reliability and Availability RAID-5 Block interleaved parity striping Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Block Block +1 Block +2 Block +3 1 18 1 1 1 19 1 1 0 0100 0 0 0110 0 1 121 1 0 0130 0 0 0141 1 0 10 1 1 Block 0 0 1 0 0151 1 0 1 23 0 1 0 1 1 0 0 0 1 1 0 1 0 1 0 Parity 1 1 0 1 Block 1 1 1 0 0 a0 0 0 0 d1 1 0 1 g0 1 1 0 j0 1 1 +1 1 1 1 1 1 b1 1 0 0 ef 1 1 0 1 hi 0 1 0 1 kl 1 0 0000 Parity 0 0 0 0000 0 c0 0 0011 0 f 1 0101 i0 1 0110 l1 0 1 1 m1 1 1 1 no 1 1 0 0 0 0 1 1 y1 1 1 1 z 1 1 0 0 aa0 0 0 Block 0 0 0 1 +2 1 1 1 0 Parity 0 0 0 0 0 bb0 0 1 1 cc 1 1 0 0 dd0 0 0 0 p1 1 0 1 s0 1 1 0 v0 1 0 0 q1 1 0 1 tu 0 1 0 1 w 1 0 0 0 r 1 1 0 1 0 1 0 1 1 0 0 Block 0 1 1 0 +3 0 1 1 0 Parity 0 1 1 0 1 ee0 1 0 1 ff 0 1 0 1 gg0 1 1 0 hh0 1 0 1 ii 1 0 0 1 jj1 0 35