Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck



Similar documents
Why disk arrays? CPUs improving faster than disks

How To Write A Disk Array

Definition of RAID Levels

Lecture 36: Chapter 6

Data Storage - II: Efficient Usage & Errors

Disk Array Data Organizations and RAID

Using RAID6 for Advanced Data Protection

File System Design and Implementation

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

What is RAID? data reliability with performance

RAID. Storage-centric computing, cloud computing. Benefits:

CS420: Operating Systems

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

RAID Basics Training Guide

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

An Introduction to RAID. Giovanni Stracquadanio

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Storing Data: Disks and Files

How To Improve Performance On A Single Chip Computer

CS161: Operating Systems

Reliability and Fault Tolerance in Storage

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

1 Storage Devices Summary

PIONEER RESEARCH & DEVELOPMENT GROUP

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Price/performance Modern Memory Hierarchy

RAID Level Descriptions. RAID 0 (Striping)

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

California Software Labs

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

RAID Made Easy By Jon L. Jacobi, PCWorld

EMC DATA DOMAIN DATA INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

RAID technology and IBM TotalStorage NAS products

Storage and File Structure

Sistemas Operativos: Input/Output Disks

RAID Overview

Surviving Two Disk Failures. Introducing Various RAID 6 Implementations

Striped Set, Advantages and Disadvantages of Using RAID

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

RAID 6 with HP Advanced Data Guarding technology:

RAID Performance Analysis

CSE-E5430 Scalable Cloud Computing P Lecture 5

Chapter 10: Mass-Storage Systems

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

CS 153 Design of Operating Systems Spring 2015

CS 6290 I/O and Storage. Milos Prvulovic

Database Management Systems

RAID Technology Overview

RAID for the 21st Century. A White Paper Prepared for Panasas October 2007

HARD DRIVE CHARACTERISTICS REFRESHER

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Firebird and RAID. Choosing the right RAID configuration for Firebird. Paul Reeves IBPhoenix. mail:

Computer Architecture Prof. Mainak Chaudhuri Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Data Corruption In Storage Stack - Review

RAID Levels and Components Explained Page 1 of 23

The Advantages of Using RAID

Lecture 23: Multiprocessors

Chapter 9: Peripheral Devices: Magnetic Disks

Disk Storage & Dependability

IBM ^ xseries ServeRAID Technology

Introduction to I/O and Disk Management

CSE 120 Principles of Operating Systems

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Intel RAID Controllers

HDD Ghosts, Goblins and Failures in RAID Storage Systems

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Computer Systems Structure Main Memory Organization

Silent data corruption in SATA arrays: A solution

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe

Hard Disk Drives and RAID

Guide to SATA Hard Disks Installation and RAID Configuration

Storage Technologies - 2

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

Transcription:

1/19 Why disk arrays? CPUs speeds increase faster than disks - Time won t really help workloads where disk in bottleneck Some applications (audio/video) require big files Disk arrays - make one logical disk out of many physical disks More disks results in two benefits - Higher data transfer rates on large data accesses - Higher I/O rates on small data accesses Old concept, popularized by Berkeley in 90s - Good survey is [Chen], which these slides draw from heavily

2/19 Reliability implications of arrays Array with 100 disks nearly 100 as likely to experience failure! - (1 ǫ) 100 1 100ǫ for small per-disk failure probability ǫ - In plain array, one disk failure can bring down array - Each disk 200,000 hours MTBF array 2,000 hours (3 months) [approximately, since it double counts two disks failing] Idea: Use redundancy to improve reliability - But makes writes slower, since redundant info must be updated - Raises issues of consistency in the face of power failures

3/19 Disk array basics Many different ways to configure arrays - Common terminology introduced by Berkeley - RAID (redundant array of inexpensive disks) - RAID levels correspond to schemes w. different properties Data striping - balances load across disks - fine grained high transfer rates for all requests - course grained allow parallel small requests to different disks Redundancy - improve reliability - Many methods in which to compute & distribute redundancy

4/19 RAID 0 Nonredundant storage (JBOD - just a bunch of disks) - E.g., Stripe sequential 128K logical regions to different disk Offers best possible write performance (only one write per write) Offers best possible storage efficiency (no redundancy) Offers good read performance Use if speed more important that reliability (scientific computing)

5/19 RAID 1 Mirrored storage Each block stored on two disks Writing slower (twice as many operations) Storage efficiency 1/2 optimal Small reads potentially better than RAID 0 - Can read on disk with shortest seek

6/19 RAID 2 Use Hamming codes, like ECC memory - Multiply data by generator matrix G = ( I A) 1 0 0 0 1 1 1 G = 0 1 0 0 0 1 1 0 0 1 0 1 0 1 0 0 0 1 1 1 0 E = (D G) T = D = ( d 1 d 2 d 3 d 4 ) d 1 d 2 d 3 d 4 d 1 + d 3 + d 4 d 1 + d 2 + d 4 d 1 + d 2 + d 3

7/19 hamming codes (continued) Decode by multiplying by H = ( A T I) H = H E = 1 0 1 1 1 0 0 1 1 0 1 0 1 0 1 1 1 0 0 0 1 0 0 0 Can recover any two missing bits Can even recover from one incorrect bit! - If one extra bit is 1, it is wrong - If two extra bits are 1, d 2, d 3, or d 4 is wrong - If all 3 extra bits are 1, d 1 is wrong

8/19 Properties of RAID 2 Small reads about like RAID 0 - Though more contention for data disks Writes must update multiple error-correction disks Storage more efficient than RAID 1 (uses more than half of disks) Recovers from errors (RAID 1 assumes fail-stop) - Is this overkill? - Most disks are mostly fail-stop... - Or data corruption involves higher-layer FS/OS bugs or hardware errors, so better to do end-to-end check

9/19 RAID 3 Bit interleaved parity Assume fail stop disks, add one parity disk D = ( d 1 d 2 d 3 d 4 ) d 1 d 2 E = d 3 d 4 d 1 + d 2 + d 3 + d 4 Any read touches all data disks Any write touches all data disks plus parity disk

10/19 RAID 4 Block-interleaved parity Interleave data in blocks instead of bits Reads smaller than striping unit can access only one disk Writes must update data and compute and update parity block - Small writes require two reads plus two writes - Heavy contention for parity disk (all writes touch it)

11/19 RAID 5 Block-interleaved distributed parity - Distribute parity uniformly over all the disks - Want to access disks sequentially when sequentially accessing logical blocks: 0 1 2 3 p 0 5 6 7 p 1 4 10 11 p 2 8 9 15 p 3 12 13 14 p 4 16 17 18 19 Better load balancing than RAID 4 Small writes still require read-modify-write

12/19 RAID 6 P+Q Redundancy - Have two parity blocks instead of one - With Reed-Solomon codes, can lose any two blocks Compute parity on bytes, not bits - Consider bytes to be in 256-element field w. generator g P = d 0 d 1 d 2 d 3 Q = d 0 g d 1 g 2 d 2 g 3 d 3 - If lose P and data block, can reconstruct from Q - If lose two data blocks, solve two equations w. two unknowns Small writes require read-modify-write of two parity blocks plus data

13/19 RAID Summary Efficiency of reading/writing/storing data w. G disks: RAID level SmRd SmWr BigRd BigWr Space 0 1 1 1 1 1 1 1 1/2 1 1/2 1/2 3 1/G 1/G (G-1)/G (G-1)/G (G-1)/G 5 1 max(1/g, 1/4) 1 (G-1)/G (G-1)/G 6 1 max(1/g, 1/6) 1 (G-2)/G (G-2)/G Notes: - RAID 4 is strictly inferior to RAID 5 - RAID 2 inferior to RAID 5 for fail-stop disks - RAID 1 is just RAID 5 with parity group size G=2 - RAID 3 is just like RAID 5 with a very small stripe unit

When/how do disks fail? Disks can fail very early in their lifetimes (manufacturing errors) Also tend to fail late in their lifetimes (when disk wears out) Systematic manufacturing defect can make entire batch of disks fail early - Beware disks with consecutive serial numbers! Environmental factors can kill a bunch of disks (air conditioning failure) Disks can fail when a bad block is read Bad block may exist for a while before being detected See [Pinheiro] for more detailed take 14/19

15/19 Dealing with failures Basic idea: - Add new disk - Reconstruct failed disk s state on new disk Must store metadata information during recovery - Which disks are failed? - How much of failed disk has been reconstructed? System crashes become very serious in conjunction with disk failure - Parity may be inconsistent (particularly bad for P+Q) - You could lose a block other than the one you were writing - MUST log in NVRAM enough info to recover parity - Makes software-only implementation of RAID risky

16/19 Maximizing availability Want to keep operating after failure Demand reconstruction - Assumes spare disk immediately (or already) installed - Reconstruct blocks as accessed - Background thread reconstructs all blocks Parity sparing - Replaces parity block with reconstructed data block - Need extra metadata to keep track of this

17/19 Unrecoverable RAID failures Double disk failures (or triple, if P+Q redundancy) System crash followed by disk failure Disk failure, then read and discover bad block during reconstruction

18/19 Tuning RAID What is optimal size of data stripe in RAID 0 disk array? (PX(L 1)Z)/N - P - average positioning time - X - disk transfer rate - L - concurrency of workload - Z - request size - N - size of array in disks What about in RAID 5? - Reads - similar to RAID 0 - Writes - optimal is a factor of 4 smaller than for reads (for 16 disks) - Seems to vary WITH #disks, while reads vary inversely! Conclusion: Very workload dependent!

19/19 Midterm reminder Midterm exam 4:15pm-5:30pm Monday, Gates B-01 - Bring papers and notes! - No electronic devices allowed Check google group for extra office hours/section announcements Refresh home page to see extra office hours