HDD Ghosts, Goblins and Failures in RAID Storage Systems



Similar documents
Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

RAID Technology Overview

Hard Disk Drives: The Good, The Bad and The Ugly

Enhanced Reliability Modeling of RAID Storage Systems

Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Definition of RAID Levels

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

CS420: Operating Systems

January A Better RAID Strategy for High Capacity Drives in Mainframe Storage

Enterprise-class versus Desktop-class Hard Drives

Reliability of Data Storage Systems

Price/performance Modern Memory Hierarchy

Using RAID6 for Advanced Data Protection

RAID Basics Training Guide

3.5 Dual Bay USB 3.0 RAID HDD Enclosure

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

HARD DRIVE CHARACTERISTICS REFRESHER

Enterprise-class versus Desktopclass

an analysis of RAID 5DP

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Storing Data: Disks and Files

Implementation of Reliable Fault Tolerant Data Storage System over Cloud using Raid 60

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

What is RAID and how does it work?

Reliability and Fault Tolerance in Storage

NetApp RAID-DP : Dual-Parity RAID 6 Protection Without Compromise

Storage Technologies - 2

Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

Disk Array Data Organizations and RAID

Sistemas Operativos: Input/Output Disks

How To Create A Multi Disk Raid

RAID Made Easy By Jon L. Jacobi, PCWorld

Energy aware RAID Configuration for Large Storage Systems

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Reliability and Recovery: How Can We Make Better Storage Devices?

Reliability of Systems and Components

Lecture 36: Chapter 6

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe

Configuring ThinkServer RAID 500 and RAID 700 Adapters. Lenovo ThinkServer

Mean time to meaningless: MTTDL, Markov models, and storage system reliability

Why disk arrays? CPUs improving faster than disks

Hydra esata. 4-Bay RAID Storage Enclosure. User Manual January 16, v1.0

An Introduction to RAID 6 ULTAMUS TM RAID

Intel RAID Controllers

VERY IMPORTANT NOTE! - RAID

PARALLELS CLOUD STORAGE

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Dual/Quad 3.5 SATA to USB 3.0 & esata External Hard Drive RAID/Non-RAID Enclosure w/fan. User s Manual

How To Understand And Understand The Power Of Aird 6 On Clariion

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

PIONEER RESEARCH & DEVELOPMENT GROUP

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

How To Improve Performance On A Single Chip Computer

Chapter 9: Peripheral Devices: Magnetic Disks

The Introduction of ORICO 3549RUS3

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Maintenance Best Practices for Adaptec RAID Solutions

RAID Levels and Components Explained Page 1 of 23

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

SAN Conceptual and Design Basics

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Data Corruption In Storage Stack - Review

Silent data corruption in SATA arrays: A solution

Dynamode External USB3.0 Dual RAID Encloure. User Manual.

Contents. Overview. Drive Policy RAID 500 features. Disable BGI RAID 700 features. Management Tasks Choosing the RAID Level.

How To Write A Disk Array

Guide to SATA Hard Disks Installation and RAID Configuration

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

Data Storage - II: Efficient Usage & Errors

RAID Level Descriptions. RAID 0 (Striping)

HP Smart Array Controllers and basic RAID performance factors

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

QuickSpecs. HP Smart Array 5312 Controller. Overview

Distribution One Server Requirements

Hydra Super-S LCM. 4-Bay RAID Storage Enclosure for four 3.5-inch Serial ATA Hard Drives. User Manual August 18, v1.0

RAID Implementation for StorSimple Storage Management Appliance

CS 6290 I/O and Storage. Milos Prvulovic

How to recover a failed Storage Spaces

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

Guide to SATA Hard Disks Installation and RAID Configuration

CS161: Operating Systems

Striped Set, Advantages and Disadvantages of Using RAID

ES-1 Elettronica dei Sistemi 1 Computer Architecture

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

RAID Storage System of Standalone NVR

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

Sonexion GridRAID Characteristics

1 Storage Devices Summary

Introduction to I/O and Disk Management

Hydra Super-S Combo. 4-Bay RAID Storage Enclosure (3.5 SATA HDD) User Manual July 29, v1.3

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

Guide to SATA Hard Disks Installation and RAID Configuration

Transcription:

HDD Ghosts, Goblins and Failures in RAID Storage Systems Jon G. Elerath, Ph. D. February 19, 2013

Agenda RAID > HDDs > RAID Redundant Array of Independent Disks What is RAID storage? How does RAID work? Common configurations Why is RAID Needed? HDD ghosts, goblins and failures (errors and failures) HDD Field Reliability for RAIDed HDDs HDD Reliability Metric Realities of RAID Operations Impact on RAID Reliability Estimates 2

HDDs in RAID Storage Systems What is RAID storage? Redundant Array of Independent Disks How does RAID work? Data Redundancy Parity: Exclusive logic Single parity creates one bit for a data stripe Dual parity creates two parity bits across different stripes of data Why is it needed? 3

RAID Storage Systems Common configurations RAID 0 No redundancy Any failure causes data loss RAID 1 Mirrored disks (n+n) all data is saved on two identical HDDs requires both disks to fail concurrently RAID 4 and RAID 5 (n+1) n data disks + 1 parity disk requires 2 failures to lose data RAID 6 (RAID DP) (n+2) n data disks + 2 parity disks requires 3 failures to lose data 4

RAID Operation (Overly Simplified) RAID 0 No redundancy Data 1 Data 2 Data 3 Data 4 Data 5 Data 6 RAID 1 Mirrored Disks Data 1 Data 1 Data 2 Data 2 Data 3 Data 3 5

RAID Operation (Unbelievably Oversimplified) RAID 5 n+1 Redundancy RAID 6 n+2 Redundancy cpu Data Disks cpu Data Disks 1 ECC Disk Equivalent 2 ECC Disk Equivalents 6

HDDs in RAID Storage Systems What is RAID storage? Redundant Array of Independent Disks How does RAID work? Data Redundancy Parity: Exclusive logic Single parity creates one bit for a data stripe Dual parity creates two parity bits across different stripes of data Why is RAID needed? Disk drives suck A perception by Dan Warmenhoven, CEO NetApp (circa 2005) 7

HDD Events (Ghosts, Goblins and Failures) Uncorrectable Cannot Find Data Cannot Read Data Correctable in RAID Data Missing Servo Track SMART Limit Hit Error During Write Written & Destroyed Electronics Can t Stay on Track Read Head Media Inherent Bit Error High Fly Write Scratched Media Corrosion Thermal Asperity 8

HDD Events (Ghosts, Goblins and Failures) Servo Track Electronics Uncorrectable Cannot Find Data Can t Stay on Track Read Head SMART Limit Hit Cannot Read Data Correctable Servo data is written at regular intervals on every data Data track of every disk surface. The servo data is used to Missing control the positioning of the read/write heads. Tracks on a HDD are not perfectly circular. Head position is continuously measured Error and Written compared to where it should be. A During position Write error signal & Destroyed is used to reposition the head over the track. Repeatable run out is part of normal HDD head positioning control. Non repeatable run out (NRRO) cannot be corrected by the HDD Thermal firmware Media since it is non repeatable. NRRO, caused Asperity by mechanical tolerances from the motor bearings, actuator Inherent arm bearings, noise, vibration and servo loop Corrosionresponse Bit Error errors, can cause the head positioning to take too long to High Fly Scratched lock onto a track and ultimately produce an error. Write Media 9

HDD Events (Ghosts, Goblins and Failures) Uncorrectable Cannot Find Data Cannot Read Data Correctable in RAID Data Missing Servo Track SMART Limit Hit Error During Write Written & Destroyed Electronics Can t Stay on Track Read Head Media Inherent Bit Error High Fly Write Scratched Media Corrosion Thermal Asperity 10

HDD Events (Ghosts, Goblins and Failures) Uncorrectable Cannot Find Data Cannot Read Data Correctable in RAID Data Missing Servo Track SMART Limit Hit Error During Write Written & Destroyed Electronics Can t Stay on Track Read Head Most events are heads and media related Media Inherent Bit Error High Fly Write Scratched Media Corrosion Thermal Asperity 11

Uncorrectable Errors and Field Failures (RAID) Uncorrectable errors constitute what most people think of as failures HDD crashed Won t spin Most field data is based on these ONLY Field data do not match vendor specifications HDD reliability depends on the HDD s Manufacturer Product Family from the manufacturer Capacity Age Use conditions 12

Mean Time Between Failures (MTBF) MTBF is NOT a good metric for measuring HDD reliability (Elerath, J. G. 1998. Specifying Reliability in the Disk Drive Industry. Proc. Annual Rel. and Maint. Symposium, IEEE. pp194 199.) What does MTBF really mean to a consumer? Does NOT include correctable events Failure Rate, h(t) Combined Failure Rate 0.000008 0.000007 0.000006 0.000005 0.000004 0.000003 0.000002 0.000001 0 0 43800 87600 131400 175200 Time (hrs) early h(t) useful h(t) wearout h(t) total h(t) 13

Real HDD Field Reliability Data Vintage affects reliability Probability of Failure 0.02 0.02 0.01 8.00E-3 4.00E-3 0 Vintage 5 Vintage 2 Vintage 4 Vintage 3 Vintage 1 Composite 0 4000.00 8000.00 12000.00 16000.00 2000 Time to Failure, hrs Slightly decreasing failure rates, then increasing rate (esp. HDD#2) Probability of Failure 99. 90. 50. 10. 5. 1. 0.5 0.1 0.05 6.0 3.0 2.0 1.6 1.4 1.2 1.0 0.9 0.8 0.7 0.6 0.5 HDD #1 HDD #2 HDD #3 0.01 Time to Failure, hrs 14

Reality of RAID Operations RAID 4 & RAID 5 (n+1) Most likely combination of failures is a unknown data corruption in a small amount of data (several sectors) followed by a concurrent, noncorrectable error in a different HDD During reconstruction onto a new HDD, the corrupted data cannot be recovered RAID 6 (n+2) Most likely combinations are 1 data corruption, followed by two uncorrectable failures (HDD failures) simultaneously. The second failure occurs before the first can be incorporated into the RAID group and reconstructed. Far less likely than RAID 4 or RAID 5 data losses, but still measurable. 15

Reality of RAID Operations Read after write is too slow; not used. Checked only on READ. Data corruptions can occur During READ operations During WRITE operations Whenever an HDD is spinning Sometimes when the HDD is not spinning (depending on the design) Small Data Corruptions All HDDs have error recovery algorithms built into the HDD Small errors are corrected on the fly (less than one revolution) Large Data Corruptions Require ECC and data from other HDDs in the RAID group Disk scrubbing by the RAID controller proactively finds these larger errors and corrects them to prevent data loss at the RAID group level 16

Impact on RAID Reliability Estimates Mean Time To Data Loss (MTTDL) Concept statistically flawed 1 Doesn t include undiscovered defects Ridiculously optimistic RAID 4/5 Reliability Estimates Monte Carlo Simulations 1 Simple Equation 2 RAID 6 Monte Carlo Simulations Simple Equation 3 DDFs per 1000 RAID Groups 25 20 15 10 5 0 RAID Group Size = 14 Non-HPP Simulation NetApp RAID Eq. NetApp Field Data (FC) MTTDL 0 4380 8760 13140 Age, Hours http://raideqn.netapp.com 1 Elerath & Pecht. A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID). Trans. on Computers. 58, 3, March 2009. 2 Elerath. A Simple Equation for Estimating Reliability of an N+1 Redundant Array of Independent Disks (RAID). DSN 2009. 3 Elerath & Schindler. Beyond MTTDL: A closed form RAID 6 reliability equation. WIP. FAST 2013. 17

Conclusion HDD head/disk interface is critical to achieving RAID reliability RAID designers must do background media scrubbing Scrub frequency is critical to RAID reliability Larger HDDs (2+ TB) have more opportunities to develop media defects and require long times to scrub Field data does not map to true defect rates 18