CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

Similar documents

Price/performance Modern Memory Hierarchy

Disk Storage & Dependability

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

How To Improve Performance On A Single Chip Computer

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Definition of RAID Levels

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Lecture 36: Chapter 6

CS 6290 I/O and Storage. Milos Prvulovic

Reliability and Fault Tolerance in Storage

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

1 Storage Devices Summary

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

PIONEER RESEARCH & DEVELOPMENT GROUP

How To Write A Disk Array

Why disk arrays? CPUs improving faster than disks

CS161: Operating Systems

Disk Array Data Organizations and RAID

Data Storage - II: Efficient Usage & Errors

Storing Data: Disks and Files

RAID Basics Training Guide

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

How To Create A Multi Disk Raid

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe

CS420: Operating Systems

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Read this before starting!

COMP 7970 Storage Systems

Filing Systems. Filing Systems

RAID Levels and Components Explained Page 1 of 23

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

Computer Systems Structure Main Memory Organization

RAID. Storage-centric computing, cloud computing. Benefits:

HARD DRIVE CHARACTERISTICS REFRESHER

Sistemas Operativos: Input/Output Disks

File System Design and Implementation

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Summer Student Project Report

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

What is RAID and how does it work?

Using RAID6 for Advanced Data Protection

RAID 5 rebuild performance in ProLiant

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

RAID Made Easy By Jon L. Jacobi, PCWorld

RAID: Redundant Arrays of Independent Disks

RAID Level Descriptions. RAID 0 (Striping)

CS 153 Design of Operating Systems Spring 2015

RAID Technology Overview

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

RAID Implementation for StorSimple Storage Management Appliance

An Introduction to RAID. Giovanni Stracquadanio

Chapter 9: Peripheral Devices: Magnetic Disks

IBM ^ xseries ServeRAID Technology

Striped Set, Advantages and Disadvantages of Using RAID

Lecture 23: Multiprocessors

Guide to SATA Hard Disks Installation and RAID Configuration

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

Silent data corruption in SATA arrays: A solution

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Reliability of Systems and Components

Hard Disk Drives and RAID

What is RAID? data reliability with performance

Achieving High Availability

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

Block1. Block2. Block3. Block3 Striping

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

System Availability and Data Protection of Infortrend s ESVA Storage Solution

IBM System x GPFS Storage Server

Programming NAND devices

RAID technology and IBM TotalStorage NAS products

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

IncidentMonitor Server Specification Datasheet

Guide to SATA Hard Disks Installation and RAID Configuration

Distributed Storage Networks and Computer Forensics

Database Management Systems

Algorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Christian Schindelhauer

Data Corruption In Storage Stack - Review

An Introduction to RAID 6 ULTAMUS TM RAID

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Q & A From Hitachi Data Systems WebTech Presentation:

How To Understand And Understand The Power Of Aird 6 On Clariion

Lecture 13: I/O: RAID, I/O Benchmarks, UNIX File System Performance Professor David A. Patterson Computer Science 252 Fall 1996

California Software Labs

Transcription:

CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC Instructor: Justin Hsia 8/08/2013 Summer 2013 Lecture #27 1

Review of Last Lecture MapReduce Data Level Parallelism Framework to divide up data to be processed in parallel Mapper outputs intermediate (key, value) pairs Optional Combiner in between for better load balancing Reducer combines intermediate values with same key Handles worker failure and does redundant execution 8/08/2013 Summer 2013 Lecture #27 2

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013 Summer 2013 Lecture #27 3

Six Great Ideas in Computer Architecture 1 Layers of Representation/Interpretation 2 Technology Trends 3 Principle of Locality/Memory Hierarchy 4 Parallelism 5 Performance Measurement & Improvement 6 Dependability via Redundancy 8/08/2013 Summer 2013 Lecture #27 4

Great Idea #6: Dependability via Redundancy Redundancy so that a failing piece doesn t make the whole system fail 1+1=2 2 of 3 agree 1+1=2 1+1=2 1+1=1 FAIL! 8/08/2013 Summer 2013 Lecture #27 5

Great Idea #6: Dependability via Redundancy Applies to everything from datacenters to memory Redundant datacenters so that can lose 1 datacenter but Internet service stays online Redundant routes so can lose nodes but Internet doesn t fail Redundant disks so that can lose 1 disk but not lose data (Redundant Arrays of Independent Disks/RAID) Redundant memory bits of so that can lose 1 bit but no data (Error Correcting Code/ECC Memory) 8/08/2013 Summer 2013 Lecture #27 6

Dependability Restoration Service accomplishment Service delivered as specified Failure Fault: failure of a component May or may not lead to system failure Applies to any part of the system Service interruption Deviation from specified service 8/08/2013 Summer 2013 Lecture #27 7

Dependability Measures Reliability: Mean Time To Failure (MTTF) Service interruption: Mean Time To Repair (MTTR) Mean Time Between Failures (MTBF) MTBF = MTTR + MTTF Availability = MTTF / (MTTF + MTTR) = MTTF / MTBF Improving Availability Increase MTTF: more reliable HW/SW + fault tolerance Reduce MTTR: improved tools and processes for diagnosis and repair 8/08/2013 Summer 2013 Lecture #27 8

Reliability Measures 1) MTTF, MTBF measured in hours/failure eg average MTTF is 100,000 hr/failure 2) Annualized Failure Rate (AFR) Average rate of failures per year (%) AFR Disks MTTF 8760 hr yr 1 Disks Total disk failures/yr 8760 hr/yr MTTF 8/08/2013 Summer 2013 Lecture #27 9

Availability Measures Availability = MTTF / (MTTF + MTTR) usually written as a percentage (%) Want high availability, so categorize by number of 9s of availability per year 1 nine: 90% => 36 days of repair/year 2 nines: 99% => 36 days of repair/year 3 nines: 999% => 526 min of repair/year 4 nines: 9999% => 53 min of repair/year 5 nines: 99999% => 5 min of repair/year 8/08/2013 Summer 2013 Lecture #27 10

Dependability Example 1000 disks with MTTF = 100,000 hr and MTTR = 100 hr MTBF = MTTR + MTTF = 100,100 hr Availability = MTTF/MTBF = 09990 = 999% 3 nines of availability! AFR = 8760/MTTF = 00876 = 876% Faster repair to get 4 nines of availability? 00001 MTTF = 09999 MTTR MTTR = 10001 hr 8/08/2013 Summer 2013 Lecture #27 11

Dependability Design Principle No single points of failure Chain is only as strong as its weakest link Dependability Corollary of Amdahl s Law Doesn t matter how dependable you make one portion of system Dependability limited by part you do not improve 8/08/2013 Summer 2013 Lecture #27 12

Question: There s a hardware glitch in our system that makes the Mean Time To Failure (MTTF) decrease Are the following statements TRUE or FALSE? 1) Our system s Availability will increase 2) Our system s Annualized Failure Rate (AFR) will increase (A) (B) (C) (D) 1 F 2 F F T T F 13

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013 Summer 2013 Lecture #27 14

Administrivia Project 3 (individual) due Sun 8/11 Final Review Tue 8/13, 7 10pm in 10 Evans Final Fri 8/16, 9am 12pm, 155 Dwinelle 2 nd half material + self modifying MIPS MIPS Green Sheet provided again Two two sided handwritten cheat sheets Can re use your midterm cheat sheet! Dead Week WTh 8/14 15 Optional Lab 13 (EC) due anytime next week 8/08/2013 Summer 2013 Lecture #27 15

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013 Summer 2013 Lecture #27 16

Arrays of Small Disks Can smaller disks be used to close the gap in performance between disks and CPUs? Conventional: 4 disk types Disk Array: 1 disk type 35 Low End 525 10 14 High End 35 8/08/2013 Summer 2013 Lecture #27 17

Replace Large Disks with Large Number of Small Disks! (Data from 1988 disks) Capacity Volume Power Data Rate I/O Rate MTTF Cost IBM 3390K 20 GBytes 97 cu ft 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K IBM 35" 0061 320 MBytes 01 cu ft 11 W 15 MB/s 55 I/Os/s 50 KHrs $2K x72 23 GBytes 11 cu ft 1 KW 120 MB/s 3900 IOs/s??? ~700 Hrs Hrs $150K 9X 3X 8X 6X Disk Arrays have potential for large data and I/O rates, high MB/ft 3, high MB/KW, but what about reliability? 8/08/2013 Summer 2013 Lecture #27 18

RAID: Redundant Arrays of Inexpensive Disks Files are striped across multiple disks Concurrent disk accesses improve throughput Redundancy yields high data availability Service still provided to user, even if some components (disks) fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store redundant info Bandwidth penalty to update redundant info 8/08/2013 Summer 2013 Lecture #27 19

RAID 0: Data Striping logical record 10010011 11001101 striped physical records Stripe data across all disks Generally faster accesses (access disks in parallel) No redundancy (really AID ) Bit striping shown here, can do in larger chunks 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 1 8/08/2013 Summer 2013 Lecture #27 20

RAID 1: Disk Mirroring recovery group Each disk is fully duplicated onto its mirror Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Logical read = one physical read Most expensive solution: 100% capacity overhead 8/08/2013 Summer 2013 Lecture #27 21

Parity Bit Describes whether a group of bits contains an even or odd number of 1 s Define 1 = odd and 0 = even Can use XOR to compute parity bit! Adding the parity bit to a group will always result in an even number of 1 s ( even parity ) 100 Parity: 1, 101 Parity: 0 If we know number of 1 s must be even, can we figure out what a single missing bit should be? 10?11 missing bit is 1 8/08/2013 Summer 2013 Lecture #27 22

RAID 3: Parity Disk Logical data is bytestriped across disks Parity disk P contains parity bytes of other disks If any one disk fails, can use other disks to recover data! We have to know which disk failed X Y Z Must update Parity data on EVERY write Logical write = min 2 to max N physical reads and writes parity new = data old data new parity old P 0 2 P 3 5 P 6 8 8/08/2013 Summer 2013 Lecture #27 23 D 0 D 3 D 6 D 2 D 4 D 7 D 3 D 5 D 8 P

Updating the Parity Data Examine small write in RAID 3 (1 byte) 1 logical write = 2 physical reads + 2 physical writes Same concept applies for later RAIDs, too What if writing halfword (2 B)? Word (4 B)? D 0 D 0 P P 0 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 0 0 1 1 0 1 0 1 1 0 0 1 1 1 1 new XOR data D 0 + D 0 D 1 D 2 D 3 P old data (1 Read) 1 only if bit changed XOR (3 Write) P D 0 D 1 D 2 D 3 + P old parity (2 Read) flip if changed (4 Write) 8/08/2013 Summer 2013 Lecture #27 24

RAID 4: Higher I/O Rate Logical data is now block striped across disks Parity disk P contains all parity blocks of other disks Because blocks are large, can handle small reads in parallel Must be blocks in different disks Still must update Parity data on EVERY write Logical write = min 2 to max N physical reads and writes Performs poorly on small writes 8/08/2013 Summer 2013 Lecture #27 25

RAID 4: Higher I/O Rate Insides of 5 disks D0 D1 D2 D3 P D4 D5 D6 D7 P D8 D9 D10 D11 P Increasing Logical Disk Address Example: small read D0 & D5, large write D12 D15 D12 D13 D14 D15 D16 D17 D18 D19 D20 D21 D22 D23 P 8/08/2013 Summer 2013 Lecture #27 26 Disk Columns P P Stripe

Inspiration for RAID 5 When writing to a disk, need to update Parity Small writes are bottlenecked by Parity Disk: Write to D0, D5 both also write to P disk D0 D1 D2 D3 P D4 D5 D6 D7 P 8/08/2013 Summer 2013 Lecture #27 27

RAID 5: Interleaved Parity Independent writes possible because of interleaved parity D0 D1 D2 D3 P D4 D5 D6 P D7 D8 D9 P D10 D11 D12 P D13 D14 D15 Increasing Logical Disk Addresses Example: write to D0, D5 uses disks 0, 1, 3, 4 P D16 D17 D18 D19 D20 D21 D22 D23 P Disk Columns 8/08/2013 Summer 2013 Lecture #27 28

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013 Summer 2013 Lecture #27 29

Error Detection/Correction Codes Memory systems generate errors (accidentally flipped bits) DRAMs store very little charge per bit Soft errors occur occasionally when cells are struck by alpha particles or other environmental upsets Hard errors occur when chips permanently fail Problem gets worse as memories get denser and larger 8/08/2013 Summer 2013 Lecture #27 30

Error Detection/Correction Codes Protect against errors with EDC/ECC Extra bits are added to each M bit data chunk to produce an N bit code word Extra bits are a function of the data Each data word value is mapped to a valid code word Certain errors change valid code words to invalid ones (ie can tell something is wrong) 8/08/2013 Summer 2013 Lecture #27 31

Detecting/Correcting Code Concept Space of all possible bit patterns: Error changes bit pattern to an invalid code word 2 N patterns, but only 2 M are valid code words Detection: fails code word validity check Correction: can map to nearest valid code word 8/08/2013 Summer 2013 Lecture #27 32

Hamming Distance Hamming distance = # of bit changes to get from one code word to another p = 011011, q = 001111, Hdist(p,q) = 2 p = 011011, q = 110001, Hdist(p,q) =? 3 Richard Hamming (1915 98) Turing Award Winner If all code words are valid, then min Hdist between valid code words is 1 Change one bit, at another valid code word 8/08/2013 Summer 2013 Lecture #27 33

3 Bit Visualization Aid Want to be able to see Hamming distances Show code words as nodes, Hdist of 1 as edges For 3 bits, show each bit in a different dimension: Bit 1 Bit 2 Bit 0 8/08/2013 Summer 2013 Lecture #27 34

Minimum Hamming Distance 2 Half the available code words are valid Let 000 be valid If 1 bit error, is code word still valid? No! So can detect If 1 bit error, know which code word we came from? No! Equidistant, so cannot correct 8/08/2013 Summer 2013 Lecture #27 35

Minimum Hamming Distance 3 Nearest 000 (one 1) Only a quarter of the available code words are valid Let 000 be valid Nearest 111 (one 0) How many bit errors can we detect? Two! Takes 3 errors to reach another valid code word If 1 bit error, know which code word we came from? Yes! 8/08/2013 Summer 2013 Lecture #27 36

Parity: Simple Error Detection Coding Add parity bit when writing block of data: b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 p Check parity on block read: Error if odd number of 1s Valid otherwise b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 p + + error Minimum Hamming distance of parity code is 2 Parity of code word = 1 indicates an error occurred: 2 bit errors not detected (nor any even #of errors) Detects an odd # of errors 8/08/2013 Summer 2013 Lecture #27 37

Parity Examples 1) Data 0101 0101 4 ones, even parity now Write to memory 0101 0101 0 to keep parity even 2) Data 0101 0111 5 ones, odd parity now Write to memory: 0101 0111 1 to make parity even 3) Read from memory 0101 0101 0 4 ones even parity, so no error 4) Read from memory 1101 0101 0 5 ones odd parity, so error What if error in parity bit? Can detect! 8/08/2013 Summer 2013 Lecture #27 38

Get To Know Your Instructor 8/08/2013 Summer 2013 Lecture #27 39

Agenda Dependability Administrivia RAID Error Correcting Codes (Cont) 8/08/2013 Summer 2013 Lecture #27 40

How to Correct 1 bit Error? Recall: Minimum distance for correction? Three Richard Hamming came up with a mapping to allow Error Correction at min distance of 3 Called Hamming ECC for Error Correction Code 8/08/2013 Summer 2013 Lecture #27 41

Hamming ECC (1/2) Use extra parity bits to allow the position identification of a single error Interleave parity bits within bits of data to form code word Note: Number bits starting at 1 from the left 1) Use all bit positions in the code word that are powers of 2 for parity bits (1, 2, 4, 8, 16, ) 2) All other bit positions are for the data bits (3, 5, 6, 7, 9, 10, ) 8/08/2013 Summer 2013 Lecture #27 42

Hamming ECC (2/2) 3) Set each parity bit to create even parity for a group of the bits in the code word The position of each parity bit determines the group of bits that it checks Parity bit p checks every bit whose position number in binary has a 1 in the bit position corresponding to p Bit 1 (0001 2 ) checks bits 1,3,5,7, (XXX1 2 ) Bit 2 (0010 2 ) checks bits 2,3,6,7, (XX1X 2 ) Bit 4 (0100 2 ) checks bits 4 7, 12 15, (X1XX 2 ) Bit 8 (1000 2 ) checks bits 8 15, 24 31, (1XXX 2 ) 8/08/2013 Summer 2013 Lecture #27 43

Hamming ECC Example (1/3) A byte of data: 10011010 Create the code word, leaving spaces for the parity bits: _ 1 _ 2 1 3 _ 4 0 5 0 6 1 7 _ 8 1 9 0 10 1 11 0 12 8/08/2013 Summer 2013 Lecture #27 44

Hamming ECC Example (2/3) Calculate the parity bits: Parity bit 1 group (1, 3, 5, 7, 9, 11):? _ 1 _ 0 0 1 _ 1 0 1 0 0 Parity bit 2 group (2, 3, 6, 7, 10, 11): 0? 1 _ 0 0 1 _ 1 0 1 0 1 Parity bit 4 group (4, 5, 6, 7, 12): 0 1 1? 0 0 1 _ 1 0 1 0 1 Parity bit 8 group (8, 9, 10, 11, 12): 0 1 1 1 0 0 1? 1 0 1 0 0 8/08/2013 Summer 2013 Lecture #27 45

Hamming ECC Example (3/3) Valid code word: 011100101010 Recover original data: 011100101010 Suppose we see 0 1 1 2 1 3 1 4 0 5 0 6 1 7 0 8 1 9 1 10 1 11 0 12 instead fix the error! Check each parity group Parity bits 2 and 8 are incorrect As 2+8=10, bit position 10 is the bad bit, so flip it! Corrected value: 011100101010 8/08/2013 Summer 2013 Lecture #27 46

Hamming ECC Cost Space overhead in single error correction code Form p + d bit code word, where p = # parity bits and d = # data bits Want the p parity bits to indicate either no error or 1 bit error in one of the p + d places Need 2 p p + d + 1, thus p log 2 (p + d + 1) For large d, p approaches log 2 (d) Example: d = 8 p = log 2 (p+8+1) p = 4 d = 16 p = 5; d = 32 p = 6; d = 64 p = 7 8/08/2013 Summer 2013 Lecture #27 47

Hamming Single Error Correction, Double Error Detection (SEC/DED) Adding extra parity bit covering the entire SEC code word provides double error detection as well! 1 2 3 4 5 6 7 8 p 1 p 2 d 1 p 3 d 2 d 3 d 4 p 4 Let H be the position of the incorrect bit we would find from checking p 1, p 2,and p 3 (0 means no error) and let P be parity of complete code word H=0 P=0, no error H 0 P=1, correctable single error (p 4 =1 odd # errors) H 0 P=0, double error detected (p 4 =0 even # errors) H=0 P=1, an error occurred in p 4 bit, not in rest of word 8/08/2013 Summer 2013 Lecture #27 48

SEC/DED: Hamming Distance 4 1 bit error (one 0) Nearest 1111 1 bit error (one 1) Nearest 0000 2 bit error (two 0 s, two 1 s) halfway between 8/08/2013 Summer 2013 Lecture #27 49

Modern Use of RAID and ECC (1/3) Typical modern code words in DRAM memory systems: 64 bit data blocks (8 B) with 72 bit codes (9 B) d = 64 p = 7, +1 for DED What happened to RAID 2? Bit striping with extra disks just for ECC parity bits Very expensive computationally and in terms of physical writes ECC implemented in all current HDDs, so RAID 2 became obsolete (redundant, ironically) 8/08/2013 Summer 2013 Lecture #27 50

Modern Use of RAID and ECC (2/3) RAID 6: Recovering from two disk failures! RAID 5 with an extra disk s amount of parity blocks (also interleaved) Extra parity computation more complicated than Double Error Detection (not covered here) When useful? Operator replaces wrong disk during a failure Disk bandwidth is growing more slowly than disk capacity, so MTTR a disk in a RAID system is increasing (increases the chances of a 2 nd failure during repair) 8/08/2013 Summer 2013 Lecture #27 51

Modern Use of RAID and ECC (3/3) Common failure mode is bursts of bit errors, not just 1 or 2 Network transmissions, disks, distributed storage Contiguous sequence of bits in which first, last, or any number of intermediate bits are in error Caused by impulse noise or by fading signal strength; effect is greater at higher data rates Other tools: cyclic redundancy check, Reed Solomon, other linear codes 8/08/2013 Summer 2013 Lecture #27 52

Summary Great Idea: Dependability via Redundancy Reliability: MTTF & Annual Failure Rate Availability: % uptime = MTTF/MTBF RAID: Redundant Arrays of Inexpensive Disks Improve I/O rate while ensuring dependability http://wwwaccscom/p_and_p/raid/basicraidhtml Memory Errors: Hamming distance 2: Parity for Single Error Detect Hamming distance 3: Single Error Correction Code + encode bit position of error Hamming distance 4: SEC/Double Error Detection 8/08/2013 Summer 2013 Lecture #27 53