Using RAID6 for Advanced Data Protection



Similar documents
Definition of RAID Levels

RAID Level Descriptions. RAID 0 (Striping)

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Lecture 36: Chapter 6

How To Create A Multi Disk Raid

Best Practices RAID Implementations for Snap Servers and JBOD Expansion

RAID 6 with HP Advanced Data Guarding technology:

RAID Technology. RAID Overview

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

RAID Basics Training Guide

Disk Array Data Organizations and RAID

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

RAID 5 rebuild performance in ProLiant

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

RAID Technology Overview

VERY IMPORTANT NOTE! - RAID

Education. Servicing the IBM ServeRAID-8k Serial- Attached SCSI Controller. Study guide. XW5078 Release 1.00

System Availability and Data Protection of Infortrend s ESVA Storage Solution

An Introduction to RAID 6 ULTAMUS TM RAID

RAID Levels and Components Explained Page 1 of 23

Why disk arrays? CPUs improving faster than disks

Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

IBM ^ xseries ServeRAID Technology

Guide to SATA Hard Disks Installation and RAID Configuration

Assessing RAID ADG vs. RAID 5 vs. RAID 1+0

Module 6. RAID and Expansion Devices

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

NetApp RAID-DP : Dual-Parity RAID 6 Protection Without Compromise

Guide to SATA Hard Disks Installation and RAID Configuration

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

Data Storage - II: Efficient Usage & Errors

Benefits of Intel Matrix Storage Technology

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

Intel RAID Software User s Guide:

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

EonStor DS High-Density Storage: Key Design Features and Hybrid Connectivity Benefits

New Advanced RAID Level for Today's Larger Storage Capacities: Advanced Data Guarding

RAID: Redundant Arrays of Independent Disks

Reliability and Fault Tolerance in Storage

Factors rebuilding a degraded RAID

HP Smart Array Controllers and basic RAID performance factors

UK HQ RAID Chunk Size T F ISO 14001

PIONEER RESEARCH & DEVELOPMENT GROUP

Using Multipathing Technology to Achieve a High Availability Solution

How To Write A Disk Array

CS420: Operating Systems

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Intel RAID Software User s Guide:

How to choose the right RAID for your Dedicated Server

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

Intel RAID Software User s Guide:

RAID technology and IBM TotalStorage NAS products

Technical White paper RAID Protection and Drive Failure Fast Recovery

3PAR Fast RAID: High Performance Without Compromise

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

Distribution One Server Requirements

RAID User Guide. Edition. Trademarks V1.0 P/N: C51GME0-00

USER S GUIDE. MegaRAID SAS Software. June 2007 Version , Rev. B

Block1. Block2. Block3. Block3 Striping

Offline Array Recovery Procedures SuperTrak SX6000 and UltraTrak

Which RAID Level is Right for Me?

MegaRAID Configuration Software

Hard Disk Drives and RAID

Striped Set, Advantages and Disadvantages of Using RAID

Getting Started With RAID

How To Improve Performance On A Single Chip Computer

Configuring ThinkServer RAID 100 on the TS140 and TS440

CSE-E5430 Scalable Cloud Computing P Lecture 5

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

Chapter 10: Mass-Storage Systems

Performance Analysis of RAIDs in Storage Area Network

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Guide to SATA Hard Disks Installation and RAID Configuration

RAID Technology White Paper

HP and Mimosa Systems A system for archiving, recovery, and storage optimization white paper

Chapter 12: Mass-Storage Systems

an analysis of RAID 5DP

NVIDIA RAID Installation Guide

California Software Labs

Deploying a File Server Lesson 2

Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR

Dynamode External USB3.0 Dual RAID Encloure. User Manual.

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

How To Understand And Understand The Power Of Aird 6 On Clariion

CS161: Operating Systems

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Main Reference : Hall, James A Information Technology Auditing and Assurance, 3 rd Edition, Florida, USA : Auerbach Publications

Filing Systems. Filing Systems

Xanadu 130. Business Class Storage Solution. 8G FC Host Connectivity and 6G SAS Backplane. 2U 12-Bay 3.5 Form Factor

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

Configuring ThinkServer RAID 100 on the Lenovo TS130

Transcription:

Using RAI6 for Advanced ata Protection 2006 Infortrend Corporation. All rights reserved.

Table of Contents The Challenge of Fault Tolerance... 3 A Compelling Technology: RAI6... 3 Parity... 4 Why Use RAI6... 4 How RAI6 Works... 4 RAI6 Recovery... 6 RAI Configuration Tradeoffs... 8 A Balanced Solution: Infortrend RAI6... 8 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 2

The Challenge of Fault Tolerance RAI technology is used to protect data as well as provide more efficient data access and capacity utilization with disk-based subsystems. By placing data on multiple disks, I/O (input/output operations can overlap in a balanced way, improving performance. Using multiple disk drives increases the Mean Time Between Failure (MTBF, while storing data redundantly and this increases the fault-tolerance of the entire system. To meet requirements such as utilization, performance, and data protection, there are several types of RAI implementation including RAI0, RAI, RAI5, and combinations such as RAI0 and RAI50. RAI0 provides the highest performance but no redundancy while RAI mirrors the data stored in one hard drive to another and can only be performed with two hard drives. With RAI5, parity data is not stored in a dedicated hard drive and in the event of a drive failure, the controller can recover/regenerate the lost data of the failed drive by comparing and recalculating data on the remaining drives. RAI preferences greatly depend on the user s needs and budget. New technologies have led to the availability of inexpensive, high capacity disk drives in SAS (Serial Attached SCSI and SATA (Serial ATA formats, allowing storage managers to build low cost, high capacity arrays to protect their data. However, these affordable options also pose their own risks as SATA drives fail more often than Fibre Channel (FC or SCSI drives. In a typical RAI5 implementation, one drive fails and another will take over. Failure of two drives at once would cause loss of data and system downtime. Therefore, when these less reliable drives are common, advanced protection is needed to guard against multiple drive failures and provide fault tolerance and high availability of data. A Compelling Technology: RAI6 RAI6 is essentially an extension of RAI5 that allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity. ata is striped on a block level across a set of drives, just like in RAI5, except a second set of parity is calculated and written across all the drives. Because RAI5 uses only one set of parity codes, data will be permanently lost if another disk fails or access errors occur during a long rebuild time. With the protection of RAI6, data can be reconstructed even if a second drive fails. With minimal performance impact and capacity consumption, RAI6 provides for extremely high data fault tolerance and can sustain multiple simultaneous drive failures. 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 3

RAI5 and RAI6 Comparison RAI5 RAI6 Parity Single ual Protection One drive Two drives Implementation Requires N+ drives, minimum of 3 Requires N+2 drives, minimum of 4 Performance Medium impact on write operation and rebuild More impact on sequential write operation vs. RAI5 Why Use RAI6 As described above, RAI5 protects against a single drive failure without downtime; however, if a second drive goes down, data will be lost. While two drive failures are less likely than one failure, the probability is an increasing concern based on the factors listed below:. Growing adoption of less reliable drives: The benefits of SATA drives include lower prices and high capacity; however, their MTBF is lower than FC or SCSI drives. The increased use of these drives increases the probability of two drives failing at the same time. 2. High capacity means longer rebuilding time: Greater capacity on a single drive means more time is needed to rebuild data if one drive fails. Subsystems suffer heavy loading during the rebuild process and it s therefore more likely to damage another drive or for a second drive failure to occur during this longer rebuilding time. 3. Human error: When one drive fails, someone has to replace the damaged drive with a new one. An error, such as removing the wrong drive, can create other drive failures and data would be lost. 4. The failure rate significantly rises as the number of hard drives increases: Increasing the number of hard drives in an array effectively raises the expected failure rate of the first hard drive. uring system recovery using a spare drive, the failure rate of the second hard drive is also multiplied. A subsystem composed of multiple hard drives needs added protection to guarantee data availability in case two drives fail simultaneously. Given the greater chance that two drives could fail simultaneously, it is clear to see the appeal of implementing RAI6 in a storage array. How RAI6 Works RAI6 algorithms perform with two parity data, P and Q, using two linear independent equations. The first parity data P, the same as the parity data in RAI5, is performed by the equation: 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 4

P L = 0 2 n The second parity data Q is generated by the equation: Q = ( A 0 0 ( A ( A 2 2 L ( A n n Where means XOR operation, means RAI6 multiplication, i is the data block, and Ai is a coefficient. Figure For example, according to Figure, parity data P ( will be: P 3 4 5 = ( and parity Q (2 will be: Q ( 3 4 5 = α ( β ( γ (2 where α β, γ, are specific numbers. 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 5

RAI6 Recovery Recovery of P and Q data failure The rebuild procedure will regenerate parity data P and Q from the data blocks. Recovery of P and single data block failure The rebuild procedure will use parity data Q and all remaining data blocks to rebuild the failed data block. After the failed data block is rebuilt, parity data P can be regenerated from all data blocks, the same as the generating method of parity data P. For example, assume parity data P and data block 3 in Figure have failed. Q = ( α 3 ( β 4 ( γ 5 = α [ Q ( β 4 ( γ 5 ] 3 Parity data can be recomputed using the normal operation: P = 3 4 5 Recovery of Q and single data block failure The rebuild process is similar to a RAI5 recovery. The failed data block will be rebuilt from parity data P and all remaining data blocks. This process is like the RAI5 rebuild process. After the data block has been rebuilt, parity data Q can be regenerated. For example, assume parity data Q and data block 3 in Figure have failed. The data in 3 can be rebuilt from parity data P and other data blocks. P = 3 4 5 3 = P 4 5 Then, parity data Q can be regenerated by the normal operation. Q = ( α 3 ( β 4 ( γ 5 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 6

Recovery of two data block failures The rebuild procedure to recover data from two failed data drives is the most complicated case. From the parity data generating equations, there are two equations and two unknowns. Using matrix inversion however, the two unknowns can be solved and the lost data can be rebuilt. Assume data blocks 3 and 4 have failed. From the parity data generating equations: P = 3 4 5 = ( α 3 ( β 4 ( γ 5 Q come the following two equations: 3 4 = P 5 and (3 ( 3 4 5 α ( β = Q ( γ (4 The following is calculated from equations (3 and (4: α β 3 4 = Q P 5 ( γ 5 Using the matrix inversion, the two unknowns, 3 and 4, are solved. 3 4 = α β Q P 5 ( γ 5 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 7

RAI Configuration Tradeoffs When configuring a storage array, system managers must balance their needs for high performance and data security. For example, if high performance is a priority, RAI0 is the best option. For increased data protection, RAI5 is an excellent choice, but as described in this paper, RAI6 provides valuable extra protection against the loss of data due to a second drive failure in large capacity disk drives, especially for SATA drives used in enterprise environments. There is a performance loss compared to RAI5 but this can be an acceptable tradeoff for the improved data safety. A Balanced Solution: Infortrend RAI6 Infortrend has developed an efficient new RAI6 scheme for its RAI subsystems that provides the highest fault tolerance with minimal loss of performance. The RAI6 array can tolerate the failure of more than one disk drive; or, in the degraded mode, one drive failure and bad blocks on the other. In the event of disk drive failure, the controller can recover/regenerate the lost data of the failed drive(s without interruption to I/O access. Compared to existing RAI6 solutions, Infortrend s RAI6 technology delivers significant performance improvements and features. Users will get twice the parity protection with only a 0 to 5 percent performance loss compared to RAI5. Summary RAI6 provides an extremely high level of fault tolerance and can sustain two simultaneous drive failures without downtime or data loss. This is a perfect solution for the rigorous fault tolerant requirements of mission-critical data. 2006 Infortrend Corporation Using RAI6 for Advanced ata Protection Page 8