Data Striping. data blocks

Similar documents
Non-Redundant (RAID Level 0)

Lecture 36: Chapter 6

Definition of RAID Levels

StorTrends RAID Considerations

How To Create A Multi Disk Raid

PIONEER RESEARCH & DEVELOPMENT GROUP

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Storing Data: Disks and Files

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Hard Disk Drives and RAID

RAID. Storage-centric computing, cloud computing. Benefits:

Filing Systems. Filing Systems

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Using RAID6 for Advanced Data Protection

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to:

Guide to SATA Hard Disks Installation and RAID Configuration

Block1. Block2. Block3. Block3 Striping

Module 6. RAID and Expansion Devices

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Striped Set, Advantages and Disadvantages of Using RAID

How to choose the right RAID for your Dedicated Server

Guide to SATA Hard Disks Installation and RAID Configuration

RAID Overview

Why disk arrays? CPUs improving faster than disks

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC /23/2006

Guide to SATA Hard Disks Installation and RAID Configuration

CRI RAID. Robert A. Albers, Jr., Cray Research, Inc., Eagan, Minnesota 1 INTRODUCTION 4 HARDWARE RAID 2 RAID LEVELS 3 SOFTWARE RAID

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

CS420: Operating Systems

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

RAID Level Descriptions. RAID 0 (Striping)

Data Storage - II: Efficient Usage & Errors

UK HQ RAID Chunk Size T F ISO 14001

OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Price/performance Modern Memory Hierarchy

NVIDIA RAID Installation Guide

Database Management Systems

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Lecture 23: Multiprocessors

RAID 5 rebuild performance in ProLiant

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

RAID technology and IBM TotalStorage NAS products

How To Improve Performance On A Single Chip Computer

RAID Performance Analysis

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Reliability and Fault Tolerance in Storage

Sistemas Operativos: Input/Output Disks

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

How To Write A Disk Array

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Guide to SATA Hard Disks Installation and RAID Configuration

Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho

File System Design and Implementation

Disk Array Data Organizations and RAID

Guide to SATA Hard Disks Installation and RAID Configuration

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

How To Understand And Understand The Power Of Aird 6 On Clariion

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

High Performance Computing. Course Notes High Performance Storage

Storage and File Structure

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

CS161: Operating Systems

Disk Storage & Dependability

MANAGING DISK STORAGE

RAID: Redundant Arrays of Independent Disks

HP Smart Array Controllers and basic RAID performance factors

HARD DRIVE CHARACTERISTICS REFRESHER

RAID for Enterprise Computing

IBM ^ xseries ServeRAID Technology

1 Storage Devices Summary

Benefits of Intel Matrix Storage Technology

RAID 6 with HP Advanced Data Guarding technology:

California Software Labs

RAID Basics Training Guide

PARALLEL I/O FOR HIGH PERFORMANCE COMPUTING

ES-1 Elettronica dei Sistemi 1 Computer Architecture

GENERAL INFORMATION COPYRIGHT... 3 NOTICES... 3 XD5 PRECAUTIONS... 3 INTRODUCTION... 4 FEATURES... 4 SYSTEM REQUIREMENT... 4

Raid storage. Raid 0: Striping. Raid 1: Mirrored

Physical Data Organization

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

The Advantages and Disadvantages of Network Computing Nodes

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Distributed Systems (5DV147) What is Replication? Replication. Replication requirements. Problems that you may find. Replication.

Intel RAID Software User s Guide:

Intel RAID Software User s Guide:

Chapter 11 I/O Management and Disk Scheduling

Transcription:

ata Striping The idea behind data striping is to distribute data among several disks so that it can be accessed in parallel ata striping takes place at a low system level (it is not user driven) and should be distinguished from data partitions in databases (which are user or application driven) Reasons for striping: increase disk bandwidth by concurrently retrieving the data from several disks decrease seek time (all disks do the seek in parallel) handle several disk request in parallel ata striping is implemented in disk arrays or RI systems data blocks one I/O stream ustavo lonso. IKS. T Zürich ata Striping 1

ine vs coarse grain striping IN RIN ine grained disk arrays use small data blocks so that all requests are serviced using all the disks at the same time The idea is to maximize the disk bandwidth (data transfer) The penalty for positioning the disk heads for every request is sequential, i.e., it must be paid for every request since requests are dealt with sequentially and all disks are used for every request Only one logical I/O request can be service at a time ORS RIN oarse grain disk arrays use large data blocks so that: small request can be serviced in parallel since they will access only a few disks large request can still benefit from high transfer rates by using many disks or small requests, the seek penalty is not sequential since several disks are used at the same time ustavo lonso. IKS. T Zürich ata Striping 2

ault tolerance isk arrays have the problem of using many independent disks: probability of having a failure is the probability of any of the disks failing if probability of a disk failing is P, the probability of a failure in a disk array with N disks is NxP ailures in disk arrays are dealt with by using redundancy and/or mirroring: parity information is used to both detect and correct disk errors mirroring is based on replication of data blocks The striping and parity depends on the block size: bit-interleaved: each block is one bit, e.g., a byte can be stored in 8 disks. Parity is then on a per byte basis block interleaved: each block contain several bytes (up to kbytes). Parity is on a per block basis The combination of parity and striping unit gives raise to the different RI levels ustavo lonso. IKS. T Zürich ata Striping 3

RI level 0 RI level 0 strips the data across the disks but without adding any redundancy The data is divided into blocks (of arbitrary size) and the blocks uniformly distributed across the disks I/O bandwidth is greatly improved (N times that of a single disk) for both reading and writing by using multiple disk channels in parallel failure in one of the drives makes the entire RI unavailable (no fault tolerance) asily implemented in software No constraints on the number of disks I M data blocks J N K O L P ustavo lonso. IKS. T Zürich ata Striping 4

RI level 1 it interleaved ault tolerance by mirroring (no parity) Read operations are performed on the copy that offers the smallest seek time Two read operations can be performed in parallel. Writes are sequential 50 % of the disk capacity is used for redundancy purposes. I/O bandwidth is only half of RI level 0 (N/2) Recovery from failures is trivial Requires at least two disks (and twice as many as RI level 0) It can handle multiple disk failures (as long as they are not on a mirrored pair) data redundant data bits... ustavo lonso. IKS. T Zürich ata Striping 5

RI level 2 it interleaved ault tolerance by mirroring based on amming codes amming codes implement parity for overlapping segments of data. They require less space than full mirroring but must be implemented in hardware Recovery is more complex (depends on the parity of several segments) I/O bandwidth is (N - log N), with log N being the number of disks needed for storing the parity It can handle multiple disk failures (depending on the failures) data blocks ustavo lonso. IKS. T Zürich ata Striping 6 J K data parity data I L f(,) f(,) f(,) f(j,k) f(,) f(,) f(,i) f(k,l)

RI level 3 it interleaved There is a disk devoted to store the bit-wise parity of the other disks I/O bandwidth much better than levels 1 and 2 (N - 1). It can only handle one request at a time (no parallelism) Recovery is relatively simple (use parity to restore Tolerates one disk failure This is fine grain stripping (adequate for applications that use large files, e.g., multimedia) I M bits J N K O L P XOR(a,b,c,d) XOR(e,f,g,h) XOR(i,j,k,l) XOR(m,n,o,p) ustavo lonso. IKS. T Zürich ata Striping 7

RI level 4 lock interleaved (blocks of arbitrary size, the size is called the striping unit) There is a disk devoted to store the block-wise parity of the other disks Write operations are sequential (all of them need to update the parity disk) Read operations can be done in parallel when on different blocks Parity disk is not used in read operations (limiting bandwidth) Tolerates one disk failure This is coarse grain stripping (adequate for standard databases with few update operations) I M data blocks J N K O L P XOR(a,b,c,d) XOR(e,f,g,h) XOR(i,j,k,l) XOR(m,n,o,p) ustavo lonso. IKS. T Zürich ata Striping 8

RI level 5 lock interleaved (blocks of arbitrary size, the size is called the striping unit) The block-wise parity is uniformly distributed across all disks Write operations can be done in parallel Read operations can be done in parallel when on different blocks Tolerates one disk failure, recovery is somewhat complex Overall good performance Small writes can be quite inefficient (because they require to read other blocks to complete the parity) Most popular approach (also in software) XOR(a,b,c,d) I M data blocks XOR(e,f,g,h) J N XOR(i,j,k,l) O L XOR(m,n,o,p) K P ustavo lonso. IKS. T Zürich ata Striping 9

omparison of RI levels RI level 1 (mirrored) it interleaved RI level 2 (amming codes) RI level 3 (parity disk) lock interleaved RI level 4 (parity disk) RI level 5 (rotated parity) ompare for small and large write and read operations... ustavo lonso. IKS. T Zürich ata Striping 10

RI level 10 RI level 10 uses a RI level 0 controller to strip the data. ach striping unit is then mirrored by a RI level 1 controller Same fault tolerance as RI level 1 Requires at least 4 disks I/O bandwidth can be slightly better than level 1 because the level 1 controllers have less disks to manage RI level 1 bits RI level 0 RI level 1 ustavo lonso. IKS. T Zürich ata Striping 11

RI level 53 It has the wrong name (it should be 30) RI level 53 uses a level 0 controller to stripe the data and then gives each striping unit to a level 3 controller Same fault tolerance as RI level 3 RI level 0 bits RI level 3 RI level 3 I M ustavo lonso. IKS. T Zürich ata Striping 12 J N XOR(a,b) XOR(e,f) XOR(i,j) XOR(m,n) K O L P XOR(c,d) XOR(g,h) XOR(k,l) XOR(o,p)

RI level 0 + 1 RI 0 + 1 uses a level 1 controller for mirroring the data and level 0 controllers for striping the mirrored disks Worse failure behavior that level 10 It can execute reads in parallel (unlike level 10) bits RI level 1 RI level 0 RI level 0 ustavo lonso. IKS. T Zürich ata Striping 13

Small writes: read-write-modify In several RI levels, small writes are a problem because they modify the parity but may not touch all the data new data block (write) fragments that define the parity 1 Thus, small writes in these cases 2 require to read all the data fragments 1 XOR that are needed for the parity even if 2 they have nothing to do with the write operation itself The read-write-modify approach 1 requires to read the data to modify before writing it. With the old data and the new data, the new parity can be computed without having to read all the fragments This allows to perform writes in parallel (depends on RI level) data block parity block ustavo lonso. IKS. T Zürich ata Striping 14

Small writes: regenerate-write n alternative is to read all data blocks needed for calculating the parity and then to regenerate the parity with the new data block With regenerate write, small writes use all the disks and can be performed only one at a time new data block (write) 1 1 1 XOR 2 1 1 data block parity block ustavo lonso. IKS. T Zürich ata Striping 15