# HDP Code: A Horizontal-Diagonal Parity Code to Optimize I/O Load Balancing in RAID-6

Save this PDF as:

Size: px
Start display at page:

## Transcription

4 (a) Vertical parity layout of P-Code with prime disks (p = 7). 15 Number of I/O operations (a) Anti-diagonal parity layout of X-Code with prime disks (p = 7) Column number (b) I/O distribution among different columns when six writes to different columns occur (Continuous data elements C,6, C 1,, C 1,1, C 1,, C 1,3, C 1,4 and C 1,5 are written, each column has just one write to data element and it is a uniform write mode to various disks). Fig.. Load balancing problem in P-Code for an 7-disk array (The I/O operations in columns and 5 are very high which also may lead to a sharp decrease of reliability and performance of a storage system). (b) Reduce I/O cost when column fails in X-Code. Fig. 4. higher I/O cost when single disk failures in X-Code. Fig. 3. Reduce I/O cost when column fails in RDP code. workload imbalance as discussed in Section II-A. ) Diagonal/Anti-diagonal Parity (DP/ADP): Diagonal or anti-diagonal parity typically appears in horizontal codes or vertical codes, such as in RDP [] and X-Code [4] (Figure 1(c) and 4(a)), which achieve a well-balanced I/O. The antidiagonal parity of X-Code can be calculated by, From Equation 1, we can see that typical HP can be calculated by a number of XOR computations. For a partial stripe write on continuous data elements, it could have lower cost than the other parities due to the fact that these elements can share the same HP (e.g., partial write cost on continuous data elements C,1 and C, is shown in Figure 1(b)). Through the layout of HP, horizontal codes can be optimized to reduce the recovery time when data disk fails [4]. However, there is an obvious disadvantage for HP, which is n 3 C n 1,i = C k, i k n () k= From Equation, some modular computation to calculate the corresponding data elements (e.g., i k n ) are introduced compared to Equation 1. For DP/ADP, it has a little effect on reducing the probability of single disk failure as discussed in Section II-B.

5 TABLE II SUMMARY OF DIFFERENT PARITIES Parities Computation Workload Reduce I/O cost of cost balance single disk failure HP very low unbalance a lot for data disks, little for parity disks DP/ADP medium balance some for all disks VP high mostly balance (unbalance in P-Code) none 3) Vertical Parity (VP): Vertical parity normally appears in vertical codes, such as in B-Code [41] and P-Code [3]. Figure (a) shows the layout of vertical parity. The construction of vertical parity is more complex: first, some data elements are selected as a set, upon which the modular calculations are done. Thus, the computation cost for a vertical parity is higher than other parities. Except for some exceptions like P-Code shown in Figure (a), most vertical codes achieve a wellbalanced I/O. Due to the complex layout, reducing the I/O cost of any single disk is infeasible for vertical codes like P-Code. Table II summarizes the major features of different parities, all of which suffer from unbalanced I/O and the poor efficiency on reducing the I/O cost of single disk failure. To address these issues, we propose a new parity named HDP to take advantage of both horizontal and diagonal/anti-diagonal parities to achieve low computation cost, high reliability, and balanced I/O. The layout of HDP is shown in Figure 5(a). In next section we will discuss how HDP parity is used to build HDP code. III. HDP CODE To overcome the shortcomings of existing MDS codes, in this section we propose the HDP Code, which takes the advantage of both both horizontal and diagonal/anti-diagonal parities in MDS codes. HDP is a solution for p 1 disks, where p is a prime number. A. Data/Parity Layout and Encoding of HDP Code HDP Code is composed of a p 1-row-p 1-column square matrix with a total number of (p 1) elements. There are three types of elements in the square matrix: data elements, horizontal-diagonal parity elements, and anti-diagonal parity elements. C i,j ( i p, j p ) denotes the element at the ith row and the jth column. Two diagonals of this square matrix are used as horizontal-diagonal parity and anti-diagonal parity, respectively. Horizontal-diagonal parity and anti-diagonal parity elements of HDP Code are constructed according to the following encoding equations: Horizontal parity: p C i,i = C i,j (j i) (3) j= Anti-diagonal parity: C i,p i = p C i+j+ p,j j= (j p i and j p 3 i p ) Figure 5 shows an example of HDP Code for an 6-disk array (p = 7). It is a 6-row-6-column square matrix. Two diagonals of this matrix are used for the horizontal-diagonal parity elements (C,, C 1,1, C,, etc.) and the anti-diagonal parity elements (C,5, C 1,4, C,3, etc.). The encoding of the horizontal-diagonal parity of HDP Code is shown in Figure 5(a). We use different icon shapes to denote different sets of horizontal-diagonal elements and the corresponding data elements. Based on Equation 3, all horizontal elements are calculated. For example, the horizontal element C, can be calculated by C,1 C, C,3 C,4 C,5. The encoding of anti-diagonal parity of HDP Code is given in Figure 5(b). According to Equation 3, the anti-diagonal elements can be calculated through modular arithmetic and XOR operations. For example, to calculate the anti-diagonal element C 1,4 (i = 1), first we should fetch the proper data elements (C i+j+ p,j). If j =, i + j + = 4 and 4 p = 4, we get the first data element C 4,. The following data elements, which take part in XOR operations, can be calculated in the same way (the following data elements are C 5,1, C,3, C,5 ). Second, the corresponding anti-diagonal element (C 1,4 ) is constructed by an XOR operation on these data elements, i.e., C 1,4 = C 4, C 5,1 C,3 C,5. B. Construction Process According to the above data/parity layout and encoding scheme, the construction process of HDP Code is straightforward: Label all data elements. Calculate both horizontal and anti-diagonal parity elements according to Equations 3 and 4. C. Proof of Correctness To prove the correctness of HDP Code, we take the the case of one stripe for example here. The reconstruction of multiple stripes is just a matter of scale and similar to the reconstruction of one stripe. In a stripe, we have the following lemma and theorem, Lemma 1: We can find a sequence of a two-integer tuple (T k, T k ) where T k = p + k+1+ 1+( 1) k (f f 1 ), p 1 T k = 1+( 1)k f ( 1)k+1 f (k =, 1,, p 3) with < f f 1 < p 1, all two-integer tuples (, f 1 ), (, f ),, (p, f 1 ), (p, f ) occur exactly once in the sequence. The similar proof of this lemma can be found in many papers on RAID-6 codes [3] [] [4] [3]. (4)

6 (a) Horizontal-diagonal parity coding of HDP Code: a horizontaldiagonal parity element can be calculated by XOR operations among the corresponding data elements in the same row. For example, C, = C,1 C, C,3 C,4 C,5. (b) Anti-diagonal parity coding of HDP Code: an anti-diagonal parity element can be calculated by XOR operations among the corresponding data elements in anti-diagonal. For example, C 1,4 = C 4, C 5,1 C,3 C,5. Fig. 5. HDP Code (p = 7). Theorem 1: A p 1-row-p 1-column stripe constructed according to the formal description of HDP Code can be reconstructed under concurrent failures from any two columns. Proof: The two failed columns are denoted as f 1 and f, where < f 1 < f < p. In the construction of HDP Code, any two horizontaldiagonal parity elements cannot be placed in the same row, as well for any two anti-diagonal parity elements. For any two concurrently failed columns f 1 and f, based on the layout of HDP Code, two data elements C p 1 f+f 1,f 1 and C f f 1 1,f can be reconstructed since the corresponding anti-diagonal parity element does not appear in the other failed column. For the failed columns f 1 and f, if a data element C i,f on column f can be reconstructed, we can reconstruct the missing data element C i+f1 f p,f 1 on the same anti-diagonal parity chain if its corresponding anti-diagonal parity elements exist. Similarly, a data element C i,f1 in column f 1 can be reconstructed, we can reconstruct the missing data element C i+f f 1 p,f on the same anti-diagonal parity chain if its anti-diagonal parity element parity element exist. Let us consider the construction process from data element C f f 1 1,f on the f th column to the corresponding endpoint (element C f1,f 1 on the f 1 th column). In this reconstruction process, all data elements can be reconstructed and the reconstruction sequence is based on the sequence of the two-integer tuple in Lemma 1. Similarly, without any missing parity elements, we may start the reconstruction process from data element C f 1,f 1 on the f 1 th column to the corresponding endpoint (element C f,f on the f th column). In conclusion, HDP Code can be reconstructed under concurrent failures from any two columns. D. Reconstruction Process We first consider how to recover a missing data element since any missing parity element can be recovered based on Equations 3 and 4. If the horizontal-diagonal parity element and the related p data elements exist, we can recover the missing data element (assuming it s C i,f1 in column f 1 and f 1 p ) using the following equation, p C i,f1 = C i,j (j f 1 ) (5) j= If there exists an anti-diagonal parity element and its p 3 data elements, to recover the data element (C i,f1 ), first we should recover the corresponding anti-diagonal parity element. Assume it is in row r and this anti-diagonal parity element can be represented by C r,p r based on Equation 4, then we have: i = r + f 1 + p (6) So r can be calculated by (k is an arbitrary positive integer): (i f 1 + p )/ (i f 1 = ±k + 1) r = (i f 1 + p )/ (i f 1 = k) (i f 1 )/ (i f 1 = k) According to Equation 4, the missing data element can be recovered, C i,f1 = C r,p r p C i+j+ p,j j= (j f 1 and j p 3 i p ) (7) ()

8 TABLE III 5 RANDOM INTEGER NUMBERS TABLE IV DIFFERENT L VALUE IN HORIZONTAL CODES Code Uniform SW requests Uniform CW requests RDP p p p p EVENODD p p 3 HDP 1 1 at the end of a stripe, the data element at the beginning of the stripe will be written 5. Based on the description of ideal sequence, for HDP Code shown in Figure 5 (which has (p 3) (p 1) total data elements in a stripe), the ideal sequence of CW requests to (p 3) (p 1) data elements is: C,1 C,, C, C,3, C,3 C,4, and so on,, the last partial stripe write in this sequence is C p,p 3 C,. We also select two access patterns combined with different types of read/write requests: Uniform access. Each request occurs only once, so for read requests, each data element is read once; for write reqeusts, each data element is written once; for continuous write requests, each data element is written twice. Random access. Since the number of total data elements in a stripe is less than 49 when p = 5 and p = 7, we use 5 random numbers (ranging from 1 to 1) generated by a random integer generator [9] as the frequencies of partial stripe writes in the sequence one after another. These 5 random numbers are shown in Table III. For example, in HDP Code, the first number in the table, 1 is used as the frequency of a CW request to elements C,1 C,. We define O j (C i,j ) as the number of I/O operations in column j, which is caused by a request to data element(s) with beginning element C i,j. And we use O(j) to delegate the total number of I/O operations of requests to column j in a stripe, which can be calculated by, O(j) = O j (C i,j ) (9) We use O max and O min to delegate maximum/minimum number of I/O operations among all columns, obviously, O max = max O(j), O min = min O(j) (1) We evaluate HDP Code and other codes in terms of the following metrics. We define metric of load balancing (L) to evaluate the ratio between the columns with the highest I/O O max and the lowest I/O O min. The smaller value of L is, the better load balancing is. L can be calculated by, 5 For continuous write mode. L = O max O min (11) For uniform access in HDP Code, O max is equal to O min. According to Equation 11, B. Numerical Results L = O max O min = 1 (1) In this subsection, we give the numerical results of HDP Code compared to other typical codes using above metrics. First, we calculate the metric of load balancing (L values) for different codes 6 with various p shown in Figure 9 and Figure 1. The results show that HDP Code has the lowest L value and thus the best balanced I/O compared to the other codes. Second, to discover the trend of unbalanced I/O in horizontal codes, especially for uniform write requests, we also calculate the L value in RDP, EVENODD and HDP. For HDP Code, which is well balanced as calculated in Equation 1 for any uniform request. For RDP and EVEN- ODD codes, based on their layout, the maximum number of I/O operations appears in their diagonal parity disk and the minimum number of I/O operations appears in their data disk. By this method, we can get different L according to various values of p. For example, for uniform SW requests in RDP code, L = O max = (p 1) + (p ) = p 4+ 1 O min (p 1) p 1 (13) We summarize the results in Table IV and give the trend of L in horizontal codes with different p values in Figure 11 and 1. With the increasing number of disks (p becomes larger), RDP and EVENODD suffer extremely unbalanced I/O while HDP code gets well balanced I/O in the write-intensive environment. V. RELIABILITY ANALYSIS By using special horizontal-diagonal and anti-diagonal parities, HDP Code also provides high reliability in terms of fast recovery on single disk and parallel recovery on double disks. A. Fast Recovery on Single Disk As discussed in Section II-B, some MDS codes like RDP can be optimized to reduce the recovery time when single disk fails. This approach also can be used for HDP Code to get higher reliability. For example, as shown in Figure 13, 6 In the following figures, because there is no I/O in parity disks, the minimum I/O O min = thus L =.

9 1 6 4 L Uniform-R Uniform-SW Uniform-CW Uniform- 5R5SW Uniform- 5R5CW HDP Code EVENODD RDP Random-R Random-SW Random-CW Random- 5R5SW Random- 5R5CW Fig. 9. Metric of load balancing L under various codes (p = 5) L Uniform-R Uniform-SW Uniform-CW Uniform-5R5SW Uniform-5R5CW Random-R Random-SW Random-CW Random-5R5SW Random-5R5CW HDP Code EVENODD RDP P-Code Fig. 1. Metric of load balancing L under various codes (p = 7). L L P RDP EVENODD HDP P RDP EVENODD HDP Fig. 11. Metric of load balancing L under uniform SW requests with different p in various horizontal codes. Fig. 1. Metric of load balancing L under uniform CW requests with different p in various horizontal codes. when column fails, not all elements need to be read for recovery. Because there are some elements like C,4 can be shared to recover two failed elements in different parities. By this approach, when p = 7, HDP Code reduces up to 37% read operations and 6% total I/O per stripe to recover single disk, which can decrease the recovery time thus increase the reliability of the disk array. We summarize the reduced read I/O and total I/O in different codes in Table V. We can see that HDP achieves highest gain on reduced I/O compard to RDP and X-Code. Based on the layout of HDP Code, we also keep well balanced read I/O when one disk fails. For example, as shown in Figure 14, the remaining disks all 4 read operations, which is well balanced. B. Parallel Recovery on Double Disks According to Figures 6 and 7, HDP Code can be recovered by two recovery chains all the time. It means that HDP Code can be reconstructed in parallel when any two disks fail, which

### H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6

H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6 Chentao Wu 1, Shenggang Wan 2, Xubin He 1, Qiang Cao 2, and Changsheng Xie 2 1 Department of Electrical & Computer Engineering,

### Distributed Storage Networks and Computer Forensics

Distributed Storage Networks 5 Raid-6 Encoding Technical Faculty Winter Semester 2011/12 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case for Redundant Array of Inexpensive Disks,

### Algorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Christian Schindelhauer

Algorithms and Methods for Distributed Storage Networks 5 Raid-6 Encoding Institut für Informatik Wintersemester 2007/08 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case for Redundant

### Theoretical Aspects of Storage Systems Autumn 2009

Theoretical Aspects of Storage Systems Autumn 2009 Chapter 2: Double Disk Failures André Brinkmann Data Corruption in the Storage Stack What are Latent Sector Errors What is Silent Data Corruption Checksum

### Using Shared Parity Disks to Improve the Reliability of RAID Arrays

Using Shared Parity Disks to Improve the Reliability of RAID Arrays Jehan-François Pâris Department of Computer Science University of Houston Houston, TX 7704-010 paris@cs.uh.edu Ahmed Amer Department

### A Performance Comparison of Open-Source Erasure Coding Libraries for Storage Applications

A Performance Comparison of Open-Source Erasure Coding Libraries for Storage Applications Catherine D. Schuman James S. Plank Technical Report UT-CS-8-625 Department of Electrical Engineering and Computer

### STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures

STAR : An Efficient Coding Scheme for Correcting Triple Storage Node Failures Cheng Huang, Member, IEEE, and Lihao Xu, Senior Member, IEEE Abstract Proper data placement schemes based on erasure correcting

### technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

technology brief RAID Levels March 1997 Introduction RAID is an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks) coined in a 1987 University of California

### Functional-Repair-by-Transfer Regenerating Codes

Functional-Repair-by-Transfer Regenerating Codes Kenneth W Shum and Yuchong Hu Abstract In a distributed storage system a data file is distributed to several storage nodes such that the original file can

### RAID Triple Parity. Peter Corbett. Atul Goel ABSTRACT. Categories and Subject Descriptors. General Terms. Keywords

RAID Triple Parity Atul Goel NetApp Inc atul.goel@netapp.com Peter Corbett NetApp Inc peter.corbett@netapp.com ABSTRACT RAID triple parity (RTP) is a new algorithm for protecting against three-disk failures.

### Surviving Two Disk Failures. Introducing Various RAID 6 Implementations

Surviving Two Failures Introducing Various RAID 6 Implementations Notices The information in this document is subject to change without notice. While every effort has been made to ensure that all information

### RAID: Redundant Arrays of Inexpensive Disks this discussion is based on the paper: on Management of Data (Chicago, IL), pp.109--116, 1988.

: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings

### Online Remote Data Backup for iscsi-based Storage Systems

Online Remote Data Backup for iscsi-based Storage Systems Dan Zhou, Li Ou, Xubin (Ben) He Department of Electrical and Computer Engineering Tennessee Technological University Cookeville, TN 38505, USA

### EMC CLARiiON RAID 6 Technology A Detailed Review

A Detailed Review Abstract This white paper discusses the EMC CLARiiON RAID 6 implementation available in FLARE 26 and later, including an overview of RAID 6 and the CLARiiON-specific implementation, when

### A Cost-based Heterogeneous Recovery Scheme for Distributed Storage Systems with RAID-6 Codes

A Cost-based Heterogeneous Recovery Scheme for Distributed Storage Systems with RAID-6 Codes Yunfeng Zhu, Patrick P. C. Lee, Liping Xiang, Yinlong Xu, Lingling Gao School of Computer Science and Technology,

### A Cost-based Heterogeneous Recovery Scheme for Distributed Storage Systems with RAID-6 Codes

A Cost-based Heterogeneous Recovery Scheme for Distributed Storage Systems with RAID-6 Codes Yunfeng Zhu, Patrick P. C. Lee, Liping Xiang, Yinlong Xu, Lingling Gao School of Computer Science and Technology,

### Three-Dimensional Redundancy Codes for Archival Storage

Three-Dimensional Redundancy Codes for Archival Storage Jehan-François Pâris Darrell D. E. Long Witold Litwin Department of Computer Science University of Houston Houston, T, USA jfparis@uh.edu Department

### A A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation

A A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation LIPING XIANG and YINLONG XU, University of Science and Technology of China JOHN C.S. LUI, The Chinese

### 200 Chapter 7. (This observation is reinforced and elaborated in Exercises 7.5 and 7.6, and the reader is urged to work through them.

200 Chapter 7 (This observation is reinforced and elaborated in Exercises 7.5 and 7.6, and the reader is urged to work through them.) 7.2 RAID Disks are potential bottlenecks for system performance and

### Towards High Security and Fault Tolerant Dispersed Storage System with Optimized Information Dispersal Algorithm

Towards High Security and Fault Tolerant Dispersed Storage System with Optimized Information Dispersal Algorithm I Hrishikesh Lahkar, II Manjunath C R I,II Jain University, School of Engineering and Technology,

### E cient parity placement schemes for tolerating up to two disk failures in disk arrays

Journal of Systems Architecture 46 (2000) 1383±1402 www.elsevier.com/locate/sysarc E cient parity placement schemes for tolerating up to two disk failures in disk arrays Nam-Kyu Lee a, Sung-Bong Yang a,

### Research on massive data storage in virtual roaming system based on RAID5 RAN Feipeng, DAI Huayang, XING Wujie, WANG Xiang, Li Xuesong

International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015) Research on massive data storage in virtual roaming system based on RAID5 RAN Feipeng, DAI Huayang,

### Energy aware RAID Configuration for Large Storage Systems

Energy aware RAID Configuration for Large Storage Systems Norifumi Nishikawa norifumi@tkl.iis.u-tokyo.ac.jp Miyuki Nakano miyuki@tkl.iis.u-tokyo.ac.jp Masaru Kitsuregawa kitsure@tkl.iis.u-tokyo.ac.jp Abstract

### Implementation and Evaluation of a Popularity-Based Reconstruction Optimization Algorithm in Availability-Oriented Disk Arrays

Implementation and Evaluation of a Popularity-Based Reconstruction Optimization Algorithm in Availability-Oriented Disk Arrays Lei Tian ltian@hust.edu.cn Hong Jiang jiang@cse.unl.edu Dan Feng dfeng@hust.edu.cn

### Data Storage - II: Efficient Usage & Errors

Data Storage - II: Efficient Usage & Errors Week 10, Spring 2005 Updated by M. Naci Akkøk, 27.02.2004, 03.03.2005 based upon slides by Pål Halvorsen, 12.3.2002. Contains slides from: Hector Garcia-Molina

### Locality Based Protocol for MultiWriter Replication systems

Locality Based Protocol for MultiWriter Replication systems Lei Gao Department of Computer Science The University of Texas at Austin lgao@cs.utexas.edu One of the challenging problems in building replication

### A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage

A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage James S. Plank University of Tennessee plank@cs.utk.edu Lihao Xu Wayne State University Jianqiang Luo Wayne

### Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads

Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads Osama Khan, Randal Burns Department of Computer Science Johns Hopkins University James Plank, William Pierce

### PIONEER RESEARCH & DEVELOPMENT GROUP

SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for

### Increasing the capacity of RAID5 by online gradual assimilation

Increasing the capacity of RAID5 by online gradual assimilation Jose Luis Gonzalez,Toni Cortes joseluig,toni@ac.upc.es Departament d Arquiectura de Computadors, Universitat Politecnica de Catalunya, Campus

### Reliability of Data Storage Systems

Zurich Research Laboratory Ilias Iliadis April 2, 25 Keynote NexComm 25 www.zurich.ibm.com 25 IBM Corporation Long-term Storage of Increasing Amount of Information An increasing amount of information is

### DiskReduce: RAID for Data-Intensive Scalable Computing

DiskReduce: RAID for Data-Intensive Scalable Computing Bin Fan binfan@cs.cmu.edu Wittawat Tantisiriroj wtantisi@cs.cmu.edu Garth Gibson garth@cs.cmu.edu Lin Xiao lxiao@cs.cmu.edu ABSTRACT Data-intensive

### Overview. How distributed file systems work Part 1: The code repair problem Regenerating Codes. Part 2: Locally Repairable Codes

Overview How distributed file systems work Part 1: The code repair problem Regenerating Codes Part 2: Locally Repairable Codes Part 3: Availability of Codes Open problems 1 current hadoop architecture

### RAID Basics Training Guide

RAID Basics Training Guide Discover a Higher Level of Performance RAID matters. Rely on Intel RAID. Table of Contents 1. What is RAID? 2. RAID Levels RAID 0 RAID 1 RAID 5 RAID 6 RAID 10 RAID 0+1 RAID 1E

A Piggybacing Design Framewor for Read-and Download-efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran, Fellow, IEEE Department of Electrical Engineering and Computer Sciences

### Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs

Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs Kevin M. Greenan ParaScale greenan@parascale.com Xiaozhou Li HP Labs xiaozhou.li@hp.com Jay J. Wylie HP

### RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection

Technical Report RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection Jay White & Chris Lueth, NetApp May 2010 TR-3298 ABSTRACT This document provides an in-depth overview of the NetApp

### Exploring RAID Configurations

Exploring RAID Configurations J. Ryan Fishel Florida State University August 6, 2008 Abstract To address the limits of today s slow mechanical disks, we explored a number of data layouts to improve RAID

### RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

ATTO Technology, Inc. Corporate Headquarters 155 Crosspoint Parkway Amherst, NY 14068 Phone: 716-691-1999 Fax: 716-691-9353 www.attotech.com sales@attotech.com RAID Overview: Identifying What RAID Levels

### A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster K. V. Rashmi 1, Nihar B. Shah 1, Dikang Gu 2, Hairong Kuang

### Note: Correction to the 1997 Tutorial on Reed-Solomon Coding

Note: Correction to the 1997 utorial on Reed-Solomon Coding James S Plank Ying Ding University of ennessee Knoxville, N 3799 [plank,ying]@csutkedu echnical Report U-CS-03-504 Department of Computer Science

### Increased Reliability with SSPiRAL Data Layouts

Increased Reliability with SSPiRAL Data Layouts Ahmed Amer University of Pittsburgh amer@cs.pitt.edu Darrell D. E. Long University of California, Santa Cruz darrell@cs.ucsc.edu Thomas Schwarz Santa Clara

### Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

1 Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays Farzaneh Rajaei Salmasi Hossein Asadi Majid GhasemiGol rajaei@ce.sharif.edu asadi@sharif.edu ghasemigol@ce.sharif.edu

### Lecture 36: Chapter 6

Lecture 36: Chapter 6 Today s topic RAID 1 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for

### Reliability and Fault Tolerance in Storage

Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

### Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How

### RAID-DP : NETWORK APPLIANCE IMPLEMENTATION OF RAID DOUBLE PARITY FOR DATA PROTECTION A HIGH-SPEED IMPLEMENTATION OF RAID 6

RAID-DP : NETWORK APPLIANCE IMPLEMENTATION OF RAID DOUBLE PARITY FOR DATA PROTECTION A HIGH-SPEED IMPLEMENTATION OF RAID 6 Chris Lueth, Network Appliance, Inc. TR-3298 [12/2006] ABSTRACT To date, RAID

### Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations A Dell Technical White Paper Database Solutions Engineering By Sudhansu Sekhar and Raghunatha

### Data Storage - II: Efficient Usage & Errors. Contains slides from: Naci Akkök, Hector Garcia-Molina, Pål Halvorsen, Ketil Lund

Data Storage - II: Efficient Usage & Errors Contains slides from: Naci Akkök, Hector Garcia-Molina, Pål Halvorsen, Ketil Lund Overview Efficient storage usage Disk errors Error recovery INF3 7.3.26 Ellen

### A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems

A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems James S Plank Department of Computer Science University of Tennessee February 19, 1999 Abstract It is well-known that Reed-Solomon

### A Reed-Solomon Code for Disk Storage, and Efficient Recovery Computations for Erasure-Coded Disk Storage

A Reed-Solomon Code for Disk Storage, and Efficient Recovery Computations for Erasure-Coded Disk Storage M. S. MANASSE Microsoft Research, Mountain View, California, USA C. A. THEKKATH Microsoft Research,

### Storing Data: Disks and Files

Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve

### A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring

Proceedings of Multimedia Computing and Networking 1997 (MMCN97), San Jose, CA, February 1997 A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring Ernst W. BIERSACK, Christoph BERNHARDT Institut

### Redundant Arrays of Inexpensive Disks (RAIDs)

38 Redundant Arrays of Inexpensive Disks (RAIDs) When we use a disk, we sometimes wish it to be faster; I/O operations are slow and thus can be the bottleneck for the entire system. When we use a disk,

### Efficient LDPC Code Based Secret Sharing Schemes and Private Data Storage in Cloud without Encryption

Efficient LDPC Code Based Secret Sharing Schemes and Private Data Storage in Cloud without Encryption Yongge Wang Department of SIS, UNC Charlotte, USA yonwang@uncc.edu Abstract. LDPC codes, LT codes,

### RAID 5 rebuild performance in ProLiant

RAID 5 rebuild performance in ProLiant technology brief Abstract... 2 Overview of the RAID 5 rebuild process... 2 Estimating the mean-time-to-failure (MTTF)... 3 Factors affecting RAID 5 array rebuild

### The Assignment Problem and the Hungarian Method

The Assignment Problem and the Hungarian Method 1 Example 1: You work as a sales manager for a toy manufacturer, and you currently have three salespeople on the road meeting buyers. Your salespeople are

### RAID. Storage-centric computing, cloud computing. Benefits:

RAID Storage-centric computing, cloud computing. Benefits: Improved reliability (via error correcting code, redundancy). Improved performance (via redundancy). Independent disks. RAID Level 0 Provides

### Sonexion GridRAID Characteristics

Sonexion GridRAID Characteristics Mark Swan Performance Team Cray Inc. Saint Paul, Minnesota, USA mswan@cray.com Abstract This paper will present performance characteristics of the Sonexion declustered

### Computer Organization II. CMSC 3833 Lecture 61

RAID, an acronym for Redundant Array of Independent Disks was invented to address problems of disk reliability, cost, and performance. In RAID, data is stored across many disks, with extra disks added

### Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Overview of I/O Performance and RAID in an RDBMS Environment By: Edward Whalen Performance Tuning Corporation Abstract This paper covers the fundamentals of I/O topics and an overview of RAID levels commonly

### Big Data & Scripting storage networks and distributed file systems

Big Data & Scripting storage networks and distributed file systems 1, 2, in the remainder we use networks of computing nodes to enable computations on even larger datasets for a computation, each node

### A Tutorial on RAID Storage Systems

A Tutorial on RAID Storage Systems Sameshan Perumal and Pieter Kritzinger CS04-05-00 May 6, 2004 Data Network Architectures Group Department of Computer Science University of Cape Town Private Bag, RONDEBOSCH

### SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

SSDs and RAID: What s the right strategy Paul Goodwin VP Product Development Avant Technology SSDs and RAID: What s the right strategy Flash Overview SSD Overview RAID overview Thoughts about Raid Strategies

### Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department

### Non-Redundant (RAID Level 0)

There are many types of RAID and some of the important ones are introduced below: Non-Redundant (RAID Level 0) A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID organization

### Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

EECS 262a Advanced Topics in Computer Systems Lecture 4 Filesystems (Con t) September 15 th, 2014 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley Today

### Striping in a RAID Level 5 Disk Array

Striping in a RAID Level 5 Disk Array Peter M. Chen Computer Science and Engineering Division Electrical Engineering and Computer Science Department University of Michigan Edward K. Lee DEC Systems Research

### Summer Student Project Report

Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months

### VIOLA - A Scalable and Fault-Tolerant Video-on-Demand System

VIOLA - A Scalable and Fault-Tolerant Video-on-Demand System Y.B. Lee Information Networking Laboratories Faculty of Engineering The Chinese University of Hong Kong Tel:(852)-2603-7360 Fax:(852)-2603-7327

### Disk Array Data Organizations and RAID

Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution

### RAID Technology Overview

RAID Technology Overview HP Smart Array RAID Controllers HP Part Number: J6369-90050 Published: September 2007 Edition: 1 Copyright 2007 Hewlett-Packard Development Company L.P. Legal Notices Copyright

### RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

RAID HARDWARE On board SATA RAID controller SATA RAID controller card RAID drive caddy (hot swappable) Anne Watson 1 RAID The word redundant means an unnecessary repetition. The word array means a lineup.

### CS 153 Design of Operating Systems Spring 2015

CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations Physical Disk Structure Disk components Platters Surfaces Tracks Arm Track Sector Surface Sectors Cylinders Arm Heads

### Data Backup and Archiving with Enterprise Storage Systems

Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,

### FPGA area allocation for parallel C applications

1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

### Diversity Coloring for Distributed Data Storage in Networks 1

Diversity Coloring for Distributed Data Storage in Networks 1 Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Pasadena, CA 9115, U.S.A. {jax, bruck}@paradise.caltech.edu Abstract

Storage management: talk roadmap! Why disk arrays? Failures Redundancy! RAID! Performance considerations normal and degraded modes! Disk array designs and implementations! Case study: HP AutoRAID 2000-03-StAndrews-arrays,

### Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

### RAID Storage Systems with Early-warning and Data Migration

National Conference on Information Technology and Computer Science (CITCS 2012) RAID Storage Systems with Early-warning and Data Migration Yin Yang 12 1 School of Computer. Huazhong University of yy16036551@smail.hust.edu.cn

### Optimized mapping of pixels into memory for H.264/AVC decoding

Optimized mapping of pixels into memory for H.264/AVC decoding Youhui Zhang a), Yuejian Xie, and Weimin Zheng Department of Computer Science and Technology, Tsinghua University, Beijng, 100084, China.

### Striping in a RAID Level 5 Disk Array

Proceedings of the 1995 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems Striping in a RAID Level 5 Disk Array Peter M. Chen Computer Science Division Electrical Engineering and

### A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi

### OceanStor 9000 InfoProtector Technical White Paper. Issue 01. Date 2014-02-13 HUAWEI TECHNOLOGIES CO., LTD.

OceanStor 9000 InfoProtector Technical White Paper Issue 01 Date 2014-02-13 HUAWEI TECHNOLOGIES CO., LTD. Copyright Huawei Technologies Co., Ltd. 2014. All rights reserved. No part of this document may

### DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

DELL RAID PRIMER DELL PERC RAID CONTROLLERS Joe H. Trickey III Dell Storage RAID Product Marketing John Seward Dell Storage RAID Engineering http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/top

### Distributed Data Management Part 2 - Data Broadcasting in Mobile Networks

Distributed Data Management Part 2 - Data Broadcasting in Mobile Networks 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Mobile Data Management - 1 1 Today's Questions 1.

### Symmetrical Pair Scheme: a Load Balancing Strategy to Solve Intra-movie Skewness for Parallel Video Servers

Symmetrical Pair Scheme: a Load Balancing Strategy to Solve Intra-movie Skewness for Parallel Video Servers Song Wu and Hai Jin Huazhong University of Science & Technology, Wuhan, China Email: {wusong,

### Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

### Filing Systems. Filing Systems

Filing Systems At the outset we identified long-term storage as desirable characteristic of an OS. EG: On-line storage for an MIS. Convenience of not having to re-write programs. Sharing of data in an

Proc. 8 th International Symposium on Stabilization, Safety and Securiity of Distributed Systems (SSS 006), Dallas, TX, Nov. 006, to appear Self-Adaptive Disk Arrays Jehan-François Pâris 1*, Thomas J.

### a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

### IBM System x GPFS Storage Server

IBM System x GPFS Storage Server Schöne Aussicht en für HPC Speicher ZKI-Arbeitskreis Paderborn, 15.03.2013 Karsten Kutzer Client Technical Architect Technical Computing IBM Systems & Technology Group

### Combining Low IO-Operations During Data Recovery with Low Parity Overhead in Two-Failure Tolerant Archival Storage Systems

2015 IEEE 21st acific Rim International Symposium on Dependable Computing Combining Low IO-Operations During Data Recovery with Low arity Overhead in Two-Failure Tolerant Archival Storage Systems Thomas

### HARD DRIVE CHARACTERISTICS REFRESHER

The read/write head of a hard drive only detects changes in the magnetic polarity of the material passing beneath it, not the direction of the polarity. Writes are performed by sending current either one

### A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of

### CS420: Operating Systems

NK YORK COLLEGE OF PENNSYLVANIA HG OK 2 RAID YORK COLLEGE OF PENNSYLVAN James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz,