1 Introduction to RI Team members: 電 機 一 王 麒 鈞, 電 機 一 吳 炫 逸, 電 機 一 孫 維 隆. Motivation Gosh, my hard disk is broken again, and my computer can t boot normally. I even have no chance to burn my cartoon and drama into V, it will cost me a lot of time to download again. Is there a method to automatically backup my data when I download a file? Yes, that is RI (Redundant rray of Independent isks) or (Redundant rray of Inexpensive isks).. RI LEVEL The basic conception of RI is that combine couple of small and cheap drives into array that offered greater capacity, reliability, speed. So the I has double meanings Independent and Inexpensive. RI is a means of spreading data into many drives by using disk striping (RI 0), disk mirroring (RI 1), disk striping with parity (RI 5). epending on the level chosen, the benefit of RI is one or more of increased data integrity, fault-tolerance, throughput or capacity compared to single drives. To spread data to every drive evenly, data must be divided into a lot of same-sized chunks (usually 32K or 64K). epending on the RI level chosen, write every chunk into the drives of array. When data are read, use the counter process. So that it can make an illusion that many drives are a big drive. RI 0 lock1 Mirroring lock1 lock1 Simply put, we divide the data lock2 into parts and store them in more Striping lock2 Mirroring lock2 than two disks, the division makes the disks work more faster than one Logical rive disk does. RI 0 wouldn t store reduplicate data. When storing one data, RI 0 has the lowest disk capacity requirement. ut if any block in RI 0 goes wrong, the combination has no ability to recover the data. RI 1 We divide data into part, and store the same data in more than two disks. In other words, we OPY the lock1 lock1 lock2 data and store in different disks as backup. RI 1 has lock1 lock2 Mirroring lock1 highest disk capacity requirement, but it provides the most reliable data and best recovery ability. lock2 Logical rive

2 RI 0+1 It seems like that RI 0 and RI 1 are simple and they have their own advantages and disadvantages. If we combine them as RI 0+1, then we have a proper lock1 Mirroring lock1 lock1 lock2 Striping lock2 Mirroring lock2 Logical rive way to apply RI in use. First, we divide data into many parts, storing them in two(or three, four ) disks(as RI 1). Second, we copy the divided data and store the copy ones in other two(or three, four )disks. This way, we will have a combined disks with advantages as RI 0 & RI 1. In other words, the combined disk work efficiently and has recovery ability. RI2 & RI3 RI2 and RI3 have the highest IO speed because the controller run all drives simultaneously (they divide a datum into bit or byte and spread to all the drives). ut they can t service multiple requests simultaneously, a datum is read all of drives woke and none have time to read another datum. So they are not used today. RI4 & RI5 RI4 and RI5 are both use parity to evaluate their fault-tolerance. Parity are computed by data in other drives of array. When a drive of array can t work, data in it can be computed by parity and data in remainder drives. When the broken drive is replaced by a same standard drive, original data can be rebuilt in the new one. Parity in RI4 is store in a specific drive. However the writing speed is limited by the parity drive. So RI5 break the limitation, it spread parity into all drives in the array. The speed limitation is only the process of computing parity, and it will cost lot of time. Now let s see how RI5 work when a drive is broken. Supposed that RI5 use (the operator we learn in chapter1) to compute parity. Why can compute parity see the form. data broken compute parity data data parity data data broken parity data

3 1 1 0 How RI5 works see the illustrations Logical rive parity parity Logical rive parity parity broken Normal situation Logical rive One drive broken parity computing computing Write parity Write Rebuild data The comparison of different version RI, see the front. Name escription of disk array ata reliability ata Transition Rate Max IO Transition Rate RI 0 Store data in parallel but no fault-tolerance lower than single disk very high high in read &write data parity data 1 1 0

5 doesn t it become popular? The answer is clear: most of us don t need it. ctually, only servers that need to store great amounts of data in a short time with faults prevention need RI. For Ps, it isn t necessary and for supercomputers, only high transaction rate of PU and RM is needed. Therefore, only servers that contains a lot of data, like those of bbs boards, need RI. Still, there are improvements for this interesting technology: even for the most common RI 5, the speed is still limited by the parity calculation and the word redundant means waste of resources. Even though RI is a useful concept, there might be new approaches replacing it in the future. E. Reference aid-intro.html

