Rebuild Strategies for Clustered Redundant Disk Arrays

Size: px
Start display at page:

Download "Rebuild Strategies for Clustered Redundant Disk Arrays"

Transcription

1 Rebuild Strategies for Clustered Redundant Disk Arrays Gang Fu, Alexander Thomasian, Chunqi Han and Spencer Ng Computer Science Department New Jersey Institute of Technology Newark, NJ 07102, USA Abstract RAID5 tolerates single disk failures by recreating lost data blocks on demand, but this results in the doubling of the load of surviving disks for pure read workload. This increase may be unacceptable if the original load was high. Clustered RAID (CRAID) with parity group size G smaller than the number of disks (G < N) was proposed so that the increase in load is α = (G 1)/(N 1) < 1, but this is at the cost of higher parity overhead 1/G. There have been two implementation of CRAID (i) the balanced incomplete block design BIBD and (ii) the nearly random permutation layout NRP data layouts. In this study we consider the latter implementation, since it provides more flexibility in varying α for a fixed N. Rebuild is a systematic reconstruction of the contents of the failed disk on a spare disk, which involves the reading of the rebuild units (say tracks) from a subset of surviving disks. We compare the effect of processing rebuild requests using the vacationing server model VSM and the permanent customer model PCM, which process rebuild requests at a lower or the same priority as user requests, respectively. We also investigate the effect of a control policy to ensure the progress of the rebuild process, since the spare disk may become a bottleneck in this case. The effect of various parameters on the completion time of rebuild processing and mean disk response time are also explored. Authors supported by NSF through Grant in Computer Systems Architecture. Hitachi Global Storage Technologies, San Jose Research Center, San Jose, CA. 1

2 1 Introduction RAID5 with rotated parity is a popular design, which tolerates single disk failure and balances disk loads via striping. Striping partitions a file into stripe units SUs and allocates them in round-robin manner on N disks, inserting a parity SU, which is the exclusive-or (XOR) of the contents of N 1 SUs in the same row or stripe as the parity. A capacity equal to the capacity of one disk is dedicated to parity. The parity blocks are kept up-to-date as data is updated and this is especially costly when small randomly placed data blocks are updated, hence the small write penalty. Caching of modified blocks in a non-volatile RAM allows the destaging of dirty blocks to be deferred, so that both data and the associated parity block can be written to disk at a lower priority than read requests. When a single disk fails, a requested data block on the failed disk can be reconstructed by XORing the contents of the surviving N 1 disks. which are accessed by an N 1 way fork-join request. There is an increase in load, since each surviving disks needs to process its own load in addition to the fork-join requests. At worst, the load is doubled when all requests are reads. The increase in load may be unacceptable when the disk utilization was already high in normal mode, so that the increase will result in disk saturation in degraded mode. This will more likely occur if we have an infinite source model, so that the arrival rate of requests is not reduced due to a slowdown of the disk request rate. Given that disk requests belong to various categories, some with lower priorities than others, one solution to deal with overload is to 2

3 shed low priority loads up to the point that overload is alleviated. This can be accomplished by a front-end such as Facade [9]. Clustered RAID proposed in [12] trades disk capacity against load increase. While the number of SUs participating over which the parity is computed, referred to as the parity group size: G = N in RAID5, clustered RAID allows G < N. To reconstruct a single block, a fraction α = (G 1)/(N 1) of the disks need to be accessed, while previously the declustering ratio was α = 1. Balanced Incomplete Block Designs BIBD [8, 13] and nearly random permutation NRP data layouts [11] are two approaches to balance disk loads from the viewpoint of parity updates, while reducing the increase in disk load in degraded mode. The NRP data layout provides higher flexibility than the BIBD layout. We will briefly describe the two methods in the next section and point out their similarity. The paper is organized as follows. In Section 2.1 and 2.2 we briefly describe the BIBD and NRP layouts. The simulation model is described in Section 4. Simulation results are given in Section 5, which is followed by conclusions in Section 6. 2 Clustered RAID Complete block designs are unacceptable, since given N disks and parity groups of size G, the number of combinations ( ) N G with N = 20 and G = 10 is too large to yield a balanced load with finite capacity disks. In what follows we describe two data layouts that work. 3

4 2.1 BIBD Data Layout A BIBD data layout is a grouping of N distinct objects into b blocks, such that each block contains G objects, each object occurs in exactly r blocks, each pair of objects appears in exactly L blocks [6]. Only 3 out of 5 variables are free since bg = Nr and r(g 1) = L(N 1). Consider the BIDB data layout in [13]: N = 10, G = 4, the number of parity groups is b = 15, the number of domains (different parity groups) per disk is r = 6, and the number of parity groups common to any pair of disks is L = 2. The data layout is given by the following table: Disk numbers Parity group number Block designs exist only for certain values of N and G, e.g., a layout for N = 33 with G = 12 does not exist, but G = 11 and G = 13 can be used instead [8]. 2.2 Nearly Random Permutation Data Layout This method proposed in [11] can be summarized as follows: 1. The whole disk array space is organized as a matrix, where the N columns correspond to disks and there are M rows or stripes consisting of N SUs ( stripe units). The SUs in the N columns are numbered 0 : N 1 and there are NM SUs in the array numbered {0 : NM 1}. 2. The parity group is G < N. Parity groups are placed sequentially on the NM SUs in 4

5 the array, so that parity group i occupies SUs ig through ig + G 1. We refer to this as an initial logical allocation. With N = 10 and G = 4 the initial data layout is shown below (Pi-j stands for the parity of SUs Di through Dj). It can be seen that parity SUs appear on half of the disks, so that the write load would not be balanced. Disk numbers Parity D0 D1 D2 P0-2 D3 D4 D5 P3-6 D6 D7 groups D8 P6-8 D9 D10 D11 P9-11 D12 D13 D14 P12-14 (initial D15 D16 D17 P15-17 D18 D19 D20 P18-20 D21 D22 allocation) D23 P21-23 D24 D25 D26 P24-26 D27 D28 D29 P For row I generate a random permutation of {0, 1,..., N 1} given as P I = {P 0, P 1,..., P N 1 }. I is used as a seed to the pseudo-random number generator that produces the permutation, Algorithm 235 [4] is used in our study 1. So that given a block number, N and the stripe unit size, we can compute the row of the block and its disk number. If mod(n, G) = 0 then step 3 should be repeated M times, i.e., 1 I M. If mod(n, G) 0 then the random permutation is reused K = LCM(N, G)/N times, where LCM(N, G) is the least common multiple of N and G. For example, assume the random permutation P 1 = {0, 9, 7, 6, 2, 1, 5, 3, 4, 8} for row one. Then we will have the following data allocation for the first two rows, since K = LCM(10, 4)/10 = 2. Note that permutations generated on successive rows will ensure and (approximately) equal number of parity blocks per disk. Disk numbers Final D0 D4 D3 P3-6 D6 D5 P0-2 D2 D7 D1 allocation D8 P0-11 D11 D13 D14 D12 D10 D9 P12-14 P6-8 Note that SUs of a parity group striding two rows in this case will be mapped onto 1 Algorithm 235: Shuffle array a[i], i=0,...,n-1. for i := n step -1 until 2 do { j=entier(i random + 1); b:=a[i]; a[i]:=a[j]; a[j]:=b} 5

6 different columns, so that (i) the parity SU will reside on a different disk than the data SUs, and (ii) all SUs can be accessed in parallel. These requirements are more formally discussed in the next section. 2.3 Discussion and Other Designs Six properties for ideal layouts are given in [8]: (i) Single failure correcting: the SUs of the same stripe are mapped to different disks. (ii) Balanced load due to parity: all disks have the same number of parity stripes mapped onto them. (iii) Balanced load in failed mode: the reconstruction workload should be balanced across all disks. (iv) Large write optimization: each stripe should contain N 1 contiguous SUs, where N is the parity group size. (v) Maximal read parallelism: reading n N disk blocks entails in accessing n disks. (vi) Efficient mapping: the function that maps physical to logical addresses is easily computable. The Permutation Development Data Layout PDDL is a mapping function described in [16], which has excellent properties and good performance in light load (like the PRIME data layout [2]) and heavy loads (like the DATUM data layout [1]). 3 Rebuild Processing The rebuild process is a systematic reconstruction of the contents of the failed disk, which is started immediately after a disk fails, provided a hot spare is available. The smallest unit of reconstruction is called the rebuild unit RU, which is usually a stripe unit or a fraction of it. An RU size equal to one or multiple of disk tracks was considered in earlier studies, 6

7 but this is not appropriate in zoned disks, since the number of (512 byte) sectors per track varies from track to track and this would complicate buffer allocation and control logic for rebuild. Of interest is the time to complete the rebuild T rebuild (ρ) and the response time of user requests versus time: R(t), 0 < t < T rebuild (ρ), where ρ is the disk utilization before the disk failure occurred (a.k.a. normal mode). When the disk is idle, rebuild time equals the number of the tracks and disk rotation time (plus delays due to track and cylinder skews). After a disk failure a RAID5 disk array operates at a higher utilization ρ = βρ, where β > 1 and in the worst case when all requests are reads β = 2. ρ is specified explicitly, since it has a first order effect on rebuild time. Other factors affecting rebuild time are discussed below. A distinction is made between stripe-oriented or more appropriately rebuild-unit (RU)- oriented and disk-oriented rebuild in [8]. In stripe-oriented rebuild each RU is rebuild by a dedicated process, which accesses RUs from all surviving disks, XORs them, and writes the resulting value to a spare disk. Denoting the maximum number of concurrent rebuild processes as P, when P = 1 there is synchronization after reconstructing each rebuild unit, but as P is increased the effect of this efficiency is eliminated. Disk-oriented rebuild dedicates a process to each disk, which reads RUs from a surviving disks asynchronously. The number of RUs read from surviving disks which have contributed to an RU, yet to be written to the spare disk varies. Therefore, this policy has higher buffer requirement than stripeoriented rebuild with small number of processes. It is shown in [8] that disk-oriented rebuild outperforms stripe-oriented rebuild, therefore the stripe-oriented rebuild policy will not be considered further in this study. 7

8 Regarding buffer requirements, we consider two type of buffers: (i) B disk, i.e., temporary buffers dedicated to each disk; (ii) B spare, i.e., buffers dedicated for writing to the spare disk. Since the unit of buffering is an RU (rebuild unit), the two types of buffers are interchangeable and there is no reason to unnecessarily move data from one buffer to the other. Once an RU is read into of B disk buffers, it is XORed as soon as possible with the appropriate RU in B spare, so that B spare will eventually hold the XORs of all appropriate disks, at which point it can be written to the spare disk. The time to XOR an RU and free B temp is a function of memory bandwidth and the parallelism in the XOR unit. We assume that the XOR operation is fast enough so that the B disk buffers are never exhausted, and that the reading of RUs from disk due to disk-oriented rebuild is not stopped because of this reason. We will consider the effect of B spare on rebuild performance, however. Rebuild requests can be processed at the same priority as user requests, which is the case with the permanent customer model PCM [11]. In PCM one request is processed at a time, i.e., a new rebuild request is inserted at the end of the disk queue as soon as a previous request is completed. Rebuild processing can be started as soon as a disk becomes idle, i.e., completes the processing of pending user requests. Rebuild processing is stopped after a user request arrives. This rebuild policy corresponds to the vacationing server model VSM in queueing theory and has been investigated in [8, 18, 20]. In effect, an idle server (resp. disk) takes successive vacations (resp. reads successive RUs), but returns from vacation (resp. stops reading RUs) when a user request arrives at the disk, so that rebuild requests are processed at a lower (nonpreemptive) priority than user requests. 8

9 Rebuild options in RAID5 are classified in [12] as follows: Baseline rebuild: Materialized blocks on the spare disk are updated, but not used to satisfy read requests. Read redirection: Materialized blocks on the spare disk are updated, but also used to satisfy read requests. This provides faster response for read requests intended for the failed disk, as well as reduces the utilization of surviving disks resulting in improved response times at these disks and faster rebuild. Piggy-backing at block level: Reconstructed data are materialized onto the spare disk as a result of a read targeted to the failed disk. Rebuild processing cost is not reduced unless all of the blocks on a track have been rebuilt. In fact it results in degraded performance as shown by the simulation results in [8]. Piggy-backing at track level is proposed and evaluated in [5]. It is shown to reduce rebuild time when the initial disk load is low enough to tolerate the increased load when the reading of blocks, requiring half a disk rotation on the average, is replaced by a full disk rotation to read a whole track with a zero latency capability. Several policies to improve the response time of user requests by allowing them to preempt rebuild requests are proposed in [19]. (i) Split-seek option: after a seek to read a track is completed, interrupt rebuild process if user disk requests are pending, otherwise start reading consecutive tracks until a user request arrives. (ii) Split-latency/transfer option: allows preemption even after search and transfer are started. (iii) Preemptable seeks: not considered, because this requires intimate knowledge of disk characteristics. While there is some improvement in response time, this improvement is at the cost of increased rebuild time. In effect a larger number of user requests are affected because the rebuild time is elongated. The few studies dealing with rebuild processing [8, 11, 20] leave several questions unan- 9

10 swered. A recent study evaluates the relative performance of VSM and PCM in (unclustered) RAID5 disk arrays and investigates the effect of buffer size, piggybacking, read redirection, etc [5]. 4 Performance Evaluation of Clustered RAID Analytic modeling has been used in the past to evaluate RAID5 performance, in normal, degraded, and rebuild modes, e.g., [3, 10, 18, 11, 20]. In normal mode there are no disk failures, in degraded mode the system operates with a single failed disk. All of the above studies with one exception ([10]) utilize the M/G/1 queueing model, allowing general rather than exponential service times. It has been shown by validation against detailed simulation results that an M/G/1 model provides very accurate estimates of mean response time, even for disks with zoning when rebuilding a RAID5 disk array according to the VSM policy [20]. More recently two-disk failure tolerant arrays such as RAID6, EVENODD, and RM2 have been analyzed in normal and degraded modes using the vacationing server model [7]. Analytical modeling has many shortcomings, e.g., there are no analytical models for the SATF policy. There are difficulties with incorporating the effect of passive resources, such as buffers. Another reason for adopting simulation rather than an analytic solution method is because of the approximations required for analysis, which would have required validation by simulation anyway. In this study we use simulation to study the CRAID5 performance with the NRP data 10

11 layout with varying degrees of declustering ratio. Our RAID5 simulator utilizes a detailed simulator of single disks, which can handle different disk drives whose characteristics are given at [14]. Here we present results for the IBM 18ES disk drive, which has 9 GB, and rotates at 7200 RPM, so that rotation time 8.33 ms. It has 11 zones, with the number of sectors per track varying from The average seek time is 7.16 ms and the average access time is over 11 ms. While our simulator can handle traces, such as those available at [17], for efficiency reasons we utilize a workload characterization of OLTP workloads given in [15], which shows that in an OLTP environment 96% of requests are to 4 KB and 4% to 24 KB blocks and that this blocks are randomly placed. We simplify this model and assume that all disk requests are to 4KB blocks. This is because positioning time dominates the service time. We also assume that the arrival process is Poisson, since it allows us to vary the arrival rate of requests and, for example, obtain the mean response time characteristic of the system, in normal and degraded modes. In degraded mode, as discussed in the previous section, we are interested in the rebuild time, assuming rebuild is started immediately, as soon as a disk fails. The user response time R(t) in rebuild mode is plotted as a function of time. The results provided are based on the repetition of the run. 11

12 5 Experimental Results The parameter space to be investigated includes: (i) VSM versus PCM. (ii) The effect of read redirection and dynamic controlling. (iii) The impact of buffer size. (iv) The impact of array size. (iv) the impact of parity de-clustering. (v) The impact of rebuilt unit size. In this paper, we mainly focus on the rebuild time, and the response time of user requests rather than rebuild requests. 5.1 VSM versus PCM (a) Declustering ratio α = 0.75 (b) Declustering ratio α = 0.25 Figure 1: Mean user response time and rebuild time with VSM and PCM. In both VSM and PCM, preemption is not allowed, so that the disk will not process user requests until the current rebuild request is completed. Figure 1(a) and (b) show the mean user response time and rebuild progress versus elapsed time in VSM and PCM with different de-clustering ratio. The following observations can be made: (i) the user response time in VSM is always lower than in PCM. (ii) the rebuild time 12

13 in VSM is always shorter than in PCM. It is natural that the user response time in VSM is lower than in PCM, because in VSM rebuild requests are processed at a lower priority and hence they have much less impact on user requests. However, it is somewhat counter-intuitive that the rebuild time in VSM is shorter than in PCM. The reason is that during rebuild, the utilization of bottleneck disk(s), either surviving disks or spare disk, is approximately 100%. And the disk utilization due to the user requests is approximately the same no matter what rebuild policy applied. Therefore, the remaining utilization left for rebuilding is a constant. Then the key to improve rebuild efficiency is to reduce the mean rebuild service time per RU. In VSM, the rebuild requests are processed only when the disk is idle, and it is more likely that the disk is still idle when the first rebuild request finished so that a second rebuild request can follow up without incurring an additional seek. Therefore, the mean service time per RU in VSM is shorter than that of PCM. In other words, the probability that the consecutive rebuild be interrupted is P VSM interrupt = 1 e λx RU in VSM and P PCM interrupt = 1 e λ(x RU +W RU ) in PCM, where X RU is the mean service time for reading/writing an RU and W RU is the mean waiting time of rebuild request in PCM. Since W RU > 0, P VSM interrupt < P PCM interrupt. Since VSM is superior to PCM in both rebuild time and user response time, we only consider VSM hereafter. 13

14 Figure 2: The Effect of Read Redirection 5.2 The effect of read redirection and dynamic control Read redirection allows the spare disk to process user read requests to access the part of the spare disk that has been already rebuilt. Read redirection can lower the user response time and shorten the rebuild process significantly. Figure 2 shows the overall mean user response time and rebuild time with or without read direction. However, read redirection may retard the rebuild progress when spare disk becomes a bottleneck, since it would increase the load of spare disk. This would typically happen when the RAID has a low declustering ratio (α) such that the surviving disks are feeding spare disk with data too fast than it can consume. When this happens, in order to shorten the rebuild time, we can control the fraction of read requests that are redirected to the spare disk so that the write rate on spare disk matches the read rate on surviving disks. Hence the idea of dynamic read redirection control, which was first present in [12]. 14

15 (a) Declustering ratio α = 1 (b) Declustering ratio α = 0.75 (c) Declustering ratio α = 0.5 (d) Declustering ratio α = 0.25 Figure 3: The Effect of Dynamic Control on Read Redirection at various declustering ratios. The effects of dynamic controlling of read redirection under various decluster ratios are shown in Figures 3(a),(b),(c), and (d). It can be observed that the improvement by using dynamic control is more noticeable when α is small and disk utilizations are high. 5.3 The impact of buffer size Figure 4 shows the user response time and rebuild time versus buffer size in VSM. In the simulation we use shared buffer for all disks. After reading a rebuild unit from surviving disk, it is XORed right away onto the rebuild working buffer which will be written to the spare disk. When all surviving RU from parity group are read, the working buffer contains the reconstructed data and can be materialized to the spare disk. Dedicated buffers for 15

16 Figure 4: The impact of buffer size each disk is required before the RU can be XORed onto working buffer, but their sizes are negligibly small and are therefore ignored. It can be observed that (i) A larger buffer size leads to shorter rebuild time. Due to temporary load imbalance, disks may go out of sync: several disks are too busy and the rebuild requests on those disks lag far behind the rebuild requests on others. As a result, the rebuild buffer is filled up and even the idle disks can not read RUs anymore. Hence the rebuild process is suspended until the bottleneck disks get time to process rebuild requests. Obviously, the possibility of this suspension is smaller with large buffer sizes. Similar to caching, the improvement of the increasing the buffer size is significant at small sizes, but drops quickly while the buffer size is big enough. A proper rebuild buffer size is related to the RU size and the workload variance over disks. (ii) The user response times are not sensitive to buffer size. In VSM, the rebuild requests are processed at a lower priority, so process rate of rebuild requests doesn t affect rebuild 16

17 requests significantly. 5.4 The impact of rebuilt unit size Figure 5: The impact of rebuild unit size Figure 5 plots user response time and rebuild time versus rebuild unit (RU) size. It is shown that a larger RU size leads to higher user response time but shorter rebuild time. The reason is that larger RU takes longer service time per rebuild request, while the user requests are typically delayed by rebuild requests the amount of half their service time. On the other hand, we gain the efficiency since the each seek can read more blocks. Consequently, the rebuild time is shortened. 5.5 The impact of array size Figure 6 shows user response time and rebuild time versus array size. The declustering ratio α is fixed at We can see that the array size does not affect user response time and rebuild time significantly. The array size may affect the rebuild speed through the fork-join 17

18 Figure 6: The impact of array size effect. Larger array size will result in higher variance for fork-join requests. However, this variance is concealed by the buffer so long as the buffer size is large enough. 5.6 The impact of parity de-clustering Figure 7: The impact of parity de-clustering on user response time. Figure 7 shows disk utilization versus the user response time when the rebuild process just starts (i.e. the worse case user response time). It is clear that all degraded mode user 18

19 Figure 8: The impact of parity de-clustering on rebuild time response time suffers a rise comparing to normal mode, while this rise is smaller for smaller α. The declustering ratio (α) can substantially affect the rebuild process in two ways: firstly, the overhead brought to each surviving disk by the rebuild process depends on the declustering ratio. The smaller the α, the lower the rebuild overhead, since only a small fraction of disks need to be accessed to rebuild a block. Then in turn this overhead decides the bandwidth left for rebuild request. Secondly, for each rebuild request, a smaller α indicates weaker fork-join effect and therefore makes the rebuilding easier. As shown in Figure 8, smaller α results in lower user response time and shorter rebuild time. 6 Conclusions and Future Work We first study the effect of the declustering ratio (α) on the performance of a clustered RAID array. We vary the arrival rate of disk requests, which are reads, until we get close to a disk utilization of one in normal mode. When a single disk fails the maximum throughput drops to one half the 19

20 throughput in normal mode, while with α = 0.5 it drops by a factor of 1.5. Also shown in the figure is the response time at the beginning of the rebuild process and before read redirection takes effect. The increase in response time with respect to degraded mode is equal to the mean residual reading time of rebuild units, i.e., this difference will increase with larger rebuild units. Disk-oriented rebuild using the vacationing server model - VSM outperforms rebuild using the permanent customer model -PCM both in terms of response time and rebuild time. The former is attributable to the fact that user requests are processed at a higher priority than rebuild requests, while the latter is due to the fact that with VSM the chances are higher that multiple rebuild accesses can be processed uninterrupted, thus minimizing the number of seeks incurred for this purpose. Read redirection has a positive effect in that with read redirection the response time of read requests improves gradually until it reaches normal mode response time, while there is no improvement without read redirection. The effect of dynamic control on read redirection shows that rebuild time decreases, while response time increases. This is not a fair comparison, since we need to take into account the fact that fewer requests will be processed in degraded mode when rebuild is completed earlier. Buffer size, if selected too small, results in a significant degradation in rebuild time, which results in further reading of rebuild requests being blocked. Increasing the rebuild unit size also results in lower rebuild time, which is accompanied in a small increase in response time. fixed. Array size has no effect on rebuild time and response time as long as other parameters remain 20

21 References [1] G. A. Alvarez, W. A. Burkhard, and F. Cristian. Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering, Proc. 24th ISCA, 1997, pp [2] G. A. Alvarez. W. A. Burkhard, L. J. Stockmeyer, and F. Cristian. Declustered disk array architectures with optimal and near optimal parallelism, Proc. 25th ISCA, 1998, pp [3] S.-Z. Chen and D. F. Towsley. The design and evaluation of RAID5 and parity striping disk array architectures, J. Parallel and Distributed Computing 10(1/2): (1993). [4] R. Durstenfeld. Algorithm 235: Random Permutation, Comm. ACM 7(7): p. 420 (1964). [5] G. Fu, A. Thomasian, C. Han, and S. Ng. Rebuild strategies for redundant disk arrays, Proc. IEEE/NASA Conf. on Mass Storage Systems and Technologies, [6] M. Hall. Combinatorial Theory, Wiley [7] C. Han and A. Thomasian. Performance of two-disk failure tolerant disk arrays, Proc. Symp. Performance Evaluation of Computer and Telecomm. Systems - SPECTS 03, [8] M. C. Holland, G. A. Gibson, and D. P. Siewiorek. Architectures and algorithms for on-line failure recovery in redundant disk arrays, Distributed and Parallel Databases 11(3): (1994). [9] C. R. Lumb, A. Merchant, and G. Alvarez. Facade: Virtual storage devices with performance guarantees, Proc. File and Storage Technologies - FAST Conf., [10] J. Menon. Performance of RAID5 disk arrays with read and write caching, Distributed and Parallel Databases 11(3): (1994). 21

22 [11] A. Merchant and P. S. Yu. Analytic modeling of clustered RAID with mapping based on nearly random permutation, IEEE Trans. Computers 45(3): (1996). [12] R. Muntz and J. C. S. Lui. Performance analysis of disk arrays under failure, Proc. 16th Int l VLDB Conf., 1990, pp [13] S. W. Ng and R. L. Mattson. Uniform parity distribution in disk arrays with multiple failures, IEEE Trans. Computers 43(4): (1994). [14] [15] K. K. Ramakrishnan, P. Biswas, and R. Karedla. Analysis of file I/O traces in commercial computing environments, Proc. Joint ACM SIGMETRICS/Performance 92 Conf., 1992, pp [16] T. J. E. Schwarz, J. Steinberg, and W. A. Burkhard. Permutation development data layout (PDDL) disk array declustering, Proc 5th IEEE Symp. on High Performance Computer Architecture - HPCA, 1999, pp [17] [18] A. Thomasian and J. Menon. Performance analysis of RAID5 disk arrays with a vacationing server model, Proc. 10th ICDE Conf., 1994, pp [19] A. Thomasian. Rebuild options in RAID5 disk arrays, Proc. 7th IEEE Symp. on Parallel and Distributed Systems, San Antonio, TX, Oct. 1995, pp [20] A. Thomasian and J. Menon. RAID5 performance with distributed sparing, IEEE Trans. Parallel and Distributed Systems 8(6): (June 1997). 22

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

Improving Lustre OST Performance with ClusterStor GridRAID. John Fragalla Principal Architect High Performance Computing

Improving Lustre OST Performance with ClusterStor GridRAID. John Fragalla Principal Architect High Performance Computing Improving Lustre OST Performance with ClusterStor GridRAID John Fragalla Principal Architect High Performance Computing Legacy RAID 6 No Longer Sufficient 2013 RAID 6 data protection challenges Long rebuild

More information

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering DELL RAID PRIMER DELL PERC RAID CONTROLLERS Joe H. Trickey III Dell Storage RAID Product Marketing John Seward Dell Storage RAID Engineering http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/top

More information

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID

More information

Data Storage - II: Efficient Usage & Errors

Data Storage - II: Efficient Usage & Errors Data Storage - II: Efficient Usage & Errors Week 10, Spring 2005 Updated by M. Naci Akkøk, 27.02.2004, 03.03.2005 based upon slides by Pål Halvorsen, 12.3.2002. Contains slides from: Hector Garcia-Molina

More information

HP Smart Array Controllers and basic RAID performance factors

HP Smart Array Controllers and basic RAID performance factors Technical white paper HP Smart Array Controllers and basic RAID performance factors Technology brief Table of contents Abstract 2 Benefits of drive arrays 2 Factors that affect performance 2 HP Smart Array

More information

Evaluation of Object Placement Techniques in a Policy-Managed Storage System

Evaluation of Object Placement Techniques in a Policy-Managed Storage System Evaluation of Object Placement Techniques in a Policy-Managed Storage System Pawan Goyal Peter Radkov and Prashant Shenoy Storage Systems Department, Department of Computer Science, IBM Almaden Research

More information

Increasing the capacity of RAID5 by online gradual assimilation

Increasing the capacity of RAID5 by online gradual assimilation Increasing the capacity of RAID5 by online gradual assimilation Jose Luis Gonzalez,Toni Cortes joseluig,toni@ac.upc.es Departament d Arquiectura de Computadors, Universitat Politecnica de Catalunya, Campus

More information

RAID Basics Training Guide

RAID Basics Training Guide RAID Basics Training Guide Discover a Higher Level of Performance RAID matters. Rely on Intel RAID. Table of Contents 1. What is RAID? 2. RAID Levels RAID 0 RAID 1 RAID 5 RAID 6 RAID 10 RAID 0+1 RAID 1E

More information

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations A Dell Technical White Paper Database Solutions Engineering By Sudhansu Sekhar and Raghunatha

More information

Fast, On-Line Failure Recovery in Redundant Disk Arrays

Fast, On-Line Failure Recovery in Redundant Disk Arrays Proceedings of the 23rd Annual International Symposium on Fault-Tolerant Computing, 1993. Fast, On-Line Failure Recovery in Redundant Arrays Mark Holland Department of Electrical and Computer Engineering

More information

Sonexion GridRAID Characteristics

Sonexion GridRAID Characteristics Sonexion GridRAID Characteristics Mark Swan Performance Team Cray Inc. Saint Paul, Minnesota, USA mswan@cray.com Abstract This paper will present performance characteristics of the Sonexion declustered

More information

RAID 5 rebuild performance in ProLiant

RAID 5 rebuild performance in ProLiant RAID 5 rebuild performance in ProLiant technology brief Abstract... 2 Overview of the RAID 5 rebuild process... 2 Estimating the mean-time-to-failure (MTTF)... 3 Factors affecting RAID 5 array rebuild

More information

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance

More information

Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays

Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays Draft copy submitted to the Journal of Distributed and Parallel Databases. A revised copy is published in this journal,

More information

Exploring RAID Configurations

Exploring RAID Configurations Exploring RAID Configurations J. Ryan Fishel Florida State University August 6, 2008 Abstract To address the limits of today s slow mechanical disks, we explored a number of data layouts to improve RAID

More information

1 Storage Devices Summary

1 Storage Devices Summary Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious

More information

How To Improve Performance On A Single Chip Computer

How To Improve Performance On A Single Chip Computer : Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings

More information

How To Understand And Understand The Power Of Aird 6 On Clariion

How To Understand And Understand The Power Of Aird 6 On Clariion A Detailed Review Abstract This white paper discusses the EMC CLARiiON RAID 6 implementation available in FLARE 26 and later, including an overview of RAID 6 and the CLARiiON-specific implementation, when

More information

RAID Performance Analysis

RAID Performance Analysis RAID Performance Analysis We have six 500 GB disks with 8 ms average seek time. They rotate at 7200 RPM and have a transfer rate of 20 MB/sec. The minimum unit of transfer to each disk is a 512 byte sector.

More information

StorTrends RAID Considerations

StorTrends RAID Considerations StorTrends RAID Considerations MAN-RAID 04/29/2011 Copyright 1985-2011 American Megatrends, Inc. All rights reserved. American Megatrends, Inc. 5555 Oakbrook Parkway, Building 200 Norcross, GA 30093 Revision

More information

Database Management Systems

Database Management Systems 4411 Database Management Systems Acknowledgements and copyrights: these slides are a result of combination of notes and slides with contributions from: Michael Kiffer, Arthur Bernstein, Philip Lewis, Anestis

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve

More information

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays 1 Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays Farzaneh Rajaei Salmasi Hossein Asadi Majid GhasemiGol rajaei@ce.sharif.edu asadi@sharif.edu ghasemigol@ce.sharif.edu

More information

A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring

A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring Proceedings of Multimedia Computing and Networking 1997 (MMCN97), San Jose, CA, February 1997 A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring Ernst W. BIERSACK, Christoph BERNHARDT Institut

More information

Q & A From Hitachi Data Systems WebTech Presentation:

Q & A From Hitachi Data Systems WebTech Presentation: Q & A From Hitachi Data Systems WebTech Presentation: RAID Concepts 1. Is the chunk size the same for all Hitachi Data Systems storage systems, i.e., Adaptable Modular Systems, Network Storage Controller,

More information

Lecture 36: Chapter 6

Lecture 36: Chapter 6 Lecture 36: Chapter 6 Today s topic RAID 1 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for

More information

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000)

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000) The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000) IntelliMagic, Inc. 558 Silicon Drive Ste 101 Southlake, Texas 76092 USA Tel: 214-432-7920

More information

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1 Introduction - 8.1 I/O Chapter 8 Disk Storage and Dependability 8.2 Buses and other connectors 8.4 I/O performance measures 8.6 Input / Ouput devices keyboard, mouse, printer, game controllers, hard drive,

More information

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1 Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System

More information

Price/performance Modern Memory Hierarchy

Price/performance Modern Memory Hierarchy Lecture 21: Storage Administration Take QUIZ 15 over P&H 6.1-4, 6.8-9 before 11:59pm today Project: Cache Simulator, Due April 29, 2010 NEW OFFICE HOUR TIME: Tuesday 1-2, McKinley Last Time Exam discussion

More information

CS 153 Design of Operating Systems Spring 2015

CS 153 Design of Operating Systems Spring 2015 CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations Physical Disk Structure Disk components Platters Surfaces Tracks Arm Track Sector Surface Sectors Cylinders Arm Heads

More information

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 Summary: This document describes how to analyze performance on an IBM Storwize V7000. IntelliMagic 2012 Page 1 This

More information

Best Practices RAID Implementations for Snap Servers and JBOD Expansion

Best Practices RAID Implementations for Snap Servers and JBOD Expansion STORAGE SOLUTIONS WHITE PAPER Best Practices RAID Implementations for Snap Servers and JBOD Expansion Contents Introduction...1 Planning for the End Result...1 Availability Considerations...1 Drive Reliability...2

More information

Energy aware RAID Configuration for Large Storage Systems

Energy aware RAID Configuration for Large Storage Systems Energy aware RAID Configuration for Large Storage Systems Norifumi Nishikawa norifumi@tkl.iis.u-tokyo.ac.jp Miyuki Nakano miyuki@tkl.iis.u-tokyo.ac.jp Masaru Kitsuregawa kitsure@tkl.iis.u-tokyo.ac.jp Abstract

More information

ES-1 Elettronica dei Sistemi 1 Computer Architecture

ES-1 Elettronica dei Sistemi 1 Computer Architecture ES- Elettronica dei Sistemi Computer Architecture Lesson 7 Disk Arrays Network Attached Storage 4"» "» 8"» 525"» 35"» 25"» 8"» 3"» high bandwidth disk systems based on arrays of disks Decreasing Disk Diameters

More information

IDO: Intelligent Data Outsourcing with Improved RAID Reconstruction Performance in Large-Scale Data Centers

IDO: Intelligent Data Outsourcing with Improved RAID Reconstruction Performance in Large-Scale Data Centers IDO: Intelligent Data Outsourcing with Improved RAID Reconstruction Performance in Large-Scale Data Centers Suzhen Wu 12, Hong Jiang 2, Bo Mao 2 1 Computer Science Department, Xiamen University 2 Department

More information

SIMULATION AND MODELLING OF RAID 0 SYSTEM PERFORMANCE

SIMULATION AND MODELLING OF RAID 0 SYSTEM PERFORMANCE SIMULATION AND MODELLING OF RAID 0 SYSTEM PERFORMANCE F. Wan N.J. Dingle W.J. Knottenbelt A.S. Lebrecht Department of Computing, Imperial College London, South Kensington Campus, London SW7 2AZ email:

More information

RAID 6 with HP Advanced Data Guarding technology:

RAID 6 with HP Advanced Data Guarding technology: RAID 6 with HP Advanced Data Guarding technology: a cost-effective, fault-tolerant solution technology brief Abstract... 2 Introduction... 2 Functions and limitations of RAID schemes... 3 Fault tolerance

More information

Striping in a RAID Level 5 Disk Array

Striping in a RAID Level 5 Disk Array Proceedings of the 1995 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems Striping in a RAID Level 5 Disk Array Peter M. Chen Computer Science Division Electrical Engineering and

More information

IBM ^ xseries ServeRAID Technology

IBM ^ xseries ServeRAID Technology IBM ^ xseries ServeRAID Technology Reliability through RAID technology Executive Summary: t long ago, business-critical computing on industry-standard platforms was unheard of. Proprietary systems were

More information

Chapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections

Chapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections Chapter 6 Storage and Other I/O Topics 6.1 Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

MS SQL Performance (Tuning) Best Practices:

MS SQL Performance (Tuning) Best Practices: MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware

More information

CS 6290 I/O and Storage. Milos Prvulovic

CS 6290 I/O and Storage. Milos Prvulovic CS 6290 I/O and Storage Milos Prvulovic Storage Systems I/O performance (bandwidth, latency) Bandwidth improving, but not as fast as CPU Latency improving very slowly Consequently, by Amdahl s Law: fraction

More information

A Tutorial on RAID Storage Systems

A Tutorial on RAID Storage Systems A Tutorial on RAID Storage Systems Sameshan Perumal and Pieter Kritzinger CS04-05-00 May 6, 2004 Data Network Architectures Group Department of Computer Science University of Cape Town Private Bag, RONDEBOSCH

More information

How To Create A Multi Disk Raid

How To Create A Multi Disk Raid Click on the diagram to see RAID 0 in action RAID Level 0 requires a minimum of 2 drives to implement RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written

More information

Load Balancing in Fault Tolerant Video Server

Load Balancing in Fault Tolerant Video Server Load Balancing in Fault Tolerant Video Server # D. N. Sujatha*, Girish K*, Rashmi B*, Venugopal K. R*, L. M. Patnaik** *Department of Computer Science and Engineering University Visvesvaraya College of

More information

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,

More information

Resource Allocation Schemes for Gang Scheduling

Resource Allocation Schemes for Gang Scheduling Resource Allocation Schemes for Gang Scheduling B. B. Zhou School of Computing and Mathematics Deakin University Geelong, VIC 327, Australia D. Walsh R. P. Brent Department of Computer Science Australian

More information

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant

More information

EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iscsi

EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iscsi EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Applied Technology Abstract Microsoft SQL Server includes a powerful capability to protect active databases by using either

More information

Sistemas Operativos: Input/Output Disks

Sistemas Operativos: Input/Output Disks Sistemas Operativos: Input/Output Disks Pedro F. Souto (pfs@fe.up.pt) April 28, 2012 Topics Magnetic Disks RAID Solid State Disks Topics Magnetic Disks RAID Solid State Disks Magnetic Disk Construction

More information

HARD DRIVE CHARACTERISTICS REFRESHER

HARD DRIVE CHARACTERISTICS REFRESHER The read/write head of a hard drive only detects changes in the magnetic polarity of the material passing beneath it, not the direction of the polarity. Writes are performed by sending current either one

More information

How To Make A Backup System More Efficient

How To Make A Backup System More Efficient Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,

More information

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12 Outline Database Management and Tuning Hardware Tuning Johann Gamper 1 Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 12 2 3 Conclusion Acknowledgements: The slides are provided

More information

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array ATTO Technology, Inc. Corporate Headquarters 155 Crosspoint Parkway Amherst, NY 14068 Phone: 716-691-1999 Fax: 716-691-9353 www.attotech.com sales@attotech.com RAID Overview: Identifying What RAID Levels

More information

PIONEER RESEARCH & DEVELOPMENT GROUP

PIONEER RESEARCH & DEVELOPMENT GROUP SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for

More information

Technical White paper RAID Protection and Drive Failure Fast Recovery

Technical White paper RAID Protection and Drive Failure Fast Recovery Technical White paper RAID Protection and Drive Failure Fast Recovery RAID protection is a key part of all ETERNUS Storage Array products. Choices of the level chosen to meet customer application requirements

More information

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to:

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to: HPTER 4 RI s it was originally proposed, the acronym RI stood for Redundant rray of Inexpensive isks. However, it has since come to be known as Redundant rray of Independent isks. RI was originally described

More information

CS420: Operating Systems

CS420: Operating Systems NK YORK COLLEGE OF PENNSYLVANIA HG OK 2 RAID YORK COLLEGE OF PENNSYLVAN James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz,

More information

3PAR Fast RAID: High Performance Without Compromise

3PAR Fast RAID: High Performance Without Compromise 3PAR Fast RAID: High Performance Without Compromise Karl L. Swartz Document Abstract: 3PAR Fast RAID allows the 3PAR InServ Storage Server to deliver higher performance with less hardware, reducing storage

More information

RAID Level Descriptions. RAID 0 (Striping)

RAID Level Descriptions. RAID 0 (Striping) RAID Level Descriptions RAID 0 (Striping) Offers low cost and maximum performance, but offers no fault tolerance; a single disk failure results in TOTAL data loss. Businesses use RAID 0 mainly for tasks

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

Performance Modeling and Analysis of a Database Server with Write-Heavy Workload

Performance Modeling and Analysis of a Database Server with Write-Heavy Workload Performance Modeling and Analysis of a Database Server with Write-Heavy Workload Manfred Dellkrantz, Maria Kihl 2, and Anders Robertsson Department of Automatic Control, Lund University 2 Department of

More information

Reliability of Data Storage Systems

Reliability of Data Storage Systems Zurich Research Laboratory Ilias Iliadis April 2, 25 Keynote NexComm 25 www.zurich.ibm.com 25 IBM Corporation Long-term Storage of Increasing Amount of Information An increasing amount of information is

More information

Mass Storage Structure

Mass Storage Structure Mass Storage Structure 12 CHAPTER Practice Exercises 12.1 The accelerating seek described in Exercise 12.3 is typical of hard-disk drives. By contrast, floppy disks (and many hard disks manufactured before

More information

What is RAID? data reliability with performance

What is RAID? data reliability with performance What is RAID? RAID is the use of multiple disks and data distribution techniques to get better Resilience and/or Performance RAID stands for: Redundant Array of Inexpensive / Independent Disks RAID can

More information

Oracle Database 10g: Performance Tuning 12-1

Oracle Database 10g: Performance Tuning 12-1 Oracle Database 10g: Performance Tuning 12-1 Oracle Database 10g: Performance Tuning 12-2 I/O Architecture The Oracle database uses a logical storage container called a tablespace to store all permanent

More information

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation Overview of I/O Performance and RAID in an RDBMS Environment By: Edward Whalen Performance Tuning Corporation Abstract This paper covers the fundamentals of I/O topics and an overview of RAID levels commonly

More information

Chapter 12: Mass-Storage Systems

Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management RAID Structure

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

The Effect of Priorities on LUN Management Operations

The Effect of Priorities on LUN Management Operations Abstract This white paper describes the effect of each of the four Priorities (ASAP, High, Medium, and Low) on overall EMC CLARiiON performance in executing. The LUN Management Operations are migrate,

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Physical structure of secondary storage devices and its effects on the uses of the devices Performance characteristics of mass-storage devices Disk scheduling algorithms

More information

WHITE PAPER Optimizing Virtual Platform Disk Performance

WHITE PAPER Optimizing Virtual Platform Disk Performance WHITE PAPER Optimizing Virtual Platform Disk Performance Think Faster. Visit us at Condusiv.com Optimizing Virtual Platform Disk Performance 1 The intensified demand for IT network efficiency and lower

More information

Getting Started With RAID

Getting Started With RAID Dell Systems Getting Started With RAID www.dell.com support.dell.com Notes, Notices, and Cautions NOTE: A NOTE indicates important information that helps you make better use of your computer. NOTICE: A

More information

Benefits of Intel Matrix Storage Technology

Benefits of Intel Matrix Storage Technology Benefits of Intel Matrix Storage Technology White Paper December 2005 Document Number: 310855-001 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,

More information

Non-Redundant (RAID Level 0)

Non-Redundant (RAID Level 0) There are many types of RAID and some of the important ones are introduced below: Non-Redundant (RAID Level 0) A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID organization

More information

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program

More information

On Benchmarking Popular File Systems

On Benchmarking Popular File Systems On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,

More information

Los Angeles, CA, USA 90089-2561 [kunfu, rzimmerm]@usc.edu

Los Angeles, CA, USA 90089-2561 [kunfu, rzimmerm]@usc.edu !"$#% &' ($)+*,#% *.- Kun Fu a and Roger Zimmermann a a Integrated Media Systems Center, University of Southern California Los Angeles, CA, USA 90089-56 [kunfu, rzimmerm]@usc.edu ABSTRACT Presently, IP-networked

More information

Chapter 7. Disk subsystem

Chapter 7. Disk subsystem Chapter 7. Disk subsystem Ultimately, all data must be retrieved from and stored to disk. Disk accesses are usually measured in milliseconds, whereas memory and PCI bus operations are measured in nanoseconds

More information

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels technology brief RAID Levels March 1997 Introduction RAID is an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks) coined in a 1987 University of California

More information

Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling Operatin g Systems: Internals and Design Principle s Chapter 11 I/O Management and Disk Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles An artifact can

More information

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS TECHNOLOGY BRIEF August 1999 Compaq Computer Corporation Prepared by ISSD Technology Communications CONTENTS Executive Summary 1 Introduction 3 Subsystem Technology 3 Processor 3 SCSI Chip4 PCI Bridge

More information

Redpaper. Performance Metrics in TotalStorage Productivity Center Performance Reports. Introduction. Mary Lovelace

Redpaper. Performance Metrics in TotalStorage Productivity Center Performance Reports. Introduction. Mary Lovelace Redpaper Mary Lovelace Performance Metrics in TotalStorage Productivity Center Performance Reports Introduction This Redpaper contains the TotalStorage Productivity Center performance metrics that are

More information

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1 RAID HARDWARE On board SATA RAID controller SATA RAID controller card RAID drive caddy (hot swappable) Anne Watson 1 RAID The word redundant means an unnecessary repetition. The word array means a lineup.

More information

How To Virtualize A Storage Area Network (San) With Virtualization

How To Virtualize A Storage Area Network (San) With Virtualization A New Method of SAN Storage Virtualization Table of Contents 1 - ABSTRACT 2 - THE NEED FOR STORAGE VIRTUALIZATION 3 - EXISTING STORAGE VIRTUALIZATION METHODS 4 - A NEW METHOD OF VIRTUALIZATION: Storage

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Fall 2004 Lecture 13: FFS, LFS, RAID Geoffrey M. Voelker Overview We ve looked at disks and file systems generically Now we re going to look at some example file

More information

The idea behind RAID is to have a number of disks co-operate in such a way that it looks like one big disk.

The idea behind RAID is to have a number of disks co-operate in such a way that it looks like one big disk. People often ask: Should I RAID my disks? The question is simple, unfortunately the answer is not. So here is a guide to help you decide when a RAID array is advantageous and how to go about it. This guide

More information

Performance Comparison of Assignment Policies on Cluster-based E-Commerce Servers

Performance Comparison of Assignment Policies on Cluster-based E-Commerce Servers Performance Comparison of Assignment Policies on Cluster-based E-Commerce Servers Victoria Ungureanu Department of MSIS Rutgers University, 180 University Ave. Newark, NJ 07102 USA Benjamin Melamed Department

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR

Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR December, 2012 Peter McGonigal petermc@sgi.com Abstract Dynamic Disk Pools (DDP) offer an exciting new approach to traditional RAID sets by substantially

More information

Capacity Planning Process Estimating the load Initial configuration

Capacity Planning Process Estimating the load Initial configuration Capacity Planning Any data warehouse solution will grow over time, sometimes quite dramatically. It is essential that the components of the solution (hardware, software, and database) are capable of supporting

More information

Dynamic Disk Pools Technical Report

Dynamic Disk Pools Technical Report Dynamic Disk Pools Technical Report A Dell Technical White Paper Dell PowerVault MD3 Dense Series of Storage Arrays 9/5/2012 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL

More information

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi

More information

Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager

Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager Hitachi Data System s WebTech Series Hitachi Path Management & Load Balancing with Hitachi Dynamic Link Manager and Global Link Availability Manager The HDS WebTech Series Dynamic Load Balancing Who should

More information

Moving Beyond RAID DXi and Dynamic Disk Pools

Moving Beyond RAID DXi and Dynamic Disk Pools TECHNOLOGY BRIEF Moving Beyond RAID DXi and Dynamic Disk Pools NOTICE This Technology Brief contains information protected by copyright. Information in this Technology Brief is subject to change without

More information

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion

More information

Supplemental File of S 2 -RAID: Parallel RAID Architecture for Fast Data Recovery

Supplemental File of S 2 -RAID: Parallel RAID Architecture for Fast Data Recovery JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 27 1 Supplemental File of S 2 -RAID: Parallel RAID Architecture for Fast Data Recovery Jiguang Wan, Jibin Wang, Changsheng Xie, and Qing Yang, Fellow,

More information

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6 Winter Term 2008 / 2009 Jun.-Prof. Dr. André Brinkmann Andre.Brinkmann@uni-paderborn.de Universität Paderborn PC² Agenda Multiprocessor and

More information