Enhanced Reliability Modeling of RAID Storage Systems

Size: px
Start display at page:

Download "Enhanced Reliability Modeling of RAID Storage Systems"

Transcription

1 Enhanced Reliability Modeling of RAID Storage Systems Jon G. Elerath Network Appliance, Inc. Michael Pecht University of Maryland Abstract A flexible model for estimating reliability of RAID storage systems is presented. This model corrects errors associated with the common assumption that system times to failure follow a homogeneous Poisson process. Separate generalized failure distributions are used to model catastrophic failures and usage dependent data corruptions for each hard drive. Catastrophic failure restoration is represented by a three-parameter Weibull, so the model can include a minimum time to restore as a function of data transfer rate and hard drive storage capacity. Data can be scrubbed as a background operation to eliminate corrupted data that, in the event of a simultaneous catastrophic failure, results in double disk failures. Field-based times to failure data and mathematic justification for a new model are presented. Model results have been verified and predict between 2 to 1,500 times as many double disk failures as that estimated using the current mean time to data loss method. 1. Introduction Storage systems consisting of redundant arrays of inexpensive disks (RAID) were developed circa 1988 to improve storage system reliability [1]. Reliability estimates were created assuming that both hard disk drive (HDD) failures and RAID system failures follow a homogeneous Poisson process. If these assumptions are accepted, the time to failure for the common RAID 4 or RAID 5 system can be expressed as the mean time to data loss (MTTDL), or an average time to double-disk failures (DDF). MTTDL is commonly turned into an hourly rate, using the exponentially distributed component failure rate. This means the probability of system failure in any time interval is constant. For example, operating 100 RAID groups for 87,600 hours will have the same probability of failure as operating 87,600 groups for 100 hours. That is, assuming renewal theory, it is assumed that the number of DDFs can be estimated by multiplying the system failure rate by time, where N(t) is the estimated number of failures and λ is the constant system failure rate per unit time in the same units as t. This is an attempt to invoke the relationship N(t) λt 1 - exp(-λt), assuming that the error is less than 1% when λt < However, the failure rate of a component, h(t), is statistically different from the failure rate of the system, more correctly referred to as the rate of occurrence of failure [2] - [5]. As noted by Ascher [6], there is little connection between the properties of component hazard rates and the properties of the process that produces a sequence of failures. That is, times between successive system failures can become increasingly larger even though each component hazard rate is increasing [3]. Even if the HDD follows a homogeneous Poisson process (HPP), there is no statistical basis for assuming the system will be a HPP. A second contributory problem is the assumption that HDD failures follow an HPP. Recent field data analyses show that HDD failure distributions are anything but constant. Data for specific HDD products often indicate subpopulations such as infant wear-out. Different vintages of the same HDD from the same manufacturer may exhibit varying failure distributions. A third issue with current methods is that undiscovered data corruptions can occur at any time in the life of a HDD. These defects were acknowledged by Kari [9], but he assumed that they were caused only by media deterioration and were independent of usage. While Schwarz included latent defects to optimize scrub algorithms [10], he still assumes the system follows a homogeneous Poisson process with constant failure rates. The significance of undiscovered latent defects (LDs) is apparent when a catastrophic (operational) failure ultimately occurs. The latent defect, combined with the operational failure, constitutes a DDF, defeating the reliability gains by (N+1) RAID. These three issues raise the question of usefulness of MTTDL models in estimating the number of DDFs in a RAID group. This paper presents a new model that includes latent defects and does not assume HPP for the HDD or system. HDD failure modes and mechanisms are briefly presented to justify the need

2 for discerning between operational failures and latent defects, which are modeled explicitly. Data scrubbing [10], the remedy for latent defects, is also incorporated in the model. Times to fail are supported through actual field data; times to restore are modeled and acknowledge a minimum and maximum time to restore a failed HDD and reconstruct the lost data. The model is evaluated using a sequential Monte Carlo simulation. The expected number of DDFs predicted by the MTTDL method is compared to the number estimated by this new model proposal, showing that the previous assumptions result in incorrect predictions. 2. Field reliability data A number of recent papers shed light on the distributions underlying HDD failures [7], [8], [10]- [13]. Figure 1 and Figure 2 present Weibull probability plots for new unpublished data. Several noteworthy observations can be made from the aggregate of these data: HDD failure rates are rarely constant Failure distributions exhibit - decreasing failure rates - late-life increasing failure rates - early-life increasing failure rates - vintage based improvements - vintage based deterioration Distributions change as a result of both design and manufacturing process changes. The above observations are a significant departure from the assumption of constant failure rates. In Figure 1, data for three different products are plotted assuming a two-parameter Weibull distribution (a straight line indicates a good fit). Only HDD #1 appears to follow a Weibull distribution. Both of the other two datasets are clearly not linear and indicate abrupt changes in the distribution. HDD #2 shows two separate linear sections, denoting two distributions dominate at different points in time, with the last one, sometime after 10,000 hours, having a marked increase in failure rate (the data plot bends upwards). Failure analyses showed the slope change was due to a change in failure mechanisms. HDD #3 shows two inflection points. Initially, the failure rate is high but decreasing, and follows the slope of HDD #1 (β = 0.9). A significant decrease occurrs (for the population) followed by a significant increase (plot line bends upward). This population has the characteristics of both competing risks and population mixtures. In mixed populations, some of the HDDs have a failure mechanism that the others do not have and so do not, in fact, fail from that mechanism. An example in an HDD is particle contamination [11]. A mixture of populations is likely responsible for the first inflection point for HDD #3 in Figure 1 (decrease in failure rate) and competing risks for the second (upturn in failure rate). Probability of Failure β HDD #1 HDD #2 Time to Failure, hrs Figure 1. Cumulative probability of failure. Only HDD #1 fits a Weibull distribution (straight line) Probability of Failure E-3 Vin E Time to Failure, hrs β1=1.0987, η1=4.5444ε+5 β2=1.2162, η2=1.2566ε+5 β3=1.4873, η3=7.5012ε+4 Vin. 3 Vin. 1 Figure 2. HDD vintage effects In Figure 2, the three lines represent three nonconsecutive HDD vintages from one manufacturer. Vintage 1 has a constant failure rate (β=1.09), whereas the others are increasing (β=1.2 and β=1.4). In a Weibull distribution the shape parameter, β, indicates η HDD #3 Weibull Vin. 1 F=198 S=10433 Vin. 2 F=992 S=23064 Vin. 3 F=921 S=22913

3 whether the failure rate is decreasing (β<1.0), constant (β=1.0), or increasing (β>1.0). 3. HDD failure modes and mechanisms In MTTDL calculations, all HDDs are assumed to have a single failure rate for catastrophic failures and latent defects are ignored. But latent defects are significant and must be included. Failure modes and mechanisms based on HDD electro-mechanical and magnetic events are summarized in Figure 3, grouped by one of two possible consequences: operational (catastrophic) failures or latent defects. Each group has its own unique failure distribution and consequence at the system level. All read failures can be classified as 1) HDD incapable of finding the data or 2) data missing or corrupted shows a structured list for the major causes of inability to read data. The failure mechanisms presented here are not novel [10], [14], but neither are they readily available from HDD manufacturers. The novelty is their use in the model. Operational Failures (Cannot find data) - Bad servo-track - Bad electronics - Can't stay on track - Bad read head - SMART limit exceeded Latent Defects (Data missing) - Error during writing Bad media Inherent bit-error rate High-fly writes - Written but destroyed Thermal asperities Corrosion Scratched media Figure 3. Breakdown for read error causes. 3.1 Cannot find data The inability to "find" data is most often caused by operational failures, which can occur any time the HDD disks are spinning and the heads are staying on track. Heads must read servo wedges that are permanently recorded onto the media during the manufacturing process and cannot be reconstructed with RAID if they are destroyed. These segments contain no user data, but provide information used solely to control the positioning of the read/write heads for all movements. If servo-track data is destroyed or corrupted, the head cannot correctly position itself, resulting in loss of access to user data even though the user s data is uncorrupted. Servo tracks can be damaged by scratches or thermal asperities. Tracks on an HDD are never perfectly circular. The present head position is continuously measured and compared to where it should be and a position error signal is used to properly reposition the head over the track. This repeatable run-out is all part of normal HDD head positioning control. Non-repeatable run-out caused by mechanical tolerances from the motor bearings, excessive wear, actuator arm bearings, noise, vibration and servo-loop response errors can cause the head positioning to take too long to lock onto a track and ultimately produce an error. High rotational speeds exacerbate this mechanism in both ball and fluid-dynamic bearings. HDDs use self-monitoring analysis reporting technology (SMART) to predict impending failure based on performance data. For example, data reallocations are expected and many spare sectors are available on each HDD, but an excessive number in a specific time interval will exceed the SMART threshold, resulting in a SMART trip. Currently, most head failures are due to changes in magnetic properties. Electro-static discharge (ESD), physical impact (contamination), and high temperatures can accelerate magnetic degradation. ESD induced degradation is difficult to detect and can propagate to full failure when exposed to localized heat from thermal asperities (T/As). The HDD electronics are attached to the outside of the HDD. DRAM and cracked chip-capacitors have also been known to cause failure. 3.2 Data missing Data is sometimes written poorly initially, but can be corrupted after being written. Unless corrected, missing and corrupted data become latent defects. 1. Errors during writing The bit-error rate (BER) is a statistical measure of the effectiveness of all the electrical, mechanical, magnetic, and firmware control systems working together to write (or read) data. Most bit-errors occur on a read command and are corrected, but since written data is rarely checked immediately after writing, bit-errors can also occur during writes. BER accounts for some fraction of defective data written to the HDD, but a greater source of errors is the magnetic recording media coating the disks. Writing on scratched, smeared, or pitted media can result in corrupted data. Scratches can be caused by loose hard particles (TiW, Si 2 O 3, C) becoming lodged between the head and the media surface. Smears, caused by soft particles such as stainless

4 steel and aluminum, will also corrupt data. Pits and voids are caused by particles that were originally embedded in the media during the sputtering process and subsequently dislodged during the final processing steps, the polishing process to remove embedded contaminants, or field use. Hydrocarbon contamination (machine oil) on the disk surface can result in write errors as well. A common cause for poorly written data is the high-fly write. The heads are aerodynamically designed to have a negative pressure and maintain the small, fixed distance above the disk surface at all times. If the aerodynamics are perturbed, the head can fly too high, resulting in weakly (magnetically) written data that cannot be read. All disks have a very thin film of lubricant on them as protection from head-disk contact, but lubrication build-up on the head can increase the flying height. 2. Data written but destroyed Most RAID reliability models assume that data will remain undestroyed except by degradation of the magnetic properties of the media ( bit-rot ). While it is correct that media can degrade, this failure mechanism is not a significant cause. Data can become corrupted any time the disks are spinning, even when data is not being written to or read from the disk. Three common causes for erasure include thermal asperities, corrosion, and scratches/smears. Thermal asperities are instances of high heat for a short duration caused by head-disk contact. This is usually the result of heads hitting small bumps created by particles embedded in the media surface during the manufacturing process. The heat generated on a single contact may not be sufficient to thermally erase data, but may be sufficient after many contacts. Heads are designed to push particles away, but contaminants can still become lodged between the head and disk. Hard particles used in the manufacture of an HDD, such as Al 2 O 3, TiW, and C, can cause surface scratches and data erasure any time the disk is rotating. Other soft materials such as stainless steel can come from assembly tooling. Soft particles tend to smear across the surface of the media rendering the data unreadable. Corrosion, although carefully controlled, also can cause data erasure and may be accelerated by T/A generated heat. 4. Model logic RAID reduces the probability of data loss by grouping together multiple inexpensive hard disk drives in a redundant configuration and adding error correction using parity. Most RAID configurations use a single additional HDD within the RAID group for redundancy. As part of the write process, an exclusive OR calculation generates parity bits that are also written to the RAID group. Error correcting codes (ECC) on the HDD and parity across the HDDs is a common method to enssure accurate data transfer and recording. ECC uses Boolean operations to encode blocks of data, interleaving the data and the ECC bits. On each read command, user data and ECC are read. If a data inconsistency occurs, the data is corrected on-the-fly (less than one revolution), data integrity preserved, and performance is not degraded. ECC strength is enhanced by interleaving multiple blocks of data so that errors covering a large physical area (many bits) can be corrected. ECC is faster than data recovery across multiple HDDs, but since ECC is read with every block of user data, excessive ECC use can degrade performance. 4.1 Previous models MTTDL was introduced as the measure of RAID group reliability nearly 20 years ago [1]. Researchers have attempted to improve RAID reliability models, but the primary change has been to introduce Markov models, resulting in a probability of failure rather than an MTTDL [7], [15], and [16]. Ultimately, all past work is based on the assumption of constant failure and repair rates. A review of the methods used to assess reliability in papers [1], [17]-[20] identified deficiencies as follows: 1. Failure rates are not constant in time. 2. Failure rates change based on production vintage. 3. Failure distributions can be mixtures of multiple distributions because of production vintages. 4. Repair rates are not constant and there exists a minimum time to complete restoration. 5. Permanent errors can occur any time. 6. Latent defects must be considered in the model. 7. RAID system failures are assumed to follow a homogeneous Poisson process. MTTDL attempts to estimate average time between simultaneous failures of two hard disk drives in an (N+1) RAID group. Disk drives are assumed to have constant failure rates, λ, the reciprocal of the mean time to failure; constant repair (restoration) rates, μ, the reciprocal of the mean time to restore; and assume RAID group failures follow a homogeneous Poisson process. Based on these assumptions, an N + 1 RAID group has an MTTDL as shown in equation 1. ( 2N + 1) λ + μ MTTDL = eq. 1 N N + 1 λ ( ) 2

5 Since the repair rate is usually much larger than the failure rate, the MTTDL expression can be simplified. MTTDL Indep = N μ 2 ( N + 1) λ N( N + 1) MTTRdisk eq. 2 From MTTDL, the expected number of failures, E[N(t)], in a time interval is estimated by multiplying the time interval by the number of systems and dividing by the MTTDL. Equation 3 shows the estimate for an MTTDL of 36,162 years (MTBF = 461,386 hrs; MTTR=12 hrs; N=7), 1,000 RAID groups, and 10 years of operation. This calculation does not include latent defects or non-constant failure or restoration rates. Non-constant failure rates invalidate the MTTDL. 4.2 NHPP-latent defect model MTTF 2 disk 10 yrs x 1000 RAIDGroups RAIDGroup N() t = = ,162 yrs Failure eq. 3 The state diagram in Figure 4 is used to convey the model logic at a high level. The model is evaluated using Monte Carlo simulation rather than Markov model because estimating the number of failures in time from a probability model can be erroneous [21]. Four distributions are required, denoted d in Figure 4: time to operational failure, time to latent defect, time to operational repair, and time to scrub (latent defect repair). System failure occurs when two HDDs fail simultaneously, depicted as states 3 and 5. An operational failure (Op) is one in which no data on the HDD can be read, even though the data may have no defect. Removal and replacement of the HDD is the only resolution for operational failures. Latent defect (Ld) refers to unknown or undetected data corruption. Latent defects are corrected only when the corrupted data is read and requires reading data on other HDDs in the RAID group and the associated parity bits. If only a few blocks of data are corrupted, the reconstructed data is written to another good section of the HDD and the faulty section is mapped out to prevent reuse. The order of occurrence of operational and latent defects is significant. If an operational failure occurs after the existence of a latent defect on a different HDD, the data cannot be reconstructed on the replacement HDD because the required redundant data is corrupted or missing. Thus, a latent defect followed by an operational failure results in a DDF. Write-errors that occur during reconstruction of an HDD will be corrected the next time the data is = read or will remain as latent defects, but their creation during a reconstruction does not constitute a DDF. The probability of suffering a usage-related data corruption in an unread area during the time of reconstruction is small, so DDFs rarely occur during reconstructions. Multiple HDDs with latent defects do not constitute DDF unless they happen to coexist in blocks from a single data stripe across more than one HDD, an extremely rare event that is not modeled. Fully Functional All HDDs in the RAID group are operating 1 g[(n+1);dld] g[dscrub] g[(n+1);dop] g[drestore] Degraded 1 Ld 1 Op g[(n);dop] g[(n);dop] Failed 2 3 g[dexcessive Op 2 ] 1 Ld 1 OP Op Note 1: Op failure must be a different HDD than the one with a Ld. Note 2: This transition does not have an explicit rate. It is included in the measured rate of "Op" from field data. Figure 4. State diagram for N+1 RAID group Recently, latent defects have been recognized by some system integrators and been reduced by data scrubbing. Schwarz [10] presented a Markov model for mirrored HDDs in an off-line archive system including scrub optimization. But the analysis did not include large RAID groups with latent defects. During scrubbing, data on the HDD is read and checked against its parity bits even though the data is not being requested by the user. The corrupt data is corrected, bad spots on the media are mapped out, and the data is saved to good locations on the HDD. Since this is a background activity, it may be rather slow so it does not impede performance. Depending on the foreground I/O demand, the scrub time may be as short as the maximum HDD and data-bus transfer rates permit, or may be as long as weeks. In summary, the two scenarios that result in DDF are 1) two simultaneous operational failures and 2) an operational failure that occurs after a latent defect has been introduced and before it is corrected. Multiple simultaneous latent defects do not constitute failure.

6 The model logic is partially depicted in Figure 4. In state 1, the data and parity HDDs are good, there are no latent defects, and a spare HDD is available. In state 2, one or more HDDs have latent defects. Failure transitions depend on the number of HDDs available and the distribution of time to failure or restoration. A generic functional notation, g[a;b], is used to represent transitions and the critical variables a and b without conveying any specific operation. For example, transition from state 1 is a function of the N+1 HDDs developing a latent defect according to the failure distribution, d Ld. From state 2, an operational failure in any of the N HDDs other than the one with the latent defect results in state 3, a DDF state. The transition from state 2 to 3 is governed by the operational failure distribution d Op. Transition from state 2 to state 4 occurs because the time to reallocate a sudden burst of media defects on a single HDD exceeds a user specified threshold. This results in a "time-out" error or SMART trip such as excessive block reallocations. In this transition massive media problems render the HDD inoperative, just like any other operational failure, so the frequency of transition from state 2 to state 4 is included in the operational failure distribution d Op. A third transition from state 2 is back to state 1. This represents repair of latent defects according to the scrubbing distribution, d Scrub. State 4 represents one operational failure. The transition to state 4 from state 1 is a function of the number of HDDs in the RAID group and the operational failure distribution, d Op. There are two transitions out of state 4. A second simultaneous operational failure results in transition to DDF state 5. The operational failure is replaced with a new HDD and data reconstructed according to the restoration distribution d Restore, returning the RAID group back to state 1 with full operability. The distribution, d Restore, includes the delay time to physically incorporate the spare HDD and has a minimum time to reconstruct based on the HDD capacity, the maximum transfer rate and concurrent I/O. 5. Sequential Monte Carlo modeling In a sequential Monte Carlo simulation, the time dependent, or chronological, behavior of the system is simulated [22]. For each HDD in the RAID group, each transition distribution in Figure 4 is sampled. The operating and failure times are accumulated until a specified mission time is exceeded. This research uses a mission of 87,600 hours (10 years). During that time, the sequence of HDD failures, repairs, latent defects, scrubs, and DDFs are tracked. Each sequence of sampling required to reach the mission is a single simulation and represents one possible system operating chronology. If 10,000 simulations are needed to develop the cumulative failure function, described in [23], it is equivalent to monitoring the number of DDFs for 10,000 systems over the mission life. Figure 5 shows the sequential sampling process used. For simplicity, only four HDD slots are shown. The graph looks like a digital timing diagram with the high signal representing the operating (nondefective) condition and the low signal representing the failed (defective) condition. Throughout this process, each HDD slot in the RAID group carries its own times to failure (both TTOp and TTLd) and times to restore (TTR and TTScrub) distributions. When a DDF involves an HDD with a latent defect, the TTR for the failure is the same as the concomitant operational failure time. Slot 1 t7 t8 Slot 2 t3 t4 t11 t12 Slot 3 t1 t2 t9 t10 t10' Slot 4 t5 t6' t6 t13 TTF TTR Comparison DDF? Next sample processes Old t1 t2 Sample new TTF & TTR for New t3 t4 Is t1<t3<t2 no slot 3 (t9 & t10) Old t3 t4 New t5 t6 Is t3<t5<t4 no Old t5 t6 New t11 t12 Is t5<t11<t6 yes Old t11 t12 New t9 t10 Is t11<t9<t12 no Old t9 t10 New t7 t8 Is t9<t7<t10 yes Sample new TTF & TTR for slot 2 (t11 & t12) Shift restart time (t6) to coincide with restoration of slot 2 (t12) Sample new TTF & TTR for slot 4 (t13 & t14) Sample new TTF & TTR for slot 2 (not shown) Shift restart time (t10) to coincide with restoration of slot 1 (t8) Sample new TTF & TTR for slot 3 (not shown) Figure 5. Timing diagram for sampling TTFs and TTRs. Initially, a TTF and TTR are sampled for each HDD slot, t1 to t8. Then, pair-wise comparisons are made as indicated below the diagram. The simulation begins by sampling a TTOp and a TTLd for every HDD and storing the times in separate arrays. For the two HDDs with the shortest times to failure (or defect), a time to restore (or time to scrub) is sampled. If two operational failures exist simultaneously, a DDF occurs. Since two latent defects will not fail the system, there is no DDF if the shortest and second shortest event times are both latent defects. If one event is an operational failure and one

7 is a latent defect, a DDF exists when the operational failure occurs after the latent defect has occurred and before the scrub process corrects the corrupted data from the latent defect. A system failure does not occur if the shortest time is an operational failure and the second shortest is a latent defect. Once a DDF has occurred, a subsequent one cannot occur until the first is restored. If no DDF is detected, then the TTR (or TTScrub) that has already been sampled and used in the preceding comparison is added to the earliest time to failure. A new TTOp (or TTLd) is sampled, added to the previous sum, and the HDDs are again sorted and reduced if the cumulative time exceeds the mission time. This process is reiterated until all the cumulative operating times for all HDD slots have exceeded the mission time. 6. Transition distributions The four component-related distributions required for this model are time to operational failure, time to restore an operational failure, time to generation of a latent defect, and time to scrub HDDs for latent defects. The simulations in this paper use a threeparameter Weibull probability density function, f(t), of the form: β 1 β β t t f () t = exp η η η where γ is the location parameter, η is the characteristic life, and β is the shape parameter. 6.1 Time to operational failure (TTOp) To illustrate the improvement over the MTTDL method, a single TTOp distribution to illustrate improvement over the MTTDL method. A Weibull failure distribution with a slightly increasing failure rate is used. The characteristic life, η, is 461,386 hours. The shape parameter, β, is These parameters are from a field population of over 120,000 HDDs that operated for up to 6,000 hours each. 6.2 Time to restore (TTR) A constant restoration rate implies the probability of completing the restoration in any time interval is equally as likely as any other interval of equal length. Therefore, it is just as likely to complete restoration in the interval 0 to 48 hours as it is in the interval 1,000 to 1,048 hours. But this is clearly unrealistic for two reasons. First, there is a finite amount of time required for the HDD to reconstruct all the data on the HDD. It is a function of the HDD capacity, the data rate of the HDD, the data rate of the data-bus, the number of HDDs on the data-bus and the amount of I/O transferred as a foreground process. Reconstruction is performed on a high priority basis but does not stop all other I/O to accelerate completion. This model recognizes that there is a minimum time before which the probability of being fully restored is zero. Fibre Channel HDDs can sustain up to 100MB/second data transfer rates, although 50MB/sec is more common. The data-bus to which the RAID group is attached has only a 2 giga-bits per second capability. Thus, in a RAID group of 14, a 144GB HDD on a Fibre Channel interface will require a minimum of three hours with no other I/O to reconstruct the failed HDD. A 500GB, Serial ATA HDD on a 1.5Gb data-bus will require 10.4 hours to read all other HDDs and reconstruct a replaced HDD. The added I/O associated with continuing to serve data will lengthen the time to restore an operational failure. Some operating systems place a limit on the amount of I/O that takes place during reconstruction, thereby assuring reconstruction will complete in a prescribed amount of time. This results in a maximum reconstruction time. The minimum time of six hours is used for the location parameter. The shape parameter of 2 generates a right-skewed distribution, and the characteristic life is 12 hours. 6.3 Time to latent defect (TTLd) Personal conversations with engineers from four of the world s leading HDD manufacturers support the contention that HDD failure rates are usage dependent, but the exact transfer function of reliability as a function of use (number of reads and writes, lengths of reads and writes, sequential versus random) is not known (or they aren t telling anyone). These analyses approximate use by combining read errors per Byte read and the average number of Bytes read per hour. The result is shown in Table 1 and the following discussion is the justification. Schwartz [10] claims the rate of data corruption is five times the rate of HDD operating failures. Network Appliance completed a study in late 2004 on 282,000 HDDs used in RAID architecture. The read error rate (RER), averaged over three months, was 8x10-14 errors per Byte read. At the same time, another analysis of 66,800 HDDs showed a RER of approximately 3.2x10-13 errors per Byte. A more recent analysis of 63,000 HDDs over five months showed a much improved 8x10-15 errors per Byte read. In these studies, data corruption is verified by the HDD manufacturer as an HDD problem and not a result of the operating system controlling the RAID group.

8 While Gray [25] asserts that it is reasonable to transfer 4.32x10 12 Bytes/day/HDD, the study of 63,000 HDDs read 7.3x10 17 Bytes of data in five months, an approximate read rate of 2.7x10 11 Bytes/day/HDD. The following studies used a high of 1.35x10 10 Bytes/hour and a low of 1.35x10 9 Bytes/hour. Using combinations of the RERs and number of Bytes read yields the hourly read failure rates in Table 1. Table 1. Range of average read error rates 6.4 Time to scrub (TTScrub) Latent defects (data corruptions) can occur any time the disks are spinning. However, these defects can be eliminated by background scrubbing, which is essentially preventive maintenance on data errors. Scrubbing occurs during times of idleness or low I/O activity. During scrubbing data is read and compared to the parity. If they are consistent, no action is taken. If they are inconsistent, the corrupted data is recovered and rewritten to the HDD. If the media is defective, the recovered data is written to new physical sectors on the HDD and the bad blocks are mapped out. Scrubbing is a background activity performed on an as-possible basis so it does not affect performance. If not scrubbed, the period of time to accumulate latent defects starts when the HDD first begins operation in the system. The latent defect rate is assumed to be constant with respect to time (β=1) and is based on the error generation rate and the hourly data transfer rate. As with full HDD data reconstruction, the time required to scrub an entire HDD is a random variable that depends on the HDD capacity and the amount of foreground activity. The minimum time to cover the entire HDD is based on capacity and foreground I/O. The operating system may invoke a maximum time to complete scrubbing. In all cases the shape parameter, β, is 3, which produces a Normal shaped distribution after the delay set by the location parameter, γ. 7. Results Read Errors per Byte per HDD Bytes Read per Hour Low Rate High Rate 1.35x x10 10 Low 8.0x x x10-4 Err/hr Med 8.0x x x10-3 Err/hr High 3.2x x x10-3 Err/hr Analyses were conducted to study the effects of parametric variants of a base case with parameters shown in Table 2. All analyses have an 87,600-hour (10-year) mission and 8 HDDs in a RAID group. Table 2. Base case input parameters Operational Failure Latent Defect Distributions Distributions TTOp TTR TTLd TTScrub γ η β γ η β γ η β γ η β Four variants of the base case, none of which include latent defects or scrubbing, are shown in Figure 6. Line "c-c" has constant rates for both failures and restorations. Line "f(t)-c" has time-dependent failure rates and constant restoration rates. Line "cr(t)" has constant failure rates and time-dependent restoration rates. For line "f(t)-r(t)," failures and restorations both are time dependent as per Table 2. The last line is based on MTTDL assuming constant failure and restoration rates. As expected, the model result "c-c" follows the MTTDL line closely. The plot shows the model will produce the same results as MTTDL under the same (time-independent rate) assumptions but is sensitive to time-dependent failure and restoration rates. The directions of change are counter-intuitive, but result from the shift in the probability density function "mass" when the characteristic life is not changed. DDFs per 1000 RAID Groups MTTDL c-c f(t)-c c-r(t) f(t)-r(t) Time, hours Figure 6. Model compared to MTTDL without latent defects The difference between the MTTDL and the model are on the order of 2 to 1. If latent defects are not included, this difference may not be enough to warrant the use of this complex model. However, when latent defects are added to the analysis, the differences become great. Figure 7 compares the base case (including latent defects and 168 hour scrub) to the case of latent defects without scrubbing, which introduces significantly more DDFs. Notice that in both of these studies, the plot lines are not linear, showing the effects of the time-dependent failure and restoration rates. The increasing rate of occurrence of

9 failure (ROCOF) is verified by finding the number of DDFs that occur in any fixed time interval (Figure 8). DDFs per 1000 RAID Groups Figure 7. Effects of latent defects with no scrub and with 168 hr scrub DDFs per 1000 RAID Groups No Scrub No Scrub 168 hr Scrub 168 hr Scrub Time, hours Time, hours Figure 8. ROCOFs for plots in Figure 7 In Figure 9 additional scrub durations are compared. Again, the plots exhibit a non-linear (time dependent) ROCOF. Remember from Figure 6 that the MTTDL without latent defects predicts only 0.27 DDFs/1000 RAID groups in 10 years. used. A shape parameter of 0.8 may actually have 83% more DDFs than when beta is 1.0. Similarly, if the actual beta is 1.4, there may be only 30% of the DDFs predicted using constant failure rates. DDFs per 1000 RAID Groups β = 0.8 β = 1 β = 1.12 β = 1.4 β = Time, hours Figure 10. Effects of operational failure shape parameter for a given characteristic life This research and new model show a clear difference between the estimated number of DDFs as a function of time based on the MTTDL and the new model. The number of DDFs predicted by the model is, in all cases, greater than the MTTDL when latent defects are included. Without scrubbing, and assuming the distributions in Table 2, this model estimates that in 1,000 RAID groups there will be over 1,200 DDFs in the 10-year mission, contrary to the 0.3 predicted by MTTDL. Table 3 shows the ratio of DDFs expected with the new model to the number estimated using the MTTDL during the first year alone. The highest ratio, >2,500, is when latent defects are included but there is no scrubbing. Even if scrubbing is completed in 168 hours, the new model predicts over 360 times as many DDFs as the MTTDL method. DDFs per 1000 RAID Groups hr Scrub 168 hr Scrub 48 hr Scrub 12 hr Scrub Time, hours Table 3. DDF comparisons Assumptions DDFs in 1st year Ratio MTTDL Base Case w/o Scrub hr Scrub hr Scrub hr Scrub hr Scrub Conclusions Figure 9. Effects of scrub durations The assumption of constant failure rates is inherent to the MTTDL calculations. However, Figure 10 clearly shows the potential inaccuracy resulting from that assumption even when this new model is The MTTDL calculations exclude latent defects and implicitly assume the rate of occurrence of failure for any RAID group is an HPP (constant in time). The model results show that correctly including timedependent failure rates and restoration rates along with latent defects yields estimates of DDFs that are as

10 much as 4,000 times greater than the MTTDL-based estimates. Additionally, the ROCOF for a RAID group is not linear in time and depends heavily on the underlying component failure distributions. Field data show HDD failure rates are not constant in time and vary from vintage to vintage. Latent defects are inevitable and scrubbing latent defects is imperative to RAID (N+1) reliability. Short scrub durations can improve reliability, but at some point the extensive scrubbing required to support the high-capacity HDDs will unacceptably impact performance. This model provides a tool by which RAID designers can better evaluate the impact of the latent defect occurrence rate, which may be 100 times greater than the operational failure rate, and the scrubbing rate. The RAID architect can use this model to drive the design, providing insights as to the best RAID group size based on a specific manufacturer s HDDs and the impact of an increasing failure rate. For systems that currently do not scrub, consumers can see that this is a recipe for disaster. It appears that, eventually, RAID 6 will be required to meet high reliability requirements. 9. References [1] D. A. Patterson, G. A. Gibson, R. H. Katz, A Case for Redundant Arrays of Inexpensive Disks (RAID), Proc., ACM Conference on Management of Data (SIGMOD), Chicago, IL, June [2] W. A. Thompson, On the Foundations of Reliability, Technometrics, vol. 23, no. 1, Feb. 1981, pp [3] H. E. Ascher, A Set-of-Numbers is NOT a Data- Set, IEEE Trans. on Reliability, vol. 48, no. 2, June [4] L. H. Crow, Evaluating the Reliability of Repairable Systems, Proc. Annual Reliability & Maintainability Symp., [5] W. Nelson, Graphical Analyses of System Repair Data, Journal of Quality Technology, vol. 20, no. 1, Jan [6] H. Ascher, [Statistical Methods in Reliability]: Discussion, Technometrics, vol. 25, no. 4, Nov [7] J. G. Elerath and S. Shah, "Disk Drive Reliability Case Study: Dependence Upon Head Fly-Height and Quantity of Heads," Proc. Annual Reliability & Maintainability Symp., [8] S. Shah and J. G. Elerath, "Disk Drive Vintage and Its Affect on Reliability," Proc. Annual Reliability & Maintainability Symp., [9] H. H. Kari, Latent Sector Faults and Reliability of Disk Arrays, Ph.D. Dissertation, TKO-A33, Helsinki University of Technology, Espoo, Finland, 1997, [10] T. J. E. Schwarz et al., Disk Scrubbing in Large Archival Storage Systems, IEEE Computer Society Symposium, MASCOTS, [11] S. Shah and J. G. Elerath, Reliability Analysis of Disk Drive Failure Mechanisms, Proc. Annual Reliability & Maintainability Symp., [12] E. Pinheiro, W. D. Weber, and L. A. Barroso, "Failure Trends in Large Disk Drive Population," Proc. 5th USENIX Conference on File Storage Technologies (FAST '07), Feb [13] B. Schroeder and G. Gibson, "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?" Proc. of 5th USENIX Conference on File and Storage Technologies (FAST), Feb [14] V. Prabhakaran, IRON File Systems, SOSP 05, Oct. 2005, Brighton, UK. [15] R. Geist and K. Trivedi, An Analytic Treatment of the Reliability and Performance of Mirrored Disk Subsystems, Twenty-Third Inter. Symp. on Fault-Tolerant Computing, FTCS, June [16] M. Malhotra, Specification and solution of dependability models of fault tolerant systems, Ph.D. Dissertation, CS , Dept. of Computer Science, Duke University, May 14, [17] D. A. Patterson et al., Introduction to Redundant Arrays of Inexpensive Disks (RAID), Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage, COMPCON, Feb [18] P. M. Chen et al., RAID: High-Performance, Reliable Secondary Storage, ACM Computing Surveys, [19] W. V. Courtright, II, A Transactional Approach to Redundant Disk Array Implementation, Ph.D. Thesis, CMU-CS , School of Computer Science, Carnegie Mellon University, May [20] T. J. E. Schwarz, W. A. Burkhard, Reliability and Performance of RAIDs, IEEE Transactions on Magnetics, vol. 31, no. 2, Mar [21] W. A. Thompson, "The Rate of Failure Is the Density, Not the Failure Rate," The American Statistician, Editorial, vol. 42, no. 4, Nov [22] C. L. T. Borges et al., Composite Reliability Evaluation by Sequential Monte Carlo Simulation on Parallel and Distributed Operating Environments, IEEE Trans. on Power Systems, vol. 16, no. 2, May [23] D. Trindade and S. Nathan, Simple Plots for Monitoring Field Reliability of Repairable Systems, Proc. Annual Reliability & Maintainability Symp., [24] P. Corbett et al., Row Diagonal Parity for Double Disk Failure Correction, Proc. of 3 rd USENIX Conference on File and Storage Technology, San Francisco, [25] J. Gray, C. van Ingen, Empirical Measurements of Disk Failure Rates and Error Rates, Microsoft Research Technical Report, MSR-TR , Dec

HDD Ghosts, Goblins and Failures in RAID Storage Systems

HDD Ghosts, Goblins and Failures in RAID Storage Systems HDD Ghosts, Goblins and Failures in RAID Storage Systems Jon G. Elerath, Ph. D. February 19, 2013 Agenda RAID > HDDs > RAID Redundant Array of Independent Disks What is RAID storage? How does RAID work?

More information

Hard Disk Drives: The Good, The Bad and The Ugly

Hard Disk Drives: The Good, The Bad and The Ugly Queuecasts Developer Tools Legacy Systems QA & Optimization Security Sponsored by Aladdin Virtualization Development Tools Directory www.pc-pitstop.com Feedback - Ads by Google Hard Disk Drives: The Good,

More information

Reliability of Data Storage Systems

Reliability of Data Storage Systems Zurich Research Laboratory Ilias Iliadis April 2, 25 Keynote NexComm 25 www.zurich.ibm.com 25 IBM Corporation Long-term Storage of Increasing Amount of Information An increasing amount of information is

More information

How To Improve Performance On A Single Chip Computer

How To Improve Performance On A Single Chip Computer : Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings

More information

RAID Technology Overview

RAID Technology Overview RAID Technology Overview HP Smart Array RAID Controllers HP Part Number: J6369-90050 Published: September 2007 Edition: 1 Copyright 2007 Hewlett-Packard Development Company L.P. Legal Notices Copyright

More information

Three-Dimensional Redundancy Codes for Archival Storage

Three-Dimensional Redundancy Codes for Archival Storage Three-Dimensional Redundancy Codes for Archival Storage Jehan-François Pâris Darrell D. E. Long Witold Litwin Department of Computer Science University of Houston Houston, T, USA jfparis@uh.edu Department

More information

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

Data Storage - II: Efficient Usage & Errors

Data Storage - II: Efficient Usage & Errors Data Storage - II: Efficient Usage & Errors Week 10, Spring 2005 Updated by M. Naci Akkøk, 27.02.2004, 03.03.2005 based upon slides by Pål Halvorsen, 12.3.2002. Contains slides from: Hector Garcia-Molina

More information

Price/performance Modern Memory Hierarchy

Price/performance Modern Memory Hierarchy Lecture 21: Storage Administration Take QUIZ 15 over P&H 6.1-4, 6.8-9 before 11:59pm today Project: Cache Simulator, Due April 29, 2010 NEW OFFICE HOUR TIME: Tuesday 1-2, McKinley Last Time Exam discussion

More information

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID

More information

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels technology brief RAID Levels March 1997 Introduction RAID is an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks) coined in a 1987 University of California

More information

Reliability and Fault Tolerance in Storage

Reliability and Fault Tolerance in Storage Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

an analysis of RAID 5DP

an analysis of RAID 5DP an analysis of RAID 5DP a qualitative and quantitative comparison of RAID levels and data protection hp white paper for information about the va 7000 series and periodic updates to this white paper see

More information

HARD DRIVE CHARACTERISTICS REFRESHER

HARD DRIVE CHARACTERISTICS REFRESHER The read/write head of a hard drive only detects changes in the magnetic polarity of the material passing beneath it, not the direction of the polarity. Writes are performed by sending current either one

More information

RAID Basics Training Guide

RAID Basics Training Guide RAID Basics Training Guide Discover a Higher Level of Performance RAID matters. Rely on Intel RAID. Table of Contents 1. What is RAID? 2. RAID Levels RAID 0 RAID 1 RAID 5 RAID 6 RAID 10 RAID 0+1 RAID 1E

More information

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe

Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe Solving Data Loss in Massive Storage Systems Jason Resch Cleversafe 2010 Storage Developer Conference. Insert Your Company Name. All Rights Reserved. 1 In the beginning There was replication Long before

More information

Mean time to meaningless: MTTDL, Markov models, and storage system reliability

Mean time to meaningless: MTTDL, Markov models, and storage system reliability Mean time to meaningless: MTTDL, Markov models, and storage system reliability Kevin M. Greenan ParaScale, Inc. James S. Plank University of Tennessee Jay J. Wylie HP Labs Abstract Mean Time To Data Loss

More information

Enterprise-class versus Desktopclass

Enterprise-class versus Desktopclass Enterprise-class versus Desktopclass Hard Drives April, 2008 Enterprise Platforms and Services Division - Marketing Revision History Date Revision Number April, 2008 1.0 Initial Release Modifications Disclaimers

More information

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1 Introduction - 8.1 I/O Chapter 8 Disk Storage and Dependability 8.2 Buses and other connectors 8.4 I/O performance measures 8.6 Input / Ouput devices keyboard, mouse, printer, game controllers, hard drive,

More information

Intel RAID Controllers

Intel RAID Controllers Intel RAID Controllers Best Practices White Paper April, 2008 Enterprise Platforms and Services Division - Marketing Revision History Date Revision Number April, 2008 1.0 Initial release. Modifications

More information

HDD & RAID. White Paper. English

HDD & RAID. White Paper. English White Paper HDD & RAID Basic information for the assessment of hard drive durability as well as on the functionality and suitability of redundant storage systems English Rev. 1.0.1 / 2012-10-10 Table of

More information

Using RAID6 for Advanced Data Protection

Using RAID6 for Advanced Data Protection Using RAI6 for Advanced ata Protection 2006 Infortrend Corporation. All rights reserved. Table of Contents The Challenge of Fault Tolerance... 3 A Compelling Technology: RAI6... 3 Parity... 4 Why Use RAI6...

More information

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD In order to fulfil the operational needs, different web hosting providers offer different models of hard drives. While some web hosts provide Enterprise HDDs, which although comparatively expensive, offer

More information

CS 6290 I/O and Storage. Milos Prvulovic

CS 6290 I/O and Storage. Milos Prvulovic CS 6290 I/O and Storage Milos Prvulovic Storage Systems I/O performance (bandwidth, latency) Bandwidth improving, but not as fast as CPU Latency improving very slowly Consequently, by Amdahl s Law: fraction

More information

RAID 5 rebuild performance in ProLiant

RAID 5 rebuild performance in ProLiant RAID 5 rebuild performance in ProLiant technology brief Abstract... 2 Overview of the RAID 5 rebuild process... 2 Estimating the mean-time-to-failure (MTTF)... 3 Factors affecting RAID 5 array rebuild

More information

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC Instructor: Justin Hsia 8/08/2013 Summer 2013 Lecture #27 1 Review of Last Lecture MapReduce Data Level Parallelism Framework

More information

Disk Storage & Dependability

Disk Storage & Dependability Disk Storage & Dependability Computer Organization Architectures for Embedded Computing Wednesday 19 November 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS

OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS W H I T E P A P E R OPTIMIZING VIRTUAL TAPE PERFORMANCE: IMPROVING EFFICIENCY WITH DISK STORAGE SYSTEMS By: David J. Cuddihy Principal Engineer Embedded Software Group June, 2007 155 CrossPoint Parkway

More information

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs Dependable Systems 9. Redundant arrays of inexpensive disks (RAID) Prof. Dr. Miroslaw Malek Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs Redundant Arrays of Inexpensive Disks (RAID) RAID is

More information

Silent data corruption in SATA arrays: A solution

Silent data corruption in SATA arrays: A solution Silent data corruption in SATA arrays: A solution Josh Eddy August 2008 Abstract Recent large academic studies have identified the surprising frequency of silent read failures that are not identified or

More information

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance

More information

RAID Implementation for StorSimple Storage Management Appliance

RAID Implementation for StorSimple Storage Management Appliance RAID Implementation for StorSimple Storage Management Appliance Alpa Kohli June, 2012 KB-00008 Document Revision 1 StorSimple knowledge base articles are intended to provide customers with the information

More information

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array ATTO Technology, Inc. Corporate Headquarters 155 Crosspoint Parkway Amherst, NY 14068 Phone: 716-691-1999 Fax: 716-691-9353 www.attotech.com sales@attotech.com RAID Overview: Identifying What RAID Levels

More information

RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection

RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection Technical Report RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection Jay White & Chris Lueth, NetApp May 2010 TR-3298 ABSTRACT This document provides an in-depth overview of the NetApp

More information

How To Understand And Understand The Power Of Aird 6 On Clariion

How To Understand And Understand The Power Of Aird 6 On Clariion A Detailed Review Abstract This white paper discusses the EMC CLARiiON RAID 6 implementation available in FLARE 26 and later, including an overview of RAID 6 and the CLARiiON-specific implementation, when

More information

Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho

Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho Case for storage CS24: Computer Architecture Storage systems Sangyeun Cho Computer Science Department Shift in focus from computation to communication & storage of information Eg, Cray Research/Thinking

More information

IBM ^ xseries ServeRAID Technology

IBM ^ xseries ServeRAID Technology IBM ^ xseries ServeRAID Technology Reliability through RAID technology Executive Summary: t long ago, business-critical computing on industry-standard platforms was unheard of. Proprietary systems were

More information

HP Smart Array Controllers and basic RAID performance factors

HP Smart Array Controllers and basic RAID performance factors Technical white paper HP Smart Array Controllers and basic RAID performance factors Technology brief Table of contents Abstract 2 Benefits of drive arrays 2 Factors that affect performance 2 HP Smart Array

More information

Enterprise-class versus Desktop-class Hard Drives

Enterprise-class versus Desktop-class Hard Drives Enterprise-class versus Desktop-class Hard Drives A document providing a comparison between enterprise-class and desktop-class hard drives Rev 2.0 April 2016 Intel Server Boards and Systems

More information

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation Overview of I/O Performance and RAID in an RDBMS Environment By: Edward Whalen Performance Tuning Corporation Abstract This paper covers the fundamentals of I/O topics and an overview of RAID levels commonly

More information

RAID-DP : NETWORK APPLIANCE IMPLEMENTATION OF RAID DOUBLE PARITY FOR DATA PROTECTION A HIGH-SPEED IMPLEMENTATION OF RAID 6

RAID-DP : NETWORK APPLIANCE IMPLEMENTATION OF RAID DOUBLE PARITY FOR DATA PROTECTION A HIGH-SPEED IMPLEMENTATION OF RAID 6 RAID-DP : NETWORK APPLIANCE IMPLEMENTATION OF RAID DOUBLE PARITY FOR DATA PROTECTION A HIGH-SPEED IMPLEMENTATION OF RAID 6 Chris Lueth, Network Appliance, Inc. TR-3298 [12/2006] ABSTRACT To date, RAID

More information

How To Create A Multi Disk Raid

How To Create A Multi Disk Raid Click on the diagram to see RAID 0 in action RAID Level 0 requires a minimum of 2 drives to implement RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written

More information

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering DELL RAID PRIMER DELL PERC RAID CONTROLLERS Joe H. Trickey III Dell Storage RAID Product Marketing John Seward Dell Storage RAID Engineering http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/top

More information

CS161: Operating Systems

CS161: Operating Systems CS161: Operating Systems Matt Welsh mdw@eecs.harvard.edu Lecture 18: RAID April 19, 2007 2007 Matt Welsh Harvard University 1 RAID Redundant Arrays of Inexpensive Disks Invented in 1986-1987 by David Patterson

More information

3PAR Fast RAID: High Performance Without Compromise

3PAR Fast RAID: High Performance Without Compromise 3PAR Fast RAID: High Performance Without Compromise Karl L. Swartz Document Abstract: 3PAR Fast RAID allows the 3PAR InServ Storage Server to deliver higher performance with less hardware, reducing storage

More information

Theoretical Aspects of Storage Systems Autumn 2009

Theoretical Aspects of Storage Systems Autumn 2009 Theoretical Aspects of Storage Systems Autumn 2009 Chapter 1: RAID André Brinkmann University of Paderborn Personnel Students: ~13.500 students Professors: ~230 Other staff: ~600 scientific, ~630 non-scientific

More information

Protecting Data Against Early Disk Failures

Protecting Data Against Early Disk Failures Protecting Data gainst Early Disk Failures Jehan-François Pâris Dept. of Computer Science University of Houston Houston, TX 77204-3010 Thomas J. E. Schwarz Dept. of Computer Engineering Santa Clara University

More information

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory William Stallings Computer Organization and Architecture 7 th Edition Chapter 6 External Memory Types of External Memory Magnetic Disk RAID Removable Optical CD-ROM CD-Recordable (CD-R) CD-R/W DVD Magnetic

More information

1 Storage Devices Summary

1 Storage Devices Summary Chapter 1 Storage Devices Summary Dependability is vital Suitable measures Latency how long to the first bit arrives Bandwidth/throughput how fast does stuff come through after the latency period Obvious

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve

More information

Benefits of Intel Matrix Storage Technology

Benefits of Intel Matrix Storage Technology Benefits of Intel Matrix Storage Technology White Paper December 2005 Document Number: 310855-001 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,

More information

PIONEER RESEARCH & DEVELOPMENT GROUP

PIONEER RESEARCH & DEVELOPMENT GROUP SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for

More information

Guide to SATA Hard Disks Installation and RAID Configuration

Guide to SATA Hard Disks Installation and RAID Configuration Guide to SATA Hard Disks Installation and RAID Configuration 1. Guide to SATA Hard Disks Installation...2 1.1 Serial ATA (SATA) Hard Disks Installation...2 2. Guide to RAID Configurations...3 2.1 Introduction

More information

Designing a Cloud Storage System

Designing a Cloud Storage System Designing a Cloud Storage System End to End Cloud Storage When designing a cloud storage system, there is value in decoupling the system s archival capacity (its ability to persistently store large volumes

More information

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck 1/19 Why disk arrays? CPUs speeds increase faster than disks - Time won t really help workloads where disk in bottleneck Some applications (audio/video) require big files Disk arrays - make one logical

More information

Chapter 9: Peripheral Devices: Magnetic Disks

Chapter 9: Peripheral Devices: Magnetic Disks Chapter 9: Peripheral Devices: Magnetic Disks Basic Disk Operation Performance Parameters and History of Improvement Example disks RAID (Redundant Arrays of Inexpensive Disks) Improving Reliability Improving

More information

Availability and Cost Monitoring in Datacenters. Using Mean Cumulative Functions

Availability and Cost Monitoring in Datacenters. Using Mean Cumulative Functions Availability and Cost Monitoring in Datacenters Using Mean Cumulative Functions David Trindade, Swami Nathan Sun Microsystems Inc. {david.trindade,swami.nathan} @sun.com Keywords : Availability analysis,

More information

VERY IMPORTANT NOTE! - RAID

VERY IMPORTANT NOTE! - RAID Disk drives are an integral part of any computing system. Disk drives are usually where the operating system and all of an enterprise or individual s data are stored. They are also one of the weakest links

More information

Self-Adaptive Disk Arrays

Self-Adaptive Disk Arrays Proc. 8 th International Symposium on Stabilization, Safety and Securiity of Distributed Systems (SSS 006), Dallas, TX, Nov. 006, to appear Self-Adaptive Disk Arrays Jehan-François Pâris 1*, Thomas J.

More information

Taurus Super-S3 LCM. Dual-Bay RAID Storage Enclosure for two 3.5-inch Serial ATA Hard Drives. User Manual March 31, 2014 v1.2 www.akitio.

Taurus Super-S3 LCM. Dual-Bay RAID Storage Enclosure for two 3.5-inch Serial ATA Hard Drives. User Manual March 31, 2014 v1.2 www.akitio. Dual-Bay RAID Storage Enclosure for two 3.5-inch Serial ATA Hard Drives User Manual March 31, 2014 v1.2 www.akitio.com EN Table of Contents Table of Contents 1 Introduction... 1 1.1 Technical Specifications...

More information

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367 Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How

More information

Striped Set, Advantages and Disadvantages of Using RAID

Striped Set, Advantages and Disadvantages of Using RAID Algorithms and Methods for Distributed Storage Networks 4: Volume Manager and RAID Institut für Informatik Wintersemester 2007/08 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case

More information

Energy aware RAID Configuration for Large Storage Systems

Energy aware RAID Configuration for Large Storage Systems Energy aware RAID Configuration for Large Storage Systems Norifumi Nishikawa norifumi@tkl.iis.u-tokyo.ac.jp Miyuki Nakano miyuki@tkl.iis.u-tokyo.ac.jp Masaru Kitsuregawa kitsure@tkl.iis.u-tokyo.ac.jp Abstract

More information

How To Write A Disk Array

How To Write A Disk Array 200 Chapter 7 (This observation is reinforced and elaborated in Exercises 7.5 and 7.6, and the reader is urged to work through them.) 7.2 RAID Disks are potential bottlenecks for system performance and

More information

Disk Array Data Organizations and RAID

Disk Array Data Organizations and RAID Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Physical structure of secondary storage devices and its effects on the uses of the devices Performance characteristics of mass-storage devices Disk scheduling algorithms

More information

Increasing the capacity of RAID5 by online gradual assimilation

Increasing the capacity of RAID5 by online gradual assimilation Increasing the capacity of RAID5 by online gradual assimilation Jose Luis Gonzalez,Toni Cortes joseluig,toni@ac.upc.es Departament d Arquiectura de Computadors, Universitat Politecnica de Catalunya, Campus

More information

Moving Beyond RAID DXi and Dynamic Disk Pools

Moving Beyond RAID DXi and Dynamic Disk Pools TECHNOLOGY BRIEF Moving Beyond RAID DXi and Dynamic Disk Pools NOTICE This Technology Brief contains information protected by copyright. Information in this Technology Brief is subject to change without

More information

Flash Memory Technology in Enterprise Storage

Flash Memory Technology in Enterprise Storage NETAPP WHITE PAPER Flash Memory Technology in Enterprise Storage Flexible Choices to Optimize Performance Mark Woods and Amit Shah, NetApp November 2008 WP-7061-1008 EXECUTIVE SUMMARY Solid state drives

More information

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS TECHNOLOGY BRIEF August 1999 Compaq Computer Corporation Prepared by ISSD Technology Communications CONTENTS Executive Summary 1 Introduction 3 Subsystem Technology 3 Processor 3 SCSI Chip4 PCI Bridge

More information

Q & A From Hitachi Data Systems WebTech Presentation:

Q & A From Hitachi Data Systems WebTech Presentation: Q & A From Hitachi Data Systems WebTech Presentation: RAID Concepts 1. Is the chunk size the same for all Hitachi Data Systems storage systems, i.e., Adaptable Modular Systems, Network Storage Controller,

More information

Embedded Systems Lecture 9: Reliability & Fault Tolerance. Björn Franke University of Edinburgh

Embedded Systems Lecture 9: Reliability & Fault Tolerance. Björn Franke University of Edinburgh Embedded Systems Lecture 9: Reliability & Fault Tolerance Björn Franke University of Edinburgh Overview Definitions System Reliability Fault Tolerance Sources and Detection of Errors Stage Error Sources

More information

RAID: Redundant Arrays of Independent Disks

RAID: Redundant Arrays of Independent Disks RAID: Redundant Arrays of Independent Disks Dependable Systems Dr.-Ing. Jan Richling Kommunikations- und Betriebssysteme TU Berlin Winter 2012/2013 RAID: Introduction Redundant array of inexpensive disks

More information

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (http://www.cs.princeton.edu/courses/cos318/)

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (http://www.cs.princeton.edu/courses/cos318/) COS 318: Operating Systems Storage Devices Kai Li Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Today s Topics Magnetic disks Magnetic disk performance

More information

Guide to SATA Hard Disks Installation and RAID Configuration

Guide to SATA Hard Disks Installation and RAID Configuration Guide to SATA Hard Disks Installation and RAID Configuration 1. Guide to SATA Hard Disks Installation... 2 1.1 Serial ATA (SATA) Hard Disks Installation... 2 2. Guide to RAID Configurations... 3 2.1 Introduction

More information

Best Practices RAID Implementations for Snap Servers and JBOD Expansion

Best Practices RAID Implementations for Snap Servers and JBOD Expansion STORAGE SOLUTIONS WHITE PAPER Best Practices RAID Implementations for Snap Servers and JBOD Expansion Contents Introduction...1 Planning for the End Result...1 Availability Considerations...1 Drive Reliability...2

More information

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

William Stallings Computer Organization and Architecture 8 th Edition. External Memory William Stallings Computer Organization and Architecture 8 th Edition Chapter 6 External Memory Types of External Memory Magnetic Disk RAID Removable Optical CD-ROM CD-Recordable (CD-R) CD-R/W DVD Magnetic

More information

Distribution One Server Requirements

Distribution One Server Requirements Distribution One Server Requirements Introduction Welcome to the Hardware Configuration Guide. The goal of this guide is to provide a practical approach to sizing your Distribution One application and

More information

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to:

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to: HPTER 4 RI s it was originally proposed, the acronym RI stood for Redundant rray of Inexpensive isks. However, it has since come to be known as Redundant rray of Independent isks. RI was originally described

More information

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology SSDs and RAID: What s the right strategy Paul Goodwin VP Product Development Avant Technology SSDs and RAID: What s the right strategy Flash Overview SSD Overview RAID overview Thoughts about Raid Strategies

More information

CS420: Operating Systems

CS420: Operating Systems NK YORK COLLEGE OF PENNSYLVANIA HG OK 2 RAID YORK COLLEGE OF PENNSYLVAN James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz,

More information

"Reliability and MTBF Overview"

Reliability and MTBF Overview "Reliability and MTBF Overview" Prepared by Scott Speaks Vicor Reliability Engineering Introduction Reliability is defined as the probability that a device will perform its required function under stated

More information

Version : 1.0. SR3620-2S-SB2 User Manual. SOHORAID Series

Version : 1.0. SR3620-2S-SB2 User Manual. SOHORAID Series Version : 1.0 SR3620-2S-SB2 User Manual SOHORAID Series Introduction About this Manual Thank you for using the product of RAIDON Technology Inc. This user manual will introduce the STARDOM SR3620-2S-SB2

More information

An Introduction to RAID 6 ULTAMUS TM RAID

An Introduction to RAID 6 ULTAMUS TM RAID An Introduction to RAID 6 ULTAMUS TM RAID The highly attractive cost per GB of SATA storage capacity is causing RAID products based on the technology to grow in popularity. SATA RAID is now being used

More information

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips

Technology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips Technology Update White Paper High Speed RAID 6 Powered by Custom ASIC Parity Chips High Speed RAID 6 Powered by Custom ASIC Parity Chips Why High Speed RAID 6? Winchester Systems has developed High Speed

More information

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1 RAID HARDWARE On board SATA RAID controller SATA RAID controller card RAID drive caddy (hot swappable) Anne Watson 1 RAID The word redundant means an unnecessary repetition. The word array means a lineup.

More information

ICMP HDD. Installation manual

ICMP HDD. Installation manual ICMP HDD Installation manual R5905769/02 17/04/2015 Barco nv Noordlaan 5, B-8520 Kuurne Phone: +32 56.36.82.11 Fax: +32 56.36.883.86 Support: www.barco.com/en/support Visit us at the web: www.barco.com

More information

Storage Options for Document Management

Storage Options for Document Management Storage Options for Document Management Document management and imaging systems store large volumes of data, which must be maintained for long periods of time. Choosing storage is not simply a matter of

More information

Why disk arrays? CPUs improving faster than disks

Why disk arrays? CPUs improving faster than disks Why disk arrays? CPUs improving faster than disks - disks will increasingly be bottleneck New applications (audio/video) require big files (motivation for XFS) Disk arrays - make one logical disk out of

More information

RAID Levels and Components Explained Page 1 of 23

RAID Levels and Components Explained Page 1 of 23 RAID Levels and Components Explained Page 1 of 23 What's RAID? The purpose of this document is to explain the many forms or RAID systems, and why they are useful, and their disadvantages. RAID - Redundant

More information

Dell Reliable Memory Technology

Dell Reliable Memory Technology Dell Reliable Memory Technology Detecting and isolating memory errors THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS

More information

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems

Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

More information

RAID Storage Systems with Early-warning and Data Migration

RAID Storage Systems with Early-warning and Data Migration National Conference on Information Technology and Computer Science (CITCS 2012) RAID Storage Systems with Early-warning and Data Migration Yin Yang 12 1 School of Computer. Huazhong University of yy16036551@smail.hust.edu.cn

More information

Maintenance Best Practices for Adaptec RAID Solutions

Maintenance Best Practices for Adaptec RAID Solutions Maintenance Best Practices for Adaptec RAID Solutions Note: This document is intended to provide insight into the best practices for routine maintenance of Adaptec RAID systems. These maintenance best

More information

How To Fix A Fault Fault Fault Management In A Vsphere 5 Vsphe5 Vsphee5 V2.5.5 (Vmfs) Vspheron 5 (Vsphere5) (Vmf5) V

How To Fix A Fault Fault Fault Management In A Vsphere 5 Vsphe5 Vsphee5 V2.5.5 (Vmfs) Vspheron 5 (Vsphere5) (Vmf5) V VMware Storage Best Practices Patrick Carmichael Escalation Engineer, Global Support Services. 2011 VMware Inc. All rights reserved Theme Just because you COULD, doesn t mean you SHOULD. Lessons learned

More information

Getting Started With RAID

Getting Started With RAID Dell Systems Getting Started With RAID www.dell.com support.dell.com Notes, Notices, and Cautions NOTE: A NOTE indicates important information that helps you make better use of your computer. NOTICE: A

More information

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect High-Performance SSD-Based RAID Storage Madhukar Gunjan Chakhaiyar Product Test Architect 1 Agenda HDD based RAID Performance-HDD based RAID Storage Dynamics driving to SSD based RAID Storage Evolution

More information

VIA / JMicron RAID Installation Guide

VIA / JMicron RAID Installation Guide VIA / JMicron RAID Installation Guide 1. Introduction to VIA / JMicron RAID Installation Guide. 3 2. VIA RAID Installation Guide. 3 2.1 VIA BIOS RAID Installation Guide.. 3 2.1.1 Introduction of RAID.

More information

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

More information

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University COS 318: Operating Systems Storage Devices Kai Li and Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318/ Today s Topics! Magnetic disks!

More information