Whitepaper Abstract This whitepaper introduces the procedure rebuilding a degraded RAID. You will find information about RAID levels and their rebuild procedures; rebuild time and its dependent factors. You will get estimation hints to calculate the rebuild time. This whitepaper implies general knowledge about storage systems, storage system architecture and RAID technology. 1
Table of Content 1. 2. PRODUCT FAMILIES COVERED BY THIS DOCUMENT... 3 WHAT HAPPENS IF ONE OR MORE DISKS IN A RAID ARRAY FAIL?... 4 2.1 DISK FAILURE IN A RAID 1 OR RAID 1+0... 4 2.2 DISK FAILURE IN A RAID 3 OR RAID 3+0 (DEDICATED PARITY RAID)... 5 2.3 DISK FAILURE IN A RAID 5 OR RAID 6 (DISTRIBUTED PARITY RAID)... 5 2.4 DISK FAILURE IN A RAID 5 WITH HOT-SPARE DISK... 6 3. THE KEY FACTORS INFLUENCING THE REBUILD TIME... 6 3.1 THE COMPUTING POWER OF THE RAID CONTROLLER... 6 3.1.1 The computing power bottleneck... 7 3.1.2 Recommended configurations for EonStor DS G6 and G7 storage systems... 7 3.2 THE WRITE PERFORMANCE OF A SINGLE DRIVE IN A RAID ARRAY... 8 3.2.1 NL-SAS vs. SATA III hard drive... 8 3.2.2 SATA vs. SATA III hard drive with Native Command Queuing (NCQ) enabled... 9 3.2.3 The drive bottleneck... 9 3.3 THE CAPACITY OF A FAILED HARD DRIVE... 10 4. ESTIMATING THE REBUILD TIME... 11 4.1 REBUILD TIME, IF COMPUTING POWER IS THE BOTTLENECK... 11 4.2 REBUILD TIME, IF WRITE THROUGHPUT IS THE BOTTLENECK... 12 5. APPENDIX:... 13 5.1 INFORTREND WEB LINKS... 13 5.2 DRIVE VENDOR LINKS... 13 2
1. Product families covered by this document This document applies to the following Infortrend storage system families: EonStor DS G6 Family EonStor DS G7 Family For more information regarding individual product models, please visit www.infortrend.com. 3
2. Factors rebuilding a degraded RAID What happens if one or more disks in a RAID array fail? Single disks or single level striped arrays will lose all data in case of a disk failure. You have to replace the failed disk and then restore the lost data from a backup before you can continue operation. Opting for a mirrored (RAID 1 and RAID 1+0) or parity array (RAID 5 / 6 and RAID 5+0 / 6+0); despite lower performance and higher TCA / TCO can solve this problem. There is an extremely small chance that 2 disks Figure 1: RAID 0, disk failure fail simultaneously, but if data protection and integrity are highest priority and cost and performance minor, consider using a RAID 6 or RAID 6+0 to be prepared for that eventuality. 2.1 Disk failure in a RAID 1 or RAID 1+0 The failed disk has to be replaced by a new disk first. The RAID controller will write then all data on the mirror to the newly installed disk (Rebuild), to be considered as a disk copy from the mirror to the new disk. There is no need to restore a previous made back-up, with possible outdated data. The RAID controller is copying the data in the background, so operation will be continued. Figure 2: RAID 1, disk failure 4
2.2 Disk failure in a RAID 3 or RAID 3+0 (Dedicated Parity RAID) If one disk fails, the data can be rebuild by reading all remaining disks (all but the failed one) and writing the rebuilt data to the newly replaced disk. Writing to the newly replaced single disk is enough to rebuild the array. There are actually two factors that can impact the rebuild of a degraded RAID (A RAID will be degraded, if one or more disks fail). If the dedicated parity disk fails, the rebuilding process is a matter of recalculating the parity info by reading all remaining data and writing the parity to the new dedicated disk. If a data disk fails, the data need to be rebuild, based on the remaining data and the parity. This is the most time-consuming part of rebuilding a degraded RAID. Figure 3: RAID 3, disk failure 2.3 Disk failure in a RAID 5 or RAID 6 (Distributed Parity RAID) If a disk fails, the data can be rebuild by reading all remaining disks, rebuilding the data, recalculating the parity information and writing the data and parity information to the new disk. This is time-consuming. The rebuild time is related to drive capacity and number of drives in the RAID array or RAID sub-arrays (RAID 5+0 or RAID 6+0), further to the computing power of the controller. Figure 4: RAID 5, disk failure 5
2.4 Disk failure in a RAID 5 with hot-spare disk When an RAID array is not protected by a hot spare disk, the failed disk has to be removed and replaced by a new one. The controller will detect the new disk and start rebuilding the RAID array. Using a hot-spare disk will overcome the replacement procedure. In case of disk failure a hot spare disk is automatically incorporated into the RAID array and takes over the failed disk. Figure 5: RAID 5 + Hot-Spare, disk failure 3. The key factors influencing the rebuild time There are 3 main factors that affect the duration of rebuild time: The computing power of the RAID controller The write performance of a drive in a RAID array The capacity of the failed drive 3.1 The computing power of the RAID controller The computing power of a RAID controller is the maximum throughput per second that can be run in XOR. As shown in the table below, EonStor DS G7 storage systems reduce the rebuild time to 62% of the rebuild time of an EonStor DS G6 storage system in a RAID 5 configuration with 15 hard drives. Model Computing power Disk Write Performance (*1) Configuration (RAID 5) Rebuild time Time saving in % G6 G7 850 MB/s 1350 MB/s NL SAS Toshiba 1TB (122 MB/s) 7 HDDs 2 hr. 35 min. 15 HDDs 4 hr. 54 min. 7 HDDs 2 hr. 35 min. Table 1: Computing power of the RAID controller 15 HDDs 3 hr. 5 min. 38% 6
3.1.1 The computing power bottleneck The computing power of a RAID controller is limited (see Table 1). The work load of the RAID controller will increase as more drives will be added to a RAID array / RAID sub-array. The rebuild time will increase significant too then. Example: 1. Configuration: EonStor DS G6 computing power = 850 Mbps EonStor DS G7 computing power = 1350 Mbps RAID 5 array = 6 x SAS 6G 400 GB SSD w/ 253.04 MB/s max. seq. WRITE throughput 2. Calculation: Max. total READ throughput = (drive # - failed drive #) * Max. WRITE throughput Max. total READ throughput = (6-1)* 253.04 MB/s = 1265.20 MB/s Computing power Total READ throughput (in rebuild process) 3. Result: EonStor DS G6: 850 Mbps < 1265.20 MB/s Computing power = Bottleneck EonStor DS G7: 1350 Mbps > 1265.20 MB/s 3.1.2 Recommended configurations for EonStor DS G6 and G7 storage systems 1. EonStor DS G6: Figure 6: ESDS G6 Number of HDD recommendation We recommend using less or equal 8 hard drives in a RAID array / sub-array for a suitable rebuild time. If you want to use more than 8 hard drives in a RAID 5 or RAID 6, we recommend using RAID 5+0 or RAID 6+0 and balance the hard drive number among the RAID sub-arrays ( 8 hard drives per RAID sub-array). 7
2. EonStor DS G7: Figure 7: ESDS G7 Number of HDD recommendation A suitable rebuild time will be achieved using equal or less than 16 hard drives in a RAID array or a RAID sub-array. If you want to use a RAID 5 with more than 10 hard drives, we recommend using RAID 5+0. Even the difference of rebuild time using 10 or 16 hard drives is quite small; we suggest using less than 10 drives in a RAID 5 array or RAID 5+0 sub-array to minimize the risk of a 2 nd hard drive failure. Using RAID 6, we recommend using less or equal 16 hard drives as the RAID protects against 2 drive failures. Consider using RAID 6+0 when using more than 16 hard drives. 3.2 The write performance of a single drive in a RAID array The maximum write performance of a single drive in the RAID array is the maximum throughput per second written to the Hot-Spare disk. 3.2.1 NL-SAS vs. SATA III hard drive The table below displays the time saving using the same EonStor DS G7 storage system in RAID 5, once with NL- SAS hard drives and once with SATA III hard drives. Using NL-SAS hard drives will reduce the rebuild time to 43% of the time using SATA III hard drives. Model G7 Computing power 1350 MB/s Write performance (*1) SATA Hitachi 2TB (43 MB/s w/o NCQ) NL SAS Toshiba 2TB (122 MB/s) Configuration (RAID 5) Rebuild time 7 HDDs 13 hrs 30 mins Table 2: Write performance of NL-SAS vs. SATA hard drive Time saving in % 7 HDDs 5 hrs 50 mins 57% (*1) The used WRITE throughput is the result of performance tests. Vendor numbers might not match reality! 8
3.2.2 SATA vs. SATA III hard drive with Native Command Queuing (NCQ) enabled A RAID controller that supports Native Command Queuing (NCQ) for SATA hard drives is able to reduce the rebuild time significant compared to a RAID controller that is not supporting NCQ. The table shows an EonStor DS G6 storage system without NCQ support compared to an EonStor DS G7 storage system with NCQ enabled. The computing power in this case is not relevant, as the rebuild time using 7 hard drives is the same on EonStor DS G6 and G7 (see 3.1). Model Computing power Write performance (*1) Configuration (Raid 5) Rebuild time Time saving in % G6 850 MB/s SATA Hitachi 2TB (43 MB/s w/o NCQ) 7 HDDs 13 hrs 30 mins G7 1350 MB/s SATA Hitachi 2TB (130 MB/s w/ NCQ) 7 HDDs 4 hrs 54 mins 64% Table 3: Write performance of SATA hard drive w/o NCQ vs. w/ NCQ 3.2.3 The drive bottleneck In general the maximum READ throughput of a single hard drive is higher than the max. WRITE throughput. NL SAS Max. Seq. Read Throughput Max. Seq. Write Throughput (*1) Toshiba 1TB MK1001TRKB 147 MB/s 122 MB/s Table 4: READ throughput vs. WRITE throughput In Rebuild process the write throughput is equal to the read throughput. If one data is read from a single disk per second, one data is written to the Hot-Spare disk per second. If two data are read from a single disk per second, two data are written to the Hot-Spare disk per second. Example: 1. Configuration: A.) EonStor DS G7 w/ 7 x NL-SAS 6G HDD (1TB), max. seq. WRITE throughput 122 MB/s B.) EonStor DS G7 w/ 7 x SATA III HDD (1TB), max. seq. WRITE throughput (w/o NCQ) 43 MB/s RAID 5 array = 7 x HDD + 1 x Hot-Spare HDD (*1) The used WRITE throughput is the result of performance tests. Vendor numbers might not match reality! 9
2. Calculation: Max. total READ throughput = (drive # - failed drive #) * Max. WRITE throughput A.) Max. total READ throughput = (7-1) * 122 MB/s = 732 MB/s B.) Max. total READ throughput = (7-1) * 43 MB/s = 258 MB/s 3. Result: A.): 1350 Mbps > 732 MB/s B.): 1350 Mbps > 258 MB/s Max. total WRITE throughput = Bottleneck Configuration A.) B.) Drive NL SAS 6G Toshiba 1TB SATAIII Hitachi 1TB Max. Write Throughput (*1) Rebuild Time Test Environment 122 MB/s 2 hours 09 minutes EonStor DS G7 Raid 5 LD: 1 43 MB/s 6 hours 09 minutes Hard Drive: 7 Hot-Spare: 1 Table 5: WRITE throughput SATA III hard drive vs. NL-SAS hard drive 3.3 The capacity of a failed hard drive The capacity of a failed hard drive is another key factor affecting the rebuild time. As higher the capacity of the failed hard drive, as longer is the rebuild time. Using a hard drive with smaller capacity (ex. 1TB hard drive) can decrease, the rebuild time to 64% using a 2 TB hard drive. Drive Type Model Configuration (RAID 5) Rebuild time Time saving in % NL SAS Toshiba 2TB NL SAS Toshiba 1TB G7 7 HDDs 5 hrs 49 mins 7 HDDs 2 hrs 35 mins 56% Table 6: Capacity of a failed hard drive (*1) The used WRITE throughput is the result of performance tests. Vendor numbers might not match reality! 10
4. Factors rebuilding a degraded RAID Estimating the Rebuild Time The rebuild time can be estimated using the following formulas. 4.1 Rebuild time, if computing power is the bottleneck The following formula is suitable using one of the following configurations: Drives per RAID array / sub-array EonStor DS G7 EonStor DS G6 SAS 6G SSD 6 3 NL-SAS HDD 11 7 SATA II / III HDD 11 7 Table 7: Suitable configuration Rebuild time = DC DW DC = Drive capacity in MB (#GB * 1000 3 1024 2 ) DW = Drive Write Throughput (*1) Example: 1. Configuration: 2. Calculation: Drive Max. Write Throughput Test Environment NL SAS 6G Toshiba 1TB 122 MB/s Table 8: Rebuild Time example configuration EonStor DS G7 Raid 5 LD: 1 Hard Drive: 7 Hot-Spare: 1 Rebuild Time in minutes = (953,674 122) 60 3. Result: EonStor DS G7, Rebuild Time = 130 min. (*1) The used WRITE throughput is the result of performance tests. Vendor numbers might not match reality! 11
4.2 Rebuild time, if write throughput is the bottleneck In all other cases the rebuild time can be estimated by using the following formula. Rebuild time = DC CP (DN DF) DC = Drive capacity in MB DN = Number of drives DF = Number of failed drives CP = Computing power (EonStor DS G6 = 850 MB/s, EonStor DS G7 = 1,350 MB/s) Example: 1. Configuration: Configuration EonStor Model Max. Computing Power Test Environment A.) B.) G7 G6 1350 MB/s 850 MB/s Hitachi SSD 400GB Raid 5 LD: 1 Drives: 7 Hot-Spare: 1 Table 9: Rebuild Time example configuration 2. Calculation: A.) Rebuild Time in minutes = (381,470 (1350 (7-1))) 60 B.) Rebuild Time in minutes = (381,470 (850 (7-1))) 60 3. Result: A.) EonStor DS G7, Rebuild Time = 28 min. B.) EonStor DS G6, Rebuild Time = 45 min. 12
5. Factors rebuilding a degraded RAID Appendix: 5.1 Infortrend Web links Infortrend Home: http://www.infortrend.com Infortrend Support and Resources: http://www.infortrend.com/global/support/support EonStor DS overview: http://www.infortrend.com/global/products/families/esds EonStor DS G7 overview: http://www.infortrend.com/event2011/2011_global/201112_esds_g7/esds_g7.html 5.2 Drive Vendor links Links to drive vendors, whose drives were used for testing: 1. HGST: HUSSL4040ASS600 http://www.hgst.com/solid-state-drives/ultrastar-ssd400s HUSML4040ASS600 http://www.hgst.com/solid-state-drives/ultrastar-ssd400m HUA723020ALA640 http://www.hgst.com/tech/techlib.nsf/techdocs/ec6d440c3f64dbcc8825782300026498/$file/ US7K3000_ds.pdf 2. Toshiba: MK1001TRKB http://storage.toshiba.com/techdocs/mkx001trkb_2tskb_series_data_sheet.pdf MK2001TRKB http://storage.toshiba.com/techdocs/mkx001trkb_2tskb_series_data_sheet.pdf Copyright 2012 Infortrend Technology, Inc. All rights reserved. Infortrend, ESVA, EonStor, EonNAS and EonPath are trademarks or registered trademarks of Infortrend. All other marks and names mentioned herein may be trademarks of their respective owners. The information contained herein is subject to change without notice. The content provided as is, without express or implied warranties of any kind. 13 WP_ED_2012035_GL_1.3