1 Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR December, 2012 Peter McGonigal Abstract Dynamic Disk Pools (DDP) offer an exciting new approach to traditional RAID sets by substantially improving rebuild times, limiting critical exposure during drive failures, reducing the performance penalty suffered during a rebuild, as well as significantly simplifying storage administration. This paper compares the performance using the IOR benchmark tool and benefits DDP can offer over traditional RAID-6. v1.6
2 Acknowledgement I would like to thank both Jerry Lohr and Scott Shaw for their help and advice that they provided me while doing this testing.
3 1.0 Introduction The introduction of release of the SGI InfiniteStorage System Manager (ISSM) software provides a number of significant new features. This paper reports on one of the new features called Dynamic Disk Pools (DDP) and specifically how DDP performs in comparison to RAID-6. With RAID-6, data is striped across a set of physical drives and requires 2 parity data updates on different drives for every write. This dual parity is used to store and recover data in the event of a single or dual drive failure. In addition, further drives can be reserved as hot spares to act as stand-by replacements for failing drives. RAID-6 is designed to tolerate the failure of two drives. DDP uses an intelligent algorithm (utilizes a novel pseudo-random placement algorithm (CRUSH) 1 ) that defines which drives are used and distributes data, spare capacity, and protection information accordingly. When the disk pool is created it automatically includes all the storage needed to reconstruct and to re-allocate data (rebalances) if drive failures occur. Depending on the size of the pool (i.e. number if drives in pool), DDP reserves a number of reconstruction locations unknown as the Preservation Capacity. The Preservation Capacity provides rebuild locations for potential drive failures, unlike RAID-6 which uses dedicated stranded hot spares. Preservation Capacity is typically expressed as a drive count (e.g. a pool of 64 drives will have a preservation capacity of 4, this is sufficient storage has been reserved to rebuild up to 4 drives). A pool of drives can range from a minimum of 11 drives to the maximum of a fully configured storage array (e.g. 384 drives in the case of the SGI IS5500 Storage Array). The main benefits of using DDP include: Improved Data Protection o provides greater drive failure protection depending on size of pool than traditional RAID-6 o provides faster reconstruction time after drive failure than traditional RAID-6 Enhanced Performance Consistency o provides better performance than RAID-6 during drive failure and reconstruction Simple Storage Management o provides no RAID to manage o provides no hot-spares to allocate and manage o provides easier administration than RAID-6 1 Refer to for more details on CRUSH. 3
4 2.0 Test Environment Both the DDP and RAID-6 configurations were setup using the ISSM version G8.21. The test environment used, including the hardware and storage setups, is described below. 2.1 Test Hardware The test environment consisted of a single SGI C2108-RP2 server and a SGI IS bay Storage Array. A single SGI InfiniteStorage 5500 (IS5500) Storage Array base enclosure configured with: o Dual controllers o 60x Seagate ST SS (300GB 6Gb SAS 15K) drives o Management Software version: G8.21 o Controller firmware: o Performance key enabled o Total un-configured capacity = 16TB Connected to a single SGI Rackable C2108-RP2 server configured with: o 2x Intel SandyBridge E core processors o 384GB of memory o 4x dual port 8Gb FC HBAs o Software: SLES-11 SP2 SGI Foundation Software 2.6 SGI Performance Suite 1.4 SGI XVM 2.6 SGI MPI 1.4 IOR DDP Configuration For the DDP test environment we configured the IS5500 Storage Array as follows: A single DDP configured with: o 60x Seagate ST SS (300GB 6Gb SAS 15K) drives in the pool o Preservation Capacity = 3 drives o 8x volumes, each volume configured with 1230GB capacity o Mapped one FC 8Gb host path to one volume o Used XVM to stripe 8x volumes together (/dev/lxvm/ddpvol) 4
5 The following ISSM GUI window provides a summary of the DDP test configuration used. Note: (1) Used the following xvm command to create a DDP volume. I was unsure whether this is the best way of handling volumes in a DDP. xvm stripe -unit volname DDPvol slice/lun* (2) Found setting up DDP more straightforward than setting up RAID-6 using ISSM. 2.3 RAID-6 Configuration For the RAID-6 test environment we configured the IS5500 Storage Array as follows: RAID-6 setup: o eight (8) RAID-6 (4+2) volume groups plus two hot spares o therefore using 50 drives (i.e hot spares) out of the 60x Seagate ST SS (300GB 6Gb SAS 15K) drives available o One volume per volume group 5
6 o o Mapped one FC 8Gb host path to one volume Used XVM to stripe 8x volumes together (/dev/lxvm/r6vol) The following ISSM window provides a summary of the RAID-6 test configuration used. Note: (1) Used the following xvm command to create a RAID-6 volume: xvm stripe -unit volname R6vol slice/lun* 2.4 Volume Initialization The time taken to initialize each of the eight (8) DDP volumes was on average 3 hours and 28 minutes, with the maximum time experienced on a volume being 4 hours and 15 minutes and the minimum being 3 hours and 10 minutes. With the eight volumes being initialized in parallel, the total time taken to complete initialization was 4 hours and 17 minutes. Initialization times will not hold up immediate use of storage since both DDP and RAID-6 volumes support Immediate Availability Format (IAF). The IAF feature provided by Santricity allows users to 6
7 access volumes during initialization. This enables setup of file-systems and applications during volume initialization, therefore providing users with zero wait time when initializing. Initialization does have some impact on performance; therefore full performance will not be reached until initialization has been completed. 2.5 IOR The IOR (v2.10.3) I/O benchmark tool was used for all performance testing. Testing conducted on both RAID-6 and DDP setups included: Sequential reads and writes Random reads and writes Sequential reads and write while manually failing a single drive and initiating reconstruction Each IOR run generated 16 I/O threads. Each I/O thread was pinned to one processor core. 7
8 3.0 DDP vs RAID6: Testing Sequential and Random READs and WRITEs 3.1 Sequential READs and WRITEs The testing conducted compares DDP and RAID-6 sequential I/O performance on the IS5500 Storage Array. Running IOR to generate sequential READs and WRITEs with I/O sizes ranging from 1KiB to 4MiB, in both in POSIX and MPI-IO modes The results of these tests are in graphs plotted below. 8
11 Note: (1) RAID-6 usually provides better sequential READ performance bandwidth than DDP. (2) DDP provides better sequential WRITE performance bandwidth than RAID-6. (3) For some reason sequential WRITE performance bandwidth is better than sequential READ performance bandwidth when using the MPI-IO API. 3.2 Random READs and WRITEs The testing conducted compares DDP and RAID-6 sequential I/O performance on the IS5500 Storage Array. Running IOR to generate random READs and WRITEs with I/O sizes ranging from 64KiB to 4MiB, in both in POSIX and MPI-IO modes Note: (1) Didn t conduct any random READs and WRITEs using small I/O sizes. Initial testing shows that doing random small I/O READs and WRITEs would take far too long, so decided to start testing from 64KiB I/O sizes. (2) Random WRITE performance bandwidth is nearly always better than random READ performance bandwidth (with the exception being 4MiB POSIX RAID6 random READs providing better performance bandwidth than random WRITEs). 11
14 4.0 DDP vs RAID6: Testing Performance during Single Drive Failure The testing also included: Forcing a single drive failure and reconstruction while running IOR to generate sequential READS and WRITES with I/O size of 4MB and recording both the time taken to reconstruct/rebalance the Storage Array and the (degraded) performance of the READs and WRITEs during reconstruction. 4.1 Single Drive Failure and Reconstruction using DDP The graphs show a slight drop in read and write performance during the period of the single drive failure and disk pool reconstruction/rebalancing. The default rebuild/rebalance priority settings were used. 14
16 The following ISSM GUI window below shows the events log that provides details from the beginning of the single drive failure through to the completion of the pool reconstruction. As you can see from the log below, the total time taken from the start of drive failure through to the completion of drive pool reconstruction was approximately 36 minutes (refer to red circled log entries). 16
17 The following ISSM GUI window below shows the DDP hardware drive layout during the single drive failure. It shows that the drive in drawer 3, slot 4 has failed. 4.2 Single Drive failure and reconstruction of Hot-Spare using RAID-6 The plot shows the slight drop in read and write performance during the period of the single drive failure and hot-spare update. 17
19 The following ISSM GUI window list the event log showing the starting date and time stamps of the beginning of the single drive failure through to the completion of the pool reconstruction. As you can see from the log below, the total time taken from drive failure through to completing of reconstruction was approximately 5 hours and 12 minutes (refer to red circled log entries). The following ISSM GUI window below shows the RAID-6 hardware drive layout during the single drive failure. It shows: The drive in drawer 3, slot 2 has failed. The drive in drawer 5, slot 10 is a hot spare and is currently active. The drive in drawer 5, slot 8 is another hot spare currently in stand-by. 19
20 Note: (1) That chart plotting RAID-6 performance while experiencing and single drive failure and reconstructing a hot-spare didn t cover the full period of reconstruction. This chart only plots up to 2:23:04am on the 30 th of November The reconstruction didn t complete until 3:01:24am on the 30 th of November (2) Reconstruction from a single drive failure was nearly nine (9) times faster for DDP compared to RAID-6 (i.e. 312 minutes for RAID-6, 36 minutes for DDP). 20
21 5.0 Summary In conclusion, the testing clearly shows that DDP is a winner when it comes to reducing the time to recover from a single drive failure (approximately 9 times faster). In regards to bandwidth while performing I/O: RAID-6 is generally slightly better than DDP when performing sequential READs. DDP is generally slightly better than RAID-6 when performing sequential WRITEs. RAID-6 is better under some circumstances when performing random READs and random WRITEs. DDP has slight performance degradation while experiencing a single drive failure and performing a reconstruction with default priority settings. RAID-6 appears to have minimal to no performance impact while experiencing a single drive failure and performing a hot-spare reconstruction with default priority settings. Further testing is required to see the impact on performance when handling single drive failures and hot-spare reconstructions with RAID-6 under other various workload scenarios. Further testing is required to see the impact on performance for both RAID-6 and DDP during dual disk failures and reconstructions. 21
22 Appendix The following provides the scripts used during testing to run the IOR benchmark. IOR script used to for sequential READs & WRITEs 22
23 IOR script used to for random READs & WRITEs 23
24 IOR script used during drive failure and reconstruction 24
Performance Study VMware vcenter Server Performance and Best Practices VMware vsphere 4.1 VMware vcenter Server allows you to manage all levels of a VMware vsphere deployment from datacenters to clusters,
Dell EqualLogic Best Practices Series Sizing and Best Practices for Deploying Citrix XenDesktop on VMware vsphere with Dell EqualLogic Storage A Dell Technical Whitepaper Storage Infrastructure and Solutions
High Performance Tier Implementation Guideline A Dell Technical White Paper PowerVault MD32 and MD32i Storage Arrays THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS
EMC VMAX3 SERVICE LEVEL OBJECTIVES AND SNAPVX FOR ORACLE RAC 12c Perform one-click, on-demand provisioning of multiple, mixed Oracle workloads with differing Service Level Objectives Non-disruptively adjust
Relational Database Management Systems in the Cloud: Microsoft SQL Server 2008 R2 Miles Ward July 2011 Page 1 of 22 Table of Contents Introduction... 3 Relational Databases on Amazon EC2... 3 AWS vs. Your
Best Practices for Virtualizing and Managing SQL Server v1.0 May 2013 Best Practices for Virtualizing and Managing SQL Server 2012 1 1 Copyright Information 2013 Microsoft Corporation. All rights reserved.
EMC VNX UNIFIED BEST PRACTICES FOR PERFORMANCE VNX OE for Block 5.33 VNX OE for File 8.1 EMC Enterprise & Mid-range Systems Division Abstract This applied best practices guide provides recommended best
Estimating the Cost of a GIS in the Amazon Cloud An Esri White Paper August 2012 Copyright 2012 Esri All rights reserved. Printed in the United States of America. The information contained in this document
White Paper EMC VNXe HIGH AVAILABILITY Overview Abstract This white paper discusses the high availability (HA) features in the EMC VNXe system and how you can configure a VNXe system to achieve your goals
NDMP Backup of Dell EqualLogic FS Series NAS using CommVault Simpana A Dell EqualLogic Reference Architecture Dell Storage Engineering June 2013 Revisions Date January 2013 June 2013 Description Initial
Best Practices Planning Abstract This white paper provides a set of proven practices for deploying EMC SourceOne Email Management. The information is intended as an enhancement to the information provided
Best Practices Guide McAfee epolicy Orchestrator for use with epolicy Orchestrator versions 4.5.0 and 4.0.0 COPYRIGHT Copyright 2011 McAfee, Inc. All Rights Reserved. No part of this publication may be
HP B6200 Backup System Recommended Configuration Guidelines Introduction... 3 Purpose of this guide... 4 Executive summary... 4 Challenges in Enterprise Data Protection... 4 A summary of HP B6200 Backup
Microsoft System Center 2012 R2 Why Microsoft? For Virtualizing & Managing SharePoint July 2014 v1.0 2014 Microsoft Corporation. All rights reserved. This document is provided as-is. Information and views
An Oracle Technical White Paper May 2011 Oracle Optimized Solution for Enterprise Cloud Infrastructure Introduction... 1 Overview of the Oracle Optimized Solution for Enterprise Cloud Infrastructure...
The Intelligent Surveillance Solution Titan NVR Server User Manual Ver. 184.108.40.206026.00 Table of Contents 1. Installation... 4 1.1 Installation Process... 4 1.2 LED Status Definitions... 12 2. Settings...
: A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica University of California,
An Oracle White Paper June 2013 Oracle Real Application Clusters One Node Executive Overview... 1 Oracle RAC One Node 12c Overview... 2 Best In-Class Oracle Database Availability... 5 Better Oracle Database
TECHNICAL BRIEF:........................................ Running Highly Available, High Performance Databases in a SAN-Free Environment Who should read this paper Architects, application owners and database
Introduction By leveraging the inherent benefits of a virtualization based platform, a Microsoft Exchange Server 2007 deployment on VMware Infrastructure 3 offers a variety of availability and recovery
E-mail Filter SurfControl E-mail Filter 5.0 for SMTP Getting Started Guide www.surfcontrol.com The World s #1 Web & E-mail Filtering Company CONTENTS CONTENTS INTRODUCTION About This Document...2 Product
Using Time Machine to Backup Multiple Mac Clients to SNC NAS and 1000 Application Note Abstract This application note describes how to use Time Machine to backup multiple Mac clients to SNC NAS and 1000.
An Oracle White Paper March 2013 Load Testing Best Practices for Oracle E- Business Suite using Oracle Application Testing Suite Executive Overview... 1 Introduction... 1 Oracle Load Testing Setup... 2