NAND Flash Media Management Through RAIN



Similar documents
HP Smart Array Controllers and basic RAID performance factors

A Comparison of Client and Enterprise SSD Data Path Protection

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Automated Data-Aware Tiering

Storing Data: Disks and Files

RAID Implementation for StorSimple Storage Management Appliance

RAID 6 with HP Advanced Data Guarding technology:

Flash-optimized Data Progression

RAID Basics Training Guide

XtremIO DATA PROTECTION (XDP)

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

WHITE PAPER. Drobo TM Hybrid Storage TM

Technologies Supporting Evolution of SSDs

NAND Flash Architecture and Specification Trends

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

PIONEER RESEARCH & DEVELOPMENT GROUP

Accelerating Server Storage Performance on Lenovo ThinkServer

OCZ s NVMe SSDs provide Lower Latency and Faster, more Consistent Performance

Sistemas Operativos: Input/Output Disks

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Flash Controller Architecture for All Flash Arrays

Chapter 9: Peripheral Devices: Magnetic Disks

SSD Performance Tips: Avoid The Write Cliff

The Pitfalls of Deploying Solid-State Drive RAIDs

EMC XTREMIO EXECUTIVE OVERVIEW

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Don t Let RAID Raid the Lifetime of Your SSD Array

Assessing RAID ADG vs. RAID 5 vs. RAID 1+0

Data Storage - II: Efficient Usage & Errors

NAND Basics Understanding the Technology Behind Your SSD

Why disk arrays? CPUs improving faster than disks

Real-Time Storage Tiering for Real-World Workloads

Flash Memory Technology in Enterprise Storage

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

In-Block Level Redundancy Management for Flash Storage System

How To Write A Disk Array

Definition of RAID Levels

An In-Depth Look at Variable Stripe RAID (VSR)

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

RAID-DP: NetApp Implementation of Double- Parity RAID for Data Protection

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

ioscale: The Holy Grail for Hyperscale

RealStor 2.0 Provisioning and Mapping Volumes

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Silent data corruption in SATA arrays: A solution

PowerVault MD1200/MD1220 Storage Solution Guide for Applications

Nasir Memon Polytechnic Institute of NYU

Frequently Asked Questions March s620 SATA SSD Enterprise-Class Solid-State Device. Frequently Asked Questions

Lecture 36: Chapter 6

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Comparison of NAND Flash Technologies Used in Solid- State Storage

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Data Distribution Algorithms for Reliable. Reliable Parallel Storage on Flash Memories

RAID 5 rebuild performance in ProLiant

Ensuring Robust Data Integrity

Solid State Drive Technology

How To Improve Write Speed On An Nand Flash Memory Flash Drive

All-Flash Storage Solution for SAP HANA:

How to recover a failed Storage Spaces

WHITEPAPER It s Time to Move Your Critical Data to SSDs

Understanding endurance and performance characteristics of HP solid state drives

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Benefits of Intel Matrix Storage Technology

Energy aware RAID Configuration for Large Storage Systems

RAID Overview

Storage Technologies for Video Surveillance

Software-defined Storage at the Speed of Flash

Guide to SATA Hard Disks Installation and RAID Configuration

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Important Differences Between Consumer and Enterprise Flash Architectures

RAID by Sight and Sound

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Solid State Drive vs. Hard Disk Drive Price and Performance Study

Enhancing IBM Netfinity Server Reliability

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Optimizing SQL Server Storage Performance with the PowerEdge R720

Solid State Drive (SSD) FAQ

Disk Storage & Dependability

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage

Frequently Asked Questions August s840z SAS SSD Enterprise-Class Solid-State Device. Frequently Asked Questions

Chapter 10: Mass-Storage Systems

How To Write On A Flash Memory Flash Memory (Mlc) On A Solid State Drive (Samsung)

Best Practices RAID Implementations for Snap Servers and JBOD Expansion

VERY IMPORTANT NOTE! - RAID

Transcription:

Flash Media Management Through RAIN Scott Shadley, Senior roduct Marketing Manager Micron Technology, Inc. Technical Marketing Brief What Is RAIN, and Why Do SSDs Need It? This brief compares redundant array of independent (RAIN) protection in enterprise solid state drives (SSDs) to the redundant array of independent (and inexpensive) disks (RAID) protection scheme for hard disk drives (HDDs) and details how RAIN technology benefits systems using Micron s SSD products. Existing rotection Techniques in Enterprise Storage In the early days of using hard disk drives (HDDs), it became apparent that a single device had enough potential for failure to warrant the creation of a protection scheme to prevent unwanted data loss. Several data protection mechanisms were developed, including the RAID protection scheme. Each mechanism implemented a slightly different form in terms of how it worked and its impact on array performance. For HDD arrays, the most common implementation became RAID 5 (rotating single-bit parity) because it protected a single drive from failing with minimal impact on performance. Figure shows an example of how RAID 5 works. In this example, parity data is rotated across four unique disks. The parity (in red) represents the data required for the system to recover any of the actual blocks of stored information. As with all parity protection, this parity occupies space on the disks, and therefore, affects the overall capacity of the array, which is important when considering how this technique and protection scheme can be applied to an SSD. Block a Block 2a Block 3a arity Block b Block 2b arity Block 4a Block c arity Block 3b Block 4b arity Block 2c Block 3c Block 4c Disk Disk 2 Disk 3 Disk 4 Figure : RAID 5 in HDDs

Flash Media Management Through RAIN ECC Is No Longer Enough to rotect in SSDs Like HDDs, SSD devices also have the potential for failure and data loss. The original method for protecting data in SSD designs was simply to add the required levels of error correction code (ECC) to the given pages and then recover data using ECC. When the first SSDs were shipped, this initial level of protection was sufficient to minimize the chance of failure, much like with the early HDD products. But as geometries continue to shrink, and controller complexity increases, a protection scheme beyond simple ECC is needed like RAID was needed for HDDs. Similar solutions to protect the user data in SSDs have emerged as the market matures, but unlike with HDD arrays, where the industry collectively settled on a very few specific versions of array-level RAID implementations (most commonly, single-bit parity RAID 5 and double-bit parity RAID both rotating), SSDs utilize approaches at the device level. Thus, no two versions of SSD data protection technology are guaranteed to be exactly the same. With many competitive architectures being created by multiple SSD manufacturers, it is important to understand the details of the approach being used. These details can be thought of as tools in the SSD designer s toolkit and can be tuned and adjusted to best match the SSD design to the intended use, workload, endurance, and a host of other factors. Data Data Data ECC protection on every page Figure 2: Original method of SSD protection Required levels of ECC added to every page To help differentiate between array-level (traditional HDD) parity protection and SSD-specific protection, the term RAIN is used in reference to SSD protection. SSD Architecture Review To show how RAIN can be beneficial in SSD design and to show the relevance of parity-based RAIN, it is important to first look at how data is organized in the SSD. Unlike HDD designs that have rotational storage organizations (platter, track, sector), SSDs use Flash devices as the storage media inside, enabling new design flexibility. The SSD layout in Figure 3 shows some similarities between how SSD devices are arranged compared to the multiple-drive arrangement of HDD arrays. This high level of similarity suggests possible integration of RAID into an single SSD device rather than using many individual drives. This represents the first level at which RAIN comes into play within a RAIN-enabled SSD. Most SSD controllers use parallelism in order to increase SSD performance and locate stored data across the many smaller Flash devices. For example, to create a 350GB SSD, smaller 8GB devices, along with parallel data placement and retrieval, are used to create this individual drive capacity. 2

Flash Media Management Through RAIN Controller CB Figure 3a: hysical device arrangement on an SSD Controller Figure 3b: Logical data placement/arrangement on an SSD This parallelism used in SSDs works like an HDD RAID 0 array where data is spread across multiple HDDs in the array. This is also referred to as striped data, without parity. The data architecture inside a component can be organized in many ways, and each controller s architecture and firmware can have a unique implementation. Moving data the most efficient way possible is the key to a successful design. The better the methodology, the faster and more compelling a user will find the end product. How Does RAIN Work in Micron SSDs? RAIN technology adds user data protection that extends beyond ECC, minimally impacts drive performance, and optimizes management. With a high degree of parallelism already in place within the SSD (striping), adding a parity protection architecture is the next logical step. However, there are many possible ways to implement this parity protection scheme each with potential tradeoffs that must be considered when designing the SSD. 3

Flash Media Management Through RAIN X 0.8X ~45% (0.8X) x 0.85 Total capacity User capacity reduction due to overprovisioning User capacity reduction after addition of RAIN Figure 4: How implementing RAIN can affect the end-user capacity of the drive Figure 4 shows a hypothetical example of how implementing a protection scheme like RAIN can impact the amount of capacity available to the end user. Like in the HDD environment, where this is already standard practice, having SSD capacity consumed by a protection scheme is an acceptable tradeoff for the data protection benefits. The example above assumes the following: SSD RAW capacity: X = 52GB Overprovisioning level: Z = 0.8 ((00-22)/00) RAIN design with parity element for each storage elements: Y = /8 (a : ratio) = 0.85 Therefore: User-available capacity of 52GB drive: 52GB x 0.85 x 0.8 = 350GB Rain Options for SSD Designers Unlike HDD arrays where the most common descriptions are the parity-generated and parity-placement mechanisms (for example RAID 5 or RAID 0), the common descriptions associated with RAIN differ slightly. Most commonly, the description states the stripe level or stripe length, which expresses how many user data elements are associated with a single parity element. This is often referred to as N to where N is the number of user data elements. For example, if an SSD associates seven user data elements to one parity element, that SSD would be described as having a : RAIN stripe. This specific RAIN stripe implementation is shown below in Figure 5a. The in this diagram represents the parity element being generated and stored along with the user data it is protecting. 0 2 3 4 5 8 9 0 2 3 4 5 28 29 30 3 Figure 5a: : RAIN stripes 4

Flash Media Management Through RAIN 0 2 3 4 5 4 5 8 9 20 2 22 23 24 25 Figure 5b: 5: RAIN stripes 0 2 3 4 5 8 9 0 2 3 4 5 2 2 Figure 5c: 2: RAIN stripes Other RAIN stripes exist, as shown above in Figure 5b (5: stripe) and Figure 5c (28: stripe), and work similarly. It is important to understand that the stripe length is a parameter that is fixed when the SSD is designed and is optimized for a myriad of factors. Many different parity lengths can be considered and implemented, and each have a unique impact on the overall design, capacity, and, potentially, performance of the drive. In the case of Micron s SSD firmware development, the differences in the performance of these data stripes are carefully characterized to determine the best performance of the SSD and the best protection method required by the SSD design fundamentals. Table shows how data stripe lengths were considered during the design of Micron s 320h CIe SSD. Only the + version of the parity architecture has an impact on performance. This method of protection consumes 50% of the in parity; therefore, it does not provide a viable choice for SSD design because it is simply not economical. Attribute Sequential read @ 4KB transfer size Sequential write @ 4KB transfer size Random read @ 4KB transfer size Random write @ 4KB transfer size (full drive) 25GB Raw Capacity SSD No arity + 3 + +.5 GB/s.5 GB/s.5 GB/s.2 GB/s.5 GB/s.5 GB/s.5 GB/s.2 GB/s 35,000 35,000 35,000 300,000,000,000 43,000 94,000 Table : 320h CIe SSD solution data stripe consideration 5

Flash Media Management Through RAIN By definition, RAIN is a redundant array of independent, and taking this definition literally is the most efficient way to discern what is occurring in the SSD. While most simple SSD designs are focused only on performance (parallelism) to the point of failure, Micron has developed a solution that ensures independence as well as parallelism, as shown in the following example. 0 2 3 4 5 Example of Failure Recovery With RAIN Sample : Rain Stripe This example walks through a failure and recovery of the 320h drive in Table using a : data stripe; it shows how the new data structure will be intact and can be carried forward through the life of the SSD. (Additional details and unique features of RAIN and the SSD firmware management are not discussed here due to the in-depth technical aspects and Micron-specific I. These details are available through more direct discussions with Micron.) In this example, each user data element number is shown as an individual data point. In actual drive applications, the level of data recovery can also be on either a smaller, individual page or a much larger failure recovery. 0 2 3 4 5 Stripe Failure While data is being moved around within the SSD through either host-initiated data writes or drive-level background operations, RAIN successfully recovers data, as shown in Figure.. As data being written to the drive goes through the process of being programmed, data is arranged so that parity can be created and stored in the eighth page of this : example stripe. Over time, as ECC corrects data, RAIN does not need to be applied for any form of data recovery. RAIN is focused on defects that occur due to unexpected issues; it is also focused on defects that occur as the device wears and ECC limits are exceeded. 2. As data is read from the drive, it is processed through the RAIN engine. In the event that any portion of the stripe is flagged as an unrecoverable error, the primary RAIN algorithm kicks in and recovers the data via the parity process of using an XOR algorithm on the remaining data. In the event that this data failure is a read process only, the newly recovered data is then stored in a new location on the drive, and other data is moved around in the drive as part of 0 3 4 Surviving Data (After Failure) Figure : RAIN data recovery during data moves the normal background operations. Instead of simply moving existing data and replacing it with new write data, the newly recovered data is instantaneously moved into normal drive management routines. 3. The existing location where the failure occurs is not instantaneously moved around within the drive. That type of recovery is suboptimal, and, although effective, it creates unnecessary wear and write amplification. Micron s RAIN has the ability to leave the existing data intact until it is absolutely necessary to reprogram the entire block. This has no impact on 5

Flash Media Management Through RAIN performance or drive data management, which is a key feature of the independent nature of the solution. 4. Data is then gathered by the controller to indicate that this recovery has occurred. It is managed by the internal processes of the drive. Conclusion Data integrity is a key attribute that enterprise customers require, and potential customers prefer the assurance that nothing will be lost in their data stores. Current solutions are focused heavily on data reliability. With HDD solutions, RAID is not optional, but required, to ensure this reliability. These solutions focus on ways to store more data in smaller, redundant locations. The continued use of tape-based storage design is another example of the need for secure data storage. I stands for inexpensive. Therefore, implementers need solutions that do not require them to create data sets that need redundant safety drives. Finding ways to solve this unusual situation can best be addressed by looking inside the SSD, rather than at ways to add layers on top of the use case. Micron s RAIN provides this internal solution to customers. SSDs are more robust than HDD solutions currently in use, but, as with all technologies, there is always a small chance to experience issues. RAIN is that next layer within the SSD that provides customers the peace of mind to implement solutions with fewer SSDs, yet still have existing HDD RAID protection levels. Technology will continue to advance both inside the SSD and in the end-storage solutions. Micron s goal is to ensure that features like RAIN are always one step ahead of customers growing needs for cost-effective data storage solutions with high reliability and endurance. SSDs have moved into enterprise data environments, and they no longer fit the definition of RAID where the micron.com roducts are warranted only to meet Micron s production data sheet specifications. roducts and specifications are subject to change without notice. 20 Micron Technology, Inc. Micron, and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All rights reserved. /3 :44