Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability

Similar documents
Incremental Redundancy to Reduce Data Retention Errors in Flash-based SSDs

In-Block Level Redundancy Management for Flash Storage System

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

Don t Let RAID Raid the Lifetime of Your SSD Array

NAND Flash Architecture and Specification Trends

Choosing Storage Systems

Solid State Drive Technology

Technologies Supporting Evolution of SSDs

NAND Flash Architecture and Specification Trends

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Important Differences Between Consumer and Enterprise Flash Architectures

Buffer-Aware Garbage Collection for NAND Flash Memory-Based Storage Systems

SALSA Flash-Optimized Software-Defined Storage

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

Low-Power Error Correction for Mobile Storage

Differential RAID: Rethinking RAID for SSD Reliability

Comparison of NAND Flash Technologies Used in Solid- State Storage

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

FAST 11. Yongseok Oh University of Seoul. Mobile Embedded System Laboratory

Offline Deduplication for Solid State Disk Using a Lightweight Hash Algorithm

Speeding Up Cloud/Server Applications Using Flash Memory

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010

The Pitfalls of Deploying Solid-State Drive RAIDs

Understanding endurance and performance characteristics of HP solid state drives

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Sistemas Operativos: Input/Output Disks

HybridLog: an Efficient Hybrid-Mapped Flash Translation Layer for Modern NAND Flash Memory

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery

Price/performance Modern Memory Hierarchy

RAID Performance Analysis

Benchmarking Mobile Storage with the Arasan High Performance HVP

The Technologies & Architectures. President, Demartek

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Nasir Memon Polytechnic Institute of NYU

Flash Memory Basics for SSD Users

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Definition of RAID Levels

Indexing on Solid State Drives based on Flash Memory

Application-Focused Flash Acceleration

Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays

PARALLELS CLOUD STORAGE

Solid State Drive (SSD) FAQ

An Overview of Flash Storage for Databases

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1

Implementation of Buffer Cache Simulator for Hybrid Main Memory and Flash Memory Storages

How To Write On A Flash Memory Flash Memory (Mlc) On A Solid State Drive (Samsung)

Flash In The Enterprise

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

File System Management

SH-Sim: A Flexible Simulation Platform for Hybrid Storage Systems

NAND Flash Media Management Through RAIN

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Firebird and RAID. Choosing the right RAID configuration for Firebird. Paul Reeves IBPhoenix. mail:

Solid State Storage in a Hard Disk Package. Brian McKean, LSI Corporation

NAND Flash-based Disk Cache Using SLC/MLC Combined Flash Memory

Maximizing Your Server Memory and Storage Investments with Windows Server 2012 R2

Endurance Models for Cactus Technologies Industrial-Grade Flash Storage Products. White Paper CTWP006

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Industrial Flash Storage Module

Flash Fabric Architecture

RFLRU: A Buffer Cache Management Algorithm for Solid State Drive to Improve the Write Performance on Mixed Workload

DELL SOLID STATE DISK (SSD) DRIVES

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

Deep Dive: Maximizing EC2 & EBS Performance

An Exploration of Hybrid Hard Disk Designs Using an Extensible Simulator

EMC XTREMIO EXECUTIVE OVERVIEW

How it can benefit your enterprise. Dejan Kocic Netapp

Flash-Friendly File System (F2FS)

Technical Paper. Performance and Tuning Considerations for SAS on Fusion-io ioscale Flash Storage

Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark

Caching Mechanisms for Mobile and IOT Devices

RNFTL: A Reuse-Aware NAND Flash Translation Layer for Flash Memory

SSD Write Performance IOPS Confusion Due to Poor Benchmarking Techniques

An In-Depth Look at Variable Stripe RAID (VSR)

A Data De-duplication Access Framework for Solid State Drives

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

A PRAM and NAND Flash Hybrid Architecture for High-Performance Embedded Storage Subsystems

RAID technology and IBM TotalStorage NAS products

Disk Storage & Dependability

Managing Storage Space in a Flash and Disk Hybrid Storage System

Fusion iomemory iodrive PCIe Application Accelerator Performance Testing

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

Filesystems Performance in GNU/Linux Multi-Disk Data Storage

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Building Flexible, Fault-Tolerant Flash-based Storage Systems

Accelerating Datacenter Server Workloads with NVDIMMs

Intel Solid-State Drive 320 Series

Transcription:

Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability Jaeho Kim (Postdoctoral researcher @ Hongik Univ.) Donghee Lee (Professor @ Univ. of Seoul) Sam H. Noh (Professor @ Hongik Univ.) 1

Introduction & Motivation Flash SSD products with RAID-5 like data protection Fusion-io s iomemory (Adaptive Flashback Tech.) Baidu s Software-Defined Flash Micron s P420m (Redundant Array Independent NAND Tech.) Shannon Systems s Direct-IO 2

Introduction & Motivation Applying RAID-5 into SSD internal Is RAID-5 suitable for flash chips? Is RAID-5 really beneficial for SSD? Exchange I/O workloads Quantitative analysis for reliability and lifetime? SSD MSN Financial Etc. Controller (FTL & RAID) Data Data Data Parity Data Data Parity Data Error Rate Workload 3

Applying RAID-5 into SSD Apply RAID-5 configuration to chips comprising the SSD device Flash Chip SSD page Data ECC Controller (FTL & RAID) Data Data Parity ECC ECC ECC Chip Chip Chip Chip RAID-5 stripe... 4

Pros & Cons of RAID-5 in SSD UPER (Error Rate: Assumptions of quantitative analysis Conventional SSD (denoted ECC ): MLC 64GB with BCH code (4bit/512bytes) for ECC RAID-5 SSD (denoted RAID-5 ): MLC 64GB applying RAID-5 without parity cache Workload: Financial Pros of RAID-5: Improving reliability of SSD Cons of RAID-5: Decreasing lifetime of SSD Required error Improved rate reliability by HDD vendors 1.0E-09 1.0E-14 1.0E-19 1.0E-24 1.0E-29 1.0E-34 1.0E-39 1TB ECC 25TB 50TB 75TB 100TB 125TB 150TB 175TB 200TB Total written bytes Pros RAID-5 Program/Erasure (P/E) cycles 2.E+04 1.E+04 5.E+03 0.E+00 1TB 25TB 50TB 75TB Decreased lifetime P/E cycle limit (MLC) 100TB 125TB 150TB Total written bytes Cons ECC RAID-5 175TB 200TB 5

How can we improve the reliability while prolong the lifespan of the SSD? 6

Outlines Introduction & Motivation Challenges of RAID-5 Our Solution: esap-raid Evaluations Analytic Models of RAID Schemes Conclusion 7

Applying RAID-5 into SSD: Challenges #1,2 Out-of-place update property of flash memory Parity writes increase write amplification (WA) in SSD LBN (Logical Block Number) based striping feature of RAID-5 Parity update overhead (Read-modify-write) for small write requests Data are written to specific chip depending on the LBN of data Updated data for D0 D0 D0 Write Req. Controller Read for new parity SSD Updated data for D0 Chip0 Chip1 Chip2 Chip3 D0 D1 D2 P0 D0 P0 D0 P0 Parity writes 8

Applying RAID-5 into SSD: Challenge #3 Open a window of vulnerability (for totally new data) Small writes must wait until stripe fills up to write parity Totally new data (Not updating existing data) D3 Write Req. RAID-5 may loss SSD incomplete stripe Controller due D3to nonexistent parity Chip0 Chip1 Chip2 Chip3 D0 D3 D1 failure D2 P0 Stripe 9

Summary of the Challenges Out-of-place update property of flash memory Parity update may decrease lifespan of flash memory LBN based striping feature of RAID-5 Must read old data or old parity for parity calculation Data is written to specific chip depending on the LBN of data Open a window of vulnerability Small writes must wait until stripe fills up to calculate parity 10

Outlines Introduction & Motivation Challenges of RAID-5 Our Solution: esap-raid Evaluations Analytic Models of RAID Schemes Conclusion 11

Our Solution: esap-raid RAID-5 esap Elastic Striping & Anywhere Parity (esap) Frequent parity update decrease lifespan of flash memory * Must read data for new parity * Skewed writes to particular chip lead to reduced lifespan Dynamically construct a stripe based on arrival order of write requests regardless of LBN Open a window of vulnerability Solve Stripe size can be flexible with partial stripe parity 12

Write Cost: RAID-5 vs. esap D1 D0 D0 Write Req. Controller RAID-5 / esap scheme employed here Updated data Chip0 Chip1 Chip2 Chip3 D0 D3 D1 D4 D2 P1 P0 D5 Stripe 0 Stripe 1 Assumptions Stripe 0 and 1 are already constructed in the SSD Stripe 0: D0, D1, D2, and P0 Stripe 1: D3, D4, D5, and P1 Updated data separately arrive in D0, D0, and D1 order 13

Write Cost: RAID-5 vs. esap RAID-5 esap Chip0 Chip1 Chip2 Chip3 Chip0 Chip1 Chip2 Chip3 D0 D1 D2 P0 D3 D4 D3 D4 P1 D5 P1 D0 D1 P0 D0 P0 P0 Stripe D0 D3 D4 D0 D1 D2 D3 P0 D4 D5 P1 D0 D1 P2 Skewed writes occur 3 new parities are written Dynamically construct a stripe with only updated data I/O cost: esap reduces the number of read & write RAID-5 esap Write updated data 3 3 Read for parity calculation 4 0 Write parity 3 1 Physically same offset, chips are always evenly utilized Reduced cost of parity overhead 14

RAID-5 Reliability: RAID-5 vs. esap New data (Not updated data) D8 esap Chip0 Chip1 Chip2 Chip3 Chip0 Chip1 Chip2 Chip3 D0 D3 D8 D1 D4 D2 P1 P0 D5 D0 D3 D8 D1 D4 PP D2 D5 P0 P1 Window of vulnerability: D8 must wait until stripe fills up to calculate parity PP: Partial-stripe Parity: Parity can be calculated even for incomplete stripe How can we protect new data D8 before parity write? RAID-5 may loss incomplete stripe due to without parity esap can protect the new data with partial stripe parity (flexible stripe size) 15

Outlines Introduction & Motivation Challenges of RAID-5 Flash-aware New RAID Architecture Evaluations Analytic Models of RAID Schemes Conclusion 16

Evaluation Setup SSD extension with the DiskSim, which is a simulator for SSD 8 flash memory chips, a stripe consists of 16 pages Evaluate three configurations ECC: No parity (similar to RAID-0) RAID-5: Conventional RAID-5 scheme esap: Elastic Striping and Anywhere Parity-RAID (Proposed scheme) Characteristics of I/O workloads Workload Total Data Req.(GB) Write Ratio Sequential 21.8 1.0 Random 30.2 1.0 Financial 35.7 0.81 Exchange 101.2 0.46 MSN 29.7 0.96 17

Response time & Life span ECC RAID-5 esap ECC RAID-5 esap Avg. response time (ms) 3.5 3 2.5 2 1.5 1 0.5 0 Sequential Financial Life span (P/E cycles) 600000 500000 400000 300000 200000 100000 0 Sequential Financial (a) Response time (b) Life span (P/E cycles) esap reduces the response time over RAID-5 esap prolongs the life span of SSD over RAID-5 RAID-5 performs worst, especially for the financial workloads Small writes incur heavy parity overhead 18

Parity overhead (count) 1.6E+7 1.4E+7 1.2E+7 1.0E+7 8.0E+6 6.0E+6 4.0E+6 2.0E+6 0.0E+0 Parity overhead of RAID-5 Analysis of Parity Overhead Reads for parity calculations (PR) Parity writes (PW) Parity overhead of esap Parity writes (PW) RAID-5 esap Partial stripe parity writes (PPW) for small write request Read for parity calculation (PR) Partial stripe parity write (PPW) Parity write (PW) Financial workload 19

Outlines Introduction & Motivation Challenges of RAID-5 Flash-aware New RAID Architecture Evaluations Analytic Models of RAID Schemes Conclusion 20

Analytic Models of RAID Schemes Goals of analytic models Find expected lifespan (P/E cycles) of SSD with various I/O workloads Project long-term reliability according to the lifespan of SSD Two factors affecting lifespan of SSD Write Amplification Factor (WAF) Garbage collection cost Parity Write Overhead (PWO) Term derived from this work (Mathematical model) Analytic model Lifespan of flash SSD (P/E cycles) Affecting WAF PWO 21

Parity Write Overhead (PWO) Parity write overhead are determined by 1) Size of RAID stripe 2) Size of write request 3) Starting position of write request within a stripe From the PWO, We can estimate the number of page writes and erase operations Write Req. Full Stripe Starting position Starting position D D D D P D D D P D D D P P 1 parity 1 parity 2 parities Flash Memory (a) Summit (b) (c) 22

Analytic Models of RAID-5 and esap Expected lifespan of SSD with RAID-5 Analytic model of RAID-5 P/E cycles & UPER Affecting WAF PWO of RAID-5 Parameters Please refer to our paper for details!! Characteristics of I/O workload & SSD Expected lifespan of SSD with esap Analytic model of esap P/E cycles & UPER Affecting WAF PWO of esap Parameters Characteristics of I/O workload & SSD 23

Projecting Long-term Reliability Procedure of projection 1) Extract characteristics of I/O workloads and parameters of SSD 2) Put extracted values into analytic models to expect lifespan of SSD 3) Calculate reliability equations with expected lifespan of SSD 4) Find Uncorrectable Page Error Rate (UPER) from the calculation Procedure of projection Characteristics of I/O workloads & SSD parameters Analytic model of RAID schemes Expected Lifespan Reliability of SSD (UPER) Equations for quantifying reliability 24

Analysis of Long-term Reliability UPER (Log scale) Uncorrectable Page Error Rate (UPER) and life span of SSD 1.E+00 1.E-03 1.E-06 1.E-09 1.E-12 1.E-15 1.E-18 1.E-21 1.E-24 1.E-27 1.E-30 1.E-33 1.E-36 1.E-39 Financial workload For 64GB MLC flash-ssd Required error rate by HDD vendor esap has the lowest UPER among three schemes 1.6E+04 esap can improve reliability 1.4E+04 while limiting SSD s 1.2E+04wear ECC RAID-5 esap Expected life span 1.0E+04 8.0E+03 6.0E+03 4.0E+03 2.0E+03 0.0E+00 esap requires far less P/E cycles compared to RAID-5 P/E cycle limit (MLC) RAID-5 esap ECC Total written bytes Total written bytes 25

Conclusion Reliability of flash based storage is getting more crucial A solution to improve reliability is to apply RAID configuration into SSD Conventional RAID-5 is not suitable esap: A novel flash-aware RAID scheme is proposed Derive the analytical model of RAID schemes in SSD Derive performance and lifespan models of RAID schemes in SSDs Project long-term reliability of SSDs 26

Thank you! & Questions? Please refer to the paper for details, Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability, IEEE Transactions on Computers, 2015 Jaeho Kim (kjhnet@gmail.com) 27

Backup Slides 28

Accuracy of Model Accuracy ratio of the model compared to the experimentally obtained Most of the cases, the difference between the model and the experimental results are within 10% Workloads RAID-5 esap Sequential 0.92 0.95 Random 0.99 0.91 Financial 0.99 0.93 Exchange 0.93 0.99 MSN 0.98 0.95 29

What is Determination factor for WAF & PWO? WAF and PWO are determined by characteristics of the I/O workloads Workload Scheme # of Write req. Avg. size of Write Sequential Financial Avg. u of victim blocks for GC RAID-5 368K 62K 0 esap 184K 124K 0 RAID-5 3617K 9K 0.66 esap 416K 78K 0.64 utilization Analytic model Life span of flash-ssd Affecting WAF PWO Affecting Characteristics of I/O workload 30

ECC Bit Correction Requirements Trends Bit requirements for BCH From: 1) ECC Options for Improving NAND Flash Memory Reliability Micron, 2012 31 2) Signal processing and the evolution of NAND flash memory Anobit, 2010

RBER of MLC vs. TLC BER of MLC Flash 32