Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability Jaeho Kim (Postdoctoral researcher @ Hongik Univ.) Donghee Lee (Professor @ Univ. of Seoul) Sam H. Noh (Professor @ Hongik Univ.) 1
Introduction & Motivation Flash SSD products with RAID-5 like data protection Fusion-io s iomemory (Adaptive Flashback Tech.) Baidu s Software-Defined Flash Micron s P420m (Redundant Array Independent NAND Tech.) Shannon Systems s Direct-IO 2
Introduction & Motivation Applying RAID-5 into SSD internal Is RAID-5 suitable for flash chips? Is RAID-5 really beneficial for SSD? Exchange I/O workloads Quantitative analysis for reliability and lifetime? SSD MSN Financial Etc. Controller (FTL & RAID) Data Data Data Parity Data Data Parity Data Error Rate Workload 3
Applying RAID-5 into SSD Apply RAID-5 configuration to chips comprising the SSD device Flash Chip SSD page Data ECC Controller (FTL & RAID) Data Data Parity ECC ECC ECC Chip Chip Chip Chip RAID-5 stripe... 4
Pros & Cons of RAID-5 in SSD UPER (Error Rate: Assumptions of quantitative analysis Conventional SSD (denoted ECC ): MLC 64GB with BCH code (4bit/512bytes) for ECC RAID-5 SSD (denoted RAID-5 ): MLC 64GB applying RAID-5 without parity cache Workload: Financial Pros of RAID-5: Improving reliability of SSD Cons of RAID-5: Decreasing lifetime of SSD Required error Improved rate reliability by HDD vendors 1.0E-09 1.0E-14 1.0E-19 1.0E-24 1.0E-29 1.0E-34 1.0E-39 1TB ECC 25TB 50TB 75TB 100TB 125TB 150TB 175TB 200TB Total written bytes Pros RAID-5 Program/Erasure (P/E) cycles 2.E+04 1.E+04 5.E+03 0.E+00 1TB 25TB 50TB 75TB Decreased lifetime P/E cycle limit (MLC) 100TB 125TB 150TB Total written bytes Cons ECC RAID-5 175TB 200TB 5
How can we improve the reliability while prolong the lifespan of the SSD? 6
Outlines Introduction & Motivation Challenges of RAID-5 Our Solution: esap-raid Evaluations Analytic Models of RAID Schemes Conclusion 7
Applying RAID-5 into SSD: Challenges #1,2 Out-of-place update property of flash memory Parity writes increase write amplification (WA) in SSD LBN (Logical Block Number) based striping feature of RAID-5 Parity update overhead (Read-modify-write) for small write requests Data are written to specific chip depending on the LBN of data Updated data for D0 D0 D0 Write Req. Controller Read for new parity SSD Updated data for D0 Chip0 Chip1 Chip2 Chip3 D0 D1 D2 P0 D0 P0 D0 P0 Parity writes 8
Applying RAID-5 into SSD: Challenge #3 Open a window of vulnerability (for totally new data) Small writes must wait until stripe fills up to write parity Totally new data (Not updating existing data) D3 Write Req. RAID-5 may loss SSD incomplete stripe Controller due D3to nonexistent parity Chip0 Chip1 Chip2 Chip3 D0 D3 D1 failure D2 P0 Stripe 9
Summary of the Challenges Out-of-place update property of flash memory Parity update may decrease lifespan of flash memory LBN based striping feature of RAID-5 Must read old data or old parity for parity calculation Data is written to specific chip depending on the LBN of data Open a window of vulnerability Small writes must wait until stripe fills up to calculate parity 10
Outlines Introduction & Motivation Challenges of RAID-5 Our Solution: esap-raid Evaluations Analytic Models of RAID Schemes Conclusion 11
Our Solution: esap-raid RAID-5 esap Elastic Striping & Anywhere Parity (esap) Frequent parity update decrease lifespan of flash memory * Must read data for new parity * Skewed writes to particular chip lead to reduced lifespan Dynamically construct a stripe based on arrival order of write requests regardless of LBN Open a window of vulnerability Solve Stripe size can be flexible with partial stripe parity 12
Write Cost: RAID-5 vs. esap D1 D0 D0 Write Req. Controller RAID-5 / esap scheme employed here Updated data Chip0 Chip1 Chip2 Chip3 D0 D3 D1 D4 D2 P1 P0 D5 Stripe 0 Stripe 1 Assumptions Stripe 0 and 1 are already constructed in the SSD Stripe 0: D0, D1, D2, and P0 Stripe 1: D3, D4, D5, and P1 Updated data separately arrive in D0, D0, and D1 order 13
Write Cost: RAID-5 vs. esap RAID-5 esap Chip0 Chip1 Chip2 Chip3 Chip0 Chip1 Chip2 Chip3 D0 D1 D2 P0 D3 D4 D3 D4 P1 D5 P1 D0 D1 P0 D0 P0 P0 Stripe D0 D3 D4 D0 D1 D2 D3 P0 D4 D5 P1 D0 D1 P2 Skewed writes occur 3 new parities are written Dynamically construct a stripe with only updated data I/O cost: esap reduces the number of read & write RAID-5 esap Write updated data 3 3 Read for parity calculation 4 0 Write parity 3 1 Physically same offset, chips are always evenly utilized Reduced cost of parity overhead 14
RAID-5 Reliability: RAID-5 vs. esap New data (Not updated data) D8 esap Chip0 Chip1 Chip2 Chip3 Chip0 Chip1 Chip2 Chip3 D0 D3 D8 D1 D4 D2 P1 P0 D5 D0 D3 D8 D1 D4 PP D2 D5 P0 P1 Window of vulnerability: D8 must wait until stripe fills up to calculate parity PP: Partial-stripe Parity: Parity can be calculated even for incomplete stripe How can we protect new data D8 before parity write? RAID-5 may loss incomplete stripe due to without parity esap can protect the new data with partial stripe parity (flexible stripe size) 15
Outlines Introduction & Motivation Challenges of RAID-5 Flash-aware New RAID Architecture Evaluations Analytic Models of RAID Schemes Conclusion 16
Evaluation Setup SSD extension with the DiskSim, which is a simulator for SSD 8 flash memory chips, a stripe consists of 16 pages Evaluate three configurations ECC: No parity (similar to RAID-0) RAID-5: Conventional RAID-5 scheme esap: Elastic Striping and Anywhere Parity-RAID (Proposed scheme) Characteristics of I/O workloads Workload Total Data Req.(GB) Write Ratio Sequential 21.8 1.0 Random 30.2 1.0 Financial 35.7 0.81 Exchange 101.2 0.46 MSN 29.7 0.96 17
Response time & Life span ECC RAID-5 esap ECC RAID-5 esap Avg. response time (ms) 3.5 3 2.5 2 1.5 1 0.5 0 Sequential Financial Life span (P/E cycles) 600000 500000 400000 300000 200000 100000 0 Sequential Financial (a) Response time (b) Life span (P/E cycles) esap reduces the response time over RAID-5 esap prolongs the life span of SSD over RAID-5 RAID-5 performs worst, especially for the financial workloads Small writes incur heavy parity overhead 18
Parity overhead (count) 1.6E+7 1.4E+7 1.2E+7 1.0E+7 8.0E+6 6.0E+6 4.0E+6 2.0E+6 0.0E+0 Parity overhead of RAID-5 Analysis of Parity Overhead Reads for parity calculations (PR) Parity writes (PW) Parity overhead of esap Parity writes (PW) RAID-5 esap Partial stripe parity writes (PPW) for small write request Read for parity calculation (PR) Partial stripe parity write (PPW) Parity write (PW) Financial workload 19
Outlines Introduction & Motivation Challenges of RAID-5 Flash-aware New RAID Architecture Evaluations Analytic Models of RAID Schemes Conclusion 20
Analytic Models of RAID Schemes Goals of analytic models Find expected lifespan (P/E cycles) of SSD with various I/O workloads Project long-term reliability according to the lifespan of SSD Two factors affecting lifespan of SSD Write Amplification Factor (WAF) Garbage collection cost Parity Write Overhead (PWO) Term derived from this work (Mathematical model) Analytic model Lifespan of flash SSD (P/E cycles) Affecting WAF PWO 21
Parity Write Overhead (PWO) Parity write overhead are determined by 1) Size of RAID stripe 2) Size of write request 3) Starting position of write request within a stripe From the PWO, We can estimate the number of page writes and erase operations Write Req. Full Stripe Starting position Starting position D D D D P D D D P D D D P P 1 parity 1 parity 2 parities Flash Memory (a) Summit (b) (c) 22
Analytic Models of RAID-5 and esap Expected lifespan of SSD with RAID-5 Analytic model of RAID-5 P/E cycles & UPER Affecting WAF PWO of RAID-5 Parameters Please refer to our paper for details!! Characteristics of I/O workload & SSD Expected lifespan of SSD with esap Analytic model of esap P/E cycles & UPER Affecting WAF PWO of esap Parameters Characteristics of I/O workload & SSD 23
Projecting Long-term Reliability Procedure of projection 1) Extract characteristics of I/O workloads and parameters of SSD 2) Put extracted values into analytic models to expect lifespan of SSD 3) Calculate reliability equations with expected lifespan of SSD 4) Find Uncorrectable Page Error Rate (UPER) from the calculation Procedure of projection Characteristics of I/O workloads & SSD parameters Analytic model of RAID schemes Expected Lifespan Reliability of SSD (UPER) Equations for quantifying reliability 24
Analysis of Long-term Reliability UPER (Log scale) Uncorrectable Page Error Rate (UPER) and life span of SSD 1.E+00 1.E-03 1.E-06 1.E-09 1.E-12 1.E-15 1.E-18 1.E-21 1.E-24 1.E-27 1.E-30 1.E-33 1.E-36 1.E-39 Financial workload For 64GB MLC flash-ssd Required error rate by HDD vendor esap has the lowest UPER among three schemes 1.6E+04 esap can improve reliability 1.4E+04 while limiting SSD s 1.2E+04wear ECC RAID-5 esap Expected life span 1.0E+04 8.0E+03 6.0E+03 4.0E+03 2.0E+03 0.0E+00 esap requires far less P/E cycles compared to RAID-5 P/E cycle limit (MLC) RAID-5 esap ECC Total written bytes Total written bytes 25
Conclusion Reliability of flash based storage is getting more crucial A solution to improve reliability is to apply RAID configuration into SSD Conventional RAID-5 is not suitable esap: A novel flash-aware RAID scheme is proposed Derive the analytical model of RAID schemes in SSD Derive performance and lifespan models of RAID schemes in SSDs Project long-term reliability of SSDs 26
Thank you! & Questions? Please refer to the paper for details, Chip-Level RAID with Flexible Stripe Size and Parity Placement for Enhanced SSD Reliability, IEEE Transactions on Computers, 2015 Jaeho Kim (kjhnet@gmail.com) 27
Backup Slides 28
Accuracy of Model Accuracy ratio of the model compared to the experimentally obtained Most of the cases, the difference between the model and the experimental results are within 10% Workloads RAID-5 esap Sequential 0.92 0.95 Random 0.99 0.91 Financial 0.99 0.93 Exchange 0.93 0.99 MSN 0.98 0.95 29
What is Determination factor for WAF & PWO? WAF and PWO are determined by characteristics of the I/O workloads Workload Scheme # of Write req. Avg. size of Write Sequential Financial Avg. u of victim blocks for GC RAID-5 368K 62K 0 esap 184K 124K 0 RAID-5 3617K 9K 0.66 esap 416K 78K 0.64 utilization Analytic model Life span of flash-ssd Affecting WAF PWO Affecting Characteristics of I/O workload 30
ECC Bit Correction Requirements Trends Bit requirements for BCH From: 1) ECC Options for Improving NAND Flash Memory Reliability Micron, 2012 31 2) Signal processing and the evolution of NAND flash memory Anobit, 2010
RBER of MLC vs. TLC BER of MLC Flash 32