NAND Flash Architecture and Specification Trends Michael Abraham (mabraham@micron.com) NAND Solutions Group Architect Micron Technology, Inc. August 2012 1
Topics NAND Flash Architecture Trends The Cloud and Clients Enterprise Application Requirements ECC and SSD Topologies August 2012 3
Process Node (nm) 60 50 40 30 20 NAND Process Migration: Shrinking Faster than Moore s Law 30nm Class 20nm Class 50nm Class 10 Q1-07 Q2-07 Q3-07 Q4-07 Q1-08 Q2-08 Q3-08 Q4-08 Q1-09 Q2-09 Q3-09 Q4-09 Q1-10 Q2-10 Q3-10 Q4-10 Q1-11 Q2-11 Q3-11 Q4-11 Q1-12 Q2-12 Q3-12 Q4-12 Volume Production Dates Company A Company B Company C Company D August 2012 4 Data based on publicly available information
Memory Organization Trends NAND block size is increasing. Larger page sizes and more planes increase sequential throughput. More pages per block reduce die size. As ECC requirements increase, the spare area per NAND page is increasing. 8388608 2097152 524288 131072 32768 8192 2048 512 128 32 8 Block size (B) Data Bytes per Operation Pages per Block August 2012 5
Number of Electrons Consumer-grade NAND Flash: Endurance and ECC Trends ECC improves data retention and endurance. Process shrinks lead to less electrons per floating gate. To adjust for increasing RBERs, ECC is increasing exponentially to achieve equivalent UBERs. ECC algorithms are transitioning from BCH to LDPC and codeword sizes are increasing. 10,000 1,000 100 10 MLC-2 SLC Endurance (cycles) 100,000 10,000 1,000 SLC Endurance SLC ECC MLC-2 Endurance MLC-2 ECC August 2012 6 70 60 50 40 30 20 10 0 ECC (bits)
Larger Page Sizes Improve Sequential Write Performance For a fixed page size across process nodes, write throughput decreases as the NAND process shrinks NAND vendors increase the page size to compensate for slowing array performance Write throughput decreases with more bits per cell 100.0 80.0 60.0 40.0 20.0 0.0 SLC MLC-2 MLC-3 Data Bytes per Operation (Page Size * # of Planes) Sequential Programming Throughput (MB/s) August 2012 7
More Pages Per Block Affect Random Write Performance Block Copy Time (ms) 1000 800 600 400 200 0 As block copy time increases, random performance decreases. Key factors that impact NAND Flash random write performance 1. Number of pages per block 2. Increase of tprog 3. Increase in I/O transfer time due to larger page sizes (effect not shown below) Impact to system product random performance Some card interfaces have write timeout specs at 250ms. To improve random performance, block management algorithms manage pages or partial blocks. 32 / 300 64 / 250 SLC MLC-2 MLC-3 64 / 300 128 / 250 256 / 300 128 / 600 128 / 900 256 / 900 Pages per Block / tprog (typ) Block Copy Time (ms) 256 / 1200 512 / 1500 192 / 2000+ August 2012 8 384 / 2000+
100% 80% 60% 40% 20% Larger Monolithic NAND Densities Increase Random Read Latencies Most applications favor read operations over write operations. Most read operations are 4KB data sectors. As monolithic NAND density increases, less NAND die are being used for a fixed system density. As tprog increases, the latency of random 4KB sector reads becomes more variable in mixed-operation environments as the probability of needing to read from a die that is busy increases NAND Flash TAM by Density (Units) 0% 2011 2012 2013 2014 2015 2016 256Gb (32GB) 128Gb (16GB) 64Gb (8GB) 32Gb (4GB) 16Gb (2GB) 8Gb (1GB) 4Gb 2Gb 1Gb 512Mb 256Mb 128Mb USA Source: isuppli 2Q12 August 2012 9 Read Latency (µs) 4KB Random Read Latency 2000 1500 1000 500 0 25 / 250 25 / 300 50 / 600 50 / 900 50 / 1200 tr / tprog Min Latency Max Latency 65 / 1500
NAND Interface Trends for High- Performance Applications MT/s Single Channel Package Dual Channel Package 900 800 700 600 500 400 300 200 100 0 ONFI 1.0 ONFI 2.x ONFI 3.0 NV-SDR NV-DDR NV-DDR2 Applications Have transitioned to 200MT/s interface Beginning interface shift to 400MT/s Packaging Typically BGA 2 channel widely available 4 channel being standardized ONFI 3.0 compatible components are available August 2012 10
The Cloud s Impact on NAND System Architectures Cloud: Long-term Data Client: Near-term Data USA August 2012 11
Storage Comparison Client Storage Information stored locally, on the device Consumer or SSD grade NAND Flash Cloud Storage Information stored in hosted server farms or data centers SLC or Enterprise grade NAND Flash USA August 2012 12
Application Requirement Comparison of NAND Flash by Application Requirement Client Storage / Consumer Client Storage / SSD Grade Cloud Storage / Enterprise Grade NAND Cell MLC-2 MLC-3 MLC-2 MLC-3 SLC MLC-2 Endurance / Cycling Up to 3K Up to 3K Up to 100K (SLC) Up to 30K (MLC-2) DPM Consumer grade Better Best I/O Channel Throughput 40 200 MT/s 40 400 MT/s 133 MT/s 400 MT/s UBER 1E-14 Less Less Data retention at max cycling NAND Package Placements 1 year 1 year Less Typically 1 to 4 4 to 16 Up to 32 August 2012 13
How Do Enterprise Applications Meet Enterprise Requirements? Application Requirement Higher system density More throughput Low latency reads Controller Ability to handle many NAND Flash some controllers up to 256 die Page-based block management, DRAM cache, Overprovisioning, More I/O channels, Faster I/O channels, Multiple ECC engines, Simultaneous, mixed operations DRAM cache, Use smaller monolithic NAND densities SSD/Enterprise-grade NAND Flash Lower DPM Faster I/O channel August 2012 14
How Do Enterprise Applications Meet Enterprise Requirements? (Part 2) Application Requirement Higher endurance / reliability More consistent use over time Power within budget for parallel operations Controller Higher ECC Balanced block management to reduce write amplification and provide even wear leveling so NAND die and blocks wear evenly Block management throttles parallelism as needed SSD/Enterprise-grade NAND Flash More ECC required, Lower UBER, Higher endurance Peak power reduction August 2012 15
ECC and Algorithms Enterprise-grade NAND requires more ECC than consumer-grade NAND Flash to achieve higher endurance and lower UBER Providing more ECC to a consumer-grade NAND Flash does not necessarily improve endurance, though it can improve data retention ECC requirements are going to increase to the point that it will be a significant amount of real estate on a multi-channel controller 1,000,000 800,000 600,000 400,000 200,000 0 Logic Gates BCH (t=16) BCH (t=29) BCH (t=60) RS/TCM LDPC Logic Gates August 2012 16
How to Handle Increasing ECC? ECC is NAND Flash technology dependent and is implemented in hardware Block management and drivers are not technology dependent and can be updated in software/firmware ECC Free Solution Tightly couples ECC to the NAND technology Also covers NAND aggregation, reducing channel loading Block management performed in processor can use DRAM buffer and results in a higher performance than a fully managed solution August 2012 17
Questions? August 2012 18
About Michael Abraham Architect in the NAND Solutions Group at Micron Covers advanced NAND and PCM interfaces and system solutions IEEE Senior Member BS degree in Computer Engineering from Brigham Young University 2007-2012 Micron Technology, Inc. All rights reserved. Products are warranted only to meet Micron s production data sheet specifications. Information, products and/or specifications are subject to change without notice. All information is provided on an AS IS basis without warranties of any kind. Dates are estimates only. Drawings not to scale. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. August 2012 19