Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays By Scott D. Lowe, vexpert Co-Founder, ActualTech Media February, 2015
Table of Contents Introduction: How Flash Storage Works 3 Flash Storage Types... 3 Fatal Flash Flaws and How to Overcome Them 5 Wear Finite Program/Erase Cycles... 5 Write Amplification... 5 Bit Error Rates... 6 Steady State Performance... 6 Write Cliff... 7 Read Disturb... 7 Summary: Architecture Matters 8 About the Author 8 About Tegile 8 ActualTech Media 2015. All rights reserved. Under no circumstances should this document be sold, copied, or reproduced in any way except with written permission. The information contained with the document is given in good faith and is believed to be accurate, appropriate and reliable at the time it is given, but is provided without any warranty of accuracy, appropriateness or reliability. The author does not accept any liability or responsibility for any loss suffered from the reader s use of the advice, recommendation, information, assistance or service, to the extent available by law. Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 2
Introduction: How Flash Storage Works To understand potential flaws in flash storage media, it s important to understand how flash works from a read/write perspective. Reading data is pretty straightforward. It s on write operations where flash-based storage is really differentiated from spinning disk. There is a multistep process that takes place to accomplish a write operation on a solid state disk (SSD): Read. On flash storage, write operations begin with a read to make sure that the target block is empty. Blocks are comprised of individual pages, and every page in a target block must be empty. Figure 1 shows a single storage block with multiple pages. One of the pages is marked in red to denote a non-empty page. Move data. If necessary, data in non-empty pages are moved to other locations on the disk. Erase. Once the block is empty, the full block is erased in preparation for writing new data. Write. The new data are written to pages inside the block. Figure 1: Storage Block Flash Storage Types There are different kinds of flash storage and each type features different characteristics, as shown in Table 1. The middle two columns, Cycles and Capacity, are the most critical. Table 1 Flash Type Cycles Capacity Cons Single Level Cell (SLC) Enterprise Multi-Level Cell (emlc) 100,000 1 bit per cell (2 possible values) Most reliable 30,000 2 bits per cell (4 values) More reliable than MLC/cMLC (overcommit) Lowest density = highest $/GB More write overhead Multi-Level Cell (MLC) 10,000 2 bits per cell (4 values) More write overhead Triple Level Cell (TLC) Sub- 10,000 (~ 3K to 5K) 3 bits per cell increases storage density (8 values) Very poor write performance Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 3
The Cycles column is the metric which determines how long that particular medium type will last. In other words, how many erase/write cycles the media can withstand before it begins to fail. The Capacity column identifies the amount of information that can be stored in each cell 1, 2, or 3 bits, depending on media type. In reality, this data density is just a reflection of how many discrete voltage levels can be read from an individual cell. With 1-bit media, SLC flash, the cell is either on, or it s off. There are only two states. With MLC flash media, there are four different voltage states, enabling the storage of 2 bits of information. TLC storage cells can have eight different values, providing 3 bits of storage per cell. Figure 2 provides a visual representation for individual cell density for each kind of flash media. Also understand that denser cell types require more fine-tuned voltage detection mechanisms, which impact the speed of the storage. This is why SLC media is so much faster than TLC media. Figure 2: Individual Cell Density Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 4
Fatal Flash Flaws and How to Overcome Them Flash storage has become a staple of the data center. When used correctly, it accelerates workloads and helps IT better meet business application needs. However, before jumping into the flash waters, make sure that you fully understand the potential challenges inherent in flash technology, and learn how vendors overcome these challenges. Wear Finite Program/Erase Cycles As shown previously in Table 1, different kinds of flash have different numbers of program/erase cycles, after which the media begins to degrade and ultimately fails. The better the flash, the more cycles it can support. This challenge makes is critical that when buying storage, you pay attention to the kind of media that vendors use in storage arrays. For example, MLC media (sometimes referred to as consumer-grade MLC [cmlc]) has the same density as emlc. However, emlc has additional cells that the media can use, which significantly extends the life of the storage investment. Moreover, all modern flash storage controllers include software that distributes writes across the storage in a technique that is known as "wear leveling". Without wear leveling, there is the potential for individual cells to be continuously and repeatedly erased and rewritten, which could lead to a premature death. Write Amplification Back in Figure 1, there was a single red data page in the block. Now, imagine this scenario: The flash storage controller determines that new data needs to be written to that very block. Keep in mind that blocks must be empty before they can be written to. Therefore, the data in that block must be moved elsewhere before the new data can be written. Now, imagine that the target block for this moved data isn t empty. That means the data in that block must be moved as well. This process creates the potential for a domino-like effect of constant data moves and the need to continuously write data blocks in new locations. This domino-like effect is known as "write amplification". As the disk fills up and fewer blocks remain empty, write amplification can become more pronounced. The more write amplification that takes place, the slower write operations become, in general. Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 5
Media choices makes a big difference when it comes to addressing write amplification. Remember that emlc media, for example, has some additional capacity a larger spare area which extends its overall life. This extra capacity also helps to address write amplification. However, there are two other techniques that help to address write amplification. The first is TRIM, and the second is garbage collection. TRIM is implemented at the operating system level (although the SSD must also support it) while garbage collection is a function of the flash controller. TRIM is a command that allows the operating system to tell the flash disk which of the previously saved blocks of data are no longer needed. That way, when the SSD performs garbage collection, it knows that those blocks can be erased. Without TRIM support, the SSD may not know about the status of blocks for files that have been deleted. Garbage collection is the process that the flash controller performs to relocate data pages to new locations so that blocks can be fully erased. By addressing relocation needs proactively, write amplification can be reduced. Bit Error Rates All storage suffers from what are known as bit errors. Kingston defines Bit Error Rate as the rate at which naturally occurring bit errors in NAND flash occur without the benefit of Error Correction Code (ECC) and which the [Flash Storage Processor] FSP corrects using on the fly advanced ECC without disrupting user or system access. As dies shrink and flash memory gets ever-smaller, ECC must continuously improve to keep pace. Make sure vendors are keeping current with technology and that their ECC mechanisms are sufficient to overcome bit errors, even as drives get larger in capacity, but smaller in cell size. Steady State Performance Any all-flash storage array goes through a three step process when it comes to performance, as shown in Figure 3: First Out of Box (FOB). When a brand new array is shipped, in theory, all of the Figure 3: Steady State Peformance (source: Micron) Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 6
flash blocks are erased, pristine, and ready to accept data. As such, when data is written to the array, the read/move/erase/write cycle can be skipped and the data just written. This means that initial performance is abnormally high. Transition State. During the transition state, some blocks are being written to for the first time while others have to go through the full cycle described earlier in this paper. Steady State. The steady state is the performance pattern that will characterize the array for the long haul. When buying storage, look for storage performance patterns in which the steady state remains significant (high) and stable. Write Cliff The write cliff is closely related to steady state performance. After all, cells in an array have been through an initial erase/write cycle, performance drops off sharply, as shown in Figure 4. Remember, in flash-based storage, write operations take place far slower than read operations due to the need to erase and move blocks of data all the time. Figure 4: Write Cliff (Source: Crestingwave/Velobit) Figure 4 provides a look at what can happen to performance once an array hits the write cliff. The write cliff can t be avoided, but its impact can be minimized. Factors such as flash media type, and write amplification minimization will help keep the write cliff under control. Read Disturb Flash storage is just a mash up of cells, each with differing voltages that determine the value of each individual cell. Read and write operations can impact these voltage levels. In fact, as cells are continuously read, there is the small potential for disruption of adjacent cells. If disruption takes place a cell value may flip, leading to data loss (Figure 5). Figure 5: Read Disturb Fortunately, this issue is easily mitigated. Enterprise flash storage systems simply track how many times individual cells have been read and move blacks as needed, effectively eliminating the potential for read disturb. Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 7
Summary: Architecture Matters When choosing and implementing an all-flash storage system, make sure the vendor has addressed the potentially fatal flaws that can plague flash storage. Here are some additional tips to help maximize the flash storage experience: Align writes at appropriate boundaries to avoid fragmented I/O and unnecessary writes. Flash storage and fragmented I/O can significantly increase write amplification. Modern operating systems are much better about alignment than older ones, but it is worth monitoring this. Make sure the selected system constantly track flash wear and relocates data to ensure uniform wear across flash pages. Buy systems that actively reduce data. By performing deduplication and compression functions prior to writes, you actively minimize write operations, which extends the life of the storage and also saves on capacity. Systems that implement DRAM caching can help minimize the need for reads, which in turn can help reduce the need for moving blocks to avoid read disturb issues. Besides the fact that DRAM is even faster that flash storage, being able to avoid read disturb improves performance and can extend the life of the array. Even with other mitigating features, make sure the array uses a block read threshold since last erase cycle to prevent read disturb issues that may arise. About the Author About Tegile Scott Lowe is co-founder of ActualTech Media and the Senior Editor of EnterpriseStorageGuide.com. Scott has been in the IT field for close to twenty years and spent ten of those years in filling the CIO role for various organizations. Scott has written thousands of articles and blog postings and regularly contributes to such sites as TechRepublic, Wikibon, and virtualizationadmin.com. Tegile Systems is pioneering a new generation of intelligent flash arrays that balance performance, capacity, features and price for virtual desktop and database applications. With Tegile's line of all-flash and hybrid storage arrays, the company is redefining the traditional approach to storage by providing a family of arrays that accelerate business critical enterprise applications and allow customers to significantly consolidate mixed workloads in virtualized environments. Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays Page 8