NetApp FAS Hybrid Array Flash Efficiency Silverton Consulting, Inc. StorInt Briefing
PAGE 2 OF 7 Introduction Hybrid storage arrays (storage systems with both disk and flash capacity) have become commonplace over the last couple of years, largely because they promise the performance of NAND flash storage at the cost of disk. As a result, most major storage vendors offer hybrid storage today, including EMC, Hitachi, HP, IBM and NetApp. NetApp has been offering hybrid arrays in its flash array storage (FAS) solutions for more than five years now and has deployed sophisticated flash functionality to provide high- performance IO for both file and block storage. NetApp s hybrid arrays use either NAND flash residing in a storage controller or solid- state drive (SSD) storage within a storage aggregate/pool in conjunction with disk storage. In contrast, most other storage vendors use only external SSD storage or flash storage modules in their hybrid array offerings. For these systems, the SSDs are mainly used as a separate storage tier and are typically located external to the storage controller. NetApp FAS Hybrid Arrays NetApp FAS hybrid storage systems use two different flash features: NetApp Flash Cache, which is NAND storage located within a storage controller that acts as an extension of dynamic random- access memory (DRAM) caching for controller read IO activity, and NetApp Flash Pool, which is SSD storage external to the storage controller that is dedicated to a specific aggregate that can act as both a read and write cache for aggregate IO activity. Because Flash Cache and Flash Pool both act as extensions of the FAS storage system s memory or DRAM cache, they have an almost immediate impact on IO performance. For instance, the second a block is re- referenced, it can start benefiting from higher performance with Flash Cache or Flash Pool. Read data, once loaded into system memory, will be relocated to Flash Cache or Flash Pool as it is referenced less frequently than the data in system memory. When read data is no longer read at all, it will be demoted out of NetApp s extended cache altogether. Data that is repeatedly overwritten can reside on Flash Pool SSD storage until the block overwrite activity has slowed, after which time the data will be de- staged to backend disk. Unlike NetApp, other storage vendors automatic storage tiering functionality analyzes IO activity over time, then optimizes IO performance by moving data across two or three tiers of storage. Such IO analysis can easily take hours or even a day or more, delaying any performance benefit.
PAGE 3 OF 7 NetApp Data ONTAP Optimizations for Flash In addition, NetApp s Data ONTAP operating system offers a number of other innovations that lead to more effective flash storage use. For example, WAFL, NetApp s backend storage layout, uses a log- structured file system. This system never overwrites the same block of storage; instead, it gathers multiple blocks to be written out all at once in a single contiguous sequence. When this block sequence is written to SSD storage, the whole accumulation of blocks can be used to overwrite a full NAND page, reducing or eliminating flash write amplification, which is needed to save data that is not overwritten. For both Flash Cache and Flash Pool, Data ONTAP applies flash performance acceleration only to random IO activity. Sequential IO activity typically performs faster when accessed directly from disk than it does when accessed on flash. Thus, Data ONTAP also saves flash capacity for IO activity that can benefit the most from its performance characteristics. NetApp hybrid array features are like running a Formula One engine at 15K RPM versus running other vendors engines at 6K RPM. NetApp Hybrid Array Benefits When compared with other vendors auto- tiering solutions, NetApp hybrid arrays require less flash to achieve equivalent performance. Because NetApp Flash Cache and Flash Pool react so quickly, NetApp hybrid arrays can reuse the same flash blocks or NAND pages to host different frequently accessed blocks during the day. Other vendor systems use a flash location to host the same block of data until the next performance analysis is done. As such, NetApp hybrid array features are like running a Formula One engine at 15K RPM versus running other vendors engines at 6K RPM. While both engines have eight cylinders, the Formula One engine generates much more horsepower because its cylinders run at higher RPMs. NetApp AutoSupport Data The better flash efficiency of NetApp hybrid array isn t just a theory. Indeed, NetApp FAS storage systems that are able to access an outside network have been sending AutoSupport (ASUP ) monthly data to NetApp since they have been selling storage systems. 1 The vast majority of NetApp s FAS storage systems install base report ASUP data that includes log information, performance statistics and failure reports. Although ASUP data was originally intended to support FAS storage, it also supplies configuration information, such as the amount of 1 Some NetApp FAS storage customers don t allow outside network access for security reasons or don t have access to the Internet; as a result, these customers don t report ASUP data.
PAGE 4 OF 7 Flash Cache capacity, Flash Pool SSD storage capacity and disk storage in a storage system, controller or aggregate. Silverton Consulting did not witness the execution of the NetApp ASUP database queries, which generated the ASUP reports, and NetApp was not allowed to supply Silverton Consulting with copies of the actual reports. However, we did review the Flash Pool and Flash Cache ASUP report output data to verify that they were similar to comparable vendors support data. Therefore, Silverton Consulting is confident that the ASUP data used in the Flash Pool ASUP and Flash Cache ASUP data analyses below is a true picture of NetApp s FAS install base at the time the reports were generated and that the calculations we generated are valid for the ASUP data available in the reports. Flash Pool ASUP Data Figure 1 shows a frequency diagram, or a histogram of Flash Pool aggregates SSD capacity % for the three classes of subsystems (entry- level, mid- range, and enterprise system aggregates), indicating the spread or variance in NetApp s install base of Flash Pool hybrid arrays. The ASUP data calculations shown in Figure 1 are for FAS hybrid arrays for all NetApp Flash Pool aggregates reporting in worldwide as of June 2014. Other calculations on the ASUP data reveals that the median Flash Pool % of SSD capacity is 2.0% of total aggregate storage capacity. The median % of SSD capacity for Flash Pool is slightly higher for enterprise- class systems at 2.4% and slightly lower for entry- level storage at 1.7%, with mid- range systems at 1.9%. Figure 1 Histogram of Flash Pool aggregates % SSD capacity by system type
PAGE 5 OF 7 As shown in Figure 1, the peak number of aggregates for entry- level and mid- range- level systems is somewhere between 0.5% and 1.0% SSD capacity, whereas the peak number of aggregates for enterprise Flash Pool storage appears to be somewhere between 1.0 and 2.5% SSD capacity. The relatively high proportion of enterprise- class storage systems with more than 5.5% SSD capacity is likely due to a relatively smaller proportion of Flash Pool aggregates for enterprise systems in the ASUP reports. Flash Cache ASUP Data Figure 2 shows a histogram of % of NAND capacity for the two classes of Flash Cache controllers (mid- range and enterprise), indicating the spread of NAND capacity for Flash Cache. Entry- level systems do not support Flash Cache. The ASUP calculations displayed in Figure 2 are for Flash Cache storage controllers for all NetApp FAS storage systems reporting in worldwide as of October 2014. Other calculations from the ASUP data shows that the median % of flash capacity is 0.7% of total controller storage capacity for all Flash Cache controllers. The median % of Flash Cache capacity for mid- range systems is slightly higher at 0.8% and is slightly lower for enterprise controllers at 0.6%. Figure 2 Histogram of Flash Cache controllers % NAND capacity by system type As shown in Figure 2, the peak number of controllers for mid- range Flash Cache controller % NAND capacity is between 0.5 and 1.0%, and the peak range for enterprise controllers is below 0.5%. NAND capacity over 5.5% for the two classes of systems is very small, indicating the relatively high number of NetApp FAS storage controllers that use Flash Cache.
PAGE 6 OF 7 Competitive Hybrid Storage Flash Requirements The other hybrid flash storage systems on the market use a variety of algorithms and capabilities to take advantage of flash performance for disk storage. However, most of these solutions recommend the use of relatively more flash storage than NetApp s hybrid arrays. For example, other enterprise storage systems typically recommend that customers configure between 3% to 5% of system capacity as flash storage. Some of this inefficiently used, extra flash capacity may be due to the imprecision of the systems auto- tiering solutions, and some may be due to the systems need to fix large amounts of data in SSDs for a longer period of time. With newer scale- out hybrid storage systems that use direct- attached storage (DAS) and SSDs/PCIe Flash for shared storage, the inefficiencies are even more significant. These vendors best practice recommendations suggest their appliance configurations use roughly 9% of storage capacity in flash. Even the newer software- defined storage systems seem to require more flash, and some of these vendors even suggest using 10% of system capacity in flash storage for IO performance. Flash at 3%-5% vs. 0.7%-2% of System Capacity Buying between 0.7% to 2% of storage capacity in flash versus 3% to 5% at the enterprise- class level should result in significant cost savings. Many enterprise- class storage systems have 500TB or more of storage, which could mean the difference between 3.5TB of flash and 25TB of flash storage. With NetApp hybrid arrays similar IO performance gains can be achieved by using flash more effectively, resulting in reduced capital expenditure. NetApp Automated Workload Analyzer Software Given NetApp s continuing focus on storage efficiency, it s not surprising that it has recently introduced an Automated Workload Analyzer (AWA) tool that examines FAS IO workloads to estimate how much Flash capacity should be used to improve IO performance. By running AWA, customers can predict the IO performance gains from various Flash Pool and Flash Cache configurations before they purchase and deploy hybrid storage. NetApp FAS Hybrid Array Performance Published NetApp benchmark reports and internal NetApp data both show that Flash Cache and Flash Pool can significantly increase the amount of IOPS per drive and reduce IO access latencies attained from similarly configured non- hybrid FAS storage systems. In most cases, the public benchmarks and internal reports show that customers can also increase storage density per footprint by using NetApp hybrid arrays with higher- density, slower- speed drives that don t sacrifice higher IOPS performance or reduced access times.
PAGE 7 OF 7 Summary Using NetApp- supplied ASUP information, Silverton Consulting calculated the median % of flash capacity or flash efficiency for both NetApp Flash Cache controllers and Flash Pool aggregates, which come out to 0.7% and 2.0%, respectively. Although some variation exists in these values for the various classes of FAS storage systems, the flash efficiency percentages revealed in our analysis seem to represent the medians for the two NetApp hybrid array features. While similar install base information is not available for other vendors, the vendors themselves recommend the use of relatively more flash/ssd capacity for their hybrid storage systems. For enterprise- class storage, they recommend the use of 3% to 5% of system capacity in flash/ssd storage, which is a much larger percentage than what NetApp uses. The logical conclusion is that NetApp hybrid arrays are more efficient in their use of flash capacity than other vendors storage and system solutions. Moreover, better flash efficiency should result in substantial cost savings when purchasing comparable storage capacities. Many hybrid storage solutions are on the market today, but the results presented herein suggest that NetApp FAS systems provide the highest levels of hybrid array flash efficiency. Other vendors storage systems may be able to match NetApp s hybrid array IO performance acceleration, but only if more expensive flash capacity is configured for their hybrid storage systems. Silverton Consulting, Inc., is a U.S.-based Storage, Strategy & Systems consulting firm offering products and services to the data storage community.