All- Flash Array Performance Testing Framework



Similar documents
Everything you need to know about flash storage performance

Calsoft Webinar - Debunking QA myths for Flash- Based Arrays

EMC XTREMIO EXECUTIVE OVERVIEW

How To Test A Flash Storage Array For A Health Care Organization

Comparison of Hybrid Flash Storage System Performance

Understanding endurance and performance characteristics of HP solid state drives

FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency

Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group

The Technologies & Architectures. President, Demartek

SOLID STATE DRIVES AND PARALLEL STORAGE

Database!Fatal!Flash!Flaws!No!One! Talks!About!!!

Flash-optimized Data Progression

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Solid State Drive (SSD) FAQ

Analysis of VDI Storage Performance During Bootstorm

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

All-Flash Arrays: Not Just for the Top Tier Anymore

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

Solid State Storage in Massive Data Environments Erik Eyberg

ISE 820 All Flash Array. Performance Review. Performance Review. March 2015

NetApp FAS Hybrid Array Flash Efficiency. Silverton Consulting, Inc. StorInt Briefing

SNIA Solid State Storage Performance Test Specification. Easen Ho CTO, Calypso Systems, Inc.

Accelerating Server Storage Performance on Lenovo ThinkServer

How To Get The Most Out Of An Ecm Xtremio Flash Array

Flash Storage: Trust, But Verify

Measuring Performance of Solid State Storage Arrays

June Blade.org 2009 ALL RIGHTS RESERVED

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

SSD Performance Tips: Avoid The Write Cliff

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Microsoft SQL Server 2014 Fast Track

Flash Memory Technology in Enterprise Storage

Addressing Fatal Flash Flaws That Plague All Flash Storage Arrays

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

How it can benefit your enterprise. Dejan Kocic Netapp

Considerations for Testing All-Flash Array Performance

Comparison of NAND Flash Technologies Used in Solid- State Storage

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

Best Practices for SSD. Performance Measurement. Accuracy, Consistency, Repeatability. Main Goals of Enterprise. Performance Measurement

How To Write On A Flash Memory Flash Memory (Mlc) On A Solid State Drive (Samsung)

SQL Server Virtualization

WEBTECH EDUCATIONAL SERIES

AFA Performance Testing Framework

Understanding Data Locality in VMware Virtual SAN

BUYING PROCESS FOR ALL-FLASH SOLID-STATE STORAGE ARRAYS

21 st Century Storage What s New and What s Changing

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

MS Exchange Server Acceleration

HP Smart Array Controllers and basic RAID performance factors

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

Increase Database Performance by Implementing Cirrus Data Solutions DCS SAN Caching Appliance With the Seagate Nytro Flash Accelerator Card

Evaluation Report: Database Acceleration with HP 3PAR StoreServ 7450 All-flash Storage

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Solid State Drive Technology

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

High Performance SQL Server with Storage Center 6.4 All Flash Array

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Temperature Considerations for Industrial Embedded SSDs

Nimble Storage for VMware View VDI

Speeding Up Cloud/Server Applications Using Flash Memory

EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers. Keith Swindell Dell Storage Product Planning Manager

Increasing performance and lowering the cost of storage for VDI With Virsto, Citrix, and Microsoft

Scaling from Datacenter to Client

WHITE PAPER Optimizing Virtual Platform Disk Performance

Deploying Flash in the Enterprise Choices to Optimize Performance and Cost

Why Inline Data Reduction Is Required for Enterprise Flash Arrays

An Overview of Flash Storage for Databases

FLASH 15 MINUTE GUIDE DELIVER MORE VALUE AT LOWER COST WITH XTREMIO ALL- FLASH ARRAY Unparal eled performance with in- line data services al the time

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads

White Paper. Educational. Measuring Storage Performance

PrimaryIO Application Performance Acceleration Date: July 2015 Author: Tony Palmer, Senior Lab Analyst

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Desktop Virtualization and Storage Infrastructure Optimization

Virtualization of the MS Exchange Server Environment

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Important Differences Between Consumer and Enterprise Flash Architectures

Total Cost of Solid State Storage Ownership

Solid State Technology What s New?

EMC VFCACHE ACCELERATES ORACLE

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Transcription:

TECHNOLOGY ASSESSMENT All- Flash Array Performance Testing Framework Dan Iacono IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com The rising tide of digital data and the torrent of applications moving to the cloud, virtualization, and big data and analytics have enterprises struggling to find the necessary performance with their traditional hard disk drive (HDD) infrastructure. According to research from IDC, the use of solid state storage (SSS) in conjunction with solid state drives (SSDs) will play an important role in transforming performance as well as use cases for enterprise application data. IDC's all-sss array (or all-flash array; AFA for short) market forecast predicts $1.2 billion in revenue by 2015 with a 58.5% CAGR. AFAs were first known for their extraordinary performance; however, AFAs are beginning to be known for their consistent performance. AFAs exhibit completely different behaviors from traditional HDD-exclusive storage arrays, which require a different way to test. The underlying flash media characteristics with the combination of hardware and software developments have created a plethora of offerings and differentiation. Highlighted below are a few items where AFAs are different from traditional HDD storage systems: Preconditioning. Preconditioning is an essential first step in testing any flash media. Performance will vary significantly from fresh out of the box (FOB) and after first write on every flash cell. Asymmetry attributes. Flash has unique attributes when compared with traditional HDD, such as the need to erase before rewrite and cell locks on writes, which could inhibit read operations. Because of the differences in flash behavior, it will require a testing plan designed for flash, not HDD. Consistent performance. Enterprise users require performance; however, not at the expense of variability. Therefore the testing framework is designed to understand AFA behavior under several different workloads to ensure production results will be consistent and AFA limits are understood. Test lab resources for flash performance. Flash is unequivocally faster than HDD and will provide an order of magnitude of performance. Therefore AFA testing equipment should be able to generate enough I/O and data throughput to thoroughly saturate the AFA. Filing Information: June 2013, IDC #241856, Volume: 1 Storage Systems: Technology Assessment

TABLE OF CONTENTS In This Study 1 Situation Overview 1 The All-Flash Array Market Perspective... 1 Why Testing Flash Is Different... 2 Solid State and Flash Storage Primer... 3 Solid State Standards Associations... 6 Testing Framework... 7 Testing Overview and Scope... 7 Testing Tools Criteria... 7 Day 1: Preconditioning the AFA... 10 Day 2: Performance Testing and the Time Dilemma... 12 Day 3: Hazard Testing... 16 Day 4: Functional Testing... 17 Day 5: Optional Vendor Specific... 18 Future Outlook 21 Essential Guidance 22 Learn More 23 Related Research... 23 P #241856 2013 IDC

LIST OF FIGURES P 1 Example of Ideal AFA Performance Graph... 19 2 Example of Real-World AFA Performance Graph... 20 3 Example of Bad AFA Performance Graph... 21 2013 IDC #241856

IN THIS STUDY The objective of this study is to provide a framework for testing an all solid state storage system, commonly known as the all-flash array. IDC performed substantial research to understand the prevailing use cases, testing methodologies, and tools; worked with independent labs; conducted vendor interviews; and reexamined existing IDC research. The results of the research generated an extremely large and timeconsuming test plan for testing AFA. In a perfect world, an end user or IT organization would like to run every test to gain the maximum amount of knowledge. However, time and budgetary constraints of the real world usually require a smaller subset of testing be performed. Therefore, we began to investigate what is a reasonable time period to invest for testing an all-flash storage array. The acceptable time period may vary per organization depending on requirements; however, the goal of this document is to provide a framework that would require a week of end-user effort for testing. We reevaluated what tests are important versus "nice to have" so that we could complete testing within a week. Throughout the document we discuss the decision points, rationale, and where (in IDC's opinion) testing cycles can be maximized given limited resources. We would like to stress this document is designed to be a flexible framework. Ultimately the decision of what is important to test and not to test will be made within each individual organization. SITUATION OVERVIEW The All- Flash Array Market Perspective The rising tide of digital data and the torrent of applications moving to the cloud, virtualization, and big data and analytics have enterprises struggling to find the necessary performance with their traditional HDD infrastructure. The industry trends are toward consolidation of applications onto an infrastructure, creating even more mixed workloads in a storage environment. The workloads of the future are predicted to overflow even large traditional controller caches and stress spinning disks to their limits. According to research from IDC, the use of solid state storage in conjunction with solid state drives will play an important role in transforming performance as well as use cases for enterprise application data. IDC's all-sss array market forecast predicts $1.2 billion in revenue by 2015. In an effort to optimize storage solutions for performance, IT organizations are placing their most frequently accessed application data, or hot data, on high-performing solid state storage and less frequently accessed data, or cooler data, on the most capacity efficient HDDs. Organizations leveraging the right balance of solid state storage are able to reduce the average physical footprint while delivering more transactions (IOPS) over a similarly configured environment with exclusively traditional storage media. Additionally, pricing declines in the underlying SSD media are dropping the system dollar-per-gigabyte prices, making the high-performance storage technology more 2013 IDC #241856 1

affordable and appealing to a broader storage market. Although the historical metric of a dollar per gigabyte ($/GB) remains an important factor in the purchasing process, more price performance metrics are beginning to hold more weight, such as: Dollar per IOP ($/IOP) Dollar per workload ($/workload) Dollar per application transaction processed ($/trans) Dollar per desktop ($/desktop) for VDI Why Testing Flash I s Different AFAs exhibit completely different behavior from traditional HDD-exclusive storage arrays, which require a different way to test. With traditional HDD-exclusive arrays, the goal of testing was to understand how much performance could be absorbed until the array cache or flash tier was saturated by incoming I/O. The performance of a traditional or hybrid storage array would dramatically diminish when array cache was saturated because I/O would be go directly to disk, not cache (usually a form of RAM). Another key concept to understand regarding traditional storage systems (which include HDD and potentially SSD) is that they have the same testing results from first power-on until decommissioned. All solid state and flash storage arrays do not have the same performance characteristics from first power-on to decommissioning. Flash media in general have various states within its life cycle. One notable state difference is when the flash device or system is FOB or without any program erase (PE) cycles performed. The difference in flash performance from FOB to a normal operating steady state could vary by 80%. This phenomenon of flash media has been well documented with end-user horror stories reporting extraordinary performance numbers during initial implementation to substantially less in continued production. An essential extra step in the testing process is called "preconditioning," which exercises the entire flash capacity of the device to match expected steady-state behavior during production operations. Preconditioning is discussed in detail in the document. Essential differences to consider between AFA and traditional storage HDD systems: Preconditioning. Preconditioning is an essential first step in testing any flash media. Performance will vary significantly from FOB and after first write on every flash cell. In comparison, traditional HDD has the same performance characteristic FOB to decommission. Without spending the time and effort to precondition the AFA, the performance testing results will not reflect real-world results. Asymmetry attributes. Flash has unique attributes (described in more detail in the solid state and flash storage primer section) when compared with traditional HDD, such as the need to erase before rewrite and cell locks on writes. Traditional HDD can overwrite data in place and does not lock data on writes. Therefore the asymmetrical behavior during testing of an AFA will be different from a traditional HDD storage, thus requiring a test plan specific to the AFA's characteristics. 2 #241856 2013 IDC

Consumable. Flash is a consumable resource. After a defined amount of usage, rated as number of PE cycles, flash will become read-only and, eventually, then inoperable. How the AFA manages the amount of PE cycles in relation to the underlying media will determine the rate of the flash media consumption, flash durability, and AFA performance and influence AFA system cost. Traditional HDD does have specifications, limits, and a resulting warranty; however, the HDD can survive for many years beyond the warranted period as long as the HDD is operating within specifications. Because of the consumable attribute of flash media, AFAs try to avoid writes (if possible), and write amplification, while wear leveling the media, which results in different storage array behaviors (exacerbated under load) during performance testing when compared with traditional HDD. Write amplification can have an exponential effect on AFA testing. For example, one write could result in multiple back-end resulting writes. Testing of an AFA should measure and understand write amplification. Load generation. It is well understood flash media is multiple times faster than traditional HDD and as a result AFAs regularly perform IOPS in the hundreds of thousands. The increased performance of an AFA will require testing tools and equipment that will be able to meet and exceed AFAs performance limits. Data efficiency. Technologies such as deduplication and compression are being implemented in AFAs with primary storage, whereas with traditional HDD those technologies have been mostly relegated to backup solutions. Therefore AFA testing equipment must be able to perform to flash speed and be aware of dataefficiency technologies. Without data-efficiency-aware tools, the AFA will not be properly stressed, resulting in skewed performance testing results. Mixed workloads. Varied workloads are commonplace in storage environments; however, several factors such as asymmetry and the consumability characteristics of flash will result in different results when compared with traditional HDD. Testing different mixtures of workloads is imperative with an AFA because there isn't necessarily a linear relationship between mixed workload results. Moving from 100% reads to 90% reads with 10% writes, may not result in 90% of the read IOPS from the 100% read test. Solid State and Flash Storage Primer Storage arrays have traditionally been built with hard disk drives as the underlying media. Increased demand for performance-centric solutions and declining flash media prices are the underpinnings to the creation of the all-flash array. Solid state and flash storage technologies operate fundamentally different from traditional HDD. It is important to understand the defining characteristics of an SSD and how the SSD differs from the memory components used to construct an SSD. An SSD is built using semiconductor memory to store data typically, either NAND or DRAM. It contains more advanced features than raw NAND memory that distinguishes it in terms of performance, reliability, and interface. These advanced features make SSDs a compelling storage solution depending on your requirements. 2013 IDC #241856 3

Within IDC's taxonomy, we define an SSD as a semiconductor-based storage device that behaves as a virtual HDD and appears to the host device as an HDD device. SSDs are constructed of either nonvolatile semiconductor memory or volatile semiconductor RAM memory with a built-in battery backup system. SSDs are classified as self-contained devices that consist of an interface to connect to the host device, an advanced device controller to provide increased performance and reliability (e.g., life-cycle management, built-in wear leveling, and error correction codes [ECCs]), and semiconductor memory components in a single device. IDC also defines an all solid state or all-flash array as a storage configuration without any spinning media or traditional HDD. The all-flash storage array is not a new concept and has been on the market for over a decade; however, up until recently, the allflash array has been relegated to deployments where maximum performance is required without consideration for cost. The relatively recent availability of multilevel cell (MLC) flash has driven AFA costs lower while new array controller technologies, such as data efficiency (deduplication, compression, and thin provisioning), have helped narrow the logical storage effective price. Commodity SSD or Custom Flash There are two fundamental architectures within the AFA market regarding the underlying flash media. The first architecture is utilizing off-the-shelf commodity SSD for flash storage, which is done by such companies as Pure Storage, WHIPTAIL, SolidFire, HP, 3PAR, NetApp (EF Series), and EMC (XtremIO). The second architecture is custom flash modules designed to the company's specification and manufactured by a third party, which is done by AFA companies such as Violin Memory, IBM (Flash System), HDS, and Skyera. When utilizing custom flash modules, the AFA company must either design the flash controller itself or contract out the development of the controller (to usually the flash module supplier). The AFA market is nascent and it isn't clear which architecture will prevail; however, it's important to discuss with the AFA vendor to understand its reasoning for choosing one architecture versus another. For the purposes of flash performance testing, we don't have a preference on which architecture is best. We want to understand how the entire AFA performs as a system serving data to applications, not a single component. Flash Behavior Attributes PE cycles. Flash is a consumable resource with a wear rating similar to tread life on tire or pages printed with an ink cartridge. Each time the flash cell is erased and programmed (written) again, the flash cell incurs a slight amount of damage known as wear. A single iteration of the erasing and writing to a flash cell is known as a program erase cycle. Flash media is rated by how many PE cycles can be sustained before the flash cell becomes inoperable by being unable to be written to and ultimately unreadable. Asynchronous behavior. Flash does not have any moving mechanical parts, and the concept of "seek time" is eliminated. Therefore read operations are "penalty free" and can be done essentially instantaneously (sequential or random) unless the page is locked at that time (explained in the I/O Operation: Read Versus Write). Writes can occur to a single page quickly if the page is free and available to be written. If the page is marked to be available to write but has 4 #241856 2013 IDC

not been reset since the previous write, an erasure operation must occur. The result of the erase before writing new data characteristic creates a significant performance penalty between write versus read operations, making write operations significantly slower. Unlike HDD, NAND flash must erase in blocks, a specified group of pages (usually 64), known as erase blocks. The maintenance of erase blocks and data movement is known as garbage collection (see the Garbage Collection bullet point in this section). Common types of flash media. Single-level cell (SLC). Single-level cell is a flash nonvolatile memory cell with only one physical bit of storage in the cell area and has a PE cycle rating of 100,000. MLC. Multilevel cell has the ability to electrically store and access two bits of data in the same physical area and has a PE cycle rate of 3,000. MLC can also be referenced as cmlc for consumer or commodity MLC. Enterprise or endurance multilevel cell (emlc). emlc is a derivative of MLC and is harvested from the same wafer as MLC; however, emlc is programmed differently and must pass higher quality standards to yield a PE cycle rating up to 30,000. Garbage collection. Flash requires an erasure operation before rewriting new data to a flash page, which impacts performance. When data is invalidated on the flash media, it isn't automatically erased for several reasons; primarily for performance and media optimization. To minimize the impact of an erasure operation, systems periodically perform a process called "garbage collection" to proactively maintain and optimize flash media. The implementation of garbage collection varies significantly among AFA storage vendors between flash components such as flash modules and SSDs or system-level garbage collection. Some vendors have implemented their own garbage collection systems, while there is an emergence of vendors relying on the SSD manufacturer. Spare allocation. Each SSD or custom flash module has a set amount of capacity reserved for administrative activities such as reallocation of unrecoverable bits and garbage collection. The spare allocation space is not available to the end user and should not be accounted as overall raw capacity. Each vendor may have a preferred spare allocation level and it should not be changed during testing. Also to note, the resulting system (if purchased) should contain the same spare allocation level. It is possible to test with a higher spare allocation level to show better performance results and quote a resulting system with lower spare allocation for aggressive pricing. Wear leveling. As previously described, flash cells have a specified wear rating measured in PE cycles before failing. If certain flash cells are used frequently (as compared with the overall SSD), the flash cells may wear out significantly sooner than the entire SSD. Therefore flash devices use a technique called wear leveling to balance PE operations among flash cells to ensure even physical flash cell wear, which is accomplished either within the SSD itself or at the system level. 2013 IDC #241856 5

Flash translation layer (FTL). Flash translation layer is software employed to provide transparent compatibility between traditional sector-based file systems and flash media. The FTL provides a logical block interface for operating systems to access flash media the same way as traditional HDD. For example, traditional file systems do not contain the concept of erase before write as required in flash and the FTL provides that functionality transparently. The FTL also abstracts the logical location of the data within the disk from the physical location within pages of flash cells. This is important for effective wear leveling and an area where SSD manufacturers are continually innovating. Write amplification. An intricacy of flash media is the concept of erasure blocks usually being larger than the chunks of data written. Therefore it may require moving valid data portions of an erasure block to free up flash cell resources. The process of moving valid data to other erasure blocks is called write amplification because the total number of writes performed by the flash media is higher than the original host request. For example, imagine the host writes 256KB into a 1MB page that contains valid data. The SSD will need to read the 1MB page, overlay the 256KB or modified data, and then write the 1MB into a free page somewhere else on the physical media. This results in a 4:1 write amplification of the data since the SSD writes 1MB instead of 256KB. Solid State Standards Associations JEDEC Solid State Technology Association The JEDEC Solid State Technology Association was formerly known as the Joint Electron Devices Engineering Council and is responsible for fostering and maintaining open standards for the microelectronics industry. Throughout IDC's research for this document two JEDEC documents were consistently referenced with respect to component-level solid state storage testing: JESD218. Solid State Drive (SSD) Requirements and Endurance Test Method JESD219. Solid State Drive (SSD) Endurance Workloads SNIA Storage Networking Industry Association (SNIA) is focused on developing, maintaining, educating, and promoting data storage related activities and standards. One of the SNIA technical projects, "Solid State Storage (SSS) Performance Test Specification (PTS) Enterprise" is chartered to create a common methodology for testing and comparison for vendors and end users regarding SSS. For more information, go to www.snia.org/sites/default/files/sss_pts_enterprise_v1.0.pdf. Other Associations of Interest Both Storage Performance Council (SPC) and Transaction Performance Council (TPC) provide a variety of benchmarks and testing material; however, none of the testing is focused specifically for solid state or flash storage. 6 #241856 2013 IDC

T e sting Framework IDC has described in previous sections how flash is different from an underlying media perspective; however, the focus for the balance of the document is regarding all-solid state or flash arrays as a system, not an individual component. Many AFAs have some form of data efficiency technologies built into the system, such as deduplication (dedup) and compression. Using traditional synthetic test tools not aware of data efficiency will adversely affect results. For example, if the testing tool or database load contains only "0's," then dedup-aware AFAs will absorb the workload handily and the SSD drive lights may not even blink (showing active work). In the "0's" use case, the all-flash storage arrays will recognize the pattern of "0's," update the metadata, and not write the data to the underlying media. Therefore test cases and tools must be aware of data efficiency and deduplication to ensure proper real-world results. AFAs do not worry about "seek time" in the traditional sense since there are no moving mechanical parts within the underlying media. However, the FTL and higherlevel array software creates an abstraction between physical and logical addressing. These layers may take varying amounts of time to respond, depending on factors like fullness and operation type. Therefore, it's important that all logical block addresses (LBAs) be repeatedly written and read to fully characterize performance during extended steady-state operations. Testing Overview and Scope Testing an AFA is a great way to gain deeper understanding of new technology and gather data that could provide substantial benefits to your organization. Both you and the vendor are making an investment in time and resources and we want to ensure both parties receive maximum value. Testing the scope of the test framework discussed in this document should take a work week to complete. We outline roughly what activities should be performed during the five days. We understand that testing sometimes takes longer than expected, so it's okay if a test slips into the next day. There's enough time allocated to pick up the time further in the week. In detail: Day 1: Setup and preconditioning Day 2: Performance Day 3: Functional Day 4: Hazard Day 5: Site specific Testing Tools Criteria There are many testing tools that can be used to accomplish the testing presented in this document. IDC doesn't have a preference or want to be prescriptive on which tool to use; however, we outline some criteria for selecting a testing tool: Generate workloads 2013 IDC #241856 7

Capture results for analysis: Throughput IOPS Latency Tune to different mixtures: Read/write Random/sequential Unique/random data Tune to specific settings: I/O transfer block size LBA dispersal Outstanding I/Os (queue depth) Repeatable randomness (user-specified seed) Increment number of worker threads Examples of Free Testing Tools Available FIO. A powerful open source Linux/Unix benchmarking tool that is extremely flexible and writes random data (100% non-dedup friendly) by default. IOmeter 2010. Open sourced by Intel in 2001, there have been a few releases subsequently. More importantly, in the 2010 release there were options for pseudorandom and full random to account for deduplicating target devices. VDBench. The latest beta version has support for configurable dedup ratios. Btest. This open source Linux-based tool has controllable levels of block sizes, thread counts, read/write mix, and data deduplication ratios and is oriented toward generating high levels of I/O load in a repeatable fashion. Examples of Commercial Testing Tools Available Swift Test JDSU Medusa Labs Test Tools Calypso RTP (SNIA PTS) 8 #241856 2013 IDC

Testing Equipment and Configuration AFAs are known for their high performance, and when testing an AFA, the testing equipment must be able to handle the new level of performance. If your testing equipment can only generate 50,000 IOPS of load, it's likely that the AFAs will be taxed minimally and it will be hard to compare. IDC recommends testing equipment that can generate at a minimum of 500,000 IOPS and more may be required depending on the AFA configuration being tested. We don't want to be prescriptive in how many servers, CPU/cores, PCIe cards, and so forth should be used; however, we want to ensure enough performance can be generated to test the boundaries of the AFA performance. Along with having performance capability, here are some testing best practices: Protocol. If your infrastructure has standardized on Fibre Chanel (FC), then test with, FC and if time permits, test other protocols such as iscsi or IB if it appropriate for your infrastructure road map. SAN or direct connect. If the plan is to connect the AFA to a SAN in production, then test with a SAN switch in the I/O path with the same production settings (including line speed). High availability. Most environments operate in a highly available configuration to avoid any single points of failure. Testing should reflect the same established standards in your production environment. This would include connecting all AFA ports to a SAN switch, mapping LUNs over all paths, and utilizing multiple controllers. Queue depths. Consult with your vendor on recommended settings. AFAs often prescribe higher-than-typical queue depths to get more I/Os "in flight" in order to help produce more load on the array. Compare queue depth setting among tested vendors. Excessively high queue depths may show high I/O, yet high latency. Multipath. Test with vendor-recommended multipath settings and use the multipath software you expect to use in your production environment. Performance results can vary greatly based on multipath settings, as can path failover times. Features "on." AFA flash vendors have the ability to enable and disable different data services features, which may have a substantial impact on AFA performance results. IDC would recommend choosing either enabling or disabling of data services features be consistent across all tests. For example, don't disable data efficiency technologies during testing and then include potential space reduction in the resulting ROI calculations. Marketing or "Hero" Performance Results Many vendors have made extraordinary claims regarding their AFA performance. Usually, these testing claims result in hundreds of thousands (or millions!) of IOPS with 100% 4KB reads. What usually isn't published is the entire testing configuration, cost, load generator, and how performance was measured (i.e., from the server or 2013 IDC #241856 9

AFA). There is a use case for 100% 4KB read scenarios; however, we believe the majority of use cases in the real world will contain a mixture of read and writes with varying block sizes depending on applications. Since flash media is a semiconductor product without any mechanical moving parts, that performance could be extrapolated or modeled from the "hero" or marketing performance numbers. For example, if the test were changed to 90/10 (reads/writes), then some portion of resources would be allocated to servicing writes and decrementing read performance in an orderly fixed way. This would be similar to CPU testing and the performance plot would look like a step function; therefore performance could easily be modeled. This is not true because AFAs have varied hardware and software implementations and architecture, which under different testing workloads is very differentiated. Day 1: Prec onditioning the AFA We previously mentioned that flash performance could vary significantly from FOB to normal operations (known as steady state). The first time flash cells are written is a unique situation. No data must be erased before the data write completes, creating a one-time performance spike. Having a one-time performance spike could significantly influence performance results (both on a single AFA and, if comparing two different arrays, one that has been preconditioned to another that has not), especially if insufficient time has been set aside for thorough testing. More importantly, it will not be an accurate representation of real-world, long-term performance results. To eliminate the one-time performance spike, every flash cell needs to be written at least once. The process of writing to each flash cell to prepare an AFA for testing is called preconditioning. Writing to every flash cell in an AFA can be time consuming, especially if the AFA has a great amount of capacity. SNIA and the consortium of participants (primarily vendors) spent a great deal of time and effort debating and finally agreeing on the proper way to precondition an AFA. SSS PTS describes three distinct phases: FOB, transitioning, and steady state. For our testing purposes, we need to be in the steady state range to ensure testing results will be consistent with real-world results. Our research revealed the best way to precondition is to create a significant workload of 4KB unique, not deduplicatable, or uncompressible exclusive writes and let the test run for 24 hours (more or less depending on your testing equipment throughput). As a general rule of thumb, during preconditioning the amount of data written to the AFA should be 2 3x times the raw capacity of the AFA, which will ensure each cell has be written to at a minimum once. Allocating 24 hours for precondition is general guidance; however, the duration can vary depending on how much data can be written consistently. If the AFA has 20TB raw capacity, then to precondition the AFA within 24 hours the test equipment should be able to write 0.71GBps. In detail: 60TB = 3 x 20TB (three times raw capacity) 61,440GB = 60 x 1,024 (convert TB to GB) 86,400 sec/day = 60 x 60 x 24 (seconds per 24 hours) 10 #241856 2013 IDC

0.71GBps = 61,440/86,400 Therefore the capacity of the AFA and the amount of data written in a given period to the AFA will influence the duration of preconditioning. As a best practice, IDC would recommend discovering the amount of data the test equipment can write to an AFA and calculate the duration required for preconditioning. We would like to note that if during the second half of preconditioning performance noticeably drops and is consistent, it's possible preconditioning has completed. IDC would recommend continuing preconditioning for the calculated duration even if the total data written is less than 3x AFA raw capacity; however, data written should be greater than 2x AFA capacity. As a best practice, IDC would recommend allocating 100% of the AFA's physical capacity and ensuring that 200% of raw capacity is written; touch each LBA at least once during the preconditioning phase. In testing terminology, a testing "run" is short for runtime, which is the time period while the test is executing. Determining the runtime for flash testing isn't as apparent as it may seem. In every flash run there is a period of higher-than-normal performance followed by a lower, more consistent period of performance called steady state, especially if time has elapsed between tests with no load. The reasoning behind the disparity in performance is the AFA has time to perform maintenance tasks, which can be mitigated if there is no time elapse between testing. Think about your production environment. If in production the array will be under constant load, it will not have downtime to perform internal maintenance tasks. The test environment should mimic this constant pressure on the array as closely as possible. The steady state should have a testing variance of no more than 5 10% from the average during the given steady state period. The goal of the run duration is to have enough elapsed time during the test to capture the steady state. As a best practice, a minimum of 25% of the testing run should be within steady state. For our purpose, the default run duration for a test should be 1 hour. If the minimum steady state duration can't be achieved during a run, the test should be invalidated and either the run duration should be extended or additional I/O should be added (e.g., servers, worker threads, queue depth). There is one test exception where run duration should be fixed and steady state may never be achieved. This condition could be observed during hazard testing when the goal is to explicitly cause chaos and measure how the array reacts. Note: For best results, delete LUNs to equal 10% of the AFA capacity and begin generating testing load immediately on the remaining 90% of AFA capacity. Do not allow significant periods of time to elapse between deleting the LUNs and generating load as this will allow the AFA to perform maintenance while idle and not represent long-term performance under load. 2013 IDC #241856 11

Day 2: Performance Testing and the Time Dilemma The main area of focus for most storage array testing is performance. It would be great to test every block size, read/write, and random or sequential iteration. Preconditioning the storage array and needing to run each test for extended periods of time to ensure consistent outcomes exhausts a substantial amount of time. During our research we outlined approximately 400+ possible performance tests with various I/O operations, block sizes, data, and I/O patterns. To ensure the arrays provide consistent results, each test should run for a minimum of one hour (defined as a run). Four hundred tests of approximately at one hour each equals roughly 16 24-hour days of testing (or 49 work days at 8 hours a day). IDC does not believe this was feasible or relevant to the real world to spend two to seven weeks testing one aspect of the overall array fit. Therefore, we'll have to make some trade-offs in order to complete the testing within a week. The following sections describe four variables to characterize performance. Each variable is independent and has a multiplicative effect on the number of tests to be run. For example, 3 I/O operations and 3 block size variations will result in 9 tests to be run (not 6). Transitioning from Preconditioning As part of preconditioning, all available capacity was mapped with several LUNs. Unmap; delete enough LUNs equal to 10% of the AFA's total capacity. The process will invalidate some capacity within the AFA and metadata in system memory. Therefore, some data written during the subsequent tests will be considered new by the AFA. With some capacity invalidated and new data incoming at a fast rate, the AFA will have to perform without significant valid (or clean) available space, which will begin to test garbage collection and flash's asymmetrical attributes. Note: For best results, do not delete LUNs equal to 10% of AFA total capacity and let significant periods of time elapse between generating load. This will allow AFAs to perform maintenance without workloads. It is best practice to fill storage systems to 80%; however, in the real world, storage systems can reach 90 95% capacity. Now that 10% of the AFA total capacity has been deleted, ensure the testing tool keeps the AFA under a performance load. The more free capacity available means more capacity to perform maintenance tasks, where AFA architectures can be differentiated. I/O Operation: Read Versus Write When all-flash storage vendors publish or announce their maximum capabilities, it is usually a 100% read test. Flash media performs extremely well for read because there is no seek time penalty or PE erase penalty; therefore extraordinary performance numbers can be achieved. Real-world storage environments have a mix of reads and writes, usually favoring reads. A secondary effect to flash asymmetry is the operation of data protection algorithms. Under a write or erase condition, a certain portion of flash may be locked; however, an incoming read operation may be 12 #241856 2013 IDC

requested on the same data protection group. What happens? Each AFA handles this condition differently; however, the purpose of testing read and write mixes is to understand how the implementation reacts to varying workloads. To this condition, IDC would recommend the following combinations of read versus write: 10% read/90% write. Some AFAs are tuned to work extraordinarily in a 100% read or write workload; however, with a slight introduction of a mixed workload, AFAs begin to differentiate. We wouldn't expect to observe many 10/90 workloads unless there is lots of content ingest or log writing; however, it's a better test to find AFA performance limitations than an exclusive 100% workload scenario. As described in the solid state and flash storage primer section, writes can take the most time, especially if an erase operation must occur before the write is complete. The goal of this test is to put the AFA under heavy write stress; however, we still have to serve out a small amount of reads to induce garbage collection and any parity reconstructions and achieve AFA saturation, since flash asymmetry is skewed to reads versus writes. We would consider the AFA saturated when 2 3ms average latency is observed or the performance begins to hockey stick (latency spikes without corresponding increase in IOPS). 35% read/65% write and 65% read/35% write. The goal of both of these read/write workload mixtures is to test database use cases. Our research revealed IT environments had two types of database workloads either write intensive or read intensive, which resulted in two separate tests. 80% read/20% write. The goal of this workload is to be read intensive with a nontrivial amount of write load to induce any read contention (as described in the solid state and flash storage primer section under asymmetry). Because of the limited time, we didn't find value in performing an exclusively read or write test. We did debate including the exclusive write test to induce garbage collection. However, we concluded that mixing reads and writes together was still the best real-world scenario. The most popular use case for AFAs is database workloads. We chose 80/20 and 65/35 workloads as the best representation of the read/write mix. This does not preclude from testing other mixtures; however, we would recommend testing read- and write-favored database mixes. The 10/90 mix was chosen to represent a heavy write workload, such as a database redo log. I/O Pattern: Random Versus Sequential We mentioned previously that flash media does not have the concept of seek time because there are no moving mechanical parts. The majority of AFAs perform some form of write coalescing, which essentially combines random I/O transactions into a single efficient sequential I/O. The write activity, whether random or sequential, is transparent to the user. The main exception is if garbage collection has to be performed before the write can complete, which causes extended latencies with a corresponding reduction in IOPS. Some workloads have a significant random component. In addition, AFAs are actively sought for performance-sensitive applications with a heavy random component to avoid the seek penalty associated with spinning disks. Putting it all together with our testing time constraints, IDC recommendations focus testing on random I/O. If 2013 IDC #241856 13

sequential performance is to be tested, compare the results with the same mix run randomly to understand the array's behavior change. Metadata Caching Effect Best practices for testing an AFA stress the array's capability to maintain performance where caching effects cannot be solely relied upon. Thus testing LUN sizes should be larger than can be effectively cached, and the test access pattern should randomly touch all LBAs within the test volumes. This methodology prevents caching of user data as well as the array's metadata and will reveal the true long-term performance potential of the array. For example, rather than using four 100GB LUNs on a 10TB array, use four 2.5TB LUNs (or eight 1.25TB LUNs, or a similar mix that spans all available capacity) with a fully random LBA access pattern across each LUN. Block Sizes There are many blocks sizes to choose, from 512B to 256KB, and the majority of AFAs are optimized for 4KB I/O. We would like to test every block size; however, testing constraints force us to choose the essential block sizes. The primary application for AFAs is database; therefore a small block I/O of 8KB (Oracle) or 4KB (MS SQL) should be chosen. Large block I/O should be tested for media application or even VDI (for files within the VDI session), which could be either 64KB, 128KB, or 256KB. Since most all-flash storage arrays are optimized for 4KB (using advanced format), sub-4kb I/O should be tested (such as 512B) to understand the ramifications of emulating to 512B. If specific block sizes are known for your environment and are essentially for production operations, please add the specific block sizes to the test plan. IDC recommends vetting the blocks thorough sizes because each additional block size will have an impact on overall testing time. In detail: 512B. It should be noted that 512B block size is not vastly deployed. Most AFAs are tuned to 4KB block size, and we wanted to understand how an AFA would perform and differentiate with smaller-than-optimal block sizes. If you have an application that does have a 1KB or 2KB standard block size, please feel free to substitute 1KB or 2KB for 512B. 4KB or 8KB. The goal of this block size is to test database, transactional, virtualized applications. If your environment has a mix of applications that regular produce 4KB or 8KB, we would suggest choosing 8KB testing because 8KB would be the worst-case scenario and the 4KB would only better. Since we are testing transactional activity in these scenarios, AFA saturation would be considered as achieved at greater than 2 3ms average latency or if the latency hockey sticks. 64KB, 128KB, or 256KB. Streaming data whether it is media, HPC, or big data will be large block and testing the AFA ability for throughput response time latency. 14 #241856 2013 IDC

Increasing Worker Threads to Find Maximum Performance A worker thread is a single process, and most likely, multiple threads (and servers) will be required to generate enough workload to saturate an AFA. Each test will require a different level of worker threads before the AFA reaches maximum performance and latency begins to rise rapidly (i.e., hockey sticking). The first test of 10% read/90% write will probably take the longest because you will have to discover two critical data points starting and maximum worker threads. First, start a worker thread and observe the workload generated for 5 10 minutes. If the AFA should deliver 150,000 IOPS, yet a single thread only yielded 5,000 IOPS, then you may want to start the tests at 20 25 threads as a starting base to increment versus 1 thread to save time. This would be known as the starting worker thread count. Since AFA should provide consistent performance, we don't see the value in incrementing from 1 thread to the AFA saturation point. To increment worker threads, we would also recommend choosing an interval of 1 3 minutes. Performance Testing Summary We outlined the testing dilemma in our introduction. To achieve our testing constraints of a week, we need to make some trade-offs. After substantial thought, consideration, and vetting, IDC believes we can consolidate from about 400 tests down to 9 foundational tests. We encourage you to add (or remove) tests as seen fit to increase relevance to your real-world environment. We do caution that adding variables increases the amount of tests that will need to be performed. For example, adding one more block size will increase the amount of test run and hours required by 33%. I/O pattern. 100% random Data pattern. 50%+ unique I/O operation. Read versus write: 10/90 35/65 65/35 80/20 Block sizes. 512B 4KB or 8KB. (standard database block sizes) 64KB, 128KB, or 256KB 2013 IDC #241856 15

Day 3: Hazard Testing The goal of hazard testing is to understand AFA behavior under suboptimal conditions. When speaking with storage administrators, performance and functional testing are great data points; however, storage administrators want to understand the AFA's limits. Knowing the AFA limits will help predict or trend toward possible failures. In detail: Hazard mixed workload. The test is designed to provide a simulated mixed I/O workload of database, virtual machine, and concurrent backups. Set up the testing tool to simulate the four I/O profiles outlined below. After the first run is complete and steady state is achieved, begin to add worker threads at the determined intervals to each I/O profile until an average steady state of up to 2ms write latency is observed or latency begins to grow exponentially. Keep running the test for 1 hour and then record. To save time, do not stop the test, keep the I/O running and begin the next hazard tests. The four I/O profiles are: 4KB, 20/80, sequential, 50%+ unique 8KB, 65/35, random, 50%+ unique 64KkB, 65/35, random, 50%+ unique 128KB, 80/20, sequential reads, 50%+ unique Degradation/failure scenarios. Components fail and it is critical to understand how an AFA will respond in a degraded state. To begin the test, run the hazard mixed workload if not already running (until a run has elapsed or steady state has been achieved, whichever comes first) and begin the first failure test. Take notice of how long it takes the array to recover to a steady state (albeit probably at a higher latency level). After steady state has been established for 5 10 minutes, begin the next failure test. During the "fail a drive" test, it may be an opportune time to also test the vendor remote support home; so remember to set up remote support before beginning the hazard testing. In detail: Fail a path Fail a drive Reboot a controller Fail a controller Perform firmware upgrade Day 4: Functional Testing The goal of functional testing is to understand the logical and feature limits of the AFA. The majority of AFAs have a partitioned metadata scheme where a portion is maintained in system cache and the balance on the underlying media (SSD). The 16 #241856 2013 IDC

testing goals that follow are needed to understand and test the limits of metadata management. Thin Provisioning Test Create the largest available LUNs on the AFA and create more LUNs that size until the array limits are reached. LUNs should be large and should be at a minimum 2 3x the size of the AFA's system memory (DRAM). Testing tools should: Keep the AFA at 90% capacity Randomly access across the LUNs entire LBA range Run the hazard mixed workload (with enough workers to generate up to 2ms of latency or observe exponential growth in latency) across all LUNs created for a 1 hour run. Testing Usable Capacity Some vendors integrate data efficiency features deeply into the array architecture and the feature is always on. Other AFA's data efficiency features are optional, which can be enabled or disabled based on licensing and application requirements. As a separate and optional test for data efficiency, IDC recommends filling the AFA with 10TB of data and varying deduplication or compression ratios (e.g., 10:1, 5:1, 3:1). After each test, measure the effective capacity consumption, purge the data, delete and recreate LUN, and begin the next test. It's important to note the following: Usable capacity before the test Usable capacity after the test Data efficiency ratios Metadata overhead A challenge arises in this test because vendors calculate their space efficiency metrics differently. Do not directly compare data reduction ratios from one array with those from another. Be sure to consult with your vendor to understand exactly what is being reported by the array and how it is calculated. Internal Copy Test Under normal operating conditions AFAs will have basic LUNs store data and advanced LUNs with snapshot or cloned data. The goal of this test is to understand if there is any performance difference between utilizing basic versus advanced LUNs in the testing mix. We outline three scenarios that have a mixture of basic LUNs with a percentage of snapshots and/or clones. Create a LUN mixture as outlined and then 2013 IDC #241856 17

perform the hazard mixed workload test for a run of 1 hour. Repeat these procedures for the remaining tests. The three scenarios are: 50% basic/50% snap 50% basic/50% clone 50% basic/25% snap/25% clone Important considerations: Begin load immediately after snap and/or clone creation. LUNs should be large and should be at a minimum 2 3x the size of the AFA's system memory (DRAM). Testing tools should: Keep the AFA at 90% capacity Randomly access across the LUN's entire LBA range Note: During functional testing, it should be important to record elapsed time between creation and completion for snapshots and clones. For systems with data efficiency enabled, resulting usable capacity should be recorded as well. Day 5: Optional Vendor Specific For the past four days, we focused on exercising the all-flash storage array with a variety of tests to create a holistic profile. On the final day of testing, IDC would recommend reserving tests for your organization's specific use cases. For example, your environment may use a homegrown application that is core to your business. There are many vendor-specific and open testing tools that can be run, such as: Database. SQLioSIM from Microsoft can be used to determine the limits of a storage configuration with respect to Microsoft SQL server. Industry standard and vendor-neutral database testing can be performed with TPC-C, TPC-E, and TPC- F. SPC-1 and SPC-2. SPC-1 and SPC-2 are for general-purpose I/O testing and is industry accepted, with several storage system vendors publicly publishing results. Scaling performance and capacity. During testing, it's common to have a single configured AFA per vendor. IDC would recommend speaking to the vendor regarding scaling capacity and performance for future needs. Virtualization-specific features. VMware with VAAI introduced the concept of offloading virtualization server tasks to the storage array such as "0'ing" LUNs and cloning operations. Each AFA implements VAAI differently to comply with VMware VAAI standards, which may have impact on overall performance. 18 #241856 2013 IDC

Microsoft Hyper-V has developed a similar offload called ODX. For VDI uses specific to VMware, we would recommend RAWC to simulate workloads. IDC strongly encourages you to test your homegrown applications as well as performing a variety of other industry benchmark. The goal is to gather enough data to understand how the all-flash storage array will behave in your environment under your specific workloads. Over the Weekend Other than preconditioning, the majority of tests run for 1 hour. This is substantially longer than traditional HDD testing but is considered short for all-flash storage testing. We have balanced time elapsed per test versus the number of tests performed in a given day. IDC would strongly recommend running the hazard mixed workload test for a period of 24 hours or longer. The weekend provides a great window to accomplish this task. Simply start the test right before you leave work and let it run over the weekend. When you arrive back to work the following week, stop and record the results. Analyze the data to ensure performance was consistent and no performance anomalies were recorded. Interpreting AFA Test Results In a perfect work, test results should be linear with IOPS and latency. When graphed, it should be similar to a straight line (see Figure 1). In IDC's research we discovered there is a need for absolute performance; however, we believe the larger enterprise market wants consistent performance. FIGURE 1 Example of I deal AFA Performance Graph Source: IDC, 2013 2013 IDC #241856 19

The perfect world is a great concept; however, we are searching for the real-world results. AFAs have a characteristic where as workload increases, latency should be the same or slightly increase, and then at the performance saturation point latency increases exponentially. The difference between an AFA and traditional HDD storage system is the AFA will have consistent performance throughout with a narrow band of variance until saturation. When graphing AFA performance, there will be a slight initial uptick, but increased workload should result in fairly flat linear latency and then a dramatic line turn upward (see Figure 2). FIGURE 2 Example of Real- World AFA Performance Graph Source: IDC, 2013 What you don't want is what is observed in Figure 3, which is inconsistent latency. The graph looks like a snake going back and forth. Why? We mentioned it several times throughout this document, there is differentiation between AFAs on how flash maintenance tasks are architected into the AFA system and are handled. The worstcase scenario (see Figure 3) shows garbage collection kicking in, performance dropping, and latency increasing significantly. At first, you might think this is the AFA saturation point; however, as test progresses, performance returns significantly and may even reach high results. The question that needs to be asked if the final results were good is how did the journey look to get there? 20 #241856 2013 IDC

FIGURE 3 Example of Bad AFA Performance Graph Source: IDC, 2013 FUTURE O UTLOOK According to research from IDC, the use of solid state storage in conjunction with solid state drives will play an important role in transforming performance as well as use cases for enterprise application data. IDC's all-sss array market forecast predicts $1.2 billion in revenue by 2015 with a 58.5% CAGR. We believe AFAs have a robust future and the following items will change to meet end-users requirements: Declining cost of flash. IDC has forecast the system- and component-level pricing per capacity to reduce over the next five years. Lower costs for flashenabled systems will increase availability to a larger market of IT environments. Adapting tests for flash. Current independent storage testing bodies have not adapted their tests for flash and therefore treat the AFA as a fast traditional storage system. We believe the tests need, and will be updated, to reflect the behaviors and attributes of flash storage. Testing tools. Flash is a semiconductor product and has different behaviors and attributes than traditional mechanical HDD. Therefore testing tools will need to adapt to properly test flash to receive intended results. In addition: Data efficiency. Most testing tools have an "all or nothing" approach to the uniqueness of data patterns used to test data efficiency. To fill the void, AFA vendors have developed their own testing tools; however, end users are a bit skeptical to run the tools because the tools' source is the AFA vendor. For testing tools to have success, the tools must be controlled by an independent standards body, commercial software, or open source for end users to trust the validity of the test results. 2013 IDC #241856 21

Preconditioning the AFA. Transitioning the AFA from FOB to steady state is a time-consuming process. Testing tools should be able to precondition an AFA without any or little interaction between the end user and the tool. In a perfect world, it would be best if the testing tool and the AFA could communicate to minimize the overall time expended to precondition the AFA. Variance. AFAs need to be tested in the steady state range, and testing tools should automatically be able to detect when variance is within the steady state without any intervention from the end user. Cloud. Whether it be private or public cloud, technological standards will be required, such as: the ability to seamlessly scale out capacity and performance, RESTful API, and be virtualization and hypervisor aware. ESSENTIAL GUIDANCE Investing the time and resources to test an AFA is a significant undertaking. AFAs are different from traditional HDD storage systems and should be tested differently. We highlight a few items to remember while testing an AFA: Testing among AFA vendors must be consistent. If you are testing multiple AFAs, it is very important to run the same tests, configurations, and so forth. It will help when comparing the data and will ensure no AFA will have an unfair advantage. The best way to ensure testing is consistent is by creating a test plan, following it rigorously, and noting any exceptions. Don't skip preconditioning. It might be tempting to cut some time out of the testing because we are all under pressure to do more with less time and resources. Performance between a FOB AFA versus a preconditioned AFA varies drastically. Ensure spare capacity consistency. Because of the asymmetrical behavior of flash media and the necessary maintenance tasks that are associated with it, spare capacity is vital to the normal operation of an AFA. AFA vendors can adjust spare capacity to bolster testing results and then sell an AFA with less spare capacity. Since flash media has a premium price, a 10% change in spare capacity size could have a material impact on price and performance. Mirror infrastructure standards. Don't test any configuration that would not be compliant with established infrastructure standards such as configurations with single points of failure. No Hero tests required. It may be tempting to run an exclusive 100% read or write test to show extraordinary results; however, when running any test be sure to understand its use case and it simulates production workloads. Find the maximum IOPS and throughput. It's best to understand the AFA's limits in a test lab than "discover" performance saturation in production. Knowing the limits of the AFA will help with future capacity planning. 22 #241856 2013 IDC

Personalize to your environment. This document is a framework for testing an AFA and should be customized to meet your environment's specific needs. Ensure mixed workloads. Majority of AFAs will have multiple workloads running on the system, and AFA performance may vary significantly depending on the combination of mixed workload types. Ensure long-term load testing for 24 hours or more. Test results may appear acceptable in short periods of time and even hours; however, today's production IT systems run 24 hours a day with significant load. It's important to understand how an AFA will behave under longer-term testing with constant workload pressure. Interpret test results. AFA performance should be fairly linear until the saturation point. Dramatic performance variations because of system maintenance tasks should be discovered during testing. LEARN MORE Related Research Worldwide All Solid State Storage Array 2013 2016 Forecast (IDC #240424, April 2013) Storage Array Quality of Service: Provisioning and Guaranteeing Storage Performance (IDC #240218, March 2013) Worldwide Solid State Storage Quarterly Update: 4Q12 Summary (IDC #239936, March 2013) New and Growing Channels for Storage Industry Terabyte Shipments (IDC #239953, March 2013) Worldwide NAND Flash Demand and Supply 4Q12 4Q13 and 2012 2017 Update (IDC #240225, March 2013) Primary Storage Data Efficiency with Solid State Storage (IDC #239312, February 2013) Market Analysis Perspective: Worldwide Solid State Storage Technologies, 2012 (IDC #238639, December 2012) Worldwide Solid State Storage 2012 2016 Forecast Update (IDC #238208, December 2012) IDC's Worldwide SSD and Storage Tiering Taxonomy, 2012 (IDC #237759, November 2012) The Economic Benefit of Storage Efficiency Technologies (IDC #236221, August 2012) 2013 IDC #241856 23

Taking Enterprise Storage to Another Level: A Look at Flash Adoption in the Enterprise (IDC #236366, August 2012) Synopsis This IDC study provides a framework for testing an all solid state storage system, commonly known as the all-flash array (AFA for short). IDC performed substantial research to understand the prevailing use cases, testing methodologies, and tools; worked with independent labs; conducted vendor interviews; and reexamined existing IDC research. The results of the research generated an extremely large and timeconsuming test plan for testing AFA. "Flash is a semiconductor product and has behaviors and attributes different from traditional mechanical HDD," says Dan Iacono, research director, IDC Storage Systems. "Therefore testing procedures, tools, and configurations must be adapted to achieve intended results." Copyright Notice This IDC research document was published as part of an IDC continuous intelligence service, providing written research, analyst interactions, telebriefings, and conferences. Visit www.idc.com to learn more about IDC subscription and consulting services. To view a list of IDC offices worldwide, visit www.idc.com/offices. Please contact the IDC Hotline at 800.343.4952, ext. 7988 (or +1.508.988.7988) or sales@idc.com for information on applying the price of this document toward the purchase of an IDC service or for information on additional copies or Web rights. Copyright 2013 IDC. Reproduction is forbidden unless authorized. All rights reserved. 24 #241856 2013 IDC