Architecture Enterprise Storage Performance: It s All About The Interface.



Similar documents
Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

Data Center Solutions

Data Center Solutions

Data Center Storage Solutions

Solid State Storage in Massive Data Environments Erik Eyberg

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Flash Memory Technology in Enterprise Storage

EMC XtremSF: Delivering Next Generation Performance for Oracle Database

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

enabling Ultra-High Bandwidth Scalable SSDs with HLnand

Flash-optimized Data Progression

Technology Insight Series

ioscale: The Holy Grail for Hyperscale

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

NV-DIMM: Fastest Tier in Your Storage Strategy

Application-Focused Flash Acceleration

Benefits of Solid-State Storage

OCZ s NVMe SSDs provide Lower Latency and Faster, more Consistent Performance

Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions

SOLID STATE DRIVES AND PARALLEL STORAGE

Flash In The Enterprise

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

Understanding Flash SSD Performance

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

HP Z Turbo Drive PCIe SSD

Scaling from Datacenter to Client

SSD Performance Tips: Avoid The Write Cliff

Storage in the era of cloud and big data: the advantages of SSDs over HDDs

The Data Placement Challenge

A Balanced Approach to Optimizing Storage Performance in the Data Center

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

OBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server

New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

Speed and Persistence for Real-Time Transactions

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems

Comparison of NAND Flash Technologies Used in Solid- State Storage

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

WHITEPAPER It s Time to Move Your Critical Data to SSDs

Benchmarking Hadoop & HBase on Violin

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era

EMC VFCACHE ACCELERATES ORACLE

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Accelerating Server Storage Performance on Lenovo ThinkServer

Benchmarking Cassandra on Violin

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

Data Center Performance Insurance

FUJITSU Storage ETERNUS DX200 S3 Performance. Silverton Consulting, Inc. StorInt Briefing

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

Optimizing SQL Server Storage Performance with the PowerEdge R720

Using Multipathing Technology to Achieve a High Availability Solution

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

HP Smart Array Controllers and basic RAID performance factors

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

High Performance SQL Server with Storage Center 6.4 All Flash Array

Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server

Server Virtualization: Avoiding the I/O Trap

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.

Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads

Data Center Storage Solutions

Comparison of Hybrid Flash Storage System Performance

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Important Differences Between Consumer and Enterprise Flash Architectures

FLASH 15 MINUTE GUIDE DELIVER MORE VALUE AT LOWER COST WITH XTREMIO ALL- FLASH ARRAY Unparal eled performance with in- line data services al the time

Solid State Technology What s New?

How SSDs Fit in Different Data Center Applications

EMC XTREMIO EXECUTIVE OVERVIEW

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Flash Storage Gets Priority with Emulex ExpressLane

Everything you need to know about flash storage performance

Server-Side Virtual Controller Technology (SVCT)

Native PCIe SSD Controllers A Next-Generation Enterprise Architecture For Scalable I/O Performace

Building a Flash Fabric

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Performance Brief: MegaRAID SAS 9265/9285 Series

Technologies Supporting Evolution of SSDs

The Advantages of Tier 0 Storage and Enhance Application Performance

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

Maximum performance, minimal risk for data warehousing

Advances in Flash Memory Technology & System Architecture to Achieve Savings in Data Center Power and TCO

Boost Database Performance with the Cisco UCS Storage Accelerator

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

Transcription:

Architecture Enterprise Storage Performance: It s All About The Interface. A DIABLO WHITE PAPER APRIL 214 diablo-technologies.com Diablo_Tech

Enterprise Storage Performance: It s All About The Architecture. Your PCIe SSD has an ugly secret. If you ve ever tested holistic SSD performance (both and latency under load), you may already know what that secret is. Though PCIe interface bandwidth is aggressively touted by SSD manufacturers, end-to-end enterprise storage performance is truly governed by the presence or lack of bottlenecks within the system. Until recently, the performance and scalability bottlenecks inherent to PCIe-based SSDs have been largely ignored. Those SSDs represented the highest-performing storage available, so customers had no alternative options to consider. However, with the introduction of Memory Channel Storage (MCS ), a superior solution now exists. Memory Chanel Storage leverages an architecture that solves the issues faced by PCIe SSDs.thereby unlocking the true potential of flash storage in the enterprise. It s Not All About The Interface With the advent of enterprise PCIe-based SSDs, interface speed moved to the forefront of most performance-related conversations. When compared to SATA and SAS SSDs, the bandwidth available to PCIe drives was clearly superior. As a result, PCIe SSD vendors have been vocal in promoting PCIe bandwidth as a key technology differentiator. In practice, however, there is a world of difference between what an interface can support and what a solution using that interface can deliver. The focus on theoretical PCIe bandwidth, while compelling, has served to obscure a critical shortcoming a pervasive bottleneck that limits both the performance and scalability of PCIe-based storage devices. Flash Management Overload It is well known that sophisticated media management is required to make economical flash (i.e. commodity MLC) usable in Enterprise applications. Wear-leveling, garbage collection, and error correction are amongst the activities that must be constantly managed for each flash IC on a given storage device. These tasks create significant computational overhead, which gets multiplied as the amount of flash increases. 1

To optimize performance, solid state drives simultaneously access multiple flash ICs in parallel. The highest-performing PCIe-based SSDs employ a big ASIC architecture, in which many parallel flash devices are managed by a single, monolithic flash controller (see Figure 1). However, due to the large amount of media management required, the big ASIC approach creates a bottleneck under heavy I/O load. This results in degraded latency as the controller ASIC is unable to keep pace with the increased computational burden. Figure 1 - Big ASIC SSD implementation A Telling Comparison The effect of media management can be observed by examining the performance of leading PCIe solutions. For example, the performance of an MLC-based big ASIC solution will be dramatically worse than the performance of analogous (same controller, same number of flash placements) SLC-based solution. This phenomenon is demonstrated in the figures below (plotting versus I/O latency). In Figure 2, we are comparing a leading MLCbased PCIe SSD to its SLC-based counterpart (again same controller, same number of Flash ICs) in a 1% Random Read scenario. Though the MLC solution does exhibit reduced performance, both solutions are able to effectively leverage the PCIe bandwidth (>9% utilization) and, therefore, achieve comparably high throughput. In this case, the discrepancy between the SLC and MLC solutions is minimized because Read requests trigger much less media management activity than Write requests (e.g. no garbage collection or wear-leveling is required). vs. Latency: 1% Read, 4K Random MLC-based PCIe SSD SLC-based PCIe SSD Figure 2-1% Read Comparison + The latency profiles of the individual solutions are also worth noting. As Figure 2 shows, latency increases significantly for both products as the I/O load intensifies. This demonstrates another infrequently discussed reality concerning PCIe-based solutions.the low latencies quoted for those SSDs only apply under low I/O loads. When supporting peak performance, the PCIe SSD latencies increase dramatically. 5 5 5 5.5 1 2 3 4 5 6 7 8 9 2

Figure 3 demonstrates the performance comparison in a 1% Random Write scenario. Here, the results clearly show that the SLC-based solution can scale to support much higher throughput than the MLC-based version. In this case, write bandwidth is halved as the solution moves to MLC. Though SLC flash does have better read and write performance than MLC, that advantage does not account for the huge performance discrepancy shown above. Instead, this discrepancy is caused by the increased media management overhead necessary for supporting MLC flash. Compared to SLC flash, MLC has inherently lower endurance and also requires more error correction. Therefore wear-leveling, garbage collection, and error correction all become more prevalent and computationally intensive. The resulting media management creates a bottleneck in the MLC-based solution. Due to this bottleneck, references to high PCIe interface speeds can be misleading. vs. Latency: 1% Write, 4K Random 5 5 5 5.5 5 1 15 2 25 MLC-based PCIe SSD SLC-based PCIe SSD Figure 3-1% Write Comparison + Despite their access to a wide pipe, PCIe based SSDs are not able to leverage those speeds under load. In practice, as demonstrated by Figure 3, less than 15% of the available PCIe bandwidth is being utilized. (Note: The MLC SSD depicted in Figures 2-4 utilizes a write cache. hence the flat performance at 1K for the 1% Write workload.) In Figure 4, we compare using a mixed-use workload consisting of 7% Read transactions and 3% Write transactions. This is a typical read/write mix in real-world Online Transaction Processing (OLTP) applications. Here again, the performance delta is dramatic due to the intensive media management required for the MLC drive. With Reads and Writes interleaved, the associated bottleneck affects all request submissions. PCIe interface capability becomes moot as, once again, only a small portion (less than 3%) of the PCIe bandwidth is actually being utilized. vs. Latency: 7/3, 8k Random 1.9.8.7.6 5 1 15 2 25 MLC-based PCIe SSD SLC-based PCIe SSD Figure 4-7 / 3 [OLTP] Comparison + 3

It s All About The Architecture To leverage the full potential of flash in the enterprise, Diablo Technologies has pioneered a storage technology that bypasses the architectural bottlenecks faced by pre-existing solutions. Diablo s Memory Channel Storage (MCS) architecture achieves end-to-end parallelism (i.e. no bottleneck) by leveraging the server s natively parallel memory subsystem. Each MCS module plugs into a standard DDR3 DIMM slot and is directly available to the CPU s memory controller. Memory controllers were designed to effectively manage massively parallel, time-sensitive, high-speed data access (e.g. to DRAM). MCS DIMMs are populated on multiple memory controller channels and media management tasks are dispersed across those channels in a distributed fashion. By dividing the media management overhead into manageable chunks and co-processing those Figure 5 - Memory Channel Storage Architecture chunks in parallel, Diablo s employs a divide and conquer strategy similar to those popular in distributed computing architectures. The result is a high-performance, efficient persistence layer that can service heavy I/O loads with low, deterministic latency (see Figure 5). In addition, by taking advantage of MCS s distributed nature, solution performance can be efficiently scaled to match Quality of Service (QoS) requirements. MCS provides system designers with granular control over the desired performance and capacity. This enables customers to pay only for what they truly need. Problem Solved In Figures 6 through 8, we ve shown how a Memory Channel Storage solution (comprised of 8x SanDisk ULLtraDIMM devices) compares to the MLC-based PCIe SSD. Both solutions use commodity MLC flash and have similar total capacity (1.6TB vs 1.4TB). 5 5 5 5 vs. Latency: 1% Read, 4K Random Figure 6 - The Read performance of an MCS solution offers a dramatic improvement over a PCIe-based SSD. Most real-world workloads have a significant Write mix, however, so this is less interesting than the next two comparisons that we will examine..5 2 4 6 8 1 12 MLC-based PCIe SSD 8x ULLtraDIMMs Figure 6-1% Read Comparison [with MCS] + 4

vs. Latency: 1% Write, 4K Random 5 Figure 7 - The MCS-based solution also offers vastly superior Write performance. This is a critical benefit in write-centric applications like high-frequency trading, and in cases where system memory must be persisted. 5 5 5.5 5 1 15 2 25 3 MLC-based PCIe SSD 8x ULLtraDIMMs Figure 7-1% Write Comparison [with MCS] + vs. Latency: 7/3, 8K Random 1.9 Figure 8 - The MCS-based solution also dominates a mixed-workload comparison. This 7% Read, 3% Write mix is highly common for popular applications like virtualization and database transaction processing..8.7.6 5 1 15 2 25 MLC-based PCIe SSD 8x ULLtraDIMMs Figure 8-7 / 3 OLTP Comparison [with MCS] + + The plotted data corresponds to and Latency performance, measured as effective Queue Depth (QD) increases. Generally speaking, flash storage performance is maximized at high effective QDs. 5

So What Have We Learned? It s really not all about the interface. As we ve explained and demonstrated, interface bandwidth is not analogous to storage performance. Though the PCIe interface can support high bandwidth, architectural bottlenecks restrict PCIe SSD performance in practice, thereby making theoretical PCIe bandwidth irrelevant. However, by leveraging a uniquely distributed architecture, Memory Channel Storage avoids such bottlenecks and offers a superior solution for real-world applications. Diablo s approach has, for the first time, unlocked the true potential of flash and represents the next generation of performance storage in the enterprise. 213. All Rights Reserved. The dt logo, Diablo Technologies, Memory Channel Storage and MCS are registered trademarks of Diablo Technologies, Incorporated. +1 (613) 569-9999 www.diablo-technologies.com Diablo_Tech 6