UFS 2.0 NAND Device Controller with SSD-Like Higher Read Performance

Similar documents
Caching Mechanisms for Mobile and IOT Devices

Crucial MX inch Internal Solid State Drive

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

Native PCIe SSD Controllers A Next-Generation Enterprise Architecture For Scalable I/O Performace

NAND Flash Architecture and Specification Trends

Scaling from Datacenter to Client

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

Host Memory Buffer (HMB) based SSD System. Forum J-31: PCIe/NVMe Storage Jeroen Dorgelo Mike Chaowei Chen

Indexing on Solid State Drives based on Flash Memory

Technologies Supporting Evolution of SSDs

SALSA Flash-Optimized Software-Defined Storage

QuickSpecs. HP Solid State Drives (SSDs) for Workstations. Overview

FAWN - a Fast Array of Wimpy Nodes

The Transition to PCI Express* for Client SSDs

Nasir Memon Polytechnic Institute of NYU

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

Intel Xeon Processor E5-2600

SATA SSD Series. InnoDisk. Customer. Approver. Approver. Customer: Customer. InnoDisk. Part Number: InnoDisk. Model Name: Date:

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

FIOS: A Fair, Efficient Flash I/O Scheduler. Stan Park and Kai Shen presented by Jason Gevargizian

MS Exchange Server Acceleration

HP 80 GB, 128 GB, and 160 GB Solid State Drives for HP Business Desktop PCs Overview. HP 128 GB Solid State Drive (SSD)

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Samsung Solid State Drive RAPID mode

HP 128 GB Solid State Drive (SSD) *Not available in all regions.

OCZ s NVMe SSDs provide Lower Latency and Faster, more Consistent Performance

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

Solid State Drive Architecture

Secure Cloud Storage and Computing Using Reconfigurable Hardware

HP 80 GB, 128 GB, and 160 GB Solid State Drives for HP Business Desktop PCs Overview. HP 128 GB Solid State Drive (SSD)

An Overview of Flash Storage for Databases

HP Z Turbo Drive PCIe SSD

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

PCIe In Industrial Application

Data Center Solutions

Universal Flash Storage: Mobilize Your Data

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Power Reduction Techniques in the SoC Clock Network. Clock Power

Virtualization of the MS Exchange Server Environment

Data Center Storage Solutions

Radeon HD 2900 and Geometry Generation. Michael Doggett

Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Important Differences Between Consumer and Enterprise Flash Architectures

Price/performance Modern Memory Hierarchy

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Methods to achieve low latency and consistent performance

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

Improving Microsoft Exchange Performance Using SanDisk Solid State Drives (SSDs)

Secure Portable Data Server. 25/06/2012 Alexei Troussov SMIS team INRIA Rocquencourt

Marvell DragonFly. TPC-C OLTP Database Benchmark: 20x Higher-performance using Marvell DragonFly NVCACHE with SanDisk X110 SSD 256GB

Memory Architecture and Management in a NoC Platform

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation

enabling Ultra-High Bandwidth Scalable SSDs with HLnand

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

Certification Document macle GmbH Grafenthal-S1212M 24/02/2015. macle GmbH Grafenthal-S1212M Storage system

Flash Fabric Architecture

Data Center Solutions

Samsung emmc. FBGA QDP Package. Managed NAND Flash memory solution supports mobile applications BROCHURE

Speeding Up Cloud/Server Applications Using Flash Memory

Testing & Assuring Mobile End User Experience Before Production. Neotys

Managing the evolution of Flash : beyond memory to storage

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Flash-Friendly File System (F2FS)

Speed and Persistence for Real-Time Transactions

Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator

1 / 25. CS 137: File Systems. Persistent Solid-State Storage

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

The Why and How of SSD Performance Benchmarking. Esther Spanjer, SMART Modular Easen Ho, Calypso Systems

Optimizing Web Infrastructure on Intel Architecture

Introduction to Microprocessors

Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Data Center Storage Solutions

Embedded Operating Systems in a Point of Sale Environment. White Paper

Programming NAND devices

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Fusion iomemory iodrive PCIe Application Accelerator Performance Testing

How SSDs Fit in Different Data Center Applications

ZD-XL SQL Accelerator 1.5

Boost Database Performance with the Cisco UCS Storage Accelerator

Transcription:

UFS 2.0 NAND Device Controller with SSD-Like Higher Read Performance Konosuke Watanabe Toshiba Santa Clara, CA 1

How did our embedded NAND storage device get SSD-like higher read performance? Santa Clara, CA 2

Outline Background Approaches Example results Santa Clara, CA 3

Embedded NAND Storage Device Host SoC Link NAND device controller NAND chips Single BGA package chip Connects to host SoC with standardized link Contains controller and NAND chips It must be small and lower power 10-20mm 1-2mm 10-20mm 1-2 W (active)...but now higher performance is also expected UFS 2.0: 1160MB/s (5.8Gbps x 2-lane) Santa Clara, CA 4 cf. SATA 3.0: 600MB/s

4KB random read (KIOPS) Performance of Current Embedded NAND Storage Device Products is Quite Low 100 50 DensBits DB3610 (128G) SanDisk inand Extreme (128G) Embedded SSD Samsung KLMCG8GEAC (64G) Samsung SSD 840 EVO (250G) SanDisk Ultra Plus SSD (256G) Crucial M500 (120G) Intel 335 (240G) SanDisk i100 (128G) SSD Intel 520 (240G) OCZ Vertex 3.20 (240G) OCZ Vertex 450 (256G) Corsair Force Series LX (256G) Intel 530 (240G) 0 100 200 300 400 500 600 700 Embedded Sequential read (MB/s) Read performance of embedded NAND storage device and SSD products Santa Clara, CA 5

Reason for Their Poor Performance and Our Challenge Embedded NAND storage device has strict limitation It must be small and lower power consumption Cannot take SSD s rich man s approach Improve performance by utilizing rich resources Massive NAND chips / channels Powerful CPU Large capacity RAM Goal: Achieve SSD-like higher performance without rich resources Focused on read performance of UFS 2.0 device Santa Clara, CA 6

Outline Background Approaches Example results Santa Clara, CA 7

Data Paths Strengthen (1) Conventional NAND Device Controller Embedded NAND storage device NAND device controller Logic + CPU Data paths NAND chips Host Host I/F NAND I/F Santa Clara, CA 8

Data Paths Strengthen (2) Strengthened Data Paths Minimally 5.8Gbps x 2-lane Embedded NAND storage device NAND device controller Logic + CPU 800MB/s 400MB/s x 2ch Host SoC M-PHY HOST I/F NAND I/F NAND chips Santa Clara, CA 9

Host NAND device controller NAND User data FTL data Random Read Latency Reduction (1) Latency of Random Read Random read is accompanied by a couple of NAND reads Target data (user data) read A couple of FTL data reads Special information for address translation NAND read latency is several 10s μsec Random read latency is larger than 100μsec Can be reduced by caching FTL data on large capacity RAM With increasing of package size and power consumption Santa Clara, CA 10

Host Memory FTL data cache Random Read Latency Reduction (2) Using Unified Memory Architecture (UMA) 1. Made NAND device controller available to access host memory Extended UFS standard 2. Cached FTL data on host memory 1. Reads host memory to get FTL data NAND device controller NAND User data FTL data Santa Clara, CA 2. Reads NAND to get user data Host memory read latency << NAND read latency Reduces random read latency by 50% in our evaluation environment With no increase in package size and chip power consumption 11

Random Read Throughput Boost (1) Parallel NAND Reading NAND chips can be read in parallel Multiple NAND chips (and channels) Multiple outstanding NAND read requests Parallelism is determined by number of NAND chips, channels and read requests SSD Our device NAND device controller NAND device controller Santa Clara, CA 12

Random Read Throughput Boost (2) Random Read Command Processor (RRCP) UM FTL data Pipeline1 Pipeline2 Pipeline3 NAND request manager Chip managers Channel arbiters Front-end Parallel out-of-order random read command processing Back-end Efficient NAND read access arbitration and management 7.0 NAND chips are activated in parallel on average in our evaluation environment (with 8 NAND chips) Santa Clara, CA 13

Outline Background Approaches Example results Santa Clara, CA 14

SSD-like Read Performance is Achieved 4KB random read (KIOPS) 100 50 0 Embedded SSD Our work 100 200 300 400 500 600 700 Sequential read (MB/s) 690MB/s 66.3KIOPS Our work (8 NAND chips) Santa Clara, CA 15

Package Size and Power Consumption can be Maintained Package size (with NAND chips) Package power Active (Seq. read) < 1.5 W Idle / Sleep 11.5mm x 13.0mm x 1.2mm < 1.45 mw / < 0.30 mw Depends on storage capacity cf. Typical level Size 10-20mm 1-2mm 10-20mm Power consumption 1-2 W (active) Santa Clara, CA 16

Conclusion Developed UFS 2.0 embedded NAND storage device Improved read performance with three approaches Strengthen data paths minimally Reduce random read latency by using unified memory architecture and FTL data caching Boost random read throughput by maximizing read command parallelism with Random Read Command Processor SSD-like read performance is achieved 4KB random read: 66.3KIOPS Sequential read: 690MB/s Package size and power consumption can be maintained Package size (with NAND chips) : 11.5mm x 13.0mm x 1.2mm Active (seq. read / write) : < 1.5 W Santa Clara, CA 17