A Study of Application Performance with Non-Volatile Main Memory



Similar documents
SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory

The Data Placement Challenge

Let s Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems

ebay Storage, From Good to Great

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability

Speeding Up Cloud/Server Applications Using Flash Memory

NV-DIMM: Fastest Tier in Your Storage Strategy

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

Price/performance Modern Memory Hierarchy

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

NVRAM-aware Logging in Transaction Systems

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated

NVRAM-aware Logging in Transaction Systems. Jian Huang

Configuring Apache Derby for Performance and Durability Olav Sandstå

Benchmarking Cassandra on Violin

Blurred Persistence in Transactional Persistent Memory

Scaling from Datacenter to Client

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Flash s Role in Big Data, Past Present, and Future OBJECTIVE ANALYSIS. Jim Handy

1 Storage Devices Summary

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

NV-Heaps: Making Persistent Objects Fast and Safe with Next-Generation, Non-Volatile Memories Joel Coburn

An Overview of Flash Storage for Databases

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

NAND Flash Architecture and Specification Trends

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Application-Focused Flash Acceleration

THE SUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

Benchmarking Hadoop & HBase on Violin

Storage Class Memory Support in the Windows Operating System Neal Christiansen Principal Development Lead Microsoft

Software Defined Microsoft. PRESENTATION TITLE GOES HERE Siddhartha Roy Cloud + Enterprise Division Microsoft Corporation

Ground up Introduction to In-Memory Data (Grids)

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

Software-defined Storage Architecture for Analytics Computing

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Hadoop: Embracing future hardware

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Can High-Performance Interconnects Benefit Memcached and Hadoop?

High Frequency Trading and NoSQL. Peter Lawrey CEO, Principal Consultant Higher Frequency Trading

AIX NFS Client Performance Improvements for Databases on NAS

nvm malloc: Memory Allocation for NVRAM

Promise of Low-Latency Stable Storage for Enterprise Solutions

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards

Data Center Solutions

Client-aware Cloud Storage

Boost Database Performance with the Cisco UCS Storage Accelerator

Sistemas Operativos: Input/Output Disks

Data Center and Cloud Computing Market Landscape and Challenges

Getting the Most Out of Flash Storage

CS 6290 I/O and Storage. Milos Prvulovic

How To Write A Database Program

StarWind Virtual SAN for Microsoft SOFS

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Building a Flash Fabric

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Research Statement Yiying Zhang

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

ioscale: The Holy Grail for Hyperscale

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

Implications of Storage Class Memories (SCM) on Software Architectures

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

PostgreSQL Performance Characteristics on Joyent and Amazon EC2

SMB Direct for SQL Server and Private Cloud

Fast Crash Recovery in RAMCloud

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Transcription:

A Study of Application Performance with Non-Volatile Main Memory Yiying Zhang, Steven Swanson

2 Memory Storage Fast Slow Volatile In bytes Persistent In blocks Next-Generation Non-Volatile Memory (NVM) 2015

The Landscape of Memory and Storage Byte-Addressable Is Changing! DRAM $ Next-Gen NVM Phase Change Memory Memristor STT-RAM Persistent Flash Disk 1ns 10ns 100ns 1us 10us 100us 1ms 10ms Latency 3

NVM Read Latency Kosuke Suzuki and Steven Swanson "The Non-Volatile Memory Technology Database (NVMDB)", Department of Computer Science & Engineering, University of California, San Diego technical report CS2015-1011, May 2015. (http://nvmdb.ucsd.edu) 4

NVM Write Latency Kosuke Suzuki and Steven Swanson "The Non-Volatile Memory Technology Database (NVMDB)", Department of Computer Science & Engineering, University of California, San Diego technical report CS2015-1011, May 2015. (http://nvmdb.ucsd.edu) 5

NVM Density Kosuke Suzuki and Steven Swanson "The Non-Volatile Memory Technology Database (NVMDB)", Department of Computer Science & Engineering, University of California, San Diego technical report CS2015-1011, May 2015. (http://nvmdb.ucsd.edu) 6

7 Non-Volatile Next-Generation Main Non-Volatile Memory (NVMM) Memory Byte Addressable DRAM CPU CPU Cache NVMM Low Latency Persistence Capacity

8 Outline Introduction Application performance with NVMM NVMM in data centers Conclusion

9 Intel NVMM Emulator Use DRAM to emulate different NVMMs Delay read latency by increasing CPU stalls Read and write bandwidth throttling by limiting DDR transfer rate Emulate data persistence by flushing CPU caches and adding software delay

10 NVMM Data Persistence Flush CPU cache clflush: flush one cache line at a time clflushopt: reduces ordering guarantee clwb: does not force to throw away cache lines WOF: read followed by non-temporal write Ensure data durability in NVMM and ordering pcommit sfence, mfence

msync latency (msec) NVMM Data Persistence Cost Update 1 byte, followed by msync to a range of increasing size 300 250 default clflushopt 200 HDD 150 WOF Making cache flush fast reduces data persistence cost clwb Still costly 100 when syncing a lot of data SSD 50 0 1 10 100 msync size (MB) 11

12 Selective Persistent Flushing (SPF) Goal: only flush modified data Use page table entry dirty bit Only flush cache lines in dirty pages Sub-page data flushing Change system call granularity (e.g., msync) Alternative: aplication modification

msync latency (msec) 13 Reduce NVMM Data Persistence Cost 300 250 200 150 default clflushopt HDD WOF clwb 100 50 SSD SPF 0 1 10 100 msync size (MB)

14 Evaluation Configurations Start with DRAM configuration Adding PCM configurations one at a time Configuration Read Latency (ns) Throughput Ratio to DRAM Write Barrier Latency (μs) Cache Flush Mode PDRAM 150 1 0 clflush PM-Lat 300 1 0 clflush PM-BW 300 1/8 0 clflush NVMM-Raw 300 1/8 1 clflush NVMM-WOF 300 1/8 1 WOF NVMM-Opt 300 1/8 1 SPF

15 Evaluated Applications File system applications: FileBench Key-value store: MongoDB Relational database: MySQL Using NVMM as big memory: Memcache

Throughput Ratio to PDRAM 16 FileBench Varmail Performance 1.2 1 0.8 0.6 0.4 0.2 0

Throughput Ratio to PDRAM MongoDB Key-Value Store Performance MongoDB fsync_safe mode, YCSB workloads Read latency = 300ns, read&write bandwidth = 1/8 DRAM bandwidth 1000 100 10 1 50% read, 50% write 95% read, 5% write 0.1 PDRAM PM-Lat PM-BW NVMM-Raw NVMM-WOF NVMM-Opt SSD HDD DRAM 17

18 Summary of NVMM Performance Net-gen NVM can offer fast, persistent storage App performance with NVMM can be close to DRAM Making data persistent can be costly Techniques to reduce the cost of data persistence and the amount of data to make persistent

19 Outline Introduction Application performance with NVMM NVMM in data centers Conclusion

20 NVMM in Data Center NVMM Data Center Reliability Availability Storage Data Replication Low Latency Consistency Protocol Networking Stack Software Overhead

Existing data replication is too slow for NVMM! 21

Mojim: Replicated, Reliable, Highly-Available NVMM with Good Performance 22

23 Mojim Approach Replication Protocol: efficient 2-Tier OS: optimized, generic layer Networking: efficient RDMA Interface: memory-based Architecture: cache-flush optimization

24 Mojim Interface Good performance Applications load store open atomic persist Easy-to-use, generic interface Mojim NVMM Sync Point

25 Mojim Architecture Primary Node Applications Mirror Node Applications load store open atomic persist load open Mojim Master Mojim Mirror NVMM RDMA NVMM

26 Async Mode Primary Node Applications Mojim Master NVMM RDMA Weak Consistency Mirror Node Applications Mojim Mirror NVMM Good performance

27 Sync Mode Primary Node Applications Mojim Master NVMM RDMA Strong Consistency Mirror Node Applications Mojim Mirror NVMM Good performance

Sync-2tier Mode Primary Node Applications More Redundancy Mirror Node Applications Mojim Master NVMM RDMA Mojim Mirror 2 strongly-consistent copies + weakly-consistent copies Good performance NVMM RDMA/Ethernet Mojim Backup Flash/Disk 28

Latency (ms) 29 MongoDB Key-Value Pair Load Testbed: DRAM as NVMM proxy, 40Gbps Infiniband Workload: key-value pair insert 3 2.5 2 1.5 Replication: 2.8X Networking: 4.9X And more reliable 1 0.5 0 MongoDB Mojim Async Mojim Sync-2tier

Latency (usec) Data Persistence Latency Testbed: Hardware NVMM emulator, 40Gbps Infiniband Workload: Persist random 4KB regions in a 4GB mmap d file 12 10 8? 6 4 2 0 Un-replicated Async Sync Sync-disk 30

flush NVMM CPU Cache Unreplicated NVMM CPU Cache RDMA Sync NVMM 1 RDMA roundtrip < clflush the data Ensures cache coherence without clflush Strongly ordered, flush one cache line at a time 31

32 Summary of NVMM Replication Non-volatile memory in data center Efficient data replication Flexible modes Replication even faster than no replication

33 Conclusion NVMM can potentially fill the gap between memory and storage Making data persistent can be costly Mojim: first step of NVMM in data centers

34 Thank you! Questions? yiyingzhang@cs.ucsd.edu NVMDB: http://nvmdb.ucsd.edu