Persistent Memory and Linux: New Storage Technologies and Interfaces. Ric Wheeler Senior Manager Kernel File and Storage Team Red Hat, Inc.

Similar documents
Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated

Linux Powered Storage:

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

Software Defined Storage: Changing the Rules for Storage Architects Ric Wheeler Red Hat

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

White Paper: M.2 SSDs: Aligned for Speed. Comparing SSD form factors, interfaces, and software support

SATA Express. PCIe Client Storage. Paul Wassenberg, SATA-IO

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

Choosing Storage Systems

Scaling from Datacenter to Client

AHCI and NVMe as Interfaces for SATA Express Devices - Overview

The Transition to PCI Express* for Client SSDs

The Data Placement Challenge

Optimizing Ext4 for Low Memory Environments

Why Hybrid Storage Strategies Give the Best Bang for the Buck

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Solid State Drive Architecture

A Study of Application Performance with Non-Volatile Main Memory

SDN and NFV Open Source Initiatives. Systematic SDN and NFV Workshop Challenges, Opportunities and Potential Impact

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

How to Choose your Red Hat Enterprise Linux Filesystem

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Preparing Applications for Persistent Memory Doug Voigt Hewlett Packard (Enterprise)

The search engine you can see. Connects people to information and services

Storage Class Memory Support in the Windows Operating System Neal Christiansen Principal Development Lead Microsoft

How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance

How To Test Nvm Express On A Microsoft I7-3770S (I7) And I7 (I5) Ios 2 (I3) (I2) (Sas) (X86) (Amd)

OCZ s NVMe SSDs provide Lower Latency and Faster, more Consistent Performance

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Advantages of Intel SSDs for Data Centres

Computer Engineering and Systems Group Electrical and Computer Engineering SCMFS: A File System for Storage Class Memory

Four Reasons To Start Working With NFSv4.1 Now

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

SOLID STATE DRIVES AND PARALLEL STORAGE

Benefits of Solid-State Storage

Providing Safe, User Space Access to Fast, Solid State Disks. Adrian Caulfield, Todor Mollov, Louis Eisner, Arup De, Joel Coburn, Steven Swanson

Application-Focused Flash Acceleration

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012

Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

Intel DPDK Boosts Server Appliance Performance White Paper

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Client-aware Cloud Storage

Total Cost of Solid State Storage Ownership

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

SATA Evolves SATA Specification v3.2 FMS 2013

PARALLELS CLOUD STORAGE

Operating System Components

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

The Evolution of Solid State Storage in Enterprise Servers

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture

Datacenter Operating Systems

Chip Coldwell Senior Software Engineer, Red Hat

Open Source Flash The Next Frontier

Understanding Flash SSD Performance

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

Price/performance Modern Memory Hierarchy

QuickSpecs. PCIe Solid State Drives for HP Workstations

Native PCIe SSD Controllers A Next-Generation Enterprise Architecture For Scalable I/O Performace

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

HGST Virident Solutions 2.0

NV-DIMM: Fastest Tier in Your Storage Strategy

Frequently Asked Questions March s620 SATA SSD Enterprise-Class Solid-State Device. Frequently Asked Questions

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Mini PCIe SSD - mpdm. RoHS Compliant. Product Specifications. August 15 th, Version 1.0

Oracle Aware Flash: Maximizing Performance and Availability for your Database

High Performance SQL Server with Storage Center 6.4 All Flash Array

A Survey of Shared File Systems

KVM & Memory Management Updates

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Flash Memory Technology in Enterprise Storage

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010

Solid State Storage in a Hard Disk Package. Brian McKean, LSI Corporation

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

White Paper. Enhancing Storage Performance and Investment Protection Through RAID Controller Spanning

Solid State Storage in Massive Data Environments Erik Eyberg

21 st Century Storage What s New and What s Changing

SanDisk Lab Validation: VMware vsphere Swap-to-Host Cache on SanDisk SSDs

Flash Storage: Trust, But Verify

SQL Server Virtualization

Best Practices for Optimizing Storage for Oracle Automatic Storage Management with Oracle FS1 Series Storage ORACLE WHITE PAPER JANUARY 2015

A Comparison of NVMe and AHCI

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

As enterprise data requirements continue

All-Flash Arrays: Not Just for the Top Tier Anymore

HP Z Turbo Drive PCIe SSD

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems

LSI MegaRAID FastPath Performance Evaluation in a Web Server Environment

How SSDs Fit in Different Data Center Applications

Support for Storage Volumes Greater than 2TB Using Standard Operating System Functionality

StarWind Virtual SAN for Microsoft SOFS

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

SSDs: Practical Ways to Accelerate Virtual Servers

CompTIA Storage+ Powered by SNIA

Virtualization for Cloud Computing

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

Transcription:

Persistent Memory and Linux: New Storage Technologies and Interfaces Ric Wheeler Senior Manager Kernel File and Storage Team Red Hat, Inc. 1

Overview What is Persistent Memory Challenges with Existing SSD Devices & Linux Crawl, Walk, Run with Persistent Memory Support Getting New Devices into Linux Related Work 2 2

What is Persistent Memory?

Persistent Memory A variety of new technologies are coming from multiple vendors Critical feature is that these new parts: Are byte addressable Do not lose state on power failure Critical similarities are that they are roughly like DRAM: Same cost point Same density Same performance

Similarities to DRAM If the parts are the same cost and capacity of DRAM Will not be reaching the same capacity as traditional, spinning hard drives Scaling up to a system with only persistent memory will be expensive Implies a need to look at caching and tiered storage techniques Same performance as DRAM Will press our IO stack which already struggles to reach the performance levels of current PCI-e SSD's

Challenges with Existing SSD Devices & Linux

7

Early SSD's and Linux The earliest SSD's look like disks to the kernel Fibre channel attached high end DRAM arrays (Texas Memory Systems, etc) S-ATA and SAS attached FLASH drives Plugged in seamlessly to the existing stack Block based IO IOP rate could be sustained by a well tuned stack Used the full block layer Used a normal protocol (SCSI or ATA commands) 8

PCI-e SSD Devices Push the boundaries of the Linux IO stack Some devices emulated AHCI devices Many vendors created custom drivers to avoid the overhead of using the whole stack Performance challenges Linux block based IO has not been tuned as well as the network stack to support millions of IOPS IO scheduling was developed for high latency devices 9

Performance Limitations of the Stack PCI-e devices are pushing us beyond our current IOP rate Looking at a target of 1 million IOPS/device Working through a lot of lessons learned in the networking stack Multiqueue support for devices IO scheduling (remove plugging) SMP/NUMA affinity for device specific requests Lock contention Some fixes gain performance and lose features

Tuning Linux for an SSD Take advantage of the Linux /sys/block parameters Rotational is key Alignment fields can be extremely useful http://mkp.net/pubs/storage-topology.pdf Almost always a good idea not to use CFQ

Device Driver Choice Will one driver emerge for PCI-e cards? NVMe: http://www.nvmexpress.org SCSI over PCI-e: http://www.t10.org/members/w_sop-.htm Vendor specific drivers Most Linux vendors support a range of open drivers Open vs closed source drivers Linux vendors have a strong preference for open source drivers Drivers ship with the distribution - no separate installation Enterprise distribution teams can fix code issues directly 12

Challenges Cross Multiple Groups Developers focus in relatively narrow areas of the kernel SCSI, S-ATA and vendor drivers are all different teams Block layer expertise is a small community File system teams per file system Each community of developers spans multiple companies

Crawl, Walk, Run

Persistent Memory & Byte Aligned Access DRAM is used to cache all types of objects file system metadata and user data Moving away from this model is a challenge IO sent in multiples of file system block size Rely on journal or btree based updates for consistency Must be resilient over crashes & reboots On disk state is the master view & DRAM state differs These new devices do not need block IO 15

Crawl Phase Application developers are slow to take advantage of new hardware Most applications will continue to use read/write block oriented system calls for years to come Only a few, high end applications will take advantage of the byte addressable capabilities Need to hide the persistent memory below our existing stack Make it as fast and low latency as possible!

Crawling Along Block level driver for persistent memory parts Best is one driver that supports multiple types of parts Enhance performance of IO path Leverage work done to optimize stack for PCI-e SSD's Target: millions of IOP's? Build on top of block driver Block level caching File system or database journals? As a metadata device for device mapper, btrfs?

Crawl Status Block drivers being developed Block caching mechanisms upstream Bcache from Kent Overstreet in 3.10 http://bcache.evilpiepirate.org A new device mapper dm-cache target Modular policy allows anyone to write their own policy Reuses the persistent-data library from thin provisioning PMDB block device from Intel Research for persistent memory https://github.com/linux-pmfs

Walk Phase Modify existing file system or database applications Can journal mechanism be reworked to avoid barrier operations and flushes? Dynamically detect persistent memory and use for metadata storage (inodes, allocation, etc)? Provide both block structured IO and byte aligned IO support Block for non-modified applications Memory mapped accessed for power applications

Walk Phase Status File systems like btrfs have looked at dynamically steering load to SSDs More based on the latency than on persistence Looking at MMAP interface to see what changes are needed Need to educate developers about the new parts & get parts in community hands

Run Phase Develop new APIs at the device level for consumption by file and storage stack Develop new APIs and a programming model for applications Create new file systems that consume the devices API's and provide the application API Nisha's talk on NVM interfaces Wed at 3:00PM Modify all performance critical applications to use new APIs

Run Status Storage Network Industry Association (SNIA) Working Group on NVM Working on a programming model for this class of parts http://snia.org/forums/sssi/nvmp Several new file systems for Linux being actively worked on Intel Research;s PMFS at https://github.com/linux-pmfs/pmfs.git FusionIO's directfs based on atomic writes Might be as painful as multi-threading or teaching applications to use fsync()!

Getting New Devices into Linux

What is Linux? A set of projects and companies Various free and fee-based distributions Hardware vendors from handsets up to mainframes Many different development communities Can be a long road to get a new bit of hardware enabled Open source allows any party to write their own file system or driver Different vendors have different paths to full support No single party can get your feature into distributions 24

Not Just the Linux Kernel Most features rely on user space components Red Hat Enterprise Linux (RHEL) has hundreds of projects, each with Its own development community (upstream) Its own rules and processes Choice of licenses Enterprise Linux vendors Work in the upstream projects Tune, test and configure Support the shipping versions 25

The Life Span of a Linux Enhancement Origin of a feature Driven through standards like T10, SNIA or IETF Pushed by a single vendor Created by a developer or at a research group Proposed in the upstream community Prototype patches posted Feedback and testing Advocacy for inclusion Move into a community distribution like Fedora Shipped and supported by an enterprise distribution 26

The Linux Community is Huge Most active contributors in 3.8 kernel (changesets): No affiliation 12.8% Red Hat 9.0% Intel 8.7% Unknown 7.4% LINBIT 4.8% Linaro 4.6% Texas Instruments 4.0% Vision Engraving Systems 3.5% Samsung 3.3% SUSE 2.5% Statistics from: http://lwn.net/articles/536768 27

Linux Storage & File & MM Summit 2013 28

Resources & Questions Resources Linux Weekly News: http://lwn.net/ Mailing lists like linux-scsi, linux-ide, linux-fsdevel, etc SNIA NVM TWG http://snia.org/forums/sssi/nvmp Storage & file system focused events LSF workshop Linux Foundation & Linux Plumbers Events 29