Practical Applications of Lustre/ZFS Hybrid Systems LUG 2014 Miami FL



Similar documents
High Performance Computing Specialists. ZFS Storage as a Solution for Big Data and Flexibility

SSDs: Practical Ways to Accelerate Virtual Servers

SSDs: Practical Ways to Accelerate Virtual Servers

Netapp HPC Solution for Lustre. Rich Fenton UK Solutions Architect

White Paper. Educational. Measuring Storage Performance

SOLUTION BRIEF. Resolving the VDI Storage Challenge

Increasing Storage Performance

WHITE PAPER. Drobo TM Hybrid Storage TM

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

All-Flash Storage Solution for SAP HANA:

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

White paper. QNAP Turbo NAS with SSD Cache

Automated Data-Aware Tiering

The Data Placement Challenge

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

EOFS Workshop Paris Sept, Lustre at exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

How it can benefit your enterprise. Dejan Kocic Netapp

Accelerating Server Storage Performance on Lenovo ThinkServer

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Accelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen

Understanding Microsoft Storage Spaces

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

Hyperscale Use Cases for Scaling Out with Flash. David Olszewski

Delivering SDS simplicity and extreme performance

The Economics of Intelligent Hybrid Storage. An Enmotus White Paper Sep 2014

Solid State Storage in the Evolution of the Data Center

New Storage System Solutions

LLNL Production Plans and Best Practices

Nexenta Performance Scaling for Speed and Cost

Scaling from Datacenter to Client

Understanding Enterprise NAS

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Storage: Trust, But Verify

Philip Lawrence. Senior Solution Architect Sun Microsystems UK & Ireland

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Dell Converged Infrastructure

EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers. Keith Swindell Dell Storage Product Planning Manager

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Remote/Branch Office IT Consolidation with Lenovo S2200 SAN and Microsoft Hyper-V

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer

Using Synology SSD Technology to Enhance System Performance Synology Inc.

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

User s Manual. Home CR-H BAY RAID Storage Enclosure

Flash Use Cases Traditional Infrastructure vs Hyperscale

Application-Focused Flash Acceleration

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

Solid State Storage in Massive Data Environments Erik Eyberg

High Performance Computing OpenStack Options. September 22, 2015

The last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery.

NetApp High-Performance Computing Solution for Lustre: Solution Guide

ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Improving IT Operational Efficiency with a VMware vsphere Private Cloud on Lenovo Servers and Lenovo Storage SAN S3200

HPC data becomes Big Data. Peter Braam

Understanding the Economics of Flash Storage

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

præsentation oktober 2011

Flash Performance in Storage Systems. Bill Moore Chief Engineer, Storage Systems Sun Microsystems

Comparison of Hybrid Flash Storage System Performance

RAID Implementation for StorSimple Storage Management Appliance

The Use of Flash in Large-Scale Storage Systems.

Flash-optimized Data Progression

89 Fifth Avenue, 7th Floor. New York, NY White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

Accelerating Database Applications on Linux Servers

Analysis of VDI Storage Performance During Bootstorm

Big Fast Data Hadoop acceleration with Flash. June 2013

Understanding Flash SSD Performance

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

New Hitachi Virtual Storage Platform Family. Name Date

THE SUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

How SSDs Fit in Different Data Center Applications

DEPLOYING HYBRID STORAGE POOLS With Sun Flash Technology and the Solaris ZFS File System. Roger Bitar, Sun Microsystems. Sun BluePrints Online

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun Storage Tech. Lab SK Telecom

Storage Fusion Xcelerator

Maximizing SQL Server Virtualization Performance

NEXT GENERATION EMC: LEAD YOUR STORAGE TRANSFORMATION. Copyright 2013 EMC Corporation. All rights reserved.

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Lustre SMB Gateway. Integrating Lustre with Windows

SOLID STATE DRIVES AND PARALLEL STORAGE

EMC XTREMIO EXECUTIVE OVERVIEW

SLIDE 1 Previous Next Exit

Performance Benchmark for Cloud Block Storage

Intel RAID SSD Cache Controller RCS25ZB040

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Jumpstart VDI Deployments with NexentaVSA for View

NetApp FAS Hybrid Array Flash Efficiency. Silverton Consulting, Inc. StorInt Briefing

nexsan NAS just got faster, easier and more affordable.

Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Transcription:

Practical Applications of Lustre/ZFS Hybrid Systems LUG 2014 Miami FL Q2-2014 Josh Judd, CTO

Agenda Brief Review: Luster over ZFS Brief Overview: platforms used in example solutions Discuss three cases for SSD Hybrids in HPC and HPBC shops Summarize findings Slide 2

Brief Review of Lustre/ZFS ZFS replaces ldiskfs/ext4 for backing FS It also provides a complete, feature-rich RAID solution ZFS also supports SSD/HDD hybrid modes natively LLNL developed this and uses it in systems like Sequoia successfully Many other HPC shops have adopted the approach WARP provides commercial support and enhancements Historically, using separated ZFS RAIDs connected to OSSs via RDMA Now, using a fully integrated ZOL approach Code from LLNL Current-model WARP system has ZFS and Lustre running on a single controller No more need for legacy RAIDs Reduces cost and rack footprint; improves performance and MTBF Slide 3

Brief Review of Lustre/ZFS (cont.) Read-optimized SSD low cost 2 nd location for read cache: if data is not in ARC, ZFS looks in L2ARC If L2ARC drives fail, data is secure on main storage Useful ratios: Small, to cache just metadata This is actually even useful for checkpoint FSs Medium, for general workloads Large (maybe 1:1), for analytics workloads ZFS Model Slide 4

Brief Review of Lustre/ZFS (cont.) L2ARC (SSD) ARC (RAM) ZIL (SSD) 1. ARC becomes full 2. New blocks arrive to be written 3. ZFS may move Least Recently Used (LRU) or Least Recently Accessed (LRA), or discard them as they already exist in the Storage Pool. Storage Pool (Disk) Slide 5

Brief Review of Lustre/ZFS (cont.) 3 2 ARC (RAM) 1 1. Random read from ARC 2. If block not in ARC, then read from L2ARC to ARC 3. If not in L2ARC, then read from Storage Pool to ARC L2ARC (SSD) ZIL (SSD) Storage Pool (Disk) Slide 6

WARP WDS-8460 primary building block 60x 4TB 6Gbps SAS drive modules per 4U enclosure (240TB) Moving to 6TB drives (360TB/enclosure; up to 3.6PB/rack) Optional use of SSDs and NVRAM to create a hybrid HDD/SSD system Dual I/O controllers for redundancy and performance Hot-pluggable drives, I/O controllers, and power/cooling modules Dual sandy bridge CPU-based controllers or SAS JBOD modules Upgradable to Ivy Bridge near term; dual Haswell mid term Slide 7

WARP WDS-8260 SSD-optimized system 60x 6Gbps 2.5 SAS SSD or HDD modules per 2U enclosure Same dual I/O controllers as in 4u enclosure Up to 120TB pure SSD or HDD Up to 100TB as hybrid Use as turn-key HA MDS/MGS for Lustre Use as pure SSD system for IOPS intensive HPC or analytics workloads Slide 8

Solution #1: Separate Checkpoint vs. General Data Scenario: Parallel FS is being used for multiple tasks Read/write mix is 40/60 to 60/40 Reads are not from checkpoint or other classic HPC data, and may be interfering with it Analysis shows that some significant portion of the reads are hitting the same data multiple times Solution: Provide an accelerated read mount point for users with such workloads Attach SSDs to the associated OSSs as L2ARC ZFS read cache Benefit: ZFS will selectively cache data onto the SSDs if there is a reason to suspect it will be read again Doesn t just act like write through or back cache, or cache most recent writes Allows using inexpensive, large, read-optimized SSDs without wear issues Slide 9

Solution #2: Analytics (Genomics / Life Sciences) Scenario: Parallel FS is being used for single purpose which involves reading and re-reading files Read/write mix is heavily weighted to reads... But... After a time, data becomes inactive, yet must still be retained So there is a known(ish) working data set, with bulk storage being more like online archive Solution: Benefit: Add L2ARC slightly larger than the typical-case working data set (medium) Bump up the fill rate, such that virtually all writes are cached ZFS will aggressively cache data onto the SSDs This is sort of like a write through or back cache... But size is equal to entire working data set Therefore typical case analysis jobs get ~100% SSD cache hits Data gets archived when no longer active... Without even needing to move it! Slide 10

Solution #3: Analytics - HPBC / Financial Scenario: Parallel FS is being used for single purpose: analyze market data for HFT Read/write mix is almost read only, and all data is always active RRDB style process is used to age out data, so there will be deletes but not changes Solution: Benefit: Add L2ARC equal to entire data set (large) Bump up the fill rate to absolute max, such that all writes are cached ZFS will cache all written data onto the SSDs so analysis jobs get 100% cache hits No write wear issue because ingest rate is low Faster than a pure SSD solution with no added cost: HDDs act as protection instead of having extra SSDs for parity Reading from L2ARC is faster than reading from an SSD RAID Slide 11

Analytics - HPBC / Financial (cont.) Interesting side note on this configuration re: SMB vs. Lustre: Customer initially had a layer of SMB gateways connected to Lustre This had performance and stability issues Specific to Samba, not Lustre What I wouldn t give for a Windows Lustre driver! Tried running Samba on OSSs; tried running it on ZFS systems without Lustre Solution: Create CentOS hypervisor on bare metal connect to Lustre as client Single Windows VM connects to hypervisor Samba, which shares Lustre Read only, so don t bother with CTDB etc. Now Samba has a one client scale instead of thousands Slide 12

Findings Agree with ORNL: SSDs are not a be all end all for HPC If you can steer analytics workloads to separate FSs, then pure SSD may be fine But the killer app for [ exascale hyperscale ] SSD systems seems hybrid Use SSDs to cache data which is expected to be read mostly or RO This allows using large commodity SSDs, without burning them out ZFS L2ARC can be combined with intelligent FS layout to do this economically Slide 13

Thanks! Q2-2014 Josh Judd, CTO