idedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti

Save this PDF as:
Size: px
Start display at page:

Download "idedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti"

Transcription

1 idedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti Advanced Technology Group NetApp 1

2 idedup overview/context Storage Clients Primary Storage Dedupe exploited effectively here => 90+% savings Secondary Storage NFS/CIFS/iSCSI NDMP/Other Primary Storage: idedup Performance & reliability are key features Inline/foreground dedupe technique for primary RPC-based protocols => latency sensitive Minimal impact on latency-sensitive workloads Only offline dedupe techniques developed 2

3 Dedupe techniques (offline vs inline) Offline dedupe First copy on stable storage is not deduped Dedupe is a post-processing/background activity Inline dedupe Dedupe before first copy on stable storage Primary => Latency should not be affected! Secondary => Dedupe at ingest rate (IOPS)! 3

4 Why inline dedupe for primary? Provisioning/Planning is easier Dedupe savings are seen right away Planning is simpler as capacity values are accurate No post-processing activities No scheduling of background processes No interference => front-end workloads are not affected Key for storage system users with limited maintenance windows Efficient use of resources Efficient use of I/O bandwidth (offline has both reads + writes) File system s buffer cache is more efficient (deduped) Performance challenges have been the key obstacle Overheads (CPU + I/Os) for both reads and writes hurt latency 4

5 idedup Key features Minimizes inline dedupe performance overheads Leverages workload characteristics Eliminates almost all extra I/Os due to dedupe processing CPU overheads are minimal Tunable tradeoff: dedupe savings vs performance Selective dedupe => some loss in dedupe capacity savings idedup can be combined with offline techniques Maintains same on-disk data structures as normal file system Offline dedupe can be run optionally 5

6 Related Work Workload/ Method Offline Inline Primary NetApp ASIS EMC Celerra IBM StorageTank idedup zfs* Secondary (No motivation for systems in this category) EMC DDFS, EMC Cluster DeepStore, NEC HydraStor, Venti, SiLo, Sparse Indexing, ChunkStash, Foundation, Symantec, EMC Centera 6

7 Outline Inline dedupe challenges Our Approach Design/Implementation details Evaluation results Summary 7

8 Inline Dedupe - Read Path challenges Inherently, dedupe causes disk-level fragmentation! Sequential reads turn random => more seeks => more latency RPC based protocols (CIFS/NFS/iSCI) are latency sensitive Dataset/workload property Primary workloads are typically read-intensive Usually the Read/Write ratio is ~70/30 Inline dedupe must not affect read performance! Fragmentation with random seeks 8

9 Inline Dedupe Write Path Challenges Extra CPU random overheads I/Os in in the the critical write path write due path to dedupe algorithm Dedupe metadata (Finger Print DB) lookups and updates Dedupe requires computing the Fingerprint (Hash) of each block Updating the reference counts of blocks on disk Dedupe algorithm requires extra cycles Dedupe metadata Client Write Write Logging Compute Hash Dedupe Algorithm Write Allocation Random Seeks!! Disk IO De-stage phase 9

10 Our Approach 10

11 idedup Intuition Is there a good tradeoff between capacity savings and latency performance? 11

12 idedup Solution to Read Path issues Insight 1: Dedupe only sequences of disk blocks Solves fragmentation => amortized seeks during reads Selective dedupe, leverages spatial locality Configurable minimum sequence length Fragmentation with random seeks Sequences, with amortized seeks 12

13 idedup Write Path issues How can we reduce dedupe metadata I/Os? Flash(?) - read I/Os are cheap, but frequent updates are expensive 13

14 idedup Solution to Write Path issues Insight 2: Keep a smaller dedupe metadata as an in-memory cache No extra IOs Leverages temporal locality characteristics in duplication Some loss in dedupe (a subset of blocks are used) Blocks Cached fingerprints Time 14

15 idedup - Viability Dedup Ratio Is this loss in dedupe savings Ok? Both spatial and temporal localities are dataset/workload properties! Viable for some important primary workloads Original Spatial Locality Spatial + Temporal Locality 15

16 Design and Implementation 16

17 idedup Architecture I/Os (Reads + Writes) NVRAM (Write log blocks) idedup Algorithm De-stage Dedupe metadata (FPDB) File system (WAFL) Disk 17

18 idedup - Two key tunable parameters Minimum sequence length Threshold Minimum number of sequential duplicate blocks on disk Dataset property => ideally set to expected fragmentation Different from larger block size variable vs fixed Knob between performance (fragmentation) and dedupe Dedupe metadata (Fingerprint DB) cache size Workload s working set property Increase in cache size => decrease in buffer cache Knob between performance (cache hit ratio) and dedupe 18

19 idedup Algorithm The idedup algorithm works in 4 phases for every file: Phase 1 (per file):identify blocks for idedup Only full, pure data blocks are processed Metadata blocks, special files ignored Phase 2 (per file) : Sequence processing Uses the dedupe metadata cache Keeps track of multiple sequences Phase 3 (per sequence): Sequence pruning Eliminate short sequences below threshold Pick among overlapping sequences via a heuristic Phase 4 (per sequence): Deduplication of sequence 19

20 Evaluation 20

21 Evaluation Setup NetApp FAS 3070, 8GB RAM, 3 disk RAID0 Evaluated by replaying real-world CIFS traces Corporate filer traces in NetApp DC (2007) Read data: 204GB (69%), Write data: 93GB Engineering filer traces in NetApp DC (2007) Read data: 192GB (67%), Write data: 92GB Comparison points Baseline: System with no idedup Threshold-1: System with full dedupe (1 block) Dedupe metadata cache: 0.25, 0.5 & 1GB 21

22 Results: Deduplication ratio vs Threshold Dedupe ratio vs Thresholds, Cache sizes (Corp) Deduplication ratio (%) GB.5 GB 1 GB Less than linear decrease in dedupe savings Spatial locality in dedupe Ideal Threshold = Biggest threshold with least decrease in dedupe savings Threshold Threshold-4 ~60% of max 22

23 Results: Disk Fragmentation (req sizes) CDF of block request sizes (Engg, 1GB) 100 Percentage of Total Requests Max fragmentation Baseline (Mean=15.8) Threshold-1 (Mean=12.5) Threshold-2 (Mean=14.8) Threshold-4 (Mean=14.9) Threshold-8 (Mean=15.4) Least fragmentation Fragmentation for other thresholds are between Baseline and Thresh-1 Tunable fragmentation Request Sequence Size (Blocks) 23

24 Results: CPU Utilization 100 CDF of CPU utilization samples (Corp, 1GB) Percentage of CPU Samples Baseline (Mean=13.2%) Threshold-1 (Mean=15.0%) Threshold-4 (Mean=16.6%) Threshold-8 (Mean=17.1%) Larger variance (long tail) compared to baseline But, mean difference is less than 4% CPU Utilization (%) 24

25 Results: Latency Impact Percentage of Requests CDF of client response time (Corp, 1GB) Affects >2ms Baseline Threshold-1 Threshold Latency impact for longer response times ( > 2ms) Thresh-1 mean latency affected by ~13% vs baseline Diff between Thresh-8 and baseline <4%! Response Time (ms) 25

26 Summary Inline dedupe has significant performance challenges Reads : Fragmentation, Writes: CPU + Extra I/Os idedup creates tradeoffs b/n savings and performance Leverage dedupe locality properties Avoid fragmentation dedupe only sequences Avoid extra I/Os keep dedupe metadata in memory Experiments for latency-sensitive primary workloads Low CPU impact < 5% on the average ~60% of max dedupe, ~4% impact on latency Future work: Dynamic threshold, more traces Our traces are available for research purposes 26

27 Acknowledgements NetApp WAFL Team Blake Lewis, Ling Zheng, Craig Johnston, Subbu PVS, Praveen K, Sriram Venketaraman NetApp ATG Team Scott Dawkins, Jeff Heller Shepherd John Bent Our Intern (from UCSC) Stephanie Jones 27

28 28

idedup: Latency-aware, inline data deduplication for primary storage

idedup: Latency-aware, inline data deduplication for primary storage idedup: Latency-aware, inline data deduplication for primary storage Kiran Srinivasan, Tim Bisson, Garth Goodson, Kaladhar Voruganti NetApp, Inc. {skiran, tbisson, goodson, kaladhar}@netapp.com Abstract

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,

More information

Online De-duplication in a Log-Structured File System for Primary Storage

Online De-duplication in a Log-Structured File System for Primary Storage Online De-duplication in a Log-Structured File System for Primary Storage Technical Report UCSC-SSRC-11-03 May 2011 Stephanie N. Jones snjones@cs.ucsc.edu Storage Systems Research Center Baskin School

More information

Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash. June 2013 Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

More information

Mambo Running Analytics on Enterprise Storage

Mambo Running Analytics on Enterprise Storage Mambo Running Analytics on Enterprise Storage Jingxin Feng, Xing Lin 1, Gokul Soundararajan Advanced Technology Group 1 University of Utah Motivation No easy way to analyze data stored in enterprise storage

More information

Read Performance Enhancement In Data Deduplication For Secondary Storage

Read Performance Enhancement In Data Deduplication For Secondary Storage Read Performance Enhancement In Data Deduplication For Secondary Storage A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Pradeep Ganesan IN PARTIAL FULFILLMENT

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Delivering SDS simplicity and extreme performance

Delivering SDS simplicity and extreme performance Delivering SDS simplicity and extreme performance Real-World SDS implementation of getting most out of limited hardware Murat Karslioglu Director Storage Systems Nexenta Systems October 2013 1 Agenda Key

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

AIX NFS Client Performance Improvements for Databases on NAS

AIX NFS Client Performance Improvements for Databases on NAS AIX NFS Client Performance Improvements for Databases on NAS October 20, 2005 Sanjay Gulabani Sr. Performance Engineer Network Appliance, Inc. gulabani@netapp.com Diane Flemming Advisory Software Engineer

More information

Data Backup and Archiving with Enterprise Storage Systems

Data Backup and Archiving with Enterprise Storage Systems Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,

More information

Methods to achieve low latency and consistent performance

Methods to achieve low latency and consistent performance Methods to achieve low latency and consistent performance Alan Wu, Architect, Memblaze zhongjie.wu@memblaze.com 2015/8/13 1 Software Defined Flash Storage System Memblaze provides software defined flash

More information

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy, and Hong Tang University of California at Santa Barbara, Alibaba Inc. Abstract In a virtualized

More information

The What, Why and How of the Pure Storage Enterprise Flash Array

The What, Why and How of the Pure Storage Enterprise Flash Array The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) What is an enterprise storage array? Enterprise storage array: store data blocks

More information

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of

More information

Everything you need to know about flash storage performance

Everything you need to know about flash storage performance Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices

More information

Data Compression and Deduplication. LOC 2010 2010 Cisco Systems, Inc. All rights reserved.

Data Compression and Deduplication. LOC 2010 2010 Cisco Systems, Inc. All rights reserved. Data Compression and Deduplication LOC 2010 2010 Systems, Inc. All rights reserved. 1 Data Redundancy Elimination Landscape VMWARE DeDE IBM DDE for Tank Solaris ZFS Hosts (Inline and Offline) MDS + Network

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000 Summary: This document describes how to analyze performance on an IBM Storwize V7000. IntelliMagic 2012 Page 1 This

More information

EMC XTREMIO EXECUTIVE OVERVIEW

EMC XTREMIO EXECUTIVE OVERVIEW EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying

More information

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2 Using Synology SSD Technology to Enhance System Performance Based on DSM 5.2 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD Cache as Solution...

More information

Primary Data Deduplication Large Scale Study and System Design

Primary Data Deduplication Large Scale Study and System Design Primary Data Deduplication Large Scale Study and System Design Ahmed El-Shimi Ran Kalach Ankit Kumar Adi Oltean Jin Li Sudipta Sengupta Microsoft Corporation, Redmond, WA, USA Abstract We present a large

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

Decentralized Deduplication in SAN Cluster File Systems

Decentralized Deduplication in SAN Cluster File Systems Decentralized Deduplication in SAN Cluster File Systems Austin T. Clements Irfan Ahmad Murali Vilayannur Jinyuan Li VMware, Inc. MIT CSAIL Storage Area Networks Storage Area Networks Storage Area Networks

More information

CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY

CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY Reference Architecture CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY An XtremIO Reference Architecture Abstract This Reference architecture examines the storage efficiencies

More information

Demystifying Deduplication for Backup with the Dell DR4000

Demystifying Deduplication for Backup with the Dell DR4000 Demystifying Deduplication for Backup with the Dell DR4000 This Dell Technical White Paper explains how deduplication with the DR4000 can help your organization save time, space, and money. John Bassett

More information

Flexible Storage Allocation

Flexible Storage Allocation Flexible Storage Allocation A. L. Narasimha Reddy Department of Electrical and Computer Engineering Texas A & M University Students: Sukwoo Kang (now at IBM Almaden) John Garrison Outline Big Picture Part

More information

Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets

Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets Young Jin Nam School of Computer and Information Technology Daegu University Gyeongsan, Gyeongbuk, KOREA 7-7 Email:

More information

Madis Pärn Senior System Engineer EMC madis.parn@emc.com. Copyright 2013 EMC Corporation. All rights reserved.

Madis Pärn Senior System Engineer EMC madis.parn@emc.com. Copyright 2013 EMC Corporation. All rights reserved. Madis Pärn Senior System Engineer EMC madis.parn@emc.com 1 History February 2000? What Intel CPU was in use? First 15K HDD 32-bit, 1C (4C/server), 1GHz, 4GB RAM What about today? 64-bit, 15C (120C/server),

More information

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010 The IntelliMagic White Paper: Green Storage: Reduce Power not Performance December 2010 Summary: This white paper provides techniques to configure the disk drives in your storage system such that they

More information

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology Evaluation report prepared under contract with NetApp Introduction As flash storage options proliferate and become accepted in the enterprise,

More information

HPe in Datacenter HPe3PAR Flash Technologies

HPe in Datacenter HPe3PAR Flash Technologies HPe in Datacenter HPe3PAR Flash Technologies Martynas Skripkauskas Hewlett Packard Enterprise Storage Department Flash Misnomers Common statements used when talking about flash Flash is about speed Flash

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition Chapter 11: File System Implementation 11.1 Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation

More information

A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP

A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP Dilip N Simha (Stony Brook University, NY & ITRI, Taiwan) Maohua Lu (IBM Almaden Research Labs, CA) Tzi-cker Chiueh (Stony

More information

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,

More information

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE Matt Kixmoeller, Pure Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies

More information

PERFORMANCE STUDY. NexentaConnect View Edition Branch Office Solution. Nexenta Office of the CTO Murat Karslioglu

PERFORMANCE STUDY. NexentaConnect View Edition Branch Office Solution. Nexenta Office of the CTO Murat Karslioglu PERFORMANCE STUDY NexentaConnect View Edition Branch Office Solution Nexenta Office of the CTO Murat Karslioglu Table of Contents Desktop Virtualization for Small and Medium Sized Office... 3 Cisco UCS

More information

Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs

Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs Data Reduction: Deduplication and Compression Danny Harnik IBM Haifa Research Labs Motivation Reducing the amount of data is a desirable goal Data reduction: an attempt to compress the huge amounts of

More information

HP StoreOnce D2D. Understanding the challenges associated with NetApp s deduplication. Business white paper

HP StoreOnce D2D. Understanding the challenges associated with NetApp s deduplication. Business white paper HP StoreOnce D2D Understanding the challenges associated with NetApp s deduplication Business white paper Table of contents Challenge #1: Primary deduplication: Understanding the tradeoffs...4 Not all

More information

Tradeoffs in Scalable Data Routing for Deduplication Clusters

Tradeoffs in Scalable Data Routing for Deduplication Clusters Tradeoffs in Scalable Data Routing for Deduplication Clusters Wei Dong Princeton University Fred Douglis EMC Kai Li Princeton University and EMC Hugo Patterson EMC Sazzala Reddy EMC Philip Shilane EMC

More information

Metadata Feedback and Utilization for Data Deduplication Across WAN

Metadata Feedback and Utilization for Data Deduplication Across WAN Zhou B, Wen JT. Metadata feedback and utilization for data deduplication across WAN. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 31(3): 604 623 May 2016. DOI 10.1007/s11390-016-1650-6 Metadata Feedback

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Flash Accel, Flash Cache, Flash Pool, Flash Ray Was? Wann? Wie?

Flash Accel, Flash Cache, Flash Pool, Flash Ray Was? Wann? Wie? Flash Accel, Flash Cache, Flash Pool, Flash Ray Was? Wann? Wie? Systems Engineer Sven Willholz Performance Growth Performance Gap Performance Gap Challenge Server Huge gap between CPU and storage Relatively

More information

DEXT3: Block Level Inline Deduplication for EXT3 File System

DEXT3: Block Level Inline Deduplication for EXT3 File System DEXT3: Block Level Inline Deduplication for EXT3 File System Amar More M.A.E. Alandi, Pune, India ahmore@comp.maepune.ac.in Zishan Shaikh M.A.E. Alandi, Pune, India zishan366shaikh@gmail.com Vishal Salve

More information

VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE

VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE 1 VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE JEFF MAYNARD, CORPORATE SYSTEMS ENGINEER 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product

More information

ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory

ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory Biplob Debnath Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA University of Minnesota, Twin Cities, USA Abstract Storage

More information

Application-Focused Flash Acceleration

Application-Focused Flash Acceleration IBM System Storage Application-Focused Flash Acceleration XIV SDD Caching Overview FLASH MEMORY SUMMIT 2012 Anthony Vattathil anthonyv@us.ibm.com 1 Overview Flash technology is an excellent choice to service

More information

Understanding Data Locality in VMware Virtual SAN

Understanding Data Locality in VMware Virtual SAN Understanding Data Locality in VMware Virtual SAN July 2014 Edition T E C H N I C A L M A R K E T I N G D O C U M E N T A T I O N Table of Contents Introduction... 2 Virtual SAN Design Goals... 3 Data

More information

Storage I/O Control: Proportional Allocation of Shared Storage Resources

Storage I/O Control: Proportional Allocation of Shared Storage Resources Storage I/O Control: Proportional Allocation of Shared Storage Resources Chethan Kumar Sr. Member of Technical Staff, R&D VMware, Inc. Outline The Problem Storage IO Control (SIOC) overview Technical Details

More information

SolidFire and NetApp All-Flash FAS Architectural Comparison

SolidFire and NetApp All-Flash FAS Architectural Comparison SolidFire and NetApp All-Flash FAS Architectural Comparison JUNE 2015 This document provides an overview of NetApp s All-Flash FAS architecture as it compares to SolidFire. Not intended to be exhaustive,

More information

FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS

FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS FLASH IMPLICATIONS IN ENTERPRISE STORAGE ARRAY DESIGNS ABSTRACT This white paper examines some common practices in enterprise storage array design and their resulting trade-offs and limitations. The goal

More information

Understanding the Economics of Flash Storage

Understanding the Economics of Flash Storage Understanding the Economics of Flash Storage By James Green, vexpert Virtualization Consultant and Scott D. Lowe, vexpert Co-Founder, ActualTech Media February, 2015 Table of Contents Table of Contents...

More information

Q & A From Hitachi Data Systems WebTech Presentation:

Q & A From Hitachi Data Systems WebTech Presentation: Q & A From Hitachi Data Systems WebTech Presentation: RAID Concepts 1. Is the chunk size the same for all Hitachi Data Systems storage systems, i.e., Adaptable Modular Systems, Network Storage Controller,

More information

BASIL: Automated IO Load Balancing Across Storage Devices

BASIL: Automated IO Load Balancing Across Storage Devices BASIL: Automated IO Load Balancing Across Storage Devices Ajay Gulati VMware, Inc. agulati@vmware.com Chethan Kumar VMware, Inc. ckumar@vmware.com Karan Kumar Carnegie Mellon University karank@andrew.cmu.edu

More information

Power Management in Storage Systems

Power Management in Storage Systems Tag line, tag line Power Management in Storage Systems Kaladhar Voruganti Technical Director CTO Office, Sunnyvale June 12, 2009 Outline Power Consumption Background in Data Centers and Storage Systems

More information

DEDUPLICATION NOW AND WHERE IT S HEADING. Lauren Whitehouse Senior Analyst, Enterprise Strategy Group

DEDUPLICATION NOW AND WHERE IT S HEADING. Lauren Whitehouse Senior Analyst, Enterprise Strategy Group DEDUPLICATION NOW AND WHERE IT S HEADING Lauren Whitehouse Senior Analyst, Enterprise Strategy Group Need Dedupe? Before/After Dedupe Deduplication Production Data Deduplication In Backup Process Backup

More information

Understanding Enterprise NAS

Understanding Enterprise NAS Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent

More information

89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison

89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Adaptive Flash Cache: A Competitive Comparison 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 White Paper HP 3PAR Adaptive Flash Cache: A Competitive Comparison Printed in the United States of America Copyright 2014 Edison

More information

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition

Chapter 11: File System Implementation. Operating System Concepts 8 th Edition Chapter 11: File System Implementation Operating System Concepts 8 th Edition Silberschatz, Galvin and Gagne 2009 Chapter 11: File System Implementation File-System Structure File-System Implementation

More information

Array Performance 101 Part 4

Array Performance 101 Part 4 Array Performance 101 Part 4 How to get the most bang from your arrays Ray Lucchesi President, Silverton Consulting 15 February 2006 http://www.silvertonconsulting.com 1 2006 Silverton Consulting, Inc.

More information

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL Dr. Allon Cohen Eli Ben Namer info@sanrad.com 1 EXECUTIVE SUMMARY SANRAD VXL provides enterprise class acceleration for virtualized

More information

#1 All-Flash Array In The Market: XtremIO Changes the Game

#1 All-Flash Array In The Market: XtremIO Changes the Game Conseils Intégration systèmes Gestion infrastructure 1/29 EMC Next Generation Flash Strategy #1 All-Flash Array In The Market: XtremIO Changes the Game Conseils Intégration systèmes Gestion infrastructure

More information

Trends in Enterprise Backup Deduplication

Trends in Enterprise Backup Deduplication Trends in Enterprise Backup Deduplication Shankar Balasubramanian Architect, EMC 1 Outline Protection Storage Deduplication Basics CPU-centric Deduplication: SISL (Stream-Informed Segment Layout) Data

More information

SOLUTION BRIEF. Resolving the VDI Storage Challenge

SOLUTION BRIEF. Resolving the VDI Storage Challenge CLOUDBYTE ELASTISTOR QOS GUARANTEE MEETS USER REQUIREMENTS WHILE REDUCING TCO The use of VDI (Virtual Desktop Infrastructure) enables enterprises to become more agile and flexible, in tune with the needs

More information

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng and Patrick P. C. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong,

More information

Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst

Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst ESG Lab Review Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst Abstract: This ESG Lab review documents hands-on

More information

NetApp FAS Hybrid Array Flash Efficiency. Silverton Consulting, Inc. StorInt Briefing

NetApp FAS Hybrid Array Flash Efficiency. Silverton Consulting, Inc. StorInt Briefing NetApp FAS Hybrid Array Flash Efficiency Silverton Consulting, Inc. StorInt Briefing PAGE 2 OF 7 Introduction Hybrid storage arrays (storage systems with both disk and flash capacity) have become commonplace

More information

White Paper. Educational. Measuring Storage Performance

White Paper. Educational. Measuring Storage Performance TABLE OF CONTENTS Introduction....... Storage Performance Metrics.... Factors Affecting Storage Performance....... Provisioning IOPS in Hardware-Defined Solutions....... Provisioning IOPS in Software-Defined

More information

EMC Backup and Recovery for Microsoft Exchange 2007 SP2

EMC Backup and Recovery for Microsoft Exchange 2007 SP2 EMC Backup and Recovery for Microsoft Exchange 2007 SP2 Enabled by EMC Celerra and Microsoft Windows 2008 Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the

More information

SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage

SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage Biplob Debnath,1 Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA EMC Corporation, Santa Clara, CA, USA ABSTRACT We present

More information

NetApp Data Compression and Deduplication Deployment and Implementation Guide

NetApp Data Compression and Deduplication Deployment and Implementation Guide Technical Report NetApp Data Compression and Deduplication Deployment and Implementation Guide Clustered Data ONTAP Sandra Moulton, NetApp April 2013 TR-3966 Abstract This technical report focuses on clustered

More information

MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services

MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,

More information

Maximizing SQL Server Virtualization Performance

Maximizing SQL Server Virtualization Performance Maximizing SQL Server Virtualization Performance Michael Otey Senior Technical Director Windows IT Pro SQL Server Pro 1 What this presentation covers Host configuration guidelines CPU, RAM, networking

More information

EMC SOLUTION FOR SPLUNK

EMC SOLUTION FOR SPLUNK EMC SOLUTION FOR SPLUNK Splunk validation using all-flash EMC XtremIO and EMC Isilon scale-out NAS ABSTRACT This white paper provides details on the validation of functionality and performance of Splunk

More information

A Deduplication File System & Course Review

A Deduplication File System & Course Review A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

More information

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000)

The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000) The IntelliMagic White Paper on: Storage Performance Analysis for an IBM San Volume Controller (SVC) (IBM V7000) IntelliMagic, Inc. 558 Silicon Drive Ste 101 Southlake, Texas 76092 USA Tel: 214-432-7920

More information

FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency

FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency Dimitris Krekoukias Systems Engineer NetApp Data ONTAP 8 software operating in Cluster-Mode is the industry's only unified, scale-out

More information

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere Nutanix Tech Note Configuration Best Practices for Nutanix Storage with VMware vsphere Nutanix Virtual Computing Platform is engineered from the ground up to provide enterprise-grade availability for critical

More information

ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER

ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER ARCHITECTURE WHITE PAPER Table of Contents Data Value... 2 NexGen Storage Architecture... 3 Storage Quality of Service and Service Levels... 4 PCIe Flash-first Design... 9 Dynamic

More information

Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group

Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group Introduction Advanced AFAs are a Different Animal Flash behavior

More information

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract

More information

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems 1 Any modern computer system will incorporate (at least) two levels of storage: primary storage: typical capacity cost per MB $3. typical access time burst transfer rate?? secondary storage: typical capacity

More information

ASKING THESE 20 SIMPLE QUESTIONS ABOUT ALL-FLASH ARRAYS CAN IMPACT THE SUCCESS OF YOUR DATA CENTER ROLL-OUT

ASKING THESE 20 SIMPLE QUESTIONS ABOUT ALL-FLASH ARRAYS CAN IMPACT THE SUCCESS OF YOUR DATA CENTER ROLL-OUT ASKING THESE 20 SIMPLE QUESTIONS ABOUT ALL-FLASH ARRAYS CAN IMPACT THE SUCCESS OF YOUR DATA CENTER ROLL-OUT 1 PURPOSE BUILT FLASH SPECIFIC DESIGN: Does the product leverage an advanced and cost effective

More information

A Data De-duplication Access Framework for Solid State Drives

A Data De-duplication Access Framework for Solid State Drives JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 941-954 (2012) A Data De-duplication Access Framework for Solid State Drives Department of Electronic Engineering National Taiwan University of Science

More information

Cloud De-duplication Cost Model THESIS

Cloud De-duplication Cost Model THESIS Cloud De-duplication Cost Model THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Christopher Scott Hocker

More information

Quanqing XU Quanqing.Xu@nicta.com.au. YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud

Quanqing XU Quanqing.Xu@nicta.com.au. YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud Quanqing XU Quanqing.Xu@nicta.com.au YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud Outline Motivation YuruBackup s Architecture Backup Client File Scan, Data

More information

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

Measuring Interface Latencies for SAS, Fibre Channel and iscsi Measuring Interface Latencies for SAS, Fibre Channel and iscsi Dennis Martin Demartek President Santa Clara, CA 1 Demartek Company Overview Industry analysis with on-site test lab Lab includes servers,

More information

Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication

Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication Effective Planning and Use of IBM Tivoli Storage Manager V6 and V7 Deduplication 02/17/2015 2.1 Authors: Jason Basler Dan Wolfe Page 1 of 52 Document Location This is a snapshot of an on-line document.

More information

WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January 2012. Permabit Technology Corporation

WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January 2012. Permabit Technology Corporation WHITE PAPER Permabit Albireo Data Optimization Software Benefits of Albireo for Virtual Servers January 2012 Permabit Technology Corporation Ten Canal Park Cambridge, MA 02141 USA Phone: 617.252.9600 FAX:

More information

Decentralized Deduplication in SAN Cluster File Systems

Decentralized Deduplication in SAN Cluster File Systems Decentralized Deduplication in SAN Cluster File Systems Austin T. Clements Irfan Ahmad Murali Vilayannur Jinyuan Li VMware, Inc. MIT CSAIL Transcript of talk given on Wednesday, June 17 at USENIX ATEC

More information

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at www.frankdenneman.

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at www.frankdenneman. WHITE PAPER All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at www.frankdenneman.nl 1 Monolithic shared storage architectures

More information

Data Deduplication in a Hybrid Architecture for Improving Write Performance

Data Deduplication in a Hybrid Architecture for Improving Write Performance Data Deduplication in a Hybrid Architecture for Improving Write Performance Data-intensive Salable Computing Laboratory Department of Computer Science Texas Tech University Lubbock, Texas June 10th, 2013

More information

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers White Paper rev. 2015-11-27 2015 FlashGrid Inc. 1 www.flashgrid.io Abstract Oracle Real Application Clusters (RAC)

More information

WHITE PAPER. SQL Server License Reduction with PernixData FVP Software

WHITE PAPER. SQL Server License Reduction with PernixData FVP Software WHITE PAPER SQL Server License Reduction with PernixData FVP Software 1 Beyond Database Acceleration Poor storage performance continues to be the largest pain point with enterprise Database Administrators

More information

Estimating Deduplication Ratios in Large Data Sets

Estimating Deduplication Ratios in Large Data Sets IBM Research labs - Haifa Estimating Deduplication Ratios in Large Data Sets Danny Harnik, Oded Margalit, Dalit Naor, Dmitry Sotnikov Gil Vernik Estimating dedupe and compression ratios some motivation

More information

The Data Placement Challenge

The Data Placement Challenge The Data Placement Challenge Entire Dataset Applications Active Data Lowest $/IOP Highest throughput Lowest latency 10-20% Right Place Right Cost Right Time 100% 2 2 What s Driving the AST Discussion?

More information

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System Avoiding the Disk Bottleneck in the Data Domain Deduplication File System Benjamin Zhu Data Domain, Inc. Kai Li Data Domain, Inc. and Princeton University Hugo Patterson Data Domain, Inc. Abstract Disk-based

More information