FAST 11. Yongseok Oh University of Seoul. Mobile Embedded System Laboratory

Size: px
Start display at page:

Download "FAST 11. Yongseok Oh <ysoh@uos.ac.kr> University of Seoul. Mobile Embedded System Laboratory"

Transcription

1 CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh <ysoh@uos.ac.kr> University of Seoul Mobile Embedded System Laboratory 1

2 Introduction The limited lifespan is the Achilles hill of Flash Memory based SSDs Not-so-well-known reasons As bit density increases, flash memory chips become less reliable and less durable Traditional redundancy solutions are considered less effective for SSDs Some prior research work has presented empirical and modeling based studies on the lifespan of flash memories The lifespan of SSDs is a function of three factors The amount of incoming write traffic The size of over-provisioned flash space The efficiency of garbage collection and wear-leveling mechanisms Mobile Embedded System Laboratory 2

3 Data Duplication is Common Kernel developers can have multiple kernel sources Word tools often automatically save a copy of documents Mobile Embedded System Laboratory 3

4 Contributions Study of data duplications for improving endurance of SSDs in file systems and various workloads Design of a content-aware FTL to extend the SSD lifespan by removing duplicate writes (up to 24.2%) and redundant data (up to 31.2%) with minimal overhead. Acceleration methods for in-line deduplication in SSD devices Implementation of CAFTL in the DiskSim simulator Mobile Embedded System Laboratory 4

5 Technical Challenges Limited resources CAFTL is designed for running in an SSD device with limited memory space and computing power, rather than running on a dedicated powerful enterprise server. Relatively lower redundancy CAFTL mostly handles regular file system workloads, which have an impressive but much lower duplication rate than that of backup streams with high redundancy (often 10 times or even higher). Lack of semantic hints CAFTL works at the device level and only sees a sequence of logical blocks without any semantic hints from host file systems. Low overhead requirement CAFTL must retain high data access performance for regular workloads, while this is a less stringent requirement in backup systems that can run during out-of-office hours. Mobile Embedded System Laboratory 5

6 The Design of CAFTL Tree critical objectives Reducing unnecessary write traffic Extending available flash space Retaining access performance Mobile Embedded System Laboratory 6

7 Hashing and Fingerprint Store A fixed-sized chunking approach (e.g. 4KB) The basic operation unit in flash is a page (Mapping policy) SHA1 (160-bit) Fingerprint store in memory Only 10-20% of the fingerprints are highly deduplicated (skewed) Most fingerprints are unique A complete search in the fingerprint store would incur high lookup latencies We should only store and search in the most likely-to-be-duplicated fingerprint in memory 15 Disks Mobile Embedded System Laboratory 7

8 Hashing and Fingerprint Store {fingerprint,(location, reference)} Fingerprint f Hash Function f mod N Highly Duplicated Fingerprints are kept in RAM Optimization techniques (1) Range Check (2) Hotness-based Reorganization (3) Bucket level binary Search Seg 0 Seg N-1 Bucket Memory Flash If a fingerprint cannot be found in Fingerprint Store, Out-of-line scanning can still identify these duplicates later Mobile Embedded System Laboratory 8

9 Indirect Mapping Two-level indirect mapping Mobile Embedded System Laboratory 9

10 Acceleration Methods Fingerprinting is the key bottleneck of the in-line deduplicaton in CAFTL Three acceleration methods have been proposed Sampling for hashing Light-weight pre-hashing Dynamic switches Mobile Embedded System Laboratory 10

11 Sampling for hashing We selectively pick only one page as a sample page for fingerprint We use this sample fingerprint to query the fingerprint store We propose to use Content-based Sampling We select the first four bytes, called sample bytes Mobile Embedded System Laboratory 11

12 Light-weight pre-hashing Producing a 32-bit CRC32 hash value is over 10 times faster than computing a 160-bit SHA-1 hash value Reducing the hash strength would not incur a significant increase of false positives for a typical SSD capacity Light-weight pre-hashing Compute a CRC32 and query the fingerprint store If a mach is found Generate SHA1 fingerprint and confirm it Otherwise, write data to flash We have also considered using a Bloom filter Multiple hashings Summary vector cannot be updated when a fingerprint is removed Mobile Embedded System Laboratory 12

13 Dynamic switches Incoming requests may wait for available buffer space to be released by previous requests Dynamic switch Set high watermark and low watermark to turn on/off inine deduplication If the percentage of the occupied cache space reaches a high watermark (95%) We disable in-line deduplication to flush writes quickly to flash Once the low watermark (50%) is hit, we re-enable the in-line deduplication. Mobile Embedded System Laboratory 13

14 Out-of-line Deduplication CAFTL does not pursue a perfect in-line deduplication An internal routine is periodically launched to perform out-ofline fingerprinting and out-of-line deduplication during the device idle Mobile Embedded System Laboratory 14

15 Performance Evaluation Experimental Systems SSD Simulator Microsoft Research SSD extension for the DiskSim Write buffering SSD Configurations Workloads and trace collection Desktop Hadoop TPC-H Transaction TPC-C Mobile Embedded System Laboratory 15

16 Effective of Deduplication Removing duplicate writes Mobile Embedded System Laboratory 16

17 Effective of Deduplication Extending flash space Mobile Embedded System Laboratory 17

18 Performance Impact Cache size Mobile Embedded System Laboratory 18

19 Performance Impact Hashing speed Mobile Embedded System Laboratory 19

20 Performance Impact Fingerprint searching Mobile Embedded System Laboratory 20

21 Acceleration Methods Sampling Because of the significantly reduced waiting time for the buffer Mobile Embedded System Laboratory 21

22 Acceleration Methods Light-weight pre-hashing This workload is write intensive and has a long waiting queue, which makes queuing effect particularly significant Mobile Embedded System Laboratory 22

23 Acceleration Methods Dynamic switch Mobile Embedded System Laboratory 23

24 Conclusions CAFTL can remove duplicate writes Enhance the lifespan of SSDs while retaining high data access performance If budget allows, we would suggest maintaining the fingerprint store fully in memory, which not only improves deduplication rate but also simplifies designs PCM into the SSDs to maintain the metadata Which can remove much design complexity On-line compression into SSDs Mobile Embedded System Laboratory 24

CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives

CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives Abstract Although Flash Memory based Solid State Drive (SSD) exhibits high performance and

More information

Offline Deduplication for Solid State Disk Using a Lightweight Hash Algorithm

Offline Deduplication for Solid State Disk Using a Lightweight Hash Algorithm JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.5, OCTOBER, 2015 ISSN(Print) 1598-1657 http://dx.doi.org/10.5573/jsts.2015.15.5.539 ISSN(Online) 2233-4866 Offline Deduplication for Solid State

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,

More information

MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services

MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,

More information

ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory

ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory Biplob Debnath Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA University of Minnesota, Twin Cities, USA Abstract Storage

More information

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely

More information

A Data De-duplication Access Framework for Solid State Drives

A Data De-duplication Access Framework for Solid State Drives JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 941-954 (2012) A Data De-duplication Access Framework for Solid State Drives Department of Electronic Engineering National Taiwan University of Science

More information

Inline Deduplication

Inline Deduplication Inline Deduplication binarywarriors5@gmail.com 1.1 Inline Vs Post-process Deduplication In target based deduplication, the deduplication engine can either process data for duplicates in real time (i.e.

More information

ioscale: The Holy Grail for Hyperscale

ioscale: The Holy Grail for Hyperscale ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card Version 1.0 April 2011 DB15-000761-00 Revision History Version and Date Version 1.0, April 2011 Initial

More information

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs SOS: Software-Based Out-of-Order Scheduling for High-Performance NAND -Based SSDs Sangwook Shane Hahn, Sungjin Lee, and Jihong Kim Department of Computer Science and Engineering, Seoul National University,

More information

A Deduplication File System & Course Review

A Deduplication File System & Course Review A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

More information

Tradeoffs in Scalable Data Routing for Deduplication Clusters

Tradeoffs in Scalable Data Routing for Deduplication Clusters Tradeoffs in Scalable Data Routing for Deduplication Clusters Wei Dong Princeton University Fred Douglis EMC Kai Li Princeton University and EMC Hugo Patterson EMC Sazzala Reddy EMC Philip Shilane EMC

More information

Solid State Drive (SSD) FAQ

Solid State Drive (SSD) FAQ Solid State Drive (SSD) FAQ Santosh Kumar Rajesh Vijayaraghavan O c t o b e r 2 0 1 1 List of Questions Why SSD? Why Dell SSD? What are the types of SSDs? What are the best Use cases & applications for

More information

SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage

SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage SkimpyStash: RAM Space Skimpy Key-Value Store on Flash-based Storage Biplob Debnath,1 Sudipta Sengupta Jin Li Microsoft Research, Redmond, WA, USA EMC Corporation, Santa Clara, CA, USA ABSTRACT We present

More information

Deduplication in SSDs: Model and Quantitative Analysis

Deduplication in SSDs: Model and Quantitative Analysis Deduplication in SSDs: Model and Quantitative Analysis Jonghwa Kim Dankook University, Korea zcbm4321@dankook.ac.kr Ikjoon Son Dankook University, Korea ikjoon@dankook.ac.kr Sooyong Kang Hanyang University,

More information

SSD Performance Tips: Avoid The Write Cliff

SSD Performance Tips: Avoid The Write Cliff ebook 100% KBs/sec 12% GBs Written SSD Performance Tips: Avoid The Write Cliff An Inexpensive and Highly Effective Method to Keep SSD Performance at 100% Through Content Locality Caching Share this ebook

More information

Theoretical Aspects of Storage Systems Autumn 2009

Theoretical Aspects of Storage Systems Autumn 2009 Theoretical Aspects of Storage Systems Autumn 2009 Chapter 3: Data Deduplication André Brinkmann News Outline Data Deduplication Compare-by-hash strategies Delta-encoding based strategies Measurements

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

The Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle

The Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle The Curious Case of Database Deduplication PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle Agenda Introduction Deduplication Databases and Deduplication All Flash Arrays and Deduplication 2 Quick Show

More information

Block-level Inline Data Deduplication in ext3

Block-level Inline Data Deduplication in ext3 Block-level Inline Data Deduplication in ext3 Aaron Brown Kristopher Kosmatka University of Wisconsin - Madison Department of Computer Sciences December 23, 2010 Abstract Solid State Disk (SSD) media are

More information

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2

Using Synology SSD Technology to Enhance System Performance. Based on DSM 5.2 Using Synology SSD Technology to Enhance System Performance Based on DSM 5.2 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD Cache as Solution...

More information

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of

More information

A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP

A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP Dilip N Simha (Stony Brook University, NY & ITRI, Taiwan) Maohua Lu (IBM Almaden Research Labs, CA) Tzi-cker Chiueh (Stony

More information

Key Components of WAN Optimization Controller Functionality

Key Components of WAN Optimization Controller Functionality Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications

More information

A Survey on Aware of Local-Global Cloud Backup Storage for Personal Purpose

A Survey on Aware of Local-Global Cloud Backup Storage for Personal Purpose A Survey on Aware of Local-Global Cloud Backup Storage for Personal Purpose Abhirupa Chatterjee 1, Divya. R. Krishnan 2, P. Kalamani 3 1,2 UG Scholar, Sri Sairam College Of Engineering, Bangalore. India

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Indexing on Solid State Drives based on Flash Memory

Indexing on Solid State Drives based on Flash Memory Indexing on Solid State Drives based on Flash Memory Florian Keusch MASTER S THESIS Systems Group Department of Computer Science ETH Zurich http://www.systems.ethz.ch/ September 2008 - March 2009 Supervised

More information

Read Performance Enhancement In Data Deduplication For Secondary Storage

Read Performance Enhancement In Data Deduplication For Secondary Storage Read Performance Enhancement In Data Deduplication For Secondary Storage A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Pradeep Ganesan IN PARTIAL FULFILLMENT

More information

DEXT3: Block Level Inline Deduplication for EXT3 File System

DEXT3: Block Level Inline Deduplication for EXT3 File System DEXT3: Block Level Inline Deduplication for EXT3 File System Amar More M.A.E. Alandi, Pune, India ahmore@comp.maepune.ac.in Zishan Shaikh M.A.E. Alandi, Pune, India zishan366shaikh@gmail.com Vishal Salve

More information

Understanding endurance and performance characteristics of HP solid state drives

Understanding endurance and performance characteristics of HP solid state drives Understanding endurance and performance characteristics of HP solid state drives Technology brief Introduction... 2 SSD endurance... 2 An introduction to endurance... 2 NAND organization... 2 SLC versus

More information

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55% openbench Labs Executive Briefing: April 19, 2013 Condusiv s Server Boosts Performance of SQL Server 2012 by 55% Optimizing I/O for Increased Throughput and Reduced Latency on Physical Servers 01 Executive

More information

The What, Why and How of the Pure Storage Enterprise Flash Array

The What, Why and How of the Pure Storage Enterprise Flash Array The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) What is an enterprise storage array? Enterprise storage array: store data blocks

More information

Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos

Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos Symantec Research Labs Symantec FY 2013 (4/1/2012 to 3/31/2013) Revenue: $ 6.9 billion Segment Revenue Example Business

More information

In-Block Level Redundancy Management for Flash Storage System

In-Block Level Redundancy Management for Flash Storage System , pp.309-318 http://dx.doi.org/10.14257/ijmue.2015.10.9.32 In-Block Level Redundancy Management for Flash Storage System Seung-Ho Lim Division of Computer and Electronic Systems Engineering Hankuk University

More information

Flash Memory Technology in Enterprise Storage

Flash Memory Technology in Enterprise Storage NETAPP WHITE PAPER Flash Memory Technology in Enterprise Storage Flexible Choices to Optimize Performance Mark Woods and Amit Shah, NetApp November 2008 WP-7061-1008 EXECUTIVE SUMMARY Solid state drives

More information

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE WHITE PAPER BASICS OF DISK I/O PERFORMANCE WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE This technical documentation is aimed at the persons responsible for the disk I/O performance

More information

IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?

IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME? IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME? EMC and Intel work with multiple in-memory solutions to make your databases fly Thanks to cheaper random access memory (RAM) and improved technology,

More information

Data Backup and Archiving with Enterprise Storage Systems

Data Backup and Archiving with Enterprise Storage Systems Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,

More information

An Exploration of Hybrid Hard Disk Designs Using an Extensible Simulator

An Exploration of Hybrid Hard Disk Designs Using an Extensible Simulator An Exploration of Hybrid Hard Disk Designs Using an Extensible Simulator Pavan Konanki Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment

More information

Databases Acceleration with Non Volatile Memory File System (NVMFS) PRESENTATION TITLE GOES HERE Saeed Raja SanDisk Inc.

Databases Acceleration with Non Volatile Memory File System (NVMFS) PRESENTATION TITLE GOES HERE Saeed Raja SanDisk Inc. bases Acceleration with Non Volatile Memory File System (NVMFS) PRESENTATION TITLE GOES HERE Saeed Raja SanDisk Inc. MySQL? Widely used Open Source Relational base Management System (RDBMS) Popular choice

More information

Hardware Configuration Guide

Hardware Configuration Guide Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...

More information

Evaluation Report: Database Acceleration with HP 3PAR StoreServ 7450 All-flash Storage

Evaluation Report: Database Acceleration with HP 3PAR StoreServ 7450 All-flash Storage Evaluation Report: Database Acceleration with HP 3PAR StoreServ 7450 All-flash Storage Evaluation report prepared under contract with HP Executive Summary Solid state storage is transforming the entire

More information

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere Nutanix Tech Note Configuration Best Practices for Nutanix Storage with VMware vsphere Nutanix Virtual Computing Platform is engineered from the ground up to provide enterprise-grade availability for critical

More information

Multi-level Metadata Management Scheme for Cloud Storage System

Multi-level Metadata Management Scheme for Cloud Storage System , pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.1.22 Multi-level Metadata Management Scheme for Cloud Storage System Jin San Kong 1, Min Ja Kim 2, Wan Yeon Lee 3, Chuck Yoo 2 and Young Woong Ko 1

More information

SOLUTION BRIEF. Resolving the VDI Storage Challenge

SOLUTION BRIEF. Resolving the VDI Storage Challenge CLOUDBYTE ELASTISTOR QOS GUARANTEE MEETS USER REQUIREMENTS WHILE REDUCING TCO The use of VDI (Virtual Desktop Infrastructure) enables enterprises to become more agile and flexible, in tune with the needs

More information

Calsoft Webinar - Debunking QA myths for Flash- Based Arrays

Calsoft Webinar - Debunking QA myths for Flash- Based Arrays Most Trusted Names in Data Centre Products Rely on Calsoft! September 2015 Calsoft Webinar - Debunking QA myths for Flash- Based Arrays Agenda Introduction to Types of Flash-Based Arrays Challenges in

More information

Solid State Storage in Massive Data Environments Erik Eyberg

Solid State Storage in Massive Data Environments Erik Eyberg Solid State Storage in Massive Data Environments Erik Eyberg Senior Analyst Texas Memory Systems, Inc. Agenda Taxonomy Performance Considerations Reliability Considerations Q&A Solid State Storage Taxonomy

More information

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays

Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays 1 Impact of Stripe Unit Size on Performance and Endurance of SSD-Based RAID Arrays Farzaneh Rajaei Salmasi Hossein Asadi Majid GhasemiGol rajaei@ce.sharif.edu asadi@sharif.edu ghasemigol@ce.sharif.edu

More information

09'Linux Plumbers Conference

09'Linux Plumbers Conference 09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center cmm@us.ibm.com 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing

More information

Contents. WD Arkeia Page 2 of 14

Contents. WD Arkeia Page 2 of 14 Contents Contents...2 Executive Summary...3 What Is Data Deduplication?...4 Traditional Data Deduplication Strategies...5 Deduplication Challenges...5 Single-Instance Storage...5 Fixed-Block Deduplication...6

More information

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures Technology Insight Paper FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures By Leah Schoeb January 16, 2013 FlashSoft Software from SanDisk: Accelerating Virtual Infrastructures 1 FlashSoft

More information

Accelerating Server Storage Performance on Lenovo ThinkServer

Accelerating Server Storage Performance on Lenovo ThinkServer Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER

More information

FlashTier: a Lightweight, Consistent and Durable Storage Cache

FlashTier: a Lightweight, Consistent and Durable Storage Cache FlashTier: a Lightweight, Consistent and Durable Storage Cache Mohit Saxena, Michael M. Swift and Yiying Zhang University of Wisconsin-Madison {msaxena,swift,yyzhang}@cs.wisc.edu Abstract The availability

More information

CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY

CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY Reference Architecture CONSOLIDATING MICROSOFT SQL SERVER OLTP WORKLOADS ON THE EMC XtremIO ALL FLASH ARRAY An XtremIO Reference Architecture Abstract This Reference architecture examines the storage efficiencies

More information

SH-Sim: A Flexible Simulation Platform for Hybrid Storage Systems

SH-Sim: A Flexible Simulation Platform for Hybrid Storage Systems , pp.61-70 http://dx.doi.org/10.14257/ijgdc.2014.7.3.07 SH-Sim: A Flexible Simulation Platform for Hybrid Storage Systems Puyuan Yang 1, Peiquan Jin 1,2 and Lihua Yue 1,2 1 School of Computer Science and

More information

COS 318: Operating Systems

COS 318: Operating Systems COS 318: Operating Systems File Performance and Reliability Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Topics File buffer cache

More information

ABSTRACT 1 INTRODUCTION

ABSTRACT 1 INTRODUCTION DEDUPLICATION IN YAFFS Karthik Narayan {knarayan@cs.wisc.edu}, Pavithra Seshadri Vijayakrishnan{pavithra@cs.wisc.edu} Department of Computer Sciences, University of Wisconsin Madison ABSTRACT NAND flash

More information

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System Avoiding the Disk Bottleneck in the Data Domain Deduplication File System Benjamin Zhu Data Domain, Inc. Kai Li Data Domain, Inc. and Princeton University Hugo Patterson Data Domain, Inc. Abstract Disk-based

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

CURRENTLY, the enterprise data centers manage PB or

CURRENTLY, the enterprise data centers manage PB or IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 61, NO. 11, JANUARY 21 1 : Distributed Deduplication for Big Storage in the Cloud Shengmei Luo, Guangyan Zhang, Chengwen Wu, Samee U. Khan, Senior Member, IEEE,

More information

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are

More information

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup

More information

File System Management

File System Management Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation

More information

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability Manohar Punna President - SQLServerGeeks #509 Brisbane 2016 Agenda SQL Server Memory Buffer Pool Extensions Delayed Durability Analysis

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

Deduplication Demystified: How to determine the right approach for your business

Deduplication Demystified: How to determine the right approach for your business Deduplication Demystified: How to determine the right approach for your business Presented by Charles Keiper Senior Product Manager, Data Protection Quest Software Session Objective: To answer burning

More information

IBM Systems and Technology Group May 2013 Thought Leadership White Paper. Faster Oracle performance with IBM FlashSystem

IBM Systems and Technology Group May 2013 Thought Leadership White Paper. Faster Oracle performance with IBM FlashSystem IBM Systems and Technology Group May 2013 Thought Leadership White Paper Faster Oracle performance with IBM FlashSystem 2 Faster Oracle performance with IBM FlashSystem Executive summary This whitepaper

More information

A Deduplication-based Data Archiving System

A Deduplication-based Data Archiving System 2012 International Conference on Image, Vision and Computing (ICIVC 2012) IPCSIT vol. 50 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V50.20 A Deduplication-based Data Archiving System

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

The assignment of chunk size according to the target data characteristics in deduplication backup system

The assignment of chunk size according to the target data characteristics in deduplication backup system The assignment of chunk size according to the target data characteristics in deduplication backup system Mikito Ogata Norihisa Komoda Hitachi Information and Telecommunication Engineering, Ltd. 781 Sakai,

More information

File-System Implementation

File-System Implementation File-System Implementation 11 CHAPTER In this chapter we discuss various methods for storing information on secondary storage. The basic issues are device directory, free space management, and space allocation

More information

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group Russ Fellows, Evaluator Group SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material

More information

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance Alex Ho, Product Manager Innodisk Corporation Outline Innodisk Introduction Industry Trend & Challenge

More information

WHITE PAPER 1 WWW.FUSIONIO.COM

WHITE PAPER 1 WWW.FUSIONIO.COM 1 WWW.FUSIONIO.COM WHITE PAPER WHITE PAPER Executive Summary Fusion iovdi is the first desktop- aware solution to virtual desktop infrastructure. Its software- defined approach uniquely combines the economics

More information

89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Thin Deduplication: A Competitive Comparison

89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Thin Deduplication: A Competitive Comparison 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 White Paper HP 3PAR Thin Deduplication: A Competitive Comparison Printed in the United States of America Copyright 2014 Edison

More information

A Context Aware Block Layer: The Case for Block Layer Deduplication

A Context Aware Block Layer: The Case for Block Layer Deduplication A Context Aware Block Layer: The Case for Block Layer Deduplication A Thesis Presented by Amar Mudrankit to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Master of Science

More information

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1 WWW.FUSIONIO.COM

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1 WWW.FUSIONIO.COM 1 WWW.FUSIONIO.COM FUSION iocontrol HYBRID STORAGE ARCHITECTURE Contents Contents... 2 1 The Storage I/O and Management Gap... 3 2 Closing the Gap with Fusion-io... 4 2.1 Flash storage, the Right Way...

More information

A Comparison of Client and Enterprise SSD Data Path Protection

A Comparison of Client and Enterprise SSD Data Path Protection A Comparison of Client and Enterprise SSD Data Path Protection Doug Rollins, Senior Strategic Applications Engineer Micron Technology, Inc. Technical Marketing Brief Data Path Protection Overview This

More information

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Design and Implementation of a Storage Repository Using Commonality Factoring IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Axion Overview Potentially infinite historic versioning for rollback and

More information

Boosting Database Batch workloads using Flash Memory SSDs

Boosting Database Batch workloads using Flash Memory SSDs Boosting Database Batch workloads using Flash Memory SSDs Won-Gill Oh and Sang-Won Lee School of Information and Communication Engineering SungKyunKwan University, 27334 2066, Seobu-Ro, Jangan-Gu, Suwon-Si,

More information

All-Flash Arrays: Not Just for the Top Tier Anymore

All-Flash Arrays: Not Just for the Top Tier Anymore All-Flash Arrays: Not Just for the Top Tier Anymore Falling prices, new technology make allflash arrays a fit for more financial, life sciences and healthcare applications EXECUTIVE SUMMARY Real-time financial

More information

DEDUPLICATION has become a key component in modern

DEDUPLICATION has become a key component in modern IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 27, NO. 3, MARCH 2016 855 Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge Min

More information

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract

More information

Implementation of Buffer Cache Simulator for Hybrid Main Memory and Flash Memory Storages

Implementation of Buffer Cache Simulator for Hybrid Main Memory and Flash Memory Storages Implementation of Buffer Cache Simulator for Hybrid Main Memory and Flash Memory Storages Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea

More information

A High-Throughput In-Memory Index, Durable on Flash-based SSD

A High-Throughput In-Memory Index, Durable on Flash-based SSD A High-Throughput In-Memory Index, Durable on Flash-based SSD Insights into the Winning Solution of the SIGMOD Programming Contest 2011 Thomas Kissinger, Benjamin Schlegel, Matthias Boehm, Dirk Habich,

More information

File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez

File Systems for Flash Memories. Marcela Zuluaga Sebastian Isaza Dante Rodriguez File Systems for Flash Memories Marcela Zuluaga Sebastian Isaza Dante Rodriguez Outline Introduction to Flash Memories Introduction to File Systems File Systems for Flash Memories YAFFS (Yet Another Flash

More information

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program

More information

Byte-index Chunking Algorithm for Data Deduplication System

Byte-index Chunking Algorithm for Data Deduplication System , pp.415-424 http://dx.doi.org/10.14257/ijsia.2013.7.5.38 Byte-index Chunking Algorithm for Data Deduplication System Ider Lkhagvasuren 1, Jung Min So 1, Jeong Gun Lee 1, Chuck Yoo 2 and Young Woong Ko

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer. Guest lecturer: David Hovemeyer November 15, 2004 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds

More information

ACHIEVING STORAGE EFFICIENCY WITH DATA DEDUPLICATION

ACHIEVING STORAGE EFFICIENCY WITH DATA DEDUPLICATION ACHIEVING STORAGE EFFICIENCY WITH DATA DEDUPLICATION Dell NX4 Dell Inc. Visit dell.com/nx4 for more information and additional resources Copyright 2008 Dell Inc. THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES

More information

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent

More information

Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication

Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication Table of Contents Introduction... 3 Shortest Possible Backup Window... 3 Instant

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

idedup: Latency-aware, inline data deduplication for primary storage

idedup: Latency-aware, inline data deduplication for primary storage idedup: Latency-aware, inline data deduplication for primary storage Kiran Srinivasan, Tim Bisson, Garth Goodson, Kaladhar Voruganti NetApp, Inc. {skiran, tbisson, goodson, kaladhar}@netapp.com Abstract

More information

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression Sponsored by: Oracle Steven Scully May 2010 Benjamin Woo IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

EMC VNXe File Deduplication and Compression

EMC VNXe File Deduplication and Compression White Paper EMC VNXe File Deduplication and Compression Overview Abstract This white paper describes EMC VNXe File Deduplication and Compression, a VNXe system feature that increases the efficiency with

More information

SOLID STATE DRIVES AND PARALLEL STORAGE

SOLID STATE DRIVES AND PARALLEL STORAGE SOLID STATE DRIVES AND PARALLEL STORAGE White paper JANUARY 2013 1.888.PANASAS www.panasas.com Overview Solid State Drives (SSDs) have been touted for some time as a disruptive technology in the storage

More information