# Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs

Save this PDF as:

Size: px
Start display at page:

Download "Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs"

## Transcription

1 Data Reduction: Deduplication and Compression Danny Harnik IBM Haifa Research Labs

2 Motivation Reducing the amount of data is a desirable goal Data reduction: an attempt to compress the huge amounts of data at hand Is it possible? information theoretically, technically Is it financially worth it? storage is becoming cheaper all the time requires resources and time

3 Compression and Deduplication Compression What is the most succinct representation of this file? Deduplication Hasn t this file appeared before? Different workloads give different results: Some favor compression, some favor dedup Sometimes the combination is best

4 Compression

5 Compression Zip runs an algorithm called DEFLATE A combination of two techniques: Lempel Ziv [1977] Huffman code [1952] Will show these 2 techniques + Arithmetic encoding

6 LZ77 Compression Go over a stream At each point, search for the longest identical string that has already appeared in the past. If none appeared, write the string If appeared, save Pointer to start of string (how many bytes back) Length of current string. Many variations How far to search back? Typically 32KB LZ78 hold a dictionary table A good approximation of the entropy for some sources

7 Huffman Code An information theoretic approach to compression: A typical text of n characters (or bytes) is not uniformly distributed. Use the skewed distribution to achieve a shorter representation. Most popular byte character gets shortest representation E.g. In a typical English text: Use the shortest encoding for e The longest for q Huffman code: A method of presenting a text using nearly its shannon entropy worth in bits. Optimal when considering just single characters

8 Huffman Code this is an example of a huffman tree Example taken from:

9 Deduplication

10 Deduplication Similar to Lempel Ziv 78, but at a whole different scale Basic Block is typically ~ 4KB, 8KB, 16KB, full file Rather than byte, or string of bytes An ongoing process. Need to address a file after it is saved and closed. Two main approaches Inline dedup process data as it arrives Offline dedup background process, first save data, then dedup in spare time.

11 How to dedupe? Fingerprint each block using a hash function Common hashes used: Sha1, Sha256, others Store an index of all the hashes already in the system New block: Compute hash Look hash up in index table If new add to index If known hash store as pointer to existing data If known hash, do you want to look at the actual data?? 11

12 Client-side deduplication A method to save bandwidth as well as storage. Also know as source-based dedupe or WAN deduplication Client computes hash and sends to server If new server requests client for the data (upload data) Otherwise (dedupe) skip upload and add a new pointer to the data Client Server Let it be.mp3 hash Index 2fd4e1 2fd4e1 2fd4e1 12 Let it be.mp3

13 Choice of hash function In most deduplication systems this is done using a cryptographic hash Usually SHA-1 which has an output of 160 bits Probability of a collision: 1. n is the number of blocks 2. b is the number of bits in the hash p n( n 1) b The above is true for any random hash function. However, a malicious adversary may choose blocks especially to create a collision. This is why a cryptographic hash is used Typically more expensive than any random like hash function

14 Issues Smaller blocks = Better Dedup But smaller blocks = more work More fingerprints More searches More metadata Bottom line: the choice of block size depends on the workload E.g. a file system with a 1KB page size

15 Alignment issues What if we insert 1 byte into an existing file. Almost identical data Dedup will fail miserably. Solution: variable block size Rabin-Karp fingerprinting: Compute a rolling hash Cut when hash equals 0 mod p Average block size = p

16 Existing data reduction solutions (A sample of solutions for storage systems)

17 Deduplication some systems and applications Content Adressable Storage (CAS) mainly for archiving Venti (Lucent), Centera(EMC), JumboStore (HP), Hydrastor(NEC) Backup Virtual Tape Library (VTL) Backup Dilligent (IBM), DataDomain (EMC), D2D (HP) Backup with client side dedup Cloud backup services: Mozy(EMC), DropBox,. Avamar(EMC), Ocarina (Dell), Netbackup (Symantec) Tivoli Storage Manager (IBM) Primary (mainly file systems) useful for VM images Netapp Filer 2 to 1 ratio guarantee on some VMWare usage. ZFS (Sun open source file system) Dell (planned for next year)

18 Compression in storage systems Real-time (Inline) RTC (IBM) ZFS (Oracle) Nimble Storage Offline Mix EMC Data Compression Dell (planned for next year dedupe inline, compression offline) Netapp Writes online, updates offline. Backup

19 Dedup vs. Compression vs. both Compression and Deduplication for Various Data Types 1.2 Data Reduction Ratio (Compressed size / Original size) Compress (Gzip) DedupV (4K, var) DedupV+Compress DedupF (4K, fix) DedupF+Compress Compress+DedupV Compress+DedupF 0 VM Images Medical Images Website Archive Project Repository DB2 TPC Laptop1 (29.9GB) Data type Data taken from C. Constantinescu, J. Glider, D. Chambliss: Mixing Deduplication and Compression on Active Data Sets. DCC 2011

20 Summary Data reduction is a useful concept, but not for all cases Compression and Deduplication 2 similar concepts at the two ends of the same scale The large scale in dedupe creates new challenges Different challenges and use cases No one solution fits all

### Security of Cloud Storage: - Deduplication vs. Privacy

Security of Cloud Storage: - Deduplication vs. Privacy Benny Pinkas - Bar Ilan University Shai Halevi, Danny Harnik, Alexandra Shulman-Peleg - IBM Research Haifa 1 Remote storage and security Easy to encrypt

### Estimating Deduplication Ratios in Large Data Sets

IBM Research labs - Haifa Estimating Deduplication Ratios in Large Data Sets Danny Harnik, Oded Margalit, Dalit Naor, Dmitry Sotnikov Gil Vernik Estimating dedupe and compression ratios some motivation

### Efficient Backup with Data Deduplication Which Strategy is Right for You?

Efficient Backup with Data Deduplication Which Strategy is Right for You? Rob Emsley Senior Director, Product Marketing CPU Utilization CPU Utilization Exabytes Why So Much Interest in Data Deduplication?

### Side channels in cloud services, the case of deduplication in cloud storage

Side channels in cloud services, the case of deduplication in cloud storage Danny Harnik, Benny Pinkas, Alexandra Shulman-Peleg Presented by Yair Yona Yair Yona (TAU) Side channels in cloud services Advanced

### STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside

Managing the information that drives the enterprise STORAGE Buying Guide: DEDUPLICATION inside What you need to know about target data deduplication Special factors to consider One key difference among

Seriously: Tape Only Backup Systems are Dead, Dead, Dead! Agenda Overview Tape backup rule #1 So what s the problem? Intelligent disk targets Disk-based backup software Overview We re still talking disk

EMC Backup solutions Aleksandar Antić EMC BRS Territory Sales Adriatic region 1 EMC BRS Division Approximately 3,000 employees 10 R&D locations Market Leadership #1 in Deduplication #1 in Purpose Built

Data Compression and Deduplication LOC 2010 2010 Systems, Inc. All rights reserved. 1 Data Redundancy Elimination Landscape VMWARE DeDE IBM DDE for Tank Solaris ZFS Hosts (Inline and Offline) MDS + Network

### 09'Linux Plumbers Conference

09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center cmm@us.ibm.com 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing

### EMC Data de-duplication not ONLY for IBM i

EMC Data de-duplication not ONLY for IBM i Maciej Mianowski EMC BRS Advisory TC May 2011 1 EMC is a TECHNOLOGY company EMC s focus is IT Infrastructure 2 EMC Portfolio Information Security Authentica Network

### Data Backup and Archiving with Enterprise Storage Systems

Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,

### Next Generation Backup Solutions

Next Generation Backup Solutions Aleksandar Antić EMC BRS Territory Sales Adriatic region 1 Data Protection Software Market Appearance Same Players Similar Share Backup to tape No major changes for decades

### Reducing Backups with Data Deduplication

The Essentials Series: New Techniques for Creating Better Backups Reducing Backups with Data Deduplication sponsored by by Eric Beehler Reducing Backups with Data Deduplication... 1 Explaining Data Deduplication...

### HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant

DISCOVER HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant HP StorageWorks Data Protection Solutions HP has it covered Near continuous data protection Disk Mirroring Advanced Backup

### STORAGE SOURCE DATA DEDUPLICATION PRODUCTS. Buying Guide: inside

Managing the information that drives the enterprise STORAGE Buying Guide: inside 2 Key features of source data deduplication products 5 Special considerations Source dedupe products can efficiently protect

EMC DATA DOMAIN OVERVIEW 1 2 With Data Domain Deduplication Storage Systems, You Can WAN Retain longer Keep backups onsite longer with less disk for fast, reliable restores, and eliminate the use of tape

### Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges

Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges September 2011 Table of Contents The Enterprise and Mobile Storage Landscapes... 3 Increased

### IJESRT. Scientific Journal Impact Factor: 3.449 (ISRA), Impact Factor: 2.114

IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Optimized Storage Approaches in Cloud Environment Sri M.Tanooj kumar, A.Radhika Department of Computer Science and Engineering,

### Theoretical Aspects of Storage Systems Autumn 2009

Theoretical Aspects of Storage Systems Autumn 2009 Chapter 3: Data Deduplication André Brinkmann News Outline Data Deduplication Compare-by-hash strategies Delta-encoding based strategies Measurements

### Availability Digest. www.availabilitydigest.com. Data Deduplication February 2011

the Availability Digest Data Deduplication February 2011 What is Data Deduplication? Data deduplication is a technology that can reduce disk storage-capacity requirements and replication bandwidth requirements

### Cloud-integrated Storage What & Why

Cloud-integrated Storage What & Why Table of Contents Overview...3 CiS architecture...3 Enterprise-class storage platform...4 Enterprise tier 2 SAN storage...4 Activity-based storage tiering and data ranking...5

### Data Deduplication and Tivoli Storage Manager

Data Deduplication and Tivoli Storage Manager Dave Cannon Tivoli Storage Manager rchitect Oxford University TSM Symposium September 2007 Disclaimer This presentation describes potential future enhancements

### Hardware Compression in Storage Networks and Network Attached Storage

Hardware Compression in Storage Networks and Network Attached Storage Tony Summers, Comtech AHA April 2007 SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

### Demystifying Deduplication for Backup with the Dell DR4000

Demystifying Deduplication for Backup with the Dell DR4000 This Dell Technical White Paper explains how deduplication with the DR4000 can help your organization save time, space, and money. John Bassett

### An In-Depth Look at Deduplication Technologies

An In-Depth Look at Deduplication Technologies White Paper Juan Orlandini, Datalink Mike Spindler, Datalink August 2008 Abstract: Deduplication is all the rage today, with a myriad of vendors offering

### Backup to Disk with DataDomain

Backup to Disk with DataDomain Why to use target based deduplication? Total technical Workshop 21.6.2016 Richard Schmidt richard.schmidt@emc.com Senior Systems Engineer Data Protection Solutions 1 Evolution

Primary Storage Data Reduction Data reduction on primary storage is a reality today and with the unchecked growth of data, it will undoubtedly become a key part of storage efficiency. Standard in many

### Backup Software Data Deduplication: What you need to know. Presented by W. Curtis Preston Executive Editor & Independent Backup Expert

Backup Software Data Deduplication: What you need to know Presented by W. Curtis Preston Executive Editor & Independent Backup Expert When I was in the IT Department When I started as backup guy at \$35B

### Cloud-integrated Enterprise Storage. Cloud-integrated Storage What & Why. Marc Farley

Cloud-integrated Enterprise Storage Cloud-integrated Storage What & Why Marc Farley Table of Contents Overview... 3 CiS architecture... 3 Enterprise-class storage platform... 4 Enterprise tier 2 SAN storage...

### Data deduplication technology: A guide to data deduping and backup

Tutorial Data deduplication technology: A guide to data deduping and backup Data deduplication is now a mainstream feature in data backup and recovery with an extensive range of vendors offering many different

### Backup of NAS devices with Avamar

Backup of NAS devices with Avamar Extremely fast / no load Video describing NAS backup using Avamar based on this ppt: https://youtu.be/swg1ejldgmw The most fresh version of this document, you will find

### Backup and Recovery Redesign with Deduplication

Backup and Recovery Redesign with Deduplication Why the move is on September 9, 2010 1 Major trends driving the transformation of backup environments UNABATED DATA GROWTH Backup = 4 to 30 times production

### Understanding EMC Avamar with EMC Data Protection Advisor

Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features to reduce the complexity of managing data protection

### Backup og De-dup i et VmWare miljø EMC Avamar. Frank Simonsen Systems Engineer

Backup og De-dup i et VmWare miljø EMC Avamar Frank Simonsen Systems Engineer 1 EMC Next Generation Backup and Archive EMC Backup, Recovery and Archive Solutions Services Assessment, Implementation File

### A Deduplication File System & Course Review

A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror

### Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,

### Get Success in Passing Your Certification Exam at first attempt!

Get Success in Passing Your Certification Exam at first attempt! Exam : E22-290 Title : EMC Data Domain Deduplication, Backup and Recovery Exam Version : DEMO 1.A customer has a Data Domain system with

### HP StoreOnce: reinventing data deduplication

HP : reinventing data deduplication Reduce the impact of explosive data growth with HP StorageWorks D2D Backup Systems Technical white paper Table of contents Executive summary... 2 Introduction to data

### NetApp Data Fabric: Secured Backup to Public Cloud. Sonny Afen Senior Technical Consultant NetApp Indonesia

NetApp Data Fabric: Secured Backup to Public Cloud Sonny Afen Senior Technical Consultant NetApp Indonesia Agenda Introduction Solution Overview Solution Technical Overview 2 Introduction 3 Hybrid cloud:

### CIGRE 2014: Udaljena zaštita podataka

CIGRE 2014: Udaljena zaštita podataka Žarko Stupar Product Manager zstupar@mds.rs "" 1 Agenda Udaljena zaštita podataka - pristup Replikacija podataka između data centara Napredna backup rešenja Replikacija

### EMC DATA DOMAIN SISL SCALING ARCHITECTURE

EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily losing ground

### Multi-level Metadata Management Scheme for Cloud Storage System

, pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.1.22 Multi-level Metadata Management Scheme for Cloud Storage System Jin San Kong 1, Min Ja Kim 2, Wan Yeon Lee 3, Chuck Yoo 2 and Young Woong Ko 1

### DXi Accent Technical Background

TECHNOLOGY BRIEF NOTICE This Technology Brief contains information protected by copyright. Information in this Technology Brief is subject to change without notice and does not represent a commitment on

### DEDUPLICATION NOW AND WHERE IT S HEADING. Lauren Whitehouse Senior Analyst, Enterprise Strategy Group

DEDUPLICATION NOW AND WHERE IT S HEADING Lauren Whitehouse Senior Analyst, Enterprise Strategy Group Need Dedupe? Before/After Dedupe Deduplication Production Data Deduplication In Backup Process Backup

### The Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle

The Curious Case of Database Deduplication PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle Agenda Introduction Deduplication Databases and Deduplication All Flash Arrays and Deduplication 2 Quick Show

### Technical White Paper for the Oceanspace VTL6000

Document No. Technical White Paper for the Oceanspace VTL6000 Issue V2.1 Date 2010-05-18 Huawei Symantec Technologies Co., Ltd. Copyright Huawei Symantec Technologies Co., Ltd. 2010. All rights reserved.

### Analysis of Compression Algorithms for Program Data

Analysis of Compression Algorithms for Program Data Matthew Simpson, Clemson University with Dr. Rajeev Barua and Surupa Biswas, University of Maryland 12 August 3 Abstract Insufficient available memory

Redefining Backup for VMware Environment 1 Agenda VMware infrastructure backup and recovery challenges Introduction to EMC Avamar Avamar solutions for VMware infrastructure Key takeaways Copyright 2009

### Type of Submission: Article Title: DB2 s Integrated Support for Data Deduplication Devices Subtitle: Keywords: DB2, Backup, Deduplication

Type of Submission: Article Title: DB2 s Integrated Support for Data Deduplication Devices Subtitle: Keywords: DB2, Backup, Deduplication Prefix: Error! Bookmark not defined. Given: Dale Middle: M. Family:

### Checklist and Tips to Choosing the Right Backup Strategy

E-Guide Checklist and Tips to Choosing the Right Backup Strategy Data deduplication is no longer just a cool technology, it's become a fairly common component of modern data backup strategies. Learn how

### UNDERSTANDING DATA DEDUPLICATION. Tom Sas Hewlett-Packard

UNDERSTANDING DATA DEDUPLICATION Tom Sas Hewlett-Packard SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may use this material

### Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst

ESG Lab Review Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst Abstract: This ESG Lab review documents hands-on

### efficient protection, and impact-less!!

Converged, Hyper- or Flash Sample photo. Replace if desired. efficient protection, and impact-less!! Bogdan Stefanescu (..aka Bogs) EMC Data Protection Solutions bogdan.stefanescu@emc.com 1 ALL DATA HAS

### EMC BACKUP AND RECOVERY SOLUTIONS

EMC BACKUP AND RECOVERY SOLUTIONS Backup to the future BRS PARTNER UPDATE Sofia, March 14 th, 2011 horia.constantinescu@emc.com dumitru.taraianu@emc.com 1 Agenda EMC backup and recovery solutions Backup

DPAD Introduction EMC Data Protection and Availability Division 1 EMC 的 備 份 與 回 復 的 解 決 方 案 Data Domain Avamar NetWorker Data Protection Advisor 2 EMC 雙 活 資 料 中 心 的 解 決 方 案 移 動 性 ( Mobility ) 可 用 性 ( Availability

### Protecting Information in a Smarter Data Center with the Performance of Flash

89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 Protecting Information in a Smarter Data Center with the Performance of Flash IBM FlashSystem and IBM ProtecTIER Printed in

### UNDERSTANDING DATA DEDUPLICATION. Thomas Rivera SEPATON

UNDERSTANDING DATA DEDUPLICATION Thomas Rivera SEPATON SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may use this material

### TOP REASONS TO CHOOSE EMC OVER VERITAS NETBACKUP

TOP REASONS TO CHOOSE EMC OVER VERITAS NETBACKUP 1 EMC HAS THE MOST EFFICIENT DEDUPLICATION ON THE MARKET EMC Data Domain was designed from the ground up with data deduplication and data protection in

### DEXT3: Block Level Inline Deduplication for EXT3 File System

DEXT3: Block Level Inline Deduplication for EXT3 File System Amar More M.A.E. Alandi, Pune, India ahmore@comp.maepune.ac.in Zishan Shaikh M.A.E. Alandi, Pune, India zishan366shaikh@gmail.com Vishal Salve

### A block based storage model for remote online backups in a trust no one environment

A block based storage model for remote online backups in a trust no one environment http://www.duplicati.com/ Kenneth Skovhede (author, kenneth@duplicati.com) René Stach (editor, rene@duplicati.com) Abstract

### Deduplication Demystified: How to determine the right approach for your business

Deduplication Demystified: How to determine the right approach for your business Presented by Charles Keiper Senior Product Manager, Data Protection Quest Software Session Objective: To answer burning

### Oracle Data Protection Concepts

Oracle Data Protection Concepts Matthew Ellis Advisory Systems Engineer BRS Database Technologist, EMC Corporation Accelerating Transformation EMC Backup Recovery Systems Division 1 Agenda Market Conditions.

### Reducing Costs and Complexity with CommVault

Reducing Costs and Complexity with CommVault Agenda The CommVault approach to Data Management Infrastructure De-duplication Snapshots VM backup and recovery Reducing costs with CommVault new pricing options

### Contents. WD Arkeia Page 2 of 14

Contents Contents...2 Executive Summary...3 What Is Data Deduplication?...4 Traditional Data Deduplication Strategies...5 Deduplication Challenges...5 Single-Instance Storage...5 Fixed-Block Deduplication...6

### Backup and Disaster Recovery Planning On a Budget. Presented by: Najam Saeed Lisa Ulrich

Backup and Disaster Recovery Planning On a Budget Presented by: Najam Saeed Lisa Ulrich Aging Backup System Symantec Backup Exec 11 Hardware Dell PowerEdge2950 Overland REO9000 7.4TB Overland REO4000 4TB

### VNX2 BEST PRACTICES FOR EFFICIENCIES: FLASH, DEDUPLICATION & COMPRESSION

1 VNX2 BEST PRACTICES FOR EFFICIENCIES: FLASH, DEDUPLICATION & COMPRESSION RYAN POULIN CORPORATE SYSTEMS ENGINEERING CORE TECHNOLOGIES DIVISION, VNX BU 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation

### WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Desktops. January 2012. Permabit Technology Corporation

WHITE PAPER Permabit Albireo Data Optimization Software Benefits of Albireo for Virtual Desktops January 2012 Permabit Technology Corporation Ten Canal Park Cambridge, MA 02141 USA Phone: 617.252.9600

### Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication

Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication Table of Contents Introduction... 3 Shortest Possible Backup Window... 3 Instant

### Multimedia Systems WS 2010/2011

Multimedia Systems WS 2010/2011 31.01.2011 M. Rahamatullah Khondoker (Room # 36/410 ) University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de

### zdelta: An Efficient Delta Compression Tool

zdelta: An Efficient Delta Compression Tool Dimitre Trendafilov Nasir Memon Torsten Suel Department of Computer and Information Science Technical Report TR-CIS-2002-02 6/26/2002 zdelta: An Efficient Delta

### Tiered Data Protection Strategy Data Deduplication. Thomas Störr Sales Director Central Europe November 8, 2007

Tiered Data Protection Strategy Data Deduplication Thomas Störr Sales Director Central Europe November 8, 2007 Overland Storage Tiered Data Protection = Good = Better = Best! NEO / ARCvault REO w/ expansion

### Understanding the HP Data Deduplication Strategy

Understanding the HP Data Deduplication Strategy Why one size doesn t fit everyone Table of contents Executive Summary... 2 Introduction... 4 A word of caution... 5 Customer Benefits of Data Deduplication...

### UNDERSTANDING DATA DEDUPLICATION. Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s.

UNDERSTANDING DATA DEDUPLICATION Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s. SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual

### Dell Data Protection. Marek Istok Ŋ Dell Slovakia

Dell Marek Istok Ŋ Dell Slovakia The Dell Portfolio Everything. Every time. On time.! Protect the full spectrum of your data across physical, virtual, and cloud. Shrink backup windows to just minutes;

### vsphere Data Protection 6.0 VDP 6.0

vsphere Data Protection 6.0 VDP 6.0 How to backup VMware environments? Daniel Olkowski EMC Data Protection and Availability Division Europe EAST 1 Goal of the meeting Where to use new vsphere Data Protection

E-Guide An in-depth look at data deduplication methods This E-Guide will discuss the various approaches to data deduplication. You ll learn the pros and cons of each, and will benefit from independent

LZ77 The original LZ77 algorithm works as follows: A phrase T j starting at a position i is encoded as a triple of the form distance, length, symbol. A triple d, l, s means that: T j = T [i...i + l] =

www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 3 March 2015, Page No. 10715-10720 Data DeDuplication Using Optimized Fingerprint Lookup Method for

### Barracuda Backup Deduplication. White Paper

Barracuda Backup Deduplication White Paper Abstract Data protection technologies play a critical role in organizations of all sizes, but they present a number of challenges in optimizing their operation.

Cost Effective Backup with Deduplication Agenda Today s Backup Challenges Benefits of Deduplication Source and Target Deduplication Introduction to EMC Backup Solutions Avamar, Disk Library, and NetWorker

### LDA, the new family of Lortu Data Appliances

LDA, the new family of Lortu Data Appliances Based on Lortu Byte-Level Deduplication Technology February, 2011 Copyright Lortu Software, S.L. 2011 1 Index Executive Summary 3 Lortu deduplication technology

### METHODOLOGY FOR OPTIMIZING STORAGE ON CLOUD USING AUTHORIZED DE-DUPLICATION A Review

METHODOLOGY FOR OPTIMIZING STORAGE ON CLOUD USING AUTHORIZED DE-DUPLICATION A Review 1 Ruchi Agrawal, 2 Prof.D.R. Naidu 1 M.Tech Student, CSE Department, Shri Ramdeobaba College of Engineering and Management,

### NEW DATA DOMAIN MODELS.

NEW DATA DOMAIN MODELS Daniel.Olkowski@dell.com Agenda Introduction New Data Domain models New functionalities in Data Domains Costs 2 of 18 Dell - Internal Use - Confidential Data Domain introduction

### Understanding EMC Avamar with EMC Data Protection Advisor

Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features that reduce the complexity of managing data protection

### Common. Next Generation Backup for IBMi. EMC Laurent Piguet Peter Wirth. Crowne Plaza 24 Septembre 2013

Common Next Generation Backup for IBMi EMC Laurent Piguet Peter Wirth Crowne Plaza 24 Septembre 2013 1 EMC A Fast)Growing Technology Company Founded: 1979 Employees (worldwide) >55,000 Countries with EMC

### Real-time Compression: Achieving storage efficiency throughout the data lifecycle

Real-time Compression: Achieving storage efficiency throughout the data lifecycle By Deni Connor, founding analyst Patrick Corrigan, senior analyst July 2011 F or many companies the growth in the volume

### idedup: Latency-aware, inline data deduplication for primary storage

idedup: Latency-aware, inline data deduplication for primary storage Kiran Srinivasan, Tim Bisson, Garth Goodson, Kaladhar Voruganti NetApp, Inc. {skiran, tbisson, goodson, kaladhar}@netapp.com Abstract

Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely

### Data Deduplication in a Virtual Tape Library Environment

Data Deduplication in a Virtual Tape Library Environment Mathias Defiebre IBM Lab Services mathias.defiebre@de.ibm.com STG Technical Conferences 2010 Agenda Data Deduplication Overview Data Deduplication

### EMC DATA PROTECTION. Backup ed Archivio su cui fare affidamento

EMC DATA PROTECTION Backup ed Archivio su cui fare affidamento 1 Challenges with Traditional Tape Tightening backup windows Lengthy restores Reliability, security and management issues Inability to meet

### Платформа NetBackup 7.6. What's new in NetBackup 7.6? 1

Платформа NetBackup 7.6 What's new in NetBackup 7.6? 1 Building the NetBackup Platform 3 Key Investment Areas 1. Optimize for Source Workloads Physical Virtual Arrays Big Data Accelerator V-Ray Replication

### 89 Fifth Avenue, 7th Floor. New York, NY 10003. www.theedison.com 212.367.7400. White Paper. HP 3PAR Thin Deduplication: A Competitive Comparison

89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 White Paper HP 3PAR Thin Deduplication: A Competitive Comparison Printed in the United States of America Copyright 2014 Edison

EMC AVAMAR Deduplication backup software and system 1 IT Pressures 2009 2020 0.8 zettabytes 35.2 zettabytes DATA DELUGE BUDGET DILEMMA Transformation INFRASTRUCTURE SHIFT COMPLIANCE and DISCOVERY 2 EMC

### Data Deduplication and Tivoli Storage Manager

Data Deduplication and Tivoli Storage Manager Dave annon Tivoli Storage Manager rchitect March 2009 Topics Tivoli Storage, IM Software Group Deduplication technology Data reduction and deduplication in

### EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive

### Backup to the Future. Hugo Patterson, Ph.D. Backup Recovery Systems, EMC

Backup to the Future Hugo Patterson, Ph.D. Chief Technology Officer Backup Recovery Systems, EMC SNW Spring Orlando SNW Spring, Orlando April 2010 Backup Redesign is Hot What are your top initiatives?

### EMC DATA DOMAIN OPERATING SYSTEM

ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read