Disk-Based Backup. The compute impact of data deduplication on disk-based backup



Similar documents
Detailed Product Description

ExaGrid s EX32000E is its newest and largest appliance, taking in a 32TB full backup with an ingest rate of 7.5TB/hour.

ExaGrid - A Backup and Data Deduplication appliance

Eight Considerations for Evaluating Disk-Based Backup Solutions

ExaGrid Product Description. Cost-Effective Disk-Based Backup with Data Deduplication

ExaGrid Stress-free Backup Storage

I D C T E C H N O L O G Y S P O T L I G H T

Detailed Product Description

Evaluation Guide. Software vs. Appliance Deduplication

ABOUT DISK BACKUP WITH DEDUPLICATION

Protect Data... in the Cloud

Future-Proofed Backup For A Virtualized World!

Optimizing Backup and Data Protection in Virtualized Environments. January 2009

Data De-duplication Methodologies: Comparing ExaGrid s Byte-level Data De-duplication To Block Level Data De-duplication

Best Practices Guide. Symantec NetBackup with ExaGrid Disk Backup with Deduplication ExaGrid Systems, Inc. All rights reserved.

vsphere Virtualization and Data Protection without Compromise

Case Studies. Data Sheets : White Papers : Boost your storage buying power... use ours!

DEDUPLICATION NOW AND WHERE IT S HEADING. Lauren Whitehouse Senior Analyst, Enterprise Strategy Group

The Economics of Backup. 5 Ways Disk Backup with Deduplication Improves Backup Effectiveness, Cost- Efficiency and Data Protection

Evaluating Disk Backup with Deduplication: Case Studies and Industry Information from the Frontlines

Business-Centric Storage FUJITSU Storage ETERNUS CS800 Data Protection Appliance

Data Reduction Methodologies: Comparing ExaGrid s Byte-Level-Delta Data Reduction to Data De-duplication. February 2007

Technology Fueling the Next Phase of Storage Optimization

Complete Storage and Data Protection Architecture for VMware vsphere

5 Ways to Ensure Data Availability

Reducing Backups with Data Deduplication

Wanted: Better Backup Poll shows widening gap between expectations and reality

Using HP StoreOnce Backup Systems for NDMP backups with Symantec NetBackup

Demystifying Deduplication for Backup with the Dell DR4000

GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT

VMware vsphere Data Protection 6.1

Veritas Backup Exec 15: Deduplication Option

ExaGrid with Veeam: Virtual Machine Backup without Compromise Date: May 2013 Author: Kerry Dolan, Lab Analyst, and Vinny Choinski, Senior Lab Analyst

Business-centric Storage FUJITSU Storage ETERNUS CS800 Data Protection Appliance

Understanding EMC Avamar with EMC Data Protection Advisor

Backup and Recovery: The Benefits of Multiple Deduplication Policies

Backup Exec 15: Deduplication Option

Backup Exec 2014: Deduplication Option

Symantec NetBackup PureDisk Optimizing Backups with Deduplication for Remote Offices, Data Center and Virtual Machines

PN: Using Veeam Backup and Replication Software with an ExaGrid System

Barracuda Backup Deduplication. White Paper

DeltaStor Data Deduplication: A Technical Review

DXi Accent Technical Background

Presents. Attix5 Technology. An Introduction

Vodacom Managed Hosted Backups

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside

How To Make Backup More Efficient

Deduplication and Beyond: Optimizing Performance for Backup and Recovery

DEDUPLICATION BASICS

Protect Microsoft Exchange databases, achieve long-term data retention

ESG REPORT. Data Deduplication Diversity: Evaluating Software- vs. Hardware-Based Approaches. By Lauren Whitehouse. April, 2009

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data

Sales Tool. Summary DXi Sales Messages November NOVEMBER ST00431-v06

Choosing an Enterprise-Class Deduplication Technology

Efficient Backup with Data Deduplication Which Strategy is Right for You?

How To Make A Backup System More Efficient

Barracuda Backup Vx. Virtual Appliance Deployment. White Paper

Reduced Complexity with Next- Generation Deduplication Innovation

Data Protection Report 2008 Best Practices in Data Backup & Recovery

Backup Software? Article on things to consider when looking for a backup solution. 11/09/2015 Backup Appliance or

Deduplication Demystified: How to determine the right approach for your business

EMC PERSPECTIVE. An EMC Perspective on Data De-Duplication for Backup

WHITE PAPER. Effectiveness of Variable-block vs Fixedblock Deduplication on Data Reduction: A Technical Analysis

Protecting enterprise servers with StoreOnce and CommVault Simpana

Protecting Information in a Smarter Data Center with the Performance of Flash

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

Every organization has critical data that it can t live without. When a disaster strikes, how long can your business survive without access to its

Data Deduplication: An Essential Component of your Data Protection Strategy

Universal Backup Device The Essential Facts of UBD

Identifying the Hidden Risk of Data Deduplication: How the HYDRAstor TM Solution Proactively Solves the Problem

Availability Digest. Data Deduplication February 2011

Understanding EMC Avamar with EMC Data Protection Advisor

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

Turnkey Deduplication Solution for the Enterprise

Data Deduplication in Tivoli Storage Manager. Andrzej Bugowski Spała

LDA, the new family of Lortu Data Appliances

3Gen Data Deduplication Technical

WHITE PAPER. Reinventing Large-Scale Digital Libraries With Object Storage Technology

HP StoreOnce: reinventing data deduplication

Don t Get Duped By Dedupe or Dedupe Vendors

Data deduplication is more than just a BUZZ word

VMware vsphere Data Protection 6.0

EMC NETWORKER AND DATADOMAIN

Tiered Data Protection Strategy Data Deduplication. Thomas Störr Sales Director Central Europe November 8, 2007

Symantec NetBackup 5220

Data Deduplication HTBackup

Optimizing Data Protection Operations in VMware Environments

W H I T E P A P E R R e a l i z i n g t h e B e n e f i t s o f Deduplication in a Backup and Restore System

EMC DATA DOMAIN OVERVIEW. Copyright 2011 EMC Corporation. All rights reserved.

Solution Brief: Archiving Avid Interplay Projects using NLT and XenData

HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant

Creating a Cloud Backup Service. Deon George

Solution Overview. Business Continuity with ReadyNAS

A Comparative TCO Study: VTLs and Physical Tape. With a Focus on Deduplication and LTO-5 Technology

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Take Advantage of Data De-duplication for VMware Backup

ReadyRECOVER. Reviewer s Guide. A joint backup solution between NETGEAR ReadyDATA and StorageCraft ShadowProtect

HP Store Once. Backup to Disk Lösungen. Architektur, Neuigkeiten. rené Loser, Senior Technology Consultant HP Storage Switzerland

Accelerating Data Compression with Intel Multi-Core Processors

Transcription:

Disk-Based Backup The compute impact of data deduplication on disk-based backup

Data backup continues to become increasingly critical, paralleling the rise in frequency of data corruption events, natural disasters, systems failures, and other data-loss catastrophes. When data needs to be restored, organizations may need to roll back weeks or even months to find a readable copy based on when the corruption or deletion occurred. In addition, industry and government regulations (e.g., Sarbanes- Oxley, HIPAA, GLBA, etc.) are becoming more stringent. All of these imperatives combined are driving the need to keep many weeks, months, and years of backup retention. Using straight disk for backup storage becomes cost prohibitive very quickly. For example, keeping 12 weeklies as well as monthlies for 3 years equals 45 backup copies. Due to backup retention requirements, data deduplication is necessary. Not only will deduplication greatly reduce backup storage, it will also reduce WAN replication to the disaster recovery (DR) site because only the changes from backup to backup are stored and replicated. Data deduplication compares the amount of storage required with data deduplication enabled to the amount of storage required without data deduplication, and the result is the deduplication ratio. If 20 copies of a 50TB backup are kept without data deduplication, 1PB of storage is required. The longer the retention period, the greater the storage required and, therefore, the greater the savings if data deduplication is used. For example, at 20 weeks of retention, a solution with data deduplication uses only 50TB to store 20 copies of a 50TB backup. The deduplication ratio is calculated as 1PB (without data deduplication) divided by 50TB (with data deduplication), which equals 20 and results in a deduplication ratio of 20:1. When implementing data deduplication, there are some inherent challenges to take into account. The first is that although data deduplication reduces storage and WAN bandwidth, not all data deduplication is created equal. Each vendor has their own algorithmic approach and with the exact same data mix, backup size, and retention period, different solutions will achieve ratios of 2:1, 4:1, 6:1, 8:1, 10:1, 12:1, 20:1, and higher. Depending on the deduplication algorithm used, the amount of storage and bandwidth can vary greatly. A strong deduplication solution can achieve a deduplication ratio of 10:1 to as high as 50:1, with an average of 20:1 depending on the data mix and retention period. Typically, the backup application s deduplication 2

results in a lower ratio and uses more disk and bandwidth than dedicated target-side appliances, since appliances have dedicated high-speed compute and can use more aggressive algorithms to achieve higher deduplication ratios. Another challenge is that most vendors simply added data deduplication as a feature to a scale-up storage platform or to a backup application. This approach of adding deduplication as a feature versus creating an architecture specifically for data deduplication not only slows down backups, restores, and VM boots, but as data grows, the backup window expands; data deduplication is highly compute-intensive and when this processing is performed in the data stream of the backups, backup speed suffers. The impact of implementing an inferior solution may include: lower deduplication ratios, resulting in 1) additional storage and cost, and 2) increased bandwidth and cost due to replicating more data slow backups, resulting in long backup windows slow restores, offsite tape copies, and VM boots, impacting users and productivity a backup window that expands as data grows, requiring the regular upgrade to a bigger and faster front-end controller or media servers (known as forklift upgrades, which are costly and disruptive) When data is deduplicated during the backup window, the backups are slow resulting in a longer backup window. In addition, only deduplicated data is stored, so in order to perform restores and VM boots, data needs to go through a time-consuming reassembly or rehydration process each time. ExaGrid looked at these challenges differently than other vendors. They are not merely storage and replication challenges because data deduplication introduces a compute challenge that other approaches do not address. 3

ExaGrid s approach meets all 5 challenges by delivering the: 1. best deduplication ratio for least amount of required storage 2. best deduplication ratio for least amount of bandwidth used 3. fastest backups for the shortest backup window 4. fastest restores, offsite tape copies, and VM boots to improve user uptime 5. backup window that remains fixed in length even as data grows ExaGrid deploys zone-level deduplication that compares zone stamps and then identifies only the bytes that have changed. ExaGrid then stores and replicates only the changed bytes. Depending on data mix and retention periods, ExaGrid achieves or exceeds the industry s best average deduplication ratio of 20:1 with ratios of 10:1 to as much as 100s:1. ExaGrid requires the least amount of storage for deduplicated data and the least bandwidth to replicate data offsite for disaster recovery. 4

In order to avoid the compute-intensive deduplication tax during the backup window, ExaGrid writes straight to disk for the fastest possible backups, resulting in ingest rates that are the highest in the industry. In contrast, appliances that perform deduplication inline are slow because they deduplicate data through a single controller; therefore, they need to deploy software on backup media servers and database servers to offload some of the deduplication. Even with doing some of the processing on the production servers, these solutions still cannot come close to ExaGrid s performance. In larger installations, ExaGrid is at least three times the speed of its closest competitor. Writing direct to disk allows ExaGrid to keep the most recent backups in both their deduplicated and original non-deduplicated form. The most recent non-deduplicated backups are kept in a landing zone, quickly accessible for the fastest restores, offsite tape copies, and VM boots. ExaGrid s approach avoids the time-consuming data rehydration process and is five to ten times faster than solutions that only store deduplicated data. When booting a VM, backup software deduplication can take a few hours up to a full day, inline scale-up appliances can take hours, but ExaGrid takes just seconds to minutes because the most recent backups are readily available in their non-deduplicated form in the landing zone. Behind that, ExaGrid keeps weeks, months, and years of deduplicated data for long-term retention storage. 5

It stands to reason that as the volume of backup data grows, so too does the amount of data to be deduplicated, and it follows that if additional compute is not added, the backup window will continue to grow indefinitely. However, ExaGrid is the only solution that uses a scale-out storage architecture, so instead of adding just storage behind a fixed resource controller, ExaGrid adds full appliances with all required resources processor, memory, network ports, and storage to a scale-out GRID. Therefore, when data doubles, triples, quadruples, ExaGrid doubles, triples, quadruples all necessary resources, not just storage capacity. ExaGrid is the only solution that meets all 5 backup storage challenges with the: 1. highest storage efficiency 2. lowest bandwidth usage 3. fastest backups 4. fastest restores, offsite tape copies, and VM boots 5. fixed-length backup window despite data growth 6

United States: 2000 West Park Drive Westborough, MA 01581 (800) 868-6985 United Kingdom: 200 Brook Drive Green Park, Reading, Berkshire RG2 6UB +44 (0) 1189 497 051 Singapore: 1 Raffles Place, #20-61 One Raffles Place Tower 2 048616 +65 6285 0302 ExaGrid reserves the right to change specifications or other product information without notice. ExaGrid and the ExaGrid logo are trademarks of ExaGrid Systems, Inc. All other trademarks are the property of their respective holders. 2016 ExaGrid Systems, Inc. All rights reserved.