Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos
|
|
|
- Johnathan Davis
- 10 years ago
- Views:
Transcription
1 Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos Symantec Research Labs
2 Symantec FY 2013 (4/1/2012 to 3/31/2013) Revenue: $ 6.9 billion Segment Revenue Example Business Consumer 30% Norton security, Norton Zone Security and compliance 30% Symantec Endpoint Protection, Mail/Web security, Server protection Storage and server management 36% Backup and recovery, Information availability Services 4% Managed security service Symantec Research Labs 2
3 Symantec Research Labs Leading experts in security, availability and systems doing innovative research across all of Symantec s businesses Our mission is to ensure Symantec s long-term leadership by fostering innovation, generating new ideas, and developing next-generation technologies across all of our businesses. A global organization: Mountain View, CA Culver City, CA Herndon, VA Waltham, MA Ongoing collaboration with other researchers, government agencies and universities such as: Sophia Antipolis, FR and numerous others Symantec Research Labs 3
4 Symantec Research Labs Charter Foster innovation and develop key technologies Operational Model Work closely with existing and emerging businesses Identify and work on relevant hard technical challenges Success Factors Transfer technology Develop new intellectual property Forge research collaborations and publish innovative work Symantec Research Labs 4
5 Symantec Research Labs (SRL) Product Development Advanced Development SRL Academic Research Proportion of time spent f (risk, time_horizon, speculation) Symantec Research Labs 5
6 Building a High Performance Deduplication System
7 Deduplication is Maturing Deduplication (dedupe) is becoming a standard feature of backup/archival systems, file systems, cloud storage, network traffic optimization, Many approaches, algorithms, techniques Inline/offline, file/block level, fixed/variable block size, global/local dedupe Near-optimal duplicate detection and usage of available raw storage However, in practice, scalability is still an issue Single node capacity < a few hundred TB Scalability achieved (mostly) using multi-node systems Capacity still relatively low (usually single-digit PB) High acquisition/maintenance cost Complex design, performance impact Symantec Research Labs 7
8 Hypothesis and Goal We advocate that improving single-node performance is critical Why? Better building blocks fewer nodes necessary Potential for single-node deployments for small/medium businesses Lower acquisition/management/energy cost Hosting/relocation flexibility How? Single-node performance = high scalability and throughput, good dedupe efficiency Making the best of a node s resources to address challenges Built and tested a complete deduplication system from scratch Including client, server and network components Symantec Research Labs 8
9 Core Deduplication Functionality Each file is identified by a file fingerprint FFP Dedupe metadata service C:\Documents and Settings\AAA\My Documents\ Budget_2011.xls 260 Kb 05/05/2011 A B C D Content E F G H 011 FFP C:\Documents and Settings\AAA\My Documents\ Budget_2011.xls 260 Kb 05/05/2011 C:\Documents and Settings\BBB\My Documents\ Budget_2011.xls 260 Kb 05/10/2011 FFP1 FFP1 C:\Documents and Settings\BBB\My Documents\ Budget_2011.xls 270 Kb 06/03/2011 FFP2 C:\Documents and Settings\BBB\My Documents\ Budget_2011.xls 260 Kb 05/10/2011 Content C:\Documents and Settings\BBB\My Documents\ Budget_2011.xls Kb 05/05/ /03/2011 A B C O Updated Content Content D F G P 1101 FFP Each file consists of data segments. FFP WAN For each segment a segment fingerprint SFP is calculated. (MD5, SHA-1/2 etc.) FFP1 Dedupe storage service A B C D E F G H O P FFP2 Symantec Research Labs 9
10 Challenges Addressed Original Data dedupe Duplicates Removed Have I seen segment X before? (and if yes, where is it stored?) Index: maps SFPs to disk location Segment indexing challenges System capacity bound by indexing capacity Increasing the segment size not a good solution System performance bound by index speed Reference management challenges Need to keep track of who is using what Capacity Segment size Item Scale Remarks Capacity bound by the ability to track references and manage/reclaim segments Reference update performance may also become a bottleneck Client and network performance challenges File 1 File 2 Symantec Research Labs 10 File 3 C = 1000 TB S = 4 KB Num of segments N = 250*10 9 N = C/S Metadata per segment E = 22 B Total segment metadata I = 5500 GB I = N*E Disk speed Z = 400 MB/sec Lookup speed goal 100 Kops/sec Z/S
11 Performance Goals Single-node scalability goals 100+ billion objects per node, with high throughput Single-node performance goals Near-raw-disk throughput for backup, restore and deletion (i.e., space reclamation) Reasonable deduplication efficiency Client-server interface capable of delivering desired throughput Assumptions: Opportunity for improving duplicate detection is becoming scarce Aiming for perfect duplicate detection may limit scalability Willing to trade some deduplication efficiency for high scalability Symantec Research Labs 11
12 Outline Architecture Overview Progressive Sampled Indexing Grouped Mark-and-Sweep Evaluation 12
13 Architecture Overview Design Client Server architecture Client component may reside on server Client Initiates backup creation/restore Performs client I/O Performs segmentation, fingerprint calculation Issues fingerprint lookups to server initiates data transfers only for new segments Server File Manager Hierarchical list of backup and file metadata Central concept: backup represents a list of files Organizes backups into (roughly equal-size) backup groups Server Segment Manager Manages storage units, called containers, and implements storage logic Hosts the fingerprint index responsible for duplicate detection Performs reference management operations Symantec Research Labs 13
14 Architecture Overview Main Concepts Containers represented by unique container ID (CID) Container contents: raw data segments + catalogue of segment SFPs File represented by a list of <SFP, CID> pairs Identified by its file fingerprint FFP = hash(sfp1, SFP2,, SFP4) FFP File = ( < SFP1, CID1 > < SFP2, CID1 > > < SFP3, CID2 > < SFP4, CID3 > ) Notice that file segment location stored inline Data segments directly locatable from file metadata No need for location services by the index e.g., at restore time Symantec Research Labs 14
15 Sampled Indexing Directly locatable objects + relaxed dedupe goals No requirement for a complete index freedom to do sampling Sampled Indexing: keep 1 out of T SFPs E.g., modulo T sampling Sampling rate R ( fullness level) = 1/T = = function (RAM, storage capacity, desired segment size) Symantec Research Labs 15
16 Progressive Sampled Indexing Estimate of sampling rate R = (M/E) / (C/S) M = memory GBs, E = bytes/entry M/E = total index entries C = storage TBs, S = segment size KBs C/S = total segments Example: 32 bytes/entry, 4KB segments and 32GB of RAM No sampling (R=1) 4 TB storage R ~= 1/100 i.e., keep 1 out of 100 SFPs 400 TB Progressive sampling: used storage VS available storage Sampling rate = function(used storage, available RAM) Start with no sampling (R=1) Progressively decrease R, down to R min = (M/E) / (C/S) Symantec Research Labs 16
17 Fingerprint Caching Straight sampling poor deduplication efficiency Only 1 out of T segments deduplicatable Solution: take advantage of spatial locality Index hit <SFP, CID> use CID to pre-fetch SFPs in container catalogue Part of index memory: SFP cache Lower bound for sampling rate: at least one sampled entry per container Symantec Research Labs 17
18 Reference Management Challenges and Goals Problem: Is it safe to delete a segment? Is anyone using it? Challenge: do it correctly and efficiently Mistakes can not be tolerated e.g., after a crash Potentially used in a distributed environment Performance requirements: > 100 Kops/sec Reference counting: efficient but hard to guarantee correctness Repeated and/or lost updates lead to irrecoverable errors Reference lists: repeated updates solved, lost updates still a problem Very inefficient (list maintenance, serialization to disk, etc.) Mark-and-Sweep (aka Garbage Collection) Workload proportional to system capacity E.g., 400 TB system, 4 KB segments, 22 byte SFP ~2.2 TB read from disk during mark phase x10 dedupe factor ~ 22 TB File 3 File 1 File 2 Symantec Research Labs 18
19 Grouped Mark-and-Sweep (GMS) Mark: divide and save Track changes to backup groups Only re-mark changed groups Mark results saved and reused Sweep: track affected containers Only sweep containers that may have deletions Result: workload = function(amount of change) Instead of system capacity Some files deleted Backup Group 1 Backup 1 Backup 2 No changes Backup Group 2 Backup 3 Backup 4 Some files added Backup Group 3 Backup 5 Mark results for Mark Containers Group 3 results for Group to sweep 2 Mark results for Group 1 G2 mark G1 mark G3 mark G1 mark G3 mark G2 mark G2 mark G2 mark Container 1 Container 2 Container 3 Container 4 Container 5 Symantec Research Labs 19
20 Deduplication Server Evaluation Evaluation metrics: single-node throughput, scalability and dedupe efficiency Implemenation in C++, using Pthreads Portable (Linux and Windows) Index implementation: pointer-free, three cache eviction policies 8-core server, 32 GB RAM, 24 TB disk array, 128 GB PCI-X SSD, 64-bit Linux Baseline disk array throughput: ~1,000 MB/sec (both read/write) Two types of data-sets: Synthetic data-set: multiple 3 GB files 100% unique segments Virtual machine image files: 4 versions of a golden image Evaluation configuration: R = 1/101, 25 GB index 200 TB per 10 GB of RAM (500 TB in our case) Symantec Research Labs 20
21 Backup Throughput: Synthetic data Multiple backup streams of synthetic data Concurrency levels: 1, 16, 32, 64, 128, 256 backup streams Symantec Research Labs 21
22 Backup Scalability Populate system at 95% capacity (480 out of 500 TB) All containers created, container catalogues, metadata stored, but no raw data Repeated throughput tests Indexing overhead Symantec Research Labs 22
23 Grouped Mark-and-Sweep Throughput and Scalability Reference update performance is critical happens daily! Tests: backup add and delete (reference addition and deletion) When system is empty and when near-full slope slope = = update update throughput ~= ~= GB/sec GB/sec Symantec Research Labs 23
24 Deduplication Efficiency Synthetic data-set: multiple backups, no less than 97% VM images a very common and important dedupe workload Data-set: VM0: Golden image, corporate WinXP SP2 + some utilites VM1: VM0 + all Microsoft Service Packs, Patches etc. VM2: VM1 + large anti-virus suite VM3: VM2 + more utilities (pdf readers, compression tools etc.) Calculated ground truth for all image segments (unique/duplicates) VM Unique Segments Globally Unique Ideal Disk Usage (MB) Actual Disk Usage (MB) Success Rate VM0 518, ,326 2,123 2,211 96% VM1 733, ,522 3,775 3,938 96% VM2 904,579 1,189,230 4,871 5,085 96% VM3 1,145,029 1,616,585 6,621 6,860 97% Symantec Research Labs 24
25 Conclusions We have built and tested a complete deduplication system Scale: hundreds of TBs per node Backup throughput: ~1,000 MB/sec (unique data), ~6,000 MB/sec (duplicates) Reference management throughput: more than 3 GB/sec Deduplication efficiency: ~ 97% of duplicates detected We introduced new mechanisms for scalability & performance Sampled (SSD) indexing, grouped mark-and-sweep, pipelined client architecture Our results demonstrate that high single-node deduplication server performance is possible Symantec Research Labs 25
26 Status Most of the technologies are in use in the product. Make impact! Symantec is increasing R&D investment. We are hiring! Fellowship Symantec Research Labs 26
27 Thank you! Questions? Copyright 2010 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice. Symantec Research Labs 27
Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations
Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations Technical Product Management Team Endpoint Security Copyright 2007 All Rights Reserved Revision 6 Introduction This
Technical White Paper for the Oceanspace VTL6000
Document No. Technical White Paper for the Oceanspace VTL6000 Issue V2.1 Date 2010-05-18 Huawei Symantec Technologies Co., Ltd. Copyright Huawei Symantec Technologies Co., Ltd. 2010. All rights reserved.
Understanding EMC Avamar with EMC Data Protection Advisor
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features to reduce the complexity of managing data protection
Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
Turnkey Deduplication Solution for the Enterprise
Symantec NetBackup 5000 Appliance Turnkey Deduplication Solution for the Enterprise Mayur Dewaikar Sr. Product Manager, Information Management Group White Paper: A Deduplication Appliance Solution for
A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP
A SCALABLE DEDUPLICATION AND GARBAGE COLLECTION ENGINE FOR INCREMENTAL BACKUP Dilip N Simha (Stony Brook University, NY & ITRI, Taiwan) Maohua Lu (IBM Almaden Research Labs, CA) Tzi-cker Chiueh (Stony
Understanding EMC Avamar with EMC Data Protection Advisor
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features that reduce the complexity of managing data protection
Symantec NetBackup deduplication general deployment guidelines
TECHNICAL BRIEF: SYMANTEC NETBACKUP DEDUPLICATION GENERAL......... DEPLOYMENT............. GUIDELINES.................. Symantec NetBackup deduplication general deployment guidelines Who should read this
A Deduplication File System & Course Review
A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror
Symantec NetBackup Deduplication Guide
Symantec NetBackup Deduplication Guide UNIX, Windows, Linux Release 7.5 21220065 Symantec NetBackup Deduplication Guide The software described in this book is furnished under a license agreement and may
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services
MAD2: A Scalable High-Throughput Exact Deduplication Approach for Network Backup Services Jiansheng Wei, Hong Jiang, Ke Zhou, Dan Feng School of Computer, Huazhong University of Science and Technology,
Data Backup and Archiving with Enterprise Storage Systems
Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia [email protected],
Backup for branch offices and compartment backups. Måns Höiom & Rikard Lindkvist
Backup for branch offices and compartment backups Måns Höiom & Rikard Lindkvist Today s IT Challenges: Why Better Backup is needed? Accelerated Data Growth 62% Year Over Year This kind of data growth means
Data deduplication is more than just a BUZZ word
Data deduplication is more than just a BUZZ word Per Larsen Principal Systems Engineer Mr. Hansen DATA BUDGET RECOVERY & DATACENTER GROWTH PRESSURE DISCOVERY REVOLUTION More Storage Longer Backups Smaller
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of
Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst
ESG Lab Review Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst Abstract: This ESG Lab review documents hands-on
Speeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,
Evaluation of Enterprise Data Protection using SEP Software
Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &
STORAGE. Buying Guide: TARGET DATA DEDUPLICATION BACKUP SYSTEMS. inside
Managing the information that drives the enterprise STORAGE Buying Guide: DEDUPLICATION inside What you need to know about target data deduplication Special factors to consider One key difference among
An Oracle White Paper July 2011. Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide
Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide An Oracle White Paper July 2011 1 Disclaimer The following is intended to outline our general product direction.
WHITE PAPER BRENT WELCH NOVEMBER
BACKUP WHITE PAPER BRENT WELCH NOVEMBER 2006 WHITE PAPER: BACKUP TABLE OF CONTENTS Backup Overview 3 Background on Backup Applications 3 Backup Illustration 4 Media Agents & Keeping Tape Drives Busy 5
Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team
Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture Dell Compellent Product Specialist Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
Storage for VDI Environments
Storage for VDI Environments Planning and Considerations Evaluator Group Inc. Executive Editor: Russ Fellows Analysts: Randy Kerns, John Webster Storage Evaluation Guide Guide Series Series Version 2.0
Symantec NetBackup Deduplication Guide
Symantec NetBackup Deduplication Guide UNIX, Windows, Linux Release 7.1 21159706 Symantec NetBackup Deduplication Guide The software described in this book is furnished under a license agreement and may
Demystifying Deduplication for Backup with the Dell DR4000
Demystifying Deduplication for Backup with the Dell DR4000 This Dell Technical White Paper explains how deduplication with the DR4000 can help your organization save time, space, and money. John Bassett
Creating a Cloud Backup Service. Deon George
Creating a Cloud Backup Service Deon George Agenda TSM Cloud Service features Cloud Service Customer, providing a internal backup service Internal Backup Cloud Service Service Provider, providing a backup
Symantec NetBackup Appliances
Symantec NetBackup Appliances Simplifying Backup Operations Geoff Greenlaw Manager, Data Centre Appliances UK & Ireland January 2012 1 Simplifying Your Backups Reduce Costs Minimise Complexity Deliver
VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014
VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup
NetBackup Best Practice Using Tape Storage with Deduplicating Disk Storage
NetBackup Best Practice Using Tape Storage with Deduplicating Disk Storage This document looks at best practices around creating multiple copies of backups on a mixture of tape and deduplicating disk storage.
VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop
VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent
EMC DATA DOMAIN OPERATING SYSTEM
EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive
EMC DATA DOMAIN OPERATING SYSTEM
ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read
http://support.oracle.com/
Oracle Primavera Contract Management 14.0 Sizing Guide October 2012 Legal Notices Oracle Primavera Oracle Primavera Contract Management 14.0 Sizing Guide Copyright 1997, 2012, Oracle and/or its affiliates.
IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE
White Paper IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE Abstract This white paper focuses on recovery of an IBM Tivoli Storage Manager (TSM) server and explores
Contents Introduction... 5 Deployment Considerations... 9 Deployment Architectures... 11
Oracle Primavera Contract Management 14.1 Sizing Guide July 2014 Contents Introduction... 5 Contract Management Database Server... 5 Requirements of the Contract Management Web and Application Servers...
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Barracuda Backup Deduplication. White Paper
Barracuda Backup Deduplication White Paper Abstract Data protection technologies play a critical role in organizations of all sizes, but they present a number of challenges in optimizing their operation.
SOLUTION BRIEF. Resolving the VDI Storage Challenge
CLOUDBYTE ELASTISTOR QOS GUARANTEE MEETS USER REQUIREMENTS WHILE REDUCING TCO The use of VDI (Virtual Desktop Infrastructure) enables enterprises to become more agile and flexible, in tune with the needs
Cloud Optimize Your IT
Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release
FAST 11. Yongseok Oh <[email protected]> University of Seoul. Mobile Embedded System Laboratory
CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh University of Seoul Mobile Embedded System Laboratory
Moving Virtual Storage to the Cloud
Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage
EMC BACKUP-AS-A-SERVICE
Reference Architecture EMC BACKUP-AS-A-SERVICE EMC AVAMAR, EMC DATA PROTECTION ADVISOR, AND EMC HOMEBASE Deliver backup services for cloud and traditional hosted environments Reduce storage space and increase
WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January 2012. Permabit Technology Corporation
WHITE PAPER Permabit Albireo Data Optimization Software Benefits of Albireo for Virtual Servers January 2012 Permabit Technology Corporation Ten Canal Park Cambridge, MA 02141 USA Phone: 617.252.9600 FAX:
Direct virtual machine creation from backup with BMR
NETBACKUP 7.6 FEATURE BRIEFING DIRECT VIRTUAL MACHINE CREATION FROM BACKUP WITH BMR NetBackup 7.6 Feature Briefing Direct virtual machine creation from backup with BMR Version number: 1.0 Issue date: 5
8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments
8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments QLogic 8Gb Adapter Outperforms Emulex QLogic Offers Best Performance and Scalability in Hyper-V Environments Key Findings The QLogic
HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant
DISCOVER HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant HP StorageWorks Data Protection Solutions HP has it covered Near continuous data protection Disk Mirroring Advanced Backup
Symantec NetBackup Deduplication Guide
Symantec NetBackup Deduplication Guide UNIX, Windows, Linux Release 7.6 Symantec NetBackup Deduplication Guide The software described in this book is furnished under a license agreement and may be used
IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org
IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator
Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving
Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving November, 2013 Saqib Jang Abstract This white paper demonstrates how to increase profitability by reducing the operating costs of backup
EMC DATA DOMAIN PRODUCT OvERvIEW
EMC DATA DOMAIN PRODUCT OvERvIEW Deduplication storage for next-generation backup and archive Essentials Scalable Deduplication Fast, inline deduplication Provides up to 65 PBs of logical storage for long-term
09'Linux Plumbers Conference
09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center [email protected] 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing
An Oracle White Paper November 2010. Oracle Real Application Clusters One Node: The Always On Single-Instance Database
An Oracle White Paper November 2010 Oracle Real Application Clusters One Node: The Always On Single-Instance Database Executive Summary... 1 Oracle Real Application Clusters One Node Overview... 1 Always
Milestone Solution Partner IT Infrastructure MTP Certification Report Scality RING Software-Defined Storage 11-16-2015
Milestone Solution Partner IT Infrastructure MTP Certification Report Scality RING Software-Defined Storage 11-16-2015 Table of Contents Introduction... 4 Certified Products... 4 Key Findings... 5 Solution
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide for Windows Release 7.5 Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide The software described in this
Deploying De-Duplication on Ext4 File System
Deploying De-Duplication on Ext4 File System Usha A. Joglekar 1, Bhushan M. Jagtap 2, Koninika B. Patil 3, 1. Asst. Prof., 2, 3 Students Department of Computer Engineering Smt. Kashibai Navale College
TEST REPORT SUMMARY MAY 2010 Symantec Backup Exec 2010: Source deduplication advantages in database server, file server, and mail server scenarios
TEST REPORT SUMMARY MAY 21 Symantec Backup Exec 21: Source deduplication advantages in database server, file server, and mail server scenarios Executive summary Symantec commissioned Principled Technologies
Avoiding the Disk Bottleneck in the Data Domain Deduplication File System
Avoiding the Disk Bottleneck in the Data Domain Deduplication File System Benjamin Zhu Data Domain, Inc. Kai Li Data Domain, Inc. and Princeton University Hugo Patterson Data Domain, Inc. Abstract Disk-based
Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
Practical Performance Understanding the Performance of Your Application
Neil Masson IBM Java Service Technical Lead 25 th September 2012 Practical Performance Understanding the Performance of Your Application 1 WebSphere User Group: Practical Performance Understand the Performance
Backup Exec 2012 Agents and Options
Backup Exec 2012 Agents and Options Markku A Suistola Principal Presales Consultant Backup Exec 2012 Backup Exec 2012 Architecture Overview Understanding the technology workflow 2 Backup Exec 2012 Core
An Oracle White Paper September 2011. Oracle Exadata Database Machine - Backup & Recovery Sizing: Tape Backups
An Oracle White Paper September 2011 Oracle Exadata Database Machine - Backup & Recovery Sizing: Tape Backups Table of Contents Introduction... 3 Tape Backup Infrastructure Components... 4 Requirements...
WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?
WHAT IS FALCONSTOR? FalconStor Optimized Backup and Deduplication is the industry s market-leading virtual tape and LAN-based deduplication solution, unmatched in performance and scalability. With virtual
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011 Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage
IMPLEMENTATION OF SOURCE DEDUPLICATION FOR CLOUD BACKUP SERVICES BY EXPLOITING APPLICATION AWARENESS
IMPLEMENTATION OF SOURCE DEDUPLICATION FOR CLOUD BACKUP SERVICES BY EXPLOITING APPLICATION AWARENESS Nehal Markandeya 1, Sandip Khillare 2, Rekha Bagate 3, Sayali Badave 4 Vaishali Barkade 5 12 3 4 5 (Department
Reducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 #1 You can
Backup and Recovery. Backup and Recovery. Introduction. DeltaV Product Data Sheet. Best-in-class offering. Easy-to-use Backup and Recovery solution
April 2013 Page 1 Protect your plant data with the solution. Best-in-class offering Easy-to-use solution Data protection and disaster recovery in a single solution Scalable architecture and functionality
Veeam Cloud Connect. Version 8.0. Administrator Guide
Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be
Performance and scalability of a large OLTP workload
Performance and scalability of a large OLTP workload ii Performance and scalability of a large OLTP workload Contents Performance and scalability of a large OLTP workload with DB2 9 for System z on Linux..............
Protect Data... in the Cloud
QUASICOM Private Cloud Backups with ExaGrid Deduplication Disk Arrays Martin Lui Senior Solution Consultant Quasicom Systems Limited Protect Data...... in the Cloud 1 Mobile Computing Users work with their
Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage
Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy, and Hong Tang University of California at Santa Barbara, Alibaba Inc. Abstract In a virtualized
VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS
VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS Successfully configure all solution components Use VMS at the required bandwidth for NAS storage Meet the bandwidth demands of a 2,200
Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III
White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge
Hardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
Windows Server 2012 2,500-user pooled VDI deployment guide
Windows Server 2012 2,500-user pooled VDI deployment guide Microsoft Corporation Published: August 2013 Abstract Microsoft Virtual Desktop Infrastructure (VDI) is a centralized desktop delivery solution
SAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
Redefining Backup for VMware Environment. Copyright 2009 EMC Corporation. All rights reserved.
Redefining Backup for VMware Environment 1 Agenda VMware infrastructure backup and recovery challenges Introduction to EMC Avamar Avamar solutions for VMware infrastructure Key takeaways Copyright 2009
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
Google File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters
COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network
Data Deduplication and Tivoli Storage Manager
Data Deduplication and Tivoli Storage Manager Dave Cannon Tivoli Storage Manager rchitect Oxford University TSM Symposium September 2007 Disclaimer This presentation describes potential future enhancements
How To Make A Backup System More Efficient
Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,
Backup Exec 15: Deduplication Option
TECHNICAL BRIEF: BACKUP EXEC 15: DEDUPLICATION OPTION........................................ Backup Exec 15: Deduplication Option Who should read this paper Technical White Papers are designed to introduce
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
WHITE PAPER. Dedupe-Centric Storage. Hugo Patterson, Chief Architect, Data Domain. Storage. Deduplication. September 2007
WHITE PAPER Dedupe-Centric Storage Hugo Patterson, Chief Architect, Data Domain Deduplication Storage September 2007 w w w. d a t a d o m a i n. c o m - 2 0 0 7 1 DATA DOMAIN I Contents INTRODUCTION................................
WHITE PAPER: customize. Best Practice for NDMP Backup Veritas NetBackup. Paul Cummings. January 2009. Confidence in a connected world.
WHITE PAPER: customize DATA PROTECTION Confidence in a connected world. Best Practice for NDMP Backup Veritas NetBackup Paul Cummings January 2009 Best Practice for NDMP Backup Veritas NetBackup Contents
Oracle Hyperion Financial Management Virtualization Whitepaper
Oracle Hyperion Financial Management Virtualization Whitepaper Oracle Hyperion Financial Management Virtualization Whitepaper TABLE OF CONTENTS Overview... 3 Benefits... 4 HFM Virtualization testing...
Backup Exec 2014: Deduplication Option
TECHNICAL BRIEF: BACKUP EXEC 2014: DEDUPLICATION OPTION........................................ Backup Exec 2014: Deduplication Option Who should read this paper Technical White Papers are designed to
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide
Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide for Windows Release 7.6 Symantec NetBackup for Microsoft SharePoint Server Administrator s Guide The software described in this
MANAGED SERVICE PROVIDERS SOLUTION BRIEF
MANAGED SERVICE PROVIDERS SOLUTION BRIEF The Assured Recovery Services Platform The data protection world has drastically changed in the past few years. Protection and recovery of data and systems has
Understanding Data Locality in VMware Virtual SAN
Understanding Data Locality in VMware Virtual SAN July 2014 Edition T E C H N I C A L M A R K E T I N G D O C U M E N T A T I O N Table of Contents Introduction... 2 Virtual SAN Design Goals... 3 Data
W H I T E P A P E R : T E C H N I C A L. Understanding and Configuring Symantec Endpoint Protection Group Update Providers
W H I T E P A P E R : T E C H N I C A L Understanding and Configuring Symantec Endpoint Protection Group Update Providers Martial Richard, Technical Field Enablement Manager Table of Contents Content Introduction...
WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression
WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Symantec NetBackup 5000 Appliance Series
A turnkey, end-to-end, global deduplication solution for the enterprise. Data Sheet: Data Protection Overview Symantec NetBackup 5000 series offers your organization a content aware, end-to-end, and global
Protect Microsoft Exchange databases, achieve long-term data retention
Technical white paper Protect Microsoft Exchange databases, achieve long-term data retention HP StoreOnce Backup systems, HP StoreOnce Catalyst, and Symantec NetBackup OpenStorage Table of contents Introduction...
Quantum StorNext. Product Brief: Distributed LAN Client
Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without
Symantec Desktop and Laptop Option 7.6
Automated protection for desktops and laptops Data Sheet: Backup and Disaster Recovery Overview With the majority of business-critical information residing outside the data centers or on off corporate
