EMC Backup/Restore and Data Deduplication



Similar documents
EMC DATA DOMAIN OVERVIEW. Copyright 2011 EMC Corporation. All rights reserved.

EMC BACKUP AND RECOVERY SOLUTIONS

Trends in Enterprise Backup Deduplication

Data Domain Overview. Jason Schaaf Senior Account Executive. Troy Schuler Systems Engineer. Copyright 2009 EMC Corporation. All rights reserved.

EMC BACKUP MEETS BIG DATA

EMC DATA DOMAIN OVERVIEW

Backup and Recovery Redesign with Deduplication

Data Domain & Deduplication Basics 101

Oracle Data Protection Concepts

DPAD Introduction. EMC Data Protection and Availability Division. Copyright 2011 EMC Corporation. All rights reserved.

EMC Data de-duplication not ONLY for IBM i

CIGRE 2014: Udaljena zaštita podataka

ClearPath Storage Update Data Domain on ClearPath MCP

Get Success in Passing Your Certification Exam at first attempt!

EMC DATA DOMAIN PRODUCT OvERvIEW

EMC Backup solutions. Aleksandar Antić EMC BRS Territory Sales Adriatic region. Copyright 2011 EMC Corporation. All rights reserved.

Next Generation Backup Solutions

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM

Long term retention and archiving the challenges and the solution

EMC BACKUP & ARCHIVE SOLUTIONS

Common. Next Generation Backup for IBMi. EMC Laurent Piguet Peter Wirth. Crowne Plaza 24 Septembre 2013

Efficient Backup with Data Deduplication Which Strategy is Right for You?

DEDUPLICATION SOLUTIONS ARE NOT ALL CREATED EQUAL, WHY DATA DOMAIN?

EMC NETWORKER AND DATADOMAIN

EMC DATA PROTECTION. Backup ed Archivio su cui fare affidamento

Overcoming Backup & Recovery Challenges in Enterprise VMware Environments

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

Sébastien CHENE Datacenter Solution Architect Geneva Business Center 12, avenue des Morgines 1213 Petit-Lancy 1

SLOW BACKUPS GOT YOU DOWN?

DEDUPLICATION SOLUTIONS ARE NOT ALL CREATED EQUAL, WHY DATA DOMAIN?

EMC DATA DOMAIN OPERATING SYSTEM 5.2

efficient protection, and impact-less!!

Cost Effective Backup with Deduplication. Copyright 2009 EMC Corporation. All rights reserved.

e Number: Passing Score: 800 Time Limit: 120 min File Version: 1.0

Symantec Backup Appliances

Copyright 2015 EMC Corporation. All rights reserved. 1

MEEC Webinar Daly and EMC BRS - EMC Backup & Archive Solutions

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

DEDUPLICATION NOW AND WHERE IT S HEADING. Lauren Whitehouse Senior Analyst, Enterprise Strategy Group

Turnkey Deduplication Solution for the Enterprise

Introduction. Silverton Consulting, Inc. StorInt Briefing

EMC DATA DOMAIN DD990

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW

E Data Domain Specialist Exam for Implementation Engineers. Version: Demo. Page <<1/7>>

Turbo Charge Your Data Protection Strategy

WHITE PAPER. Dedupe-Centric Storage. Hugo Patterson, Chief Architect, Data Domain. Storage. Deduplication. September 2007

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

EMC DATA DOMAIN DATA INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

Best Practices Guide. Symantec NetBackup with ExaGrid Disk Backup with Deduplication ExaGrid Systems, Inc. All rights reserved.

EMC Backup and REcovery Product Overview

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst

Backup and Recovery Solutions for Exadata. Cor Beumer Storage Sales Specialist Oracle Nederland

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.

Backup and Recovery Solutions for Exadata. Ľubomír Vaňo Principal Sales Consultant

Symantec NetBackup 5220

EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE: MEETING NEEDS FOR LONG-TERM RETENTION OF BACKUP DATA ON EMC DATA DOMAIN SYSTEMS

Avamar. Technology Overview

NetApp Data Fabric: Secured Backup to Public Cloud. Sonny Afen Senior Technical Consultant NetApp Indonesia

EMC Backup and REcovery Product Overview

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

Quantum DXi6500 Family of Network-Attached Disk Backup Appliances with Deduplication

15-MINUTE GUIDE. SMARTER BACKUP Transform your future

Deduplication has been around for several

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

Backup Software Data Deduplication: What you need to know. Presented by W. Curtis Preston Executive Editor & Independent Backup Expert

How To Protect Data On Network Attached Storage (Nas) From Disaster

ORACLE RMAN DESIGN BEST PRACTICES WITH EMC DATA DOMAIN

Backup and Recovery 1

Copyright 2015 EMC Corporation. All rights reserved. 1

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

What s New: Data Domain OS 5.5 & Avamar 7.1

Eight Considerations for Evaluating Disk-Based Backup Solutions

EMC DATA DOMAIN ARCHIVER

Data Backup and Archiving with Enterprise Storage Systems

EMC AVAMAR. Deduplication backup software and system. Copyright 2012 EMC Corporation. All rights reserved.

EMC DATA DOMAIN REPLICATOR

Protect Data... in the Cloud

REMOTE SITE RECOVERY OF ORACLE ENTERPRISE DATA WAREHOUSE USING EMC DATA DOMAIN

EMC AVAMAR. a reason for Cloud. Deduplication backup software Replication for Disaster Recovery

ExaGrid Product Description. Cost-Effective Disk-Based Backup with Data Deduplication

WHITE PAPER: customize. Best Practice for NDMP Backup Veritas NetBackup. Paul Cummings. January Confidence in a connected world.

Dell Data Protection. Marek Istok Ŋ Dell Slovakia

A Deduplication File System & Course Review

KUIDAS KAITSTA ANDMEID EMC TARKVARAGA?

DXi Accent Technical Background

EMC Backup and Recovery Product Portfolio

Using HP StoreOnce Backup systems for Oracle database backups

We look beyond IT. Cloud Offerings

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Protect Microsoft Exchange databases, achieve long-term data retention

FAQ RIVERBED WHITEWATER FREQUENTLY ASKED QUESTIONS

Transcription:

Kompetera SolutionsDay Brøndby Stadion, September 2012 EMC Backup/Restore and Data Deduplication Michael Hemmingsen Technology Consultant EMC Backup Recovery Systems (BRS) Division Email: michael.hemmingsen@emc.com 1

Our Division Approximately 3,000 Employees 10 R&D Locations Market Leadership #1 In Deduplication #1 In Purpose Built Backup Appliances #1 In Combined Software & Storage Rapid Market Adoption Data Domain & Avamar Exited 2011 At A Revenue Run Rate Over $2 Billion Global Sales, Support, & Services Approximately 6,000 Channel Partners 2

EMC And BRS at a Glance EMCis a global leader in enabling businesses and service providers to transform their operations and deliver IT as a service. Revenues (2010): $17B (2011) Projected Revenue: $19.8 Fortune 500 Rank: 166 Employees: 50,000 Countries with EMC operations: 83 R&D investment (2010): $2B Founded: 1979 EMC Backup Recovery Systems (BRS) is a division of EMC focused on backup and recovery solutions Division HQ: Santa Clara, CA Global sales, support, and services 10 R&D locations, 2,000 employees, 6,000 channel partners 3

Our Journey to the Cloud/Big Data EMC cumulative 8 year technology investment R&D $10.5B 2011 CLOUD & BIG DATA 2009 2007 JOURNEY TO THE CLOUD 2003 AND BEFORE 2005 Virtualization ENTERPRISE STORAGE 4

IT Trends Impacting Backup and Recovery 2009 0.8 Zettabytes 2020 35.2 Zettabytes DATA DELUGE BUDGET DILEMMA Transformation INFRASTRUCTURE SHIFT 5

Is Growing Infrastructure the Answer??? 18 Cabinets of IBM Tape 6

Deduplication Is Accelerating the Transition More efficient Reduced storage Less bandwidth 7

Deduplication Is Accelerating the Transition Deduplication 10 30 times less data stored versus fulls + incrementals with typical retention policies 30 Data Stored 20 10 0 1 5 10 15 20 Weeks in Use Deduplication storage Traditional storage 8

EMC Continues to Evolve Backup Purpose Build Backup Appliance (PBBU) - Software and storage designed to work together Unlocks the power of disk Software INTEGRATED ARCHITECTURE Storage Improves Performance:Up to 50 percent faster than traditional solutions Simplicity: One user interface common policy management Predictability:Proven best practices more than 20,000 installations 9

EMC Data Domain Leading the Deduplication Market In-use rating 3-times over nearest competitor Reduce: Backup windows by up to 90 percent Data stored by up to 30 times* Replication bandwidth by up to 99 percent *Versus full and incremental with typical retention Source: Wave 15 Storage Study Q2 2011, published 5/16/11, large-enterprise sample, n=31,theinfopro (www.theinfopro.com) 10

Disk Based Backup w/data Deduplication ReplacesTape EMC Purpose Build Backup Appliance Data Domain Avamar 11

Purpose-Built Backup Appliances Fastest growing segment in backup and recovery market 2010 Total Market $1.69B EMC 64.2% EMC IBM HP Oracle Quantum Sepaton FalconStor Dell Others Source: Worldwide Purpose-Built Backup Appliance 2011 2015 Forecast and 2010 Vendor Shares, IDC Doc #228091, May 2011 12

TraditionalBackup Infrastructure 13

NextGeneration Data Protection 14

NextGeneration Data Protection 15

Backup Operations Hassle 20% slow or offline clients 20% post-processes failures 30% media-related failures 10% performance and job-queues 10% network interruptions 10% plugin and other Expand disk pools Inject additional media Library inventory, media-to-pool Help-desk tickets escalated Prepare test & dev environments Prep tape media for pickup Retrieve reports from consoles Retrieve reports scripts Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success/Failures Long-running Jobs Media Errors Restart failed jobs 16

Backup Operations Hassle 20% slow or offline clients 20% post-processes failures 30% media-related failures 10% performance and job-queues 10% network interruptions 10% plugin and other Expand disk pools Inject additional media Library inventory, media-to-pool Help-desk tickets escalated Prepare test & dev environments Prep tape media for pickup Retrieve reports from consoles Retrieve reports scripts Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success No tape End-to-end phone home support Dedupe speeds job completion 17

Backup Operations Hassle 20% slow or offline clients 20% post-processes failures 30% media-related failures 10% performance and job-queues 10% network interruptions 10% plugin and other Expand disk pools Inject additional media Library inventory, media-to-pool Help-desk tickets escalated Prepare test & dev environments No tape Centralized mgmt & reporting Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success No tape End-to-end phone home support Dedupe speeds job completion 18

Backup Operations Hassle 20% slow or offline clients 20% relieved -> No post processes 30% relieved -> No tape 10% relieved -> Performance & flexibility 10% network interruptions 10% plugin and other Expand disk pools Inject additional media Library inventory, media-to-pool Help-desk tickets escalated Prepare test & dev environments No tape Centralized mgmt & reporting Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success No tape End-to-end phone home support Dedupe speeds job completion 19

Backup Operations Hassle 20% slow or offline clients 20% relieved -> No post processes 30% relieved -> No tape 10% relieved -> Performance & flexibility 10% network interruptions 10% plugin and other Single pool of storage Help-desk tickets escalated Prepare test & dev environments No tape Centralized mgmt & reporting Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success No tape End-to-end phone home support Dedupe speeds job completion 20

Backup Operations Hassle 20% slow or offline clients 20% relieved -> No post processes 30% relieved -> No tape 10% relieved -> Performance & flexibility 10% network interruptions 10% plugin and other Single pool of storage Yes, spend more time doing this Application admins create test & dev No tape Centralized mgmt & reporting Client software/configuration Backup jobs, scheduling adjustments Media servers changes Media purchases SAN/Network changes Success No tape End-to-end phone home support Dedupe speeds job completion 21

Backup Operations Hassle 20% slow or offline clients 20% relieved -> No post processes 30% relieved -> No tape 10% relieved -> Performance & flexibility 10% network interruptions 10% plugin and other Single pool of storage Yes, spend more time doing this Application admins create test & dev No tape Centralized mgmt & reporting Less changes Less Risk Better SLAs Success No tape End-to-end phone home support Dedupe speeds job completion 22

Failure to Restore is Costly Operational Restore, OK-ish. Disaster Recovery Situation How long to retriveand ready tape clones? Backup is the Life Line when other systems has failed! Backup administrators desktop 23

EMC DATA DOMAIN OVERVIEW 24

Deduplication Fundamentals 25

Data Deduplication: Technology Overview Store more backups in a smaller footprint Friday Full Backup A B C D A E F G Backup Estimated Data Logical Reduction Physical Mon Incremental A B H Tues Incremental C B I Weds Incremental E G J Thurs Incremental A C K Second Friday Full Backup B C D E F L G H A B C D E F G H I J K L FRIDAY FULL 1 TB 2 4x 250 GB Monday Incremental 100 GB 7 10x 10 GB Tuesday Incremental 100 GB 7 10x 10 GB Wednesday Incremental 100 GB 7 10x 10 GB Thursday Incremental 100 GB 7 10x 10 GB Second FRIDAY FULL 1 TB 50 60x 18 GB TOTAL 2.4 TB 7.8x 308 GB 26

Retain: Store More for Longer with Less Over one year of retention in 3U of Data Domain deduplication storage Backup Cumulative Estimated Physical Data Logical Reduction First Full 1 TB 4x 250 GB Week 1 Week 2 Week 3 Month 1 Month 2 Month 3 April 7 2.4 TB 8x 308 GB April 14 3.8 TB 10x 366 GB April 21 5.2 TB 12x 424 GB April 28 6.6 TB 14x 482 GB May 31 12.2 TB 17x 714 GB June 30 17.8 TB 19x 946 GB Month 4 July 31 23.4 TB 20x 1,178 GB TOTAL 23.4 TB 20x 1,178 GB 27

Variable Length: Not Fixed Segment Size Intelligent, self-tuning algorithms automatically adjust to data changes Call me Ishmael. Some years ago -never mind how long precisely -having little Fixed size wouldn t handle inserts/deletes well: Call me Ish. Some years ago -never mind how long precisely -having little Call me Izzy. Some years ago -never mind how long precisely -having little Variably sized segments maximize redundancy: Call me Izzy. Some years ago -never mind how long precisely -having little Advantages: Application independent Protocol independent File pathname independent Block address independent Key: - Consumes new capacity - Deduplicated 28

Data Domain Basics 29

Data Domain Basics Easy integration with existing environment Control Tier Target Tier Disaster Recovery Tier Backup and Archive Applications CIFS, NFS, NDMP, DD Boost EMC Symantec CommVault IBM HP Veeam Quest NEW: Oracle RMAN Ethernet Virtual Tape Library (VTL) over Fibre Channel DD890 appliance Replication DD890 appliance 2U 2 to 14 ports 10 and 1 GigabitEthernet; 8 Gb/s Fibre Channel RAID 6 Up to 285 TB usable capacity with shelves 2 TB or 1 TB 7.2K rpm SATA HDD in shelf File system NVRAM N+1 fans and redundant, hot-plug power supplies 30

Methodology: Inline vs. Post-Process Deduplication INLINE Deduplication Before Storing POST-PROCESS Deduplication After Storing Deduplication Store Deduplication 3x disk accesses to shared store Other activities unimpeded Predictable Simpler The more processes, the more resource contention Copy to tape: Too slow to stream tape Recovery: Service level agreement predictability Replication: Poor time-to-disaster-recovery Deduplication: If interleaved with backup or restore More administration to fight these issues 31

Stream Informed Segment Layout 99% of duplicate segments identified in RAM All related segments are stored in close proximity on disk for optimal reads Process: New write comes in i.e. segment Function generates the offsets and compared within the summary vector Summary Vector cannot conclude redundancy It can have false positives Role of the Summary Vector is to figure out if the segment has been seen before or not, If SV test says segment may not be unique, move to next step If new write to container compressed when container full write to disk If not new verify again Fingerprint lookup in cache (rarely) load FPC from disk Key results System capacity is very scalable No RAM constraint on index Good performance with few spindles Very few disk accesses during write 32

SISL Scaling Architecture Data Domain answer: SISL Stream-informed Segment Layout includes: Summary Vectorin RAM says if segment is new Segment Localities minimize seeks if answer is on disk Check uniqueness with Summary Vector Check in-memory fingerprint cache Key results See: http://www.usenix.org/events/fast08/tech/full_papers/zhu/zhu.pdf 33

Performance: CPU-Centric vs. Spindle-Bound 6,000 Data Domain Throughput MB/s Fibre Channel Most deduplication vendors SATA 50 50 100 150 200 Number of Disk Spindles 34

Segment Localities Metadata Segment data abcd A B C D efgh E F G H ijkl I J K L... stuv S T U V DDFS log structure Localities Stream-informed storage units Neighboring unique segments stored together Fingerprints and segments stored together with other metadata One seek can retrieve hundreds into RAM Fast caching for fingerprint lookup Fast reads during recovery or copy to tape 35

Data Integrity: Data Invulnerability Architecture End-to-end data verification Checksum Deduplication, write to disk Verify Self-healing file system Cleaning Expired data Defrag Verify Other RAID 6 NVRAM Snapshots Generate Checksum File System Deduplication Local Compression RAID Verify Data End-to-end data verification Verify the file system metadata integrity Verify user data integrity Verify stripe integrity 36

Fault Avoidance and Containment Localities Data Domain Log-structured File System Architecture New Data Never Overwrites Good Data Previous backups are not at risk Protect against software bugs Fewer Complex Data Structures Mean Fewer Bugs No bitmaps and link counts to corrupt NVRAM for Fast, Safe Restart Snapshot protection option DD-RAID does no partial-stripe writes No RAIDvulnerability to power loss or NVRAM failure 37

Continuous Fault Detection and Healing DD-RAID 6 Protection against Two disk failures Disk read errors during reconstruction Operator pulling the wrong disk Verifies data integrity and stripe coherency after writes On-the-fly Error Detection and Correction All on-disk structures covered by strong checksums Data correctness verified on every disk read Data errors corrected automatically from data redundancy Continuous Scrub - Finds and fixes errors before they become a problem 38

DD Boost Software DD Boost Distributes parts of deduplication process to backup server or application clients Licensable software works across Data Domain portfolio Supports majority of backup software market EMC Avamar and NetWorker Symantec NetBackup and Backup Exec Speeds backups by up to 50 percent Process more backups with existing resources 20 40% less overall impact to backup server 80 99% less LAN bandwidth Enables Data Domain replication management from the backup application 39

Data Domain Portfolio 40

New Data Domain DD160 Appliance For small enterprise data centers and remote offices Protect up to 4 TB user data Back up 4 TB in less than four hours Supports all Data Domain software options DD160 41

Industry s Most Scalable Inline Deduplication Systems Data Domain Software Options DD Boost DD Encryption DD Extended Retention DD Replicator DD Retention Lock DD Virtual Tape Library Large Enterprise Small Enter./ ROBO Midsize Enterprise DD160 DD620 DD640 DD670 DD860 DD890 DD990 Speed (DD Boost) 1.1 TB/hr 2.4 TB/hr 3.4 TB/hr 5.4 TB/hr 9.8 TB/hr 14.7 TB/hr 31.0 TB/hr Speed (other) 667 GB/hr 1.1 TB/hr 2.3 TB/hr 3.6 TB/hr 5.1 TB/hr 8.1 TB/hr 15.0 TB/hr Logical capacity 40 195 TB 83 415 TB 0.32 1.6 PB 0.6 2.7 PB 1.4 7.1 PB 5.7 28.5 PB 1 2.9 14.2 PB 5.7 28.5 PB 13 65 PB 1 Usable capacity Up to 3.98 TB Up to 8.3 TB Up to 32.2 TB Up to 55.9 TB Up to 142 TB Up to 570 TB 1 Up to 285 TB Up to 570 TB Up to 1.3 PB 1 1 With DD Extended Retention software option 42

EMC Backup Assessments Backup Daily Server by What s in an assessment? Data collection Analysis Recommendations Learn about... Metrics on backup capacity and the backup jobs Details about the data change rate Reliability of your backup infrastructure And more 43

Technology Deployment Services Operational Assurance Services Post-implementation service to enable operational selfsufficiency Provides practical and thorough on-going operational bestpractices guidance Allows you to fully realize backup and recovery capabilities with best practice knowledge transfer, operational best practices, performance/tuning optimization guidance, process recommendations for improvement, and training recommendations QuickStart version for environments that require an easy to order fixed scope of work and price Custom version for more complex environments outside the scope of fixed scope/price Professional Services Excellence in Enabling Customer Success 44

With Data Domain Deduplication Storage Systems, You Can WAN Retain longer Keep backups onsite longer with less disk for fast, reliable restores, and eliminate the use of tape for operational recovery Replicate smarter Move only deduplicated data over existing networks with up to 99% bandwidth efficiency for cost-effective disaster recovery Recover reliably Continuous fault detection and self-healing ensure data recoverability to meet service level agreements 45

TAK Michael Hemmingsen Technology Consultant EMC Backup Recovery Systems (BRS) Division Email: michael.hemmingsen@emc.com 46