EMC DATA DOMAIN OVERVIEW ATEA Tromsö 16 November 2010 Peter Karlsson BRS Channel Manager EMEA North 1
EMC Backup Recovery Systems Division Division HQ: Santa Clara, CA 10 R&D locations 1,800 employees Solutions DataDomain, EDL, Avamar, NetWorker, DPA Data protection storage systems > 50,000 systems installed > 40,000 customers > 13,000 petabytes under protection Global sales, support and services ~ 5,000 channel partners $0.6 2006 $0.8 2007 $1.1 2008 $1.5 2009 Total Revenue ($B) 2
EMC Data Domain: Leadership and Innovation Deduplication storage systems More than 12,000 systems installed More than 4,300 customers More than 2,600 PB under Data Domain protection worldwide A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 First Deduplication NAS First Deduplication Volume Replication First Deduplication Virtual Tape Library Largest Deduplication Array First Deduplication Directory Replication First Deduplication Nearline Storage Fastest Backup Controller Cascaded Replication First Deduplication Encryption First Distributed Processing 3
Backup Redesign is Hot Tiered Storage Build Out Consolidation Technology Refresh Backup Redesign Virtualization Adoption Improving Performance Archiving Improving Forecasting Disaster Recovery What are your top storage initiatives? Expanding Replication Data Migration Thin Provisioning New Data Center Securing Storage Green Storage Wave 9 Wave 10 Wave 11 Wave 12 Wave 13 Source: The InfoPro Wave 13 (12/09) 0% 10% 20% 30% 40% 50% % Respondents 4
CPU Utilization CPU Utilization Exabytes What Is Causing Companies to Redesign Their Backup & Recovery Environments? Data growth Digital Information Created and Replicated Worldwide 2,500 2,000 1,500 5-FOLD Growth in 4 YEARS 1,000 500 0 2008 2009 2010 2011 2012 Source: IDC Digital Universe white paper, sponsored by EMC, May 2009 Server virtualization Old Paradigm Physical Environment: Low overall server utilization and plenty of bandwidth for backup 100% 80% 60% 40% 20% 0% Server A Server B Server C New Paradigm Virtual Environment: High overall server utilization and little bandwidth for backup Virtual Server A Virtual Server B Virtual Server C 20 percent resource utilization 80 percent resource utilization 100% 80% 60% 40% 20% 0% ESXServer Hardware Shared Physical Resources 5
Physical Environment Impact of VMware Adoption 0% 50% 100% Phase 1 IT Production Cost Efficiency 15% Phase 2 Business Production Quality of Service 30% 0% 50% 70% Virtual Environment Phase 3 ITaaS Business Agility 85% Increasing Backup & Recovery Needs 100% Rapidly growing number of VM s Greater business continuity needs Greater need for improved SLAs Lowering the data protection cost structure 8
A Proactive Approach Is Needed Reactive Approach Backup Redesign Phase 2 Business Production Quality of Service Phase1 Phase2 Phase3 30% 70% Proactive Approach Phase1 Phase2 Phase3 Backup Redesign 9
Data Stored Deduplication Dramatically Reduces Storage Capacity Requirements Deduplication 10 30 times less data stored versus fulls + incrementals with typical retention policies 30 20 10 0 1 5 10 15 20 Weeks in Use Deduplication storage Traditional storage 10
With Data Domain Deduplication Storage Systems, You Can WAN Retain longer Keep backups onsite longer with less disk for fast, reliable restores, and eliminate the use of tape for operational recovery Replicate smarter Move only deduplicated data over existing networks with up to 99% bandwidth efficiency for cost-effective disaster recovery Recover reliably Continuous fault detection and self-healing ensure data recoverability to meet service level agreements 11
Deduplication Fundamentals 12
Data Domain Basics Easy integration with existing environment Control Tier Target Tier Disaster Recovery Tier Backup and Archive Applications CIFS, NFS, NDMP, DD Boost Ethernet Virtual Tape Library (VTL) over Fibre Channel DD880 appliance Replication DD880 appliance 4U 2 to 6 ports 10 and 1 Gigabit Ethernet; 8 Gb/s Fibre Channel RAID 6 5.4 TB to 142.5 TB usable capacity with shelves 2 TB or 1 TB 7.2k rpm SATA HDD in shelf File system NVRAM N+1 fans and redundant, hot-plug power supplies 13
Data Domain Infrastructure and Ecosystem It works with what you have VMware Microsoft Microsoft SharePoint Oracle SAP Backup NAS, SAN, DAS Archive Primary storage Midrange and Mainframe Partners BusTech LaserVault Luminex Backup Applications EMC Symantec CommVault CA HP Vizioncore IBM Tivoli Atempo BakBone Archive Applications EMC F5 Networks Symantec CommVault Network Replication over WAN Disaster Recovery Data Domain Deduplication Storage 14
Data Deduplication: Technology Overview Store more backups in a smaller footprint Friday Full Backup A B C D A E F G Backup Estimated Data Logical Reduction Physical Mon Incremental A B H Tues Incremental C B I Weds Incremental E G J Thurs Incremental A C K Second Friday Full Backup B C D E F L G H A B C D E F G H I J K L FRIDAY FULL 1 TB 2 4x 250 GB Monday Incremental 100 GB 7 10x 10 GB Tuesday Incremental 100 GB 7 10x 10 GB Wednesday Incremental 100 GB 7 10x 10 GB Thursday Incremental 100 GB 7 10x 10 GB Second FRIDAY FULL 1 TB 50 60x 18 GB TOTAL 2.4 TB 7.8x 308 GB 15
Retain: Store More for Longer with Less Backup Cumulative Estimated Physical Data Logical Reduction First Full 1 TB 4x 250 GB Week 1 Week 2 Week 3 Month 1 Month 2 Month 3 April 7 2.4 TB 8x 308 GB April 14 3.8 TB 10x 366 GB April 21 5.2 TB 12x 424 GB April 28 6.6 TB 14x 482 GB May 31 12.2 TB 17x 714 GB June 30 17.8 TB 19x 946 GB Month 4 July 31 23.4 TB 20x 1,178 GB TOTAL 23.4 TB 20x 1,178 GB 16
If you need a restore, it HAS to work... 17
Data Integrity: Data Invulnerability Architecture Trust but verify hope is not a strategy Data verification Checksum Deduplication, write to disk Verify Self-healing file system Cleaning Expired data Defrag Verify Other RAID 6 NVRAM Snapshots Generate Checksum File System Global Compression Local Compression RAID Verify Data Verify the file system metadata integrity Verify user data integrity Verify stripe integrity 18
Network-Efficient Replication for True Disaster Recovery Lowers WAN costs; improves service level agreements DB Archive data Backup data Data Domain system Data Domain system 1 5% 1 5% WAN Flexible replication One-to-many Many-to-one Bi-directional System-tosystem Cascaded DIR A Home Home Source: Remote sites Data Domain system 1 5% 95 99% cross-site bandwidth reduction Data Domain DDX Array with DD880s Destination: Data Center Hub Supports hundreds of remote sites 19
Industry s Most Scalable Inline Deduplication Systems Software options: DD Boost, DD Virtual Tape Library, DD Replicator, DD Retention Lock, and DD DD600 Encryption Appliance Series DD880 Global Deduplication Array DD140 Remote Office Appliance DDX Array Series Up to 16 Controllers DD140 DD610 DD630 DD670 DD880 Global Deduplication Array DDX Array Speed (Other) 450 GB/hr 675 GB/hr 1.1 TB/hr 3.6 TB/hr 5.4 TB/hr 86.4. TB/hr Speed (DD Boost) 490 GB/hr 1.3 TB/hr 2.1 TB/hr 5.4 TB/hr 8.8 TB/hr 12.8 TB/hr 140.8 TB/hr Logical capacity 17 43 TB 75 195 TB 165 420 TB 1.1 2.7 PB 2.8 7.1 PB 5.7 14.2 PB 45.6 114 PB Raw capacity 1.5 TB Up to 6 TB Up to 12 TB Up to 76 TB Up to 192 TB Up to 384 TB Up to 3.07 PB Usable capacity 0.86 TB Up to 3.98 TB Up to 8.4 TB Up to 55.9 TB Up to 142.5 TB Up to 285 TB Up to 2.28 PB 20
Deduplication Storage Evaluation Criteria 22
Is Data Deduplication a Good Thing? Without De-Duplication No reduction in local backup storage No reduction in replication time and bandwidth No reduction in offsite storage Leveraging De-Duplication Reduced local backup storage Reduced replication time and bandwidth Reduced offsite storage OFFSITE REPLICATION WITHOUT DE-DUPLICATION REPLICATE AFTER DE-DUPLICATION Backup de-duplication Primary Site Remote Site Primary Site Remote Site 23
Methodology: Inline vs. Post-Process Deduplication Inline: Deduplication Before Storing Post- Process: Deduplication After Storing Deduplication Store Deduplication 3x disk accesses to shared store Other activities unimpeded Predictable Simpler The more processes, the more resource contention Copy to tape: Too slow to stream tape Recovery: Service level agreement predictability Replication: Poor time-to-disaster recovery Deduplication : If interleaved with backup or restore More administration to fight these issues 24
What about Disaster Recovery? Data is only safe when safely off-site Inline dedupe/replication: WAN DR Optimized Data Domain (Inline) Replicate during backup DR-Ready Post-Process Store to cache dedupe replicate TIME Post-process: DR restore point is usually obsolete 25
Post-Process Deduplication Bottleneck 26
Enterprise Recoverability Readiness at Disaster Recovery Site Data Domain Inline Deduplicated Replication Replicate during backup Disaster recovery (DR)-ready Adaptive Post-process Deduplicated Replication Backup to Cache Backup time 1.7 times longer than Data Domain DR-ready Deduplicate and replicate <50% ingest speed two times longer if uncompressed at fixed bandwidth Scheduled Post-process Deduplicated Replication Backup to Cache Backup time 1.1x longer than Data Domain DR-ready Deduplicate and replicate <50% ingest speed two times longer if uncompressed at fixed bandwidth Backup to VTL Recall tapes VTL/Tape/Truck Copy to tape Truck to storage Truck from storage? 27
Throughput MB/s Performance: CPU-Centric vs. Spindle-Bound 1,500 Data Domain Fibre Channel SATA Most deduplication vendors 50 50 100 150 200 Number of Disk Spindles 28
Throughput GB/s Scalability: Data Domain Systems Trajectory Data Domain SISL Scaling Architecture: CPU-Centric 5 3 1.5 Multi-controller systems with global deduplication Distributed processing for single-controller systems DD880, July 2009 Industry s fastest backup storage controller 2011 (est.) 0.04 DD200 (2004) 1.25 70 > PB Addressable capacity in terabytes post-raid (physical) 29
Why Data Domain? Less disk to resource, less to manage CPU-centric deduplication Inline Green Simple, mature, and flexible Simple, mature appliance Nearline tier: any fabric, any software, backup or nearline applications: data center or remote office Resilience and disaster recovery Storage of last resort Cross-site global compression: data center or remote office 30
THANK YOU 31
EMC s Information Infrastructure Portfolio Backup and Archive Platforms Storage Software Manage: Ionix ControlCenter Virtual Provisioning Automate: PowerPath Ionix for IT Operations Intelligence FAST Mobility: Virtual LUN SAN Copy VPLEX Avamar VM Avamar Data Domain Disk Library Backup/Recovery: NetWorker Avamar Data Protection Advisor Replication: Local Remote Multi-site CDP Security: RSA envision Encryption IPv6 Virtualization and Connectivity VPLEX Access Anywhere File Management Appliance Connectrix Avamar Virtual Edition for VMware RecoverPoint Gen 2 Data Store Distributed Federation CDP and CRR SAN connectivity Storage Platforms Symmetrix CLARiiON CX4 Centera Celerra Atmos Iomega Flash Flash Flash Cloud optimized storage DMX-4 V-Max AX4 Gen 4 LP Node Consumer and SMB storage 32
EMC Global Services Information Infrastructure Store Protect +Intelligence Virtualize and Automate Services Sales Services Solutions Support Partner Ecosystem 14,000 professionals 33