Using Data De-duplication to Drastically Cut Costs May 2008 Victor Nemechek, Product Marketing Manager, Hitachi Data Systems What most Backup Administrators feel like! Data Growth is the Fundamental Problem! 10000% 9000% 8000% 7000% 6000% 5000% 4000% 3000% 2000% 1000% 0% CY 99 CY 00 CY 01 CY 02 CY 03 CY 04 CY 05 CY06 Amount of Data # of Storage Admin Storage Budget Source: Strategic Research, Gartner, CV estimates 1
Disk dramatically improves backup operations! Performance: Backup and restore from tape is slow Disk enables quick recovery of data Reliability: Tape is major source of failed backup operations Disk adds high-availability and reliability to backup operations Security: Handling tapes creates additional failure opportunities Disk reduces potential loses associated with physical tapes Fundamental Problem Still Not Addressed Disk Space Disk improves backup operations, but does not address the fundamental problem of data growth What is Data De-duplication? Full Backup 1 Incr. 1 Incr. 2 Full Backup 2 A B C D E F G H I J A B C D E F G H I J Backup Repository Unique Data Redundant Data De-duplicated Data Most elements of data backed up today already exist in previous backups 2
The Power of De-duplication! H Y P E R F A C T O R Reduces required capacity by 25 to 1 or more! Production Customers Deployment Results Major Wireless Center Retail Distribution 13TB hosting 190TB ~16:1 with 30 day retention 40TB hosting 880T ~22:1 with 40 day retention International Oil Company 22TB hosting 726TB ~33:1 with 30 day retention Financial Institution 50TB hosting 850TB ~17:1 with 30 day retention 5 Metrics to understand the cost and efficiency of Disk versus Tape Average Annual TCO (AA-TCO) Storage Efficiency Factor RTO/RPO 1st-Time Backup Reliability Tape Media Cost 3
Average Annual Total Cost of Ownership (AATCO) Measured $ per TB per year Real cost of hardware, software, services, tape media, and operating costs (labor) Average Annual TCO Savings of ProtecTIER Starting Capacity 3 yr Capacity AATCO Savings 10TB 22.4TB $24k/TB/yr 20TB 45TB $19k/TB/yr 30TB 67.5TB $15k/TB/yr 100TB 225TB $11k/TB/yr 250TB 562TB $9k/TB/yr Storage Efficiency Utilization of a disk or tape repository when you have multiple versions of data stored Grandfather-Father-Son, GFS, retention plan with weekly fulls and daily incrementals Over the course of a year, every TB of primary disk sees 25 TB of tape generated to protect it under the rules of GFS De-duplication Storage Efficiency Factor is 0.8 30 times less than tape! Means de-dupe capacity required is equivalent to only 80% the primary disk it is protecting 4
Recovery Time and Recovery Point Objectives RTO is the amount of time the system s data is unavailable, preventing normal service RPO is the amount of data at risk of being lost Disk reduces both, results in less unplanned downtime during recovery events De-dupe enables more data to be protected on disk, more frequently & recovered easier International Pharmaceuticals Manufacturer Data Center Environment: 350 servers handling critical e-mail, databases, etc. IBM 3494 tape library with 10 IBM 3590 tape drives IBM TSM backing up 40TB to 300 tapes Problems: Unacceptable restore times Tape management costs Results: 500% improvement in restore times Saved 150 tapes per month Reduced tape management time over 50% 14 First-time backup success rate Tape backup is notoriously unreliable First-time backup success rate of only 70% to 80% Requires high number of jobs to be rerun Increasing management burden Increasing RPOs & RTOs Disk-based backups are highly reliability Success rate jumps to > 99% 5
Large Grocery Retailer & Distributor Problem: 24 hour backup window Reliability of backups 20-30% backup failure rates Full-time employee needed to restart and manage failed backup jobs Our shop was exposed every day in some environments Results: Backups complete 30-50% faster Failed backups have entirely disappeared Critical data now fully protected 16 Tape Media Consumption Annual expenses for tape media to protect a multiterabyte primary store runs (even with high capacity LTO) at >$3,500/TB/yr. Companies spend hundreds of thousands to millions of dollars per year on media, media handling, and offsite storage services. De-duplication reduces that number by 60% or more depending on the retention period and tape archive rotation scheme Large French Financial Institution Problems: Floor space required for silos Reliability of tape drives Manage huge tape inventory Data Center Environment 85,000 tapes 6 silos and 2 automatic tape libraries (ATLs) Results: Down to 4,000 tapes, 7 robots retired Saved $5M in tape costs alone Reassigned several staff members 18 6
5 Metrics THE to understand METRICS Disk of PROTECTION versus Tape Tape Backup with Automation Virtual Tape Libraries ProtecTIER VT Avg. Annual TCO ($/TB/yr) 3 yr analysis of protection of 30TB primary disk growing 50% per yr to 67TB Storage Efficiency Factor 13 week online repository size compared to primary disk $27,000 /TB/yr 25 - Annual tape media with GFS tape rotation $26,000 /TB/yr 5.0 - Disk as a tape-cache, using GFS rotation $12,000 /TB/yr - $3 million savings over 3 yrs 0.8 - less than primary disk RTO/RPO Hours to days Minutes-to-hours Minutes-to-hours 1 st time Backup Reliability < 80% >99% >99% Tape Media Cost - on 50TB primary disk $3.5k/TB/yr for tape media 30% tape media cost reduction 50% tape media cost reduction De-duplicated disk is $15,000 per TB per year cheaper than tape to own and operate. The New Metrics of Disk-based Data Protection By Michael Peterson 20 The Impact of Data De-duplication Up to 25X the physical capacity 7
De-duplication Enables Remote Replication Primary Site Master Server Significantly less bandwidth capacity required Secondary Site Disk dramatically improves backup operations Disk Space De-duplication makes disk less expensive than tape! Hitachi Virtual Tape Library Appliances Performance and Functionality Enterprise Large Highest Performance Largest Capacity High High Performance Performance Large Large Capacity Capacity Good Performance Medium Highly Scalable Low cost Scalability 8