Use Case Maximizing Deduplication ROI in a NetBackup Environment For many companies, backing up and restoring data is becoming a slow, complicated, expensive process. The volume of data to be protected is growing 30-60% annually, while the amount of time available to complete backups is staying the same or shrinking. At the same time, end-users are more dependent on continuous access to data for businesscritical operations than ever. Failed backups which happen in 20% of tape backups and delayed file restores drive down productivity and directly affect the bottom line. Deduplication technology is changing the economics of data protection by enabling companies to backup and restore petabytes of data without adding capacity making the cost of backing up to a VTL comparable to physical tape. The following illustrates a typical use case scenario for enterprise deduplication.
Achieving Maximum Deduplication ROI in a NetBackup Environment page 1 Enterprise Data Protection For many companies, backing up and restoring data is becoming a slow, complicated, expensive process. The volume of data to be protected is growing 30-60% annually, while the amount of time available to complete backups is staying the same or shrinking. At the same time, end-users are more dependent on continuous access to data for businesscritical operations than ever. Failed backups which happen in 20% of tape backups and delayed file restores drive down productivity and directly affect the bottom line. Deduplication technology is changing the economics of data protection by enabling companies to backup and restore petabytes of data without adding capacity making the cost of backing up to a VTL comparable to physical tape. Let s look at the case of an IT manager who sets out to improve his backup and recovery process. As an IT manager for a leading financial services firm, Pete is responsible for protecting approximately 4 TB of data using Symantec NetBackup. He has about a 10% change in data daily. He is currently using an LTO-3 tape library. He schedules backups for after hours, so the load on his application servers and his LAN doesn t bring the business to its knees. Pete runs incremental backups nightly and one full backup per week, however, since each full backup takes 14 hours plus another ten minutes to label, file, mount and unmount tapes, this practice is becoming unwieldy. He manages his tape media carefully to ensure that he doesn t backup to an overly worn tape, that tape utilization stays high, and that tapes are recycled efficiently as his archived and stored data is expired. He also makes a second full copy of his tapes periodically, in case of media failure. As a result, at any given time he is juggling about 50 physical tapes in secondary storage and archive. Managing a Restore Request A senior stockbroker calls Pete and asks him to restore an email mailbox that was accidentally deleted four days since his last full backup. The mailbox contains information that is vital to an important deal and every minute she waits for restoration is costing thousands of dollars. Pete goes to the tape library and loads the first tape of the backup into the tape unit, he then has to load each of the incremental tapes he has created. He used the NetBackup Multiplexing function to speed the backup process, but he is paying for that now. Mutiplexing slows his restore times by combining data from several servers onto the same tape. Three hours later, he has restored the entire backup to a server he uses only for restores. One of the tapes has an error on it, so he cannot restore anything in the later part of the backup. He dreads the thought of telling the broker that she may have to do without her critical email. At least he didn t have to get the tape from his off-site archive.
Achieving Maximum Deduplication ROI in a NetBackup Environment page 2 He could have waited most of a day just to have the media delivered. Pete calculates that he loses data on restores about one time out of five. It happens often because he has been over working his tape drive equipment and reusing his media to keep up with data growth. He has to admit, human error is also partly to blame. This is a very manual process. Once he has restored the data, he exports the requested mailbox and imports it to the broker s laptop. Luckily, the data loss did not affect the broker s mailbox. The restore process has taken most of the day, and the broker has lost significant money on her deal. Another issue is looming. His tape library is nearly full and he doesn t have funds in the budget - or room in his data center - for a new one until the second half of the year, at least. By that time, the firm may have completed its plan to merge with a competitor more than doubling its backup volume overnight. Time to Improve the Backup and Restore Process The administrator is getting tired of working late, working every Friday night and missing dinner every time he gets a restore request. He s also tired of having irate employees yell at him because their data is permanently lost. He also realizes that as the volume of his backups continues to increase, the complexity of managing it all will increase. That translates to more overtime cost, higher probability of human error, and significantly higher risk of data loss He reads articles, talks to some storage vendors, checks with his CFO about his IT budget and decides that backing up to a VTL with deduplication would make the backup and restore process faster and more reliable. Needs Analysis Pete makes a list of his technical and business requirements as follows: Complete 4 TB full backups within an 8-hour backup window Restore data at wire speed without burdening his network Take full advantage of NetBackup s advanced features, such as Inline Copy, Advanced Client Option, and Multiplexing Deduplicate data to reduce his capacity requirements and footprint Backup designated servers without deduplication Scale to handle many times larger backups after planned merger Cost less than tape to acquire, implement, and manage Reduce complexity in the data center (no training, additional manpower, or policy changes required)
Achieving Maximum Deduplication ROI in a NetBackup Environment page 3 Evaluating the Options First, he evaluated a small VTL that offered a built-in, inline deduplication software and the ability to cluster several systems for added capacity and performance. Although this option fit his budget, he ruled it out quickly for several reasons. First, the deduplication was all or nothing. He has several servers that contain data that he does not want to deduplicate for regulatory reasons. This means he would have to divide his backups between this option and tape or another backup system. That added complexity means significantly more time, cost, and risk. Second, the so-called clustering capability that was advertised for scaling this option just means that their management GUI could be used for multiple individual systems. Each system, however, remains completely independent, requiring its own capacity allocation, performance tuning, hardware monitoring, etc. Deduplication is performed individually as well. A single system wouldn t backup much faster than the tape system he already had. He would need at least four of these systems to start if he wanted to stay within an eighthour window. A quick calculation revealed that after the company s merger, he would be managing eight of these systems. The thought of allocating capacity, managing upgrades, and just doing basic system management on eight individual systems was not appealing. Third, the inline deduplication method they used by these systems would degrade backup performance significantly over time and never restore data fast enough to meet his needs. Fourth, the systems were not compatible with his physical tape systems. He would have to do a rip and replace to adopt this technology. Next, he considered a large VTL option with a deduplication capability. This option was significantly more expensive than the smaller VTLs, but offered faster performance and more capacity. However, despite their advertised size and processing speed, the deduplication process could not scale past one node. That means that it would take more than 11 hours to complete the deduplication process and regain his capacity and effectively bottleneck his backup process. This option was also an all or nothing proposition. Restores would take significantly longer than he wanted. With every new backup, this system breaks his data into smaller and smaller pieces and scatters those pieces all over the disk array. To restore data, the system has to reconstitute the backup from the many pieces, a process that takes longer and longer over time. His third option was a SEPATON S2100 -ES2 VTL with DeltaStor deduplication software. This option cost the same or less than physical tape, restored data ten times faster than the small VTL. It had six times the throughput of the large VTL. It would complete full backups in less than two hours. It was fully compatible with NetBackup features, including Inline Copy, Advanced Client Option, and Multiplexing, and would not require him to make trade-offs in restore performance. In the future, he could add enough performance and/or capacity to manage tens of petabytes of data on a single system. It also let him choose which data he wanted deduplicate so he could backup up all of his application servers with it. He could store his data online for months without adding capacity or performance.
Achieving Maximum Deduplication ROI in a NetBackup Environment page 4 Cost Justification The S2100-ES2 cost the same as the LTO-3 or LTO-4 based system he would need to buy anyway. However, unlike the physical tape systems, he could add performance and capacity to the VTL, as he needed it. That means he wouldn t have the huge capital expense of buying a whole new physical tape library every 18-24 months that he currently faces. The increased reliability and complete automation would cut tens of thousands of dollars off his overtime and labor administration costs. Best of all, the SEPATON VTL would cut the risk of data loss, regulatory non-compliance, and end user productivity loss. The significant savings in floor space was an added benefit. Based on his analysis, he was given approval to purchase a SEPATON S2100-ES2 with DeltaStor deduplication. SEPATON S2100-ES2 with DeltaStor Deduplication He purchased a SEPATON S2100 VTL with 20 TB of storage licensed for use with DeltaStor deduplication software and processing power to deliver backup and restore at 17 TB/hour. The system was installed without disruption to his existing environment arrived fully configured and it completely emulated his existing tape systems and. That means no added complexity, no added cost. The SEPATON professional services team installed it for him and ensured that the deduplication software was tuned to deliver optimal deduplication results for each data type. It was also configured to backup designated data without deduplication to meet his business requirements. As part of the configuration, he identified what tape unit and tape type the VTL was supposed to look like. He made each the same type as his physical tape unit, pointed his backup software to it and was ready to back up. Pete kicked off his first full backup job. Once it started, he left to go to a meeting. When he came back, the backup was already complete. He checked his watch he had only been gone a little over an hour. He used the system to conduct full backups nightly for a full week and decided to check the restore speed. Again, a little over an hour later he had restored an entire backup. It used a feature called forward referencing that keeps a full set of data ready for instantaneous restores. The system also handled all of its back-end storage management capacity allocation, load balancing, etc. automatically. It would even email him with any system or backup issues that it s built-in monitoring system detected. He could log into the system from his desktop and check on the status of any backup, or restore procedure, look at system operating parameters and check efficiency of backup and deduplication. He could document this improvement. Management was going to love this. Reliability With nearly no moving parts, no physical media to break, and a diagnostic system that warned him before critical hardware failures (fans, disk drive, etc.), actually happened, failed backups were a thing of the past.
Achieving Maximum Deduplication ROI in a NetBackup Environment page 5 Looking Ahead As Pete looked ahead to the imminent merger, he knew he could add capacity and performance, as he needed to. The system s open architecture is fully compatible with tape systems, so he could move the new backups over to the VTL easily and with minimal risk. Overall, the SEPATON VTL with DeltaStor gave him the performance, scalability, capacity reduction, and ease-of-use he needed to optimize data protection and minimize risk of data loss for this data. Table 1: Side-by-Side Comparison of Data Protection Technologies S2100-ES2 Small VTL Large VTL Tape Backup Performance 2400 MB/sec (8.6 TB/hr) Maximum: 80 MB/sec Maximum: 100 MB/sec Maximum: 80 MB/sec High Availability Grid design for scalability and reliability None, multiple separate and independent units Limited, active cluster of two servers supported. Node failure reduces performance in half None Scalability of Performance and Capacity Fast, cost-effective, easy-allows for scalability to 8 nodes for 4800 MB/second. Costly, complex and typically limited; requires multiple independent units Time-consuming, costly, complex Time-consuming, complex Ease-of-Use Easy, highly automated, web-enabled Complex, must manage multiple separate systems Complex, manual. Disk and VTL managed separately Complex, manual Total Cost of Ownership Disaster Recovery Capability Low Medium High High High High Low Medium Capacity reduction Up to 40:1 reduction ratios 20:1 reduction ratios 20:1 reduction ratios None Copyright 2008. SEPATON, Inc. All rights reserved. SEPATON, S2100, SRE, and DeltaStor are registered trademarks and ContentAware and Site 2 are trademarks of SEPATON, Inc. Other product and company names mentioned herein are or may be trademarks and/or registered trademarks of their respective companies. UseCase5.1- v20908