89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 Protecting Information in a Smarter Data Center with the Performance of Flash IBM FlashSystem and IBM ProtecTIER
Printed in the United States of America Copyright 2014 Edison Group, Inc. New York. Edison Group offers no warranty either expressed or implied on the information contained herein and shall be held harmless for errors resulting from its use. All products are trademarks of their respective owners. First Publication: April, 2014 Produced by: Chris M. Evans, Senior Analyst; Manny Frishberg, Editor; Barry Cohen, Editor-in- Chief
Table of Contents Executive Summary... 1 ProtecTIER Overview... 2 The Problem... 3 The Solution with IBM FlashSystem... 4 Proof Points... 5 Flash for Metadata and User Data... 5 Flash for Metadata... 6 Conclusion... 7
Executive Summary The adoption of flash is powering data centers to new levels of storage and compute performance. With it, companies are achieving even greater business value by reducing the time-to-delivery of information and processing greater workloads. As businesses move towards 24/7 operations, slow backups are simply not an option. Data protection in the form of backup, restore, and replication plays a key role in ensuring applications are always available. Products like IBM ProtecTIER have changed the landscape for data protection by implementing Enterprise class, scalable and high-performance deduplicating backup solutions. ProtecTIER leverages IBM s patented HyperFactor data deduplication technology to deliver storage savings by a factor of 25 or more. FlashSystem delivers faster performance at half the cost and 18 times the footprint reduction, compared to disk Backup, restore and replication using ProtecTIER and FlashSystem technology delivers an optimized solution that offers the performance and throughput of flash, for a price half that of traditional hard disk solutions. Customers can also reduce their storage footprint by up to 18 times when consolidating racks full of disk enclosures to a compact 2U FlashSystem enclosure. Making the solution even more cost effective, FlashSystem can be implemented with ProtecTIER in a number of specific configurations, which allows flash performance to be targeted at the exact parts of the infrastructure requiring it. In summary, IBM ProtecTIER and FlashSystem offer a unique combination of technologies that meet the requirements of the modern data center. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 1
ProtecTIER Overview ProtecTIER s Deduplication Gateway solution provides disk-to-disk backup by emulating a Virtual Tape Library (VTL), or by providing a network-attached storage (NAS) host interface. It is delivered as a hardware/software combination, using IBM System x servers and a mix of either flash or disk storage. In typical ProtecTIER deployments, spinning media (hard disk drives) are used to store both the backup data and metadata for managing the backup contents. IBM is a leader in deduplication. The ProtecTIER platform includes patented HyperFactor data deduplication technology, which can reduce the storage needed to store backups by a factor of 25 or better. This makes backing up to disk a more efficient and cost effective solution than legacy technology such as tape. Backup via a ProtecTIER system provides a number of benefits: Sustained backup throughput of 2700 MB/s or more Sustained recovery throughput of 3600 MB/s or more Scalability of up to 1PB of physical storage, which translates up to 25PB or more of backup storage capacity depending on the deduplication ratio The ability to cluster ProtecTIER nodes for high availability The ability to replicate backup data between sites for resilience and enhanced recovery capability Support for backup software platform integration including Symantec s NetBackup OpenStorage interface and standards-based protocol support, including SMB and NFS High data reliability through non-hashed-based deduplication technology High availability through geographically dispersed replication, providing disaster recovery capabilities that use data deduplication to optimize WAN traffic IT departments are seeing the benefits of moving to solutions such as ProtecTIER for the majority of their backup requirements and using tape for more long-term archive and backup needs. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 2
The Problem Data protection is a critical component of delivering reliable and available IT services. As more demands are put on businesses to deliver services to a 24-hour timetable, the time available for backup and maintenance is decreasing. In many cases, backups simply take too long and disk-based solutions have to be overprovisioned, wasting capacity while incurring space, power and cooling expense. Backup needs to meet the time demands in three ways: Time to Backup There is always pressure on any environment to complete backup within the predefined window. As environments scale, the pressure on backup increases, especially where organizations are moving to 24-hour operation. Time to Restore Recovering data in a timely fashion is critical to keeping businesses operating. Should data become lost or corrupted due to hardware or software failures, it is imperative to recover systems back to operation in the minimum amount of time. Time to Replicate Off Site Ensuring one or more off site copies of backup data are generated within the predefined window is a critical requirement for business continuity. In the event of a disaster, it is imperative to leverage off site copies of backup data to continue and recover business operations in the minimum amount of time. Backup solutions need to adapt to cope with the increase in performance required to backup and replicate applications. To achieve the high bandwidth required for high volume ProtecTIER workloads, more disk drives (in some cases filling complete data center racks) may have to be deployed than are needed to store the data. This results in wasted capacity, plus additional environmental expenses for space, power, and cooling. Delivering high performance backup is all about achieving high throughput. In a ProtecTIER system the deduplication process operates in-line (or in real time) to achieve best performance, generating both read and write activity. When data is restored from ProtecTIER to the primary system, the contents of the backup are read from storage, and due to the nature of deduplication, will result in highly random I/O. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 3
The Solution with IBM FlashSystem Businesses that have invested to optimize their backup performance need to ensure that backup is delivered through the most efficient and state-of-the art technology available. FlashSystem arrays, which include MicroLatency, provide extreme performance with extremely low latency, making them particularly suitable for economical deployment with ProtecTIER. FlashSystem storage can be used with ProtecTIER in a number of configurations to suit the requirements of the customer. These provide a balance between price and performance, depending on the capability required. Flash for Metadata FlashSystem can be deployed to host the metadata for one or more ProtecTIER systems. Metadata information is constantly accessed to store new content and retrieve backup data for delivery to the host. The user data can remain on spinning disk if that provides sufficient throughput. In some scenarios, it may be more cost effective to share FlashSystem between two or more ProtecTIER clusters, enabling each to take advantage of metadata performance improvements. Flash for Metadata and User Data FlashSystem can be used to host the entire ProtecTIER environment, covering both system metadata and user data. This scenario would be beneficial where a high volume of single stream backup and restore is required. ProtecTIER with FlashSystem reduces the wasted space and expense seen with deploying large numbers of hard disks just to meet the throughput requirements of high performance ProtecTIER environments. One other benefit of using FlashSystem for the data repository is the ability to encrypt data-at-rest on the array. If data is encrypted before being sent to ProtecTIER, then any deduplication benefits may not be realized. Encryption once the data has been deduplicated is therefore much more desirable. Introducing flash storage into the backup infrastructure can be further extended to the backup software itself. Using FlashSystem to manage the backup application database and other performance critical components provides another boost to backup throughput and performance. Using FlashSystem for both metadata and user data leads to additional savings and will not degrade performance of FlashSystem arrays. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 4
Proof Points IBM has created and certified a reference design for ProtecTIER using FlashSystem to demonstrate the cost savings when using flash to deliver storage performance. Flash for Metadata and User Data The solution design deploys both the metadata and user data onto IBM FlashSystem 840 versus a disk system with sufficient hard disks to meet performance requirements. $400 $350 Metadata and User Data - Price/Performance FlashSystem 840 vs Disk Back up, $359 $300 $250 50% Reduction Restore, $281 $200 $150 Back up, $178 Restore, $139 $100 $50 $- FlashSystem 840 (36 TB usable) Disk Tray 18 (187 TB usable) Figure 1: Metadata and User Data Price/Performance FlashSystem vs. Disk Both the flash and disk-based solutions provide comparable levels of throughput performance (2500 MB/s backup and 3200 MB/s restore). The results show FlashSystem 840 provides a much lower price point (a 50% price per performance improvement) for both backup and restore compared to an all-disk solution. FlashSystem also delivers significant savings in power consumption and data center footprint (18X reduction) over the disk-based configuration. To achieve this level of performance using spinning media requires 18 disk enclosures, totaling 432 disks (or 36U of space) as compared to one FlashSystem enclosure (or 2U of space). Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 5
To view this data another way, the customer could deploy two FlashSystem 840 enclosures (4U, 72 TB usable capacity) and have comparable price/performance to 18 disk enclosures (36U, 187 TB usable capacity). Flash for Metadata The solution deploys the metadata on FlashSystem with the user data remaining on disk. Additionally four Tivoli Storage Manager databases were moved from disk to FlashSystem. In this configuration FlashSystem achieved higher performance results when compared to disk. FlashSystem (8 TB) Disk (21 TB) Back Up 2700 MB/s 2500 MB/s Restore 3600 MB/s 3200 MB/s Table 1: Metadata Database Deployment Performance FlashSystem and Disk Faster performance combined with the economical price point of FlashSystem also resulted in a 10% price per performance improvement. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 6
Conclusion Backup is an essential part of an efficient data protection strategy. To meet tight service level agreements, deduplicating backup solutions such as ProtecTIER offer significant operational advantage over legacy tape systems. As the amount of data requiring backup continues to grow, the performance demands on backup continue to increase. Deploying ProtecTIER with FlashSystem is a costeffective way to deliver high performance backup and restore capabilities, enabling customers to meet their service level objectives. Edison: Protecting Information in a Smarter Data Center with the Performance of Flash Page 7