Backup Appliances: When to Choose Them, When to Avoid Them Whitepaper WD Business Storage Solutions
Contents Contents... 2 Executive Summary... 3 The Evolution of Backup - Traditional Strategies and the Causes of Data Loss... 4 The Backup Imperative... 4 Causes of Data Loss... 4 Types of Backup and Considerations... 4 Business Requirements, Deployment Preferences... 5 Network Backups: Reducing Cost, Simplifying Management... 6 PBBA Overview... 8 Alternative Backup Methods... 9 Choosing the Right Backup Appliance... 10 (1) Levels of Technical Expertise... 10 (2) Speed of Provisioning... 11 (3) Physical Security and Malware Isolation... 11 (4) Integration of Additional Services - Seed and Feed Technology... 12 (5) Fully-Engineered Solutions... 12 (6) Flexible Configuration... 13 (7) Technical Support Throats to Choke... 14 When are PBBAs Not the Best Solution?... 15 The Bottom Line... 16 Summary and Conclusion... 17 WD Arkeia Page 2 of 17
Executive Summary From a small office environment to the most expansive of enterprises, backing up data is important simply because data is important. This white paper reviews the evolution of data backup technology following the migration from direct-to-tape to tape-to-disk and, more recently, to disk-to-cloud and defines the genesis of purpose-built backup appliances (PBBAs) among the alternative backup methods of software and virtual appliances. It also explores the various trends leading to the emergence of PBBAs, as well as the dimensions used to evaluate these trends when determining if a PBBA makes sense for a network environment. When it comes to network backup, there is no one-size-fits-all solution. Even the most ideal program implemented today can be outgrown as a company s needs change over time. Considering the variety and complexity of available hardware and software, a best practice is to engage a knowledgeable solution provider, capable of serving as a trusted partner to support a company as it examines the many secure and scalable options that are available to address their network backup requirements. WD Arkeia Page 3 of 17
The Evolution of Backup - Traditional Strategies and the Causes of Data Loss The Backup Imperative The single largest cause of data loss is hardware failure which accounts for about 40% of data loss (source: Kroll Ontrack); but a combination of other factors (e.g., human error, software failure, malware and natural disasters) leads to more data being lost overall (see Figure 1). Obviously some hardware failure is due to disk drive malfunction, which can be prevented by using RAID technology, but over 50% of the data lost from hardware failure and all other contributing factors, cannot be prevented by simple solutions such as RAID. Causes of Data Loss The practice of backing up data involves much more than just creating a copy of one s files and folders. True data backup is a solution that allows the clock to be rolled back by establishing restore points, which offer the ability to recover data as it was at a specific date and time in the past. Restore points are invaluable when lost data needs to be recovered. For most companies, as well as individual consumers, having multiple restore points to access and recover data is a wise practice. Types of Backup and Considerations The benefits of restore points in data backup are twofold: Allows data to be restored from a specific day and time. Protect data from the dangers of failure, error or disaster. Figure 1 Causes of Data Loss There are different types of data backup, some created more than 50 years ago. Historically, backups were done to tape. Initially they were made on magnetic, nine-track open-reel setups with separate housings for the tape and take-up reels that required manual mounting and usually needed human operators to change reels (see Figure 2). Later, tape cassettes that conveniently housed both reels inside the same mechanical device were introduced. Figure 2 Tape Backups Managed by autoloaders, libraries of cassette tapes paved the way for automation (see Figure 3). Dozens to hundreds to thousands of cassettes could easily be accessed and returned by robotic arms, offering a very efficient method to perform large data backups to large volumes of tape without requiring human intervention. Figure 3 Cassette Tape Libraries WD Arkeia Page 4 of 17
What Is Data Deduplication? Data deduplication is a kind of global compression that looks for commonality across multiple files. By only having to store content once, content that is common across multiple files can be dramatically compressed to reduce storage requirements. Global compression is distinct from local compression and its more familiar algorithms. Local compression strategies target the compression of individual files, and use algorithms like DEFLATE (i.e. zip ), JPEG, or MPEG. Learn More about deduplication and how it works-- http://www.arkeia.com/products/datasheets. Backup to tape is still widely used today by the majority of medium and enterprise-sized businesses as their final backup target. It was less than a decade ago that backups to disk media, via hard disk drives (HDDs) (Figure 4) originally introduced in 1956 by IBM, really began to make sense. Fueled by the prevalence of personal computers emerging in the 1980 s, costs associated with hard disk storage have steadily continued to drop and, as a result made, the feasibility of backup to disk media rapidly gain in popularity. The growing interest in using dedicated solutions such as purpose-built Figure 4 Hard Disk Drive backup appliances (PBBAs) for backup is further evidence of the shift from tape to disk. When evaluating a backup solution, there are a number of things to consider. First you must decide on the preferred backup target: is it target to tape or target to disk? It is important to remember that, whether backups are made to tape or to disk, the safest method is one that includes a data backup that is replicated for off-site storage. In making the backup target decision, PBBAs can be used for either kind of deployment. Business Requirements, Deployment Preferences Next, you ll need to decide on the frequency of your data backed up. What makes the most sense for time and budget constraints: backups that are done hourly, daily or weekly? The more frequently data backup is performed and restore points are established, the less likely it is that data will potentially be lost. The disadvantages of high frequency restore points are that they take system processing and network resources as well as additional storage to maintain the restore points. The third issue to consider is the use of full differential or incremental backup strategies. These different types of backups were created more than 50 years ago to accommodate earlier tape technology and the generally high associated cost of storage. By doing incremental backups, the amount of data that needs to be written to tape is reduced in establishing a restore point, meaning only the files that have been changed since the most recent restore point are backed up. WD Arkeia Page 5 of 17
Gartner Group Predictions Through 2015, disk-to-disk-to-tape (D2D2T) backup will remain the predominant strategy for large enterprises. By the end of 2016, 40% of large enterprises, up from 20% at year-end 2012, will have eliminated tape for operational recovery. Between 2012 and 2016, one-third of organizations will change backup vendors due to frustration over cost, complexity and/or capability. By 2016, at least 20% of large enterprises will abandon backup/recovery solutions using a traditional methodology and adopt those that employ only snapshot and replication techniques, up from less than 7% today. Gartner Group, Strategic Planning Assumptions/Magic Quadrant paper, June 2013. Incremental backups make it possible to have many more restore points with the same volume of data than one might normally be able to have. The downside of performing incremental backups, in the instance of tape, is that it can slow the recovery process. This occurs because, rather than restoring tapes with only the data that is to be recovered, you may have to mount many different tapes, taking only a small amount of data off of each tape in the data restoration process. Today when using backup to disk, the concept of full differential or incremental backups is almost no longer relevant, since deduplication technologies make every backup effectively an incremental backup in terms of volume. When it comes to backup time, however, relying on incremental as a backup strategy still makes sense since the method of data access during the backup process looks at the file level where deduplication backup is done at the block level. The final point to consider when deploying a backup solution is where the backed-up data will physically be stored. Some businesses still do backups and backup storage on-site. This practice is risky. It cannot be stressed strongly enough that a backup cannot be depended upon unless it is stored off site. If a workplace or business is destroyed due to natural disaster, theft or even simple human error, no one wants to lose the backup data at the same time as vital production data is lost. Network Backups: Reducing Cost, Simplifying Management Another technology that has significantly impacted data backup is networking. The industry is moving away from the original standard of using one tape device to back up a single computer. This one-to-one pairing only made sense when things are expensive. Over the years, as the cost of computers was reduced and proliferated while mechanically-complex tape drives were not, a new method was found to reduce the cost of backing up the expanding number of multiple machines. The answer was provided by network technology. A single backup server could now connect to hundreds of machines and access the data from all of the systems on the network during the backup process, storing it on a single target. The purpose of network backup is to both reduce the cost of performing backups and to simplify its management by limiting deployments to a single backup server per each local area network (LAN). WD Arkeia Page 6 of 17
The trends mentioned earlier have driven the cause for purpose-built backup appliance (PBBA) adoption with the transition from tape to disk, in particular, being an essential element leading to their increasing favor with small and mid-sized business applications. Purpose-built backup appliances exist today because hard drives have become both efficient and effective targets for backups. All physical PBBAs have hard disk drives in them; some have both hard drives and tape drives inside. The change in business requirements is directly impacting the drive toward PBBA adoption. In the past, most companies had only a single data center with all backups performed there by a dedicated skilled IT staff in that one location. Today, most companies have multiple offices where data is created and each of these offices requires its own backup infrastructure. This is one of the significant motivations for the use of PBBAs delivering backup solutions in environments that do not have heavy technical expertise readily available. Another trend relevant to network backup and to other product categories is the increasing preference within IT organizations of purchasing appliances from a single vendor rather than building their own solutions. Buying an appliance from a single vendor makes it faster to deploy a solution as well as simpler to support. It is also a reflection of the One Throat to Choke preference in keeping things accountable to a single vendor when issues arise. An extra point in favor of PBBAs is the potential of having additional features provided by the PBBA vendors. Companies such as WD, for instance, deliver certain PBBAs with network boot for bare metal recovery and network link aggregation. WD Arkeia Page 7 of 17
IDC Tracks Continued Growth of PBBA Market According to the International Data Corporation (IDC) Worldwide Quarterly Purpose-Built Backup Appliance Tracker worldwide purpose-built backup appliance (PBBA) factory revenues posted a 6.7% yearover-year increase, totaling $713.5 million, in the third quarter of 2013 (3Q13). In addition, the total PBBA open systems market grew 6.9% year over year in 3Q13, with revenues totaling $622.7 million. Total worldwide PBBA capacity shipped reached 383,153 terabytes, growing a scant 2.4% year over year. "The total worldwide PBBA market continued to experience growth despite a challenging macroeconomic environment." Robert Amatruda, Research Director, Data Protection and Recovery. "PBBA solutions are being embraced by customers of all sizes as a turnkey, simplified approach to data protection and recovery." IDC defines a purpose-built backup appliance (PBBA) as a standalone disk-based solution that utilizes software, disk arrays, server engine(s), or nodes that are used for a target for backup data and specifically data coming from a backup application (e.g., NetWorker, NetBackup, TSM, and Backup Exec) or can be tightly integrated with the backup software to catalog, index, schedule, and perform data movement. The PBBA products are deployed in standalone configurations or as gateways. PBBA solutions deployed in a gateway configuration connect to and store backup data on general-purpose storage. Here, the gateway device is serving as the component that is purpose built solely for backup and not for supporting any other workload or application. Regardless of packaging (as an appliance or gateway), PBBAs can have multiple interfaces or protocols. Also, PBBAs often can provide and receive replication to or from remote sites and a secondary PBBA for the purpose of disaster recovery (DR). (source: IDC press release, 19 Dec 2013) PBBA Overview The term purpose-built backup appliance or PBBA was coined several years ago by International Data Corporation (IDC) analyst Robert Amatruda on his observation that these unique types of backup appliances were not only an anomaly but they were becoming a product category in and of themselves. The first PBBA was delivered in 2006 by Avamar Technologies that today is part of EMC. The second PBBA was delivered in 2007 by Arkeia now owned by WD - and since then many more companies have started to deliver these types of network backup solutions. Taking a closer look, what defines purpose-built backup appliances and how are they used? Purpose-built backup appliances (PBBAs) are stand-alone, disk-based backup servers that use disk arrays, software, and server engines/nodes as integrated storage targets for backup data and replicated backup data. Features include data compression, deduplication, encryption, remote replication and interfaces for support. PBBAs are always connected to a network in order to access the machines they are to protect and well-designed backup appliances preclude the running of other workloads or additional software. On one hand this can be seen as a limitation of PBBAs but it is also the reason why they are such reliable backup servers. PBBAs are significant since their hardware is fine-tuned expressly for one application: data backup. Everything in a PBBA, all the technologies from connectivity, to disk storage to the CPU necessary for de-dupe processing, to the use of tape drives, hard disk drives or SSDs all of the components involved are engineered to deliver a complete and perfectly-balanced backup server. It is important to note that while they may appear to be the same, PBBAs are not all-purpose servers and users are generally precluded from installing custom applications on them. PBBAs can be deployed in standalone self-contained systems or function as gateways connecting to shared storage such as a storage area network (SAN) Some PPBAs, such as those produced by WD Arkeia, offer integrated tape drives that allow businesses to take the backed up data off-site at the completion of the backup process Some PBBAs have incorporated SSDs that store the backup catalog that enhances the deduplication performance WD Arkeia Page 8 of 17
Alternative Backup Methods Other than PBBAs, there are two additional modes whereby backup servers can be effectively deployed in a network: Software Application (SW) Virtual Appliance (VA) Because PBBAs have only existed since 2006, the most universal solution prior to the use of PBBSs was backup using a software application. With software backups, a company would typically license the backup software, buy the necessary hardware, provide and configure an operating system (OS) and then install the backup server before being able to initiate the backup process. Subsequent to PBBAs, another viable option is the delivery of backup servers as virtual appliances. Virtual appliances take the firmware that would otherwise exist within a PBBA along with a pre-configured operating system and deploy it on a hypervisor, by way of either Microsoft Hyper-V, VMware vsphere, or another brand of hypervisor. Depending on the network backup requirements and preferences of a particular business, each deployment method offers its own mix of features, benefits and limitations. WD Arkeia Page 9 of 17
Choosing the Right Backup Appliance There are seven essential areas to consider when determining how each of the three backup solutions match up with the dimensions important in the decision making process: 1. Levels of Technical Expertise 2. Speed of Provisioning 3. Physical Security and Malware Isolation 4. Integration of Additional Services 5. Fully-engineered Solutions 6. Flexible Configuration 7. Technical Support Throats to Choke (1) Levels of Technical Expertise Looking at levels of technical expertise, PBBAs rank well, requiring little expertise to set up, deploy and maintain. One can effectively plug them in, set an IP address, install the necessary agents, and start a backup. PBBAs are the easiest of the three, whereas a software application by comparison requires the most technical expertise to deploy. In establishing a software-based backup server, one has to identify and acquire the hardware, the OS, install the software, configure the storage, and install the agents all before the process of backing up data can commence. Software backup applications require a lot of work even when there are teams present to support the deployment process. In the middle, between the two backup solutions is that of the virtual appliance, requiring a bit more expertise than PBBAs but not so much as software. Table 1 Required Levels of Expertise to Deploy PBBA Software Application (SW) Virtual Appliance (VA) Low High Moderate Plug it In Install Software Deploy on Hypervisor Set IP Address Install OS Configure Storage Start Backup Install Backup Software Set IP Address Configure Storage Start Backup Start Backup WD Arkeia Page 10 of 17
(2) Speed of Provisioning In the evaluation of the best backup method for deployment, speed of provisioning is another important factor. PBBAs take a little time and require a few physical steps to deploy. This makes them slower than deploying VAs for backup. Managed Service Providers (MSPs) have broadly adopted VAs because they can be deployed effectively on demand. This makes sense for data centers that have multiple LANs and very large numbers of distinct customers that need to be protected. The slowest one of the three methods for speed in provisioning is software. Table 2 Speed of Provisioning PBBA Software Application (SW) Virtual Appliance (VA) Moderate Slowest Fastest Fast but Requires Some Physical Steps Manual Process Roll Your Own Can be Scripted Remote Deployments Possible (3) Physical Security and Malware Isolation Physical security and malware isolation is another critical consideration in backup deployment options. PBBAs rank high since they have the advantage of being on dedicated hardware running a hardened operating system. The term hardened OS is derived from the fact that, in tuning it for specific use in a PBBA, all of the services not necessary for a backup server are terminated at startup. Since all of the other services, such as the processes that listen to ports are turned off with PBBAs, by virtue of having the hardware physically separate and by having a hardened OS, the machine and data are much better protected than they would otherwise be. The least secure are software applications because they are deployed on shared hardware and are most often deployed on standard or existing operating systems that perform other functions outside of data backups. This makes them vulnerable to malware. Virtual appliances are in the middle, having some, but not all, of the risk. Table 3 Security Levels and Malware Isolation PBBA Software Application (SW) Virtual Appliance (VA) High Low Moderate Shared/Dedicated Hardware Shared/Dedicated Hardware Shared Hardware Hardened OS Standard OS Hardened OS WD Arkeia Page 11 of 17
(4) Integration of Additional Services - Seed and Feed Technology Getting backup sets off-site has traditionally been accomplished by using tapes on trucks. The practice of using Seed and Feed is a technology offered by WD Arkeia to move backup sets off-site using physical media in the form of hard drives. When using hard drives instead of tape, moving backup sets off-site via disk is generally done over the wide area network (WAN) and one concern is bandwidth limitations which can be very expensive. Seed and Feed is a technology that lets a backup set be exported from one backup server and shipped either via sneaker-net (on foot) or by way of a common carrier like FedEx or UPS to a remote location. On arrival, the backup set is then imported onto the backup server at that remote location. Seed and Feed uses USB connected portable media as an alternative to using a WAN. This practice is very cost effective since many large volumes of data can be moved quickly, and while the latency is high, the bandwidth is also very high. Nightly backups can be sent over the network while big backups, like the first backup or those for disaster recovery, are done by sending USB-connected hard drives. PBBAs are an obvious choice to implement Seed and Feed technology since both software applications and virtual appliances require significantly more effort in the securing of off-site data backup delivery. Table 4 Integration of Additional Services PBBA Software Application (SW) Virtual Appliance (VA) Seed and Feed (Ideally-Suited) Some Effort to Implement Some Effort to Implement USB Ports fully managed automatically Based on the OS, without integration in the product Network boot mode for Bare Metal Recovery Network link aggregation RAID management (5) Fully-Engineered Solutions Distinctive from both software and virtual appliance backup applications, PBBAs offer an out-of-the-box backup solution that can be up and running in a matter of hours. In addition to having system components specifically tested and certified for reliable network backup, purpose-built backup appliances have the advantage of using a hardened operating system, meaning that only the elements needed to do specific backup and recovery functions are enabled. Both software applications and virtual appliances have to work with operating systems shared among a variety of components that perform other tasks. This can limit and sometimes adversely impact backup performance. WD Arkeia Page 12 of 17
Table 5 Fully-Engineered Solutions PBBA Software Application (SW) Virtual Appliance (VA) Yes No No Hardened OS Build Your Own Build Your Own Enterprise-grade Components Yes Build Your Own SSD Yes Build Your Own Redundant Everything Yes Build Your Own Ready for Tape Device Yes Build Your Own Tested and Certified System Components Yes (6) Flexible Configuration Another important criteria when evaluating PBBAs is the flexibility of configuration. To truly customize a backup solution, using a software application is the best way to go since it offers the greatest amount of flexibility. Table 6 Configuration Flexibility PBBA Software Application (SW) Virtual Appliance (VA) PBBA Software Application Virtual Appliance Low High Moderate Pre-configured Appliances Infinite Flexibility Less Flexible than Software Limited Customization Possible Greater Flexibility than PBBA Licenses per PBBA Different Hardware with Same License WD Arkeia Page 13 of 17
(7) Technical Support Throats to Choke The final consideration in selecting a PBBA is a business preference for the number of vendors to address when issues arise. Choosing PBBAs offer the simplest route to take since, with their deployment, there is a single vendor that provides a total backup solution and thus a single vendor to contact if issues arise. The selection of virtual appliances involves interaction with several vendors for the backup solution and therefore contact with a few different companies will be necessary if a company requires support. When using a software application for backup, many vendors play different roles to arrive at the backup solution. As a result, there can be a variety of vendor throats to choke when a company is experiencing problems and needs technical help. Table 7 Tech Support Vendors to Access PBBA Software Application (SW) Virtual Appliance (VA) One (Hardware, OS and Application) Many Several Table 8 Backup Solution Summary Criteria PBBA Software Application Virtual Appliance (VA) Required Tech Expertise Low High Moderate Speed of Provisioning Moderate Slow Fastest Physical Security and Malware High Low Moderate Integration of Additional Services High Low Moderate Integration of Additional Services High Low Moderate Fully Engineered Solution Yes No No Flexible Configuration Low High Moderate WD Arkeia Page 14 of 17
When are PBBAs Not the Best Solution? Businesses requiring a very high level of customization might find that PBBAs are not the best method of network data backup deployment. For example, some larger deployments might want to integrate the backup solution within a more global storage solution and using the command-line interface to build scripts, it is possible to embed a backup engine, such as the one available from WD Arkeia, into such a solution. Companies with rapidly growing volumes of data that are unsure about the technologies they will rely on in the future for their IT infrastructure might prefer to consider a software deployment which allows them the flexibility of moving backup licenses from one hardware platform to another. Another consideration when deploying a PBBA is cost. Typically, PPBAs carry a much higher upfront cost, since they are both a hardware and software solution. Budget conscious buyers who already have available hardware may find a software deployment a much more cost-effective solution. WD Arkeia Page 15 of 17
The Bottom Line When evaluating different options for network backup, striking a balance of budget resources versus capabilities is key. Using PBBAs can offer meaningful savings of both time and money and the built-in feature sets streamline the backup process. This is demonstrated by the improved productivity of IT staff who do not have to spend additional hours on data backup/recovery, and by end-users who no longer have to wait as long during data recovery processes. Using PBBAs offers secure, affordable network backup even for companies with limited technical expertise. WD Arkeia Page 16 of 17
Summary and Conclusion Rather than reinventing the wheel when it comes to network backup, purpose-built backup appliances are a best choice for companies that want solid backup functionality from a fully-engineered, out-of-the-box solution with seed and feed functionality that offers malware isolation and maximum security. This is especially true for companies that do not have advanced IT staff always on-site or accessible to handle issues that arise during backup, restore or disaster recovery situations. When evaluating the plethora of available network backup options on the market today, it is important to think about current requirements balanced with what might be needed for backup in the future. For example, a small company taking advantage of a PBBA right now for its data backup might want to delve into virtualization in the near future, in which case moving to a virtual appliance model for backup could be advisable. Or, as a company grows and diversifies it might desire a greater level of customization and when this happens, moving to a software-based backup solution could be a better choice. The key in determining the best backup solution is selecting a partner that excels in providing service and support in all three options: purpose-built backup appliances; software backup; and virtual appliance backup. Companies such as Western Digital, with the variety of options from WD Arkeia, offer backup solutions that range from PBBAs to servers that use software applications and an array of virtual appliance backup methods that can be optimized to fit the backup deployment needs and budgetary requirements unique to each business today and in the future. For service and literature: http://www.arkeia.com/r/support http://www.arkeia.com/r/wiki Western Digital, WD, and the WD logo are registered trademarks of Western Digital Technologies, Inc. in the U.S. and other countries, and WD Arkeia and Progressive Deduplication are trademarks of Western Digital Technologies, Inc.in the U.S. and other countries. Other marks may be mentioned herein that belong to other companies. Product specifications subject to change without notice. 2014 Western Digital Technologies, Inc. All rights reserved. 2579-800082-A00 May 2014 WD Arkeia Page 17 of 17