Fast and Simple Disaster Recovery with Syncsort ExpressDR White Paper bex30101807wpdr
Fast and Simple Disaster Recovery with Syncsort ExpressDR Introduction Meltdown! One of your servers a critical component of your business has just failed. Permanently. Your files are lost. Your applications are history. Your data is gone. Your operating system irreparably damaged. What do you do when the root volume or system drive of a server or workstation crashes? Traditional approaches to disaster recovery include manual reinstallation, recovery through operating system utilities, and bare metal recovery from tape. This paper introduces a new end-to-end enterprise-wide strategy for Windows and UNIX environments that: maximizes and simplifies continuity of business (COB) in the event of server failure or catastrophic site failure. optimizes data integrity and consistency through frequent backups and recoveries to past points in time. provides powerful and effective tools for maintenance and activation of disaster recovery (DR) sites. maintains scalability while meeting continuity of operations planning requirements for enterprises of all sizes. simultaneously supports routine backup and restore obligations. Traditional Disaster Recovery Approaches The importance of comprehensive data protection and disaster recovery planning is a given. Some of the pitfalls of inadequate preparation include unmanageable recoveries, misplaced software CDs and other source media, recovered files that are not current, omitted application patches and upgrades, prolonged business interruption, and a frustrated over-extended labor force. An examination of the traditional disaster recovery approaches highlights many of these snares. Manual Reinstallation Reinstallation is the most commonly applied approach to server disaster recovery. To accomplish this, you need to replace the failed device or volume, install an operating system, search for software sources, reinstall each application and patch, configure system settings, build partitions, add drivers, and finally restore your data and files. With manual reinstallation, the frustration factor is high and the productivity loss is significant. If your failure is catastrophic and you lose multiple servers, a complete restoration using this approach might prove to be unattainable. Page 1
Operating System Utilities Another widespread approach is restoration via an operating system utility. This method exposes you to the risk of significant data loss due to the amount of time between the backup and the server failure. System backups through operating system utilities tend to be executed infrequently for three reasons: 1. These backups require full backups each time and therefore are enormous consumers of bandwidth and storage space. 2. These backups often conflict with your regular data backup schedules. 3. In many cases the operator needs to boot into a command tool or otherwise manually initiate the system backup. This requirement is bothersome and disruptive and may need to be done for each machine being backed up. As a result, upon recovery of your most recent system backup, you may find yourself significantly removed from your Recovery Point Objective (RPO). The optimal RPO is recovery as close as possible to the data state at the moment of failure. The consequences of restoration via operating system utility may be outdated system and security configurations, missing or fragmented applications, and files that are obsolete. Bare Metal Recovery from Tape Enterprises that are better prepared have a bare metal recovery from tape solution in place. This approach to system recovery involves laboriously mounting tapes and incrementally restoring and reassembling the former contents of your volume still a brain-intensive and time-consuming process. The New Strategy: Disk-to-Disk Disaster Recovery Leveraging the most state-of-the art hardware and software innovations is a fresh, new approach to data protection. Disk-to-disk disaster recovery is simple to use, is lightning-fast, and contains none of the hazards of the traditional approaches. Server recovery, including system recovery, is hassle-free, provides nearly uninterrupted business continuity, and results in restored files, applications, and configurations that are extremely close to the data state as of the moment of failure. This simple yet comprehensive strategy, developed by Syncsort Incorporated, contains the following unique and highly beneficial backup and recovery attributes: For the backup process: frequent, scheduled node backups to disk that do not interfere with other operations. perpetual block-level incremental backups after the initial image-based base backup. disaster recovery backups that are integrated into your regular data backup scheme. For the recovery process: an extremely simple interface for initiating the bare metal recovery. a very fast one-step recovery mechanism. the flexibility to initially recover only the most critical elements to save time. Page 2
the ability to recover any past backup, which is a necessity in the event of server, operating system, application, or data corruption. the ability to continue working during almost the entire recovery process. Key Tactical Components Several key components contribute to this disk-to-disk disaster recovery strategy. Among them are Backup Express, Syncsort AdvancedClient technologies, high capacity disk storage, and ExpressDR. Straightforward integration of these components provides enterprise-wide COB solutions for even the most complex Windows, UNIX, and mixed environments. Backup Express Backup Express is Syncsort's high-performance enterprise data protection solution. It integrates heterogeneous snapshot, image, and rapid recovery options, and controls backup and restore operations for the entire enterprise environment with a single catalog and browser-based GUI. AdvancedClient Technologies Syncsort s AdvancedClient technologies enable block-level data transfer with file-level restore. The AdvancedClient agent accesses the source disk directly, bypassing the file system, and transfers data at nearly raw disk speed. More remarkably, after the initial base backup, AdvancedClient uses block-level incremental (BLI) backups, assuring that only the absolute minimum amount of data (i.e. the changed blocks of changed files) is transferred; block-level backups transfer up to 90% less data than conventional file-level backup. Over time, BLI results in significant storage savings when compared to traditional backup methods (see Figure 1). Page 3
BACKUP STORAGE GROWTH MODEL 950 FILE-LEVEL BACKUP: WEEKLY BASE, DAILY INCREMENTAL BLOCK-LEVEL INCREMENTALS FOREVER 900 MODEL ASSUMPTIONS: 12 TB STARTING 2% DAILY CHANGE 907 TB 850 150 100 4% MONTHLY DATA GROWTH S T O R A G E S A V I N G S 50 42 TB 0 DAY 1 DAY 7 DAY 30 DAY 365 Figure 1: Storage Savings Block Level Incremental Versus Traditional Methods High Capacity Disk Storage Economical high capacity disk-based storage provides, among other things, rapid disk-based access to reference data. With the Syncsort solution, you can choose storage disk arrays that are precisely tailored to your needs. Depending on the importance of the data or applications being backed up, you can choose high-quality, medium-grade, generic, or JBOD storage, or a combination. And you are not tied to any specific disk manufacturer. To optimally match your organization s needs and infrastructure, two variations of the Backup Express solution are available, Express Recovery Server (XRS) and Advanced Protection Manager (APM). XRS is optimally deployed with Windows environments, UNIX environments, or mixed environments. It uses Windows 2003 x64 servers as its destination host, allowing any direct, SAN, or iscsi-attached disk array to be used for storage. APM, currently deployed with Windows only, utilizes Network Appliance Data ONTAP systems as destination media. ExpressDR ExpressDR, available from Syncsort with Backup Express, is a high-performance bare metal recovery product which, when compared to conventional disaster recovery techniques, dramatically simplifies both the backup and recovery process. ExpressDR eliminates the need for tapes, for distinct disaster recovery backups, for system reboots at backup, and for manual reinstallation at recovery. Page 4
To take advantage of ExpressDR, you use Backup Express to perform regular AdvancedClient backups of your Windows or UNIX nodes to a high-capacity disk-based storage host. Then, if the need to perform an ExpressDR recovery arises, simply boot up an adequately sized "bare machine" by using the ExpressDR CD provided with Backup Express, provide minimal information about the backup job and storage location, and select the backup instance you wish to recover (see Figure 2). 1. Use Backup Express to perform frequent node level AdvancedClient (XRS or APM) backups. Routine Block-Level Incremental Backup of Entire Node 2. To start system recovery, boot up a bare machine by inserting the ExpressDR CD. BACKUP EXPRESS ExpressDR Syncsort Incorporated, 2005 3. Provide target and destination information in response to the ExpressDR recovery dialog. Backup Location Restore Target Backup Instance Selected for Recovery Click Next for Complete Recovery Figure 2: ExpressDR How It Works in Three Easy Steps Page 5
ExpressDR recovers your operating system, system and security configurations, and complete point-in-time backed up files and data to the bare machine. Unlike other disaster recovery products, most requisite network, SCSI, and hardware drivers are installed automatically on the bare machine. In the final step of an ExpressDR recovery, ExpressDR applies configuration changes and reboots the machine. Once rebooted, the machine contains your recovered environment, operating system, applications, and data. Critical Server Recovery: An Illustration TIME AND EVENT ACTION DISK STORAGE VIRTUAL VOLUME IMAGE 11:00 A.M. 1. BLOCK-LEVEL INCREMENTAL BACKUP OF NODE ROUTINE BLI SNAPSHOT 12:00 P.M. NODE FAILURE X 1:00 P.M. - 1:10 P.M. 2. BACKUP EXPRESS ExpressDR NEW HARDWARE AVAILABILITY RECOVERY BEGINS X BOOT MACHINE FROM ExpressDR CD Syncsort Incorporated, 2005 1:10 P.M. - 1:30 P.M. 3. RECOVERY CONTINUES X RECOVER DATA FROM VIRTUAL VOLUME IMAGE (e.g., 50GB @ 150 GB/hr) 1:30 P.M. RECOVERY COMPLETED BUSINESS AS USUAL Figure 3: ExpressDR Fast Bare Metal Recovery To illustrate a critical server recovery, Figure 3 depicts a complete disaster recovery scenario for a single server using ExpressDR. The following is a description of the timeline: At 11:00 a.m., Backup Express performs a routine block-level incremental backup to disk-based storage. At noon, the source server fails. At 1:00 p.m., replacement hardware is obtained and at 1:10, recovery is initiated by booting up the bare machine with the ExpressDR CD. The ExpressDR recovery dialog prompts for minimal information and lists the snapshots on the disk storage available for selection. From 1:10 to 1:30, data including the operating system is transferred from the 11:00 a.m. virtual volume image on the disk storage. At 1:30, full recovery to the 11:00 a.m. state is completed. Business continues as usual. Page 6
Meeting Enterprise Objectives by Integrating Components When used with other key components of this disaster recovery strategy, ExpressDR can achieve much more than straightforward server recovery. Following are several objectives that can be met with this overall strategy. Objective 1. Maximize and simplify continuity of business (COB) in the event of server failure or catastrophic site failure. Syncsort s disk-to-disk disaster recovery strategy provides both the flexibility to initially recover only the most critical elements in order to save time and the ability to continue working during as much of the recovery process as possible. Suppose you have just suffered a hit to your server. You immediately assess the trade-off between recovery time and recovery completeness. If you have the luxury of time, you'll rapidly begin to restore your entire server in the manner illustrated in Figure 3. But if your recovery time window is short, recovering the system drive, boot drive, and operating system are your priorities. The applications and data can be restored later when the frenzy has dissipated. ExpressDR offers this flexibility. You accomplish it through simple selection in the ExpressDR recovery dialog. ExpressDR ensures that critical elements, such as the operating system, boot drive, system drive, and Backup Express catalog are always recovered. Now suppose you also need immediate full access to the data for critical production purposes. With this disk-to-disk solution, you can recover the minimum necessary elements for operation as described above by using ExpressDR, then simply map to your latest virtual volume image on an iscsi-connected disk device server. You have instant access to your data without a data transfer. The Instant Availability feature of Backup Express provides this capability, rapidly enabling nearly seamless business continuation. To complete the scenario, you later restore applications and data to an available functioning volume and, during a few moments of downtime, synchronize the day's changes with the restored data, unmap the backup snapshot, and then map the up-to-date reconstituted volume. Despite critical server failure, this comprehensive solution allows you to first restore only the critical elements required for operation, have instant access to the backup snapshot on the destination device, and then, at a brief, convenient window of opportunity, resync all the data including any data that has been modified since the initial ExpressDR recovery. Objective 2. Optimize data integrity through frequent backups and recoveries to past points in time. Successful server recovery hinges on complete and up-to-date backups. Critical servers and active workstations ought to be backed up many times each day at the node level. And your recovery application must have the ability to recover to past backups. Due to space, bandwidth, and time limitations, multiple daily node backups are not generally performed with many backup products. The Syncsort solution overcomes this in a variety of ways: With the disk-to-disk approach, your node is backed up to a large disk-based unit. The complexities of tape storage are eliminated. Physical-level disk access at backup bypasses the file system for significant increase in performance and minimal CPU impact. Page 7
Syncsort XRS and APM use advanced snapshot technologies that enable all your applications to remain open and live during backup, eliminating business interruption and eradicating the need for a backup window. After the initial base backup of your node, incremental backups (which are both fast and small) are automatically performed. Except under exceptional circumstances, there is no need to ever do another base backup of the node. Syncsort XRS and APM back up only changed blocks (as opposed to changed files). The backup of just the small changes in a large file can be hundreds of times smaller and faster than traditional incremental backups. For all these reasons, it is prudent and effortless to schedule Backup Express to back up your critical servers many times a day. In most cases, they can be backed up every hour. Then, at recovery, your reconstituted server is extremely current and up to date. But suppose your server became corrupted before the last backup. With typical mirroring solutions, you could be in trouble because of the inability to recover to a previous point. Yet the Syncsort disk-to-disk disaster recovery solution allows you to easily revert back to any past backup. This is accomplished without monopolizing space on the destination disk by exploiting groundbreaking technology. Through careful data management on the destination host, each block-level incremental node backup is stored and immediately virtualized with the previous node backups as a full volume image. You can select for recovery any of the incremental backups and it appears as a full base backup "snapshot" as of that point in time. The ExpressDR interface prompts you to choose from a list of backups, allowing this entire process to be extremely effortless. In contrast, many popular backup products waste precious minutes at restore time "reconstructing" the full volume image from the many incremental backups that were done. ExpressDR wastes no time reconstructing the full volume image. Further, Backup Express employs the industry's most sophisticated technologies to optimize data transfer. Transfer rates over a LAN generally approach or exceed 150 GB per hour. Objective 3. Provide powerful and effective tools for maintenance and activation of disaster recovery sites. Disasters are unpredictable. Your disaster recovery plan must take into account the possibility that your backup system crashes. Or even that you need to move operations to a remote disaster recovery (DR) site. Redundancy of backed up instances is imperative. By replicating to an alternate disk device server at your company's DR site, you prepare for this eventuality. The unique Syncsort solution supports this capability in an extremely powerful way. All well-equipped enterprises invest in DR sites. Unfortunately, in many cases, the DR site fails to be adequately maintained. For example, say that at your main data center at headquarters, your system administrators are busy solving the multitudes of day-to-day problems that arise. Occasionally, amidst the turmoil, they install software upgrades or security patches on your critical application servers. The likelihood that the same upgrades and patches are installed at your DR site in a timely fashion is scant. The solution to this problem is embedded in the Backup Express strategy, and it works like this. After each locally scheduled node backup at headquarters, a data transfer from the backup destination device across the WAN to an "alternate destination device" at the DR site takes place. Only the changed blocks are trans- Page 8
ferred. The content of the alternate disk destination repository remains identical to the content of the destination repository at the main headquarters site (see Figure 4). HEADQUARTERS WAN DISASTER RECOVERY SITE SCHEDULED B L I BACKUPS FREQUENT DATA TRANSFERS BARE METAL RECOVERY Source Server Destination Repository (Disk Storage) (BLOCK-LEVEL INCREMENTALS) Alternate Destination Repository (Disk Storage) Bare Machine Figure 4: Disaster Recovery Site Maintenance and Activation Now, suppose a catastrophe occurs at headquarters. Personnel at the DR site only need to perform simple ExpressDR recoveries from the alternate destination host to bare machines at the DR site. In a short time, the DR site is up and running with up-to-date operating systems, applications, files, and data. Objective 4. Maintain scalability while meeting continuity of operations planning requirements for enterprises of all sizes. Syncsort s disk-to-disk strategy leverages Backup Express and other key components for comprehensive protection of multiple servers in both local and remote locations. Backup Express is easily deployed on all servers, but has the unique property that it is controlled from a single "master server" with a single central catalog. Further, with permissioning, Backup Express can be accessed by administrators through a browserbased GUI from any Windows node on the network. AdvancedClient backups can be performed over LAN, SAN, or WAN connections from many Windows and UNIX servers to a single destination host or to multiple destination hosts. Data from remote sites, such as branch offices, can be backed up to headquarters or backed up locally then transferred to headquarters by using replication technology such as NetApp's SnapMirror application. For compliance purposes, all data can be streamed to tape and/or transferred to alternate disk storage at the DR site. From this standpoint, the disk-to-disk disaster recovery strategy is extraordinarily scalable. Objective 5. Simultaneously support routine backup and restore obligations. To optimize your disaster recovery investment, your approach should not require distinct disaster recovery backups. With Backup Express, disaster recovery backups can be integrated into your regular data backup scheme. Your regularly scheduled incremental node backups and your disaster recovery backups are one and the same. From the same backup instance, you can restore individual files and folders or the entire node. However, if you already have a backup/restore implementation in place, Backup Express with ExpressDR can complement it without interfering with it for disaster recovery purposes. Page 9
The Bottom Line for Disaster Recovery ExpressDR provides a unique disk-to-disk bare metal recovery data protection strategy for Windows, UNIX, or mixed environments that maximizes business continuity in every disaster scenario in a cost-effective manner. The fundamental advantages are simplicity of recovery, optimal data integrity and consistency, and minimal business disruption in the event of system failure for enterprises of all sizes. By exploiting ExpressDR and other Syncsort applications such as Instant Availability, APM, and XRS you can be confident that your vital servers, workstations, and enterprises will be recovered quickly, easily, and with complete, reliable, up-to-date data. Page 10
50 Tice Boulevard Woodcliff Lake, NJ 07677 www.syncsort.com 201-930-8200 Syncsort Incorporated, 2007 All rights reserved. Backup Express is a trademark of Syncsort Incorporated. Network Appliance is a trademark of Network Appliance, Inc. in the U.S. and other countries. All other company and product names used herein may be the trademarks of their respective companies.