Disk-to-Disk Backup & Restore Application Note All trademark names are the property of their respective companies. This publication contains opinions of StoneFly, Inc., which are subject to change from time to time. This publication is copyright by StoneFly, Inc. and is intended for use only by recipients authorized by StoneFly, Inc.. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of StoneFly, Inc., is in violation of U.S. copyright law.
Information is the lifeblood of today s global economy, and backing up that information is critical. However, with data growing exponentially, shrinking backup windows are forcing companies to choose which data is to be backed up and which data is to be left unprotected. Individual backup for each server is difficult to manage and requires multiple backup devices and backup licenses. Scheduling backups for multiple client servers to a backup server creates an inherent contradiction between the time required to back up all clients and the time available for non-disruptive access to the network. Since most backup software applications require client servers to be backed up one at a time, scheduling backups during nonpeak hours for example, between 8:00 pm to 6:00 am may not provide sufficient time to back up all data on all servers. Moreover, this scenario is not viable for enterprises that operate across multiple or international time zones. Because the amount of data requiring backup is exceeding the allowable backup window, organizations are faced with the dilemma of performing selective backups. However, this approach is less than ideal, since backup is all about restore. How fast can an organization recover from an inadvertent error to its last known good state? Moreover, the increasing avalanche of information will soon make it physically impossible to back up all data on all servers in a 24-hour day using traditional methods. The Solution The revolutionary Storage Concentrator from StoneFly is a costeffective way of offloading backup and restore operations from the company LAN onto a dedicated Ethernet IP SAN by working in conjunction with industry-standard backup packages. With this approach, the typical constraints imposed by the LAN are removed from the backup process and the burden of backup traffic is removed from the LAN. The Storage Concentrator also resolves the conflict between data backups and shrinking backup windows. Instead of backing up data one server at a time, data can now be backed up from multiple servers simultaneously onto a central repository, slashing backup times. Page 2 of 9
Building an Affordable IP SAN StoneFly s architecture allows for the construction of an Affordable IP SAN that enables organizations to consolidate, centrally manage, and share a pool of storage that can be used by the backup software to perform disk-to-disk data movement. Designed for quick and easy installation, the Affordable IP SAN consists of: Microsoft Windows 2000/2003, Microsoft Windows NT, or Linux client drivers and/or SNIC Ethernet adapters. Gigabit Ethernet infrastructure. A Storage Concentrator i3000 A SCSI RAID or JBOD unit equipped with appropriate amount of storage (or internal disk, in the case of an integrated Storage Concentrator) Figure 1. A One Terabyte IP SAN Page 3 of 9
How Disk-to-Disk Backup Works In an IP SAN configuration, the backup software instructs client servers to send backup data to logical volumes presented by the Storage Concentrator over a dedicated Gigabit Ethernet SAN. The Storage Concentrator makes these logical volumes appear as a local disk to each client server. In this way, the backup operation is a simple disk copy operation that is performed at disk-to-disk speeds. The entire operation is as fast and seamless as writing backup data to a local drive that actually resides at the client server. Backup operations are further enhanced by Ethernet s inherent ability to support concurrent server data transfers on the network. This intrinsic feature enables multiple client servers to backup data to their own logical disks at the same time, without having to wait for other backup jobs to finish. Running multiple backup jobs simultaneously dramatically reduces the time required to backup data. In a laboratory test at StoneFly headquarters, backing up from internal storage on multiple servers to an external RAID array on an IP SAN was 63% faster than backing up to tape. Once the client servers complete their data backups, the backup server can fetch the data directly from the Storage Concentrator logical volume and place it on the tape. This transfer can take place without regard to the backup window since all of the client servers have already completed their respective backups and returned to normal operation. As an added bonus, the system administrator has complete control over configuring the size and number of logical volumes presented to each client server, thus providing the ultimate flexibility in determining how much and how long backup data resides accessible at disk speeds. For example a system administrator could choose to have as little as yesterday s backups, or as much as last month s daily backups accessible via the Storage Concentrator. How Disk-to-Disk Restore Works In the IP SAN, client server data is backed up to logical volumes presented by the Storage Concentrator. In Figure 2, for example, servers 1 and 2 backup data Monday through Friday. In order to facilitate the quick recovery of recently backed-up data, the system administrator has chosen to keep five days worth of backup data on each of those logical volumes. In the rare instance a client server needs to restore this data, its backup software is pointed to the correct logical volume on the Storage Concentrator to copy the required data. Page 4 of 9
The entire transaction is conducted over a dedicated Gigabit Ethernet IP SAN using a simple copy operation. This restores the data at diskto-disk speeds that are as fast and seamless as accessing data from local drives in the client server. If client backup data more than five days old is required, it can be restored from tape by the backup server and then sent over the LAN, to the requesting server. Ethernet s inherent ability to support concurrent server data transfers on the network enables a client server to restore data from its own logical volume without impacting other restore operations that may be occurring at the same time on other servers. As a company s storage requirements increase, the IP SAN allows client servers and their associated logical volumes to be added, without increasing backup or restore times. Figure 2 Client data is sent to the Storage Concentrator during the normal backup window. The Storage Concentrator sends older client data to tape as a background activity with no concern for backup window. Page 5 of 9
Dramatic Cost Savings The IP SAN not only enhances data backup operations, but it also helps businesses reduce costs significantly by consolidating existing and new disk storage. The IP SAN simply plugs into a business existing Ethernet infrastructure, without requiring proprietary network equipment or specialized personnel training. With backup operations now performed centrally, businesses can redeploy IT personnel and eliminate tape drives/libraries on individual servers and at remote locations. In addition, an affordable IP SAN allows business to use readily available IP network security technologies, such as firewalls, encryption, and authentication tool, to prevent unauthorized access into the business storage. Backup and Restore Performance of an IP SAN To test the performance of a disk-to-disk backup solution, StoneFly performed the following tests in a laboratory environment. 1. Backup to tape 2. Disk-to-disk backup with a RAID Array directly attached to the master backup server 3. Disk-to-Disk backup with a RAID Array attached to the StoneFly Storage Concentrator with Volumes Provisioned to the Hosts for the backup targets. Test 1 The first test simulated a traditional IT environment where server data is backed up directly to tape. The Master Backup server commands each server to backup its data across the network to a tape drive directly attached to the Master Backup server. It is important to note that the traditional model queues multiple jobs and processes them one at a time. (Note there are products on the market that are designed to handle multiple backup jobs at the same time.) With tape backup, jobs are processed in serial. Tape initialization adds time to overall backup jobs. Page 6 of 9
Test 2 Test 2 involved disk-to-disk backup of the data from the servers to a RAID Array directly attached to the Master Backup Server. Tests were performed by backing up each server in parallel. All data flows across the LAN and through the Master Backup Server. Test 3 Test 3 incorporated an IP SAN to perform disk-to-disk backup. The target disk was provisioned storage presented by the Storage Page 7 of 9
Concentrator. In this scenario the Storage Concentrator is configured to present each server with a separate logical volume. The Master server will command each server to backup their direct-attached data to the provisioned Logical Volume. Data will not have to pass through the Master Backup Server on its way to its destination. Intel Storage Network Interface Cards (SNICs) were used as the iscsi initiators in the servers. Control commands and metadata are sent across the LAN, but server backup data is sent over the IP SAN. Backup jobs were performed in parallel. Our results indicate significant disk-to-disk backup benefits. We were able to reduce backup windows dramatically from tape backup and had a slight edge over disk-to-disk with a locally attached RAID Array. Restores were also significantly faster than tape and very similar to Disk to Disk w/ locally attached RAID. The benefits of a disk to disk Page 8 of 9
Backup Methodology include faster backing up of data, faster restoring of data, cost savings from Tape Media and Tape Hardware (not needing to buy tapes) and reduced network congestion on Local Area Networks (for Disk to Disk with an IP SAN). It is highly recommended that some sort of off-site backup be used in conjunction with a disk-to-disk backup solution. One option is to perform a Tape Backup of the Data Copy. Since the Data Backup now resides on the IP SAN, backups can take place during work hours without affecting the production servers or adding additional traffic to the LAN. Page 9 of 9