A CommVault White Paper: Quick Recovery Increased Application Availability for Multi-Platform SAN Environments CommVault Corporate Headquarters 2 Crescent Place Oceanport, New Jersey 07757-0900 USA Telephone: 888.746.3849 or 732.870.4000 2007 CommVault Systems, Inc. All rights reserved. CommVault, CommVault and logo, the CV logo, CommVault Systems, Solving Forward, SIM, Singular Information Management, CommVault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, QNet, GridStor, Vault Tracker, Quick Snap, QSnap, Recovery Director, CommServe, and CommCell, are trademarks or registered trademarks of CommVault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
CommVault Quick Recovery Product Overview Table of Contents Increased Application Availability for Multi-Platform SAN Environments...2 The Challenge...2 The Solution- CommVault Quick Recovery...3 Primary Use Cases...4 Increase Application Data Availability...4 Complement Backup/Restore...4 Quick Recovery Volumes...5 Application Smart...5 Quick Recovery Volume Creation...6 Incremental Update...7 Using QR Volumes for Migration...7 Auto Discovery Features...7 QR Policy Implementation...8 Snapshot Engine selection...8 Copy Manager Support...9 Hardware Copy Manager...9 Software Copy Manager...9 LAN Copy Manager...9 Multiple Recovery Points on a Single Recovery Volume...9 QR Administration Tools...10 Volume Scratch Pool definition...10 Snapshot or QR Volume Pruning support...10 In Use Flag...10 QR Volume Creation and Recovery History...10 Integration with Volume Explorer...10 Quick Recovery Volume Recovery...11 Automatic Application Recovery...11 Automatic Single Volume Recovery...11 Recovering back to Primary Disk Volumes...11 Benefits of Integration with CommVault Galaxy Backup & Recovery...12 Snapshot assisted Backup...12 Conclusion...13 1
Increased Application Availability for Multi-Platform SAN Environments The Challenge As both security concerns and corporate data stores grow, businesses are more careful than ever in protecting their critical data. As a result they are spending more time and money on sophisticated data protection procedures. One of the best ways to protect production data is to use snapshot technology in addition to mirroring and traditional backup. Many vendors offer excellent snapshot capabilities in their hardware devices and storage management applications. However, snapshot technology can be complex and hard to manage, especially for those companies running more than one type of database, platform and disaster recovery application. Given these challenges, even sophisticated IT departments may shy away from using or optimizing their snapshot capabilities. Snapshots provide key advantages such as enabling IT departments to restore snapshot data significantly faster than data from a backup. Since snapshots can occur more frequently than backups, there is a smaller window of vulnerability. There are many technologies designed to protect against data loss. Mirroring and clustering are necessary in case of primary server failure, but will not help in the event of data corruption. For example, corrupted databases can be restored from snapshots taken before the corruption event but not from a real time replica since this carries the corruption from one to another. IT can also use backups, but the more frequent snapshots will usually contain more recent transactions. Snapshots can also enable remote image creation for offsite usage or storage. Remote image restores originate from a remote location and can either restore back to the primary production server at the local level or can restore to a hot site replacement server in case of switchovers or catastrophic failure at the local site. By optimizing their snapshot technology, businesses can realize an excellent return on investment (ROI). And by integrating snapshot management with overall data and storage management, businesses can match the vulnerability window with the value of their data. This helps to significantly decrease the total cost of ownership (TCO). Optimizing snapshot technology requires simplifying snapshot procedures, managing snapshots through a central interface and integrating snapshots with backups and mirrored data. Snapshots should embrace multiple applications and platforms, and must be able to handle a variety of storage devices and environments. By simplifying snapshot management and extending its benefits across multiple platforms and heterogeneous environments, businesses can leverage their snapshot technology to provide top levels of data protection and high availability. 2
The Solution- CommVault Quick Recovery CommVault s Quick Recovery (QR) software helps protect businesses critical applications from corruption, user errors, virus attacks and hardware failures. It simplifies and centralizes snapshot management across multiple platforms, applications, snapshot vendors and equipment. This enables IT administrators to fully leverage their snapshot technologies for increased data availability and fast recovery. Quick Recovery software enables IT administrators to create point-in-time Recovery Volumes on any magnetic disk and manage the volumes in a single view. By automatically discovering all volumes associated with the application, QR enables administrators to protect an entire application or just a critical data set, allowing individual volumes or applications to be recovered in minutes instead of the hours that it would take with traditional backup and recovery techniques. As a component of the CommVault Singular Information Management software suite, Quick Recovery leverages the same engine as CommVault Galaxy Backup and Recovery software. This allows IT administrators to view the QR volumes alongside backup data through a single console for easy management Key Benefits include: Extremely fast data recovery - Quick Recovery volumes and snaps allow customers to fully recover applications in minutes from any magnetic disk. Restores that would have taken days now take a fraction of the time and are not tied to specific snapshot applications or hardware systems. Keep non-disruptive images - Snapshot applications can create volume images without impacting the production host. This solves the problem of performance hits on the primary database servers during a backup routine. Multiple Recovery Points from a single Recovery Volume - The Quick Recovery software allows users to create and retain multiple read-only recovery points that let users meet fine-grained Recovery Point Objectives. Application smart - The Quick Recovery software discovers all volumes associated with a given application. This allows users to use snapshots on the entire application or limit them to critical data sets. It also eliminates the need to manual trace LUNs and logical connections in the SAN. This is especially helpful when prioritizing data stores in large and complex databases. Automatic snapshot operations - Most snapshot applications allow customers to schedule their snapshots, but Quick Recovery customers can use its scheduling engine to automate snapshots across applications and physical devices. Manages snapshots - Snapshot management creates detailed catalogs of snapshot contents. This allows users to track their snapshots by attributes including current mount point, last mount point, time of snap and volume size. Policy-based aging - Snapshots use disk space efficiently but over time they can take up a considerable amount of space. Quick Recovery uses policies to automatically age snapshots, allowing customers to reclaim space for additional snapshot operations or other uses. Incremental updates - Once the customer creates an original Quick Recovery volume, it updates successive volumes incrementally at the block level. Snapshot software - Quick Recovery includes CommVault's QSnap snapshot software for the Solaris AIX, Linux and Windows operating systems. It is also integrated with Microsoft Volume Shadow Copy Service (VSS) for Windows 2003, Snapshot Hardware - Quick Recovery works seamlessly with controller-based snapshot approaches from vendors such as EMC, HDS, NetApp and others. 3
Primary Use Cases Quick Recovery software can leverage your snapshot investments by simplifying and integrating snapshot management, whether you're using the CommVault Galaxy backup and recovery solution or another data protection technology. By deploying the Quick Recovery solution along with the remaining CommVault suite, customers can integrate and simplify their data movement and management. This Singular Information Management approach allows customers to lower the cost of managing their databases, reduces time of backups and keeps data highly available. Increase Application Data Availability First and foremost, the QR product is designed to provide application data availability. QR volume creation and snapshot retention options enable administrators to create multiple pointin-time images, providing more views of data for faster recovery. When the application data integrity is compromised by human, software or hardware error, QR allows application data to be recovered to a specific point in time in a matter of minutes. This is achieved by switching the primary volume being used by the application with the snap or QR Volume. Using the snapshot or QR volume for quick recovery significantly reduces the recovery point and recovery time objective getting you back to business quickly by eliminating the need to do a restore from a traditional backup copy. Complement Backup/Restore QR software does not replace traditional backup. Rather, QR fills the business resumption gap between traditional mirrored or replicated disk solutions and tape or disk based backup copies. While mirroring and replication are good for hardware failure protection they are not able to recover from data corruption, virus, or other data based failures. Tape and disk based backups are excellent when a complete system needs to be rebuilt and to protect data in the case of environmental disasters. Restores from backup copies tend to be time consuming. What snapshots and QR volumes provide is the ability to take more frequent images of important data and make these images readily available for quick recovery in the event of data based failures. By storing the data in its native format, QR volumes can be remounted directly to the application, usually in only a matter of minutes. Restoring from a backup copy is much more time consuming since backup copies are not stored in native formats and backup media handling also adds to the amount of time required to complete the restore. 4
Quick Recovery Volumes Unlike backup, the QR software does not store copies of application data on to tape or magnetic disk media in a backup format. Instead, QR constructs online QR Volumes, which are exact block-level replicas of the original source volumes. QR Volumes can be mounted on the same application server as the original volumes, or on any other QR client that has access to the physical disk containing the QR Volume. SAN-attached QR Volumes may be created by using the SCSI Extended Copy facilities built into some hardware products. If supported hardware is not available, the CommVault Software Copy Manager module may be used instead. The QR solution also offers the capability to create QR Volumes at a local or remote server by using the CommVault LAN Copy Manager instead of hardware or Software Copy Managers. The copy managers are described in detail later in this guide. Backup & Restore = Hours or Days Read Primary Copy Application Read Restored Copy Back-up Copy QR Recovery = Minutes Primary Copy Read Application QR Recovery Volume Snapshot Volume Application Smart Quick Recovery supports the latest releases of Microsoft Windows, Sun Solaris and Red Hat Linux, AIX, HPUX file systems, Microsoft Cluster Service and Microsoft Virtual Server, Microsoft SQL Server, Oracle on Windows or Solaris, and Microsoft Exchange applications. From the user interface, either individual file system volumes or applications can be selected. The QR product automatically discovers and correlates the Oracle and SQL database elements or Exchange Storage Group plus Information Stores with the underlying physical volumes. From that point, a user can simply use the QR software to drive their snapshot policies or extend their recovery options by creating a QR Volume for other recovery needs. Whenever a snapshot or QR Volume is created or updated, the QR software automatically quiesces the application momentarily to ensure that all application data is safely flushed from the buffers before proceeding with the operation. This ensures a consistent copy of the data is made. During snapshot or QR Volume creation, QR software automatically executes the application specific data integrity check tool to ensure the point-in-time copies of the application data have relational integrity, assuring the recoverability of the data. 5
Quick Recovery Volume Creation Basic QR volume creation only copies the data blocks that are actually used. However, the QR volume will be exactly the same size as the primary volume, as this is necessary for the application to be able to directly mount the QR volume. For each QR point-in-time volume that is preserved, it requires the same amount of incremental space. However, the QR software intelligently uses the destination disk pool. It will automatically reuse the specific disk pool for all new QR volumes or updates to the original QR volume. The following steps occur when QR is deployed in Snapshot Management Only mode: 1. Quiesce the application 2. Call the appropriate snapshot technology engine to make the point in time copy 3. Return the application to full production mode at the completion of the snapshot (usually with no interruption to the application users) An additional step is added in the creation of a QR volume: 4. Utilize a Copy Manager to move the blocks from the source disk to the QR volume QR Recovery Volume Creation Step 1: Quiese Application Quiese Primary Copy Application QR Recovery Snap Creation Step 2: Snap Primary & Re-initialize Application Read Primary Copy Snap Copy Application QR Recovery Volume Creation Step 3: Create QR Volume Snap Copy 6 QR Recovery Volume
QR Volumes may be created on demand or on a custom schedule. The amount of time the snapshot takes is not directly proportional to the database size rather it is directly proportional to the amount of data the supported application has to flush from its cache to storage. For example, on the same Exchange system, this duration could range from tens of seconds to several minutes depending on how busy the Exchange server is. Multiple secondary QR Volumes may exist for a given source or primary volume. All point-intime versions of a given volume may be browsed from CommCell Console. Subsequently, QR software can update the QR Volume incrementally using the Incremental Update feature described in the section below. To use QR you do not have to write or customize any scripts. With application smarts built-in; all actions are performed within the easy to use QR GUI. QR software uses policies, defined by the administrator, to automatically perform all the tasks. Snapshots and QR Volumes are created utilizing the underlying storage based snapshot function(s) and one of the Copy managers previously mentioned. These preferences are defined as part of the QR Policy. Incremental Update Once the initial QR Volume is created, the software updates the QR Volume by only moving the blocks that have changed since the last QR job. Incremental updates can be done on demand or run as a scheduled job. Once the Incremental Update is complete, the QR Volume is now an image of the source volume as of the point-in-time of the update the prior point-in-time has been overwritten. If the user wants to keep multiple points-in-time, then a combination of snaps and QR Volumes must be maintained to recover to different points-in-time. Using QR Volumes for Migration The QR solution is an easy to use and implement tool to migrate application data from disk subsystem to disk subsystem. The administrator simply creates volume(s) on the target storage of the same size or bigger then the source volume(s). These newly created volumes(s) are added to the Volume Scratch Pool in the QR Policy. Once the QR job is complete, the administrator simply recovers the application to these new QR volumes. The application data is now fully migrated to the new magnetic storage subsystem. Auto Discovery Features To ensure the consistency of application data and improve ease of use, the QR software uses built-in intelligence to automatically discover and correlate physical storage devices and descriptions with application objects and synchronize application input/output with QR operations. The QR software lets the user create and manage point-in-time replicas of application volumes created by various storage-based snapshot solutions. The QR automated discovery process determines: Any supported application installed on the application server Any supported snapshot engine on the storage being utilized by the application server Any available hardware based data mover in the SAN. Every volume that is accessible (mounted or un-mounted) from the application server 7
All volumes being used by the supported application Unique identifier for each of the discovered volumes Method to quiesce the application Once the auto discovery is complete, the information is used to setup the QR Policies for use in creating and managing the snapshots and QR Volumes. QR Policy Implementation Each QR subclient may be associated with one QR policy. The QR policy controls the resource usage and general behavior of the QR volume creation job and provides any parameters required by the specific snapshot engine in use. Functions of QR Policy include: Snapshot engine selection Copy Manager selection Automatic snapshot or QR Volume aging/pruning Volume scratch pool definition Snapshot Engine selection The QR software easily integrates with hardware-based snapshot technologies to deliver the best performance with the least impact on the production system. The following hardware snapshot technologies are certified with the Quick Recovery software: EMC SnapView and Time Finder Network Appliance SnapVault (ONTAP and Open Systems SnapVault) Hardware snapshot support is extended via certified hardware providers for Microsoft Volume Shadow Copy Service, so QR software also supports these array-based snapshot technologies: HDS Quick Shadow, ShadowImage and TrueCopy HP StorageWorks EVM and EVA This approach delivers investment protection to users that may already have existing snapshot technologies. In the QR policy screen, a pull down menu lists all supported snapshot technologies for each application based on where the application data resides. So, if one application is on a JBOD array and another application sits on a StorageWorks array, the users will automatically be shown only the supported snapshot engine that s available for that array. In this manner, QR provides unparalleled ease of use and management and the same quick recovery functionality for multiple types of snapshot technologies. The QR software includes the built-in CommVault Software Snapshot (CSS), a copy-on-write software snapshot engine that supports Windows, HPUX, AIX, Solaris and Red Hat Linux operating systems. 8
Copy Manager Support Once the application returns to normal production mode after completion of the snapshot, the QR software copies the blocks from the primary disk to the destination disk to create the QR volume. The movement of the blocks from the primary disk to the QR volume can be done using a number of different data movement options. The QR software lets customers fully utilize the capabilities of their SAN environment, including leveraging the SCSI Extended Copy command set for true serverless data movement. Options for moving data over the LAN or for SAN environments are also supported. Hardware Copy Manager A Hardware Copy Manager (HCM) uses SAN-based storage routers that support the SCSI Extended Copy command set. Using this data movement option provides true serverless data movement. The data blocks are copied directly from the primary disk subsystem to the target disk subsystem using the SAN router. Virtually no server CPU cycles are used to move the data. As a result, the QR software provides tremendous value to customers by utilizing their current investment in the SAN router technology. Software Copy Manager The QR solution provides a Software Copy Manager (SCM) to drive data movement in a SAN where SAN routers are not present. The SCM is a virtual copy manager device that emulates the functions of a Hardware Copy Manager. The SCM resides with the CommVault MediaAgent software module and directs the movement of the data blocks from primary to target disks. This configuration provides data movement across a SAN infrastructure: array to array, or within a single monolithic disk array. The SCM does use server CPU cycles, but the SCM software directing the movement of the data may reside on any server in the SAN, not necessarily the application server. LAN Copy Manager The LAN Copy Manager (LCM) is a data movement technology that copies QR volumes over the LAN or WAN connections. This option is used to create QR Volumes at remote location or to create a QR copy within an array. This capability offers an excellent business continuance or disaster recovery strategy by replicating data to a secondary location, where the application can be rapidly recovered in case of a disaster. The LCM software is installed on the CommVault MediaAgent and takes full advantage of CommVault s patented Datapipe technology to maximize network data movement. Multiple Recovery Points on a Single Recovery Volume Reducing Recovery Point Objectives exposes the organization to less data loss. Quick Recovery software s ability to retain multiple recovery points on a single Recovery Volume gives users more flexible and timely options to restore lost data without need to retain many individual snapshots. In addition, users can cover the recovery point image into a read/write volume. QR software s multiple recovery points on a single Recovery Volume are application aware, so that they can be used to protect and recover relational databases like Oracle, SQL and others, plus mail applications like Exchange or Domino. 9
QR Administration Tools Volume Scratch Pool definition All QR volume operations performed under the QR policy use destination volumes that are allocated from an associated Volume Scratch Pool. Volume Scratch Pool is integrated with Volume Explorer and enables the administrator to add, remove or manage the pool members. Snapshot or QR Volume Pruning support Each QR policy has an associated Retention Policy. Retention Policy specifies a minimum period of time (days, seconds, or hours) for which the associated snapshots and QR volumes are retained. When the retention policy is reached, the software will prune the active snapshot or QR Volume, returning the space to the Volume Scratch Pool. This space is now free to be allocated for future snaps or QR volumes or other use. Before the pruning action happens, the QR software will check if the snap or QR Volume is scheduled for update (via Incremental Update), is flagged for use, or is being actively used. If any of these conditions exist, the QR software will not prune the volume. In Use Flag Any QR Volume can be flagged as being in use. Designating a QR Volume prevents a number of actions from occurring: first, it overrides the retention policy; second, it prevents the QR Volume from being pruned; third, it precludes the QR Volume space from being recycled into the Volume Scratch Pool. If a QR Volume is flagged as in use, it essentially has an infinite retention period. QR Volume Creation and Recovery History All the QR volume creation and recovery histories can be viewed through CommCell GUI as well as the Galaxy Report. There are very rich filter configurations enabling the display of very specific job type and history information. Integration with Volume Explorer QR uses the CommVault Volume Explorer to automatically detect and manage all mounted and un-mounted volumes from any server that has QR Agents or MediaAgent software installed. 10
Quick Recovery Volume Recovery When the time comes to restore the data from a traditional snapshot, the complexity the user faces is enormous. There are no easy ways of finding out when and for what application the snapshots are taken and how many snapshots are available since there is no automatic indexing of snapshots when it occurred, what application it was associated with, and what are the logical paths to it and the original application. With tools available today, the user then has to figure how and where to mount the volume, what snapshot to use, what application is associated with the snap, so the application can recover the data. All these manual steps lead to application downtime, which directly impacts the company bottom line. With CommVault Quick Recovery, these headaches and complexities are eliminated. The application smarts and indexing allow the QR software to easily track and manage the snaps and QR volumes. This ability to track, manage and easily display and choose which snaps or QR volumes to restore is especially important for enabling users to quickly resume operations. Automatic Application Recovery The application based QR volumes can be browsed and recovered automatically from the application idataagent browse window. Automatic application recovery uses the following steps: 1. Quiese the application, 2. Detect the source volumes associated with the application objects 3. Un-mounts the source volumes 4. Remount the snap or QR volume to the application 5. Return the application to normal production mode. QR Recovery = Minutes Read Application QR Recovery Volume Automatic Single Volume Recovery Snaps or QR volumes can be created without intelligent application awareness. In case of device error or application damage, the volumes can be recovered using the QR volume browse window. This type of recovery is useful for using QR volumes to continuously protect file systems. Recovering back to Primary Disk Volumes After a recovery using a QR Volume has enabled the application to return to production mode on the QR volume, how does the former primary volume become the primary volume again? In many cases, the QR volumes are created on less expensive disk volumes, not generally acceptable for long term usage in heavy transaction based application environments. There is a need to return the application to a state of using the primary disk array as soon as possible. The 11
QR software facilitates this swap back to the primary disk array. Once the primary disk has been repaired, the administrator can copy back the QR volume to the repaired primary disk. Once that is completed, the administrator can restore the application on the repaired primary disk array. Once this action is complete, the application is back to running on the primary disk array. Benefits of Integration with CommVault Galaxy Backup & Recovery If the Galaxy backup ida for Exchange or SQL Server is installed on the application server, the user is able to view snapshots, QR Volumes as well as conventional backup information in a single unified view. To see all copies of data as of a specific point-in-time simply select the Browse Data item on the pull down menu for the corresponding application ida. All copies of data, including Galaxy backups, snapshots, and QR volumes appear in the browse window. The user can elect to recover the application data from a backup copy or from the QR Volume copy without having to know or care about how to deal with different format of these data types and where and how these copies of application data were created or stored. In fact, this is the recommended user administrative approach since it automatically invokes the applicationawareness built into the Quick Recovery product. If the user decides to select the QR agent for browse and recovery operations, the recovery process is accomplished as a physical restore. Specifically, for transaction-based applications, like databases and messaging, the administrator will have to manually replay logs to finalize the recovery. Snapshot assisted Backup Once the QR Volume is created, you can perform Galaxy snapshot-assisted backup on the application recovery-ready QR Volume. This enables organizations to take advantage of Galaxy Backup and Restore features to make a full volume backup copy using the QR volume as a source. This type of use also frees up the application server CPU cycles by having the backup I/O removed from the production server, using a backup server or serverless backup capabilities to make the backup copy. In this manner, the combination of CommVault s Singular Information Management components, QR and Galaxy software, enable the customer to have a complete data protection strategy that covers the spectrum from fast application recovery, to multiple point in time images, to local and remote removable backup copies for complete system restorations and offsite vaulting. 12
Conclusion In today s highly competitive business environment, availability of file system and applications data is crucial to success. Snapshot technology offers tremendous advantages for data protection, enabling faster restores than backup while minimizing potential data loss. CommVault Quick Recovery software helps businesses optimize their snapshot technology over multiple platforms and applications by simplifying the administration and management of snapshots, making restore procedures fast and powerful. QR lets you fine-tune your uptime strategy by leveraging advanced snapshot technologies to assure data integrity when compromised by human, software or hardware error. QR enables you to leverage a wide range of array-embedded snapshot technologies including those from HDS, NetApp, EMC, HP, Engenio, IBM, LeftHand, Xiotech and many more for increased data protection. This provides comprehensive snapshot management without being locked in to a single hardware vendor. As a component of CommVault s Singular Information Management suite, Quick Recovery offers the unique advantage of managing snapshots through a central interface and integrating snapshots with backups and mirrored data. This helps businesses simplify snapshot management and provide top levels of data protection and high availability. 13