Redefining Microsoft Exchange Data Management FEBBRUARY, 2013 Actifio PAS Specification
Table of Contents Introduction.... 3 Background.... 3 Virtualizing Microsoft Exchange Data Management.... 3 Virtualizing the Copy Operation.... 4 Discovery... 4 Application Consistency... 4 Physical Servers... 4 Virtual Servers... 5 Virtualizing the Store Operation... 6 Globally Deduplicating and Compressing Changed Data... 6 Virtualizing the Storage Repository... 6 Virtualizing the Move Operation... 6 Traditional Primary Storage Replication... 6 Backup Replication... 6 Dedup-Async Replication... 7 Virtualizing the Restore / Recovery Operation.... 8 Mount... 8 Clone... 9 Restore... 9 Object Restore... 9 About Actifio... 10 actifio.com Redefining Microsoft Exchange Data Management 2
Introduction Actifio copy data storage introduces data virtualization technology into the management of copy data in both physical and virtual environments, resulting in a single, radically simple solution that can dramatically reduce costs and efficiently manage all copies of production data for backup, disaster recovery, test, development or any other application that requires a copy of the data. Actifio provides a comprehensive and efficient solution for protecting Microsoft Exchange environments, with the benefits shown in Table 1. This paper describes how Actifio s patented Virtual Data Pipeline technology is applied in Microsoft Exchange environments to deliver these benefits. TABLE 1 ACTIFIO MICROSOFT EXCHANGE SOLUTIONS BENEFITS BENEFIT No Production Impact ACTIFIO DETAILS Using Actifio dedicated resources, storage of copy data and related processing for deduplication, compression, and replication is offloaded from the production Exchange servers Actifio s turnkey solution does not require any intrusive agents or other resources from the customer s environment Background Actifio pioneered the industry s first storage system optimized for managing virtual copies of production data, eliminating redundant silos of IT infrastructure and data management applications. While data storage technology is undergoing some fundamental shifts, such as production data storage moving from disk to SSD and copy data storage moving from tape to disk, very little has changed over the last two decades in the way data is managed. The common approach to data management has been to deploy complex, expensive infrastructure silos-- built with point tools for each of protection, disaster recovery, business continuity, test & development, analytics, compliance or other applications-- each making and managing redundant copies of data. Actifio copy data storage solutions introduce data management virtualization to deliver a SLA-driven solution that decouples the management of data from storage, network and server infrastructure. The result is a radically simple solution that can reduce costs by up to 90% and efficiently manage all copies of production data for backup, disaster recovery, test, development or any other application that requires a copy of production data. ACTIFIO MS EXCHANGE SUPPORT AT A GLANCE Exchange 2003, 2007, 2010 Database Availability Group (DAG) setups High Performance Update the T0 backup image with the changes from T1 to create a new full image with SCN of T1. Take incremental snapshot of the new image, capturing changed blocks in Snapshot Pool. SLA may also cause these blocks to move into DeDup Pool. Physical and Virtual Exchange servers Instant recovery of Exchange objects such as emails, mailboxes, and calendar entries Granular Recovery Scalable Provides object level access and recovery of Exchange information including individual emails, mailboxes, and calendar entries Efficient solution for both standalone and database availability group (DAG) environments Virtualizing Microsoft Exchange Data Management Actifio created the Virtual Data Pipeline (VDP) technology to virtualize production data copy management, eliminating redundancies and re-purposing the unique data for multiple data management applications. VDP efficiently captures a copy of changed data from the Microsoft Exchange servers allowing for direct access to the data from the Actifio solution, without any data movement. The Virtual Data Pipeline is a distributed object file system, virtualizing the core primitives of data management: copy, store, move and restore. This technology allows the instant creation of virtual copies of point-intime data from the collection of unique blocks of data. A single solution can now be deployed to replace one or more of backup software, disaster recovery, business continuity or test and development tools and actifio.com Redefining Microsoft Exchange Data Management 3
FIGURE 1: VIRTUAL DATA PIPELINE ARCHITECTURE 1. COPY: Most efficient and scalable data capture Block-level, incremental snapshot with change block tracking 2. STORE: Independent copy with multiple formats, any storage device Raw format for instant restore, optimized format for longer retention Storage virtualization for private, public or hybrid cloud storage 3. MOVE: Dedup Async TM to drive down network usage 4. RESTORE: App-aware instant mount to any host Maintains temporal and causal relationship between the objects can be used as a platform for search, compliance and analytics tools. The resulting simplicity of operations and reduction in infrastructure significantly drives down capital and operational costs. purposes such as ensuring application consistent backups and truncating logs. It is not involved in any Exchange backup data movement, therefore consumes virtually zero system resources. The following sections explain the details of virtualizing the copy, store, move and restore operations in this environment. Virtualizing the Copy Operation Discovery Actifio performs a deep discovery of the Exchange environment by querying the Actifio connector a lightweight software component that is installed on the Exchange servers. The connector will discover all the Exchange databases present on each server and the administrator can apply predefined SLA protection-templates to these databases. The lightweight Actifio Connector is used for discovery and for coordination Application Consistency Actifio leverages the Microsoft Volume Shadow Copy Service (VSS) to capture application-consistent snapshots of the Exchange databases in both physical and virtual server environments. VSS snapshots work against open Exchange databases and provide an efficient means to quiesce the database to capture an application consistent state. Microsoft has included VSS in all versions of Windows since Windows 2003 and it is readily available on Exchange servers by default. Physical Servers When Exchange is running on physical servers, Actifio can quickly capture views of the Exchange data by leveraging its built-in Copy On Write (COW) snapshot technology at the storage level. The Actifio Connector is installed on all Exchange servers, standalone or DAG, actifio.com Redefining Microsoft Exchange Data Management 4
to provide tighter integration with Exchange and ensure applicationconsistent backups. It coordinates the Actifio storage-level snapshot with the Microsoft VSS snapshot on the server, and also facilitates log truncation as needed. Actifio can natively compare the current and the previous snapshots at the block level to determine which blocks differ and need to be processed for deduplication, compression, and replication. For example, if Actifio collects changes on an hourly basis then only the last hour of changed blocks would be stored in the Actifio system for that data capture operation. This optimizes the amount of storage consumed and reduces the overall load on the system, as data is only processed one time. In traditional backup environments, data is processed over and over again every time there is a full backup. The Actifio architecture combines the benefits of full system snapshots along with the benefits of block level, incremental forever data movement. Virtual Servers For Exchange running on virtual servers, Actifio uses VMware snapshot technology to quickly capture application-consistent views of the blocks on the virtual disks attached to Virtual Machines (VM). The Actifio Connector is only required in the VMs for log truncation, but can also facilitate deeper discovery of Exchange databases. Three VMware technologies aid in protecting Exchange on VM s: VMware Snapshots, VMware Tools, and Changed Block Tracking (CBT). Actifio is fully integrated with VMware vsphere (version 4.0 or higher). vcenter is queried to find the location of the VM to be managed and to initiate the VM snapshot. When Actifio triggers the VMware snapshot, the guest Windows operating system of the VM is alerted to the snapshot through the VMware Tools software that is installed on the guest OS. When copying the snapshot data, Actifio leverages a feature of the vstorage APIs for Data Protection called Changed Block Tracking (CBT) to capture changes from the production environment using a true incremental-forever architecture. Changed Block Tracking lets Actifio copy only the blocks that have changed since the last data collection. For example, if Actifio collects changes on an hourly basis then only the last hour of changed blocks would be copied from the production VM onto Actifio during any operation. Upon the first backup of a VM, a staging disk is created within Actifio for each virtual disk of the VM. Storage consumption for this staging disk is minimized by using thin provisioning as well as by using VMware Changed Block Tracking to retrieve only the non-empty blocks. Actifio supports two architectures for moving captured changed data blocks from the production VMs to Actifio PAS: Area Networks (SANs) to connect to their datastores, whether via iscsi or Fibre Channel. Actifio will copy the changed blocks directly from the datastore(s) over Fibre Channel, bypassing the ESX servers for the data movement and offloading this data movement from the customer s IP network. Moving data from one storage system directly to another one is the most efficient way to move data in the data center. In summary, Actifio uses the following workflow to copy data from the VMware environment: 1. Create a VMware snapshot of the VM through the vcenter API. Application consistency is achieved through VMware Tools. 2. Use VMware CBT to identify which blocks have changed since the last operation. 3. Copy only the changed blocks onto the Actifio system. 4. Delete the VMware snapshot through the vcenter API. This approach leverages VMware snapshots as an efficient way to achieve application-consistent points in time while effectively transferring them onto the Actifio system, using only incremental data movements. This design eliminates the performance issues that can be associated with large data transfers and with maintaining and deleting older VMware snapshots from ESX servers. Actifio copies data blocks incrementally to avoid a heavy load on the ESX environment and on the storage systems. In addition, Actifio only uses VMware snapshots for a very brief period of time; the snapshots are retained just long enough to copy the changed blocks and then they are deleted. This technique results in a minimal performance impact within VMware for snapshot deletion since very few changed blocks need to be merged back to the original volumes. BENEFITS OF ACTIFIO FOR EXCHANGE PROTECTION Minimal backup window data is captured using snapshots and not backups Incremental forever after the initial copy, all backups are incremental (no more fulls ) Store snapshots on any storage and reclaim expensive tier-1 production storage Provides object level access including emails, mailboxes, and calendar entries. Incremental backups deduped and compressed to optimize long-term storage 1. Ethernet Network (IP): Actifio can copy the changed blocks directly through VMware API calls. Data moves directly between the ESX servers and Actifio using the customer s IP network. 2. Storage Area Network (Fibre Channel): Actifio can copy the changed blocks over the storage network for VMs that use Storage actifio.com Redefining Microsoft Exchange Data Management 5
Virtualizing the Store Operation Once application-consistent changed blocks are imported into Actifio, the next step is to globally deduplicate and compress the changes between the last two snapshots at a block level. Newly captured data is deduplicated across all physical and virtual servers managed by Actifio, and is then compressed prior to being written to disk. Global data deduplication and compression provide the benefits shown in Table 2. TABLE 2 DATA DEDUPLICATION AND COMPRESSION BENEFITS Virtualizing the Move Operation Data movement, especially replication, is the single biggest inhibitor to efficient data management in a geographically distributed environment. Actifio delivers robust, efficient and scalable data movement technology that not only drives down the overall network usage but also eliminates the need for a dedicated WAN accelerator/optimizer. In addition to offering traditional synchronous and asynchronous primary storage replication, Actifio leverages its global deduplication and compression to provide efficient data replication between sites. Actifio offers two types of deduplicated data replication: Backup Replication and Dedup-Async Storage Replication. BENEFIT ACTIFIO DETAILS Traditional Primary Storage Replication Storage Economics WAN Optimization Globally deduplicated and compressed data is economically stored on disk for longer retention Deduplication and compression optimizes the data set for transport between sites Globally Deduplicating and Compressing Changed Data Actifio s Virtual Data Pipeline (VDP) File System processes only the blocks that have changed between the last two snapshot-based data captures. This process executes within the Actifio system, offloading all data management processing from the user s production environment. VDP uses patented progressive data processing algorithms to perform this deduplication and optimize the use of storage capacity. Virtualizing the Storage Repository In addition to reducing the storage footprint, Actifio enables the use of any storage repository to store the reduced data footprint. Users can now have their SLA s dictate the type of storage they need to use, rather than be dictated by the vendor or the choice of production data storage device. Many users have re-purposed their existing storage for storing data copies or have opted for lower cost storage devices. This capability further lowers the overall storage costs. Actifio offers 2 types of traditional primary storage replication: synchronous and asynchronous. These replications are similar to the replication technologies typically offered by traditional storage arrays. Synchronous replication can be achieved up to a 300 Km distance and will guarantee that all data is synchronized between two sites. Asynchronous replication removes the distance limitation and will send data over the WAN as fast as the bandwidth allows. Both of these technologies are designed for real-time replication of data resulting in every write being replicated between sites. The advantages of using Actifio synchronous or asynchronous replication are: 1. Actifio replication does not require any storage array vendor licenses as data is sent from one Actifio system to another. 2. Actifio replication is heterogeneous from any supported array to any supported array. Tier 1 to Tier 2 and/or Vendor A to Vendor B. 3. Actifio replication preserves write-order, even across multiple LUNs in a consistency group. 4. Actifio replication is fully integrated with VMware Site Recovery Manager (SRM). Backup Replication The backup replication process begins after the deduplication process ingests and stores the unique data captured by the Copy and Store processes. This workflow minimizes the bandwidth required to move data between locations. A proprietary deduplication-aware replication protocol enables the transmission of only the globally unique blocks that are needed in the remote Actifio system, ensuring the most efficient movement possible. This architecture ensures maximum efficiency for WAN bandwidth utilization. For example, a 1 GB service pack that was applied to 10 Exchange servers would generate much less than 10 GB of replication traffic with Actifio technology. First, the 10 GB of changes would be single instanced down to the 1 GB of common new blocks that were written to each server. Next, a percentage of those blocks are probably already in the dedup repository from previous patches or from existing Windows binaries. actifio.com Redefining Microsoft Exchange Data Management 6
FIGURE 2: DEDUP ASYNC REPLICATION ARCHITECTURE Because of this effect, the 1 GB of blocks from the service pack may deduplicate down to 600 MB since the other 400 MB of the blocks are already in the repository. Finally, the 600 MB of new blocks may compress down to 300 MB before being sent over the wire. In this case, 300 MB of data would be sent over the WAN instead of 10 GB of data due to Actifio s global deduplication technology. Furthermore, if the remote site is performing local data protection, it is possible that no data would be transmitted at all as the needed blocks may already exist in the deduplication pool at that site. Dedup-Async Replication The third type of primary storage replication is an industry first, uniquely delivered by Actifio, called Dedup-Async. This is the industry s most efficient way of moving data, driving down bandwidth requirements while allowing RTOs comparable to traditional synchronous and asynchronous methods. Dedup-Async replication provides asynchronous data replication but uses globally deduplicated and compressed data movement over the network. The Dedup-Async protocol uses the following workflow with an abbreviated cycle time: 1. Take a snapshot of the physical or virtual server 2. Deduplicate and compress only the incremental block-level changes since the last cycle 3. Replicate the new globally unique compressed blocks to the DR site 4. Rehydrate the new unique blocks to ready the entire dataset for a failover condition This approach has all of the benefits of asynchronous storage replication but only uses a fraction of the bandwidth required with traditional replication technologies. It also provides the ability to have applicationconsistent replicated copies at the DR site, shortening recovery time by avoiding the need for integrity checking that is typically performed with a crash-consistent copy. In addition, the backup replication and Dedup-Async replication services utilize the same bit stream for even more efficiencies. With traditional technologies, a customer will require bandwidth for primary storage replication as well as backup replication. With Actifio PAS, these two services share the same bandwidth. For example, when the primary storage blocks are replicated by Dedup-Async replication, those blocks do not need to be resent when backup replication starts some time later, since those blocks are already at the DR site. The end result is that both primary storage and backups can be replicated sharing the same globally deduplicated and compressed data stream. The Dedup-Async architecture also allows Actifio to create virtual copies of the data for DR and failover testing without interrupting the flow of data between sites. With Dedup-Async you can perform DR tests of your Exchange servers (virtual servers or physical servers) at any time with no interruption to data replication and with no additional capacity requirements. A virtual copy of the data is created at the DR site and is immediately available for testing or analysis. Dedup-Async also provides sophisticated incremental failback capabilities to make the restoration of service at the primary site very simple in the event of a failover condition to the DR site. actifio.com Redefining Microsoft Exchange Data Management 7
FIGURE 3: EXCHANGE OBJECT LEVEL RECOVERY Virtualizing the Restore/ Recovery Operation Actifio delivers the unique capability to enable applications to directly use point-in-time data without the need for a traditional restore operation. This is because Actifio is an intelligent copy data storage system that can instantly create a view of the application data from a past point in time, and allow applications to access it efficiently via Fibre Channel or iscsi, just as if accessing a traditional storage system. This unique technique of enabling applications to directly access historical data eliminates the traditional restore time and data movement and significantly increases application availability at much lower cost. Any application consistent data point stored in the system can be accessed on any system connected to the Actifio solution. Common use cases are to recover a VM after a software issue, to retrieve an email or mailbox that was deleted accidentally, or to use a virtual copy of a production data set for test and development. There are three different methods than can be used to access the data stored on the Actifio: Mount, Clone and Restore. Mount The mount function is the most frequently used data access method as it directly leverages the virtual copies of data stored on Actifio. Since Actifio already has the data and can service I/O directly, there is no need to copy the data anywhere. Virtual copies of the data can be mounted instantly on any system in the environment using efficient block-level protocols including iscsi and Fibre Channel. actifio.com Redefining Microsoft Exchange Data Management 8
By eliminating the data movement from the process, data sets of any size can be accessed instantly on any server in the environment, virtual or physical. A 2 TB data set can be accessed in the same amount of time as a 2 GB data set, in seconds, since the data is mounted instead of copied. Virtual copies of any data set can be used for file/folder level recovery, for test and development, or for full VM instant recovery. Actifio can mount one or more of the virtual disks within a point-intime copy of an Exchange database to a physical host, to an existing VM, or as a new VM it creates on-the-fly. Actifio presents virtual copies of the desired data set to the chosen server via Fibre Channel or iscsi and makes the data accessible in the environment through the Actifio connector. When mounting virtual disks to an existing physical host or VM, the user chooses the desired host. For a VM, the new disks are added as Virtual or Physical Mode Raw Device Mappings (vrdms/prdms). For physical hosts, the connector will assign a drive letter to the volumes and make the volumes accessible within Windows. When creating a new VM, the user specifies a VM name, a vcenter server, and an ESX server, and a datastore. Actifio creates the VM using virtual copies of the point-in-time images presented as vrdms and marked as non-independent, so that this VM can be further protected using Actifio if desired. The VM is automatically registered with vcenter but the virtual network interfaces are unplugged in order to avoid network IP conflicts. The datastore specified is only used to host the VM configuration and vrdm mapping files, which consume almost no space in the datastore. Clone The clone function is used to create an independent copy of a data set for any number of reasons: test/development, audit, compliance, data warehousing, forensic analysis, e-discovery, user acceptance testing, etc. Virtual server or physical server data sets can be copied from any application-consistent point in the system to a separate storage location anywhere in the customer environment. For example, a VM running Exchange can be cloned to a new VM that appears exactly as it did 6 months ago for e-discovery. For VMs, creating a clone is similar to mounting to a new VM, except that the virtual disks are actually copied to newly created VMDK files in one or more of the ESX server s data stores, as specified by the user. For physical Exchange servers, the clone operation will create a bit for bit copy of the Exchange volume(s) on storage from an attached storage array. server, the underlying disk will be reverted to its state at the time of the original data capture. Microsoft generally does not recommend this storage level restore for Exchange environments. Rather, a new Exchange server is typically provisioned with Exchange installed in recovery mode and then the Exchange databases are presented to this server from the point-in-time backup by mounting the point-in-time copy directly to the new server (physical or virtual). Object Restore In addition to restoring information at volume level, Actifio allows users to recover individual Microsoft Exchange objects. Actifio provides the capability to open an Exchange database and copy objects from a backup image to PST files or back to the production Exchange instance. This capability is further enhanced by Actifio s ability to provide instant access to the entire Exchange database in just seconds. To recover an individual mailbox or email, the user simply has to select the capture point that they are interested in, mount it to the server, and drag and drop the objects to the appropriate recovery destination. Conclusion Actifio copy data storage solution is based on patented Virtual Data Pipeline (VDP) technology, which provides the most efficient way to manage data growth while solving your biggest IT challenges around information protection and availability. Using Actifio to manage and protect MS Exchange data delivers significant benefits to IT, including: Near-zero backup window data is captured using snapshots and not backups Instant access mount and clone operations with no data movement required Incremental forever after the initial copy, all backups are incremental (no more fulls ) Extreme granularity provide object-level access including emails, mailboxes, and calendar entries Storage and WAN optimized -- incremental backups deduped and compressed to optimize long-term storage and replication Reclaimed storage -- store snapshots on any storage and reclaim expensive tier-1 production storage Restore The restore function effectively reverts the production data to look exactly as it did at the time of the data collection point. When restoring a VM, Actifio powers down the production VM, deletes all the virtual disks (all data will be lost), and creates new virtual disks with the same data from the time of the restore point. When restoring a physical actifio.com Redefining Microsoft Exchange Data Management 9
About Actifio Actifio pioneered the industry s first storage system optimized for managing virtual copies of production data, eliminating redundant silos of IT infrastructure and data management applications. Actifio Protection and Availability Storage (PAS) platform is based on patented Virtual Data PipelineTM (VDP) technology, delivering dramatically enhanced business availability by eliminating backup and restore windows and the creation of virtual point-in-time copies of data on-demand, for use by any business application. Integrating data deduplication, network and processor utilization optimization, Actifio provides the most efficient way to manage data growth while solving your biggest IT challenges around information protection and availability. For more information on how Actifio can radically simplify your data management, visit www.actifio.com, email Actifio at info@actifio.com, or call Actifio at 877.282.5373 actifio.com Redefining Microsoft Exchange Data Management 10