Traditional Disaster Recovery versus Cloud based DR

Traditional Disaster Recovery versus Cloud based DR May 2014 Executive Summary Many businesses want Disaster Recovery (DR) services to prevent either man-made or natural disasters from causing expensive service disruptions. Unfortunately, current DR services come either at very high cost or with weak guarantees about the amount of data lost and time required to restart operation after a failure. However, with cloud computing and virtualization opening up a plethora of opportunities, business enterprises are discovering that a lot of applications can be availed as services, DR being no exception. This has resulted in the emerging model of delivering Disaster Recovery as a Service (DRaaS). DRaaS as a model is gaining popularity among enterprises mainly due to its pay-as-you-go pricing model that can lower costs, and use of automated virtual platforms that can minimize the recovery time after a failure. Virtualized cloud platforms are well suited to providing Disaster Recovery. Under normal operating conditions, a cloud-based DR service may only need a small share of resources to synchronize state from the primary site to the cloud. The use of automated virtualization platforms for disaster recovery means that additional resources can be rapidly brought online once the disaster is detected. This can dramatically reduce the recovery time after a failure which is a key component in enabling business continuity. 1

Key Requirements for Effective DR Service The requirements for an effective DR Service may be based on business decisions such as the monetary cost of system downtime or data loss, while others can be directly tied to application performance and accuracy. Exhibit 1 Requirements for DR Service The level of data protection and speed of recovery depends on the type of backup mechanism used and the nature of resources available at the backup site. In general, DR services fall under one of the following categories: Hot Backup Site: A hot backup site typically provides a set of mirrored stand-by servers that are always available to run the application once a disaster occurs, providing minimal RTO and RPO. Hot standbys typically use synchronous replication to prevent any data loss due to a disaster. 2

Warm Backup Site: A warm backup site may keep state up to date with either synchronous or asynchronous replication schemes depending on the necessary RPO. Standby servers to run the application after failure are available, but are only kept in a warm state where it may take minutes to bring them online. Cold Backup Site: In a cold backup site, data is often only replicated on a periodic basis, leading to an RPO of hours or days. In addition, servers to run the application after failure are not readily available, and there may be a delay of hours or days as hardware is brought out of storage or re-purposed from test and development systems, resulting in a high RTO. It can be difficult to support business continuity with cold backup sites, but they are a very low cost option for applications that do not require strong protection or availability guarantees. Cloud-based Disaster Recovery (DR) The on-demand nature of cloud computing means that it provides the greatest cost benefit when peak resource demands are much higher than average case demands. This means that cloud platforms can provide the greatest benefit to DR services that require warm stand-by replicas. In this case, the cloud can be used to cheaply maintain the state of an application using low cost resources under ordinary operating conditions. Only after a disaster occurs, a cloud-based DR Service pays for the more powerful and expensive resources required to run the full application. These resources can be provisioned in a matter of seconds or minutes. In contrast, an enterprise using its own private resources for DR must always have servers available to meet the resource needs of the full disaster case, resulting in a much higher cost during normal operation. Devising a Blueprint for Cloud-based DR Just as with traditional DR, there isn t a single blueprint for cloud-based disaster recovery. Every company is unique in the applications it runs, and the relevance of the applications to its business and the industry it is in. Therefore, a cloud disaster recovery plan (cloud DR blueprint) will be very distinct and unique for each organization. Triage is the overarching principle used to create traditional as well as cloud-based DR plans. The process of devising a DR plan starts with identifying and prioritizing applications, services and data, and determining for each one the amount of downtime that s acceptable before there s a significant business impact. Priority and required recovery time objectives (RTOs) will then determine the disaster recovery approach. Identifying critical resources and recovery methods is the most relevant aspect during this process, since an organization needs to ensure that all critical apps and data are included in the blueprint. With applications identified and prioritized, and RTOs defined, the organization can 3

then determine the best and most cost-effective methods of achieving the RTOs (by application and service). A combination of cost and recovery objectives drives different levels of disaster recovery. Exhibit 2 Traditional DR vs. DR as a (Cloud) Service Choosing to go with a cloud disaster recovery service will be governed purely by the business imperative. If an organization has critical applications that should be available within minutes of downtime, it should consider cloud-based DR. Disaster Recovery as a Service (DRaaS) is gaining popularity among enterprises mainly due to its pay-as-you-go pricing model that can lower costs and use of automated virtual platforms that can minimize the recovery time after a failure. 4

What are Cloud-based Disaster Recovery Options? An increasingly popular option is to put both primary production and disaster recovery instances into the cloud and have both handled by a managed service provider. By doing this enterprises can get all the benefits of cloud computing from usage-based cost to eliminating on-premises infrastructure. However, in this case the choice of service provider and the process of negotiating appropriate service level agreements (SLAs) are of utmost importance. By handing over control to the service provider, an enterprise needs to be absolutely certain whether the service provider is able to deliver uninterrupted service within the defined SLAs for both primary and DR instances. Back Up to and Restore from the Cloud Applications and data remain on-premises in this approach, with data being backed up into the cloud and restored onto on-premises hardware when a disaster occurs. In other words, the backup in the cloud becomes a substitute for tape-based off-site backups. Back Up to and Restore to the Cloud In this approach, data is not restored back to on-premises infrastructure; instead it is restored to virtual machines in the cloud. This requires both cloud storage and cloud compute resources. The restore can be done when a disaster is declared or on a continuous basis (pre-staged). Prestaging DR VMs and keeping them relatively up-to-date through scheduled restores is crucial in cases where aggressive RTOs need to be met. Replication to Virtual Machines in the Cloud For applications that require aggressive recovery time (RTO) and recovery point objectives (RPOs), as well as application awareness, replication is the data movement option of choice. Replication to cloud virtual machines can be used to protect both cloud and on-premises production instances. In other words, replication is suitable for both cloud-vm-to-cloud-vm and on-premises-to-cloud-vm data protection. 5

Exhibit 5 Cloud-based DR Approaches An ideal cloud backup and DR service provides the following key elements: A replica of all protected systems frequently updated by incremental backups or snapshots at intervals set by the user for each system. The user determines the settings according to recovery point objectives (RPO). Full site, system, disk, and file recovery via a completely user-driven, self-service portal. This portal allows the user the flexibility to choose which file disk or system they want to recover. Fast SLA-based data recovery. Recovery is, after all, what backup is all about, and there can be no compromise when choosing a cloud service for backup and DR. The SLA is negotiated up front, and the customer pays for the SLA required. No data, no file or system disk, should take more than 30 minutes to recover. WAN optimization between the customer site and the cloud that enables full data mobility at reduced bandwidth and storage utilization and cost. Data validation. There must be an automated or user-initiated validation protocol that allows the customer to check their data at any time to ensure the data s integrity. DR rehearsal that demonstrates the viability of the DR plan. 6

The Benefits of DRaaS The cloud can facilitate disaster recovery by significantly lowering costs: The cloud s pay-as-you go pricing model significantly lowers costs due to the different level of resources required before and during a disaster. Cloud resources can quickly be added with fine granularity and have costs that scale smoothly without requiring large upfront investments. The cloud platform manages and maintains the DR servers and storage devices, lowering IT costs and reducing the impact of failures at the disaster site. The benefits of virtualization, while not necessarily specific to cloud platforms, still provide important features for disaster recovery: VM startup can be easily automated, lowering recovery times after a disaster. Virtualization eliminates hardware dependencies, potentially lowering hardware requirements at the backup site. Application agnostic state replication software can be run outside of the VM, treating it as a black box. These characteristics can simplify the replication and deployment of resources in a cloud DR site, and enable business continuity by reducing recovery times. Lead time to allocate the minimum required resources, should DR be invoked Lead time to scale up resources to the defined (or full) level Duration for which such resources will be retained on a dedicated basis for the company Additional fees for occupancy beyond the pre-defined period Additional facilities such as conference rooms and video conferencing Capability to provide additional hardware as and when needed Parameters related to work area recovery can also be included if such services are used CloudHPT your DRaaS provider in the Middle East CloudHPT is a cloud provider based in the UAE who offers a Disaster Recovery as a Service. CloudHPT is a cloud built on High Performance Technology (HPT) and offers all its services with or without a managed service. In addition CloudHPT offers cloud Servers, Storage and Backup solutions. If you would like to find out more, simply call +971 4 3789055. 7