Disaster Recovery with Amazon Web Services: A Technical Guide



Similar documents
Storage and Disaster Recovery

Disaster Recovery as a Cloud Service: Economic Benefits and Deployment Challenges

SteelFusion with AWS Hybrid Cloud Storage

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

Cloud Computing Disaster Recovery (DR)

Amazon Cloud Storage Options

The case for cloud-based disaster recovery

We look beyond IT. Cloud Offerings

Running Oracle Applications on AWS

Deployment Options for Microsoft Hyper-V Server

Increased Security, Greater Agility, Lower Costs for AWS DELPHIX FOR AMAZON WEB SERVICES WHITE PAPER

How To Use An Npm On A Network Device

IBM Tivoli Storage Manager

SOLUTION BRIEF Citrix Cloud Solutions Citrix Cloud Solution for Disaster Recovery

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

WHITE PAPER. The Double-Edged Sword of Virtualization:

OmniCube. SimpliVity OmniCube and Multi Federation ROBO Reference Architecture. White Paper. Authors: Bob Gropman

Leveraging Public Cloud for Affordable VMware Disaster Recovery & Business Continuity

Veritas Cluster Server from Symantec

Nutanix Solution Note

SQL Server Storage Best Practice Discussion Dell EqualLogic

Disaster recovery: Resilient cloud-based disaster recovery

High Availability with Windows Server 2012 Release Candidate

How AWS Pricing Works

Offering Brief NetApp Private Storage for Amazon Web Services

Using Live Sync to Support Disaster Recovery

Double-Take Replication in the VMware Environment: Building DR solutions using Double-Take and VMware Infrastructure and VMware Server

CA Cloud Overview Benefits of the Hyper-V Cloud

Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving

Amazon Relational Database Service (RDS)

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

Data Protection as Part of Your Cloud Journey

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM

How AWS Pricing Works May 2015

36 Reasons To Adopt DataCore Storage Virtualization

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

Affordable Remote Data Replication

Veritas Storage Foundation High Availability for Windows by Symantec

CompTIA Cloud+ Course Content. Length: 5 Days. Who Should Attend:

Module: Business Continuity

WHITE PAPER. Best Practices to Ensure SAP Availability. Software for Innovative Open Solutions. Abstract. What is high availability?

Implementing Disaster Recovery? At What Cost?

Cisco Virtualized Multiservice Data Center Reference Architecture: Building the Unified Data Center

Are You in Control of Your Cloud Data? Expanded options for keeping your enterprise in the driver s seat

Backup Exec Private Cloud Services. Planning and Deployment Guide

RemoteApp Publishing on AWS

F5 and Oracle Database Solution Guide. Solutions to optimize the network for database operations, replication, scalability, and security

TOP FIVE REASONS WHY CUSTOMERS USE EMC AND VMWARE TO VIRTUALIZE ORACLE ENVIRONMENTS

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

EMC SOLUTIONS TO OPTIMIZE EMR INFRASTRUCTURE FOR CERNER

F5 and VMware Solution Guide. Virtualization solutions to optimize performance, improve availability, and reduce complexity

HP StorageWorks Data Protection Strategy brief

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

CompTIA Cloud+ 9318; 5 Days, Instructor-led

VMware System, Application and Data Availability With CA ARCserve High Availability

Licensing Guide for CA ARCserve & CA XOsoft Products

Dell Data Protection Point of View: Recover Everything. Every time. On time.

With Eversync s cloud data tiering, the customer can tier data protection as follows:

Realizing the True Potential of Software-Defined Storage

Things You Need to Know About Cloud Backup

June Blade.org 2009 ALL RIGHTS RESERVED

Web Application Hosting in the AWS Cloud Best Practices

Symantec Storage Foundation High Availability for Windows

Using the cloud to improve business resilience

Server and Storage Virtualization with IP Storage. David Dale, NetApp

Backup, Recovery & Archiving. Choosing a data protection strategy that best suits your IT requirements and business needs.

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

Data Sheet: Backup & Recovery Symantec Backup Exec 12.5 for Windows Servers The gold standard in Windows data protection

BACKUP IS DEAD: Introducing the Data Protection Lifecycle, a new paradigm for data protection and recovery WHITE PAPER

Availability for the modern datacentre Veeam Availability Suite v8 & Sneakpreview v9

NetApp SnapMirror. Protect Your Business at a 60% lower TCO. Title. Name

Disaster Recovery for Oracle Database

W H I T E P A P E R. Disaster Recovery Virtualization Protecting Production Systems Using VMware Virtual Infrastructure and Double-Take

SOLUTION WHITE PAPER. Managing AWS. Using BMC Cloud Management solutions to enhance agility with control

Backup and Recovery of SAP Systems on Windows / SQL Server

Riverbed WAN Acceleration for EMC Isilon Sync IQ Replication

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Virtualizing disaster recovery using cloud computing

Zerto Virtual Manager Administration Guide

Amazon Compute - EC2 and Related Services

Evolution from the Traditional Data Center to Exalogic: An Operational Perspective

How To Use Netbackup For Business

FAQ RIVERBED WHITEWATER FREQUENTLY ASKED QUESTIONS

Ensuring High Availability for Critical Systems and Applications

Achieving High Availability & Rapid Disaster Recovery in a Microsoft Exchange IP SAN April 2006

Protect Data... in the Cloud

The Case for Enterprise Ready Virtual Private Clouds

DISASTER RECOVERY WITH AWS

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.

SQL SERVER ADVANCED PROTECTION AND FAST RECOVERY WITH EQUALLOGIC AUTO-SNAPSHOT MANAGER

Virtualization. Disaster Recovery. A Foundation for Disaster Recovery in the Cloud

Three Paths to the Virtualized Private Cloud

Using HP StoreOnce Backup systems for Oracle database backups

Business Life Insurance

Technical Considerations in a Windows Server Environment

IBM PureFlex System. The infrastructure system with integrated expertise

Corporate Enterprise Data Protection Package

CloudNet: Enterprise. AT&T Labs Research, Joint work with: Timothy Wood, Jacobus van der Merwe, and Prashant Shenoy

Redefining Oracle Database Management

Transcription:

Disaster Recovery with Amazon Web Services: A Technical Guide

An unplanned disruption to a business or data center can have devastating and far-reaching impact on productivity, customer service, supply chain, and revenue. As the marketplace becomes increasingly global and customers demand around-the-clock service, enterprises need to avoid business-disrupting events by implementing robust business continuity solutions. 1

Although cost-effective, multi-tenant, cloud-based disaster recovery (DR) solutions are available in the marketplace, many large enterprises have not yet adopted them. Limited adoption rate is primarily due to concerns about security, reliability, and performance. Enterprises can reap the following benefits from cloud-based disaster recovery vs. traditional. Save up to 85% in reduced infrastructure capital expenditure for servers, storage, networking, and physical data center costs Lower on-going reoccurring expenses such as co-location charges, power, and maintenance Increased efficiency for recovery time and recovery point objectives Greater speed to market does not restrict growth in primary facility Entryway for organizations to gain familiarity and skills with AWS and cloud computing In this paper, we will: 1. Define the challenges that enterprises face in adopting public cloud solutions for disaster recovery. 2. Describe the value that large enterprises can gain by adopting cloud-based DR with services such as Amazon Web Services (AWS) for disaster recovery. 3. Provide recommended disaster recovery architecture patterns. 2

ENTERPRISE CHALLENGES AND REQUIREMENTS Enterprises face specific challenges in designing, deploying, and managing cloud-based DR solutions that satisfy their risk profile and business continuity requirements. Specifically: Handling Large Oracle and SQL Databases: Deploying, replicating, and managing large databases over long distances presents a challenge in cloud computing. Enterprises must exercise care in designing their DR architecture and selecting database deployment locations, replication technologies, database designs optimized for cloud computing, and data deduplication strategies for minimizing database sizes. Network Integration: One of the greatest challenges in cloud computing is the need to minimize latency between internal and cloud-based servers. Integration usually involves deployment of network optimization tools to monitor and manage both internal traffic and traffic between in-house infrastructure and data centers. External bottlenecks are particularly difficult to manage since they may be beyond the control of internal IT staff. IP addressing and network convergence is another challenge that enterprises face. Network woes have been eased with several AWS solutions such as Elastic IP addressing, Virtual Private Cloud (VPC), and Route 53, which have made it easier to manage internet addressing across data centers and the cloud. Rapid Spin-up of Standby Machine Images: Minimizing business disruption in the event of a disaster requires an architecture that enables deployment of infrastructure resources reserved for rapid recovery. Selection and management of reliable technologies for this purpose is critical for business continuity and cost management. Change Management: Maintaining systems, software, and services becomes a complex task for large enterprises. Moving infrastructure to the cloud does not remove this requirement and change management must still be coordinated. Lack of Cloud Computing Expertise: The rapid evolution of cloud technology outpaces availability of in-house subject matter experts who can assist enterprises in solutions architecture, SLA negotiations, and deployment. In addition to acquisition and development of internal capabilities, enterprises often need external experts with much broader and in-depth capabilities in cloud computing and data center optimization to develop and implement solutions. Recovery Model: Businesses will need to scrutinize all aspects of their disaster recovery approach even testing and overseeing all aspects of the recovery scenario as a fully proven and valid recovery event is difficult to achieve. Increasing Business Continuity Expectations: Enterprises are increasingly expected to provide round-the-clock service. High availability demands are pushing enterprises to blur the lines between disaster recovery and uninterruptible business continuity, which demands more innovative and challenging solutions. Facing these challenges, an enterpriselevel DR system must have the following characteristics and capabilities: 1. Ability to handle large, complex enterprise-scale databases. 2. Robust network connectivity between DR and production infrastructure, as well as other critical data sources. 3. Ability to rapidly scale to facilitate testing and accommodate fluctuations in business activity. 4. Automated orchestration of systems to minimize human error and increase speed of deployment. 5. Capability to provide large, permanent storage for both data and server images in multiple zones to protect against the effects of regional outages. 6. Access to expertise in designing, deploying, and operating high-availability DR systems in the event of wide-scale regional disasters. 3

DISASTER RECOVERY ARCHITECTURES Enterprises have different levels of tolerance for business interruptions and therefore a wide variety of disaster recovery preferences, ranging from solutions that provide a few hours of downtime to seamless failover. Accenture s deep expertise with infrastructure design and DR architectures along with AWS leading service provides a cost-efficient and optimized DR capability. Smart DR offers three design methods which meet the recovery needs of most enterprises using combinations of AWS services: These solutions provide great versatility as they facilitate integration into current environments by leveraging current enterprise backup software. Several vendors are developing and optomzing backup software integration with AWS such as Commvault, Veritas, Backup Exec, etc. Environments that are already virtualized provide the best RTO with AWS due to ease of integration. Enterprise-level disaster recovery is primarily measured in terms of Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO is a measure of the maximum amount of time within which operations are expected to be resumed after a disaster. RPO is a measure, in terms of time, of the maximum amount of data that can be lost as a result of a disaster. Method/Pattern RTO Cost Cold Pilot-Light Warm Low RTO >= 1 business day Moderate RTO < 4 hours Aggressive RTO < 1 hour Lowest Cost Moderate Cost Highest Cost DISASTER RECOVERY COLD METHOD The Cold Method is a traditional backup model that provides data replication to non-volatile media such as tape, which is then typically stored at a secure offsite location. As demand for storage and the need for faster recovery increases, traditional backup and recovery methods may no longer meet business requirements. The Cold Method enables an enterprise to realize the cost-savings of cloud-based DR while addressing the challenges of flexibility and recovery time. This DR method also lowers DR costs by eliminating the need for duplicate infrastructure. The Cold Method enables an enterprise to use its backup software of choice to replicate data into AWS s cloud. In the event of a disaster, data could be restored from the cloud back onto in- house resources or made available to EC2 instances on AWS infrastructure. Additionally, backups of on-site virtual machine images could be restored for DR using the AWS Import function. This method suits enterprises that seek cost savings and can tolerate relatively low RPO requirements and slower recovery time objectives. Figure 1: DR Cold Method On-Premise Infrastructure Replication Options Accenture/AWS DR Internet Optional Recovery with EC2 VPN Connection AWS Import/Export S3 Bucket with Objects Scaling EC2 Recovery Instances Recovery/ Fail-Over Zones Corporate Data Center AWS Storage Gateway 4

Storage Layer: The Cold Method is supported by several replication methods as shown in Figure 1. Replication via the cloud, a VPN connection, AWS Import/Export feature, or AWS Storage Gateway all facilitate a method in which data can be replicated from an on-premises system to AWS s S3 or EBS storage. The replicated data is stored as objects in the S3 buckets that can be provisioned to an EC2 instance. Application Layer: For the Cold Method, applications and critical data are stored in a S3 bucket. This repository provides a point of recovery for data to be recovered during a failure. At the point of disaster, an environment is provisioned at AWS. With cloud-based DR, automated processes can use AMIs, cloud formation, or scripts along with the import function to help automate and speed up the Cold Method recovery capabilities. Database Layer: With the expertise of Accenture resources, DR with AWS conducts initial assessments of tier 1 database systems and selects the resources that should be replicated into AWS. Certain database options would be decided on such as the ability to take backups of on-premises database servers or in the event of a failure, snapshots can be restored to an EC2 instance in AWS. Applications would then be re-pointed to the database server hosted in AWS. Network Layer: The Cold Method uses a combination of AWS services to address the challenges of network integration and maximize business continuity: Route 53 and Elastic IP are configured to enable high availability and redundancy by diverting network traffic to AWS during a triggered event (disaster, testing, or load balancing need). Round Robin is configured to provide high availability, load balancing, and network addressing with no impact on loads in the local environment. Management Layer: Basic Management tools provided by AWS such as AWS CloudFormation offers developers and system administrators an easy way to create a collection of related AWS resources and provision them in an orderly and predictable fashion. Third party tools such as puppet labs take automation to the next level with holistic infrastructure orchestration. The following AWS services, as shown in the figure above, can be leveraged in the Cold Method: Snapshots: When a secure, optimized network connection is established between the enterprise and AWS infrastructure, snapshots of the on-premises systems can be made and replicated to an S3 bucket and provisioned into EC2 instances when needed. S3: Data that is replicated via the Storage Gateway will be stored in an EBS volume or S3 bucket from which data can be copied back to the local site for recovery or easily provisioned into an AWS virtual machine (EC2). In a go-live recovery scenario, the S3 bucket would allocate the appropriate data objects into an EBS volume to which the EC2 would attach. Direct Connect: AWS Direct Connect provides a dedicated connection from the enterprise s LAN to AWS edge network. DC increases network reliability, resilience, and security, thus partly addressing the challenge of optimizing network connectivity. Direct Connect may also be excluded to further reduce cost. 5

DISASTER RECOVERY PILOT-LIGHT The Pilot Light method satisfies most enterprise environments that require comprehensive back-up, relatively fast recovery, and redundancy. A Pilot Light solution consists of an up-to-date core infrastructure configured in AWS with the ability to quickly provision a full scale environment during a recovery process. This method entails replication of tier 1 systems to AWS, as well as creation and maintenance of AMI templates. This solution addresses an enterprise s need for flexible DR capabilities, seamless network integration, rapid activation of standby AMIs, and minimal business interruption. With the Pilot Light configuration, it is necessary to set up automation to bring up the full environment using scripts and well-defined processes. Accenture s expertise with these tools and systems helps to reduce the time to recovery and limits the amount of manual intervention. This solution fits businesses with moderate RPO requirements and slower recovery time objectives since core data and services will be replicated actively. The traditional enterprise model consists of several co-existing layers that function together to support an IT environment: Storage Layer: DR with AWS storage is supported by AWS Storage Gateway. The Gateway enables replication of data from an on-premises system to AWS s S3 or EBS storage. The replicated data is stored as a data volume that could be provisioned to an Figure 2: DR Pilot-Light On-Premise Infrastructure DNS/Proxy/Query Server If failure is detected Application Server EC2 instance. If replicated to EBS, the volume can easily be mounted to the EC2 instance. EBS volumes are best suited when performance matters. S3 provides low cost storage that is durable and can be mounted to a host when IO performance is not critical. Application Layer: Database Request/ Receive A small application footprint is run in a warm configuration at AWS. This smaller footprint can maintain a minimum of business transactions. At the point of disaster, a larger environment is provisioned at AWS. With Cloud-based DR, automated processes use AMIs and application virtual images along with the Import function to quickly provision the necessary infrastructure for the applications. By keeping dormant AMIs on hand, this method enables enterprises to address the challenge of being able to spin up server images quickly for DR purposes. The ability to quickly spin up AMIs also enables enterprises to test the DR system more often and more rigorously. The smaller warm environment also provides a cost savings compared to running a full live environment or self-maintained data center. AWS Accenture/AWS DR Recovery/Fail-Over Zones Static DNS/Proxy/ Query Server Mirror Replication Static Application Server Database Layer: Amazon Web Services RDS service enables enterprises to operate and scale relational databases safely within the cloud. Customizations need to be made for Oracle systems exceeding 3 TB and for out-of-box AWS solutions exceeding 16TB, storage traffic saturation will drive the performance targets and ability to build past limitations. With these capabilities, DR with AWS addresses the challenge of handling large databases within cloud environments. Active replication of core database systems also addresses any RPO woes and maintains a dynamic backup ready to be used during a triggered event. Management Layer: Not running but active when triggered during failover event Actively Replicated Database & Server In addition to tools provided by AWS, DR with AWS leverages a combination of tools to address the challenges of spinning up images and optimizing business continuity in cloud environments. Basic Management tools provided by AWS such as AWS CloudFormation, will typically suffice but holistic automation and orchestration can be achieved with tools such as Puppet Labs and Chef. These services can customize a complete environment to any level of granularity where a triggered event would not need human intervention. 6

Network Layer: Multiple AWS services and features can address the challenges of network integration and maximize business continuity: Direct Connect is configured to provide a dedicated, secure connection from the on-premises environment to the AWS cloud, facilitating network convergence and bandwidth throughput. VPC could be used to further customize a network topology by enabling granularity into network modifications such as VLAN s, subnets, ports, and IP addresses. Route 53 and Elastic IP are configured to enable high availability and redundancy by diverting network traffic to AWS during a triggered event (disaster, testing, or load balancing need). Round Robin can be configured to provide high availability, load balancing, and network addressing with no impact in the local environment. After the core infrastructure is spun-up and configured in a Pilot Light DR Method, all other systems are activated via templates, automation tools, or newly built machines. These systems will be ready during a failed event in which servers will provision and spin up to take over production systems. DR policy would dictate the level of urgency in which each application and service needs to be activated. Less critical systems could be configured via installation packages if the DR scenario lasts for a substantial amount of time. DISASTER RECOVERY WARM CONFIGURATION Disaster Recovery in a warm configuration allows customers a near no-downtime solution with a near-to-100% uptime SLA arrangement. Warm Method extends beyond the Pilot Light implementation by replicating and keeping systems up-todate within the AWS. With two mirrored environments, if the main site is interrupted in the event of a disaster, a network failover diverts traffic to the other location within a matter of seconds to provide near-perfect business continuity. Figure 3: DR Warm Method On-Premise Infrastructure Warm DR utilizes tools from Accenture and AWS partners such as Puppet Labs or Chef to enable administrators to automate tasks, deploy applications, minimize human error, and manage infrastructure changes within the cloud and on-premises systems. Warm Method can also be leveraged for availability purposes as well as load balancing if business activities require scalability across two environments. Accenture/AWS DR Production Servers Active Failover Zone If failure is detected ROUTE 53 Traffic EC2 Copy of Production Database Mirror Replication Actively Replicated Database & Server 7

Each layer builds upon the respective Pilot Light layers and adds certain features, as well as limitations, that exist when building a comprehensive DR environment. Storage Layer: The warm configuration would leverage the storage gateway to replicate all priority data. Although the warm configuration is not fully scaled to a production environment, it is fully functional and available on demand. Application Layer: All applications would have a minimal base footprint in the warm DR environment, enabling an easier recovery time compared to starting from a few core applications, as shown in Pilot Light. Multiple application instances can be running and receiving regular updates where the Warm standby site is available to immediately take over. Database Layer: All tier 1 and tier 2 databases would have an active copy in AWS. Tier 3 could be included as well, depending on the businesses needs. Oracle systems could encounter limitations and may not fully support a cloud model which would entail customizations not yet tested. Certain systems that prove not compatible can be configured in a Direct Connect facility to provide cloud integration (e.g., Oracle RAC). Management Layer: Although AWS has built many tools and management features for disaster recovery scenarios, a warm environment adds several layers of management and automation complexity that are solved by using third party tools such as Puppet Labs. Network Layer: Warm configuration solution would take the Pilot light design and incorporate IP Load balancer, Route 53, and Direct Connect to enhance and optimize network resiliency and availability. Routine upgrades and maintenance to the AWS system along with a high-speed network running between sites for replication adds additional complexity and expense. Since Warm Method maintains an active copy of the production environment, RTO and RPO requirements are easily met, as the switch over is in a matter of seconds with no data loss. Although most of the critical systems are actively running, several services that have been deemed non-critical may be created in the DR environment this would simply be done by importing the virtual machine image or having an AMI on hand to spin up during a triggered event. The use of Reserved Instances is also highly recommended for critical applciations to ensure sufficient capacity. 8

ENTERPRISE READINESS A critical step toward successful adoption of cloud-based DR is getting the enterprise ready for disaster recovery in a multitenant cloud environment. Preparedness involves developing experience in virtualization technologies, optimizing both internal network and external connection(s) to cloud resources, optimizing database implementations for cloud computing, aligning cloud security with internal security policies and groups, and acquiring deep cloud computing expertise. Successful adoption of cloud-based DR solutions starts with aligning the enterprise IT infrastructure and functions with the infrastructure services offered by AWS. Migrating to a virtualized computing environment prepares the enterprise for cloud computing and also enables realization of the cost-saving benefits of virtualization technology, which include flexibility, scalability, and more efficient utilization of computing capacity. Deep understanding of virtualization technology also enables better planning and decisionmaking in preparation for migrating to cloud-based DR. Furthermore, being able to utilize computing capacity more effectively through virtualization potentially drives centralization and efficient provisioning of computing resources as services to business units. Such a change usually requires careful planning and change management support. Optimizing network connectivity includes procuring more robust connectivity both to the Internet and to internal network resources, minimizing latency-inducing bottlenecks, implementing dedicated connectivity to public exchanges, and using content delivery services. Minimizing the volume of data that needs to be transmitted through deduplication to remote DR resources is likely to result in cost savings, faster recovery speeds, fewer bandwidth bottlenecks, and lower chance of data loss. Finally, enterprise-readiness must include acquisition of cloud computing skills through training of internal staff and acquisition of cloud computing experience through hiring and contracting of external expertise. In addition to technical expertise, the enterprise should develop or acquire strong capabilities in defining and negotiating enterprise-worthy SLA s for cloud-based services. AWS allows you to test a DR scenario more often and at a lower cost than the traditional model as there is no physical hardware to purchase, configure, and maintain. Leveraging Accenture s knowledge will enable enterprises to prepare their decision-makers, environment, support staff, and users for successful adoption of public cloud-based disaster recovery. AWS PARTNER SOLUTIONS While AWS offers the best suites of services for enterprise-scale DR in the public cloud, the market also offers many complementing solutions offered by AWS partners. There are a number of AWS partner solutions that help enterprises migrate and maintain DR in the cloud. The following are some of the functionalities in the marketplace address specific enterprise challenges: Storage Domain Riverbed Technology: Whitewater Cloud Storage Gateway is an appliance that caches and replicates data from an in-house database through an on-site backup management application to either a private or public cloud, thus eliminating tape backup systems, minimizing administrative burden, reducing enterprise risk, and improving DR readiness. The appliance is also compatible with most enterprise data backup management systems and major database products. 9

NetApp: Private Storage for AWS is an enterprise storage solution that optimizes data replication between on-site systems and AWS. It uses an on-site storage appliance to replicate via a dedicated connection to NetApp Private storage within a Direct Connect location. NetApp is leading in this area with similar products from competitors expected in the near future. Corporate Data Center AWS Direct Connect NetApp Private Storage AWS Direct Connect Facility AWS Direct Connect AWS Simple Storage Service CA: Corporate Data Center AWS Cloud ARCserve Replication and High Availability is an enterprise solution that enables diskto-disk replication of data and complete systems to EC2 instances. ARCserve Replication provides automatic or manual switchover and switchback. System high availability is provided only for Windows systems but data replication is available. VPN Connection Disk-to-Disk REplication Switchover Switchback CA ARCserve VPC EC2 Instances Zadara: Corporate Data Center AWS Cloud Zadara has created a separate private storage cloud though AWS Direct Connect. It enables an enterprise to build a Virtual Private Cloud Array (VPCA) within the Zadara private cloud and connect it to the enterprise s AWS computing environment. Replication Zadara VPC Client VPC Array (VPCA) EC2 Instances 10

Application Domain Zadara has created a separate private storage cloud though AWS s Direct Connect. It enables an enterprise to build a Virtual Private Cloud Array (VPCA) within the Zadara private cloud and connect it to the enterprise s AWS computing environment. Technologies within this domain optimize application performance to maximize user experience Riverbed Technologies: The Stingray Traffic Manager is an example of cloud-based software designed to vastly improve application performance. Stingray provides Web Content Optimization (WCO), load balancing, improved scalability through offloading TCP and SSL connection overhead, and built-in performance monitoring and scripting. Database Domain Safe and efficient replication of databases is a particular challenge for cloud-based implementations. Solutions in this domain attempt to simplify and speed up database backup and replication processes to minimize risk of data corruption, network latency, and user experience degradation. Riverbed Technologies By caching and managing replication of databases from in-house to cloud-based resources, Riverbed Technologies Whitewater Cloud Storage Gateway appliance attempts to mitigate these risks and ease adoption of cloud-based DR. SAP and AWS have been working together designing solutionsthat can meet all businesses demands. New and existing SAP customers can deploy their SAP system on an AWS EC2 instances in production environments that have been fully vetted, tested, and certified by both vendors for use. Services such as SAP RDS take management to another level by automating and making a deployment much quicker. Oracle RAC is not supported by AWS and would entail placing the RAC system in an AWS direct connect facility to integrate with AWS services. Oracle Cloud Backup Integarating RMAN with AWS S3, Oracle Cloud Backup module enables enterprises to stream database backups to AWS S3 using Oracle RMAN commands and programs. Compared to on-site tape backup, this solution is more reliable since it is based on disk instead of tape, which is more accessible for restore operations, and cheaper in terms of upfront capital costs. Avnet Cloud Backup for Oracle Databases uses Oracle Recovery Manager (RMAN) to enable database backup to AWS S3. Zmanda For MySQL databases, a reliable solution is provided by AWS partner Zmanda through Amanda Enterprise backup and recovery. This solution enables an enterprise to use AWS S3 as a backup target from on-premises backup infrastructure using a browser-based management console. Network Domain Solutions within this domain attempt to maximize access to, and efficient utilization of, network bandwidth in order to maximize application performance in the cloud environment. Riverbed Technologies Steelhead WAN optimization solution uses the combination of virtualization, data deduplication, storage centralization, bandwidth optimization, application acceleration, and resource consolidation to address this challenge. Management Domain Technologies within this domain attempt to automate and simplify orchestration of cloud-based resource management to minimize human error and maximize speed. RightScale enables an enterprise to deliver applications in public and private clouds that are resilient to scheduled maintenance, unpredictable hardware failures, and occasional disasters, with the ability to clone entire environments and stage in another data center with one click. Puppet Labs & Chef both offer IT automation software solutions that help system administrators manage infrastructure throughout its lifecycle, from provisioning and configuration to patch management and compliance. These solutions can easily automate repetitive tasks, quickly deploy critical applications, and proactively manage change scaling from 10s of servers to 1000s, either onpremises in the cloud or both. OpsWorks and CloudFormation automate the deployment for a VPC based stack on AWS. OpsWorkds is an application management solution with automation tools that enable modeling and control of applications and the supporting infrastructure. Both Opsworks and CloudFormation integrate with Chef and Puppet. 11

EXAMPLE Company X has a combination of Tier 1, Tier 2, and Tier 3 business applications. They can choose from the following options: For Tier 1 with RTO and RPO of <1 hour, the business would choose Warm DR and will have the following: EC2 instances for all services running at all times. In-house and cloud infrastructure load-balanced and configured for auto-failover, which is facilitated by AWS Route 53, Elastic IP addresses, and Elastic Load Balancing. Initial data synchronization using inhouse backup software or file transfer protocol. Incremental data replicated/synchronized using storage gateway. Automation used for rapid failover and spin-up of environment using Puppet Labs software. For Tier 2 with RTO and RPO of <4 Hours, the business would choose Pilot Light DR and will have the following: Critical core elements of system already configured. EC2 instances running for critical services. Pre-configured AMIs for Tier-2 apps that can be quickly provisioned upon failure. Cloud infrastructure load-balanced and configured for automatic failover which is facilitated by AWS Route 53, Elastic IP addresses, and Elastic Load Balancing. Initial data synchronization using inhouse backup software or file transfer protocol. Incremental data replicated/synchronized using storage gateway. Automation used for rapid failover and spin-up of environment using Puppet Labs software. For Tier 3 with RTO and RPO of <8 hours, the business would choose Cold DR and will have the following: All data replicated into S3 bucket. Initial data synchronization using in house backup software or file transfer protocol via the web or AWS Import/ Export feature. Pre-configured AMIs for Tier 1 and Tier 2 apps that can be quickly provisioned upon failure. Incremental data replicated/synchronized using storage gateway. EC2 instances are spun-up from objects within the s3 buckets. As enterprise needs for disaster recovery progress toward a need for complete business continuity, and while IT budgets for DR remain stagnant, enterprises can no longer avoid considering cost-effective, multi-tenant, cloud-based disaster recovery solutions like AWS. Enterprises must, however, appreciate and navigate the challenges presented in enterpriselevel cloud computing. Managing risks by partnering with Accenture to implement Smart DR with AWS gives enterprises the opportunity to significantly improve disaster recovery while taking advantage of the potentially significant cost savings. Accenture can help enterprises transition to the cloud with our best practices and expertise along with market leading AWS partners. 12

FURTHER INFORMATION For further information or comments on this article, contact: Joseph Beah Emerging Technology & Innovation joseph.h.beah@accenture.com Written in collaboration with AWS solution architects and business development leads. Special thanks to: Tom Laszewski AWS Strategic Solutions Architect tomlasz@amazon.com Alejandro Flores Emerging Technology & Innovation alejandro.flores@accenture.com Keith Linnenbringer Application Modernization and Optimization keith.linnenbringer@accenture.com Chris Scott Emerging Technology & Innovation chris.scott@accenture.com Footnotes 1. Timothy Wood, Emmanuel Cecchet, K.K. Ramakrishnan, Prashant Shenoy, Jacobus van der Merwe, Arun Venkataramani, Disaster Recovery as a cloud Service: Economic benefits & deployment challenges, University of Massachusetts Amherst, AT&T Labs. 2. Disaster Recovery, Amazon Web Services LLC, accessed November 27, 2012, http://aws.amazon.com/ disaster-recovery 3. Glen Robinson, Ianni Vamvadelis, and Attila Narin Using Amazon Web Services for Disaster Recovery, Amazon Web Services LLC, accessed November 27, 2012, http://aws.amazon.com/disaster-recovery 4. Karen Dye, Do Your Disaster Recovery Time Objectives Meet Your Business Requirements?, Sun Microsytems, accessed April 24, 2013, http://i. zdnet.com/whitepapers/bizcontwp.pdf?tag=mantle_ skin;content 5. Jacob Gsoedl, Disaster recovery in the cloud explained, Storage Magazine-Vol. 10 Num. July 5, 2011. 6. Timothy Wood, K.K. Ramakrishnan, Prashant Shenoy, Jacobus van der Merwe, Enterprise-Ready Virtual Cloud Pools: Vision, Opportunities, and Challenges, Gearge Washington University, AT&T Research Labs, University of Massachussets Amherst. 7. T. Wood, A. Gerber, K. Ramakrishnan, J. Van der Merwe, and P. Shenoy.The case for enterprise ready virtual private clouds. In 8. Proceedings of the Usenix Workshop on Hot Topicsin Cloud Computing (HotCloud), San Diego, CA, June 2009 9. Rob Livingstone, When Disaster Thunders Through the Cloud, accessed January 22, 2013, http://www3. cfo.com/article/2012/1/the-cloud_cost-of-disasterrecovery-in-cloud?currpage=1 10. Henrik Rosendahl, Is Enterprise Cloud Backup at Economy Prices forreal?, accessed January 22, 2013, http://www.wired.com/insights/2012/08/enterprisecloud-backup/ 11. Brandon Butler, Disaster recovery in the cloud: Vendors jump in; Enterprises wade, Network World, accessed January 22, 2013, http://www. networkworld.com/news/2012/092712-disasterrecovery-cloud-262818.html?page=3 12. Jim Cooke, Cloud Readiness for the Enterprise, Cloud Computing Journal, accessed January 22, 2013, http://cloudcomputing.sys-con.com/node/2086147 13. Announcing Amanda Enterprise 3.1, Radically simple, intelligent and robust network backup and recovery, Zmanda, accessed January 22, 2013, http:// www.zmanda.com/backup-amazon-s3.html 13

14

ABOUT ACCENTURE Accenture is a global management consulting, technology services and outsourcing company, with approximately 275,000 people serving clients in more than 120 countries. Combining unparalleled experience, comprehensive capabilities across all industries and business functions, and extensive research on the world s most successful companies, Accenture collaborates with clients to help them become high-performance businesses and governments. The company generated net revenues of US$27.9 billion for the fiscal year ended August 31, 2013. Its home page is www.accenture.com. Copyright 2013 Accenture All rights reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.