1 IBM Global Technology Services March 2008 Virtualization for disaster recovery:
2 Page 2 Contents 2 Introduction 3 Understanding the virtualization approach 4 A properly constructed virtualization strategy can yield results 5 Key considerations for your virtualization approach for disaster recovery 8 Summary Introduction For years, many companies have been faced with challenges related to providing backup and recovery for infrastructures, technologies and networks in order to protect their critical business processing capabilities. Much of this effort has been driven by the need to provide a recovery design that was cost-effective and leveraged as much of the information technology (IT) investment as possible, while still providing adequate coverage to help protect the company from an adverse event that could compromise business results. Years ago, when mainframes ruled, many companies opted for secondary data centers. Here, companies could offset key production workloads, oftentimes running test and development on this secondary capacity, to provide adequate capability to recover and fulfill their primary processing requirements. Over time, this approach has become more difficult to manage and coordinate, as increased capacities became required to run the expanded workloads. Larger and more complex environments were being developed to run the work, while ever increasing amounts of data were being generated on a daily basis to meet the demands of the growing business requirements. Additionally, we have seen the evolution of numerous distributed technology platforms, a multitude of systems software installations with varying operating levels, and the advancement of networking technologies that enabled a connect everything to everywhere all of the time. Many companies began to realize that the technology was becoming increasingly more difficult to sustain and manage. The result was not only the increased complexity of maintaining a recovery capability, but also the inability to manage a fully redundant secondary site, based upon a combination of financial, operational and technology concerns that needed to be considered.
3 Page 3 To help resolve these issues, commercial vendors introduced the concept of a shared recovery facility to be used by multiple businesses. A comprehensive infrastructure would be enabled with the necessary technology. It could be scaled to virtually any size and configuration required. This hot-site concept provided a pooled resource that was driven by discrete customer requirements, all managed by a third-party vendor in a remote location, separate from a company s primary processing location. This was the first evidence that using a virtualization strategy for disaster recovery could be established. Understanding the virtualization approach Before moving ahead, it is important to understand the baseline concepts of using virtualization for disaster recovery. At a high level, the main focus of a virtualization approach is the benefit that can be realized by consolidation. The vast numbers of servers, the amounts of storage and the numerous networks could be combined in a managed pool of resources that could be configured based upon need. From a disaster recovery perspective, what this means is that when a disaster event occurs, resources from the larger pool can be reconfigured to provide capacity and access for assuming the primary production environment. While on the surface this appears to be a very attractive strategy, there are many underlying factors that should be considered. We will discuss these factors in greater detail a bit later in this paper. This virtualization approach provides the ability to leverage a single contracted machine for the recovery of multiple environments simultaneously. Virtualization is an approach that has been used for many years for disaster recovery. On the high end are mainframe platforms. Based upon individual business recovery needs, vendors provide clients with a pool of resources that are available via contract. To realize the optimum benefit of the hardware, virtual machine (VM) operating systems are utilized to enable multiple production partitions to be run on one physical machine for disaster recovery. This can provide the ability to leverage a single contracted machine for the recovery of multiple environments simultaneously. It also allows a company to contract only the technology that is required from the shared pool of resources.
4 Page 4 Numerous virtual machines can be recovered onto a single physical footprint. Over time, greater requirements for distributed processing recovery became evident and warranted action. Companies began to see techniques deployed that utilized software to encapsulate virtual machines, which could be restored on device-independent hardware at the recovery site. These new techniques made it easier to identify and script a recovery process that was dependent upon how the backups were defined, versus adhering to very rigid hardware-specific requirements. Assuming capacity, storage and interfaces were adequate enough to provide equal or greater throughput for the individual production workloads, the result was that numerous virtual machines could be recovered onto a single physical footprint. A properly constructed virtualization strategy can yield results Many benefits can be realized using a properly constructed virtualization strategy for disaster recovery. Potential benefits include: The virtualization strategy for disaster recovery allows for standardization of processes and procedures for the entire resource pool. The ability to create a virtual resource pool that provides multiple reuse scenarios, effectively producing a pay once, use many times scenario for asset utilization A cohesive technology platform that utilizes similar technologies for produc- tion, test and development, and recovery processing Standardization of processes and procedures for the entire resource pool, using a unified approach to monitoring and management across all operations Consistencies in the technology investment as refresh cycles are propagated across all environments at the same time to help maintain the integrity of the pooled resources Ease of maintenance as the processes, schedules and level of effort can be coordinated in a more timely and efficient manner, given the affinity across the installed technology base.
5 Page 5 Key considerations for your virtualization approach for disaster recovery When utilizing a virtualization approach for disaster recovery, there are several key considerations that should be incorporated into the design: Isolation from the primary production environment is key to avoiding single points of failure. 1. Capacity for recovery It is critical to allow for adequate capacity when designing for recovery. Frequently, it is assumed that less than 100-percent capacity will be tolerable during a recovery event. In all actuality, during the initial phases of the recovery, utilization is greater than the production capacity as workloads push the limits of the systems to fully recover. In addition, considerable catch-up work must be run to bring the systems back to their pre-event status, all the while handling the new workload that is part of the business resumption process. 2. Resources for integration While processing capacity represents a large portion of recovery consideration, attention should be given to the various other components required to support the production environment. These components include processor resources (storage, device interfaces, etc.), disk resources (storage arrays, storage area networks [SANs], clusters, etc.), peripherals (control units, terminals, blades, etc.), infrastructure (external switches) and network connectivity (switches, bandwidth, etc.). 3. Isolation, network redundancy and scalability Key to avoiding single points of failure is helping to ensure that the design of the virtualized resources is isolated from the primary production environment. Network redundancy is crucial to providing access for internal users as well as all external parties customers, business partners, supply chains, etc. The ability to scale is a requirement to handle peak workloads for both recovery and production processing.
6 Page 6 Formulate a detailed plan to manage resources and execute whether it be an exercise or an actual recovery. 4. Recovery plan execution A major consideration in the design of a virtualized recovery strategy is the ability to actively test the plan. This includes the capability to fully test at a system level, effectively repurposing all workloads residing on the virtual resources for an extended period of time. This allows for integrated business and infrastructure validation. While function and component testing can be easier to schedule, true results may never be realized and could effectively compromise the recovery efforts. 5. Repurposed workload plan A detailed plan should be put in place to manage the workloads that will be moved at the time of a recovery event be that an exercise or an actual disaster. These plans should include a formal schedule for testing with senior executive commitment, an alternative work plan for the resources being offset at the time of the event, a daily backup process for the workload being offset, and a tested recovery plan for reestablishing this workload at an alternate site at the time of the event. 6. Disaster recovery posture retention Consideration should be given to the risk profile of the business when implementing a virtualized recovery design. Geographic diversity should not be sacrificed in light of any technological distance limitations that may be inherent in a virtual design. Examples include the ability to enable a processor failover scenario and the requirement for synchronous data transfer that can help minimize latency concerns. In the end, the site of the recovery should be in line with the business s tolerance for risk in accordance with its formal mitigation strategy, and should not be a result of satisfying a technical requirement.
7 Page 7 Change management is vital in maintaining the integrity of the recovery environment. 7. Clearly identified workloads Prior to identifying the specific resources that will constitute the virtualization pool, it is very important to understand the workloads that will be recovered at the time of an event. Business prioritization and criticality should be identified, with a detailed mapping in place relative to process flows, application integration and dependencies, and underlying information technology components, to help enable recoverability within the virtualized environment. 8. Disciplines for maintaining integrity Strict systems management disciplines that include problem, change, incident, configuration and asset management are a prerequisite that should be in place prior to engaging any new strategy for virtualized recovery. These are vital in preserving the integrity of the recovery environment and are critical for effectiveness in the ultimate operation, monitoring and maintenance of the virtualized resource pool. 9. Business and IT reporting The ability to track progress, deliver status and report results of the recovery program is an important output of any recovery program. This may be important in justifying the significant capital investment being made in transforming the information technology function into a virtualized core utility for the business.
8 Summary Virtualization, while popular in today s information technology discussions, is something that has been around for many years and has been used for both production operations and disaster recovery scenarios. Developing a disaster recovery strategy using a virtualized approach is as intriguing as it is challenging to design and implement. Keys to developing a strategy include: Leveraging the IT investment for multiple purposes (production, development or disaster recovery) Understanding the true production requirements to help ensure that adequate processing capabilities are in place for business protection Defining a separate and isolated infrastructure and network that can scale to meet production specifications Developing adequate test plans and schedules for comprehensive exercising and validation of the recovery capability Incorporating strict disciplines that can not only manage the integrity of the two environments, but can report the status of the efforts from a business and IT standpoint. Copyright IBM Corporation 2008 IBM Global Services Route 100 Somers, NY U.S.A. Produced in the United States of America All Rights Reserved IBM and the IBM logo are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Other company, product and service names may be trademarks or service marks of others. IBM assumes no responsibility regarding the accuracy of the information provided herein and use of such information is at the recipient s own risk. Information herein may be changed or updated without notice. IBM may also make improvements and/ or changes in the products and/or the programs described herein at any time without notice. References in this publication to IBM products and services do not imply that IBM intends to make them available in all countries in which IBM operates. For more information To learn more about virtualization for disaster recovery, contact: Joseph E. Starzyk, PMP Senior business development executive IBM Business Continuity and Resiliency Services Phone: ibm.com/services/continuity BUW03004-USEN-00
IBM Global Services IT Optimization: Driving Infrastructure Value Faster, better, cheaper has always been infrastructure s mantra, but in the past, I ve only had to focus on one or two of these mandates
EMC NetWorker Version 8.2 SP1 Server Disaster Recovery and Availability Best Practices Guide 302-001-572 REV 01 Copyright 1990-2015 EMC Corporation. All rights reserved. Published in USA. Published January,
RESPONSIVE ENTERPRISE COMPUTING SERVICES Cloud Solutions A Silver Lining for Intelligence Missions Salient Federal Solutions Proprietary Information For comments or questions regarding this white paper,
The Definitive Guide tm To Cloud Computing Ch apter 10: Key Steps in Establishing Enterprise Cloud Computing Services... 185 Ali gning Business Drivers with Cloud Services... 187 Un derstanding Business
Assess, Adjust, Improve An LXI Publication Page 1 of 11 Your company's ability to recover is a high priority. In a survey by Contingency Planning & Management Magazine of 1437 contingency planners, 76%
White Paper MICROSOFT EXCHANGE 2010 STORAGE BEST PRACTICES AND DESIGN GUIDELINES FOR EMC STORAGE EMC Solutions Group Abstract Microsoft Exchange has rapidly become the choice of messaging for many businesses,
Cost-Effective Alternatives to Software Asset Management kpmg.com Contents Executive Summary 1 Introduction 2 Key SAM issues 4 A cost-effective approach to SAM 6 Benefits of SAM 8 Conclusion 9 Cost-Effective
ITIL V3 Application Support Volume 1 Service Management For Application Support ITIL is a Registered Trade Mark and Community Trademark of the Office of Government and Commerce. This document may contain
TECHNICAL WHITE PAPER: DATA AND SYSTEM PROTECTION Achieving High Availability with Symantec Enterprise Vault Chris Dooley January 3, 2007 Technical White Paper: Data and System Protection Achieving High
IBM Software Thought Leadership White Paper January 2013 The business value of improved backup and recovery The IBM Butterfly Analysis Engine uses empirical data to support better business results 2 The
White Paper EMC IT S JOURNEY TO THE PRIVATE CLOUD: SERVER VIRTUALIZATION A series exploring how EMC IT is architecting for the future and our progress toward offering IT as a Service to the business Abstract
Board of Supervisors June 6, 2002 Page 2 on recommendations contained in ISD s 1999 COMDISCO Report. ISD contracted with COMDISCO, an outside consultant, to conduct an assessment of ISD s data center s
Customer Cloud Architecture for Big Data and Analytics Executive Overview Using analytics reveals patterns, trends and associations in data that help an organization understand the behavior of the people
Data Center Management In a Shared, Multi-Customer Environment November 2010 This paper is intended to provide an overview of the fundamental elements of data center management in the context of a multi-customer
Introduction to Windows Storage Server 2003 Architecture and Deployment Microsoft Corporation Published: July 2003 Abstract Microsoft Windows Storage Server 2003 is the latest version of Windows Powered
W H I T E P A P E R : M A N A G E D O U T C O M E Managed Outcome: The New Model for IT Customer Relationships Ajay Nigam, Vice President Product Management, Symantec Global Services White Paper: Managed
Microsoft Cross-Site Disaster Recovery Solutions End-to-End Solutions Enabled by Windows 2008 Failover Clustering, Hyper-V, and Partner Solutions for Data Replication Published: December 2009 Introduction:
IT@Intel White Paper Intel Information Technology Business Solutions June 2010 An Enterprise Private Cloud Architecture and Implementation Roadmap The private cloud is a shared multi-tenant environment
Microsoft System Center 2012 R2 Why Microsoft? For Virtualizing & Managing SharePoint July 2014 v1.0 2014 Microsoft Corporation. All rights reserved. This document is provided as-is. Information and views
Nokia Corporation Nokia Mobile Phones P.O. Box 100 FIN-00045 Nokia Group, Finland Tel. +358 7180 08000 Telefax +358 7180 34016 www.nokia.com/m2m Machine-to-Machine - Real Opportunity in Wireless Data Business
DeltaV Distributed Control System Whitepaper October 2014 DeltaV Virtualization High Availability and Disaster Recovery This document describes High Availiability and Disaster Recovery features supported
sm OPEN DATA CENTER ALLIANCE : The Private Cloud Strategy at BMW SM Table of Contents Legal Notice...3 Executive Summary...4 The Mission of IT-Infrastructure at BMW...5 Objectives for the Private Cloud...6
White Paper EMR Infrastructure Readiness Kathleen Gaffney June, 2012 Introduction An often-overlooked aspect to implementing an Electronic Medical Record (EMR) system is the need for a solid, medical-grade
Double-Take Replication in the VMware Environment: Building DR solutions using Double-Take and VMware Infrastructure and VMware Server Double-Take Software, Inc. 257 Turnpike Road; Suite 210 Southborough,
Department of Budget & Management State of Maryland Information Technology (IT) Disaster Recovery Guidelines Version 4.0 July 2006 TABLE OF CONTENTS 1.0 INTRODUCTION...1 1.1 Purpose...1 1.2 Scope...1 1.3