VMware Site Recovery Manager Overview Q2 2008
Executive Summary VMware virtual infrastructure transforms organizations ability to protect their datacenter VMware Site Recovery Manager is a new product for disaster recovery that does the following: Simplifies and automates key disaster recovery workflows: setup, testing, failover Simplifies and centralizes management of recovery plans Makes it easy for customers to leverage their investment in storage replication technologies with their virtual infrastructure This combination of disaster recovery automation with the capabilities of VMware Infrastructure reduces time to recovery, risk, complexity and cost
Agenda Disaster Recovery Challenges Introducing VMware Site Recovery Manager Summary
Requirements for Disaster Recovery Minimize Downtime Minimize Risk Control Cost 93% of companies that lost their data center for ten days or more due to a disaster filed for bankruptcy within one year of the disaster. --National Archives and Records Administration 92% of users surveyed acknowledged that their companies would face serious consequences if they had to implement their disaster recovery plans. --Dynamic Markets Ltd. 73% of executives expressed concern with the costs associated with maintaining a secondary data centre --Beacon Technology Partners Effective disaster recovery is a business imperative, but is very difficult to achieve
Challenges of Disaster Recovery Minimize downtime Many manual processes for recovery Multiple steps to overcome hardware differences Incomplete or out-of-date runbooks Reduce risk Testing requires additional hardware and infrastructure Usually only data is regularly and cleanly updated Frequent failures during recovery Control cost Simplest recovery requires identical hardware Idle recovery hardware is impossible to repurpose Multiple third-party products necessary for recovery
Agenda Disaster Recovery Challenges Introducing VMware Site Recovery Manager Summary
VMware Vision for Disaster Recovery Rapid > Automate recovery process > Eliminate failures due to hardware dependencies > Integrate different components of recovery Reliable > Enable easier, more frequent testing > Turn manual, inconsistent processes into pre-programmed, repeatable processes Manageable > Centralize and simplify management of recovery plans > Make disaster recovery protection a property of virtual infrastructure Affordable > Eliminate idle recovery hardware > Eliminate dependencies on physical infrastructure
VMware Infrastructure Adds Value To DR Today Business Concern Time to Recovery (RTO) Reliability Cost Associated DR Workflow Supporting VI3 Platform Features Awards Best Disaster Failover Recovery Product of 2006 (TechTarget) Hardware Independence Instant Repurposing Testing Snapshots VLANs Customers Planning And Acquisition 55% of customers using virtualization for BC/DR DRS And Resource Pools Encapsulation Boot From Shared Storage (#1 reason for virtualization behind consolidation/resource utilization)
Introducing VMware Site Recovery Manager Site Recovery Manager leverages VMware Infrastructure to deliver advanced disaster recovery management and automation Simplifies and automates disaster recovery workflows: Setup, testing, failover Turns manual recovery runbooks into automated recovery plans Provides central management of recovery plans from VirtualCenter Works with VMware Infrastructure to make disaster recovery rapid, reliable, manageable, affordable
Site Recovery Manager Core Capabilities ESX Server VirtualCenter Virtual Machines ESX Server Servers Storage ESX Server Site Recovery Manager ESX Server Centralized management for DR Create, test, update and execute recovery plans from a single point of management Tight integration with VirtualCenter Disaster recovery automation Build recovery process in advance Automate testing of recovery plans Automate execution of recovery process Simplified setup and integration Allocate and manage recovery resources Easy integration with leading vendors storage replication systems
Key Components VirtualCenter Virtual Machines VMware Infrastructure Servers VMware Infrastructure 3 deployment Storage
Key Components VirtualCenter Virtual Machines VMware Infrastructure Servers Storage Site Recovery Manager Site Recovery Manager > Manages and monitors recovery plans > Tightly integrated with VirtualCenter VMware Infrastructure > Requires ESX Server 3.0.2 or later > Requires VirtualCenter 2.5 or later Storage > iscsi or FibreChannel storage Storage Partner Replication > Integrated via replication adapters created, certified and supported by replication vendor Partner Replication
Key Components Production Disaster Recovery VirtualCenter Virtual Machines VMware Infrastructure Servers Site Recovery Manager Site Recovery Manager Protected virtual machines Site Recovery Manager VirtualCenter Virtual Machines VMware Infrastructure Servers Storage Partner Replication Storage
Site Recovery Manager Management Interface Disaster recovery management is another view of your environment Viewed from VirtualCenter management client Central point of management for virtual infrastructure and disaster recovery VirtualCenter client VirtualCenter Site Recovery Manager
Disaster Recovery Setup Create recovery plans For virtual machines, applications, business units Integrate with replication Identify which virtual machines are protected by replication configuration Map recovery resources Server resources, network resources, management objects Specify recovery process Convert manual runbook to preprogrammed response Customizable with scripting and callouts
Setup: Mapping Recovery Resources Production Disaster Recovery Maps production resources to resources available at recovery site Virtual machine hierarchy Resource pools Network connections Eliminates complexity of managing recovery site resources
Protected and Recovery Site Datacenters SRM PROTECTED SITE SRM RECOVERY SITE
Protected Site VMware Topology Map
User Interface SRM UI Access Local and Paired Site Protection Setup Recovery Setup
Setup: Building Recovery Plans Turn manual runbook into automated process Specify steps of recovery process in VirtualCenter Extensible framework Scripts for specialized tasks Checkpoints for manual steps Enables integration with physical recovery
Setup Workflow Protection Site At the protection site the following setup activities are completed: The user pairs the SRM servers at the protected and recovery sites Security certificates are established between the SRM servers and the VC servers
Setup Workflow Protection Site - continued Array Managers Configuration Select the correct Manager Type from the Manager type drop down box Storage Partner Participation VMware provides the SRA specification Storage Partners create the SRA Storage Partners test the SRA VMware review the SRA test results SRA support with SRM granted if all test are passed Storage Partners
Setup Workflow Protection Site (continued) SRM identifies available arrays in the Protection and Recovery Side and the replicated datastores and determines the datastore groups Protection Side Array Discovery Recovery Side Array Discovery Replicated Datastores and Datastore Groups
Setup Workflow Protection Site - continued Using the Inventory Preferences Mapper, the user maps resources in the protected site to their counterparts in the recovery site.
Setup Workflow Protection Site - continued A protection group is a group of VMs that will be failed over together to the recovery site Working through the Protection Group wizard you will need to select a temporary location for placeholder VM configuration files for the protected VMs at the recovery site.
Setup Workflow Protection Site - continued Working through the Protection Group wizard a user selects which VMs need to be protected and assigns them to a protection group The creation of a protection group results in VC inventory updates in the recovery site
Setup Workflow Recovery Site At the recovery site the following setup activity is completed: The user creates a recovery plan which is associated to a single or multiple protection groups
Recovery Plan VM Shutdown High Priority VM Shutdown Prepare Storage High Priority VM Recovery Normal Priority VM Recovery
Recovery Plan - continued Low Priority VM Recovery Post Test Cleanup Storage Reset SRM Recovery Plan Benefits: turn manual BC/DR run books into an automated process specify the steps of the recovery process in VirtualCenter Provide a way to test your BC/DR plan in an isolated environment at the recovery site without impacting the protected VMs in the protected site
Testing Replication Management Snapshot replicated LUNs before test Delete snapshots of replicated LUNs after test Network Management Change all virtual machines to a test port group before powering them on Customization/extensibility Same breakpoints and callouts as failover sequence Extra breakpoints and callouts around the test bubble
Testing a Recovery Plan SRM enables you to Test a recovery plan by simulating a failover with zero downtime to the protected VMs in the protected site Storage configuration during a SRM Test failover from Site A to Site B for datastore shared-san-2 Site A - Protected Site Site B - Recovery Site Data Replication continues between the Source LUN and Target LUN The data synchronization between the Target LUN and the Clone LUN is suspended Read Write Enabled Write Disabled (read only) Read Write Enabled Source LUN (shared-san-2) Protected VMs (app_vm7 to app_vm12) Protected VMs that will be recovered to Site B Target LUN (shared-san-2) Clone LUN (shared-san-2) Protected VMs (app_vm7 to app_vm12) Protected VMs powered on in Site B during the SRM Test failover Note: Datastore shared-san-1 will be in the same configuration state as shared-san-2
Testing a Recovery Plan - continued Status Recovery Only Success Errors Success Waiting for Input Test Only
Recovery Plan Reports Accessible compliance Exportable recovery plan Exportable recovery results Maintained history
Failover Automation Detect site failures Raise alert when heartbeat lost Initiate failover User confirmation of outage Granular failover initiation Manage replication failover Break replication Make replica visible to recovery hosts Execute recovery process Use pre-programmed plan Provide visibility into progress
Executing an Actual Failover WARNING - Executing an actual failover with SRM will permanently alter virtual machines and infrastructure of both the protected and recovery sites Storage configuration after running a Recovery in SRM (Actual Failover) from Site A to Site B Site A - Protected Site Site B - Recovery Site Data Replication is suspended Write Disabled (read only) Read Write Enabled Protected VMs (app_vm7 to app_vm12) All powered off by SRM At start of SRM Recovery Source LUN (shared-san-2) Target LUN (shared-san-2) Protected VMs (app_vm7 to app_vm12) All powered on by SRM during the SRM Recovery Note: A Clone LUN is not used during an actual failover in SRM.
Executing an Actual Failover - continued WARNING - Executing an actual failover with SRM will permanently alter virtual machines and infrastructure of both the protected and recovery sites WARNING - Failback to the protected site is a not an automated process in SRM 1.0
Datastore Re-signature During Failover SRM will automatically perform a re-signature on the Datastores in the Recovery Site that were replicated from the SRM Protected Site LVM.EnableResignature=1 With a re-signature - Datastore names will change to snapxxxx_datastorename, for example snap-00000002-shared-san-1 snap-00000002-shared-san-2 WARNING - The re-signature of the target datastore has implications during a failback (resync) of data back to the SRM Protected Site
Failback Promote secondary site to primary For one or more VMs Add VMs to DR profile Protect them at another site Manual failover Short downtime (minutes) Likely to be individual VMs/applications
Agenda Disaster Recovery Challenges Introducing VMware Site Recovery Manager Summary
Site Recovery Manager Core Benefits Expand disaster recovery protection Now any workload in a VM can be protected with minimal incremental effort and cost Reduce time to recovery As soon as disaster is declared, a single button kicks off recovery sequence for hundreds of VMs Increase reliability of recovery Replication of system state ensures a VM has all it needs to startup Hardware independence eliminates failures due to different hardware Easier testing based off of actual failover sequence allows more frequent and more realistic tests
Summary Site Recovery Manager Leverages VMware Infrastructure to Make Disaster Recovery Rapid Automate disaster recovery process Eliminate complexities of traditional recovery Reliable Ensure proper execution of recovery plan Enable easier, more frequent tests Manageable Centrally manage recovery plans Make plans dynamic to match environment Affordable Utilize recovery site infrastructure Reduce management costs
Questions?