VMware Site Recovery Manager (SRM) Lessons Learned in the First 90 Days



Similar documents
VMware Site Recovery Manager Overview Q2 2008

VMware vcenter Site Recovery Manager 5 Technical

VMware Site Recovery Manager with EMC RecoverPoint

DR-to-the- Cloud Best Practices

Cookbook Disaster Recovery (DR)

Drobo How-To Guide. Topics Drobo and vcenter SRM Basics Configuring an SRM solution Testing and executing recovery plans

Site Recovery Manager Administration Guide

Site Recovery Manager Installation and Configuration

VMware Business Continuity & Disaster Recovery Solution VMware Inc. All rights reserved

vsphere Replication for Disaster Recovery to Cloud

What s New with VMware Virtual Infrastructure

Oracle E-Business Suite Disaster Recovery Solution with VMware Site Recovery Manager and EMC CLARiiON

Implementing disaster recovery solutions with IBM Storwize V7000 and VMware Site Recovery Manager

Site Recovery Manager Installation and Configuration

Disaster Recovery with EonStor DS Series &VMware Site Recovery Manager

Basic System Administration ESX Server 3.0 and VirtualCenter 2.0

VMware vcloud Air - Disaster Recovery User's Guide

Basic System Administration ESX Server and Virtual Center 2.0.1

How To Use Vcenter Site Recovery Manager 5 With Netapp Fas/Vfs Storage System On A Vcenter Vcenter 5 Vcenter 4.5 Vcenter (Vmware Vcenter) Vcenter 2.

PROTECTING THE VIRTUAL DATACENTER: DISASTER RECOVERY USING DELL EQUALLOGIC PS SERIES STORAGE AND VMWARE VCENTER SITE RECOVERY MANAGER

Downtime, whether planned or unplanned,

Taking the Disaster out of Disaster Recovery

Implementing HP 3PAR Remote Copy with VMware vcenter Site Recovery Manager

Tintri Deployment and Best Practices Guide for VMware vcenter Site Recovery Manager 5.8

Getting Started with Database Provisioning

Optimization, Business Continuity & Disaster Recovery in Virtual Environments. Darius Spaičys, Partner Business manager Baltic s

EMC Business Continuity for VMware View Enabled by EMC SRDF/S and VMware vcenter Site Recovery Manager

Virtual Server Agent v9 with VMware. March 2011

VMware Virtualization for Business Continuity and Disaster Recovery. Guy Bowers Senior Systems Engineer Q1 2010

Oracle 11g Disaster Recovery Solution with VMware Site Recovery Manager and EMC CLARiiON MirrorView/A

vsphere Replication for Disaster Recovery to Cloud

VMware vsphere-6.0 Administration Training

Site Recovery Manager Administration Guide

VMware/Hyper-V Backup Plug-in User Guide

13.1 Backup virtual machines running on VMware ESXi / ESX Server

Implementing a Holistic BC/DR Strategy with VMware

VMware Site Recovery Manager and Nimble Storage

Virtual DR: Disaster Recovery Planning for VMware Virtualized Environments

AUTOMATED DISASTER RECOVERY SOLUTION USING AZURE SITE RECOVERY FOR FILE SHARES HOSTED ON STORSIMPLE

Workflow Templates Library

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage

HP Matrix Operating Environment 7.2 Recovery Management User Guide

CHEAP DISASTER RECOVERY

EXPRESSCLUSTER X for Windows Quick Start Guide for Microsoft SQL Server Version 1

Veeam Backup Enterprise Manager. Version 7.0

Virtual Infrastructure Implementation of High Availability and Business Continuation Solutions

Table of Contents. Online backup Manager User s Guide

Site Recovery Manager (SRM) Proof of Concept (POC) Guide

Using Emergency Restore to recover the vcenter Server has the following benefits as compared to the above methods:

How to Backup and Restore a VM using Veeam

Mastering Disaster Recovery: Business Continuity and Virtualization Best Practices W H I T E P A P E R

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

ArCycle vmbackup. for VMware/Hyper-V. User Guide

EMC Disaster Recovery with VMware SRM

Backup Exec Private Cloud Services. Planning and Deployment Guide

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

SnapManager 2.0 for Virtual Infrastructure Best Practices

Core Protection for Virtual Machines 1

TGL VMware Presentation. Guangzhou Macau Hong Kong Shanghai Beijing

NetApp OnCommand Plug-in for VMware Backup and Recovery Administration Guide. For Use with Host Package 1.0

Khóa học dành cho các kỹ sư hệ thống, quản trị hệ thống, kỹ sư vận hành cho các hệ thống ảo hóa ESXi, ESX và vcenter Server

Drobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups

VMware for Bosch VMS. en Software Manual

Virtualization & Covance Inc.

SAP Disaster Recovery Solution with VMware Site Recovery Manager and EMC CLARiiON

Continuous Data Protection for any Point-in-Time Recovery: Product Options for Protecting Virtual Machines or Storage Array LUNs

Bosch Video Management System High availability with VMware

Technology Comparison. A Comparison of Hypervisor-based Replication vs. Current and Legacy BC/DR Technologies

Direct Storage Access Using NetApp SnapDrive. Installation & Administration Guide

Clustering VirtualCenter 2.5 Using Microsoft Cluster Services

VMware View Design Guidelines. Russel Wilkinson, Enterprise Desktop Solutions Specialist, VMware

Architecting DR Solutions with VMware Site Recovery Manager

Acronis Backup & Recovery 10 Advanced Server Virtual Edition. Quick Start Guide

NetIQ Sentinel Quick Start Guide

VMware vsphere 5.0 Evaluation Guide

User Guide for VMware Adapter for SAP LVM VERSION 1.2

EMC ViPR Controller. Backup and Restore Guide. Version Darth 302-XXX-XXX 01

Enterprise Manager. Version 6.2. Administrator s Guide

Index C, D. Background Intelligent Transfer Service (BITS), 174, 191

JOB ORIENTED VMWARE TRAINING INSTITUTE IN CHENNAI

BEST PRACTICES GUIDE: VMware on Nimble Storage

VMware and VSS: Application Backup and Recovery

Running VirtualCenter in a Virtual Machine

Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4

Introduction. Setup of Exchange in a VM. VMware Infrastructure

VirtualCenter Database Maintenance VirtualCenter 2.0.x and Microsoft SQL Server

SERVER VIRTUALIZATION AND STORAGE DISASTER RECOVERY. Ray Lucchesi, Silverton Consulting

StarWind iscsi SAN Software: Using StarWind with VMware ESX Server

Microsoft SMB File Sharing Best Practices Guide

Virtual Storage Console 4.0 for VMware vsphere Installation and Administration Guide

Kaseya 2. User Guide. Version 7.0. English

Deploying the BIG-IP System with VMware vcenter Site Recovery Manager

Installing and Configuring Windows Server Module Overview 14/05/2013. Lesson 1: Planning Windows Server 2008 Installation.

PassTest. Bessere Qualität, bessere Dienstleistungen!

Setup for Microsoft Cluster Service ESX Server and VirtualCenter 2.0.1

Directions for VMware Ready Testing for Application Software

Optimizing Disaster Recovery with VMware Site Recovery Manager and EMC Celerra Replicator V2

vcloud Air Disaster Recovery Technical Presentation

Setup for Failover Clustering and Microsoft Cluster Service

VMware vsphere Data Protection Evaluation Guide REVISED APRIL 2015

Transcription:

VMware Site Recovery Manager (SRM) Lessons Learned in the First 90 Days Bob van der Werf / Jeremy van Doorn Sr. Systems Engineer BvanderWerf@vmware.com +31 (0)6 1389 9144

Agenda SRM in 10 minutes Installation & Configuration FAQs, issues, pitfalls Storage Replication Adapters install, setup, gotchas SRM Administration concepts, FAQs Failback Workflows, Considerations Q & A

SRM in 10 Minutes Disaster Recovery Problem Set VMware s DR Vision Site Recovery Manager

Disaster Recovery Problem Set Management Simplify and automate setup, testing, and execution of recovery process Lower RTO and increase reliability Data Protection of all data configuration, OS, and application Infrastructure Infrastructure Reduce cost and complexity of providing infrastructure necessary to ensure successful recovery

VMware Vision for Disaster Recovery Rapid Reliable Automate recovery process Eliminate failures due to hardware dependencies Integrate different components of recovery Enable easier, more frequent testing Turn manual, inconsistent processes into pre-programmed, repeatable processes Manageable Centralize and simplify management of recovery plans Make disaster recovery protection a property of virtual infrastructure Affordable Eliminate idle recovery hardware Eliminate dependencies on physical infrastructure

Useful Properties of Virtualization for DR Partitioning Hardware Independence VMotion Run multiple virtual machines simultaneously on a single physical server Run a virtual machine on any server without modification Isolation Encapsulation Each virtual machine is isolated from other virtual machines on the same server Virtual machines encapsulate entire systems (hardware configuration, operating system, apps) in files Copyright 2006 VMware, Inc. All rights reserved.

Introducing VMware Site Recovery Manager Site Recovery Manager leverages VMware Infrastructure to deliver advanced disaster recovery management and automation Simplifies and automates disaster recovery workflows: Setup, testing, failover Turns manual recovery runbooks into automated recovery plans Provides central management of recovery plans from VirtualCenter Works with VMware Infrastructure to make disaster recovery rapid, reliable, manageable, affordable

Site Recovery Manager at a Glance Protected Site Site A Recovery Site Protected Site Site B Recovery Site VirtualCenter Site Recovery Manager Supports bidirectional site protection VirtualCenter Site Recovery Manager Protected VMs offline powered on Protected VMs online become in unavailable Protected Site Array Replication Datastore Groups Datastore Groups

Site 1 Server Side Components VC Server 1 Site 2 VC Server 2 VCMS 1 DB SRM Server 1 SRM Server 2 VCMS 2 DB SRM 1 DB Storage Replication Adapter Storage Replication Adapter SRM 2 DB Block Replication SW Array 1 Array 2 Block Replication SW

Installation & Configuration Issues from the field FAQs Common Pitfalls

Install & Config Issues from the field 1 of 3 Q. We had SRM installed and working but now the SRM service will not start? A. Most common root cause is SRM not licensed (check SRM logs) and defaults back to Eval mode. Solution is to install your full SRM license Q. We have installed SRM but would now like to rename our Sites can we do this? A. In this release of SRM the site names specified during the install process cannot be undone without reinstalling SRM. This feature request will be considered for a later SRM update Q. Can we install SRM and use only a single VirtualCenter instance? A. No. SRM requires two instances of VirtualCenter

Install & Config Issues from the field 2 of 3 Q. During the install process port 80 is defined as the communication port for VirtualCenter, can this be changed? A. Even though SRM uses SSL when it communicates to VirtualCenter, it does not use port 443. SRM establishes a TCP connection to port 80, then uses an HTTP CONNECT request to establish a tunnel to the VC server, then does an SSL handshake with VC over that tunnelled connection. The SRM installation enforces these semantics.

Install & Config Issues from the field 3 of 3 Q. We have installed SRM but cannot see any SRM screens inside VirtualCenter? A. To make the SRM icon and screens available you must download/install the SRM plugin via the VirtualCenter Manage Plugins menu. Your VirtualCenter userid will also need the appropriate privileges to be able to work with SRM.

Installation & Configuration FAQs Q. Where should we install SRM (and the SRA)? In a VM? On the same VM as VirtualCenter? Should the SRM database reside alongside the VirtualCenter db? Can the SRM database be of a different type? i.e Oracle? A. It can depend on.. Size of VI environment (# hosts/# VMs) Small #hosts & VMs customers deploying in same VM as VC Larger #hosts & VMs customers installing SRM components in separate VM Type of SRA being used can be a factor i.e does your SRA need access to admin LUN(s) to communicate with storage?

Installation & Configuration Common Pitfalls Q. We have created a protection group but it has a warning triangle icon against it and if we try to configure protection on the VM a Edit VM Properties wizard pops up? A. The root cause here is nearly always a missing inventory mapping relating to one of the virtual machine(s) at the protected site that the protection what does group this trying look to protect. If Solution we review is to the SRM Inventory log simply like? we map can Mapping see the this folder also screen to a folder we at can the see that the recovery folder site the Webstore VM resides in has not been mapped.

Storage Replication Adapters Installation Common Setup Issues Gotchas

Storage Replication Adapters Installation Download SRA from vmware.com install, no other checks needed, is that correct? NO, each vendor provides a readme, ensure you review this Each vendor also supplies a whitepaper / technote covering best practices for setting up their adapter (SRA), ensure you seek these out! List of currently available docs on the SRM forum: http://communities.vmware.com/thread/165859?tstart=0 All the SRA s communicate with their arrays in the same way? NO, again each vendors architecture is different for connectivity some require the installation of a client side remote command suite (provided by the storage vendor)

Storage Replication Adapters Common Setup Issues Launching the Configure Array Managers wizard generates errors? Errors returned during the simple input phase of the array manager configuration usually point to misconfiguration of the storage array client/server communication If you are seeing errors at this point review the storage vendors readme for installation of their adapter and ensure to review any best whitepapers. Pay particular attention to sections on configuration of any command line variables Common mistakes include adapters that depend on perl and then fail to set a PATH variable for the perl bin folder Array configurations that do not match pre-reqs defined in storage vendors documentation

Storage Replication Adapters Gotchas 1 of 2 Storage Replication Adapter is installed correctly, all seems ok however in the datastore groups screen no datastores appear? If the replicated VMFS datastores are empty i.e contain no VMs then the datastore will not appear. Add VMs into the datastore(s) and use the rescan arrays button to update the view

Storage Replication Adapters Gotchas 2 of 2 We have storage replication configured but still see errors running the recovery plans? Double check the storage configuration against the storage vendors pre-reqs. For Test mode recoveries storage vendors may not support all of their own available method for creating snapshot luns. Verify the method you are using matches their documentation If you have configured device/consistency/disk groups within your storage array verify the setup of this matches the SRA s recommended configuration. Some vendors have particular requirements for configuration of these data consistency objects when used with Site Recovery Manager

SRM Administration SRM Admin Concepts FAQs

SRM Admin Concepts Placeholder VMs When creating a protection group SRM prompts for a datastore location to house Placeholder VMs what are these used for? Used to identify a location of the recovered VM in within the recovery site VirtualCenter inventory SRM will replace the placeholder VM with the VM registered from the replicated storage during testing / failover Which datastore should be selected to hold the placeholder VMs? What to consider? Datastore needs to reside at the recovery site Datastore does not need to be replicated Sizing - datastore will only contain VM config files (*.vmx, *.vmxf, *.vmsd (typically 3 files < 1KB each)

SRM Admin Concepts Networking 1 of 2 If a recovery plan is run in Test mode or Run mode what are the differences in terms of networking for the VM s being recovered? If recovery plan is invoked in Test mode the VMs are by default assigned to the bubble network (designated Auto ). If recovery plan is invoked in Run mode the VMs are by default assigned to the network (portgroup) defined in the inventory mapping screen at the protected site.

SRM Admin Concepts Networking 2 of 2 Considerations for Test networking Reality is this feature is a safety valve to prevent VMs being tested on production portgroups by default In practice customers will create test VLANs and override bubble

SRM Admin Concepts VM Placement 1 of 2 How will the placement of VMs affect the flexibility of an SRM protection group / recovery plan design? Remember that the basic unit of replication is the datastore A protection group will consist of one or more datastores For a protection group to map to a business entity such as application group, department, IT service then VMs on datastore must map to that entity to provide required granularity For protection groups and recovery plans consisting of multiple datastores verify the replication schedules are sufficiently consistent with each other to meet your RPO s and ensure applications recover correctly. This can be enforced at the storage array level

SRM Admin Concepts VM Placement 2 of 2 Using information from previous slide lets look at a simple example. Goal have a protection group and a recovery plan that consists of only our RDBMs VMs First We Create note locate the that the recovery the RDBMS VMs datastore plan, and we now verify group see they consists the are only of flexibility more type than SRM on datastore one offers in (if datastore. testing others subsets exist We we will of can then svmotion verify infrastructure this to is another a valid config datastore). and the VM Create involved is protection one of our group RDBMS VMs

SRM Administration FAQs 1 of 3 Q. Which VirtualCenter object is SRM enabled on, Host, Cluster, Resource Pool? A. In SRM the basic unit of replication is the datastore. Recovered VMs can be placed on arbitrary hosts/clusters, as long as the hosts can access the replicated datastores Q. Do all VM s we are protecting need to be in a cluster? A. No. SRM only requires separate VirtualCenter instances. One managing protected site and other managing recovery site Q. For failover how does SRM guarantee resources at the recovery site? A. SRM can suspend local VM s at the recovery site as part of recovery plan. Best practice is to also use resource pools at the protected site and map these to resource pools at the recovery site using SRM Inventory Mapping

SRM Administration FAQs 2 of 3 Q. Occasionally when we login to the SRM screens we see the sites pairing status displayed as Low Resources on Paired Site what causes this? A. The Low Resources message can be generated if any of the following conditions are true on the server (VM) where SRM is installed: Remote site free disk space drops below 100 Mb (default) Remote site CPU usage goes above 70 % (default) Remote site available memory drops below 32 Mb (default) These are default values which can be configured by modifying the vmware-dr.xml file located in C:\Program Files\VMware\VMware Site Recovery Manager\config The fields to modify are mindiskspace, maxcpuusage, and minmemory.

SRM Administration FAQs 3 of 3 Q. During the beta SRM used to preserve the VMFS datastore names now they appear as Snap-xxxx? A. The GA version of SRM, by default, does not rename datastores back to the original VMFS name. Customers with ESX v3.0.x should leave the default behaviour in place. Customers with ESX v3.5.x can reverse the behaviour by setting the following variable to true in vmware-dr.xml, default is false <fixrecovereddatastorenames>true</fixrecovereddatastorenames> Note: vmware.dr.xml is located in C:\Program Files\VMware\VMware Site Recovery Manager\config

Failback Failback and SRM 1.0 Options with SRM 1.0 Abbreviations and Checklist Leveraging SRM Considerations and Summary

Failback Is Not Automated in SRM 1.0?

Why Is Failback Not Automated in SRM 1.0? Storage Replication Adapter (SRA) is ONLY responsible for: Array discovery Replicated LUN discovery SRM Test initiation (simulated failover in an isolated environment) SRM Failover initiation (actual failover of services to the recovery site) SRM Failback initiation is not part of the SRA 1.0 specification

Storage View During a SRM Recovery Storage configuration after running a Recovery in SRM (Actual Failover) from Site A to Site B Site A - Protected Site Site B - Recovery Site SRM Recovery Data Replication is suspended Write Disabled (read only) Read Write Enabled Protected VMs (app_vm7 to app_vm12) All powered off by SRM At start of SRM Recovery Source LUN (shared-san-2) Target LUN (shared-san-2) Protected VMs (app_vm7 to app_vm12) All powered on by SRM during the SRM Recovery Note: A Clone LUN is not used during an actual failover in SRM.

SRM 1.0 Failback Options You have two Failback Options Without SRM (no Recovery Plan, no Testing capabilities, no audit trail) Delete the protection groups in the Protected Site VC Unregister the protected virtual machines in the Protected Site VC Work with your storage team, reverse data replication VM re-inventory in Protected Site VC, restart and re-ip (manual or scripted) With SRM (Recovery Plan, Test before Recovery, built-in audit trail) Delete the protection groups in the Protected Site VC Unregister the protected virtual machines in the Protected Site VC Work with your storage team, reverse data replication Leverage SRM, complete SRM workflows in the reverse direction from Recovery Site back to the Protected Site Repeat the above steps from the Protected Site back to the Recovery Site to complete the re-protection of the virtual machines in the Protected Site

Failback Leveraging SRM Connect to the VC instance in Site A and delete protection group (PG 1)

Failback Leveraging SRM.. Connect to the VC instance in Site A and perform a remove from inventory operation on all the protected VMs in Site A

Failback Leveraging SRM.. Work with your Storage team to complete a storage configuration change personality swap FIRST ONE Source LUN is now associated with Site B and the Target LUN is associated with Site A

Failback Leveraging SRM.. Complete the Array Manager configuration wizard in Site B. Site B A Array Manager View

Failback Leveraging SRM.. Configure Inventory Preferences in Site B These inventory preferences will be applied to the protected VMs when they are restarted in Site A after the failback

Failback Leveraging SRM.. Connect to the VC instance in Site B and configure protection groups for the virtual machines that will be failed back to Site A

Failback Leveraging SRM.. Connect to the VC instance in Site A and create required recovery plan(s) in Site A

Failback Leveraging SRM Final Step You are now READY for FAILBACK First complete a SRM Test from Site B to Site A Confirm all protected VMs started up cleanly Review the history report which is your audit trail for the SRM Test Second complete a SRM Recovery from Site B to Site A Confirm all protected VMs started up cleanly Review the history report which is your audit trail for the SRM Recovery* Note: FAILBACK is completed via a SRM Recovery operation

Post Failback Rebuild Original Configuration Shutdown all of the protected VMs in Site A that were failed back from Site B Perform a cleanup of the directory in Site A that contained the VM configuration files created during protection group(s) creation in Site B Connect to the VC instance in Site B and delete protection group(s) that were created in Site B Connect to the VC instance in Site B and perform a remove from inventory operation on all the protected VMs in Site B that were recovered to Site A

Rebuilding Original SRM Configuration Work with your Storage team to complete a storage configuration change personality swap SECOND ONE Source LUN is now re-associated with Site A, the Target LUN is re-associated with Site B along with the Clone LUN

Rebuilding Original SRM Configuration.. Recreate protection group(s) in Site A for the protected VMs Note: The protection groups you create here should be identical to the protection groups that were originally associated with recovery plans contained at Site B ( RP 1) Re-associate protection group(s) in Site A with RP 1 in Site B

Rebuilding Original SRM Configuration..DONE! Complete a final Test a simulated failover against a recovery plan to ensure that Site A is RE-PROTECTED and ready for any event that may necessitate a Recovery via SRM to Site B should a disaster be declared SRM Test

Failback Considerations Site Pairing from Site B back to Site A via the SRM Connection wizard Not required as this is a bi-directional relationship and only needs to be completed once between Site A and Site B SRM License transfer between Site A and Site B The SRM failback steps did not discuss the transfer of SRM licenses between Site A and Site B and vice versa DNS Updates: SRM 1.0 does not provide a mechanism to update DNS Remember to update DNS as required during FAILBACK

Failback Summary Failback is not an automated procedure in SRM 1.0 It is a multistep process that will require planning on your part You will need to involve the storage team during failback Failback can be completed leveraging SRM, the benefits are: Recovery Plan Test before Recovery (FAILBACK) Built-in audit trail Practice Makes Perfect, Test and then Re-Test your FAILBACK procedures

VMware BCDR References Download VMbook on BCDR at: http://www.vmware.com/files/pdf/practical_guide_bcdr_vmb.pdf Visit the Site Recovery Manager site to learn more at: http://www.vmware.com/products/srm/ Download the Site Recovery Manager Compatibility Matrix, which is available at: http://www.vmware.com/pdf/srm_10_compat_matrix.pdf Download the Site Recovery Manager Evaluator's Guide, which is available at: http://www.vmware.com/pdf/srm_10_eval_guide.pdf

Q&A