Nutanix Solution Note



Similar documents
Nutanix Tech Note. Data Protection and Disaster Recovery

Backup and Recovery Best Practices With Tintri VMstore

Redefining Microsoft SQL Server Data Management. PAS Specification

Microsoft Private Cloud Fast Track

Technology Insight Series

Microsoft SMB File Sharing Best Practices Guide

Evolving Datacenter Architectures

Using Live Sync to Support Disaster Recovery

Redefining Microsoft Exchange Data Management

Nutanix Solutions for Private Cloud. Kees Baggerman Performance and Solution Engineer

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

Native Data Protection with SimpliVity. Solution Brief

Protecting Microsoft Hyper-V 3.0 Environments with CA ARCserve

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Optimization, Business Continuity & Disaster Recovery in Virtual Environments. Darius Spaičys, Partner Business manager Baltic s

Maxta Storage Platform Enterprise Storage Re-defined

VMware VDR and Cloud Storage: A Winning Backup/DR Combination

Windows Server 2003 Migration Guide: Nutanix Webscale Converged Infrastructure Eases Migration

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

High Availability with Windows Server 2012 Release Candidate

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Windows Server 2008 Hyper-V Backup and Replication on EMC CLARiiON Storage. Applied Technology

BDR TM for VMware. VMware BACKUP WITH VEMBU. VEMBU TECHNOLOGIES TRUSTED BY OVER 25,000 BUSINESSES

Drobo How-To Guide. Topics Drobo and vcenter SRM Basics Configuring an SRM solution Testing and executing recovery plans

Business Process Desktop: Acronis backup & Recovery 11.5 Deployment Guide

Zerto Virtual Manager Administration Guide

Deployment Options for Microsoft Hyper-V Server

VMware vcenter Site Recovery Manager 5 Technical

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

Introduction Overview

Technology Comparison. A Comparison of Hypervisor-based Replication vs. Current and Legacy BC/DR Technologies

Backing Up the CTERA Portal Using Veeam Backup & Replication. CTERA Portal Datacenter Edition. May 2014 Version 4.0

Backup and Recovery Best Practices With CommVault Simpana Software

Continuous Data Protection for any Point-in-Time Recovery: Product Options for Protecting Virtual Machines or Storage Array LUNs

How To Create A Hypervisor Based Replication In Zerto

Virtual Server System and Data Protection, Recovery and Availability with CA ARCserve r16

Symantec and VMware: Virtualizing Business Critical Applications with Confidence WHITE PAPER

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

Introduction. Setup of Exchange in a VM. VMware Infrastructure

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

EMC Backup Solutions for Virtualized Environments

Windows Server 2012 授 權 說 明

Redefining Microsoft SQL Server Data Management

Dell PowerVault DL2200 & BE 2010 Power Suite. Owen Que. Channel Systems Consultant Dell

How to Effectively Protect Data in Virtualized Environments. By Hitachi Data Systems

Virtual Server System and Data Protection, Recovery and Availability

HIGHLY AVAILABLE MULTI-DATA CENTER WINDOWS SERVER SOLUTIONS USING EMC VPLEX METRO AND SANBOLIC MELIO 2010

Module: Business Continuity

How To Get A Storage And Data Protection Solution For Virtualization

ACCELERATING YOUR IT TRANSFORMATION WITH EMC NEXT-GENERATION UNIFIED STORAGE AND BACKUP

What s New with VMware Virtual Infrastructure

Leveraging Public Cloud for Affordable VMware Disaster Recovery & Business Continuity

A virtual SAN for distributed multi-site environments

Making a Smooth Transition to a Hybrid Cloud with Microsoft Cloud OS

Drobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups

Evaluation of Enterprise Data Protection using SEP Software

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Optimized Storage Solution for Enterprise Scale Hyper-V Deployments

CommVault Simpana Remote and Branch Office Protection

CA arcserve Unified Data Protection virtualization solution Brief

VMware vsphere Data Protection 6.0

Top 10 Do s/don ts of Data Protection for VMware vsphere

Virtual Server System and Data Protection, Recovery and Availability

The VMware Administrator s Guide to Hyper-V in Windows Server Brien Posey Microsoft

Veritas Storage Foundation High Availability for Windows by Symantec

A Guide to Disaster Recovery in the Cloud. Simple, Affordable Protection for Your Applications and Data

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS

Backup & Recovery for VMware Environments with Avamar 6.0

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage

The Power of Deduplication-Enabled Per-VM Data Protection SimpliVity s OmniCube Aligns VM and Data Management

EMC RECOVERPOINT FAMILY

Disaster Recovery of Tier 1 Applications on VMware vcenter Site Recovery Manager

Virtual Machine Protection with Symantec NetBackup 7

W H I T E P A P E R. Disaster Recovery Virtualization Protecting Production Systems Using VMware Virtual Infrastructure and Double-Take

Windows Server 2008 R2 Hyper-V Live Migration

Backup and Recovery for VMware Using EMC Data Domain Deduplication Storage

Availability for your modern datacenter

Citrix XenApp Server Deployment on VMware ESX at a Large Multi-National Insurance Company

IBM Tivoli Storage FlashCopy Manager

VMware Site Recovery Manager and Nimble Storage

SteelFusion with AWS Hybrid Cloud Storage

Is VMware Data Recovery the replacement for VMware Consolidated Backup (VCB)? VMware Data Recovery is not the replacement product for VCB.

Turbo Charge Your Data Protection Strategy

RUBRIK CONVERGED DATA MANAGEMENT. Technology Overview & How It Works

CommVault Simpana Replication Software Optimized Data Protection and Recovery for Datacenter or Remote/Branch Office Environments

Enabling comprehensive data protection for VMware environments using FalconStor Software solutions

What s New in VMware Site Recovery Manager 6.1

IMPROVING VMWARE DISASTER RECOVERY WITH EMC RECOVERPOINT Applied Technology

Efficient Storage Strategies for Virtualized Data Centers

SOLUTION BRIEF: CA ARCserve R16. Virtual Server System and Data Protection, Recovery and Availability

Acronis Backup Product Line

Implementing a Holistic BC/DR Strategy with VMware

Microsoft SQL Server on VMware Availability and Recovery Options

BEST PRACTICES GUIDE: VMware on Nimble Storage

New Generation of IT self service vcloud Automation Center

Transcription:

Nutanix Solution Note Version 1.0 April 2015 2

Copyright 2015 Nutanix, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. Nutanix is a trademark of Nutanix, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies. 2

1. Nutanix Virtual Computing Platform... 4 2. Backup... 5 Unlimited VM-centric Snapshots... 5 Hybrid Cloud Deployments with Cloud Connect... 7 3. Disaster Recovery... 9 Remote Replication... 9 Metro Availability... 10 4. Ecosystem Integration... 12 A RESTful Future... 13 5. Conclusion... 14 3

1. Nutanix Virtual Computing Platform This Solution Note discusses the data protection and disaster recovery functionality in the Nutanix Virtual Computing Platform. We also recommend reading the Nutanix Tech Note on Infrastructure resilience to learn more on about the resiliency features of the Virtual Computing Platform including how hardware and software failures are handled. The Nutanix distributed software architecture runs a virtual storage controller (Controller VM or CVM) on each Nutanix node or host on the Virtual Computing Platform, forming a distributed system. All nodes actively work together to aggregate storage resources into a single global pool that can be leveraged by all. The storage resources are managed by the Nutanix Distributed File System (NDFS) to ensure that data and system integrity is preserved in the event of node, disk or application or hypervisor software failure. NDFS also delivers data protection and high availability functionality that keeps critical data and VMs protected and applications running. Figure 1: Nutanix solution for data protection and disaster recovery covers all aspects of availability 4

2. Backup Unlimited VM-centric Snapshots The foundation of the Nutanix data protection functionality is the concept of a VM-centric snapshot. To understand the advantage of Nutanix snapshot functionality, it is important to understand the different types of snapshots available today. A snapshot is an evolution of the traditional backup process. It is created when the storage system creates a full or virtual copy of the metadata or the index of the stored data. This is different from traditional backup solutions, which create separate copies of the stored data. Because snapshots only need to copy the metadata or index at the time they are taken, they can be near instantaneous, have little performance impact and require little incremental space. IT organizations can take snapshot-based backups more frequently and improve recovery point objective. Backup vendors and analysts have acknowledged the shift to snapshots as a viable option for backup and recovery. However, it is important to note that not all snapshot implementations are created equal. Each of the implementations has different storage requirements and pose different restrictions on their use. The preferred implementation of snapshot is redirect-on-write (ROW). In this method, any updates to existing protected data are redirected to a new location. None of the existing data in snapshots needs to be copied or moved. As a result ROW snapshots do not suffer the performance impact of the alternative copy-on-write snapshot implementations. The performance impact for copy-on-write snapshots limits their applicability for primary data. Another consideration when implementing snapshots is the granularity of data that can be protected. This determines the space overhead of the snapshots taken. Smaller block sizes result in increased sharing of data between snapshots and greater space efficiency. With large blocks, a change to a small portion of a block would create a full new block with mostly duplicate data, causing the snapshot size to be much larger than the amount of data changed. The last aspect that needs to be considered for snapshot design is the unit of data that can be protected and restored by the storage system. Traditional storage deployments typically operate at the storage object or volume/lun level with little to no understanding of what is stored in those containers. In virtualized environment, this results in a simultaneous snapshot of tens-to-hundreds of VMs, each with varying change rates. Consequently, it puts the burden on the administrators to map the different VMs to the storage objects such as LUNs or volumes. This results in additional steps and greater system complexity, especially when recovering individual VMs. In the traditional approach, snapshot schedules can only be set at a LUN or a volume level, leading to practices such as creating one LUN per VM as a workaround in order to create individualized snapshot VM schedules. An alternative to this method is taking a VM-centric approach to storage and data protection. In this scenario, storage understands and operates at the virtual disk or VM-level. So snapshots are taken at the VM-level and administrators can set schedules and retention periods at the VM-level to meet service levels. Recovery is simple as administrators can restore individual VMs without dealing with the underlying storage objects. This brings us to the snapshot implementation on the Virtual Computing Platform. Nutanix OS implements redirect-on-write, VM-granular snapshots. When a snapshot of a VM is initially taken on the Nutanix Virtual Computing Platform, the system creates a read only zero-space clone of the metadata (i.e. index to data) and makes the underlying VM data immutable or read only; no VM data or virtual disks are actually copied or moved. The system creates a read only copy of the VM that can be accessed similar to its active counterpart. Nutanix snapshots take only a few seconds to create, eliminating application and VM backup windows. 5

After a snapshot is taken and as the VM continues to run, any updates to existing data and new writes are redirected. The original data in the snapshot remains unchanged and the unchanged data is shared across the snapshots and active VM. The Virtual Computing Platform handles this transparently so there is no change to how applications and the virtualization stack accesses the VM. From an efficiency standpoint, Nutanix snapshots can be taken with byte-level resolution. This byte-incremental implementation means that only the changed data is captured between successive snapshots. For even greater efficiency, all the data stored on the Virtual Computing Platform including the snapshot can be compressed and deduplicated. Even though individual deployment savings will vary with the specific workloads, average deployments depending on the workload have seen anywhere from 25% to 75% reduction in the amount of space needed. Nutanix snapshots have byte-level resolution Figure 2: Nutanix snapshots are more efficient with byte-level granularity The VM-granular snapshots can be set to be either crash consistent or VM-consistent and can be scheduled on an hourly, daily, weekly or monthly basis depending on the Recovery Point Objectives (RPO) and retention needs. The choice between taking crash-consistent or VM-consistent snapshots should be based on recovery needs. Crash consistent snapshots are instantaneous and are sufficient for workloads able to recover from operating system (OS) or VM crashes. Stateless applications such as web-servers are best protected through crash consistent snapshots. The alternative VM-consistent snapshots take advantage of host framework and services such as Microsoft Volume Shadow Copy Service (VSS) to quiesce the VM and supported applications; rendering them in to a known or consistent state. In the case of VMware running Microsoft Windows guests, VSS support is provided with VMware tools running in the guest OS. Using deep integration between Nutanix Virtual Computing Platform and VMware vsphere, the VMware tools are called to quiesce the OS and supported applications such as Microsoft Exchange and SQL Server before the Virtual Computing Platform takes a VMconsistent snapshot of the VM. Additionally, multiple VMs can be grouped together in a Nutanix protection domain enabling them to be operated upon as a single entity with the same RPO. This is useful when trying to protect complex applications such as Microsoft SQL Server-based applications or Microsoft Exchange. The main advantage of using a protection domain approach of grouping VMs versus the traditional SAN approach of consolidating different VMs on to a single LUN is VM portability. VMs can be moved between different protection domains on a Nutanix Virtual Computing Platform without the need for any data to be moved or copied. For traditional SANs, changing a VM s SLA will most likely require migrating the VM to another LUN or volume. Because of the unique NDFS design leveraging a shared nothing distributed approach to metadata, there is no upper limit to the number of snapshots that can be taken with the Nutanix Virtual Computing Platform. This scalable approach eliminates the need for separate Keeping Data Optimized Nutanix Virtual Computing Platform runs a distributed data management service in the background. The MapReduce-based service called Curator is responsible for executing tasks such as metadata optimization, garbage collection of deleted VMs, data reduction, tiering, consistency checking, and rebalancing to optimize data across nodes and flash/disks with minimal impact to performance. 6

storage systems for backup and long term archiving, as the VM snapshots are stored across the entire cluster that makes up Nutanix Virtual Computing Platform. Nutanix snapshot technology forms the basis of a unique set of functionality and ecosystem for high availability and disaster-recovery. The first feature that builds on the Nutanix snapshot capability is VM-granular cloning. Cloning can be used for a variety of reasons including deployment and recovery. Integration with the virtualization stack with functionality such as VMware vstorage APIs for Array Integration (VAAI) and VMware View Composer API for Array Integration (VCAI) enables administrators to simplify VM deployment using integrated cloning. For the purpose of this document, the discussion will focus on recovering VMs. The Virtual Computing Platform enables user-driven recovery of individual VMs from snapshots. This is done by either replacing the existing active VM with the snapshot copy or by creating a separate clone of a snapshot preserving the active VM. Depending on settings of snapshot, the recovered VM will either be crash-consistent or VM-consistent upon recovery. If needed, administrators can create a clone of a Nutanix VM-granular snapshot for the purpose of recovering a single file without taking up additional space. Compared to a traditional LUN/volume based approach, a VMgranular snapshot approach eliminates the need for first recovering the storage object (LUN/volume) and then identifying and mounting the VM, and recovering the file. Hybrid Cloud Deployments with Cloud Connect With Nutanix Cloud Connect, customers can now leverage public cloud services as a destination and seamlessly backup and recover their Virtual Machines as if it were another site that they own. It reuses all existing concepts that we discussed earlier and extends it to the public cloud. Depending on the workload and the associated SLAs, customers can tune the backup schedule and retention periods. All the management happens centrally from within Nutanix Prism. A single management console will be used for managing storage, compute, backup and DR. From within Nutanix Prism, Cloud Connect can be setup, workloads can be backed up to public cloud or a remote site, protected items can be parsed through quick recovery can be performed, make changes to protection schedules. When using a VPC to connect to public cloud all of the nodes help participate in replication so it does not impact the running workloads Data that is sent across the WAN can be compressed and the granularity of what is sent is at the byte level. If 32KB of data is changed Nutanix will send 32BK of data. If only 4KB of data has changed then only 4KB of data is sent. 7

Figure 3: Cloud Connect leverages public cloud resources worldwide 8

3. Disaster Recovery Remote Replication Nutanix VM-granular snapshots also make it possible to efficiently replicate individual virtual machines from a primary Virtual Computing Platform to one or more secondary Nutanix clusters across different sites. By supporting a fan-out and fan-in or multi-way model for replication, the Virtual Computing Platform can create flexible multi-master virtualization environment for backup and disaster recovery. Deployments supporting numerous remote and branch offices can benefit from a flexible deployment model. Figure 4: Multi-way protection domains make DR flexible Since the software-defined replication functionality builds on VM-granular snapshots, policies for replication are also set at the individual protection domain level rather than working at the LUN/volume level. Only byte-level changes between snapshots of individual-vms are sent over the network to the remote cluster. NDFS also enables another host other than the one serving IO on the active virtual disk in the cluster can do the work of calculating the changed blocks; eliminating performance bottlenecks for critical VMs and their corresponding hosts. So all nodes in the cluster participate in replication. Host-based or storage VM-granular Replication based replication with Nutanix Figure 5: Eliminate bottlenecks by using all cluster resources for replication To make the most out of WAN connectivity, the data can be deduplicated and compressed before it is sent across the WAN. First the fingerprint of changed blocks for individual VMs are sent from the primary system to the different destinations. The different destination systems report back with the unique blocks they need to create the destination, which is sent back by the primary system. Deduplicating data sent to remote sites can effectively cut the bandwidth required by as much as 75% versus host-based full-copy backup solutions. 9

Nutanix VM-granular replication makes it possible to create an affordable disaster recovery solution. The converged compute and storage approach used by Virtual Computing Platform along with the VM-centric approach to replication makes creating a disaster recovery solution very simple. Using the protection domain concept, the groups of related VMs can be replicated together and those VMs can be brought up on the secondary site with a single command in case the primary site is down. Because the workloads are virtualized and replication is not hardware dependent, the secondary site can have different cluster sizes and configurations from the primary site s clusters. This is especially useful for deployments with multiple remote sites using a centralized backup and disaster recovery strategy. Metro Availability Metro Availability synchronously replicates data to another site ensuring that a real-time copy of data exists at a different location. In the event of a disaster or a planned maintenance, virtual machines (VM) can failover from a primary site to a secondary site, guaranteeing near 100% uptime for applications. Metro Availability is a continuous availability solution that provides a global file system namespace across a stretched container between Nutanix clusters. The stretched container is supported by synchronous storage replication across independent Nutanix clusters across different sites. Synchronous replication is enabled at the container level, and all virtual machines and files stored within that container are replicated synchronously to another Nutanix cluster. Containers have two primary roles while enabled for Metro Availability, Active and Standby. Active containers replicate data synchronously to Standby containers. The active and standby containers will be mounted to their respective Hypervisor hosts using the same datastore name, which effectively spans the datastore across both clusters and sites. With a stretched datastore across both Nutanix clusters, a single Hypervisor cluster can be created and common clustering features, like VMware vmotion and VMware High Availability, can be used to manage the environment. Metro Availability is supported in conjunction with existing Nutanix data management features including compression, deduplication and tiering. Metro Availability also allows compression to be enabled for the synchronous replication traffic between the Nutanix clusters. The compression of replication traffic is enabled when creating the remote site configuration and will help reduce the total bandwidth required to maintain the synchronous relationship. With Metro Availability, hypervisor related high availability or clustering technologies typically used within datacenters can now be leveraged across datacenters. This type of configuration is commonly referred to as a stretched cluster and helps to minimize downtime during unplanned outages. Metro Availability also supports the migration of virtual machines across sites using technologies such as vmotion. This enables zero downtime while transitioning workloads between datacenters. Setup and management is simple, intuitive and done from within the Prism UI. It can also be automated using REST APIs in larger environments. The simplicity and ease of management is unparalleled and for the first time enterprises will have a modern consumer-grade management experience when it comes to disaster recovery and high availability. 10

Figure 6: Nutanix Metro Availability delivers zero RPO 11

4. Ecosystem Integration Nutanix integrates with popular offload capabilities, including VMware API for Array Integration (VAAI), Microsoft Offloaded Data Transfer (ODX) to create clones in a matter of seconds with minimal overhead. Additionally, with support for vstorage API for Data Protection (VADP) and application-level consistent snapshots by leveraging Volume Shadow Services (VSS), Nutanix backup and DR capabilities fully integrate with third-party tools, such as Symantec NetBackup, Commvault Simpana, and Veeam. 12

5. A RESTful Future Nutanix Virtual Computing Platform provides an exhaustive list of REST APIs accessed through the Nutanix Prism management framework to various functions including around data protection and disaster recovery. These APIs can be explored through the Nutanix Prism API explorer. The REST APIs are the foundation for the Nutanix Prism management interface and for failover run book automation. Figure 7: Nutanix Prism APIs and PowerShell commandlets enable runbook automation for failover Nutanix Prism APIs and PowerShell commandlets can also be used to automate workflows using snapshots and replication for backup and disaster through scripting languages, or workflow engines. The Prism APIs are also used to create an automated run book for failover, automatically registering the VM at the DR site in VMware vcenter and powering them on. For example, a custom script can be created using the Prism APIs can trigger a Virtual Computing Platform to take and replicate a snapshot of the group of critical VMs making up an orderentry system, based on the number of transactions being executed. 13

6. Conclusion Most enterprise workloads are either unprotected or under-protected. Significant budget requirements and deployment complexities have prohibited enterprises from protecting their applications resulting in downtime during software and hardware glitches and user errors. With the increasing use of virtualization for critical workloads it is no longer optional to deploy data protection and disaster recovery. With VM-granular snapshots and recovery, use of policy-based protection domains, VM-granular site-to-site replications, Metro Availability, and centralized management using Prism, Nutanix Virtual Computing Platform provides the functionality to backup critical data, protect applications, and survive disasters efficiently without specialized skill or investment. Figure 8: Keep applications protected and available with Nutanix 14