High Availability (HA) Aidan Finn



Similar documents
Scale-Out File Server. Subtitle

Hyper-V over SMB Remote File Storage support in Windows Server 8 Hyper-V. Jose Barreto Principal Program Manager Microsoft Corporation

Hyper-V over SMB: Remote File Storage Support in Windows Server 2012 Hyper-V. Jose Barreto Principal Program Manager Microsoft Corporation

Hyper-V Networking. Aidan Finn

High Availability with Windows Server 2012 Release Candidate

Configuring a Microsoft Windows Server 2012/R2 Failover Cluster with Storage Center

Live Migration. Aidan Finn

Dell High Availability Solutions Guide for Microsoft Hyper-V R2. A Dell Technical White Paper

Cool New Hyper-V Features in Windows Server 2012 R2. Aidan Finn

Windows Server 2012 R2 Hyper-V: Designing for the Real World

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

Windows Server 2012 Hyper-V Installation and Configuration Guide

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Dell High Availability Solutions Guide for Microsoft Hyper-V

Hyper-V Replica. Aidan Finn

Introduction to Hyper-V High-Availability

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V

Dell Compellent Storage Center

What s new in Hyper-V 2012 R2

Storage Spaces. Storage Spaces

Cloud Optimize Your IT

Storage and High Availability with Windows Server

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Microsoft Certification Exam : Server Virtualization with Windows Server Hyper-V and System Center. Study Guide by Orin Thomas

MS Configuring and Administering Hyper-V in Windows Server 2012

Building a Hyper-V Cluster using the Microsoft iscsi Software Target

und

Dell Compellent Storage Center

Windows Server 2012 授 權 說 明

How To Set Up A Two Node Hyperv Cluster With Failover Clustering And Cluster Shared Volume (Csv) Enabled

PassTest. Bessere Qualität, bessere Dienstleistungen!

MESOS CB220. Cluster-in-a-Box. Network Storage Appliance. A Simple and Smart Way to Converged Storage with QCT MESOS CB220

StarWind Virtual SAN Best Practices

StarWind Virtual SAN Installation and Configuration of Hyper-Converged 2 Nodes with Hyper-V Cluster

Introduction. Options for enabling PVS HA. Replication

Configuring a VEEAM off host backup proxy server for backing up a Windows Server 2012 R2 Hyper-V cluster with a DELL Compellent SAN (Fiber Channel)

Cloud Optimize Your IT

Drobo How-To Guide. Topics. What You Will Need. Prerequisites. Deploy Drobo B1200i with Microsoft Hyper-V Clustering

Best Practices: Microsoft Private Cloud Implementation

Virtual SAN Design and Deployment Guide

The VMware Administrator s Guide to Hyper-V in Windows Server Brien Posey Microsoft

Deploying Microsoft Hyper-V with Dell EqualLogic PS Series Arrays

Storage and High Availability with Windows Server 10971B; 4 Days, Instructor-led

Windows Server RM Seminars spring Windows Server 2012! (C) 2013 RM Education! 1! Agenda. Editions

Optimized Storage Solution for Enterprise Scale Hyper-V Deployments

Windows Server 2008 R2 Hyper-V Live Migration

HP P4000 G2 LeftHand SAN Solutions

70-414: Implementing a Cloud Based Infrastructure. Course Overview

10215A Implementing and Managing Microsoft Server Virtualization

Configuration Guide. Achieve Unified Management and Scale-Out for Windows Server 2012 R2 Hyper-V Deployments with the Sanbolic Platform

How To Connect Virtual Fibre Channel To A Virtual Box On A Hyperv Virtual Machine

Configuring Windows Server Clusters

Big data Devices Apps

Building the Virtual Information Infrastructure

StarWind iscsi SAN Software: Providing shared storage for Hyper-V's Live Migration feature on two physical servers

Introducing. Markus Erlacher Technical Solution Professional Microsoft Switzerland

Implementing and Managing Microsoft Server Virtualization

Private cloud computing advances

10971B: Storage and High Availability with Windows Server

EXAM Installing and Configuring Windows Server Buy Full Product.

Windows 8 SMB 2.2 File Sharing Performance

Microsoft SMB File Sharing Best Practices Guide

Course 10971:Storage and High Availability with Windows Server

Server and Storage Consolidation with iscsi Arrays. David Dale, NetApp

Course Description. Course Audience. Course Page - Page 1 of 7

Veeam Study Webinar Server Virtualization with Windows Server Hyper-V and System Center. Orin

Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008


Bosch Video Management System High Availability with Hyper-V

How to configure Failover Clustering for Hyper-V hosts on HP ProLiant c-class server blades with All-in-One SB600c storage blade

VERITAS Storage Foundation 4.3 for Windows

10971: Storage and High Availability with Windows Server

Hyper-V Protection. User guide

Running Microsoft SQL Server 2012 on a Scale-Out File Server Cluster via SMB Direct Connection Solution Utilizing IBM System x Servers

StarWind Virtual SAN Provides HA Storage for Hyper-V Clusters with DAS-Based Virtualized iscsi Devices

Index C, D. Background Intelligent Transfer Service (BITS), 174, 191

What DBAs Should Know About Windows Server 2012

What s New with VMware Virtual Infrastructure

Step-by-Step Guide. to configure Open-E DSS V7 Active-Active iscsi Failover on Intel Server Systems R2224GZ4GC4. Software Version: DSS ver. 7.

Synology High Availability (SHA)

Storage and High Availability with Windows Server

MS-10215: Implementing and Managing Microsoft Server Virtualization. Course Objectives. Required Exam(s) Price. Duration. Methods of Delivery

Powered by Windows Server 2012

Building a Scalable Microsoft Hyper-V Architecture on the Hitachi Universal Storage Platform Family

Quorum DR Report. Top 4 Types of Disasters: 55% Hardware Failure 22% Human Error 18% Software Failure 5% Natural Disasters

Brocade and EMC Solution for Microsoft Hyper-V and SharePoint Clusters

StarWind Virtual SAN for Microsoft SOFS

Dell Converged Infrastructure

Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4

Availability Guide for Deploying SQL Server on VMware vsphere. August 2009

Introduction to Hyper-V High- Availability with Failover Clustering


Powering the Next Generation Cloud with Azure Stack, Nano Server & Windows Server 2016! Jeff Woolsey Principal Program Manager Cloud & Enterprise

SQL Clusters in Virtualized Environments April 10 th, 2013

SMB Direct for SQL Server and Private Cloud

Server and Storage Virtualization with IP Storage. David Dale, NetApp

Implementing and Managing Microsoft Server Virtualization

This video is part of the Microsoft Virtual Academy.

QNAP in vsphere Environment

Transcription:

High Availability (HA) Aidan Finn

About Aidan Finn Technical Sales Lead at MicroWarehouse (Dublin) Working in IT since 1996 MVP (Virtual Machine) Experienced with Windows Server/Desktop, System Center, virtualisation, and IT infrastructure @joe_elway http://www.aidanfinn.com http://www.petri.co.il/author/aidan-finn Published author/contributor of several books

Books System Center 2012 VMM Windows Server 2012 Hyper-V

Agenda Item 1 Item 2 Item 3 4

What is HA?

High Availability From the Hyper-V perspective, HA is about infrastructure fault tolerance Example: 1. HostA is one of a number of hosts in a cluster 2. Every host in the cluster stores VM files on shared storage, such as a SAN 3. VM01 is running on HostA 4. HostA stops running 5. VM01 automatically fails over to another host in the cluster 6. VM01 automatically boots up There is some downtime for VM01 but it is minimized The cluster has acted as a unit to protect against the failure of HostA

Two or more hosts A Typical Hyper-V Cluster Each host is connected to a set of networks with special roles All hosts are connected to the a shared cluster-supported storage system All HA VMs are stored on the shared storage

Managing Failure

The Heartbeat Failover Clustering conducts health monitoring between nodes to detect when servers are no longer available When servers are unresponsive clustering takes recovery action Unicast in nature and uses a Request- Reply type process for reliability and security Not just a basic ping Yes You there?

Failover Cluster Virtual Adapter Failover Cluster Virtual Adapter (NetFT) is a virtual network adapter that builds fault-tolerant TCP connections across all available interfaces between nodes in the cluster NetFT is the mechanism by which clusters use multiple clusterenabled adapters to communicate Seamless internode communication NetFT will dynamically and seamlessly switch cluster communication to a different network (based on priority) when a network fails Long story short: The cluster can use multiple enabled networks for cluster communications and is fault tolerant 10

Heartbeat Detection Runs on TCP 3343 WS2012 R2 Hyper-V clusters: Nodes exchange heartbeats every 1 second Will allow for failure for up to 10 seconds (5 on non Hyper-V) for nodes on the same subnet Will allow for failure for up to 20 seconds (5 on non Hyper-V) for nodes on different subnets No response after that threshold the host is assumed offline and quorum must be obtained 11

Quorum

What is Quorum? Quorum is when you have enough voters to come to an agreement Primary function of cluster is to keep mission critical services online It needs to accomplish this without causing corruption or confusion This is why quorum is used 13

Explaining Quorum 14

Quorum Basics Sticking with WS2012 R2 to keep this simple Two types of vote breaker or witness Witness disk A 1 GB LUN that is created on the shared storage just for this purpose Configured as a witness disk in the cluster Owner of the disk is the vote breaker in case of tied vote for quorum File Share Witness Originally intended for multi-site clusters Now used for clusters that use SMB 3.0 for shared storage 15

Explaining Quorum 16

Other Quorum Concepts Sequential host failure (WS2012) Scenario when one host after another after another drops offline Quorum can be still obtained, even if less than half the nodes are online Dynamic quorum (WS2012 R2) When the cluster rigs the quorum voting process Intended to give cluster more chance of staying online ALWAYS have a quorum witness Used to only do it with uneven number of nodes before WS2012 R2 17

Cluster Storage

Why Storage Is Needed The Hyper-V hosts provide HA to VMs Each host must have access to the VMs storage There is no replication from host-to-host inside a cluster All VMs are stored on shared storage Options include SAS storage area network (SAN) iscsi SAN Fibre channel SAN Fibre Channel over Ethernet (FCoE) SAN PCI RAID (WS2012 +) Storage Spaces (WS2012 +) SMB 3.0 (WS2012+ file shares) 19

Connectivity Each node is connected to the shared storage Exact same connectivity Dual path connectivity Multipath IO (MPIO) for traditional storage SMB Multichannel for SMB 3.0 storage All disks/luns/shares on the storage are assigned to all nodes Each host has equal access 20

Cluster-in-a-Box Take the requirements of a cluster Put it into a single enclosure 2+ blade servers with own power + networking JBOD or PCI RAID shared storage Hard wired cluster networking Designed for SME and branch office

Cluster Shared Volumes

Cluster Shared Volume (CSV) Microsoft s cluster file system Makes the volume on the disk active/active across all nodes Store lots of VMs on a single volume All able to run on any node in the cluster Every node connected to the disk can read/write to the volume One node owns the volume and is responsible for metadata operations: Owner AKA CSV Coordinator No drive letter Drive is mounted as folder under C:\ClusterStorage on each node You can (I usually do) rename that folder, e.g. CSV1

CSV Illustrated

A process used by CSV (only) Redirected IO Nodes in cluster redirect storage IO to pass over cluster network via CSV coordinator Done on per-csv basis, not per-cluster Used by W2008 R2 CSV for backup Caused concern Redirected IO NOT USED FOR BACKUP SINCE WS2012 Redirected IO is used by WS2012 for: Very brief metadata operations: permissions, file metadata, file create, file open, file extend Storage path fault tolerance: Node loses direct connection to storage and redirects via CSV owner to avoid outage

Redirected IO Illustrated

On W2008 R2: Controlling Redirected IO Redirected IO went across the cluster communications network Network with lowed routing metric (could be manipulated) On WS2012 and later: Uses SMB 3.0 and SMB Multichannel Can flood equal speed networks between nodes if not controlled Use SMB Multichannel Constraints to select which networks to talk to other cluster nodes New-SmbMultichannelConstraint -ServerName Node2,Node3 InterfaceAlias ClusterNet1,ClusterNet2

CSV Cache A read cache for virtual hard disks stored on the CSV Uses percentage of cluster node s RAM for the cache Size of cache is set once per cluster Boost read performance, e.g. VDI boot storm (Get-Cluster). SharedVolumeBlockCacheSizeInMB = 512 WS2012 Up to 20% of nodes RAM could be assigned to cache Enable each required CSV for CSV Cache Get-ClusterSharedVolume Cluster Disk 1 Set- ClusterParameter CsvEnableBlockCache 1 Required CSV to be disabled/enabled to start caching WS2012 R2 Up to 80% of nodes RAM can be assigned to cache CSV Cache enabled by default for each CSV

WS2012: Other CSV 2.0 Improvements Uses mount point instead of junction point Single synchronised VSS Snapshot for backup - no Redirected IO during backup Can enable BitLocker NTFS on CSV appears as CSVFS Supported for Hyper-V and Scale-Out File Server WS2012 R2: Supports ReFS file system I still would not do it yet unless volumes are huge (no CHKDSK) CSV ownership is automatically load balanced across nodes

Networking

Converged Networks In W2008 R2 we would have had 1 NIC or NIC team per required network Lots of NICs Very expensive to add 10 GbE or faster networking for peak usage Converged networks concept: Aggregate fewer NICs into an accumulation of bandwidth Divide that bandwidth up using WS2012+ QoS into required networks Makes adopting 10 GbE or faster from economic for medium/larger companies Much bigger concept than I have time to talk about today. See my posts on aidanfinn.com and Petri IT Knolwedgebase

Non-Converged with iscsi With SAS/FC SAN: 4 NICs, 8 with NIC teaming With iscsi SAN: 6 NICs, 10 with NIC teaming 1-2 more with dedicated backup network

Convergence Using Virtual NICs Management OS MPIO Management (VLAN 101) Cluster (VLAN 102) Live Migration (VLAN 103) Backup (VLAN 104) (VLAN 201) (VLAN 202) Virtual Switch NIC Team Hyper-V Port or Dynamic SAS/FC/iSCSI Storage Adapters pnic1 (DVMQ) Trunk Ports pnic2 (DVMQ) Top-of-Rack Switches

Convergence with SMB 3.0 Storage Management OS SMB Direct (W=70) Backup (W=20) Cluster (W=10) Management (VLAN 101, W=50) NIC Team (Dynamic) iwarp rnic (RSS, DCB) SMB Multichannel Constraint iwarp rnic (RSS, DCB) 1 Gbps NIC 1 Gbps NIC 10 Gbps Switch1 (VLAN 201, DCB) 10 Gbps Switch1 (VLAN 202, DCB) 1 Gbps Switch1 (Trunk Port) 1 Gbps Switch2 (Trunk Port)

Building a Cluster

Creating a Cluster Easier than ever Get the pieces right first: Storage Networking Process: 1. Validate the cluster fix until it passes 2. Deploy the cluster Get it right up front and it takes a few minutes Possible to automate using PowerShell

Finish the Cluster

Run Windows Update Completing the Cluster To get updates published via Windows Update Search for Recommended update for Windows Server 2012 R2 Failover Clustering To get the bug fixes that are usually not published via Windows Update

Configure Cluster Networks Rename the networks in Failover Cluster Manager I name them after the NICs that are on the networks Select your Live Migration network(s) Check multiple boxes if you elect to use SMB Live Migration

Configure Witness Cluster wizard will automatically find a suitable Disk Witness if one is available Make sure you check this You will have to add a File Share Witness if using SMB 3.0 storage

Configure Storage SMB 3.0 Storage Create one share for File Share Witness Create one or more shares for storing VMs Add all hosts to a security group Add all admins to a security group Grant full control to the shares Disk storage Provision 1 * 1 GB disk for disk witness Provision 1 or more LUNs per node in the cluster to store VMs Connect the disks to all nodes in the cluster Activate (GPT) and format the disks in Disk Manager on one node Add the disks to the cluster Convert storage disks to CSVs I rename the mount points to have consistent names

Patching a Cluster

Cluster Aware Updating Simple orchestration of cluster node updates Determines updates needed, moves workloads off nodes for updates Uses Windows Update Agent direct from Microsoft or from WSUS Identifies node with least load Puts node in maintenance mode Verifies success, then moves to next node Maintains service availability and without impacting cluster quorum Can be: Scheduled Manually started from remote Failover Cluster Manager console Update Coordinator

Enabling Cluster Self-Updating Place all cluster nodes and cluster computer account in an OU for the cluster Delegate rights to cluster CAP Create/manage computer objects in this OU This is used to create another CAP/computer object for selfupdating CAU Launch CAU wizard in Failover Cluster Manager Configure Self-Updating job

Managing VMs

Use FCM All management of HA VMs is locked out in Hyper-V Manager Use Failover Cluster Manager You can order failover of VMs using Virtual Machine Priority (High/Medium/Low) You can drain a node of VMs by pausing the host

Backup There are products that support a WS2012 R2 Hyper-V cluster And then there are products that do it at least decently Test & research Do not trust sales & marketing Many have been stung, especially by: Companies known by 2 letters Companies that add support 12 months after a server release