Performance of VMware vcenter (VC) Operations in a ROBO Environment TECHNICAL WHITE PAPER



Similar documents
VMware vcenter Server 6.0 Cluster Performance

Getting Started with ESXi Embedded

VMware vcenter Update Manager Administration Guide

VMware vsphere 5.0 Evaluation Guide

Microsegmentation Using NSX Distributed Firewall: Getting Started

HCIbench: Virtual SAN Automated Performance Testing Tool User Guide

Enabling NetFlow on Virtual Switches ESX Server 3.5

What s New in VMware vsphere 4.1 VMware vcenter. VMware vsphere 4.1

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Building a Penetration Testing Virtual Computer Laboratory

Remote PC Guide Series - Volume 1

VMware vsphere-6.0 Administration Training

VMware vshield Zones R E V I E W E R S G U I D E

QuickStart Guide vcenter Server Heartbeat 5.5 Update 2

VMware Data Recovery. Administrator's Guide EN

vsphere Networking ESXi 5.0 vcenter Server 5.0 EN

Setup for Failover Clustering and Microsoft Cluster Service

VMware vsphere 5.0 Evaluation Guide

vsphere Replication for Disaster Recovery to Cloud

VMware Virtual SAN Proof of Concept Guide

Legacy Host Licensing with vcenter Server 4.x ESX 3.x/ESXi 3.5 and vcenter Server 4.x

How to Create a Virtual Switch in VMware ESXi

Uila SaaS Installation Guide

VI Performance Monitoring

Installing and Configuring vcenter Multi-Hypervisor Manager

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Set Up a VM-Series Firewall on an ESXi Server

VMware vcenter Update Manager Performance and Best Practices VMware vcenter Update Manager 4.0

Monitoring Databases on VMware

vsphere Upgrade vsphere 6.0 EN

Unattended Installation on Windows Server 2003/2008

Configuration Maximums VMware vsphere 4.1

Installing and Administering VMware vsphere Update Manager

Managing Multi-Hypervisor Environments with vcenter Server

Installing and Configuring vcenter Support Assistant

VMware vcenter Log Insight Getting Started Guide

Getting Started with Database Provisioning

vsphere Replication for Disaster Recovery to Cloud

Basic System Administration ESX Server and Virtual Center 2.0.1

VMware vsphere Data Protection 6.0

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

VMware vsphere Storage Appliance 5.1.x Brownfield Deployments. TECHNICAL MARKETING DOCUMENTATION v 1.0

Setup for Failover Clustering and Microsoft Cluster Service

VMware vcenter Update Manager Administration Guide

VM-Series Firewall Deployment Tech Note PAN-OS 5.0

Setup for Failover Clustering and Microsoft Cluster Service

Virtual Appliance Setup Guide

E-SPIN's Virtualization Management, System Administration Technical Training with VMware vsphere Enterprise (7 Day)

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

Veeam Cloud Connect. Version 8.0. Administrator Guide

Acronis Backup & Recovery 10 Advanced Server Virtual Edition. Quick Start Guide

ESX Configuration Guide

Using VMware ESX Server with IBM System Storage SAN Volume Controller ESX Server 3.0.2

End Your Data Center Logging Chaos with VMware vcenter Log Insight

SAN/iQ Remote Copy Networking Requirements OPEN iscsi SANs 1

Managing Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER

VMware vsphere Replication Administration

High-Availability Fault Tolerant Computing for Remote and Branch Offices HA/FT solutions for Cisco UCS E-Series servers and VMware vsphere

What s New in VMware vsphere 5.5 Networking

SAN Conceptual and Design Basics

What s New in VMware vsphere Storage Appliance 5.1 TECHNICAL MARKETING DOCUMENTATION

How to Backup and Restore a VM using Veeam

vsphere Host Profiles

What s New in VMware vsphere 5.0 Networking TECHNICAL MARKETING DOCUMENTATION

Configuration Maximums VMware Infrastructure 3

VMware vcenter Support Assistant 5.1.1

Configuration Maximums

Set Up a VM-Series Firewall on an ESXi Server

QNAP in vsphere Environment

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

VMware vsphere 4.1. Pricing, Packaging and Licensing Overview. E f f e c t i v e A u g u s t 1, W H I T E P A P E R

Administrator Guide VMware vcenter Server Heartbeat 6.3 Update 1

Study Guide. Professional vsphere 4. VCP VMware Certified. (ExamVCP4IO) Robert Schmidt. IVIC GratAf Hill

Configuration Maximums VMware vsphere 4.0

VMware vcloud Air - Disaster Recovery User's Guide

Technical Note. vsphere Deployment Worksheet on page 2. Express Configuration on page 3. Single VLAN Configuration on page 5

VMware vsphere Data Protection Evaluation Guide REVISED APRIL 2015

Introduction to VMware EVO: RAIL. White Paper

VMware vsphere Examples and Scenarios

Install Guide for JunosV Wireless LAN Controller

Setup for Failover Clustering and Microsoft Cluster Service

HP LeftHand SAN Solutions

Leveraging NIC Technology to Improve Network Performance in VMware vsphere

VMware vcenter Log Insight Getting Started Guide

Table of Contents. vsphere 4 Suite 24. Chapter Format and Conventions 10. Why You Need Virtualization 15 Types. Why vsphere. Onward, Through the Fog!

How to Configure an Initial Installation of the VMware ESXi Hypervisor

vsphere Networking vsphere 5.5 ESXi 5.5 vcenter Server 5.5 EN

[VADP OVERVIEW FOR NETBACKUP]

Top 10 Reasons to Virtualize VMware Zimbra Collaboration Server with VMware vsphere. white PAPER

Altor Virtual Network Security Analyzer v1.0 Installation Guide

Management Pack for vrealize Infrastructure Navigator

Monitoring VMware ESX Virtual Switches

Virtual Web Appliance Setup Guide

vcenter Support Assistant User's Guide

vsphere Monitoring and Performance

ESX Server 3 Configuration Guide Update 2 and later for ESX Server 3.5 and VirtualCenter 2.5

Basic System Administration ESX Server 3.0 and VirtualCenter 2.0

vrealize Operations Management Pack for vcloud Air 2.0

Drobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups

Scalability Tuning vcenter Operations Manager for View 1.0

Transcription:

Performance of VMware vcenter (VC) Operations in a ROBO Environment TECHNICAL WHITE PAPER

Introduction Many VMware customers have virtualized their ROBO (Remote Office Branch Office) offices in order to reap the benefits provided by VMware Infrastructure (VI), such as hardware cost savings, business continuity and high availability, and lower maintenance costs. Because of ROI considerations and the desire to keep management of VMware ESX hosts and virtual machines centralized, many of these customers choose to keep one VMware vsphere server and configure it to manage ESX hosts over the WAN. This practice has been used by large enterprises that have ESX servers distributed over large geographical distances. VMware vcenter 4.1 (VC) has made improvements in bandwidth usage over the previous VI releases. These improvements show that common VI operations such as power operations on a virtual machine (VM) are faster and consume significantly less bandwidth than in previous VI releases. The measurements for this paper are from tests performed on VC 4.1. Objective Selecting Data Points DATA POINTS USED FOR THE STUDY Network Pipes Bandwidth Latency Packet Error Rate Dial-up 1 64Kbps 25ms.5% Dial-up 2 256Kbps 25ms.5% DSL 512Kbps 1ms.5% Satellite 1.5Mbps 5ms.1% T1 1.5Mbps 1ms.5% Table 1. Data Points Used for the Experiments. The objective of this paper is to present a study of the performance of a set of commonly used operations on the vcenter in a low-bandwidth and high-latency environment. Five data points of bandwidth and latency were chosen for this study. The choice of the data points was influenced by the bandwidth provided by Internet Service Providers and the usage patterns of customers using the VMware vsphere 4.1 ( vsphere ) in a ROBOlike environment. Choice of Packet Error Rate (PER) Most network pipes have some form of packet corruption. To closely mimic such network pipes we chose to add packet error rates for each of the data points. For satellite links a brief reading on commonly offered link speeds suggests error rates between 1 in 1^6 bits to 1 in 1^8 bits. Taking a mean error rate of 1 in 1^7 bits, and assuming an average TCP packet size of 512 bytes, the PER for satellite links is estimated to 1^7 (/512 * 8) ~.5%. To err on the side of caution, PER is set to.1% for satellite links and half of that on every other data point. TECHNICAL WHITE PAPER / 2

Test Setup In this section, the set of VC operations for this test and the environment used to mimic the ROBO case are mentioned in detail. VC Operations Studied Broadly, this study focused on measuring the network bandwidth consumed and the time taken to complete a variety of VC operations. Add Host operations: The operation of adding an ESX host to the vsphere server is measured. Power operations: Powering on and powering off virtual machines on an ESX host is analyzed. Virtual machine operations: Commonly used virtual machine operations, such as reconfiguring a VM and taking a snapshot of the VM, are measured. Statistics: The ESX host sends a variety of metrics to the VC server periodically. We evaluate the performance of a static setup at different levels of inventory to estimate the statistics sent across to VC as a measure of the total amount of traffic sent to VC. Linked VC operations: Running multiple vsphere servers in a linked mode is a feature of vsphere 4.. Typically used to query across multiple VC instances, the linked VC mode is also used to replicate user privileges and user roles among other data across multiple VCs. VMRC responsiveness: Also known as the VMware Remote Console, the VMRC client is used to view the console of a virtual machine. This test measures the time taken to view the console in such constrained environments. HA-DRS: To evaluate the network usage and latency when VC features such as VMware High Availability (VMware HA) and VMware Distributed Resource Scheduling (VMware DRS) are used, the Group Power On and Enter Maintenance Mode operations were studied. Environment Setup To mimic the ROBO environment across different data points shown in Table 1, we require a network pipe on which we can shape network traffic appropriately. To this end, we have the setup shown in Figure 1. Pnic vswitch eth1 VC in a VM eth vswitch1 eth Vyatta VC5 eth1 Legend: vswitch Virtual Switch Pnic Physical Network Interface Card VC Virtual Center Server VM Virtual Machine vswitch2 ESXi ESXi ESXi ESXi Figure 1. Test Setup. TECHNICAL WHITE PAPER / 3

We required a router that could connect VC and the ESX hosts across two private virtual networks so that the experiments can be performed without other network noise. Vyatta s VC5 VMware Virtual Appliance 1 was used as the router in this experiment. Figure 1 shows the ESXi boxes connected to vswitch2 that has no physical network interface attached to it. The VC server is located on a Windows Server 23 virtual machine (VC in a VM in Figure 1) is located on a separate virtual switch again with no physical NIC. The VC5 Vyatta appliance serves to route traffic between vswitch1 and vswitch2. It is shown to have two interfaces eth in vswitch1 and eth1 in vswitch2. The virtual machine containing the VC server also has two interfaces one in vswitch1 and the other in a virtual switch (vswitch) backed by a physical NIC. This is done so that the VC in a VM can be managed from an outside physical network. Using the VC5 virtual appliance, we create a DHCP server in vswitch2 to provide addresses to the ESXi machines. The Vyatta VC5 appliance is built over a Linux kernel. We use the Linux kernel s traffic shaper tc 2 to shape traffic between vswitch1 and vswitch2. To shape the network bandwidth between the two private networks we use the token bucket filter provided by tc. To add latency and packet error rates we use a different queuing discipline and use netem in conjunction with tc. A sample script to simulate the satellite data point of Table 1 is shown. sudo tc qdisc add dev eth root handle 1: tbf rate 1.5Mbit burst 1.5Mbit limit 1 sudo tc qdisc add dev eth1 root handle 1: tbf rate 1.5Mbit burst 1.5Mbit limit 1 sudo tc qdisc add dev eth1 parent 1: handle 2: netem loss.1% delay 5ms sudo tc qdisc add dev eth parent 1: handle 2: netem loss.1% delay 5ms Note: We have introduced shaping on both eth and eth1 of the VC5 Vyatta virtual appliance. To limit the interfaces from pumping in more than 1.5 Mbps for the satellite data point we have made the burst size equal to tc-tbf rate used for that channel. To time a sequence of operations, Cygwin shell s Time command was used. To listen to packets on the VC server s traffic, a VMware internal tool called vimshark 1.5 was used, which utilizes Wireshark to capture packets at port 443 of the VC in a VM. The resulting output from vimshark is then parsed to obtain the sum of the TCP header lengths of the packets. Comparison with vsphere 4. Many performance optimizations have made operations on VC 4.1 faster than previous releases of VI. To indicate the changes in performance, a test was run to measure the bandwidth consumed and the time taken for Virtual Machine Power operations on vsphere 4. setup for the T1 data point. These results were compared to a similar run on a VC 4.1 setup. Table 2 shows the results of running the power operations on the two infrastructures. A cycle is composed of one VM power on and one VM power off operation. INFRASTRUCTURE NETWORK USAGE PER CYCLE TIME PER CYCLE vsphere 4. 189.9KB 17.5 secs VC 4.1 27.7KB 14.1 secs Table 2. vsphere 4. vs. VC 4.1 Comparison of VM Power Operations. 1. http://www.vyatta.com/ 2. http://linux.die.net/man/8/tc-tbf TECHNICAL WHITE PAPER / 4

Experimental Data This section lists the experimental data for the VC operations studied. Add Host The Add Host test was performed in a loop for 2 iterations. The host added had two VMs that were powered off. The bandwidth measurements indicate that adding a host to the VC consumed approximately 14.7MB of the network bandwidth. This network usage is largely due to the size of the management bundle that VC sends to the host at the time of the host s addition to the VC inventory. 25 2136 2 Time in Seconds 15 1 5 743 575 627 174 Figure 2. Time Taken for Adding One ESXi Host. Figure 2 shows that add host operations are limited by the network bandwidth. Dialup2 in Figure 2 has four times as much bandwidth as Dialup1 this accounts for the variance from 12 to 3 minutes to add a host. Power On/Off Operations For this experiment, the setup involved serially powering on and off two VMs on a single host 25 times per data point. The bandwidth requirements per cycle of power on and power off operations combined were about 28KB. Figure 3 shows that the time for a power on and a power off operation combined, is limited by the network latency between VC and the ESX hosts. Satellite links that have a round trip time of one second are shown here as the data point that takes the longest to complete a Power operation on a VM. TECHNICAL WHITE PAPER / 5

5 45 45.48 4 35 Time in seconds 3 25 2 15 25.248 25.512 14.436 14.12 1 5 Figure 3. Bandwidth Consumed per Power Cycle Operation. Virtual Machine Operations A series of operations commonly run on a virtual machine were evaluated by being run in a loop for 3 iterations. Figures 4 and 5 show the results of these tests. A mixed set of commonly used VM operations were part of each iteration. These operations were: 1 reconfigure operations 6 snapshot operations 2 create VM operations 2 register VM operations 2 power on/off operations 2 delete VM operations 1 register, reset and suspend operation 18 16 14.46 14.7 575 15.24 14 12 11.44 KB of Data 1 8 6 4 2 Figure 4. Bandwidth Consumed per VM Operation. TECHNICAL WHITE PAPER / 6

Figure 5 shows that, in general, the time taken for VM operations is constrained by the latency in the network. 9 8 76.69 7 Time in Seconds 6 5 4 3 45.79 45.96 29.241 26.724 2 1 Figure 5. Time Taken for a VM Operation. Linked VC Mode Search Operations Multiple instances of the VC server can be connected in the linked mode. To simulate multiple VCs in the linked mode the experiment used two VCs with one VC in vswitch1 and the other on vswitch2. The VM that contained the VC server had two interfaces one was for the private network and the other interface was used to talk to the domain controller. This experiment did not replicate the domain controller within the setup, and hence Table 2 and Figure 6 are representative only of the search and health status traffic updates between the two VC instances. To test the linked VC search network usage and time taken across the data points, two search queries were performed for 2 iterations. The search queries were to list all the VMs in the inventory containing a particular string name and to list all the hosts in the inventory with a given IP prefix. For the experiment that used 2 VMs, two hosts with five VMs each were made part of the inventory across each VC. Similarly for the experiment involving 4 VMs, four hosts with five VMs each were distributed across each VC. The network usage consumed by the linked VC mode for one instance of a VC server is shown in Table 3. The bandwidth measured at the other instance of the VC server in the experiment shows a similar graph. Note that the bandwidth consumption shown is composed of both the search traffic and the traffic for the health monitoring system between the instances of VC. VC INVENTORY NETWORK USAGE PER VC INSTANCE 2 VMs 2MB 4 VMs 3MB Table 3. Bandwidth Usage In Linked VC Mode. TECHNICAL WHITE PAPER / 7

The time taken by the search test is shown in Figure 6 to be dominated once again by network latency. 12 1 Time in Seconds 8 6 4 2 VMs 4 VMs 2 Figure 6. Total Time Taken for the Search in the Linked VC Operations. VMRC Responsiveness The VMware Remote Console is a console viewer that comes in two types. The test evaluated the stand-alone Windows VMRC client and the VMRC client that is used to display the console tab of the vsphere client. The test was to measure the time taken for the virtual machine console to be visible once the user was logged in to the VMRC console viewer in the stand-alone case. The time taken to launch the VM console from the VI client was observed to be similar to the time taken to launch the console from the stand-alone client. Figure 7 shows that the time to display the virtual machine client is dominated by the network latency. 12 1 11 Time in Seconds 8 6 7 7.5 4 3.5 3.5 2 Figure 7. Time to Display the Virtual Machine Console. TECHNICAL WHITE PAPER / 8

Statistics Periodically ESX hosts send data to VC about certain metrics. The statistics collected this way can be configured in the vsphere client according to the number of metrics requested called the Stats Level. Level 4 denotes the set of all metrics sent by the ESX host to the VC server and Level 1 represents the smallest set of stats sent to the server. Furthermore, the frequency of collection of these stats influence the frequency with which the stats are transmitted back to the server. This too could be configured in the vsphere client by changing the stats duration. In this experiment we have evaluated the statistics generated for different stats levels and stats duration for 2 VMs distributed across 4 hosts and for 4 VMs distributed across 8 hosts. These VMs were powered on but with no guest OS and with no other operations running on VC. 28 24 KB of Stats Data 2 16 12 8 2 VMs 4 VMs 4 Level 1 Level 4 Stats Levels Figure 8. Stats Network Usage in Kilobytes for Stats Duration=5. The amount of statistics sent is directly proportional to the size of the inventory. From Figure 8 we see that as the number of VMs increase at the same stats level the network usage of statistics sent across increases twice as much. Similarly, the size of statistics increase based on the level of collection. In Figure 9 this increase is not significant because the stats duration is the lowest possible limit. However, the difference is apparent in Figure 9, in which the same experiment was repeated with the highest possible stats duration of one. Note: The results in Figures 8 and 9 are independent of the data points used in the experiment. Although the bandwidth consumption for statistics were collected across all the data points, the data generated for the static setup was found to be independent of the varying points of latency and bandwidth. 28 24 KB of Stats Data 2 16 12 8 2 VMs 4 VMs 4 Level 1 Level 4 Stats Levels Figure 9. Stats Network Usage in Kilobytes for Stats Duration=1. TECHNICAL WHITE PAPER / 9

VMware HA/VMware DRS To evaluate the effect of having VMware HA and VMware DRS enabled on a cluster, this experiment measured the latency and network usage to power on a group of VMs simultaneously and also evaluated the Enter Maintenance Mode operation. For both experiments, VMs with 8MB disk and 1MB memory running no guest OS were used. 1) Group power on: During the group power on operation, the VMware DRS component might decide to move the VMs between different hosts depending on the resource utilization. The VMware HA component tests if the failover level configured at the time of creating the cluster can still be satisfied after powering on the VMs. For this experiment, the setup distributed the 2 and 4 VMs across the hosts such that each host had 5 VMs. These hosts were placed in a cluster with VMware HA and VMware DRS enabled. This experiment retained the default settings on the VMware HA and VMware DRS enabled cluster when created using the vsphere client. The test was repeated for 5 iterations for both inventories. The average bandwidth consumed per iteration is shown in Table 4. Note that the network usage and latency measurements for this operation cannot be compared to the general VM power cycle operation data in section B. VMS IN CLUSTER BANDWIDTH USAGE PER POWER CYCLE OPERATION 2 VMs 22KB 4 VMs 46KB Table 4. Bandwidth Usage For Group Power On. Figure 1 shows that the time taken to complete this operation is once again affected by the latency in the network pipe. However, the difference is not as marked between the different network pipes. This is because while the operation waits for a group of VMs to complete the power operation on the host, the simultaneous power-on of VMs fills the network pipe with data to the VC about the remaining powered on/off VMs without delay. Hence, the time taken for this group power-on operation to an extent masks the difference in latency across network pipes. 9 8 7 Time in Seconds 6 5 4 3 2 VMs 4 VMs 2 1 Figure 1. Time Taken for the HA/DRS Group Power On/Off Operation. TECHNICAL WHITE PAPER / 1

2) Enter maintenance mode: For this operation, the setup was 2 VMs distributed across 4 hosts within a VMware HA and VMware DRS enabled cluster. All the hosts shared a common NFS datastore (hosted on a VM running Ubuntu as the guest OS in vswitch2 in Figure 1) that hosted all the VMs for this experiment. The data in Figure 11 and Table 5 shows that this operation is not limited by network bandwidth and is limited by the latency in the network pipe. 25 2 22.75 178.25 175.3 165.6 163.15 Time in Minutes 15 1 5v Figure 11. Time Taken for the HA/DRS Enter Maintenance Mode Operation. NETWORK PIPE NETWORK USAGE (IN MB) NUMBER OF VMOTIONS Dial-up 1 45.623 27 Dial-up 2 45.549 276 DSL 44.844 27 Satellite 47.391 278 T1 45.94 27 Table 5. Bandwidth usage for Enter Maintenance Mode. It was observed that the time taken to complete this operation with 4 VMs distributed over 8 hosts in the inventory consumed double the time and double the network usage as that of Table 5. TECHNICAL WHITE PAPER / 11

Summary This paper has shown that for most VC operations the latency in the network plays a dominant role in the time taken to execute the operation. In the operation to add a host, the bandwidth of the network pipe controls the time taken to complete the operation. High latency links such as satellite links show an increase in operational time from anywhere between 1.5x to 4x depending on the operation in comparison to latencies typically experienced in a LAN. Acknowledgment I would like to thank Jairam Ranganathan, Kinshuk Govil, and Kiran Kamath for suggesting the experiments and their reviews of the results. Ravi Soundararajan and Victor White provided useful feedback. I would also like thank Kiran Kamath for suggesting the test setup and his guidance in the effort. Alper Mizrak and Jayesh Seshadri were very helpful in setting up the linked VC ops environment scenarios. Elisha Ziskind and Marc Sevigny helped in debugging some of the VMware HA/VMware DRS test issues. VMware, Inc. 341 Hillview Avenue Palo Alto CA 9434 USA Tel 877-486-9273 Fax 65-427-51 www.vmware.com Copyright 21 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies. Item No: VMW_12Q4_WP_Performance_vCenter_ROBO_p12_A_R1