BridgeWays Management Pack for VMware ESX

Similar documents
Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Monitoring Databases on VMware

Windows Server Performance Monitoring

Performance Management in a Virtual Environment. Eric Siebert Author and vexpert. whitepaper

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

Deploying and Optimizing SQL Server for Virtual Machines

Avoiding Performance Bottlenecks in Hyper-V

Windows Server 2008 R2 Hyper-V Live Migration

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

Windows Server 2008 R2 Hyper-V Live Migration

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Technology Insight Series

Balancing CPU, Storage

WHITE PAPER The Storage Holy Grail: Decoupling Performance from Capacity

Microsoft Exchange Solutions on VMware

Active Continuous Optimization (ACO) for Server Infrastructure Performance Improvement

Memory and SSD Optimization In Windows Server 2012 and SQL Server 2012

Virtual server management: Top tips on managing storage in virtual server environments

WHITE PAPER. How To Compare Virtual Devices (NFV) vs Hardware Devices: Testing VNF Performance

In-Guest Monitoring With Microsoft System Center

Performance Analysis Methods ESX Server 3

Top Purchase Considerations for Virtualization Management

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Microsoft Windows Server Hyper-V in a Flash

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

OPTIMIZING SERVER VIRTUALIZATION

Best Practices for Managing Virtualized Environments

Chapter 14 Virtual Machines

SolarWinds Virtualization Manager

Maximizing SQL Server Virtualization Performance

Stratusphere Solutions

2972 Linux Options and Best Practices for Scaleup Virtualization

Performance Management in the Virtual Data Center, Part II Memory Management

Parallels Virtuozzo Containers

International Journal of Advancements in Research & Technology, Volume 1, Issue6, November ISSN

Virtualizing Exchange

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

White Paper. Recording Server Virtualization

Expert Reference Series of White Papers. Visions of My Datacenter Virtualized

All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at

Delivering Quality in Software Performance and Scalability Testing

Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

Cloud Computing Capacity Planning. Maximizing Cloud Value. Authors: Jose Vargas, Clint Sherwood. Organization: IBM Cloud Labs

Managing Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER

W H I T E P A P E R. Reducing Server Total Cost of Ownership with VMware Virtualization Software

HRG Assessment: Stratus everrun Enterprise

VI Performance Monitoring

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

How To Make A Virtual Machine Aware Of A Network On A Physical Server

The Top 20 VMware Performance Metrics You Should Care About

Virtualization of the MS Exchange Server Environment

VMware vsphere 5.0 Boot Camp

Capacity planning with Microsoft System Center

Understanding Memory Resource Management in VMware vsphere 5.0

Getting the Most Out of Virtualization of Your Progress OpenEdge Environment. Libor Laubacher Principal Technical Support Engineer 8.10.

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

Basics in Energy Information (& Communication) Systems Virtualization / Virtual Machines

Parallels Virtuozzo Containers

WHITE PAPER 1

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Simplify VMware vsphere* 4 Networking with Intel Ethernet 10 Gigabit Server Adapters

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Hyper-V vs ESX at the datacenter

VMware vsphere 5.1 Advanced Administration

SAN Conceptual and Design Basics

WHITE PAPER Guide to 50% Faster VMs No Hardware Required

Getting Even More Out of OpenEdge in a Virtualized Environment

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

An Oracle White Paper August Oracle VM 3: Server Pool Deployment Planning Considerations for Scalability and Availability

Performance Monitoring and Capacity Planning. John Paul & Chris Hayes Session: ADC0199

How Customers Are Cutting Costs and Building Value with Microsoft Virtualization

Capacity Planning in Virtual Environments. Eric Siebert Author and vexpert. whitepaper

WHITE PAPER Optimizing Virtual Platform Disk Performance

SQL Server Virtualization

CA Cloud Overview Benefits of the Hyper-V Cloud

Cisco Unified Computing Remote Management Services

PARALLELS CLOUD SERVER

Oracle Database Scalability in VMware ESX VMware ESX 3.5

What s New with VMware Virtual Infrastructure

Oracle Hyperion Financial Management Virtualization Whitepaper

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

SolarWinds Virtualization Manager

Expert Reference Series of White Papers. VMware vsphere Essentials

Best Practices for VMware ESX Server 2

solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms?

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1

Desktop Virtualization with VMware Horizon View 5.2 on Dell EqualLogic PS6210XS Hybrid Storage Array

Virtualization Technologies ORACLE TECHNICAL WHITE PAPER OCTOBER 2015

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

WHITE PAPER Guide to 50% Faster VMs No Hardware Required

Database Virtualization

Managing Application Performance and Availability in a Virtual Environment

Performance Tuning of Virtual Servers TAC9872. John A. Davis Senior Consulting Engineer

Full and Para Virtualization

Cloud Server. Parallels. An Introduction to Operating System Virtualization and Parallels Cloud Server. White Paper.

Microsoft Office SharePoint Server 2007 Performance on VMware vsphere 4.1

Transcription:

Bridgeways White Paper: Management Pack for VMware ESX BridgeWays Management Pack for VMware ESX Ensuring smooth virtual operations while maximizing your ROI. Published: July 2009 For the latest information, please visit www.bridgeways.ca

Bridgeways White Paper: Management Pack for VMware ESX Introduction... 1 The Art of Capacity Planning... 2 Providing the Canvas... 2 Balancing Resources while Protecting the User Experience... 3 Monitoring the Tipping Point... 3 Processing Time... 4 Memory Commitment... 4 Network and Storage... 6 Where to Go from Here... 7

Bridgeways White Paper: Management Pack for VMware ESX 1 Introduction Virtualization technology has improved dramatically over the last few years. As more and more companies look to virtualize their environments, additional vendors come to market, spurring competition and accelerating feature growth. Today s hypervisors open even more opportunities for server consolidation and create a new (some would argue a return to the old) operational paradigm in which resources are centralized and workloads are distributed to the hardware that can best accommodate the resource requirements. Monitoring virtualized environments has become critical to optimizing hardware performance and achieving maximum ROI. This requires a consolidated view that allows for monitoring of both the hypervisor and the individual workloads in order to identify bottlenecks and correct any problems before they have a major impact on the performance of the virtual environment. Without such a consolidated view it is hard to place emerging issues in their appropriate context. This in turn can lead to a frustrating and time consuming trial and error search for the root causes of problems. For example, if an application server is handling fewer requests, we need to examine several possible causes: Is there an increased load on the server? Have there been patches applied to the operating system or application? Have new VMs been added to the resource pool? How are they configured? Are they SMP virtual machines? Is the resource pool marked as expandable? Has the resource pool been making heavy demands on the parent pool to meet its own needs? Have new pools been added which reduce the resources available from the parent pool? Has the VM moved to a new host with different bandwidth characteristics? Is there IO contention on the SAN? Any of these issues can result in slower response times, making it imperative to quickly identify the underlying issue. Otherwise, unexpected system downtime, service degradation, or extended maintenance periods can result in lost revenues and reduced customer satisfaction. Microsoft System Center Operations Manager is an excellent platform from which to perform in-depth monitoring of complex environments. The addition of BridgeWays Management Packs allows administrators to drill down into the hypervisor, operating system, and workloads to do deep root cause analysis and quickly resolve underlying problems. The BridgeWays Management Pack for VMware ESX models the entire virtual environment from the data center, through the hosts, to the individual virtual machines. By modelling the entire ESX environment, it presents administrators with a detailed view of what loads are being placed on the hypervisor, allowing them to see how the various components are interacting. A single misconfigured virtual machine can have a significant impact on the host, which, in turn, can reduce the efficiency of an entire resource pool or cluster.

Bridgeways White Paper: Management Pack for VMware ESX 2 Today, many administrators are still monitoring virtual machines through guest operating system metrics. This approach can send them off on tangents, and cause them to solve problems that don t exist. For example, if a multi-processor VM using a non-smp HAL stays in its idle loop for long periods of time, the guest operating system is going to report CPU usage as 0%, because it sees the idle loop and considers the CPU to be available. Monitoring via BridgeWays through the host will show that this VM is actually using 100% of allocated CPU time as the idle loop executes. If the VM is not sending the proper halt instruction due to an incorrect HAL, the host will waste a large amount of resources supporting an idle VM. It is crucial to monitor both the ESX host and the guest OS to see the discrepancy in the metrics and correctly diagnose the system s health. The Art of Capacity Planning Capacity planning is the art of balancing what is requested against what can be delivered, and then creating a plan to provide the necessary resources. When expanding a virtual environment, administrators need to know how the current virtual capacity is allocated, and to what extent it is actively used. The current resources could be over committed, under committed, or they could be perfectly balanced. However, without proactive monitoring and historical reporting it is impossible to determine their current state. Determining historical usage trends has typically been difficult for many IT organizations, because they are using disparate tools to gather the metrics they need. A tool that gathers information from the hypervisor typically provides no information on workloads running on guest operating systems. A tool that monitors the workload, or provides metrics from the operating system, is generally unaware of the hypervisor. Failure to weave the various strands of data into an overall context can lead to planning mistakes. Providing the Canvas The BridgeWays Management Pack for VMware ESX provides a comprehensive view of the virtual capacity to help ensure that the environment is operating at the levels for which it is designed. Operations Manager allows BridgeWays to pull together information from the hypervisor, operating systems, and various workloads into a single management console where the administrator can see how changes to one workload s resource allocations impact the others. This enables administrators to take an iterative approach to capacity management in which they cautiously downgrade the resources available to a system, monitor the impact, and then reallocate those resources to a new system coming on line. In the absence of a comprehensive view, it is difficult to determine whether the system from which resources are taken from was over-allocated, and won t suffer degraded performance when resources are transferred.

Bridgeways White Paper: Management Pack for VMware ESX 3 For example, when heap memory usage and garbage collection characteristics of an application server are compared to the actual memory used and the amount being swapped out by the hypervisor, it is possible to see if the application server is efficiently using the available memory. Consistently high levels of swapped memory indicate that a virtual machine may have too much memory allocated, because the hypervisor is swapping it out to disk. If the hypervisor is not constantly swapping the memory into and out of physical memory, and if the heap and garbage collection metrics are healthy, the virtual machine has more memory allocated to it than its workload requires, and some of the memory can be safely taken away. The consolidated view provided by BridgeWays enables administrators to do the same for CPU usage, network bandwidth, and storage capacity. By monitoring how much of the available resources are being used by the virtual machines administrators can easily identify and reallocate idle resources. The holistic overview of the available resources in the virtual environment that is provided by the BridgeWays Management Pack for VMware ESX enables administrators to ensure that the capacity plan for the virtual environment is handling the load efficiently. Capacity management is used to keep an environment balanced between over- and under-allocated states. By monitoring actual resource usage, it is possible to find the point at which the resources available are being overstretched, and individual workloads start to suffer. Monitoring the Tipping Point Underutilization of hardware exists in both the virtual and physical world. For example, a server having 32GB of memory, 1TB of drive space, and two quad core CPUs may be running a database workload that uses only half of the physical memory, leverages a SAN for the datafiles, and uses around 5% of the available CPU time. Servers like this represent a significantly underutilized investment and are good candidates for virtualization. The problem faced by many administrators is that when they try to virtualize a server like this, the application owners insist on having the same specs in the virtual environment as they had on the physical server. This is where an administrator must get creative in managing limited resources that are not being fully utilized. Features of the hypervisor like memory and CPU limits can help keep control over resources by showing one value for the virtual machine, for example 32GB of memory, while limiting the physical memory provided to 16GB. The key to using such advanced features lies in monitoring their effects on workloads and fine tuning resource allocations based on the measures obtained. For example, a database application could be fine-tuned by allocating 32GB of memory at the start, and then monitoring memory usage over the course of a week. If it is found that the server is only using half of the memory allocated, the amount of physical memory allocated to the application could be reduced to

Bridgeways White Paper: Management Pack for VMware ESX 4 20GB. It is only by proactively monitoring the environment that it is possible to make these kinds of changes while minimizing risk to the workloads. Processing Time The first place to look for underutilized processors is on virtual machines equipped with multiple processors. The problem with multiple processors is that when the hypervisor tries to schedule CPU time for virtual machines, it needs to synchronize the available cores to ensure that a core is available for each vcpu. The result is that the hypervisor locks multiple cores while waiting for enough to become available, and other single processor virtual machines are stuck in line behind the multi-processor virtual machine, waiting to a use resources that are locked, but idle. To monitor this kind of scenario, administrators can use BridgeWays to look at a pair of metrics. First, they can monitor the CPU Ready Time % for the individual virtual machines. This metric shows how long a virtual machine is waiting for a core to become available, in order to execute an instruction. If there are virtual machines with CPU Ready Time % above 5%, the administrator can examine the Host CPU Usage % to see if it is high or low. If the Host CPU Usage % is low, it is likely that several virtual machines with more than one vcpu are running on that host. These virtual machines are causing CPU contention and should be split across multiple hosts, or if possible have the number of vcpus reduced. CPU Reservation is another possible cause of underutilization. By reserving a specific amount of time on the CPU, a virtual machine may be blocking another VM that is trying to use that CPU time. To ensure that a virtual machine that is given reserved CPU time actually needs it, the administrator can monitor its CPU Guaranteed and CPU Usage metrics. If CPU Usage falls below CPU Guaranteed, CPU time that could be used productively by other VMs is being wasted. CPU Shares are the preferred way to provide some virtual machines with more processor time than others. Increasing the shares available to a virtual machine enables the hypervisor to more intelligently schedule the extra time. This sharing mechanism also provides a nice metric for locating extra CPU time in Resource Pools. By watching CPU Extra Time, it is possible to identify virtual machines that are being given more time on the processor than their share allocation reserves for them. This indicates the potential for adding a new virtual machine to the resource pool without having a significant impact on the existing virtual machines. Memory Commitment Memory over commitment is a powerful feature of the Virtual Infrastructure architecture that allows capacity planners to allocate more memory than is actually available. The way this works is that as physical memory is required for one virtual machine, it is taken away from another virtual machine. Which VM looses physical memory is determined by the current load, share ratios, limits, and reserves of the

Bridgeways White Paper: Management Pack for VMware ESX 5 virtual machines running on the host providing the memory. Since most virtual machines do not run at 100% capacity, the hypervisor is able to reclaim memory from one VM and give it to another. Through monitoring, an administrator can ensure that the memory reclamation is not causing unintended resource limits for the various workloads. By watching a Resource Pool s Memory Active (memory that is actively touched and used), Memory Consumed (how much memory has been allocated to virtual machines) and Memory Overhead (the amount of memory lost to the hypervisor to handle the scheduling of access to physical memory) it is possible to see how high the demand for physical memory is. Digging a little deeper and looking at how the individual hosts are reclaiming memory resources and how virtual machines are affected by the reclamation, administrators can find areas where capacity may be over or underutilized. This information can be used to tune environments to better reflect the actual needs of each workload, allowing the hypervisor to do less work to balance those requirements. There are several mechanisms available to the hypervisor to handle the reclamation of unused memory from virtual machines: Shared Memory Shared Memory is a case where the host contains similar guest operating systems with a large number of common components. The hypervisor scans for identical memory pages among VMs, and allocates a single read-only version of the page in physical memory. The duplicate memory pages are released and made available for other purposes. This is the preferred way to reclaim physical memory because it has the least impact on the virtual machines. When monitoring Shared Memory the administrator needs to watch the historical values. If Shared Memory usage suddenly drops, that may indicate that some of the virtual machines were patched while others were not. This leads to fewer memory pages that can be shared, and reduces the overall capacity of the environment. Balloon Driver The balloon driver is installed along with VMware Tools on a guest operating system. The balloon driver is controlled by the hypervisor and is used to pin memory pages on a guest, forcing the operating system to page out to disk. The hypervisor determines which VMs use the balloon driver based on the activity level of the virtual machine. This is

Bridgeways White Paper: Management Pack for VMware ESX 6 the second best way to reclaim memory from virtual machines, because the guest operating system gets the chance to choose which memory pages to swap to disk. When monitoring the Memory Balloon Usage administrators need to be conscious of the usage pattern that the driver is exhibiting. From the host level, if the memory reclaimed by the driver is constant, virtual machines are likely to be found where the balloon is always inflated, or the host has hit the limit of the amount of memory it can reclaim through the balloon driver. In the first instance, the balloon drivers on individual virtual machines must be monitored to find which ones are consistently inflated. This is a sign that the virtual machine has more memory allocated to it than it needs, reducing the committed memory to more reasonable levels will reduce the resource overhead lost to the hypervisor as it maintains the balloon driver. In the second instance, if the balloon driver has reached the limit of how fast it can reclaim memory, the Swapped Memory metrics can be examined to determine how swapped memory is being used. Swapped Memory Swapped Memory is physical memory that the hypervisor swaps to disk directly, rather than allow the guest operating system to do its own paging. This is the least desirable choice for memory reclamation because while the hypervisor can make a best guess of which pages to swap to disk, based on calculations like the memory tax, the guest operating system will generally make better choices. When monitoring Swapped Memory, it is not enough to watch the amount of memory swapped to disk. The swap rate must also be watched to see how actively the hypervisor is moving memory pages from disk back to physical memory because of hard faults. If the swap rate is high, the host is struggling to meet its memory commitment levels and action needs to be taken to increase the overall memory available. If the swap rate is low and the amount swapped to disk is consistent, then there are virtual machines that are not making good use of allocated memory. This is a good way to find workloads on which the allocated memory can be reduced. Network and Storage The network and storage backbone is often a limiting factor for virtual environments. Both bandwidth saturation and IO contention tend to cause ripple effects throughout the environment, reducing the overall effectiveness of virtualization. Capacity managers need to constantly monitor the load being placed on the network to ensure that it is not exceeding capacity. There are several ways in which monitoring the hypervisor can help with this task. From the host level, it is possible to monitor the Network Bandwidth Usage to ensure that it is not hitting the maximum capacity of the network interface cards

Bridgeways White Paper: Management Pack for VMware ESX 7 installed on the system. As network traffic increases, administrators need to either add more NICs and team them for busy vswitches, or investigate the possibility of moving complementary workloads to the same hosts. In addition to monitoring the bandwidth, it is important to monitor the connection states of those NICs. It is not uncommon for a switch to be taken offline and replaced temporarily, but the replacement switch may auto-negotiate to a lower connection rate than the production switch. Hypervisor routing can help reduce overall network usage. In the case of N-tier or client/server architectures, it is common for components to be on different virtual machines, and for communication to occur across the network. If two virtual machines are on the same host, network communication will be re-routed through the hypervisor, rather than through the physical network. There are two advantages to this reduced network load and increased transfer speeds. The data flow in this case is occurring at the speed of the memory modules as opposed to the speed of the network. When monitoring underlying storage, administrators must be aware of the storage devices and how their load is being handled. The physical devices themselves can handle a finite number of IOPS. The hypervisor handles spikes in IO by queuing up IOPs while the device is busy, and sending them once capacity is available. By monitoring both Device and Kernel Latencies, administrators are able to see where bottlenecks are forming. If the Device Latency is increasing, there could be a problem with the physical disks. If the Kernel Latency is increasing, then there may be too much traffic sent to the LUNs, in which case more capacity needs to be added. Where to Go from Here Once administrators are able to use the BridgeWays Management Packs to monitor and measure available capacity, they can take capacity planning to the final phase by proactively projecting resource requirements three months, six months, or even years down the road. By analyzing performance views, historical data reports, and trending reports, administrators can forecast resource utilization as the organization grows, and greater demands are placed on the virtual infrastructure. This allows administrators to implement just-in-time procurement to cut rack space, power consumption, cooling, and other costs associated with running high end hardware. BridgeWays 301 Moodie Dr., Suite 200 Ottawa, Ontario K2H 9C4 Canada tel: 1.613.842.3494 fax: 1.613.842-3499 www.bridgeways.ca In the past, in order to avoid adverse impact on systems already in place, administrators had to resort to using disjointed tools and crystal balls when allocating resources to new virtual systems as they were brought online. Today, the BridgeWays Management Pack for VMware ESX provides both the high-level and detailed views that help take the guesswork out of tuning virtual environments. Proactive monitoring with the BridgeWays Management Pack for VMware ESX helps ensure smooth virtual operations while maximizing hardware ROI.