Capacity Analysis Techniques Applied to VMware VMs (aka When is a Server not really a Server?)



Similar documents
Managing Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER

Avoiding Performance Bottlenecks in Hyper-V

The Top 20 VMware Performance Metrics You Should Care About

Monitoring Databases on VMware

WINDOWS SERVER MONITORING

Best Practices for Monitoring a Vmware Environment. Gary Powell Senior Consultant IBM SWG Tivoli

A Comparison of Oracle Performance on Physical and VMware Servers

vrealize Operations Manager User Guide

Configuration Maximums VMware Infrastructure 3

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

A Comparison of Oracle Performance on Physical and VMware Servers

Performance Management in a Virtual Environment. Eric Siebert Author and vexpert. whitepaper

Kronos Workforce Central on VMware Virtual Infrastructure

VMware Certified Professional 5 Data Center Virtualization (VCP5-DCV) Exam

vrops Microsoft SQL Server MANAGEMENT PACK OVERVIEW

vsphere with Operations Management (vsom) and vcenter Operations (vcops)

Ready Time Observations

WHITE PAPER 1

Oracle Database Scalability in VMware ESX VMware ESX 3.5

BridgeWays Management Pack for VMware ESX

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Providing Self-Service, Life-cycle Management for Databases with VMware vfabric Data Director

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Performance Management in the Virtual Data Center, Part II Memory Management

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

vcenter Operations Management Pack for SAP HANA Installation and Configuration Guide

Configuration Maximums VMware vsphere 4.0

Hyper-V vs ESX at the datacenter

StarWind iscsi SAN Software Hands- On Review

E-SPIN's Virtualization Management, System Administration Technical Training with VMware vsphere Enterprise (7 Day)

About Me: Brent Ozar. Perfmon and Profiler 101

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

What s New with VMware Virtual Infrastructure

Databases Going Virtual? Identifying the Best Database Servers for Virtualization

VDI FIT and VDI UX: Composite Metrics Track Good, Fair, Poor Desktop Performance

IOS110. Virtualization 5/27/2014 1

vsphere Monitoring and Performance

Configuration Maximums VMware vsphere 4.1

Best Practices for Managing Virtualized Environments

Capacity planning with Microsoft System Center

7/15/2011. Monitoring and Managing VDI. Monitoring a VDI Deployment. Veeam Monitor. Veeam Monitor

vrealize Operations Management Pack for vcloud Air 2.0

Deploying and Optimizing SQL Server for Virtual Machines

VIRTUALIZATION 101. Brainstorm Conference 2013 PRESENTER INTRODUCTIONS

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

How To Use Vsphere On Windows Server 2012 (Vsphere) Vsphervisor Vsphereserver Vspheer51 (Vse) Vse.Org (Vserve) Vspehere 5.1 (V

CHAPTER 2. Existing Server Environment Analysis with Capacity Planner

What s New in VMware vsphere 4.1 Storage. VMware vsphere 4.1

Windows Server Performance Monitoring

Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure

Characterizing Task Usage Shapes in Google s Compute Clusters

VMware vsphere 5.1 Advanced Administration

vsphere Monitoring and Performance

Uila SaaS Installation Guide

Performance Analysis Methods ESX Server 3

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

vsphere Monitoring and Performance

Benchmarking Microsoft SQL Server Using VMware ESX Server 3.5

SanDisk Lab Validation: VMware vsphere Swap-to-Host Cache on SanDisk SSDs

VMWare Workstation 11 Installation MICROSOFT WINDOWS SERVER 2008 R2 STANDARD ENTERPRISE ED.

Benchmarking Hadoop & HBase on Violin

Hyper-V R2: What's New?

VMware vsphere 5.0 Boot Camp

Database Virtualization

Maximizing SQL Server Virtualization Performance

SmartCloud Monitoring - Capacity Planning ROI Case Study

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

Veeam ONE What s New in v9?

How Customers Are Cutting Costs and Building Value with Microsoft Virtualization

Why Choose VMware vsphere for Desktop Virtualization? WHITE PAPER

Directions for VMware Ready Testing for Application Software

VMware vsphere 4.1 with ESXi and vcenter

Case Study I: A Database Service

Running VirtualCenter in a Virtual Machine

HRG Assessment: Stratus everrun Enterprise

Users are Complaining that the System is Slow What Should I Do Now? Part 1

Balancing CPU, Storage

Understanding Memory Resource Management in VMware vsphere 5.0

vsphere Resource Management

Splunk for VMware Virtualization. Marco Bizzantino Vmug - 05/10/2011

Cloud Server. Parallels. An Introduction to Operating System Virtualization and Parallels Cloud Server. White Paper.

Memory Resource Management in VMware ESX Server

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

vsphere Performance Best Practices

Migrating Control System Servers to Virtual Machines

Advanced VMware Training

Philips IntelliSpace Critical Care and Anesthesia on VMware vsphere 5.1

Audit & Tune Deliverables

Transcription:

Capacity Analysis Techniques Applied to VMware VMs (aka When is a Server not really a Server?) Debbie Sheetz, BMC Software La Jolla, CA November 5 th 2013

Presentation Overview How to Approach Performance/Capacity Evaluation for a Virtualized Application - How is this like evaluating performance of a non-virtualized application? - What kinds of measurements are necessary? Methodology - Understanding Virtualization Layers Which layers matter How layers relate Where layer measurements come from - Identifying Metric Clusters CPU: capacity utilization and performance stress Memory: capacity utilization, stress, shortage - Application performance is the sum of all its parts Case Studies 1. Right-sizing VMware Linux guest memory 2. VMware Linux guest memory health assessment Conclusions Copyright 1/22/2016 BMC Software, Inc 2

How to Approach Performance/Capacity Evaluation for a Virtualized Application Computer performance analysis and prediction depends on having cause and effect relationships - High CPU queue = poor response time - Memory shortage = degraded response time Need to identify groups of related metrics, i.e. metric clusters - CPU: CPU capacity utilization and queue length - Memory: capacity utilization, pressure, shortage So far, all of this applies to physical or virtual servers Virtualization introduces layers - Relationship of virtualization layers to the application - Different layers have different measurements available - Paper Modeling/Sizing Techniques for Different Virtualization Strategies from CMG 2008 outlines this approach Copyright 1/22/2016 BMC Software, Inc 3

Performance Evaluation for a Virtualized Application Layers What s between the application and the hardware resources it uses? PHYSICAL SERVER VIRTUALIZED APPLICATIONS Copyright 1/22/2016 BMC Software, Inc 4

Performance Evaluation for a Virtualized Application ESX Layers What are the most important ESX server components for our analysis? - VM the virtual machine Contains the operating system and the applications running on it - Cluster a set of physical hosts A VM is assigned to a cluster At any given moment the VM is running on one of the physical hosts owned by that cluster If VMotion is enabled, the VM can be automatically moved from one host to another to achieve balanced hosts - Host a physical host Owns hardware resources such as CPU, physical memory, disks, and network interfaces Copyright 1/22/2016 BMC Software, Inc 5

Performance Evaluation for a Virtualized Application ESX Layers How are the layers for ESX Server related to each other? 1. the application 2. the operating system hosting the application 3. the virtual machine hosting the operating system 4. the physical host on which the virtual machine runs 5. the cluster which owns a number of physical hosts and runs a group of virtual machines Copyright 1/22/2016 BMC Software, Inc 6

Performance Evaluation for a Virtualized Application Measuring ESX Layers Where are performance metrics for each ESX layer obtained? - Layers 1 and 2 are reported from the host operating system - Layers 3, 4, and 5 are reported from ESX (vcenter) Copyright 1/22/2016 BMC Software, Inc 7

Performance Evaluation for a Virtualized Application Metric Clusters/CPU What affects the setting of a CPU capacity threshold? - Why not set it at 100%? For interactive workloads, CPU queueing causes poor performance For non-interactive workloads, 100% can be perfect! (see paper Analytic Modeling Techniques for Predicting Batch Window Elapsed Time" from CMG 2009) Interactive workloads can be spiky need headroom Workload forecasting can be inexact - need headroom, too Failover planning - After taking into account non-performance constraints, then need to observe the CPU queue length May need to further reduce the CPU capacity threshold So the metric cluster is CPU CAPACITY UTILIZATION and CPU QUEUE LENGTH Copyright 1/22/2016 BMC Software, Inc 8

Performance Evaluation for a Virtualized Application Metric Clusters/Memory What affects the setting of a memory capacity threshold? - The philosophy is similar to CPU, but the metrics are not as simple CPU usage is in direct proportion to the workload, for memory that s not always true Need a combination of capacity usage and performance warning metrics Memory metrics differ by operating system So the metric cluster is MEMORY CAPACITY UTILIZATION, MEMORY PRESSURE, and MEMORY SHORTAGE Copyright 1/22/2016 BMC Software, Inc 9

Performance Evaluation for a Virtualized Application Metric Clusters/CPU Metrics CPU Capacity Utilization metrics - For ESX, CPU utilization is MHz Used divided by MHz Available Cluster: sum of all hosts MHz Host: number of physical processors * MHz per processor VM: number of physical processors configured * MHz per processor - For Linux and Windows, CPU Utilization is CPU Seconds Used (User CPU + System CPU) divided by CPU Seconds Available (number of processors seen by the OS * seconds) CPU Queue Length metrics - For ESX, CPU Queue Length is CPU Ready divided by seconds for each VM Cluster/Host: sum of all VMs CPU queue length (see paper "Virtualization Performance and Capacity Data Classification Schema, from CMG 2010) - For Linux, Run Queue Depth is available (sampled metric) - For Windows, Processor Queue Length is available (sampled metric) Copyright 1/22/2016 BMC Software, Inc 10

Performance Evaluation for a Virtualized Application Metric Clusters/Memory Metrics Memory Capacity Utilization metrics - For ESX, Memory utilization is either Consumed Memory divided by Configured Memory or Active Memory divided by Configured Memory Cluster: sum of all hosts configured memory Host/VM: physical memory configured - For Linux and Windows, Memory utilization is Memory Used divided by Memory Available Also require breakdown of physical memory by usage type: Free, Files Cache, Process, and System memory Copyright 1/22/2016 BMC Software, Inc 11

Performance Evaluation for a Virtualized Application Metric Clusters/Memory Metrics Memory Pressure metrics - For ESX VMs, Hosts, and Clusters Balloon Memory is available ratio of Active Memory to Consumed Memory can be calculated - For Linux, Page Scans is available - For Windows, no equivalent metric Memory Shortage metrics - For ESX VMs, Hosts, and Clusters Swapping (Paging) is available Memory Swapped is available - For Linux, Paging (to disk) is available - For Windows, Paging (to disk) is available, but includes File Cache support - For Linux and Windows, Virtual Memory Utilization approaching 100% Copyright 1/22/2016 BMC Software, Inc 12

Performance Evaluation for a Virtualized Application Application Performance is the Sum of Its Parts Application Resource Demand is a function of - Workload Volume How many transactions does the application need to support Time of day Day of the week Time of the year, etc. - Workload Resource Profile CPU, Memory, I/O, and Network required per transaction Application Performance = Resource Demand + Queueing - Demand = Workload Volume * Workload Resource Profile This is called service time in an analytic model - Queueing occurs when demand can t be met immediately by available hardware resources This is called wait time or queueing delay in an analytic model RESPONSE TIME = SERVICE TIME + WAIT TIME Copyright 1/22/2016 BMC Software, Inc 13

Capacity Evaluation Techniques for VMware VMs Case Studies Overview Case Studies - Demonstrate selected aspects of the capacity analysis methodology - Shows VMware ESX-hosted Linux guests Methodology applies to any virtualized platform - Two Case Studies 1. Right-sizing VMware Linux guest memory 2. VMware Linux guest memory health assessment Copyright 1/22/2016 BMC Software, Inc 14

Case Study Right-sizing VMware Virtual Machine (VM) Memory VMware provides two measurements of VM memory usage: Consumed and Active Memory - Which one should be used for capacity planning? Consumed Memory is often almost as large as Configured Memory Active is usually much smaller than Consumed, often near a factor of 10 - So Consumed is quite conservative and Active Memory much less so ESX Cluster Layer 5 Using Active, you can run about 750 and 250 more VMs on each cluster Using Consumed, you can run about 100 and 15 more VMs on each cluster Copyright 1/22/2016 BMC Software, Inc 15

Case Study Right-sizing VMware Virtual Machine (VM) Memory For capacity planning, some recommend using Active + a buffer (such as 70% above Active) - Much less conservative than Consumed, so more VMs could be run on the current hardware Copyright 1/22/2016 BMC Software, Inc 16

Case Study Right-sizing VMware Virtual Machine (VM) Memory SOLUTION: Choose conservative or aggressive approach depending on corporate philosophy - Consumed Memory: Configure each VM with the Consumed amount of memory; allow for memory over-commitment on the host/cluster - Active Memory +: Configure each VM with less memory than the current Consumed (but more than Active) and carefully monitor for memory stress; if there is stress, increase the Configured Memory Active Memory is rarely over 1 GB, and the original VM Configured Memory is 16 GB ESX VM Layer 3 It s decided to try 4 GB as the new Configured Memory Copyright 1/22/2016 BMC Software, Inc 17

Case Study Right-sizing VMware Virtual Machine (VM) Memory SOLUTION: Need to monitor from both ESX and VM perspectives - ESX perspective: Memory utilization reduces overall; no paging, no ballooning, no swapping Configured Memory reduced to 4 GB, then is restored to 16 GB Consumed Memory % is 100% of Configured, then reduces to < 50% Restored to 16 GB Restored to 16 GB ESX VM Layer 3 Copyright 1/22/2016 BMC Software, Inc 18

Case Study Right-sizing VMware Virtual Machine (VM) Memory SOLUTION: Need to monitor from both ESX and VM perspectives - Guest perspective: Crisis! Virtual memory runs out, memory utilization is 100%, paging occurs, no process memory left, applications stop Physical Memory utilization is 80-100% until the Configured Memory is restored Restored to 16 GB Paging rate spikes to.3 MB/sec until Configured Memory is restored, then is 0 Restored to 16 GB Linux Layer 2 Copyright 1/22/2016 BMC Software, Inc 19

Case Study Right-sizing VMware Virtual Machine (VM) Memory VM and its applications are suffering badly Virtual (swap) Memory utilization rises to100% ; when the system is rebooted to restore the Configured Memory, utilization is 0% again Processes consume most of memory, but it s not enough (see the paging, swapping problems) ; when memory is restored the Good memory profile returns: plenty of free memory, more file system cache, and more process memory Restored to 16 GB Restored to 16 GB Linux Layer 2 Copyright 1/22/2016 BMC Software, Inc 20

Case Study Right-sizing VMware Virtual Machine (VM) Memory EVEN BETTER SOLUTION: Be very careful when right-sizing - Confirm application memory requirements before downsizing VM - Consider using VMware over-commitment instead of manually reconfiguring individual VMs Need to monitor ESX-measured paging, ballooning, swapping All cluster hosts are under-committed Density of 1 indicates physical=virtual; <1 is under; >1 is over-committed Specific host is well under the memory threshold of 80% Reconfiguration ESX Host Layer 4 Copyright 1/22/2016 BMC Software, Inc 21

Case Study Right-sizing VMware Virtual Machine (VM) Memory EVEN BETTER SOLUTION: Be very careful when right-sizing - If using manual reconfiguration, must screen for guest-measured Paging (and/or scanning) increase Physical memory utilization increase and/or changes in profile Virtual (swap) memory utilization increase - Recommend screening all important guests for Memory or CPU stress Virtualized guest measurements can t always be taken literally, but are absolutely necessary for capacity analysis! More detail in the paper about this Copyright 1/22/2016 BMC Software, Inc 22

Case Study VMware Virtual Machine (VM) Memory Health Changes have been seen in vcenter VM Memory metrics. Are these changes impacting application performance? Is there an ESX capacity shortfall? - Many memory metrics available from ESX for a VM VM Used (same as Active Memory and Memory Usage from vcenter) has a clear daily pattern VM Configured Memory is steady at 7.8 GB ESX VM Layer 3 There are large shifts between Granted/Shared/Zero and Balloon Memory Copyright 1/22/2016 BMC Software, Inc 23

Case Study VMware Virtual Machine (VM) Memory Health Additional drill down on the ESX VM memory characteristics - Specific memory pressure metrics The ratio of Active to Consumed is changing higher ratio indicates memory pressure Ballooning also indicates memory pressure ESX VM Layer 3 Granted and Balloon show their relationship, which is that when there is memory pressure, Granted reduces Copyright 1/22/2016 BMC Software, Inc 24

Case Study VMware Virtual Machine (VM) Memory Health Is this affecting the application s performance? - CPU and Memory patterns for the application don t change despite the changes seen at the VM level CPU usage by process shows a very consistent daily pattern of workload volume and workload profile Memory usage of active processes shows a consistent daily pattern, correlated with CPU usage Linux Layer 1 Copyright 1/22/2016 BMC Software, Inc 25

Case Study VMware Virtual Machine (VM) Memory Health Is this affecting the application s performance? - What about memory pressure or shortage metrics Paging and scanning are zero Virtual memory utilization is very low Memory usage breakdown shows Process memory increases as a percentage and Files Cache and Free decrease when ballooning (memory pressure) occurs In the first case study, that was correlated with a memory shortage, but not here Linux Layer 2 Copyright 1/22/2016 BMC Software, Inc 26

Case Study VMware Virtual Machine (VM) Memory Health Application is OK, but why are these changes occurring in ESX? - What about memory capacity utilization? It s consistently around 93% Definitely over the capacity threshold of 80% Need to check for memory pressure and shortage metrics next ESX Host Layer 4 Copyright 1/22/2016 BMC Software, Inc 27

Case Study VMware Virtual Machine (VM) Memory Health Why are these changes occurring in ESX? - What do the memory pressure and shortage metrics show? Ballooning indicates pressure and it s occurring consistently Swapped memory indicates a shortage and it s occurring consistently Paging (Swapping) indicates a shortage and it s occurring pretty consistently ESX Host Layer 4 Copyright 1/22/2016 BMC Software, Inc 28

Case Study VMware Virtual Machine (VM) Memory Health So the ESX host is definitely experiencing a memory shortage - The cluster containing this host is showing around 70% memory capacity utilization Even if with DRS enabled, this host is quite worse than average for both memory and CPU capacity utilization Possible solutions - Investigate why cluster isn t better balanced Didn t have the data for the other hosts to do this analysis - Investigate moving one or more VMs to less utilized clusters - Upgrade memory on the cluster hosts Copyright 1/22/2016 BMC Software, Inc 29

Case Study VMware Virtual Machine (VM) Memory Health Memory pressure metric Ratio of Active to Consumed Memory compared across layers - Individual VM experience is much different than the average experience Cluster and Host ratio is rising, which shows memory pressure is increasing overall Active VMs are the only VMs we need to know about Our VM is much worse than average Ratio is approaching 60% for some ESX Cluster/Host /VM Layers 5/4/3 ESX VM Layer 3 Copyright 1/22/2016 BMC Software, Inc 30

Conclusions Virtualization is simple for the application, not so easy for the capacity planner/performance analyst - Must identify the relevant layers between the application and the hardware resources it uses - Need appropriate measurements from every layer Often requires multiple data sources Apparently similar metrics can mean entirely different things - Need to perform analysis on several layers at the same time to get a complete picture Copyright 1/22/2016 BMC Software, Inc 31

Conclusions Need to use the same techniques as for a physical server - Set hardware capacity resource utilization thresholds according to both performance and other constraints - Understand that high for one resource can produce low for another resource - Identify workload demand patterns within servers/guests - Metric clusters (capacity and performance) are needed for each resource Copyright 1/22/2016 BMC Software, Inc 32

Conclusions Use higher layer metrics carefully - Averages (or other summarizations) can obscure exactly what you need to see Focus on application and VM layers for actual performance Focus on active VMs - Threshold hardware resource layer only Threshold interpretation for VMs requires multi-layer analysis Threshold interpretation for cluster requires host analysis, too - Use compatible metric comparison units such as percentage of total capacity or queue length per processor rather than MHz, GB, total queue length, etc. - Higher layers provide essential overall capacity planning projections and trends Copyright 1/22/2016 BMC Software, Inc 33

Q&A Copyright 1/22/2016 BMC Software, Inc 34

Learn more at www.bmc.com Copyright 1/22/2016 BMC Software, Inc 35