Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June 25 2010

Similar documents

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Real-time KVM from the ground up

KVM & Memory Management Updates

Cloud Operating Systems for Servers

IBM. Kernel Virtual Machine (KVM) Best practices for KVM

Database Virtualization

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

BEST PRACTICES FOR PARALLELS CONTAINERS FOR LINUX

Enabling Technologies for Distributed and Cloud Computing

KVM PERFORMANCE IMPROVEMENTS AND OPTIMIZATIONS. Mark Wagner Principal SW Engineer, Red Hat August 14, 2011

The Benefits of Using VSwap

Delivering Quality in Software Performance and Scalability Testing

2972 Linux Options and Best Practices for Scaleup Virtualization

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

HRG Assessment: Stratus everrun Enterprise

Red Hat Enterprise Virtualization Performance. Mark Wagner Senior Principal Engineer, Red Hat June 13, 2013

KVM Architecture Overview

RED HAT ENTERPRISE VIRTUALIZATION

Enabling Technologies for Distributed Computing

An Implementation Of Multiprocessor Linux

AirWave 7.7. Server Sizing Guide

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Virtual Machines. COMP 3361: Operating Systems I Winter

Computer-System Architecture

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Architecture of the Kernel-based Virtual Machine (KVM)

Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment

Operating Systems Concepts: Chapter 7: Scheduling Strategies

SYSTEM ecos Embedded Configurable Operating System

BridgeWays Management Pack for VMware ESX

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Capacity Estimation for Linux Workloads

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

Using Linux as Hypervisor with KVM

Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance

Performance brief for IBM WebSphere Application Server 7.0 with VMware ESX 4.0 on HP ProLiant DL380 G6 server

CS5460: Operating Systems. Lecture: Virtualization 2. Anton Burtsev March, 2013

OPERATING SYSTEM - VIRTUAL MEMORY

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Windows Server Performance Monitoring

Performance and scalability of a large OLTP workload

Understanding Memory Resource Management in VMware vsphere 5.0

Multi-core Programming System Overview

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

REAL TIME OPERATING SYSTEMS. Lesson-10:

Understanding Linux on z/vm Steal Time

Beyond the Hypervisor

How To Understand And Understand An Operating System In C Programming

ELI: Bare-Metal Performance for I/O Virtualization

STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT

evm Virtualization Platform for Windows

PARALLELS SERVER 4 BARE METAL README

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

Virtualization Technology. Zhiming Shen

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Realtime Linux Kernel Features

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

FOR SERVERS 2.2: FEATURE matrix

Chapter 16: Virtual Machines. Operating System Concepts 9 th Edition

Chapter 2: OS Overview

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

ò Paper reading assigned for next Thursday ò Lab 2 due next Friday ò What is cooperative multitasking? ò What is preemptive multitasking?

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Performance Management in a Virtual Environment. Eric Siebert Author and vexpert. whitepaper

Linux Process Scheduling Policy

Hyper-V vs ESX at the datacenter

Leveraging NIC Technology to Improve Network Performance in VMware vsphere

COS 318: Operating Systems. Virtual Machine Monitors

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP

Cutting Costs with Red Hat Enterprise Virtualization. Chuck Dubuque Product Marketing Manager, Red Hat June 24, 2010

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Deploying and Optimizing SQL Server for Virtual Machines

White Paper Perceived Performance Tuning a system for what really matters

PERFORMANCE TUNING ORACLE RAC ON LINUX

kvm: Kernel-based Virtual Machine for Linux

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

Effects of Memory Randomization, Sanitization and Page Cache on Memory Deduplication

Scaling in a Hypervisor Environment

Solid State Storage in Massive Data Environments Erik Eyberg

Chapter 14 Virtual Machines

Ready Time Observations

IOS110. Virtualization 5/27/2014 1

Microsoft Office SharePoint Server 2007 Performance on VMware vsphere 4.1

Enhancing SQL Server Performance

Virtualization and Performance NSRC

Avoiding Performance Bottlenecks in Hyper-V

Deeply Embedded Real-Time Hypervisors for the Automotive Domain Dr. Gary Morgan, ETAS/ESC

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

RED HAT ENTERPRISE VIRTUALIZATION 3.0

RPM Brotherhood: KVM VIRTUALIZATION TECHNOLOGY

Process Description and Control william stallings, maurizio pizzonia - sistemi operativi

Virtual machines and operating systems

Advanced Computer Networks. Network I/O Virtualization

Cloud Computing CS

OS OBJECTIVE QUESTIONS

Transcription:

Kernel Optimizations for KVM Rik van Riel Senior Software Engineer, Red Hat June 25 2010

Kernel Optimizations for KVM What is virtualization performance? Benefits of developing both guest and host KVM optimizations in RHEL5 & RHEL 6 Future optimizations Deployment ideas to maximize performance with KVM Conclusions

What is virtualization performance? Different users have different goals: Throughput or latency of one virtual machine. Aggregate throughput of a set of virtual machines. Largest number of virtual machines on a system, with acceptable performance (density). KVM in RHEL and RHEV has performance optimizations for all of these (sometimes contradictary) goals.

Benefits of developing both guest and host High performance virtualization is HARD Operating systems do strange things Some things cannot be emulated efficiently All virtualization solutions depend on special device drivers for performance Squeezing out the last few (dozen) percent of performance takes more work... Some things more easily done in the guest OS Some things more easily done on the host

Large guest support The obvious way to increase guest performance Up to 64 VCPUs per guest Up to 1TB memory per guest X2apic, a virtual interrupt controller Lockless RCU synchronization in KVM on the host

Tickless kernel & pvclock Timekeeping has always been a virtualization headache Tickless kernel Eliminate timer ticks in idle guests, reduces CPU use More CPU time left for running guests Pvclock In the past, missed timer ticks caused time skew issues Instead of keeping its own clock, guests with pvclock ask the host what time it is Eliminates time skew

Transparent huge pages Uses 2MB pages for KVM guests The performance benefits of hugetlbfs, without the downsides Reduces memory address translation time A virtual machine is like a process CPU does 2 address translations for processes in virtual machines Fewer address translations needed Typical performance improvements from 2% to 10% Can also be used for non-kvm processes, eg. JVMs

Transparent huge pages internals Using 2MB pages for processes TLB covers more memory, fewer TLB misses Fewer page table levels to walk, each TLB miss faster Can be [disabled], [always] enabled or [madvise] /sys/kernel/mm/redhat_transparent_hugepage/enable Always: enable transparent hugepages for any process Madvise: only use hugepages on specified VMAs

Transparent huge pages internals #2 Uses 2MB pages when available Memory gets defragmented as required Falls back to 4kB pages if needed 2MB pages get broken up into 4kB pages under memory pressure, only small pages get swapped Khugepaged folds small pages into large ones, defragmenting memory as needed

Guest swapping KVM guests are Linux processes Rarely used memory can be swapped out to disk However, swapping things back in is slow 10ms disk latency 10ms equals 25,000,000 CPU cycles with 2.5GHz CPU Modern CPUs can execute multiple instructions/cycle Useful to increase the density of virtual machines per physical system Do not rely on swapping too much

Kernel Samepage Merging (KSM) Pack more guests onto a host by sharing memory Guest memory is scanned by ksmd Identical pages are write protected and shared If a guest writes to a page, it is unshared Both identical pages inside guests and between guests can be shared Especially useful if guests run the same program Windows zeroes all free memory, can all be shared

KSM page sharing in action G1 G2 Kernel

KSM performance considerations KSM is a tradeoff Uses CPU time...... to avoid disk IO for swapping Ksmtuned controls KSM scan rate By default KSM only runs when free memory gets low If all guests fit in RAM, KSM may be a waste of CPU # service ksm stop Transparent hugepages cannot be shared by KSM Get split up and become sharable on memory pressure

KSM performance considerations KSM will be great for hosting providers More virtual machines per system More customers served with less hardware Noone will notice the CPU time used on KSM Clients only see inside their own guests A constant tiny slowdown is not as visible as a sudden crawlto-a-halt swap storm Ksmtuned can be configured using /etc/ksmtuned.conf

Memory ballooning & guest migration Sometimes KSM and swapping are not enough Too much swap IO going on? Guests can be resized (ballooned) down Does the guest have more memory than it needs now? Eg. lots of memory in the page cache Do the guests need all their memory to run programs? Live migrate a virtual machine to another host Frees up memory and stops swap IO

IO scheduler considerations RHEL 6 has two main IO schedulers CFQ enforces fairness in access to a disk Mostly for physical disks (high throughput, slow seek) Reads are latency sensitive (a program waits for data) Writes can be done in bulk Deadline scheduler Makes sure no request stays in the queue too long Can keep multiple disks (arrays) busy Higher latency reads in the presence of writes

IO scheduler recommendations When multiple guests share a disk CFQ keeps read latencies reasonable...... even when another guest is doing large background writes When using a large storage array for the guests Deadline keeps all the disks busy May result in better throughput Increased throughput may make up for reduced fairness Test out which works best for your setup

Virtualization vs. guest kernel spinlocks Spinlocks are the simplest kernel locks One CPU holds the lock, does stuff, then unlocks it Other CPUs that want the lock spin until the lock is free Virtualization throws a monkey wrench in this scheme The CPU that holds the lock may not be running The CPU that wants the lock may spin for a longer time The CPU that wants the lock may prevent the other CPU from freeing the lock for an entire timeslice! Only an issue if #guest CPUs > #physical CPUs.

Virtualization vs ticket spinlocks Fair ticket spinlocks add an additional complication Multiple CPUs may be spinning on a lock But only the CPU that has been waiting longest may take the lock Everybody else continues spinning...... even if the next-in-line CPU is not running right now The fix: by default use unfair spinlocks in KVM guests Provide a kernel commandline option to switch spinlock=ticket or spinlock=unfair

Hardware solutions to the spinlock problem When all else fails, throw hardware at the problem Pause Loop Exiting Some very new Intel CPUs detect when a guest is spinning on a spinlock PLE support scheduled to be in RHEL 6.1 Hyperthreading Can be used to reduce scheduling latencies, which reduces spinlock worst case overhead

Hyperthreading to improve spinlock behaviour Hyperthreading on Intel CPUs: Two threads of execution can share the same CPU core Adds 5-30% aggregate throughput A 4 core CPU with hyperthreading Performs like a 4-5 core CPU Is fully utilized with 40-50% idle time in top, sar, etc The extra threads allow for spinlock-holding and spinlock-wanting guest CPUs to run simultaneously Eliminates the worst of the bad behaviour

Pause Loop Exiting Fancy feature on the very latest Intel CPUs Detects when a guest CPU is spinning on a lock Traps with EXIT_REASON_PAUSE_INSTRUCTION The host can schedule another guest CPU Might be the guest CPU that has the lock (can free it) Could be another guest CPU doing useful work System spends less time spinning on locks, more time doing work PLE feature planned for RHEL 6.1

Asynchronous page faults KVM guests are like processes Guest memory can be swapped out and back in Guest CPU paused while waiting for swapin IO However, we know there is an operating system inside Usually, a process inside the guest does the access The guest OS can suspend that process while waiting for the host to do the page fault With these async pagefaults, the guest CPU can continue to handle timers, network traffic, etc Async pagefault feature planned for RHEL 6.1

Conclusions There are different performance goals Guest performance, aggregate performance, density Which optimizations you use depends on your need RHEL 6 defaults are conservative Experiment with transparent huge pages Experiment with different IO schedulers Enable HT, keep all the cores busy at 40% idle time Red Hat is working to implement more performance optimizations over time