Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance

Similar documents
KVM Virtualized I/O Performance

KVM Architecture Overview

2972 Linux Options and Best Practices for Scaleup Virtualization

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

KVM PERFORMANCE IMPROVEMENTS AND OPTIMIZATIONS. Mark Wagner Principal SW Engineer, Red Hat August 14, 2011

The QEMU/KVM Hypervisor

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

KVM Virtualization Roadmap and Technology Update

Enabling Technologies for Distributed Computing

IBM. Kernel Virtual Machine (KVM) Best practices for KVM

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

High-performance vnic framework for hypervisor-based NFV with userspace vswitch Yoshihiro Nakajima, Hitoshi Masutani, Hirokazu Takahashi NTT Labs.

RUNNING vtvax FOR WINDOWS

Enabling Technologies for Distributed and Cloud Computing

Running vtserver in a Virtual Machine Environment. Technical Note by AVTware

KVM on S390x. Revolutionizing the Mainframe

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Real-time KVM from the ground up

Full and Para Virtualization

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Module I-7410 Advanced Linux FS-11 Part1: Virtualization with KVM

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

SUSE Linux Enterprise 10 SP2: Virtualization Technology Support

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Arwed Tschoeke, Systems Architect IBM Systems and Technology Group

Red Hat Enterprise Virtualization Performance. Mark Wagner Senior Principal Engineer, Red Hat June 13, 2013

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

Virtualization and Performance NSRC

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

KVM in Embedded Requirements, Experiences, Open Challenges

Database Virtualization

Red Hat enterprise virtualization 3.0 feature comparison

Performance and scalability of a large OLTP workload

8Gb Fibre Channel Adapter of Choice in Microsoft Hyper-V Environments

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE FEATURES

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Architecture of the Kernel-based Virtual Machine (KVM)

Maximizing SQL Server Virtualization Performance

KVM KERNEL BASED VIRTUAL MACHINE

BHyVe. BSD Hypervisor. Neel Natu Peter Grehan

Beyond the Hypervisor

Virtualization and the U2 Databases

The XenServer Product Family:

KVM Virtualization in RHEL 7 Made Easy

KVM Virtualization in RHEL 6 Made Easy

Virtualization. Dr. Yingwu Zhu

Scaling Microsoft Exchange in a Red Hat Enterprise Virtualization Environment

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

Set Up the VM-Series Firewall on KVM

Citrix XenServer Product Frequently Asked Questions

SQL Server Virtualization

Scaling from Datacenter to Client

With Red Hat Enterprise Virtualization, you can: Take advantage of existing people skills and investments

Masters Project Proposal

White Paper. Recording Server Virtualization

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Introduction to Virtualization & KVM

FOR SERVERS 2.2: FEATURE matrix

Comparing Free Virtualization Products

RPM Brotherhood: KVM VIRTUALIZATION TECHNOLOGY

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Cutting Costs with Red Hat Enterprise Virtualization. Chuck Dubuque Product Marketing Manager, Red Hat June 24, 2010

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Virtualizing Microsoft SQL Server 2008 on the Hitachi Adaptable Modular Storage 2000 Family Using Microsoft Hyper-V

RED HAT ENTERPRISE VIRTUALIZATION & CLOUD COMPUTING

CON9577 Performance Optimizations for Cloud Infrastructure as a Service

Veritas InfoScale 7.0 Virtualization Guide - Linux

Introduction to KVM. By Sheng-wei Lee #

Configuration Maximums

Cloud Computing CS

Servervirualisierung mit Citrix XenServer

Dynamic Load Balancing of Virtual Machines using QEMU-KVM

Operating Systems Virtualization mechanisms

IOS110. Virtualization 5/27/2014 1

SCSI support on Xen. MATSUMOTO Hitoshi Fujitsu Ltd.

RED HAT ENTERPRISE VIRTUALIZATION

Scaling in a Hypervisor Environment

AirWave 7.7. Server Sizing Guide

Networking for Caribbean Development

Advanced Computer Networks. Network I/O Virtualization

Samsung Magician v.4.5 Introduction and Installation Guide

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

VMware vsphere 5.0 Boot Camp

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

Virtual Machines. COMP 3361: Operating Systems I Winter

Performance tuning Xen

Performance brief for IBM WebSphere Application Server 7.0 with VMware ESX 4.0 on HP ProLiant DL380 G6 server

Windows 8 SMB 2.2 File Sharing Performance

Virtualization in Linux KVM + QEMU

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

Virtualization Management the ovirt way

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Configuration Maximums VMware vsphere 4.0

A quantitative comparison between xen and kvm

How To Save Power On A Server With A Power Management System On A Vsphere Vsphee V (Vsphere) Vspheer (Vpower) (Vesphere) (Vmware

Virtualization of the MS Exchange Server Environment

Transcription:

Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance Dr. Khoa Huynh (khoa@us.ibm.com) IBM Linux Technology Center

Overview KVM I/O architecture Key performance challenges for storage I/O in virtualized environments Solving key performance challenges Solutions & prototypes Performance results Performance recommendations Other recent performance features Recap Agenda 2 IBM, Linux, and Building a Smarter Planet

KVM I/O Architecture at 10,000 Feet KVM GUEST KVM HOST KVM Guest s Kernel Applications File System & Block Drivers vcpu0 vcpun Hardware Emulation (QEMU) iothread Only one thread may be executing QEMU code at any given time Generates I/O requests to host on guest s behalf & handle events Applications and guest OS run inside KVM Guest just like they run bare metal Guest interfaces with emulated hardware presented by QEMU QEMU submits I/Os to host on behalf of guest Host kernel treats guest I/Os like any user-space application KVM (kvm.ko) File System / Block / NFS Physical Drivers LINUX KERNEL HARDWARE cpu0. 3 IBM, Linux, and Building a Smarter Planet cpum Notes: Each guest CPU has a dedicated vcpu thread that uses kvm.ko module to execute guest code There is an I/O thread that runs a select(2) loop to handle events

So What Are The Key Performance Challenges? Low throughput / high latencies (compared to bare metal) Generally less than 30% of bare metal configuration issue(s) Between 30% and 60% of bare metal performance tuning With proper configuration + performance tuning 90% or more Low I/O rates (IOPS) Some enterprise workloads require 100Ks I/Os Per Second (IOPS) Most difficult challenge for I/O virtualization! Current KVM tops out at ~147,000 IOPS VMware claimed vsphere v5.1 could achieve 1.06 million IOPS for a single guest 4 IBM, Linux, and Building a Smarter Planet

Issue Low Throughput / High Latencies 5 IBM, Linux, and Building a Smarter Planet

I/O Virtualization Approaches Device assignment (pass-through) Pass physical device directly to guest High performance No device sharing among multiple guests Difficult for live migration PCI device limit (8) per guest Full virtualization IDE, SATA, SCSI Good guest compatibility Bad performance (many trap-and-emulate operations), does not scale beyond 1 thread Not recommended for enterprise storage Para-virtualization virtio-blk, virtio-scsi Efficient guest host communication through virtio ring buffer (virtqueue) Good performance Virtualization benefits (e.g. device sharing among guests, etc.) 6 IBM, Linux, and Building a Smarter Planet

Virtio-blk Storage Configurations Device-Backed Virtual Storage File-Backed Virtual Storage Host Server KVM Applications Guest Direct I/O w/ Para-Virtualized Drivers Host Server KVM Applications Guest Paravirtualized Drivers, Direct I/O LVM Volume on Virtual Devices Virtual Devices /dev/vda, /dev/vdb, Block Devices /dev/sda, /dev/sdb, LVM Volume File RAW or QCOW2 Physical Storage Physical Storage 7 IBM, Linux, and Building a Smarter Planet

Storage Performance Results KVM Block I/O Performance FFSB Benchmark w/ Direct I/O on LVM Volume KVM Guest = 2 vcpus, 4GB; Host = 16 CPUs, 12GB Physical Storage = 8 x RAID10 Disk Arrays (192 Disks), 4 x DS3400 Controllers, 2 x FC Host Adapters 1.20 Throughput Relative To Bare Metal 1.00 0.80 0.60 0.40 0.20 0.90 0.92 0.88 0.89 0.91 0.69 0.53 0.78 0.69 Bare Metal Device-Backed Virtual Storage RAW-File-Backed Virtual Storage QCOW2-File-Backed Virtual Storage 0.00 Data Streaming Random I/O Overall Notes: FFSB = Flexible File System Benchmark (http://sourceforge.net/projects/ffsb/) Data Streaming (sequential reads, sequential writes) with block sizes of 8KB and 256KB Random Mixed Workloads = random reads, random writes, mail server, mixed DB2 workloads (8KB block size) 8 IBM, Linux, and Building a Smarter Planet

Storage Performance Recommendations Proper Configuration & Performance Tuning make BIG difference! Use VirtIO drivers (IDE emulation does NOT scale beyond 1 thread) Virtual Storage If possible, pass whole partitions or block devices to guest (device-backed virtual disks) instead of host files PCI pass-through is better than device-backed virtual disks, but cannot easily migrate Virtual Image Format RAW offers the best performance, but QCOW2 performance has improved significantly since RHEL 6.1 and upstream Guest Caching Mode cache = writethough is the default & provides data integrity in all cases (disk cache not exposed to guest) cache = writethrough is great for read-intensive workloads (host s page cache enabled) cache = none is great for write-intensive workloads or workloads involving NFS remote storage (host s page cache disable and disk write cache enabled) Guest I/O Model Linux AIO support (aio=native) provides better performance in many cases, especially with multiple threads I/O Scheduler Deadline I/O scheduler yields the best performance for I/O-intensive workloads (much better than the default CFQ scheduler) Miscellaneous Enable x2apic support for guest 2% to 5% performance gain for I/O-intensive workloads Disable unnecessary features: delay accounting (nodelayacct kernel boot parameter), random entropy contribution (sysfs) Avoid memory over-commit in KVM host (much worse than CPU over-commit) 9 IBM, Linux, and Building a Smarter Planet

Storage Performance Recommendations (cont'd) KVM Guest Configuration: Use 2 or more virtual CPUs in the guest for I/O-intensive workloads Allocate enough memory in the guest to avoid memory overcommit Do not format virtual disks with ext4 on RHEL5 RHEL5 does not have performance optimizations for ext4 Use deadline I/O scheduler in the guest Use Time Stamp Counter (TSC) as clocksource (if host CPU supports TSC) 64-bit guest on 64-bit host has the best I/O performance in most cases NFS Remote Storage If RHEL5.5 (or earlier version) is used on KVM host, need kvm-83-164.el5_5.15 or later Reduce number of lseek operations up to 3X improvement in NFS throughput and 40% reduction in guest s virtual CPU usage Patches are included in RHEL5.6 and later (and upstream QEMU) If RHEL6.0 or 6.1 is used on KVM host, need an errata package for RHEL6.1 (http://rhn.redhat.com/errata/rhba-2011-1086.html) RHEL6 QEMU uses vector I/O constructs for handling guest I/O s, but NFS client (host) does not support vector I/O, so large guest I/O s are broken into 4KB requests This issue was fixed in RHEL6.2 (NFS client now supports vector I/O) Use Deadline I/O scheduler Ensure NFS read / write sizes are sufficiently large, especially for large I/O s Ensure NFS server(s) have enough NFS threads to handle multi-threaded workloads NFS exports = sync (default), w_nodelay For distributed parallel file-system with large block sizes used in cloud s storage systems, need ability to avoid reading/writing multiple blocks for each I/O that is smaller than block size 10 IBM, Linux, and Building a Smarter Planet

Issue Low I/O Rates (IOPS) 11 IBM, Linux, and Building a Smarter Planet

Some Background Many enterprise workloads (e.g. databases, ERP systems, low-latency financial trading applications, etc.) require very high I/O rates Some demand well over 500,000 I/Os Per Second (IOPS) KVM can typically handle only up to ~147,000 IOPS Very high I/O rates present significant challenge to virtualization Primary reason why many enterprise workloads have not been migrated to virtualized environments Challenge from other hypervisors (2011) VMware vsphere 5.0 300,000 IOPS for single VM; 1 million IOPS for 6 VMs @ 8KB I/O size (2012) VMware vsphere 5.1 1 Million IOPS for a single VM @ 4KB I/O size (2012) Microsoft Hyper-V 700,000 IOPS for a single host @ 512-byte I/O size 12 IBM, Linux, and Building a Smarter Planet

Possible Paths To Very High I/O Rates (1 million IOPS) PCI Pass-through (PCI Device Assignment) Currently stands at ~800,000 IOPS @ 8KB I/O size, ~950,000 IOPS @ 4KB I/O size Limit of 8 PCI devices per VM (guest) No virtualization benefits; difficult for live migration Virtio-blk Optimization Virtio-blk can traditionally support up to ~147,000 I/O Operations Per Second (IOPS) Performance profiling Big QEMU Lock Allows core QEMU components to ignore multi-threading (historically) Creates scalability problems Need to bypass or relieve Big QEMU Lock as much as possible 13 IBM, Linux, and Building a Smarter Planet

Bypassing Big QEMU Lock in virtio-blk Vhost-blk Initially coded by Liu Yuan, new prototype by Asias He Submits guest I/Os directly to host via kernel threads (similar to vhost_net for network) Drawbacks: Involves the kernel (ring 0 privilege, etc.) Cannot take advantage of QEMU features (e.g. image formats, etc.) Cannot support live migration Data-Plane QEMU Coded by Stefan Hajnoczi (~1500 LOC) Submits guest I/Os directly to host in user space (one user-space thread per virtual block device) Will become default mode of operations eventually No kernel change is required Both approaches have comparable performance in our testing! 14 IBM, Linux, and Building a Smarter Planet

Data-Plane Most exciting development in QEMU in the last 5 years! - Anthony Liguori (QEMU Maintainer) Virtio-blk-data-plane: Accelerated data path for para-virtualized block I/O driver Using per-device dedicated threads and Linux AIO support in the host kernel for I/O processing without going through QEMU block layer No need to acquire big QEMU lock Availability Accepted upstream (qemu-1.4.0 or later) Available as Technology Preview in Red Hat Enterprise Linux 6.4 and SUSE Linux Enterprise Server 11 SP3 KVM Guest QEMU Event Loop Host Kernel irqfd ioeventfd Virtio-blkdata-plane thread(s) Linux AIO 15 IBM, Linux, and Building a Smarter Planet

Libvirt Domain XML How To Enable Virtio-blk-data-plane? <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>... <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='none' io='native'/> <source file='path/to/disk.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk>... <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='device.virtio-disk0.scsi=off'/> </qemu:commandline> <!-- config-wce=off is not needed in RHEL 6.4 --> <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='device.virtio-disk0.config-wce=off'/> </qemu:commandline> <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='device.virtio-disk0.x-data-plane=on'/> </qemu:commandline> <domain> QEMU command-line qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=path/to/disk.img \ -device virtio-blk,drive=drive0,scsi=off,config-wce=off,x-data-plane=on 16 IBM, Linux, and Building a Smarter Planet

BUT... Current limitations Only raw image format is supported Other image formats depend on QEMU block layer Live migration is not supported Hot unplug and block jobs are not supported I/O throttling limits are ignored Only Linux hosts are supported (due to Linux AIO usage) On-going work (for upcoming QEMU releases) Patches have recently been submitted upstream to convert virtio-net to use data-plane Reduce the scope of big QEMU lock moving to RCU (Read Copy Update) 17 IBM, Linux, and Building a Smarter Planet

Performance Results Data-Plane Highest virtualized storage I/O rates ever reported for a single virtual machine (guest) Red Hat Enterprise Linux 6.4 1.20 million IOPS @ 8KB I/O size for a single guest 1.58 million IOPS @ 4KB I/O size for a single guest Upstream QEMU / SLES 11 SP3 1.37 million IOPS @ 8KB I/O size for a single guest 1.61 million IOPS @ 4KB I/O size for a single guest More efficient memory-mapping infrastructure (in QEMU) Approaching bare-metal limit of our storage setup 50% higher than the closest competing hypervisor (VMware vsphere 5.1) Low latencies and consistent throughput for storage I/O requests 18 IBM, Linux, and Building a Smarter Planet

First Thing First Needed a storage setup capable of delivering > 1 Million IOPS Host Server: IBM System x3850 X5 4 x E7-4870 sockets (40 Cores @ 2.40 GHz), 256GB Memory (Total) Storage: 7 x 8-Gbps, dual-ported QLogic HBA s hooked to 7 SCSI target servers Each SCSI target server is backed by RAM disks and provides 2 PCI devices (ports) and 8 LUNs (56 LUNs total) Virtual Machine (Guest): 40 vcpus, 8 GB memory, 42 LUNs No host cache (cache=none) FIO workload: 1 job per LUN Queue depth = 32 Direct I/O s Engine = libaio KVM Configuration Host Server (IBM x3850 X5) KVM Guest (40 vcpus, 8GB, 42 LUNs) FIO Benchmark Random Reads + Writes (50% Split) 8-Gbps Qlogic Adapters (Dual- Ported) SCSI Target Servers (7) 19 IBM, Linux, and Building a Smarter Planet

Qemu-kvm Command Line For Data-Plane Testing qemu-kvm -M pc -mem-path /hugepages -m 7168 -cpu qemu64,+x2apic -smp 40 -name guest1 -uuid 1c047c62-c21a-1530-33bf-185bc15261d8 -boot c -drive if=none,id=root,file=/khoa/images/if1n1_sles11.img -device virtio-blk-pci,drive=root -drive if=none,id=drive-virtiodisk1_0,file=/dev/sdc,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=5.0,multifunction=on,drive=drive-virtio-disk1_0,id=virtio-disk1_0 -drive if=none,id=drive-virtiodisk1_1,file=/dev/sdd,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=5.1,multifunction=on,drive=drive-virtio-disk1_1,id=virtio-disk1_1 -drive if=none,id=drive-virtiodisk1_2,file=/dev/sde,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=5.2,multifunction=on,drive=drive-virtio-disk1_2,id=virtio-disk1_2 -drive if=none,id=drive-virtiodisk2_0,file=/dev/sdg,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=6.0,multifunction=on,drive=drive-virtio-disk2_0,id=virtio-disk2_0 -drive if=none,id=drive-virtiodisk2_1,file=/dev/sdh,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=6.1,multifunction=on,drive=drive-virtio-disk2_1,id=virtio-disk2_1 -drive if=none,id=drive-virtiodisk2_2,file=/dev/sdi,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=6.2,multifunction=on,drive=drive-virtio-disk2_2,id=virtio-disk2_2 -drive if=none,id=drive-virtiodisk3_0,file=/dev/sdk,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=7.0,multifunction=on,drive=drive-virtio-disk3_0,id=virtio-disk3_0 -drive if=none,id=drive-virtiodisk3_1,file=/dev/sdl,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=7.1,multifunction=on,drive=drive-virtio-disk3_1,id=virtio-disk3_1 -drive if=none,id=drive-virtiodisk3_2,file=/dev/sdm,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=7.2,multifunction=on,drive=drive-virtio-disk3_2,id=virtio-disk3_2 -drive if=none,id=drive-virtiodisk4_0,file=/dev/sdo,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=8.0,multifunction=on,drive=drive-virtio-disk4_0,id=virtio-disk4_0 -drive if=none,id=drive-virtiodisk4_1,file=/dev/sdp,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=8.1,multifunction=on,drive=drive-virtio-disk4_1,id=virtio-disk4_1 -drive if=none,id=drive-virtiodisk4_2,file=/dev/sdq,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=8.2,multifunction=on,drive=drive-virtio-disk4_2,id=virtio-disk4_2 -drive if=none,id=drive-virtiodisk5_0,file=/dev/sds,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=9.0,multifunction=on,drive=drive-virtio-disk5_0,id=virtio-disk5_0 -drive if=none,id=drive-virtiodisk5_1,file=/dev/sdt,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=9.1,multifunction=on,drive=drive-virtio-disk5_1,id=virtio-disk5_1 -drive if=none,id=drive-virtiodisk5_2,file=/dev/sdu,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=9.2,multifunction=on,drive=drive-virtio-disk5_2,id=virtio-disk5_2 -drive if=none,id=drive-virtiodisk6_0,file=/dev/sdw,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=10.0,multifunction=on,drive=drive-virtio-disk6_0,id=virtio-disk6_0 -drive if=none,id=drive-virtiodisk6_1,file=/dev/sdx,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=10.1,multifunction=on,drive=drive-virtio-disk6_1,id=virtio-disk6_1 -drive if=none,id=drive-virtiodisk6_2,file=/dev/sdy,cache=none,aio=native -device virtio-blk-pci,scsi=off,config-wce=off,x-dataplane=on,addr=10.2,multifunction=on,drive=drive-virtio-disk6_2,id=virtio-disk6_2 -drive if=none,id=drive-virtiodisk7_0,file=/dev/sdaa,cache=none,aio=native -device virtio-blk-pci,scsi 20 IBM, Linux, and Building a Smarter Planet

Single KVM Guest RHEL 6.4 with virtio-blk-data-plane FIO Benchmark, Direct Random I/Os (50% Reads, 50% Writes) KVM Host = IBM x3850 X5 (Intel E7-8870@2.4GHz, 40 Cores, 256GB) IOPS Average Read Latency (ms) Average Write Latency (ms) IOPS 1,800,000.0 1,600,000.0 1,400,000.0 1,200,000.0 1,000,000.0 800,000.0 600,000.0 400,000.0 200,000.0 0.0 512 B 1 KB 2 KB 4 KB 8 KB I/O Size 1.80 1.60 1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00 Average Latency (ms) 21 IBM, Linux, and Building a Smarter Planet

KVM vs. Competing Hypervisors Direct Random I/Os at 4KB Block Size Host Server = Intel E7-8870@2.4GHz, 40 Cores, 256GB Higher is better Upstream KVM (with virtioblk-data-plane - Single Guest) 1,613,288.0 RHEL 6.4 KVM (with virtio-blkdata-plane - Single Guest) 1,577,684.0 KVM (normal I/O path - Single Guest) 147,365 Microsoft Hyper-V (Single Host, Multiple Guests) 400,000 52% Higher VMware vsphere 5.1 (Single Guest) 1,059,304 0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 I/O's Per Second (IOPS) 22 IBM, Linux, and Building a Smarter Planet

I/O Operations Per Second (IOPS) 1,800,000.0 1,600,000.0 1,400,000.0 1,200,000.0 1,000,000.0 800,000.0 600,000.0 400,000.0 200,000.0 0.0 KVM vs. Competing Hypervisors Direct Random I/Os Across Various Block Sizes Host Server = Intel E7-8870@2.4GHz, 40 Cores, 256GB Higher is better virtio-blk-data-plane (SLES11 SP3 / upstream) virtio-blk-data-plane (RHEL 6.4) SLES11 SP3 KVM w/ virtio-blk-data- vsphere 5.1 plane (Single Guest) PCI Pass-through (RHEL6.3) RHEL 6.4 KVM w/ virtio-blk-dataplane (Single Guest) RHEL 6.3 KVM w/ PCI Pass-through (Single Guest) Microsoft Hyper-V (Single Host, Multiple Guests) VMware vsphere 5 (Single Guest) vsphere 5.0 512 B 1 KB 2 KB 4 KB 8 KB I/O Size 23 IBM, Linux, and Building a Smarter Planet

Single Virtual Machine Direct Random I/Os at 4KB Block Size Host Server = Intel E7-8870@2.4GHz, 40 Cores, 256GB KVM w/ virtio-blk-dataplane (performance tuned) 1601753.0 KVM w/ virtio-blk-dataplane (no tuning) KVM w/ PCI Pass-through (performance tuned) KVM w/ PCI Pass-through (no tuning) Existing KVM w/ virtio-blk Microsoft Hyper-V 380730.0 295,652 147,365 400,000 Impact of performance tuning 939199.0 Impact of performance tuning Performance tuning: Qlogic interrupt coalescing No delay accounting No random entropy contribution from block devices Guest TSC clocksource VMware vsphere 5.1 1,059,304 0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 I/O's Per Second (IOPS) 24 IBM, Linux, and Building a Smarter Planet

Other Recent Performance Features 25 IBM, Linux, and Building a Smarter Planet

Para-Virtualized End-Of-Interrupts (PV-EOI) Improve interrupt processing overhead Reduce number of context switches between KVM hypervisor and guests Less CPU utilization Up to 10% Ideal for workloads with high I/O rates High storage I/O rates High incoming network traffic Enabled by default in guest Linux operating systems Availability Red Hat Enterprise Linux 6.4 (KVM guests) 26 IBM, Linux, and Building a Smarter Planet

Bio-based virtio-blk driver Skip I/O scheduler in KVM guest by adding bio-based path to virtio-blk (Asias He @ Red Hat) Similar to bio-based driver for ramdisk Shorter I/O path Less global mutex contention Performance better (throughput, latency, vcpu utilization) for fast drives (e.g. Ramdisk, SSDs), but not for spinning disks I/O scheduler's benefits (e.g. request merging) outweigh bio path's advantage Availability Upstream kernels (since 3.7) How to enable (disabled by default) Add 'virtio_blk.use_bio=1 to kernel cmdline Modprobe virtio_blk use_bio=1 Bio-based Virtio-blk 27 IBM, Linux, and Building a Smarter Planet

Virtio-scsi SCSI support for KVM guests Rich SCSI feature set true SCSI devices (seen as /dev/sd* in KVM guest) Virtio-scsi device = SCSI Host Bus Adapter (HBA) Large number of disks per virtio-scsi device Easier for P2V migration block devices appear as /dev/sd* How to enable virtio-scsi https://access.redhat.com/site/documentation/en- US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/sect- Managing_storage_controllers_in_a_guest.html Performance is OK (see next slide) Availability: Red Hat Enterprise Linux 6.4 and later SUSE Linux Enterprise Server 11 SP3 and later More information http://wiki.qemu.org/features/virtioscsi Better Utilization of Storage Features from KVM Guest via virtio-scsi presentation at 2013 LinuxCon/CloudOpen by Masaki Kimura (Hitachi) 28 IBM, Linux, and Building a Smarter Planet

Virtio-scsi vs. Virtio-blk Performance Storage Performance: Virtio-blk vs. Virtio-scsi Single KVM Guest (cache=none), Raw Virtual Image, Direct I/Os, 6 Physical Disk Arrays 900.0 800.0 700.0 600.0 500.0 400.0 300.0 200.0 100.0 0.0 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads 1 Thread 8 Threads 16 Threads Throughput (MB/sec) Virtio-blk Virtio-scsi Large File Creates (Block Size=256KB) Sequential Reads (Block Size=256KB) Large File Creates (Block Size=8KB) Sequential Reads Random Reads Random Writes (Block Size=8KB) (Block Size=8KB) (Block Size=8KB) Mixed I/O (70% Reads, 30% Writes, Block 29 IBM, Linux, and Building a Smarter Planet

Recap Low Throughput / High Latencies Proper configuration & performance tuning are critical Storage virtualization types (para-virtualization vs. full emulation), device- vs. file-backed virtual storage, image formats (raw vs. qcow2), guest caching mode (write-through vs. none), I/O scheduler (deadline) are all important Support for high I/O rates KVM tops out at ~147,000 IOPS maximum per guest Need virtio-blk-data-plane technology feature to scale better (1.6 million IOPS per guest!) At least 50% higher than competing hypervisors Available as a Technology Preview in RHEL 6.4 and SLES 11 SP3 Recent features PV-EOI reduce CPU utilization for workloads with high I/O & interrupt rates, need new guest kernels BIO-based virtio-blk should help performance with fast storage (ramdisk, SSDs), but not slower spinning disks Virtio-scsi Support for SCSI devices in KVM guests, easier P2V migration Performance is mostly comparable to virtio-blk Available in RHEL 6.4 and SLES 11 SP3 30 IBM, Linux, and Building a Smarter Planet

Hindi Arabic Russian Traditional Chinese T h a n k Y o u English D anke Gracias Spanish Thai Obrigado Brazilian Portuguese Grazie Italian Simplified Chinese German M erci Hebrew French Tamil Japanese Korean 31 IBM, Linux, and Building a Smarter Planet