CFS-v: I/O Demand-driven VM Scheduler in KVM



Similar documents
Virtual Machine Scheduling for Parallel Soft Real-Time Applications

Virtual machine CPU monitoring with Kernel Tracing

Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing

Performance Analysis of Network I/O Workloads in Virtualized Data Centers

Real-time KVM from the ground up

Enabling Technologies for Distributed and Cloud Computing

Small is Better: Avoiding Latency Traps in Virtualized DataCenters

Cloud Operating Systems for Servers

Monitoring Databases on VMware

Energy Efficiency and Server Virtualization in Data Centers: An Empirical Investigation

Enabling Technologies for Distributed Computing

IOmark-VM. DotHill AssuredSAN Pro Test Report: VM a Test Report Date: 16, August

Virtual Switching Without a Hypervisor for a More Secure Cloud

Monitoring and analysis of performance impact in virtualized environments

An Enhanced CPU Scheduler for XEN Hypervisor to Improve Performance in Virtualized Environment

Architecture of the Kernel-based Virtual Machine (KVM)

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Linux Process Scheduling. sched.c. schedule() scheduler_tick() hooks. try_to_wake_up() ... CFS CPU 0 CPU 1 CPU 2 CPU 3

Real- Time Mul,- Core Virtual Machine Scheduling in Xen

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Real-Time Multi-Core Virtual Machine Scheduling in Xen

Big Data in the Background: Maximizing Productivity while Minimizing Virtual Machine Interference

Marvell DragonFly Virtual Storage Accelerator Performance Benchmarks

Scaling in a Hypervisor Environment

BLACKBOARD LEARN TM AND VIRTUALIZATION Anand Gopinath, Software Performance Engineer, Blackboard Inc. Nakisa Shafiee, Senior Software Performance

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Scheduling in The Age of Virtualization

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

VIRTUALIZATION is widely deployed in large-scale

Efficient Consolidation- aware VCPU scheduling on Multicore Virtualization Platform

Virtualizing Performance Asymmetric Multi-core Systems

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

A Network Marketing Model For Different Host Machines

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

APerformanceComparisonBetweenEnlightenmentandEmulationinMicrosoftHyper-V

How To Stop A Malicious Process From Running On A Hypervisor

White paper. QNAP Turbo NAS with SSD Cache

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Virtualizing Performance-Critical Database Applications in VMware vsphere VMware vsphere 4.0 with ESX 4.0

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

IOS110. Virtualization 5/27/2014 1

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

High-Performance Nested Virtualization With Hitachi Logical Partitioning Feature

KVM Virtualized I/O Performance

KVM in Embedded Requirements, Experiences, Open Challenges

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Measurements and Analysis of Network I/O Applications in Virtualized Cloud

OPERATING SYSTEMS SCHEDULING

Understanding Linux on z/vm Steal Time

Deploying Microsoft Exchange Server 2007 mailbox roles on VMware Infrastructure 3 using HP ProLiant servers and HP StorageWorks

Effective Computing with SMP Linux

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Why Relative Share Does Not Work

ELI: Bare-Metal Performance for I/O Virtualization

Virtualization. Types of Interfaces

The QEMU/KVM Hypervisor

Basics in Energy Information (& Communication) Systems Virtualization / Virtual Machines

CIVSched: A Communication-aware Inter-VM Scheduling Technique for Decreased Network Latency between Co-located VMs

Project No. 2: Process Scheduling in Linux Submission due: April 28, 2014, 11:59pm

Application Performance Isolation in Virtualization

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Cloud Computing through Virtualization and HPC technologies

PARALLELS CLOUD STORAGE

SCSI support on Xen. MATSUMOTO Hitoshi Fujitsu Ltd.

IO latency tracking. Daniel Lezcano (dlezcano) Linaro Power Management Team

Avoid Paying The Virtualization Tax: Deploying Virtualized BI 4.0 The Right Way. Ashish C. Morzaria, SAP

Virtualization. Pradipta De

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

Real-time Performance Control of Elastic Virtualized Network Functions

Using Linux as Hypervisor with KVM

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

WE view a virtualized cloud computing environment

You know us individually, but do you know Linchpin People?

Squeezing Top Performance from your Virtualized SQL Server

Scheduler Vulnerabilities and Coordinated Attacks in Cloud Computing

FUSION iocontrol HYBRID STORAGE ARCHITECTURE 1

Completely Fair Scheduler and its tuning 1

VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT

Full and Para Virtualization

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

Linux O(1) CPU Scheduler. Amit Gud amit (dot) gud (at) veritas (dot) com

Delivering Quality in Software Performance and Scalability Testing

Virtualization in Linux KVM + QEMU

Benchmarking Hadoop & HBase on Violin

Storage Sizing Issue of VDI System

Zeus Traffic Manager VA Performance on vsphere 4

Efficiency Analysis of AWS Offering Vs. Private/Hybrid Implementations or Traditional Colo

STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT

The CPU Scheduler in VMware vsphere 5.1

Beyond the Hypervisor

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Prioritizing Soft Real-Time Network Traffic in Virtualized Hosts Based on Xen

Data Centers and Cloud Computing

Linux scheduler history. We will be talking about the O(1) scheduler

Transcription:

CFS-v: Demand-driven VM Scheduler in KVM Hyotaek Shim and Sung-Min Lee (hyotaek.shim, sung.min.lee@samsung.- com) Software R&D Center, Samsung Electronics 2014. 10. 16

Problem in Server Consolidation 2/16 bound VMs are often co-located with computing bou nd VMs on the same physical machine To avoid significant performance interference from computing-an d-computing VMs or -and- VMs Comp. Comp. Comp. Comp. Guest Kernel Guest Kernel Guest Kernel Guest Kernel KVM KVM Physical Machine Physical Machine TRACON: Interference-Aware Scheduling for Data-Intensive Applications in Virtualized Environment (SC 11) Understanding Performance Interference of Workload in Virtualized Cloud Environments (CLOUD 10)

Problem in Server Consolidation 3/16 performance is still interfered with co-running computin g bound VMs Also when computing and tasks coexist on the same VM Performance Performance Degradation Degradation Performance Degradation Performance Degradation Comp. Comp. Comp. Comp. Guest Kernel Guest Kernel Guest Kernel Guest Kernel KVM KVM Physical Machine Physical Machine

IOPS (4KB Random Read) Performance Evaluation of CFS in KVM performance interference by computing tasks 60000 50000 40000 30000 20000 4/16 Mixed (4 VMs): four VMs, each of which has 8 tasks and 1~4 comp. tas ks Separate (2 VMs): VM1 (32 tasks), VM2 (16 Comp. tasks) No Interference without computing tasks No Degradation at all regardless of computing threads 10000 0

Performance Evaluation of CFS in KVM 5/16 CPU utilization of VMs(Mixed) and Native Linux Ratio (%) 100% 80% 60% 40% 20% Without C-Threads idle iowait guest+steal sys+irq+soft user+nice Ratio (%) 100% 80% 60% 40% 20% With 4 C-Threads idle iowait guest+steal sys+irq+soft user+nice 0% 0% Ratio (%) 100% 80% 60% 40% 20% With 8 C-Threads idle iowait guest+steal sys+irq+soft user+nice Ratio (%) 100% 80% 60% 40% 20% With 16 C-Threads idle iowait guest+steal sys+irq+soft user+nice 0% 0%

CFS in Native Linux 6/16 In native Linux CFS, -bound tasks could repeatedly pre empt computing-bound tasks while consuming a small ti me slice Wakeup & Preemption Wakeup & Preemption Wakeup & Preemption Wakeup & Preemption IO Comp. IO Comp. IO Comp. IO Comp. I O Comp. What if an bound task repeatedly fail to preempt the current task? Wakeup & Fail to preempt Wakeup & Fail to preempt I O Comp. I O Comp. Time Line

Path of Virtio-blk 7/16 Five different kinds of tasks are involved for handling a vir tio-blk request Each task may be woken up just before handling their job (totally, up to five wake-ups) Main Loop Thread Bdrv Request Worker Thread Vring Interrupt VCPU0 Thread VCPU1 Thread RESCHEDULE IPI Event Handling Handling Interrupt Handling (Intr.+Softirq) Idle Request PCPU0 PCPU1 PCPU2 PCPU3 Host Kernel Softirq Event (ioeventfd)

Problem of CFS on VM scheduling 8/16 In CFS group scheduling, VMs are deployed in different gr oups (runqueues) Group entity of woken-up related threads often fail to preempt the group entity of computing VCPUs Woken-up but fails to preempt Sched Entity CFS Runqueue Sched Entity CFS Runqueue CFS Runqueue Event Handling Handling Interrupt Handling Request Event Handling ComputingComputing Main Loop Thread Worker Thread VCPU0 VCPU1 Current VCPU0 VCPU1

Preemption Rule of CFS 9/16 Vruntime Compensation When a schedule entity is woken up, the entity s Vruntime is reset to minimum Vruntime in its runqueue - threshold(e.g., 12000000) The reset Vruntime is not small enough to preempt the current comp uting task Weight and Grain Schedule entity for a group of bound VM tasks has a relatively sm all Weight due to CPU consumption, which results in large Grain If woken-up sched entity s Vruntime is smaller than Current sched entity s Vruntime + Grain, then preempt the current task! Grain is calculated by Weight of the woken-up

CFS-v: Boosting Related Tasks 10/16 CFS-v detects bound VCPU tasks by using a 2-bit count If a woken-up task consumes CPU less than 500us, increases by 1 If it consumes time slice more than 1ms, the count decreases by 1 The count is larger than or equal to 2, the VCPU is related Main loop, worker, and softirq threads are considered related A woken-up related task can always preempt non related tas ks Next task Boost Queue CFS Runqueue CFS Runqueue Event Handling Interrupt Handling Handling Request Computing Computing ComputingComputing Main Loop Thread Worker Thread VCPU0 VCPU1 VCPU0 VCPU1 VCPU0 VCPU1

CFS-v: Partial Boosting 11/16 VCPU that has both and computing tasks is not consid ered as bound and thus cannot be boosted with the pr evious boosting algorithm What if CFS-v boosts such a VCPU? The VCPU will not return CPU shortly due to computing tasks bound Task Current task in VCPU0 CFS Runqueue Performance Degradation C Computing bound Task Runqueue Runqueue Wait (in the wait queue) Running Comp. Ready but not running CFS Scheduler Guest VM Kernel CFS Scheduler Guest VM Kernel VCPU0 VCPU1

CFS-v: Partial Boosting 12/16 Even for a non related VCPU, if it receives Vring Interrupt or RESCHEDULE IPI, it is partially boosted within 200us time slice After a timer interrupt, the partially-boosted VCPU returns CPU Timer (200us) CFS Runqueue Sched Entity Sched Entity Sched Entity Co m Co m Co m Co m Co m VCPU Task VCPU Task VCPU Task VCPU Task Boost Queue (Partial Boosting in Xen) Task-aware Virtual Machine Scheduling for IO Performance (VEE '09)

Performance Evaluation 13/16 Evaluation Environment Host: Intel(R) Core(TM) i7-2600 CPU 3.40GHz (8 Cores), 12GB DRA M, kernel-3.16.1 Guest VMs: kernel-3.16.1, 8 Cores, 512MB DRAM» Running on a QCOW2 image, sharing a hard disk» Storage benchmark on RAW, sharing a Samsung 256GB SSD 840 pro 2.1, Virtio-blk We modified the host kernel and to implement CFS-v Micro benchmark Storage benchmark 4KB O_DIRECT mode random read Host kernel s page cache is also disabled Computing benchmark

Performance Evaluation 14/16 Separate 2VMs: co-running 2 VMs that execute and compu ting workloads, respectively VM1: 32 random read threads VM2: 16 MD5 hashing threads CFS-v improves the storage performance of VM1 by 764%, while reduces the computing performance of VM2 by 30.1% CFS-v has also raised the CPU util. of VM1 from 46% to 273%

Performance Evaluation 15/16 Mixed 2VMs: co-running 2 VMs, each of which is running 16 random read threads and 8 MD5 hashing threads CFS-v improves the storage performance by 93% and reduces the computing performance by only 9%

Summary 16/16 CFS-v enhances the overall resource efficiency of server c onsolidation Enables KVM to detect related and system tasks and t o boost them ahead of computing tasks By using partial boosting, CFS-v can also timely and momentary b oost bound tasks inside a VCPU running mixed workloads CFS-v highly improves the performance of VMs in exchange f or little degradation on the computing performance