Takahiro Hirofuchi, Hidemoto Nakada, Satoshi Itoh, and Satoshi Sekiguchi

Similar documents
Data center modeling, and energy efficient server management

ADAPTIVE LIVE VM MIGRATION OVER A WAN MODELING AND IMPLEMENTATION

A Hybrid Approach To Live Migration Of Virtual Machines

Ecole des Mines de Nantes. Journée Thématique Emergente "aspects énergétiques du calcul"

IT VIRTUALIZATION FOR DISASTER MITIGATION AND RECOVERY

Infrastructure as a Service (IaaS)

Windows Server 2008 R2 Hyper-V Live Migration

Enhancing the Performance of Live Migration of Virtual Machine s with WSClock Replacement Algorithm

Dynamic Resource allocation in Cloud

Dynamic resource management for energy saving in the cloud computing environment

Black-box and Gray-box Strategies for Virtual Machine Migration

Research Article An Automatic Decision-Making Mechanism for Virtual Machine Live Migration in Private Clouds

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

This is an author-deposited version published in : Eprints ID : 12902

Dynamic Load Balancing of Virtual Machines using QEMU-KVM

Xen Live Migration. Networks and Distributed Systems Seminar, 24 April Matúš Harvan Xen Live Migration 1

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Windows Server 2008 R2 Hyper-V Live Migration

Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed Computing

Datasheet Fujitsu Cloud Infrastructure Management Software V1

Energy Constrained Resource Scheduling for Cloud Environment

Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, Amr El Abbadi. cs.ucsb.edu microsoft.com

An Energy-aware Multi-start Local Search Metaheuristic for Scheduling VMs within the OpenNebula Cloud Distribution

Hecatonchire: Enabling Multi-Host Virtual Machines by Resource Aggregation and Pooling

Elastic Load Balancing in Cloud Storage

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

A Batch Computing Service for the Spot Market. Supreeth Subramanya, Tian Guo, Prateek Sharma, David Irwin, Prashant Shenoy

Ready Time Observations

International Journal of Advance Research in Computer Science and Management Studies

Amazon EC2 XenApp Scalability Analysis

A Survey Paper: Cloud Computing and Virtual Machine Migration

Keywords Virtualization, Virtual Machines, Migration, Hypervisor, Cloud Computing

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

Two-Level Cooperation in Autonomic Cloud Resource Management

Resource Availability Based Performance Benchmarking of Virtual Machine Migrations

Towards an understanding of oversubscription in cloud

Live und in Farbe Live Migration. André Przywara CLT 2010

Performance Analysis of Network I/O Workloads in Virtualized Data Centers

Enhancing the Scalability of Virtual Machines in Cloud

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

Environments, Services and Network Management for Green Clouds

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm

Efficient Resources Allocation and Reduce Energy Using Virtual Machines for Cloud Environment

Emerging Technology for the Next Decade

Virtual Machine Management with OpenNebula in the RESERVOIR project

A Holistic Model of the Energy-Efficiency of Hypervisors

Cost-Aware Live Migration of Services in the Cloud

WE view a virtualized cloud computing environment

Affinity Aware VM Colocation Mechanism for Cloud

Service allocation in Cloud Environment: A Migration Approach

Built for Business. Ready for the Future.

SafePeak Case Study: Large Microsoft SharePoint with SafePeak

Virtual Machine Synchronization for High Availability Clusters

Effective Resource Allocation For Dynamic Workload In Virtual Machines Using Cloud Computing

COS 318: Operating Systems. Virtual Machine Monitors

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

Automatic Workload Management in Clusters Managed by CloudStack

Towards Unobtrusive VM Live Migration for Cloud Computing Platforms

ENERGY EFFICIENT CONTROL OF VIRTUAL MACHINE CONSOLIDATION UNDER UNCERTAIN INPUT PARAMETERS FOR THE CLOUD

Data Center Specific Thermal and Energy Saving Techniques

KVM & Memory Management Updates

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Overview Customer Login Main Page VM Management Creation... 4 Editing a Virtual Machine... 6

Dynamic Resource Allocation in Software Defined and Virtual Networks: A Comparative Analysis

CLOUD COMPUTING. Virtual Machines Provisioning and Migration Services Mohamed El-Refaey

Multi-dimensional Affinity Aware VM Placement Algorithm in Cloud Computing

Virtualization Technology using Virtual Machines for Cloud Computing

Task Scheduling for Efficient Resource Utilization in Cloud

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

QoS-Aware Secure Live Migration of Virtual Machines

A Dynamically-adaptive Resource Aware Load Balancing Scheme for VM migrations in Datacenters

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

AN ADAPTIVE DISTRIBUTED LOAD BALANCING TECHNIQUE FOR CLOUD COMPUTING

Coordinating Virtual Machine Migrations in Enterprise Data Centers and Clouds

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Why ClearCube Technology for VDI?

Virtual Machine Dynamic Provision and Image Management for UniCloud

Dive Into VM Live Migration

VIRTUALIZATION, The next step for online services

Ceph Optimization on All Flash Storage

AIX NFS Client Performance Improvements for Databases on NAS

PARALLELS CLOUD SERVER

An Efficient Live VM Migration Technique in Clustered Datacenters

Data Centers and Cloud Computing

VM Management for Green Data Centres with the OpenNebula Virtual Infrastructure Engine

Towards a Resource Elasticity Benchmark for Cloud Environments. Presented By: Aleksey Charapko, Priyanka D H, Kevin Harper, Vivek Madesi

Keywords Cloud computing, virtual machines, migration approach, deployment modeling

Table of Contents. Server Virtualization Peer Review cameron : modified, cameron

VIRTUALIZATION 101. Brainstorm Conference 2013 PRESENTER INTRODUCTIONS

Building a Private Cloud with Eucalyptus

A dynamic optimization model for power and performance management of virtualized clusters

Newsletter 4/2013 Oktober

RESOURCE MANAGEMENT IN CLOUD COMPUTING ENVIRONMENT

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Energy-aware job scheduler for highperformance

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Transcription:

Takahiro Hirofuchi, Hidemoto Nakada, Satoshi Itoh, and Satoshi Sekiguchi National Institute of Advanced Industrial Science and Technology (AIST), Japan VTDC2011, Jun. 8 th, 2011 1

Outline What is dynamic consolidation? Background and challenge Why is postcopy live migration promising? Comparison between postcopy/precopy Our postcopy live migration implementation Reactive consolidation system Overall design Packing algorithm Evaluation Conclusion 2

Dynamic Consolidation Dynamically optimize VM locations in response to VM load changes in order to Eliminate excessive power Assure VM performance 3

Dynamic Consolidation When datacenter load becomes small, VM Move running VMs into fewer hosts Reduce power consumption Suspend idle physical hosts Eliminate excessive power idle 4

Dynamic Consolidation When datacenter load becomes high, The CPU usage of physical machines VM Remove overload Distribute running VMs into other hosts Power up new physical hosts Assure VM Performance 5

Challenge Be transparent to IaaS customers IaaS providers present performance criteria EC2 Small Instance: 1.0 1.2GHz 2007 Opteron Live migration incurs Long duration until completed CPU Overheads Hide what s going on in the background as much as possible! 6

Contribution Reactive consolidation by postcopy live migration Develop postcopy live migration for Qemu/KVM Presented in CCGrid2010 Develop a reactive VM consolidation system Optimize VM locations in response to load changes Exploit postcopy live migration for quick load balance Presented in this talk Assure VM performance in a higher degree Reduce performance loss on VM repacking 50% better in a randomly generated scenario 7

Outline What is dynamic consolidation? Background and challenge Why is postcopy live migration promising? Comparison between postcopy/precopy Our postcopy live migration implementation Reactive consolidation system Overall design Packing algorithm Evaluation Conclusion 8

Precopy v.s. Postcopy Precopy live migration Copy VM memory before switching the execution host Widely used in VMMs Postcopy live migration Copy VM memory after switching the execution host No publicly available implementation But, we developed it! 9

Precopy Live Migration (1) Copy VM memory before relocation Ex. VM with 1 GB RAM Takes 10 seconds at least with GbE May take more and more. 1. Copy all memory pages to destination 2. Copy memory pages updated during the previous copy again 3. Repeat the 2 nd step until the rest of memory pages are enough small 4. Stop VM 5. Copy CPU registers, device states, and the rest of memory pages. 6. Resume VM at destination VM Machine A RAM Machine B 10

Precopy Live Migration (2) Copy VM memory before relocation Ex. VM with 1 GB RAM Takes 10 seconds at least with GbE May take more and more. 1. Copy all memory pages to destination 2. Copy memory pages updated during the previous copy again 3. Repeat the 2 nd step until the rest of memory pages are enough small RAM size Migration time = Network speed + alpha 4. Stop VM 5. Copy CPU registers, device states, and the rest of memory pages. 6. Resume VM at destination 11

Precopy Live Migration (3) Copy VM memory before relocation Ex. VM with 1 GB RAM Takes 10 seconds at least with GbE May take more and more. 1. Copy all memory pages to destination 2. Copy memory pages updated during the previous copy again 3. Repeat the 2 nd step until the rest of memory pages are enough small 4. Stop VM 5. Copy CPU registers, device states, and the rest of memory pages. 6. Resume VM at destination RAM size Migration time = Network speed + alpha It takes long time to switch location. It is difficult to estimate how long it takes. 12

Postcopy Live Migration (1) Copy VM memory after relocation RAM VM Machine A Stop Machine B 1. Stop VM 2. Copy CPU and device states to destination 3. Resume VM at destination 4. Copy memory pages 13

Postcopy Live Migration (2) Copy VM memory after relocation RAM VM Machine A Machine B 1. Stop VM 2. Copy CPU and device states to destination 3. Resume VM at destination 4. Copy memory pages Copy CPU and device states Only 256KB w/o VGA => Less than 1 sec for relocation 14

Postcopy Live Migration (3) Copy VM memory after relocation RAM VM Machine A Resume VM Machine B 1. Stop VM 2. Copy CPU and device states to destination 3. Resume VM at destination 4. Copy memory pages 15

Postcopy Live Migration (4) Copy VM memory after relocation RAM VM Machine A VM Machine B 1. Stop VM 2. Copy CPU and device states to destination 3. Resume VM at destination 4. Copy memory pages Copy memory pages On-demand Background 16

Demo (Precopy v.s. Postcopy) Migrating. Precopy Live Migration Migrating. Postcopy Live Migration 17

Problems in Prior Consolidation Studies All prior studies are based on precopy live migration. To tackle a long migration time, prior studies use load prediction techniques. Repacking timescale ~ hour/day (e.g., business hour). However, IaaS datacenters allow users to run any kinds of workloads at any time. Precise prediction is difficult because we cannot use workload specific algorithms. Nobody cannot predict sudden load changes. 18

Reactive Consolidation Exploit postcopy live migration Reactively optimize VM locations in response to load changes. Repacking timescale ~ 10 seconds Wall clock time Overloaded Detect Complete (T = Ramsize / Bandwidth) Switch the execution host (1 sec) 19

Packing Algorithm (1) Realistic design Lightweight, near optimal Define two server types Shared Server (with large RAM, expensive) Consolidate many idle VMs Always power on Dedicated Server Host an actively running VM Suspend when unused 20

Packing Algorithm (2) If overloaded, an active VM pops out from Shared Server. Pop Out Threshold = over 90% CPU usage of Shared Server. VM VM VM VM VM VM Power off unused Dedicated Server Server Node (Shared) Server Node (Dedicated) Server Node (Dedicated) Server Node (Dedicated) Server Node (Dedicated) Always Power On Power On (Resumed) Power On (Resumed) Power Off (Suspended) Power Off (Suspended) 21

Packing Algorithm (3) If a VM becomes idle and Shared Server has space for it, the VM returns to Shared Server. Idle Threshold = under 50% usage of Dedicated Server. VM VM VM VM VM VM Power off unused Dedicated Server Server Node (Shared) Server Node (Dedicated) Server Node (Dedicated) Server Node (Dedicated) Server Node (Dedicated) Always Power On Power On (Resumed) Power On (Resumed) Power Off (Suspended) Power Off (Suspended) 22

One operation Evaluation Experiments Busy loop C (times) Memory touch C * alpha (times) Simple load change scenario Pure CPU intensive workload CPU is busy Memory intensive workload Target CPU Load Compound load change scenario A new benchmark program Metric for performance assurance Generate a target CPU load with a specified memory update intensity Measure achieved operations per second Failed ops = Target ops Achieved ops sleep CPU is idle Memory update intensity 23

Simple Load Change Scenario 6VMs VM0 and VM1 become active. 24

1 Shared Server s CPU Usage (%) 2 How it basically works. 1. Detect overloading and optimize locations. VM0 is moved to Dedicated Server 4 2. Detect idle state and optimize location. VM0 is moved back to Shared Server Dedicated Server s CPU Usage (%) 25

Using Postcopy Using Precopy Overload is removed in less than 10 seconds. Overload is removed in more than 20 seconds. Shared Server s CPU Usage (%) VM0 cannot get enough CPU resource. This results in Failed operations. 26

Failed Operations per Second (Red: VM0, Green: VM1) Using Postcopy Using Precopy Postcopy alleviates the number of failed operations. 20000(ops) > 5000 (ops) Note that 4000 (ops) is detection overhead (avg. 5sec). In Poscopy, performance loss until completed is small, i.e., 1000 (ops); this workload is pure CPU intensive 27

Failed Operation per Second v.s. Memory Update Intensity Using precopy incurs large performance penalty for memory intensive workloads. Using postcopy greatly contributes to reducing packing overheads. Detection overhead 1 GB/s at 100% CPU 28

Compound Load Change Scenarios (1) One hour, randomly generated load changes Emulate race to halt workloads Mostly idle, but sometimes active An active VM consumes a random CPU load between 80% and 100%. An idle VM consumes a random CPU load between 0% and 20%. 20% 80% A new state continues for a random duration between 60 and 300 seconds. Memory update intensity is 0.6 (600MB/s on 100% CPU usage). 29

Compound Load Change Scenarios (2) By using postcopy migration, failed operations are reduced to the half level of using precopy. 30

Compound Load Change Scenarios (3) The number of live migrations during each scenario The consolidation system using postcopy frequently optimizes VM locations. 31

Related Work Postcopy live migration SnowFlock (live VM cloning), EuroSys2009 Postcopy live migration (Xen), VEE2009 Postcopy live migration (Qemu/KVM), CCGrid2010 By me VM Consolidation System Black box v.s. Gray box, NSDI2007 Need workload specific information for better prediction Genetic Algorithm, DCAI2009 Quickly find near optimal locations By us 32

Conclusion Reactive VM consolidation System Keeps performance criteria of VMs as much as possible in IaaS datacenters Does not reply on load prediction Reactively optimizes VM locations in response to load changes Exploits postcopy live migration for quick load balance Evaluation better performance assurance than using precopy live migration, especially for memory intensive workloads 33

Yabusame 流 鏑 馬 Postcopy Live Migration for Qemu/KVM KVM Forum 2011 http://grivon.apgrid.org/ Yabusame ( 流 鏑 馬 ) is a type of mounted archery in traditional Japanese archery. An archer on a running horse shoots three special turnip headed arrows successively at three wooden targets. from wikipedia. Photo: Yuki Shimazu 2011 http://www.flickr.com/photos/shimazu/5631324478/ 34