Why VM Autoconfiguration?

Similar documents
VCONF: A Reinforcement Learning Approach to Virtual Machine Auto-configuration

URL: A Unified Reinforcement Learning Approach for Autonomic Cloud Management

Coordinated Self-configuration of Virtual Machines and Appliances using A Model-free Learning Approach

Self-Adaptive Provisioning of Virtualized Resources in Cloud Computing

A Distributed Self-learning Approach for Elastic Provisioning of Virtualized Cloud Resources

Figure 1: Cost and Speed of Access of different storage components. Page 30

Cloud Computing. Mother Nature in Tear. Sustainability: we care, but. IT Liability. Sustainable Computing in Cloud. Sustainability: Culture Issues

A Reinforcement Learning Approach to Online Web Systems Auto-configuration

Performance Management for Cloudbased STC 2012

Resource Provisioning of Web Applications in Heterogeneous Cloud. Jiang Dejun Supervisor: Guillaume Pierre

Big Data in the Background: Maximizing Productivity while Minimizing Virtual Machine Interference

Data Center Network Minerals Tutorial

On the Use of Fuzzy Modeling in Virtualized Data Center Management Jing Xu, Ming Zhao, José A. B. Fortes, Robert Carpenter*, Mazin Yousif*

Resource Provisioning of Web Applications in Heterogeneous Clouds

Autonomic resource management for the Xen Hypervisor

Performance In the Cloud. White paper

<Insert Picture Here> Introducing Oracle VM: Oracle s Virtualization Product Strategy

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Comparison of the Three CPU Schedulers in Xen

Database Systems on Virtual Machines: How Much do You Lose?

IOS110. Virtualization 5/27/2014 1

Advanced Load Balancing Mechanism on Mixed Batch and Transactional Workloads

Performance Management for Cloud-based Applications STC 2012

Multiple Virtual Machines Resource Scheduling for Cloud Computing

This is an author-deposited version published in : Eprints ID : 12902

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

Figure 1. The cloud scales: Amazon EC2 growth [2].

USING VIRTUAL MACHINE REPLICATION FOR DYNAMIC CONFIGURATION OF MULTI-TIER INTERNET SERVICES

Data Centers and Cloud Computing

Performance Testing of a Cloud Service

Improving the performance of data servers on multicore architectures. Fabien Gaud

MS EXCHANGE SERVER ACCELERATION IN VMWARE ENVIRONMENTS WITH SANRAD VXL

PARALLELS CLOUD SERVER

Dynamic Resource allocation in Cloud

Solving the Hypervisor Network I/O Bottleneck Solarflare Virtualization Acceleration

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

Power and Performance Modeling in a Virtualized Server System

Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team

Express5800 Scalable Enterprise Server Reference Architecture. For NEC PCIe SSD Appliance for Microsoft SQL Server

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro

Virtualization of the MS Exchange Server Environment

Bernie Velivis President, Performax Inc

CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms

Modeling Virtual Machine Performance: Challenges and Approaches

Avoid Paying The Virtualization Tax: Deploying Virtualized BI 4.0 The Right Way. Ashish C. Morzaria, SAP

SAFE HARBOR STATEMENT

Virtualization Performance Insights from TPC-VMS

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Scaling in a Hypervisor Environment

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

S 2 -RAID: A New RAID Architecture for Fast Data Recovery

BLACKBOARD LEARN TM AND VIRTUALIZATION Anand Gopinath, Software Performance Engineer, Blackboard Inc. Nakisa Shafiee, Senior Software Performance

Resource Availability Based Performance Benchmarking of Virtual Machine Migrations

Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Network Infrastructure Services CS848 Project

Outdated Architectures Are Holding Back the Cloud

Oracle Database 12c: Performance Management and Tuning NEW

Towards Autonomic Grid Data Management with Virtualized Distributed File Systems

MS Exchange Server Acceleration

Informatica Data Director Performance

Delivering SDS simplicity and extreme performance

Evaluating and Comparing the Impact of Software Faults on Web Servers

Characterize Performance in Horizon 6

IaaS Multi Tier Applications - Problem Statement & Review

How To Manage Cloud Service Provisioning And Maintenance

Achieving QoS in Server Virtualization

Kronos Workforce Central on VMware Virtual Infrastructure

Dynamic Resource Allocation in Software Defined and Virtual Networks: A Comparative Analysis

Virtualization: Concepts, Applications, and Performance Modeling

Intelligent Management of Virtualized Resources for Database Systems in Cloud Environment

Method of Fault Detection in Cloud Computing Systems

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures

The best platform for building cloud infrastructures. Ralf von Gunten Sr. Systems Engineer VMware

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

Capacity Management for Oracle Database Machine Exadata v2

Cloud Computing through Virtualization and HPC technologies

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

Table of Contents. P a g e 2

Application of Predictive Analytics for Better Alignment of Business and IT

Network performance in virtual infrastructures

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Tableau Server 7.0 scalability

Data center modeling, and energy efficient server management

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

Sizing guide for SAP and VMware ESX Server running on HP ProLiant x86-64 platforms

Capacity Planning for Green IT Green IT Saving Costs Locally, Natural Resources Globally

Microsoft Office SharePoint Server 2007 Performance on VMware vsphere 4.1

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

RESOURCE MANAGEMENT IN CLOUD COMPUTING ENVIRONMENT

Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang

7 Real Benefits of a Virtual Infrastructure

Newsletter 4/2013 Oktober

Cutting Costs with Red Hat Enterprise Virtualization. Chuck Dubuque Product Marketing Manager, Red Hat June 24, 2010

Solid State Architectures in the Modern Data Center

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

Transcription:

A Reinforcement Learning Approach to Virtual Machines Auto-configuration Jia Rao, Xiangping Bu, Cheng-Zhong Xu Le Yi Wang and George Yin Wayne State t University it Detroit, Michigan http://www.cic.eng.wayne.edu Why VM Autoconfiguration? Server consolidation is a primary usage of virtualization to reduce TCO In cloud systems, virtual machines need to be configured on-demand, in real time VMs need to be reconfigured dynamically Created from template Migrated to a new host Resource demands/supplies vary with 2 1

Challenges in Online Autoconfig A rich set of configurable parameters CPU, memory, I/O bandwidth, etc Heterogonous applications in the same physical platform Hungry for different types of resources Performance interference between VMs Centralized virtualization layer Delayed effect of reconfiguration Scale and real-time requirements make it even harder 3 Performance Interference Balanced configurations WLoad weight vcpu mem TPC-W 256 2 0.5GB TPCC SPEC web TPC-C 256 1 1.5GB SPECweb 512 2 0.5GB APP Xen 3.1 DB Intel server 2-socket, 4-core, 8GB 4 2

Performance Interference (cont ) No ormalized perform mance Balanced config-1 config-2 config-3 Config-1: move 1G mem from TPCC to ICAC'09 VM Autoconfiguration 5 Perf Interference (cont ) No ormalized perfor rmance Base workload-1 workload-2 workload-3 Load-1: Input of TPC-W from browsing-mix to ordering-mix 6 3

Delayed Effect of Reconfiguration Response Time (sec) Reconfig takes effect after certain delays TPCC benchmark on two VMs of same conf, except mem TPC-C TPC-C reconfig Time (min) stabilize 7 Problem Statement For VMs to be run on the same physical platform, automate the resource (re)configuration process: Optimize system-wide perf (utility func) under individual SLA constraints. SLA w. r.t. throughput or response time Multi-resources CPU time (weighted scheduling in Xen), #virtual CPUs, Memory Work-conserving in resource allocation 8 4

Related Work [Wildstrom08] Regression based value est for memory Single memory resource, supervised learning, not adaptive [Soror08] Greedy search based config enumeration for database workloads Single CPU resource, assumes independent calibration of different resources [Padala07, Padala09] Control theory based alloc of resources (CPU, i/o bw) Assumes no interference between VMs due to the use of non-work-conserving mode Delayed effect of memory config is not considered 9 Our Contribution A reinforcement learning approach for online auto-configuration of multiple resources (includes memory) Consideration of VM interferences in workconserving mode Consideration of delayed effect in resource allocation Prototyped in a VCONF framework and tested in real world applications 10 5

Reinforcement Learning Learning by interaction with env State: configuration of VMs (cpu, mem, time, etc) Action: reconfiguration (increase/decrease/nop of resrc) Immediate reward: w.r.t. response time or throughput Learning Objective For a given state, find an action policy that would maximize long-run return C. Xu @ Wayne State Autonomic Cloud Management 11 Reinforcement Learning (cont ) An optimal policy π * is to select the action a in each state s that maximizes m cumulative reward r Q π* (s t, a t )=r 0 +γr 1 +γ 2 r 2 + (0 γ<1) An RL solution is to obtain good estimations of Q(s t,a t ) based on interactions: (s t, a t, r t+1 ) Q(s t,a t ) of each state-action action pair is updated each time an interaction finishes: Q(s t, a t )=Q(s t, a t )+α[r t+1 +γ*q(s t+1, a t+1 )-Q(s t, a t )] 12 6

RL for Autoconfiguration State space: (mem 1, weight 1, vcpu 1,,mem n, weight n, vcpu n ) Action set: Inc, Dec, and Nop on each resource Rewards: summarized perf over hosting applications Score each VM based on normalized perf n n > = = w score if for all score reward i i * i i 0; 1 1 otherwise thrpt score = penalty ref _ thrpt 0, if resp SLA; penalty = resp, if resp > SLA. SLA VCONF Architecture VCONF resides in dom0 Observes current state, makes decision based on Q(s, a) table Monitors VM perf and calculate rewards Updates corresponding entry in Q(s, a) table Valid actions VCPU weight MEM VCONF score 1 score n VM 1. VM n Xen Actual Hardware 14 reconfigu uration 7

Adaptability and Scalability Trivial implementation would lead to poor adaptability and scalability Adaptability Revise existing policy when environment changes Poor adaptability due to slow start Scalability The size of the Q(s,a) table grows exponentially with the state variables Model-based RL and Function Approx Build env models from collected traces (s t, a t ) r t Batch update Q(s, a) using simulated interactions from the models Continuously update the models with new traces Model-based RL is more data efficient Model reuse when similar resource demands detected Replace look-up table based Q with neural network based function approximation 8

Experimental Results Settings SPECweb, TPC-C, C TPC-W as applications Xen vm ver3.1 on 2-socket quad-core CPU Controlled environment APP DB APP A single instance of TPC-W DB Two instances of TPC-W Only CPU related resource considered TPCC APP1 DB1 APP In-housr Cloud testbed Three heterogeneous applications SPEC APP2 DB2 web DB All the three resources were considered Single Application Auto-configuration Optimizing application throughput Workload changes Throughput TPC-W Time (mins) 18 9

System-Wide Perf Optimization Two TPC-W apps run concurrently Auto-configuration Browsing Browsing Browsing Ordering Browsing Shopping Manual tune-up in trial-and-error 19 VM Auto-configuration in Clouds Heterogeneous VMs Large problem size More VMs, more resources considered Workload-1 600 browsing TPC-W TPC-C SPECweb 50 warehouses, 10 terminals 800 banking Workload-2 600 50 warehouses, 10 800 banking ordering terminals Workload-3 600 browsing Workload-4 600 browsing 50 warehouses, 1 terminals 50 warehouses, 10 terminals 800 banking 200 banking 10

Coordinated Auto-Reconfiguration 80% to 100% of max perf 21 Conclusion VCONF shows the applicability of RL algorithms in VM auto-configuration fg RL-based agent is able to obtain optimal (near-optimal) policies in small scale problems In clouds with a large problem size, model-based RL shows better adaptability and scalability Future work Consider more resources, such as I/O, shared cache Integrate t migration as an additional dimension n in the RL framework Auto-configuration of appliances in clouds [see ICDCS 09] 22 11

Thank you! Questions? Cloud and Internet Computing Laboratory Wayne State University, Michigan http:/cic.eng.wayne.edu Cheng-Zhong Xu, czxu@wayne.edu 23 12