Flexible Distributed Capacity Allocation and Load Redirect Algorithms for Cloud Systems



Similar documents
A Game Theoretic Formulation of the Service Provisioning Problem in Cloud Systems

Dynamic request management algorithms for Web-based services in Cloud computing

Falloc: Fair Network Bandwidth Allocation in IaaS Datacenters via a Bargaining Game Approach

LPV model identification for power management of Web service systems Mara Tanelli, Danilo Ardagna, Marco Lovera

Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang

On the Interaction and Competition among Internet Service Providers

An Autonomic Auto-scaling Controller for Cloud Based Applications

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

An Energy-Aware Methodology for Live Placement of Virtual Machines with Variable Profiles in Large Data Centers

Comparison of PBRR Scheduling Algorithm with Round Robin and Heuristic Priority Scheduling Algorithm in Virtual Cloud Environment

Optimizing the Cost for Resource Subscription Policy in IaaS Cloud

Auto-Scaling Model for Cloud Computing System

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Multi-layer MPLS Network Design: the Impact of Statistical Multiplexing

Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints

Energy-aware joint management of networks and cloud infrastructures

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

Exploring Resource Provisioning Cost Models in Cloud Computing

OCRP Implementation to Optimize Resource Provisioning Cost in Cloud Computing

1. Simulation of load balancing in a cloud computing environment using OMNET

A Survey on Load Balancing Technique for Resource Scheduling In Cloud

Hierarchical Approach for Green Workload Management in Distributed Data Centers

Managing Overloaded Hosts for Dynamic Consolidation of Virtual Machines in Cloud Data Centers Under Quality of Service Constraints

Infrastructure as a Service (IaaS)

QoS-driven Web Services Selection in Autonomic Grid Environments. Danilo Ardagna Gabriele Giunta Nunzio Ingraffia Raffaela Mirandola Barbara Pernici

A Mathematical Programming Solution to the Mars Express Memory Dumping Problem

SLA-driven Dynamic Resource Provisioning for Service Provider in Cloud Computing

Cloud Computing An Introduction

Capping Server Cost and Energy Use with Hybrid Cloud Computing

International Journal of Advance Research in Computer Science and Management Studies

OPTIMIZED PERFORMANCE EVALUATIONS OF CLOUD COMPUTING SERVERS

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.

Generalized Nash Equilibria for SaaS/PaaS Clouds

Game Theory Based Iaas Services Composition in Cloud Computing

An Approach to Load Balancing In Cloud Computing

Cost Effective Automated Scaling of Web Applications for Multi Cloud Services

DDS-Enabled Cloud Management Support for Fast Task Offloading

Beyond the Stars: Revisiting Virtual Cluster Embeddings

Cloud Management: Knowing is Half The Battle

Figure 1. The cloud scales: Amazon EC2 growth [2].

Monitoring Performances of Quality of Service in Cloud with System of Systems

Cost-effective Resource Provisioning for MapReduce in a Cloud

Advanced Load Balancing Mechanism on Mixed Batch and Transactional Workloads

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

On Orchestrating Virtual Network Functions

The Theory And Practice of Testing Software Applications For Cloud Computing. Mark Grechanik University of Illinois at Chicago

Motivated by a problem faced by a large manufacturer of a consumer product, we

SLA-based Admission Control for a Software-as-a-Service Provider in Cloud Computing Environments

CLEVER: a CLoud-Enabled Virtual EnviRonment

Load Balancing in cloud computing

CLOUD computing has emerged as a new paradigm for

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

Webpage: Volume 3, Issue XI, Nov ISSN

Optimized Scheduling in Real-Time Environments with Column Generation

Sistemi Operativi e Reti. Cloud Computing

Environments, Services and Network Management for Green Clouds

RESOURCE MANAGEMENT IN CLOUD COMPUTING ENVIRONMENT

An Energy-aware Multi-start Local Search Metaheuristic for Scheduling VMs within the OpenNebula Cloud Distribution

Dantzig-Wolfe bound and Dantzig-Wolfe cookbook

Black-box Performance Models for Virtualized Web. Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

Green Cloud Computing 班 級 : 資 管 碩 一 組 員 : 黃 宗 緯 朱 雅 甜

A Novel Approach for Efficient Load Balancing in Cloud Computing Environment by Using Partitioning

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

Adaptive Load Control of Service Oriented Architecture Server

Optimization of Communication Systems Lecture 6: Internet TCP Congestion Control

Mobile and Cloud computing and SE

Dynamic Round Robin for Load Balancing in a Cloud Computing

Online Resource Management for Data Center with Energy Capping

Capacity Planning Fundamentals. Support Business Growth with a Better Approach to Scaling Your Data Center

SchedulAir. Airline planning & airline scheduling with Unified Optimization. decisal. Copyright 2014 Decisal Ltd. All rights reserved.

Experiments on cost/power and failure aware scheduling for clouds and grids

Transcription:

Flexible Distributed Capacity Allocation and Load Redirect Algorithms for Cloud Systems Danilo Ardagna 1, Sara Casolari 2, Barbara Panicucci 1 1 Politecnico di Milano,, Italy 2 Universita` di Modena e Reggio Emilia, Dipartimento di Ingegneria dell Informazione, Italy IEEE Cloud 2011, Washington DC, 6 th July 2011

Research scenario 2 Cloud Computing: u On-demand software/hardware delivering on a pay per use basis u End-users obtain the benefits of the infrastructure without the need to implement it directly u Cloud providers maximize the utilization of their physical resources, minimizing energy costs and obtaining economies of scale Major issues: u Development of efficient service provisioning policies u Modern Clouds live in an open world characterized by continuous changes which occur autonomously and unpredictably

Our contribution 3 Workload prediction-based capacity allocation techniques able to coordinate multiple distributed resource controllers Dynamic load redirection mechanism which determines the requests to be redirected during peak loads Requests distribution optimized according to the average response time

Problem statement 4 WS provider perspective offering multiple transactional WSs hosted at multiple sites of an IaaS provider SLA contract, associated with each WS class k specifies the QoS levels: R k R k WSs are deployed in VMs, each VM hosts a single Web service applica4on

Problem statement 5 Mul2ple (homogeneous) VMs implemen4ng the same WS class can run in parallel Services can be located on mul2ple sites IaaS provider charges the WS provider on a hourly basis

Problem statement 6 Capacity Allocation (CA): Determine the optimal number of VMs for each WS class in each IaaS site according to a prediction of the incoming workload (T 1 mid-long time scale) Load Redirection (LR): If a site resources are insufficient, incoming requests redirected to other sites (T 2 << T 1 short-term time scale)

Cloud System Reference Framework 7 IaaS Provider i=1 i=2 Local workload manager Virtualized Servers Flat VMs! "# $% "# & & & 23 4( 7( 23 4 ( 23 5( 7(!)(!)(!)(!"#$%&'()&*+",-().,"$.#( /&#01&#-( 23 6(!)( On demand VMs!!" #$! "" #$%#$! &" $$!!" #$! %" #$&#$! '" $$ Local WS arrival rates i=4! "# $%! &# $%'$%! (# %% i=3! "# $%! &# $%'$%! (# %% Execution rate of local arrivals! "# $%! &# $%'$%! (# %% Redirect rate of local arrivals Local workload manager! "# $%! &# $%'$%! (# %% Local CA and LR manager Virtualized Servers

Prediction model time scales 8

I K C i c i c i N i Problem formulation System parameters 9 Set of sites Set of WS classes VM instances capacity at site i Time unit cost for flat VMs at site i Time unit cost for on demand VMs at site i Number of flat VMs available at site i µ k Maximum service rate of a capacity 1 VM for executing WS class k requests d i,j,i= j Delay (s) for requests redirecting from site i to site j g i,j = 1 d,i= j Conductance of the communication link (i,j) i,j G i = g i,j,i= j Equivalent conductance seen from site i to the other sites j

Problem formulation Decision variables 10 Nk i Mk i x i k zk i Number of flat VMs allocated for class k request at site i Number of on demand VMs allocated for class k request at site i Execution rate of local arrivals for WS class k request at site i Redirect of WS class k request at site i toward other sites

Design assumptions 11 Each WS class hosted in a VM is modeled as an M/G/1-PS queue The fraction of workload redirected to other sites is inversely proportional to the network delay/directly proportional to the conductance g i,j = 1/d i,j The overall load at site i due to the redirect of other sites is given by: g j,i z j k G j j I,j=i The total rate of class k requests executed at site i is given by: x i k + j=i g j,i z j k G j

Prediction models 12 Exponential Smoothing (ES) models to predict the local arrival rate Λ i k Simple model motivated by the application context: Short-time predictions suitable for real-time autonomic decisions We consider a version of ES, where parameters are dynamically chosen: Λ i k(t + T 1 )=γk(t) i Λ i k(t)+(1 γk(t))λ i i k(t), t > T 1 Λ i k(t 1 )= 1 T 1 T 1 t=1 Λ i k(t)

Prediction models 13 Dynamic ES model by re-evaluating the smoothing factor γ i k (t) at each prediction sample t We use the Trigg and Leach procedure: γ i k (t) = Ai k (t) E i k (t) A i k(t) =φ i k(t)+(1 φ)a i k(t T 1 ) E i k(t) =φ i k(t) +(1 φ)e i k(t T 1 )

Capacity Allocation problem 14 The goal of CA problem is to: u Minimize the overall costs for flat and on demand VM instances of multiple distributed IaaS sites u Guaranteeing that the average response time of each class is lower than the SLA threshold u Solved every T 1 time instant WS requests average response time: R i k = 1 C i µ k Λ i k N i k +Mi k R k = i Λ i k Ri k Λ j k j

Capacity Allocation problem min N i k,m i k k c i Nk i + ci Mk i i 15 i Λ i k <C i µ k (Nk i + Mk) i Λ i k (N k i + M k i) C i µ k (Nk i + M k i) Λ R k Λ j i k k j Nk i N i, i I k K k K, i I k K

Load Redirect problem 16 The goal of LR problem is to: u Cooperatively minimize request average response times u Avoid episodic local congestions due to the variability of the incoming workload u Solved every T 2 time instant WS requests average response time: R i k = C i µ k 1 x i k + j=i g j,i z j k G j N i k +M i k R i k = R i k + j=i x i k + j=i z j k G j g j,i z j k G j

Load Redirect problem min x i k,zi k (N k i + M k i) x i k + j=i k i C i µ k (Nk i + M k i) (xi k + j=i g j,i z j k G j g j,i z j k G j ) + j=i 17 z j k G j x i k + z i k = Λ i k k K, i I x i k + j=i g j,i z j k G j <C i µ k (N i k + M i k) k K, i I x i k,z i k 0 k K, i I Distributed decomposable solution relying on Lagrangian techniques

Load Redirect problem decomposition min x i,y i,z i,w i i (N i + M i ) x i + y i C i µ (N i + M i ) (x i + y i ) + wi 18 x i + z i = Λ i x i + y i <C i µ (N i + M i ) y i = j=i g j,i z j G j i I i I i I w i = j=i z j G j i I x i,y i,z i,w i 0 i I

Duality Theory Primal problem D* Dual problem P* D* P* Lagrangian relaxation (LB) Any feasible solution P* (UB) Strong duality

Load Redirect problem Lagrangian relaxation min x i,y i,z i,w i i (N i + M i ) x i + y i C i µ (N i + M i ) (x i + y i ) + wi + +Θ i y i j=i g j,i z j G j +ηi w i j=i z j G j x i + z i = Λ i x i + y i <C i µ (N i + M i ) x i,y i,z i,w i 0 i I i I i I The relaxed problem further separates into I sub-problems

Dual decomposition For a given set of Θ i s and η i s defines the dual function L(Θ, η) and the dual problem is then given by: max Θ,η L(Θ, η) The dual problem can be solved by using a sub-gradient method: Θ i (t + 1) = Θ i (t)+α t y i g j,i z j G j j=i η i (t + 1) = η i (t)+β t w i j=i z j G j

Experimental results Scalability analysis 22 Large set of randomly generated instances: u I has been varied between 20 and 60 u K has been varied between 100 and 1000 Average execution time required to solve instances of maximum size is lower than 3 minutes and one minute for the CA and LR problems, respectively

Experimental results Comparison with alternative methods 23 Heuristic 1: u The CA is performed every 5 minutes and the number of VMs is determined according to utilization thresholds u The number of VMs is determined such that the utilization of the VMs is equal to a given threshold τ 1 u VM provisioning is further triggered if the prediction of the VMs utilization is higher than a second threshold τ 2 > τ 1 u Multiple analyses have been performed by adopting different thresholds: (τ 1, τ 2 ) = (40%, 50%), (50%, 60%), and (60%, 80%)

Experimental results Comparison with alternative methods 24 Heuristic 2: Same as Heuristic 1 but the number of VMs is determined by optimally solving our CA problem every 5 minutes Heuristic 3: Same as Heuristic 2 but with a 10 minutes time horizon

Experimental results Comparison with alternative methods 25 Local incoming workload has been obtained from the traces of a very large dynamic Web-based system: u Normal day scenario: It describes the baseline workload (bimodal requests profile) u Heavy day scenario: It exhibits a 40% increment in the number of the client requests with respect to the baseline u Noisy day scenario: It is characterized by the same request profile belonging to the heavy day scenario with an additional noise component

26 Experimental results Comparison with alternative methods

Experimental results Comparison with alternative methods 27

Experimental results Validation on Amazon EC2 28

Conclusions and future work 29 Prediction-based distributed CA and LR algorithms for IaaS cloud system minimizing the cost of the running VMs Experimental results shown that our solutions significantly improve other heuristics Future work will extend the validation of our solution considering a larger experimental setup

Thank you! 30 Questions? Danilo Ardagna Politecnico di Milano,, Italy ardagna@elet.polimi.it