Optimization of IaaS Cloud including Performance, Availability, Power Analysis Networking 2014 Trondheim, Norway

Similar documents

Performance, Availability and Power Analysis for IaaS Cloud

Availability Analysis of Cloud Computing Centers

Research on the Anti-perspective Correction Algorithm of QR Barcode

A Gentle Introduction to Cloud Computing

Cloud Computing An Elephant In The Dark

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

Optimized Data Indexing Algorithms for OLAP Systems

An inquiry into the multiplier process in IS-LM model

How To Understand Cloud Computing

Comparison between two approaches to overload control in a Real Server: local or hybrid solutions?

Clo l ud d C ompu p tin i g

A Game Theoretic Formulation of the Service Provisioning Problem in Cloud Systems

Mobile and Cloud computing and SE

The EOQ Inventory Formula

Outline. What is cloud computing? History Cloud service models Cloud deployment forms Advantages/disadvantages

10A CA Plex in the Cloud. Rob Layzell CA Technologies

Cloud definitions you've been pretending to understand. Jack Daniel, Reluctant CISSP, MVP Community Development Manager, Astaro

Load balancing model for Cloud Data Center ABSTRACT:

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

CLOUD SECURITY SECURITY ASPECTS IN GEOSPATIAL CLOUD. Guided by Prof. S. K. Ghosh Presented by - Soumadip Biswas

How To Understand Cloud Computing

IS PRIVATE CLOUD A UNICORN?

FORCED AND NATURAL CONVECTION HEAT TRANSFER IN A LID-DRIVEN CAVITY

Private Cloud in Educational Institutions: An Implementation using UEC

Cloud Computing Architectures and Design Issues

Schedulability Analysis under Graph Routing in WirelessHART Networks

Infrastructure as a Service (IaaS)

Cloud Computing Flying High (or not) Ben Roper IT Director City of College Station

Li Sheng. Nowadays, with the booming development of network-based computing, more and more

Cloud Computing: The Next Computing Paradigm

Cloud Computing, and REST-based Architectures Reid Holmes

CLOUD COMPUTING. A Primer

White Paper on CLOUD COMPUTING

OVERVIEW Cloud Deployment Services

CHAPTER 8 CLOUD COMPUTING

Mobile Cloud Computing Security Considerations

CS 695 Topics in Virtualization and Cloud Computing. Introduction

Cloud Computing Technology

Cloud Storage: Where Does It Fit Into Tomorrow s IT?

Environments, Services and Network Management for Green Clouds

Introduction to Cloud Computing

A strong credit score can help you score a lower rate on a mortgage

Cloud Computing Services and its Application

How To Understand Cloud Computing

Security Considerations for Public Mobile Cloud Computing

Cloud Services Business Potenziale und Risiken

The Cloud Opportunity: Italian Market 01/10/2010

Perspectives on Moving to the Cloud Paradigm and the Need for Standards. Peter Mell, Tim Grance NIST, Information Technology Laboratory

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar

Cloud Computing Overview

Modeling Public Pensions with Mathematica and Python II

SURVEY OF ADAPTING CLOUD COMPUTING IN HEALTHCARE

CS 695 Topics in Virtualization and Cloud Computing and Storage Systems. Introduction

Cloud Computing. Chapter 1 Introducing Cloud Computing

A Quantitative Approach to the Performance of Internet Telephony to E-business Sites

<Insert Picture Here> Enterprise Cloud Computing: What, Why and How

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services

Realizing the Value Proposition of Cloud Computing

Topics. Images courtesy of Majd F. Sakr or from Wikipedia unless otherwise noted.

Modeling the Performance of Heterogeneous IaaS Cloud Centers

Radware Cloud Solutions for Enterprises. How to Capitalize on Cloud-based Services in an Enterprise Environment - White Paper

Reallocation and Allocation of Virtual Machines in Cloud Computing Manan D. Shah a, *, Harshad B. Prajapati b

Performance Management for Cloudbased STC 2012

Demystifying the Cloud Computing

Performance Modeling of Cloud Computing Centers

Virtual Machine Instance Scheduling in IaaS Clouds

Survey on Models to Investigate Data Center Performance and QoS in Cloud Computing Infrastructure

The Magical Cloud. Lennart Franked. Department for Information and Communicationsystems (ICS), Mid Sweden University, Sundsvall.

Cloud Essentials for Architects using OpenStack

Selling T-shirts and Time Shares in the Cloud

Prof. Luiz Fernando Bittencourt MO809L. Tópicos em Sistemas Distribuídos 1 semestre, 2015

Student's Awareness of Cloud Computing: Case Study Faculty of Engineering at Aden University, Yemen

Historians and Production Management as Cloud Applications

In a dynamic economic environment, your company s survival

THE CLOUD AND ITS EFFECTS ON WEB DEVELOPMENT

2 Limits and Derivatives

The Hybrid Cloud: Bringing Cloud-Based IT Services to State Government

Cloud deployment model and cost analysis in Multicloud

Have We Really Understood the Cloud Yet?

ON THE ROAD TO OPEN HYBRID CLOUD BRYAN CHE GENERAL MANAGER, CLOUD BU, RED HAT

Lecture 02a Cloud Computing I

FEDERATED CLOUD: A DEVELOPMENT IN CLOUD COMPUTING AND A SOLUTION TO EDUCATIONAL NEEDS

Cloud-based Services: To Move or Not To Move. Seminar Internet Economics Cristian Anastasiu & Taya Goubran

Why Private Cloud? Nenad BUNCIC VPSI 29-JUNE-2015 EPFL, SI-EXHEB

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

Hybrid Cloud Computing

How To Understand Cloud Computing

Soft Computing Models for Cloud Service Optimization

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

Cloud Computing in the Enterprise An Overview. For INF 5890 IT & Management Ben Eaton 24/04/2013

Cloud Computing An Introduction

OPTIMAL FLEET SELECTION FOR EARTHMOVING OPERATIONS

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

Performance Analysis of a Numerical Weather Prediction Application in Microsoft Azure

Auto-Scaling Model for Cloud Computing System

Transcription:

Optimization of IaaS Cloud including Performance, Availability, Poer Analysis Netorking 2014 Trondeim, Noray June 2, 2014 Prof. Kisor S. Trivedi Duke Hig Availability Assurance Lab (DHAAL) Department of Electrical and Computer Engineering Duke University, Duram, NC 27708-0291 Pone: (919) 660-5269 E-mail: ktrivedi@duke.edu URL:.ee.duke.edu/~ktrivedi 1

Duke University 2 Researc Triangle Park (RTP) Duke UNC-CH NC state USA Nort Carolina 2

DHAAL Researc Triangle Softare Packages Teory Books: Blue, Red, Wite Stocastic modeling metods & numerical solution metods: Large Fault trees, Stocastic Petri Nets, Large/stiff Markov & non-markov models Fluid stocastic Petri Nets Performability & Markov reard models Softare aging and rejuvenation Attack countermeasure trees Applications HARP (NASA), SAVE (IBM), IRAP (Boeing) SHARPE, SPNP, SREPT Reliability/availability/performance Avionics, Space, Poer systems, Transportation systems, Automobile systems Computer systems (ardare/softare) Telco systems Computer Netorks Virtualized Data center Cloud computing 3

Books Autored by Trivedi Probability and Statistics it Reliability, Queuing, and Computer Science Applications, first edition, Prentice-Hall, 1982; Second edition, Jon Wiley, 2001 (Bluebook) Performance and Reliability Analysis of Computer Systems: An Example-Based Approac Using te SHARPE Softare Package, Kluer, 1996 (Redbook) Queuing Netorks and Markov Cains, Jon Wiley, first edition, 1996; second edition, 2006 (Wite book) 4

Talk outline Overvie of Cloud Computing Cloud Capacity Planning Availability Model for IaaS Cloud Performance Model for IaaS Cloud Poer Model for IaaS Cloud 5

An Overvie of Cloud Computing 6

Key caracteristics On-demand self-service: Provisioning of computing capabilities itout uman intervention Resource pooling: Sared pysical and virtualized environment Rapid elasticity: Troug standardization and automation, quick scaling Metered Service: Pay-as-you-go model of computing Many of tese caracteristics are borroed from Cloud s predecessors! Source: P. Mell and T. Grance, Te NIST Definition of Cloud Computing, October 7, 2009 7

Evolution of cloud computing Time line of evolution Early 80s Cluster computing Early 90s Grid computing Around 2005-06 Around 2000 Cloud computing Utility computing Source: ttp://seekingalpa.com/article/167764-tipping-point-gartner-annoints-cloud-computing-top-strategic-tecnology 8

Cloud Service models Infrastructure-as-a-Service (IaaS) Cloud: Examples: Amazon EC2 Platform-as-a-Service (PaaS) Cloud: Examples: Microsoft Windos Azure, Google AppEngine Softare-as-a-Service (SaaS) Cloud: Examples: Gmail, Google Docs 9

Deployment models Private Cloud: Cloud infrastructure solely for an organization Managed by te organization or tird party May exist on premise or off-premise Public Cloud: Cloud infrastructure available for use for general users Oned by an organization providing cloud services Hybrid Cloud: Composition of to or more clouds (private or public) 10

Stocastic Model Driven Capacity Planning for an IaaS Cloud, R. Gos, F. Longo, R. Xia, V. Naik, and K. Trivedi, IEEE Trans. On Services Computing, 2014 (to appear) 11

SLA driven capacity planning Wat is te optimal #PMs so tat total cost is minimized? Large sized cloud, large # configurations to searc 12

Capacity Planning Problem Determine te number of Pysical Macines Tat Minimize te overall cost 13

Duke/IBM project on cloud computing Joint ork it Raul Gos, Ruofan Xia and Dong Seong Kim (Duke), Francesco Longo (Univ. of Messina) Vijay Naik, Murty Devarakonda and Daniel Dias (IBM T. J. Watson Researc Center) 14

Cost components Capital Expenditure (CapEx) Infrastructure cost Operational Expenditure (OpEx) Penalty due to violation of different SLA metrics Cost of job rejection due to insufficient resources Cost of dontime Cost of carrying out repairs Poer usage cost 15

Tree Pools of Servers (PMs) To reduce poer usage costs, pysical macines are divided into tree pools [IBM Researc Cloud] Hot pool (ig performance & ig poer usage) Warm pool (medium performance & poer usage) Cold pool (loest performance & poer usage) 16

System Operation Details Failure/Repair (Availability): Servers may fail and get repaired. A minimum number of operational ot servers are required for te system to function. Servers in oter pools may be temporarily assigned to te ot pool to maintain system operation (migration). Job Arrival/Service (Performance): Ne jobs may be rejected if existing orkload is eavy so all resources are occupied. Server operation consumes poer depending on te server status (i.e., te pool it is in and te number of active VMs) 17

Optimization Problem Determine te number of PMs in eac pool: n, n, n c so as to minimize CapEx(n, n, n c ) + OpEx(n, n, n c ) n, n, n c : number of servers in te ot, arm, cold pool CapEx function can be easily determined OpEx for eac vector of PMs in eac pool needs to be computed For a searc-based optimization algoritm, tis OpEx computation needs to be done many times We need an efficient algoritm for doing tis Scalable models are developed for suc a computation 18

Hig level vie of developed models for OpEx OpEx Repair cost Dontime cost Availability model Poer & cooling cost Job rejection cost Performance model Mean time to failure/ repair of servers in te tree pools; Unit dontime cost; Unit repair cost Number of servers in te tree pools; System operation period lengt. External Job arrival rate; Mean job execution time; Poer consumption of a given server Unit rejection cost; Unit poer cost 19

Cost Component Part I Infrastructure cost: C f : cost of eac server n, n, n c : number of servers in te ot, arm, cold pool n s : te number of servers on a cassis C f : te cost of a server cassis Rejection cost: ρ reject : task rejection rate from te performance model C t : cost of eac task rejection L : lengt of te operation period 20

Cost Components Part II Repair cost: r, r, r c : Mean number of repairs per time unit in te ot, arm and cold pool respectively C r : Cost of eac repair L: Lengt of time of operation Dontime cost: C d : Revenue loss per time unit because of dontime DT: Total steady state dontime in minutes for te Cloud during operational period L (r) (from Availability model) DT t : Tresold on dontime beyond ic dontime cost is incurred 21

Cost Components Part III Poer and cooling cost: p (x), p (y), p c (z): te probability tat tere are x, y, z servers in te ot, arm, cold pool respectively. Computed from te availability model. W (x, y, z) : te poer consumption en tere are x, y, z servers running in te ot, arm, cold respectively. Computed from te performance model. C p : cost of per unit of poer consumption L: lengt of te operation period 22

Poer and cooling cost Overall poer consumption and cooling cost as expected steady state reard rates States of availability model x, y, z x, y, z x, y, z W x, y, z W x, y, z W x, y, z Performance model 23

Optimization Problems and Solution Approac Te problems are nonlinear and (in general) non-convex. We use Simulated Annealing but oter searc algoritms can be used as ell. For eac vector of values of te number of severs in eac pool, e need and efficient metod of computing te job rejection probability, dontime and te poer usage cost scalable availability, performance and poer models are needed 24

Sample Results Optimal configurations in different problem instances 25

Comparison it intuition based approac Consider a case ere OpEx involves only: Poer consumption and cooling costs An example of intuition based capacity planning in suc scenario: More PMs in ot pool iger poer cost More PMs in cold pool loer poer cost In our previous paper (DSN orksop DCDV 2011), e soed, suc intuition based approac does not alays old true Wen te orkload arrival rate is ig, PMs in te cold pool ill act as PMs in te ot pool Cold pool poer consumption ill be almost same as te ot pool 26

Ho do e develop scalable performance and availability and poer models to compute OpEx? 27

Our goals in te IBM Cloud project Develop a compreensive analytic modeling approac Hig fidelity Scalable and tractable 28

Our approac Monolitic analytic (Markov) models ill not ork as tey ill suffer largeness and ence not scalable Our approac: overall system model decomposed into a set of sub-models sub-model solutions composed via an interacting Markov cain approac Fixed-Point problem solved via successive substitution scalable and tractable 29

Scalable Analytic Model for IaaS Cloud Availability and Dontime [paper in IEEE Trans. On Cloud Computing 2014] 30

Analytic model Markov model (CTMC) is too large to construct by and. Te number of PMs in eac pool can be large PMs can migrate among pools We use a ig level formalism of stocastic Petri net (te flavor knon as stocastic reard net (SRN)). SRN models can be automatically converted into underlying Markov (reard) model and solved for te measures of interest suc as DT (dontime) For very large number of PMs even decomposed models are not enoug; e resort to discrete-even simulation; same SRN model can be simulated via our softare package (SPNP) 31

Monolitic SRN Model 32

Monolitic Model Monolitic SRN model is automatically translated into CTMC or Markov Reard Model Hoever te model not scalable as state-space size of tis model is extremely large #PMs per pool #states #non-zero matrix entries 3 10, 272 59, 560 4 67,075 453, 970 5 334,948 2, 526, 920 6 1,371,436 11, 220, 964 7 4,816,252 41, 980, 324 8 Memory overflo Memory overflo 10 - - 33

Decompose into Interacting Sub-models SRN sub-model for cold pool SRN sub-model for arm pool SRN sub-model for ot pool 34

Import grap and model outputs Model outputs: mean number of PMs in eac pool (E[#P ], E[#P ], and E[#P c ]) Dontime in minutes per year 35

Many questions Existence of Fixed Point (easy) Uniqueness Rate of convergence Accuracy Scalability 36

Monolitic vs. interacting sub-models #states, #non-zero entries 37

Monolitic vs. interacting sub-models Dontime [minutes per year] k is te #PM in ot pool to ave te Cloud available results differ only after te 8t significant figure (not reported in table) 38

Interacting sub-models vs. simulation Mean number of non-failed PMs n is te initial #PMs in eac pool Numeric solution of interacting sub-models in te c.i. of simulation solution 39

Analytic-Numeric vs. simulative solutions Dontime [minutes per year] 40

Analytic-Numeric vs. simulative solution Solution times [seconds] 41

Availability Model Summary 42

Performance Modeling and Analysis for IaaS Cloud [paper in Proc. IEEE PRDC 2010; FGCS 2013] 43

System model Current Assumptions [ill be relaxed soon] Homogenous requests All pysical macines (PMs) are identical. 44

Life-cycle of a job inside a IaaS cloud Provisioning response delay Arrival Queuing Provisioning Instantiation Decision VM deployment Actual Service Out Resource Provisioning Decision Engine Run-time Execution Job rejection due to buffer full Provisioning and servicing steps: (i) resource provisioning decision, (ii) VM provisioning and (iii) run-time execution Job rejection due to insufficient capacity

Resource provisioning decision engine (RPDE) Provisioning response delay Arrival Queuing Provisioning Instantiation Decision VM deployment Actual Service Out Resource Provisioning Decision Engine Run-time Execution Job rejection due to buffer full Job rejection due to insufficient capacity

Resource provisioning decision engine (RPDE) Flo-cart: 47

CTMC model for RPDE i,s i = number of jobs in queue, s = pool (ot, arm or cold) 0,0 0, δ P 0, δ P 1, 1, δ P δ P δ ( 1 P ) δ ( 1 P ) δ P δ ( 1 P ) δ δ P δ P δ c ( 1 Pc ) c P c δ P N-1, N-1, δ 1 P ) c ( c δ P c c δ ( 1 P ) δ P c c δ ( 1 P ) δ ( 1 P ) δ 1 P ) δ 1 P ) δ P c c c ( c c ( c 0,c 1,c N-1,c 48

49 Generator Matrix of te RPDE model " 3 " 2 " 1 3 2 1 3 2 1 3 2 1 3 2 1 0 1, ) (1 1, ) (1 1, 2, ) (1 2, ) (1 2, 2, ) (1 2, ) (1 2, 1, ) (1 1, ) (1 1, 0, ) (1 0, ) (1 0, 0,0 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 0, 0, 0, 0,0 c c c c c N P P N P P N c N P N P N c P P P P c P P P P c P P P P c N N N c N N N c c c δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ O M O O O L Te generator matrix possesses significant structure and may be solved troug matrix geometric metod.

Closed form solution of RPDE sub-model Let, W = δ ( 1 P ) X δ + c = δ 1 P ( ) Y + δ = δ 1 P ( ) Z = δ ( 1 P ) It can be son: 50

Closed form solution of RPDE sub-model 51

Closed form solution of RPDE sub-model Similarly, oter state probabilities can be derived in terms of π (0,0 ) Were, Finally, normalization is provided by: 52

RPDE model: parameters & measures Input Parameters: arrival rate: data collected from cloud 1/ δ,1/ δ,1 / δ c mean searc delays for resource provisioning decision engine: from searcing algoritms or measurements P probability of being able to provision: computed from, P, Pc VM provisioning model N maximum # jobs in RPDE: from system/server specification Output Measures: Job rejection probability due to buffer full (P block ) Job rejection probability due to insufficient capacity (P drop ) Sum of te above to is te overall rejection probability (ρ reject ) Mean decision delay for an accepted job (E[T decision ]) Mean queuing delay for an accepted job (E[T q_dec ]) 53

VM provisioning Provisioning response delay Arrival Queuing Provisioning Instantiation Decision VM deployment Actual Service Out Resource Provisioning Decision Engine Run-time Execution Job rejection due to buffer full Job rejection due to insufficient capacity

VM provisioning model Hot PM Hot PM pool Resource Provisioning Decision Engine Service out Warm pool Accepted jobs Running VMs Idle resources on ot macine Idle resources on arm macine Idle resources on cold macine Cold pool 55

VM provisioning model for eac ot PM 0,0,0 0,1,0 L,1,0 µ β 0,0,1 (L -1),1,1 L,1,1 µ µ β µ L is te buffer size and m is max. # VMs tat can run simultaneously on a PM i,j,k ( m 1)µ 2µ β 0,0,(m-1) ( m 1)µ mµ 0,1,(m-1) β β 0,0,m mµ β 1,0,m β 2µ ( m 1)µ β β (L -1),1,(m-1) β 2µ mµ i = number of jobs in te queue, j = number of VMs being provisioned, k = number of VMs running ( m 1)µ L,1,(m- 1) β L,0,m 56

Generator Matrix of te ot PM model 000 001 002 003 010 011 012 103 110 111 112 203 210 211 212 303 310 311 312 000 0 001 µ 01 002 2µ 02 003 3µ 03 010 β 10 011 β µ 11 012 β 2µ 12 103 3µ 13 110 β 10 111 β µ 11 112 β 2µ 12 203 3µ 13 210 β 10 211 β µ 11 212 β 2µ 12 303 3µ 13 310 β 20 311 β µ 21 312 β 2µ 22 Te generator matrix en L = 3 and m = 3. It possesses a block structure tat facilitates a matrix geometric solution. 57

VM provisioning model (for eac ot PM) Input Parameters: 1 P block ) = ( n 1/ 1/ β µ P block can be measured experimentally obtained from te loer level run-time model obtained from te resource provisioning decision model Hot pool model is te set of independent ot PM models Output Measure: m 1 P = prob. tat a job is accepted in te ot pool = 1 ( ϕ + ( m 1 ( ) ϕ + ϕ ( ) n ) ( ) ( ) n ( L,1, i) ϕ( L,0, m) ) ere, ( L,1, i ) ( L, 0, m ) is te steady state probability tat a PM can not i = 0 accept job for provisioning - from te solution of te Markov model of a ot PM on te previous slide i= 0 58

VM provisioning model for eac arm PM 0,0,0 0,1,0 L,1,0 µ γ β µ 0,1,0 L,1,0 β 0,0,1 0,1,1 (L -1),1,1 L,1,1 β 2 µ 2µ ( m 1)µ L 0, 1,0 0,1, β 0,0,(m-1) mµ µ ( m 1)µ β 0,1,(m-1) β β γ β mµ β µ β Copyrigt 0,0,m 2014 by K.S. 1,0,m Trivedi ( m 1)µ 2µ β (L -1),1,(m- 1) β mµ ( m 1)µ L,1,(m-1) β L,0,m 59

60 Generator Matrix for arm pool model 22 21 13 12 11 10 13 12 11 10 13 12 11 10 13 12 11 01 00 01 00 01 00 2 312 311 0 31 3 303 2 212 211 0 21 3 203 2 112 111 0 11 3 103 2 012 011 0 01 3 003 2 002 001 000 310 0 31 210 0 21 110 0 11 010 0 01 312 311 0 31 303 212 211 0 21 203 112 111 0 11 103 012 011 0 01 003 002 001 000 310 0 31 210 0 21 110 0 11 010 0 01 µ β µ β β β µ µ β µ β β µ µ β µ β β µ µ β µ β β µ µ µ β β γ γ β γ β γ β γ L = 3 and m = 3. Te matrix may be solved using matrix-analytical metod.

VM provisioning model for eac cold PM c 0,0,0 0,1,0 γ c c c L c,1, 0 γ c µ β c 0,1,0 L c,1,0 β c 0,1, 0 0,1,1 µ µ 0,0,1 (L c -1),1,1 c c c β 2 µ 2µ ( m 1)µ β c 0,0,(m-1) ( m 1)µ c mµ c β 0,1,(m-1) β c β c c mµ L c, 1,0 β µ L c,1, 1 β β Copyrigt 0,0, 2014 by K.S. 1,0, Trivedi m m c ( m 1)µ c 2µ c β (L c -1),1,(m-1) β c mµ c ( m 1)µ L c,1,(m-1) β L c,0,m 61

VM provisioning model: Summary Warm/cold PM model is similar to ot PM, except: Effective job arrival rate For first job, arm/cold PM requires additional start-up time Mean provisioning delay for a VM for te first job is longer Outputs of ot, arm and cold pool models: Probabilities ( P, P, Pc ) tat at least one PM in ot/arm/cold pool can accept a job 62

Import grap for performance models job rejection probability and mean response delay P P block RPDE model P block P block Pc P Hot pool model P P Warm pool model P Cold pool model VM provisioning models 63

Fixed-point iteration To solve ot, arm and cold PM models, e need provisioning decision model P block from resource To solve provisioning decision model, e need and cold pool model respectively P, P, P c from ot, arm Tis leads to a cyclic dependency among te resource provisioning decision model and VM provisioning models (ot, arm, cold) We resolve tis dependency via fixed-point iteration Observe, our fixed-point variable is equation is of te form: P = block f ( P block ) P block and corresponding fixed-point 64

Many questions Existence of Fixed Point (easy) Uniqueness Rate of convergence Accuracy Scalability 65

Performance measures comparison it monolitic model 1 PM per pool and 1 VM per PM Jobs/r Mean RPDE queue lengt Rejection probability IMC monolitic IMC Monolitic 1 9.0332e-07 9.2321e-07 9.8899e-06 1.1221e-03 5 4.1622e-05 4.3364e-05 4.2334e-02 8.0500e-02 10 2.3731e-04 2.4225e-04 2.3496e-01 2.6587e-01 15 6.3539e-04 6.4377e-04 3.9860e-01 4.1493e-01 20 1.2526e-03 1.2655e-03 5.1069e-01 5.1969e-01 25 2.0990e-03 2.1179e-03 5.8915e-01 5.9449e-01 30 3.1826e-03 3.2091e-01 6.4648e-01 6.4985e-01 35 4.5106e-03 4.5462e-03 6.8999e-01 6.9223e-01 Te error is beteen e-03 and e-07 for all te results. Te number of states in monolitic model is 912 ile in ISP model it is 21 66

Poer Quantification for IaaS Cloud [paper in Proc. IEEE/IFIP DSN orksop DCDV 2011] 67

Poer Consumption from Hot PM Model 0,0,0 0,1,0 L,1,0 µ β 0,0,1 ( m 1)µ 2µ β 0,0,(m-1) µ µ β µ mµ ( m 1)µ 0,1,(m-1) β β 0,0,m mµ (L - 1),1,1 β 1,0,m β 2µ ( m 1)µ L,1,1 β β (L -1),1,(m- 1) β 2µ mµ ( m 1)µ L,1,(m- 1) β L,0,m Hot PM idle poer consumption (no VM): l Additional poer consumption of eac running VM it average resource utilization: v a For eac state (i, j, k) of te CTMC, e assign a reard rate: r(i, j, k) = l + kv a 68

Poer Consumption from Warm PM Model Warm PM CTMC states Reard rates l 1 l 2 l3 l 69

Poer Consumption from Cold PM Model Cold PM CTMC states Reard rates c l 1 cl c 2 l3 l Net poer consumption is sum of poer consumptions in ot, arm and cold pool 70

Poer and cooling cost Overall poer consumption and cooling cost as expected steady state reard rates States of availability model x, y, z x, y, z x, y, z P x, y, z P x, y, z Performance model P x, y, z 71

Conclusions 72

Conclusions Analytic models are poerful for te construction and numerical solution of various reliability, availability, performance, and poer models For very complex systems suc as clouds, ierarcical, fixed-point iterative and approximate solutions are needed. Performance, availability and poer consumption analysis can be done using suc an approac Simulative and ybrid models/solutions sould be used en absolutely necessary Models can ten be used in capacity planning a feedback control setting for adapting to canges 73

Summary Performance analysis: Developed scalable interacting stocastic sub-models for large Clouds Analysis of provisioning delay and impact of different factors, e.g., arrival rate, system capacity, resource olding time etc. R. Gos, F. Longo, V. K. Naik, and K. S. Trivedi, Modeling and Performance Analysis of Large Scale IaaS Clouds, Elsevier Future Generation Computing Systems, July 2013. 74

Summary (contd.) Scalable Availability analysis: Interacting stocastic sub-models for failure-repair analysis F. Longo, R. Gos, V. K. Naik, and K. S. Trivedi, A Scalable Availability Model for Infrastructure-as-a-Service Cloud, DSN, June 2011. R. Gos, F. Longo, F. Frattini, S. Russo and K. S. Trivedi, Scalable Analytics for IaaS Cloud Analytics, IEEE Trans. On Cloud Computing, accepted Feb 2014. Cost analysis, optimization and Cloud capacity planning: Developed and solved optimization problems to minimize te total cost itout violating te SLAs R. Gos, F. Longo, R. Xia, V. K. Naik, and K. S. Trivedi, Stocastic Model Driven Capacity Planning for an Infrastructure-as-a-Service Cloud, IEEE Trans. On Services Computing, accepted August 2013. 75

Tanks! 76