Cloud Analytics for Capacity Planning and Instant VM Provisioning



Similar documents
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, ACCEPTED FOR PUBLICATION 1. Cloud Analytics for Capacity Planning and Instant VM Provisioning

Self-adaptive Cloud Capacity Planning

Characterizing Task Usage Shapes in Google s Compute Clusters

Black-box Performance Models for Virtualized Web. Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang

Study Plan for the Master Degree In Industrial Engineering / Management. (Thesis Track)

Hybrid Cloud Delivery Managing Cloud Services from Request to Retirement SOLUTION WHITE PAPER

Dynamic Resource allocation in Cloud

Business Analytics using Data Mining Project Report. Optimizing Operation Room Utilization by Predicting Surgery Duration

Data Mining. Nonlinear Classification

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

Figure 1. The cloud scales: Amazon EC2 growth [2].

Dan French Founder & CEO, Consider Solutions

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

International Journal of Computer & Organization Trends Volume21 Number1 June 2015 A Study on Load Balancing in Cloud Computing

The Predictive Data Mining Revolution in Scorecards:

Benchmarking of different classes of models used for credit scoring

SOLUTION WHITE PAPER. Building a flexible, intelligent cloud

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Storage I/O Control: Proportional Allocation of Shared Storage Resources

SUSE OpenStack Cloud. Become Your Enterprise s Cloud Service Provider

IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT

JOURNAL OF OBJECT TECHNOLOGY

Migration of Virtual Machines for Better Performance in Cloud Computing Environment

Learn How to Leverage System z in Your Cloud

The Data Mining Process

About the Author. The Role of Artificial Intelligence in Software Engineering. Brief History of AI. Introduction 2/27/2013

Energy Constrained Resource Scheduling for Cloud Environment

Data Mining - Evaluation of Classifiers

A Real-Time Cloud Based Model for Mass Delivery

Chapter 6. The stacking ensemble approach

Inciting Cloud Virtual Machine Reallocation With Supervised Machine Learning and Time Series Forecasts. Eli M. Dow IBM Research, Yorktown NY

Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang

Efficient and Enhanced Load Balancing Algorithms in Cloud Computing

Scala Storage Scale-Out Clustered Storage White Paper

Exploring Big Data in Social Networks

Cloud Management: Knowing is Half The Battle

A Property & Casualty Insurance Predictive Modeling Process in SAS

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

Comparison of Windows IaaS Environments

Sunnie Chung. Cleveland State University

The Artificial Prediction Market

Energetic Resource Allocation Framework Using Virtualization in Cloud

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

DATA MINING TECHNIQUES AND APPLICATIONS

Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching

How To Identify A Churner

Cloud Lifecycle Management

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

QLogic 16Gb Gen 5 Fibre Channel for Database and Business Analytics

1. Simulation of load balancing in a cloud computing environment using OMNET

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Context-Aware Online Traffic Prediction

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

Fast Analytics on Big Data with H20

Master of Science in Health Information Technology Degree Curriculum

A New Quantitative Behavioral Model for Financial Prediction

HPC performance applications on Virtual Clusters

Understanding Data Locality in VMware Virtual SAN

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

VIRTUAL RESOURCE MANAGEMENT FOR DATA INTENSIVE APPLICATIONS IN CLOUD INFRASTRUCTURES

Regression III: Advanced Methods

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing

Introducing Oracle Exalytics In-Memory Machine

A Generic Auto-Provisioning Framework for Cloud Databases

Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools

Statistics for BIG data

Scalable Developments for Big Data Analytics in Remote Sensing

EMC XTREMIO EXECUTIVE OVERVIEW

VBLOCK SOLUTION FOR SAP APPLICATION SERVER ELASTICITY

Car Insurance. Prvák, Tomi, Havri

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

An Approach to Load Balancing In Cloud Computing

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Oracle Big Data SQL Technical Update

Performance Evaluation of Task Scheduling in Cloud Environment Using Soft Computing Algorithms

Package acrm. R topics documented: February 19, 2015

How To Get A Masters Degree In Logistics And Supply Chain Management

THE VIRTUAL DATA CENTER OF THE FUTURE

A VERITAS PERSPECTIVE: Maximize Agility, Minimize Risk In The Multi-Vendor Hybrid Cloud

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Expedia

Transcription:

Cloud Analytics for Capacity Planning and Instant VM Provisioning Yexi Jiang Florida International University Advisor: Dr. Tao Li Collaborator: Dr. Charles Perng, Dr. Rong Chang

Presentation Outline Background Cloud Capacity Prediction Predict provisioning resource demand Estimate de-provisioning requests Experimental evaluation results Instant Cloud Provisioning Predict VM provisioning demand Experimental evaluation results 1 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Background What is Cloud Analytics? Rapidly identify cloud resource or application trouble spots so you can solve the problem. What is the objective of cloud analytics? The cloud platform itself. What can cloud analytics do? Workload analysis System fault diagnostics 2 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Smart Cloud Enterprise trace data 5 month, 35k+ requests, 120+ image types, 20+ features each record Important Features: Image Name, Owner, Start Time, End Time, ID 3 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Aggregating the Raw Data weekly Cannot reflect real capacity daily Just right hourly 4 Yexi Jiang http://users.cis.fiu.edu/~yjian004/ 5

Aggregating the Raw Data weekly Cannot reflect real capacity daily Just right hourly Measurement Weekly Daily Hourly Coefficient of Variance (CV) 0.5606 0.7915 1.2249 Skewness 0.3295 1.5644 5.4464 Too irregular Kurtosis 1.62 5.8848 52.4103 5 Yexi Jiang http://users.cis.fiu.edu/~yjian004/ 6

Presentation Outline Background Cloud Capacity Prediction Predict provisioning resource demand Estimate de-provisioning requests Experimental evaluation results Instant Cloud Provisioning Predict VM provisioning demand Experimental evaluation results 6 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Cost of Data Centers * From James Hamilton's Blog 31% of the cost is related to power. As hardware price continuously decreases, the proportion would further increase. The US EPA estimates the energy usage at data centers is experiencing successive doubling every five years. (7.4 billion in 2011) 7 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Motivation Reduce power cost via capacity prediction Cost of the Cloud Provider Prepared Resource Real Requirement 8 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Motivation Reduce power cost via capacity prediction Cost of the Cloud Provider Prepared Resource Predicted Resource Real Requirement 9 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Candidate Time Series Capacity time series Non-stationary. Difficult to model directly Provisioning /de-provisioning time series Obvious temporal pattern Better candidate 10 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Basic Idea Capacity = (# existing VMs) + (# provisioning) - (# de-provisioning) Predicted Provisioning - Predicted Deprovisioning + Existing VM in cloud Predicted Capacity 11 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Predicting Provisioning Ensemble method for time series prediction Individual prediction techniques used: Moving Average. Naïve predictor. Auto Regression. Linear predictor. Neural Network. Non-linear predictor. Gene Expression Programming. Genetic algorithm. Support Vector Machine. Linear predictor with non-linear kernel. Dynamic weighted linear combination Weight update Demands w p (t) weight of predictor p v p predicted value of individual predictor p c p (t) cost of predictor p at time t e (t) error of individual predictor p 12 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Cloud Prediction Cost C = R( v( t), v ~ ( t)) + T ( v( t), v ~ ( t)) Over-prediction: cost of resource waste. R function: Under-prediction: cost of SLA penalty. T function: Property: Non-negative, Monotonic. 13 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Prediction Result Ensemble has the best average performance. 14 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Predicting De-provisioning Use the life span CDF F(x) of VMs to estimate number of deprovisioning requests Estimation of distribution: step-wise function. * n i # of VMs with life span t (t1 < t < t2) 15 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

De-provisioning evaluation Test data: last 60 day. Test methods: 1. No preparation at all (None) 2. Always prepare the maximum capacity (Maximum) 3. Time series prediction (Time Series) 4. Life span distribution despite of image 60 days of data (Dist 60) 90 days of data (Dist 90) Global distribution estimation method outperforms the time series prediction method. 16 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Presentation Outline Background Cloud Capacity Prediction Predict provisioning resource demand Estimate de-provisioning requests Experimental evaluation results Instant Cloud Provisioning Predict VM provisioning demand Experimental evaluation results 17 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Motivation Problem: Existing clouds are not instant, not suitable for mid-job scaling and urgent tasks. VM preparation is fast, but patching, security assurance, manual process and other processes cost time. Known solutions: Prepare extreme large number of different types of VMs. Waste resource Ask customers to provide schedule. Impractical Our Idea: Make good use of the customer historical requests to infer the future demand. Reduce the average VM provisioning fulfillment time. 18 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Core Idea Model and predict demands Predict Results Pre-provision at suitable time Wait for Requests Assign VMs to customers 19 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Focus on individual types No obvious temporal patterns for individual image type. Ensemble is still required. 20 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Focus on popular VM types 1) About 10% (12) of the 124 VM types consists more than 80% requests 2) Inflection point divides the VM types into popular group and rare group 3) Requests for rare image types appear randomly. 21 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

22 Yexi Jiang http://users.cis.fiu.edu/~yjian004/ Workflow Overview

Experimental Evaluation Ensemble method have the best performance in reducing waiting time and resource waste. 23 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Conclusion Capacity Prediction The demand of cloud capacity can be estimated by predicting provisioning and deprovisioning requests Use time series ensemble method for provisioning prediction Use VM life span model for de-provisioning prediction Instant cloud provisioning Pre-provision VMs before requests arrive Predict VM provision requests use time series ensemble method The average provisioning fulfillment time can be reduced by 85%+ Future work Improve prediction with user profile Fine-grain adjustment with control theory 24 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Thank you! 25 Yexi Jiang http://users.cis.fiu.edu/~yjian004/

Thank you Related Paper: Intelligent Cloud Capacity Management. (NOMS 2012) ASAP: A Self-Adaptive Prediction System for Instant Cloud Resource Demand Provisioning. (ICDM 2011) Patent: Cloud Provisioning Accelerator, Serial # 13306506, Pending 26 Yexi Jiang http://users.cis.fiu.edu/~yjian004/