DynamicCloudSim: Simulating Heterogeneity in Computational Clouds

Similar documents
DynamicCloudSim: Simulating Heterogeneity in Computational Clouds

Data Sharing Options for Scientific Workflows on Amazon EC2

A SURVEY ON LOAD BALANCING ALGORITHMS FOR CLOUD COMPUTING

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

Grid Computing Vs. Cloud Computing

Performance Analysis of VM Scheduling Algorithm of CloudSim in Cloud Computing

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Exploring the Efficiency of Big Data Processing with Hadoop MapReduce

Cloud Computing. Alex Crawford Ben Johnstone

Profit Based Data Center Service Broker Policy for Cloud Resource Provisioning

A NEW APPROACH FOR LOAD BALANCING IN CLOUD COMPUTING

Cloud Computing Simulation Using CloudSim

Smart Cloud Federation Simulations with CloudSim

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

SCORE BASED DEADLINE CONSTRAINED WORKFLOW SCHEDULING ALGORITHM FOR CLOUD SYSTEMS

Performance Analysis of Cloud Computing Platform

CloudAnalyst: A CloudSim-based Visual Modeller for Analysing Cloud Computing Environments and Applications

CDBMS Physical Layer issue: Load Balancing

Improving MapReduce Performance in Heterogeneous Environments

CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms

Performance Analysis of Web Applications on IaaS Cloud Computing Platform

Multilevel Communication Aware Approach for Load Balancing

Dynamic resource management for energy saving in the cloud computing environment

Use of Hadoop File System for Nuclear Physics Analyses in STAR

Simulation-based Evaluation of an Intercloud Service Broker

Amazon EC2 XenApp Scalability Analysis

EFFICIENT VM LOAD BALANCING ALGORITHM FOR A CLOUD COMPUTING ENVIRONMENT

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Dynamic Round Robin for Load Balancing in a Cloud Computing

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing

Comparison of PBRR Scheduling Algorithm with Round Robin and Heuristic Priority Scheduling Algorithm in Virtual Cloud Environment

Load Balancing with Tasks Subtraction

An Implementation of Load Balancing Policy for Virtual Machines Associated With a Data Center

GraySort on Apache Spark by Databricks

Practical Approach for Achieving Minimum Data Sets Storage Cost In Cloud

Cloud Computing through Virtualization and HPC technologies

Creating A Galactic Plane Atlas With Amazon Web Services

A SURVEY ON WORKFLOW SCHEDULING IN CLOUD USING ANT COLONY OPTIMIZATION

Reallocation and Allocation of Virtual Machines in Cloud Computing Manan D. Shah a, *, Harshad B. Prajapati b

PICS: A Public IaaS Cloud Simulator

Scientific Workflow Applications on Amazon EC2

Load Balancing Scheduling with Shortest Load First

Energy Constrained Resource Scheduling for Cloud Environment

A Broker-based Framework for Multi-Cloud Workflows

Towards an Optimized Big Data Processing System

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

Muse Server Sizing. 18 June Document Version Muse

On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds

Improving MapReduce Performance in Heterogeneous Environments

Matchmaking: A New MapReduce Scheduling Technique

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

The Case for Resource Sharing in Scientific Workflow Executions

Characterizing Task Usage Shapes in Google s Compute Clusters

Environments, Services and Network Management for Green Clouds

Deadline Based Task Scheduling in Cloud with Effective Provisioning Cost using LBMMC Algorithm

HPC performance applications on Virtual Clusters

Utilizing Round Robin Concept for Load Balancing Algorithm at Virtual Machine Level in Cloud Environment

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

SLA-aware Resource Scheduling for Cloud Storage

Performance Testing of a Cloud Service

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Survey on Scheduling Algorithm in MapReduce Framework

LOAD BALANCING OF USER PROCESSES AMONG VIRTUAL MACHINES IN CLOUD COMPUTING ENVIRONMENT

Enabling Technologies for Distributed and Cloud Computing

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

EPOBF: ENERGY EFFICIENT ALLOCATION OF VIRTUAL MACHINES IN HIGH PERFORMANCE COMPUTING CLOUD

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

NetworkCloudSim: Modelling Parallel Applications in Cloud Simulations

EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Round Robin with Server Affinity: A VM Load Balancing Algorithm for Cloud Based Infrastructure

Nutan. N PG student. Girish. L Assistant professor Dept of CSE, CIT GubbiTumkur

A Proposed Framework for Ranking and Reservation of Cloud Services Based on Quality of Service

CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES

CPU Benchmarks Over 600,000 CPUs Benchmarked

Performance Analysis of Cloud-Based Applications

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

Automatic Mapping Tasks to Cores - Evaluating AMTHA Algorithm in Multicore Architectures

Performance Analysis of a Numerical Weather Prediction Application in Microsoft Azure

C-Meter: A Framework for Performance Analysis of Computing Clouds

Storage CloudSim: A Simulation Environment for Cloud Object Storage Infrastructures

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Transcription:

DynamicCloudSim: Simulating Heterogeneity in Computational Clouds Marc Bux, Ulf Leser {bux leser}@informatik.hu-berlin.de The 2nd international workshop on Scalable Workflow Enactment Engines and Technologies (SWEET'13)

Meet Sandra DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 2

Meet Sandra DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 3

Meet Sandra DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 4

Meet Paul Small Instance: 1.7 GB RAM, 1 EC2 Compute Unit, 160 GB local storage Compute Unit: equiv. CPU capacity of a 1.0-1.2 GHz Opteron or Xeon No guarantees wrt. I/O throughput and network delay / bandwidth DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 5

Meet Paul Any one cloud instance is unlike another. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 6

Heterogeneity in EC2 Cloud Instances Source: [Dejun10] Amazon EC2 Performance [Schad10] Different CPUs on physical host systems [Jackson10, Schad10] Intel Xeon E5430 (2.66 GHz quad) AMD Opteron 270 (2 GHz dual) AMD Opteron 2218 HE (2.6 GHz dual) I/O throughput varies as well [Dejun10] No correlation between CPU and I/O performance DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 7

Dynamic Changes of Performance Occasional CPU performance slumps and failures during task execution [Dejun10, Jackson10] Variance in I/O and network throughput [Zaharia08,Jackson10] Performance depends on hour of day and day of week [Schad10] EC2 Disk performance vs. VM co-allocation [Zaharia08] CPU performance slumps [Dejun10] DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 8

Vision Adaptive scheduling of scientific workflows Exploit heterogeneous resources Exhibit robustness to instability DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 9

Vision The standard approach for evaluation is simulation Cloud simulation toolkits do not model instability [Braun01, Blythe05] DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 10

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 11

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 12

CloudSim R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, R. Buyya (2011), CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms, Software - Practice and Experience 41(1):23-50. More than 250 citations in Google Scholar https://code.google.com/p/cloudsim/ Task VM Host Datacenter DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 13

DynamicCloudSim Extend CloudSim with models for 1. Heterogeneous computational resources (Het) 2. Dynamic changes of performance at runtime (DCR) 3. Straggler VMs and failed task executions (SaF) More fine-grained representation of computational resources https://code.google.com/p/dynamiccloudsim/ Error-prone Task Dynamic VM Heterogeneous Host Datacenter DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 14

Realism can we ever get there? Simulation can never perfectly resemble reality We model inhomogeneity and dynamic changes by sampling from normal distributions Default mean and STD/RSD Parameters are obtained from [Zaharia08, Dejun10, Jackson10, Schad10, Iosup11] Many performance characteristics in EC2 follow a normal distribution [Schad10] DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 15

Simulating VM Performance: DCS vs CS 1. Heterogeneous computational resources (Het) 2. Dynamic changes of performance at runtime (DCR) 3. Straggler VMs and failed task executions (SaF) DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 16

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers a) Scheduling Scientific Workflows b) Evaluation Workflows c) Evaluation Results 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 17

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers a) Scheduling Scientific Workflows b) Evaluation Workflows c) Evaluation Results 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 18

Scheduling of Scientific Workflows Scheduling: Mapping tasks to the available physical resources Usual goal: minimize overall execution time Static Scheduling: Schedule is assembled prior to workflow execution Schedule is strictly abided at runtime Adaptive Scheduling: Monitor computational infrastructure Adjust workflow execution at runtime DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 19

Static Schedulers Baseline: Round Robin Assign tasks to resources in turn Equal amount of tasks per resource Elaborate: HEFT (Het. Earliest Finish Time) [Topcuoglu02] Implemented in SWfMS Pegasus Requires runtime estimates for each task on each resource Assign tasks with longest time to finish a fixed timeslot on a suitable (well-performing) resource Exploit heterogeneity in computational infrastructure (Het) DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 20

Adaptive Schedulers Baseline: Greedy Task Queue Assign tasks to resources at runtime in first-come-firstserved manner Adapts to changes of performance at runtime (DCR) Elaborate: LATE (Longest Approx. Time to End) [Zaharia08] Developed for Hadoop to increase robustness to instability 10% of Tasks progressing at rate below average are replicated and speculatively executed Exploit dynamic changes of performance Robust to straggler VMs and failed task executions (SaF) DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 21

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers a) Scheduling Scientific Workflows b) Evaluation Workflows c) Evaluation Results 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 22

Evaluation Workflow: Montage [Berriman04] DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 23

Abstract Montage Workflow One task can have many task instances. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 24

Concrete Montage Workflow 43,318 tasks reading and writing 534 GB of data 10 GB input files which have to be uploaded to the cloud Determine avg. runtime over 100 simulations of workflow exec. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 25

Eval. Workflow: Comparative Genomics DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 26

Concrete Genomics Workflow DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 27

Concrete Genomics Workflow Align 10% of the reads produced in a sequencing experiment against the smallest of human chromosomes (chr22) Use about 0.2% of the available data 4,266 tasks reading and writing 436 GB of data (2.3 GB upload) Upload to cloud Indexing (bowtie, SHRiMP, PerM) Alignment (bowtie, SHRiMP, PerM) Convert (samtools view) Sort (samtools sort) Merge (merge) Preprocess (samtools mpileup) Variant calling (VarScan) Sense-Making (VCFTools) Download from cloud DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 28

Agenda 1) Simulating Heterogeneity in Computational Clouds 2) Evaluating Established Workflow Schedulers a) Scheduling Scientific Workflows b) Evaluation Workflows c) Evaluation Results 3) Summary and Outlook DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 29

Average Runtime in Minutes Runtime depending on Heterogeneity (Het) Average Runtime in Minutes 1314 1400 1200 1000 800 600 400 200 0 368 Static Round Robin 286 450 300 296 300 308 371 296 313 303 301 315 300 304 308 296 311 HEFT 715 Greedy Queue LATE 0 0.5 0.375 0.25 0.125 747 RSD Parameters for Heterogeneous Resources (Het) 800 602 600 400 200 0 203 Static Round Robin 220 275 143 163 178 HEFT 149 195 185 152 187 182 150 166 177 148 163 179 Greedy Queue DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 30 LATE 0 0.5 0.375 0.25 0.125 RSD Parameters for Heterogeneous Resources (Het)

Runtime depending on Dynamic Changes (DCR) Average Runtime in Minutes Average Runtime in Minutes 600 500 400 300 200 100 0 368 Static Round Robin 574 530 465 439 394 307 357 311 289 352 299 299 301 308 296 317 304 296 311 HEFT Greedy Queue 400 300 200 100 0 LATE 203 0 Static Round Robin 0.5 0.375 0.25 0.125 HEFT RSD Parameters for Dynamic Changes at Runtime (DCR) 314 295 255 241 207 177 216 190 170 180 165 179 165 166 176 143 163 178 Greedy Queue DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 31 393 LATE 0 0.5 0.375 0.25 0.125 RSD Parameters for Dynamic Changes at Runtime (DCR)

Average Runtime in Minutes Runtime with Stragglers and Failures (SaF) Average Runtime in Minutes 3000 2500 2000 1500 1000 500 0 368 Static Round Robin 1365 1291 1137 962 876 790 659 321 598 586 316 405 317 0.025 396 0.01875 304 316 296 0.0125 311 0.00625 HEFT 2559 Greedy Queue LATE 0 1990 Likelihood of Straggler VMs and Failed Tasks (SaF) 2000 1500 1000 500 0 203 Static Round Robin HEFT Greedy Queue DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 32 1025 984 1125 604 617 635 411 195 444 352 188 262 187 0.025 237 0.01875 180 143 0.0125 163 178 0.00625 LATE 0 Likelihood of Straggler VMs and Failed Tasks (SaF)

That s all well and good, but Scheduling in SWfMS: Static or Greedy Task Queue HEFT and LATE have a computational overhead and require information not available in real scenarios: HEFT: runtime estimates of each task on each machine LATE: progress rate of each running task Untapped optimization potential: multiple resource scheduling Find appropriate matches between tasks and machines DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 33

Summary and Outlook EC2: Heterogeneity and instability in VM performance DynamicCloudSim introduces several factors of instability into CloudSim Simulation experiments reproduce known strengths and shortcomings of established schedulers Outlook: Comparative evaluation on real hardware DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 34

Thanks for your attention! https://code.google.com/p/dynamiccloudsim/ DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 35

Questions DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 36

Literature [Braun01] T. D. Braun, H. J. Siegel, N. Beck, L. L. Boloni, M. Maheswarans, A. I. Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, R. F. Freund (2001), A Comparison Study of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems, Journal of Parallel and Distributed Computing 61:810 837. [Blythe05] J. Blythe, S. Jain, E. Deelman, Y. Gil, K. Vahi, A. Mandal, K. Kennedy (2005), Task Scheduling Strategies for Workflow-based Applications in Grids, in: Proceedings of the 5th IEEE International Symposium on Cluster Computing and the Grid, volume 2, Cardiff, UK, pp. 759 767. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 37

Literature (cont.) [Jackson10] K. R. Jackson, et al. (2010), Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud, in: Proceedings of the 2nd International Conference on Cloud Computing Technology and Science, Indianapolis, USA, pp. 159-168. [Dejun09] J. Dejun, et al. (2009), EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications, in: Proceedings of the 7th International Conference on Service Oriented Computing, Stockholm, Sweden, pp. 197-207. [Zaharia08] M. Zaharia, et al. (2008), Improving MapReduce Performance in Heterogeneous Environments, in: Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, San Diego, USA, pp. 29-42. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 38

Literature (cont.) [Schad10] J. Schad, J. Dittrich, J.-A. Quiané-Ruiz (2010), Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance, Proceedings of the VLDB Endowment 3(1):460 471. [Iosup11] A. Iosup, N. Yigitbasi, D. Epema (2011), On the Performance Variability of Production Cloud Services, in: Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Newport Beach, California, USA, pp. 104 113. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 39

Literature (cont.) [Topcuoglu02] H. Topcuoglu, S. Hariri, M.-Y. Wu (2002), Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing, IEEE Transactions on Parallel and Distributed Systems 13(3):260-274. [Berriman04] G. B. Berriman, et al. (2004), Montage: a gridenabled engine for delivering custom science-grade mosaics on demand, in: Proceedings of the SPIE Conference on Astronomical Telescopes and Instrumentation, volume 5493, Glasgow, Scotland, pp. 221-232. DynamicCloudSim: Simulating Heterogeneity in Computational Clouds 40