How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research Young Choon Lee

Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. (NIST definition)

Cloud Computing Broadly falls into distributed computing Relevant/similar concepts Internet: largely information sharing Grid computing: primarily used as scientific platforms On-demand/elasticity + Utility computing

Cloud Computing Service Models IaaS (Infrastructure as a Service) AWS, Microsoft Azure, Google Compute Engine PaaS (Platform as a Service) Rackspace, Microsoft Azure SaaS (Software as a Service) Salesforce.com Deployment Models Private cloud Public cloud Hybrid cloud Community cloud

Clouds A (IaaS) cloud is basically a data centre Major public clouds host hundreds of thousands of servers, or even millions of them A single data centre often consists of thousands of servers or more; and they are virtualized

Example application Cycle Computing s cloud computing deployment 16,788 Amazon EC2 instance cluster with 156,314 cores across 8 regions in five continents for materials science experiments that collectively screen some 205,000 candidate molecules. This deployment has powered 2.3 million computing hours at a cost of only $33,000 which is equivalent to $68 million worth of equipment.

Problems Performance High performance/throughput Performance isolation and resource contention due to multi-tenancy Scalability Load balancing Cost Efficiency Resource failure Fault tolerance Failure recovery

Research Process Ideas / Feedback Design Analysis Implementation Experimental Evaluation

Research Process: ideas Pick up a promising/interesting domain, E.g., Cloud computing Big data In scientific discovery, the first three paradigms were experimental, theoretical and computational science (simulation), The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research. Energy efficiency Survey/review literature Read papers, blog posts and even news paper articles Watch leading RGs and community/industry activities Find a real(istic) problem

Research Process: Design Algorithms, e.g., Scheduling Resource allocation Load balancing Fault tolerance Systems Hardware: new systems architecture, e.g., FAWN Software Res. management systems, e.g., Mesos, Workflow execution systems, e.g., DEWE

Research Process: Implementation A realization of the design Often involves Prototyping, e.g., Hadoop schedulers Software development (programming) Extensibility Portability Scalability Etc.

Research Process: Evaluation Experimental evaluation Simulations Only when you can WELL justify, e.g., energy efficiency with realistic data like Google cluster data, SPEC benchmarks Use existing simulators, e.g., NS-3, CloudSim Or write your own and make it open source Experiments in real systems Open-source systems, e.g., Hadoop, Xen, Docker Real systems like Amazon EC2 (AWS Education Grants)

Research Process: Analysis Cleaning and organising experimental results Scrutinizing results Conducting comparison study Describing results with effective use of figures and tables; colours, patterns and different chart types 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Research Process: Analysis 100 CPU utilization 80 60 40 20 0 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS exec 952 945 962 870 810 862 787 780 749 713 905 898 915 895 825 896 870 867 821 763 1415 1363 1273 1231 1231 3127 3142 3231 2990 2767 idle 12.8 21.7 10.4 7.88 7.17 15.6 10.2 8.35 8.29 8.13 18.5 9.16 14.2 9.05 5.89 13.9 9.20 7.75 5.90 5.71 11.7 9.50 8.24 7.53 7.53 6.86 3.38 2.81 1.83 3.52 iowait 7.64 8.32 11.7 7.14 5.88 7.79 5.75 5.72 5.26 4.69 11.1 8.20 10.9 8.72 11.1 5.76 3.73 2.82 2.55 2.42 0.06 0.04 0.01 0.01 0.01 3.89 2.08 4.32 1.35 1.80 system 10.6 11.3 10.4 11.5 10.0 11.4 12.3 13.9 12.5 10.9 20.7 27.2 26.2 26.2 23.3 10.4 12.4 13.6 12.7 10.7 2.88 3.15 3.52 3.46 3.46 13.7 17.6 18.6 18.5 12.3 user 68.8 58.6 67.4 73.4 76.9 65.1 71.6 71.9 73.9 76.2 49.5 55.4 48.5 55.9 59.5 69.8 74.5 75.8 78.8 81.0 85.3 87.3 88.2 89.0 89.0 75.5 76.9 74.2 78.3 82.3

Issues with Evaluation The seven deadly sins of cloud computing research, HotCloud 2012. sin, n. common simplification or shortcut employed by researchers; may present threat to scientific integrity and practical applicability or research Systems Benchmarking Crimes, Gernot Heiser, NICTA/UNSW. When reviewing systems papers (and sometimes even when reading published papers) I frequently come across highly misleading use of benchmarks. I call such cases benchmarking crimes. Not because you can go to jail for them (but probably should?) but because they undermine the integrity of the scientific process.

Issues with Evaluation: The seven deadly sins of cloud computing research Sin 1: Unnecessary distributed parallelism Sin 2: Assuming performance homogeneity Sin 3: Picking the low-hanging fruit Sin 4: Forcing the abstraction Sin 5: Unrepresentative workloads Sin 6: Assuming perfect elasticity Sin 7: Ignoring fault tolerance

Issues with Evaluation: Systems Benchmarking Crimes Pretending micro-benchmarks represent overall performance Benchmark sub-setting without strong justification Selective data set hiding deficiencies Throughput degraded by x% overhead is x% 6% 13% overhead is a 7% increase Same dataset for calibration and validation No indication of significance of data Benchmarking of simplified simulated system Inappropriate and misleading benchmarks Relative numbers only No proper baseline Only evaluate against yourself Unfair benchmarking of competitors Arithmetic mean for averaging across benchmark scores