Cloud computing The cloud as a pool of shared hadrware and software resources

Size: px
Start display at page:

Download "Cloud computing The cloud as a pool of shared hadrware and software resources"

Transcription

1 Cloud computing The cloud as a pool of shared hadrware and software resources cloud Towards SLA-oriented Cloud Computing middleware layers (e.g. application servers) operating systems, virtual machines middleware layers (e.g. application servers) operating systems, virtual machines middleware layers (e.g. application servers) operating systems, virtual machines cpu, memory, disk cpu, memory, disk cpu, memory, disk Sara Bouchenak INSA Lyon Sara.Bouchenak@insa-lyon.fr 3 rd Franco-American Workshop on CyberSecurity, December 8-10, 2014, Lyon, France The cloud as a means for distributed applications to: pick up required resources access infinite remote resources access on - demand resources (pay - as - you - go) transparent resource management December 10, QoS,, SLA QoS Many quality-of-service criteria Performance (e.g. service response time, service tjhroughput) Dependability, Availability (e.g. service abadon rate), Reliability Security, etc. Costs Energetic costs Financial costs Service Level Objective for a given QoS criterion/metric Examples: a target service level, e.g. a minimum service throughput, a maximum service abandon rate a service level interval service level maximization/minimization SLA (Service Level Agreement) A contract between the service provider and service customer Ideally, a combination of a set of and cost constraints Example: 99% of service requests are processed within 1s with a minimal energetic cost December 10, QoS and SLA in clouds? Some initiatives Amazon EC2, Rackspace, 3tera clouds Restricted to a single QoS criterion Service unavailability due to computer failures Other QoS aspects not tackled (performance, security, energy, financial costs, etc.) Ad-hoc and incomplete approaches Is the SLA guaranteed/violated by the cloud? E.g. Amazon EC2 customers must provide proofs of cloud service unavailability: capture the failure, document it, send the "proof" to Amazon, within 30 days This is against one of the main motivations of Cloud Computing: "hide complexity of resource management and provide simple access to cloud services by the customer" December 10,

2 Open Challenges & Perspectives Multicriteria SLA: dependability, performance, cost, etc. Towards scalable and distributed SLA control Consider different applications and Big Data services Objective 1: Define a new cloud model, the SLAaaS (SLA aware Service) Orthogonal to other cloud models (IaaS, PaaS, SaaS) A cloud presents, along with its service interface, a non-functional SLA interface Allow a customer to compare different cloud service providers regarding the provided From cloud provider perspective Multi-objective SLA between cloud provider and cloud customer Fully elastic cloud via dynamic resource re-allocation, reconfiguration Handle cloud dynamics, workload variations From cloud customer point of view Automatically notify the customer about SLA violation, energy footprint, etc. Stress/evaluate dependability and scalability of Big Data cloud services, real workloads AMADEOS project: MyCloud project: December 10, December 10, Objective 1: Define a new cloud model, the SLAaaS Challenges Towards a control - theoretic approach [ACM OSR 2013] ConSer: Control of server systems [IEEE Trans. Comp. 2011] MoKa: Control of multi - tier distributed web systems [IGI Global, 2011] MoMap: Control of MapReduce systems [ACM CCGrid 2013] MRBS: Bechmarking framework for Hadoop MapReduce [IEEE SRDS 2012] December 10, Objective 1: Define a new cloud model, the SLAaaS Challenges Towards a control - theoretic approach [ACM OSR 2013] ConSer: Control of server systems [IEEE Trans. Comp. 2011] MoKa: Control of multi - tier distributed web systems [IGI Global, 2011] MoMap: Control of MapReduce systems [ACM CCGrid 2013] MRBS: Bechmarking framework for Hadoop MapReduce [IEEE SRDS 2012] December 10,

3 Challenges in autonomic reconfiguration of cloud services 1) Complex Multiple service level objectives () performance, availability, dependability, security, etc. Challenges in autonomic reconfiguration of cloud services 3) Time-varying and nonlinear behavior Workload amount (#concurrent client requests) Trade - off antagonist at least 99% of client requests are admitted and processed within 1s, with a minimal financial cost availability performance cost 2) From to resource allocation/configuration Amazon EC2 cloud bime cloud application Resources Small instance $0.085 per hour Availability level 99% of requests are processed Large instance Extra large instance Number of instance $0.34 per hour $0.68 per hour X unitary price Nontrivial -toresource allocations Performance level Cost constraint requests processed within 1s minimal cost Workload amount of the soccer World Cup Web Site [Arlitt et. al., HP 99] December 10, December 10, A control - theoretic approach exogenous inputs Feedback control loop control knobs Target system measured service levels service costs Followed approach (1) Utility: State the objective and capture the trade-off Multicriteria utility function (2) Model: Describe system behavior Relationship between allocated resources and service levels and costs inputs exogenous variables resource allocations System model predicted service levels service costs outputs Exogenous inputs: Workload amount Workload mix Control knobs (i.e. resource allocations): Cache size Server admission control Server provisioning Content quality level System outputs: Cache hit ratio Service QoS Resource utilization Service differentiation ratio (3) Control: Solve the system Calculate (optimal) resource allocation Maximize utility function Based on the model (4) Implement the solution Translate theoretical optimal solution into concrete implementation Not trivial: automatically (re)determine model s parameters December 10, December 10,

4 Objective 1: Define a new cloud model, the SLAaaS Challenges Towards a control - theoretic approach [ACM OSR 2013] Control of server systems Server admission control Prevent server thrashing, denial-of-service Multi-Programming Level () ConSer: Control of server systems [IEEE Trans. Comp. 2011], PhD L. Malrait MoKa: Control of multi - tier distributed web systems [IGI Global, 2011], PhD J. Arnaud clients Admission control MoMap: Control of MapReduce systems [ACM CCGrid 2013], PhD M. Berekmeri MRBS: Bechmarking framework for Hadoop MapReduce [IEEE SRDS 2012], PhD A. Sangroya December 10, rejected server Classical configuration parameter in server systems Apache Web server s MaxClients MySQL database server s max_connections December 10, Trade off between server performance and availability Experiments conducted with PostgreSQL database server, running TPC-C benchmark Related work Server admission control Ad-hoc techniques, heuristics [Menascé et al., EC 01] Best - effort behavior Linear models [Diao et. al., NOMS 02] [Parekh et al., RTS 02] Can not render the whole nonlinear behavior of server systems Nonlinear models based on queueing theory [Robertsson et. al. CDC 04] [Tipper et. al., JSAC 90] [Wang et. al. INFOCOM 96] Multiple model parameters, hard to calibrate Performance (client request latency) Availability (client request abadon rate) How to configure server s trading-off performance and availability? Do not tackle full dynamics (workload types) Restricted to a single QoS aspect, SLO December 10, December 10,

5 ConSer*: Control of server systems latency and : L L abandon rate max & α ConSer modeling latency and : L L abandon rate max & α workload amount N & mix M AM-C controlled Target server L (latency) α (abandon rate) workload amount N & mix M AM-C controlled Target server L (latency) α (abandon rate) (1) Utility: State the objective and capture the trade-off AM-C (availability-maximizing objective) (P1) average client request latency does not exceed a given L max (P2) and abandon rate is made as small as possible (2) Nonlinear fluid modeling State variables: Exogenous inputs: Control input: Server model admitted requests Outputs: PM-C, PA-AM-C, AA-PM-C (workload amount) N throughput of processed requests L (latency) (incoming throughput) T i * L. Malrait, S. Bouchenak, N. Marchand. Experience with ConSer: A System for Server Control Through Fluid Modeling. IEEE Transactions on Computers, 60(7), (workload mix) M request abandon rate α (abandon rate) In collaboration with the NeCS INRIA research group on Control Theory request latency December 10, December 10, ConSer control latency and : L L abandon rate max & α ConSer AM-C control evaluation workload amount N & mix M AM-C controlled Target server L (latency) α (abandon rate) (3) Control server s AM-C (availability-maximizing control) (P1) average client request latency does not exceed a given L max (P2) and abandon rate is made as small as possible Performance improved by up to 30% ; γ > 0 If L > L max ; (P1) ; a decreased value of N e If L < L max ; (P1) & possibly (P2) ; an increased value of N e Efficient control: O(1) Experiments conducted with PostrgeSQL database server running TPC-C benchmark, AM-C control law, L max = 8s December 10, December 10,

6 ConSer AM-C control evaluation Objective 1: Define a new cloud model, the SLAaaS Challenges Towards a control - theoretic approach [ACM OSR 2013] ConSer: Control of server systems [IEEE Trans. Comp. 2011], PhD L. Malrait MoKa: Control of multi - tier distributed web systems [IGI Global, 2011], PhD J. Arnaud MoMap: Control of MapReduce systems [ACM CCGrid 2013], PhD M. Berekmeri Experiments conducted with PostrgeSQL database server running TPC-C benchmark, AM-C control law, L max = 8s MRBS: Bechmarking framework for Hadoop MapReduce [IEEE SRDS 2012], PhD A. Sangroya December 10, December 10, Big Data Systems - MapReduce MapReduce Big Data applications A popular programming model A runtime environment on cluster of commodity computers Automatic Data partitioning Data replication Task scheduling Fault tolerance A wide range of applications log analysis, data mining, web search engines, scientific computing, business intelligence, etc. Big companies use it Amazon, ebay, Facebook, LinkedIn, Twitter, Yahoo!, etc. Motivation Lots of work to improve MapReduce dependability and performance New fault-tolerance models [Costa, CloudCom 11] Replication and partitioning policies [Ananthanarayanan, EuroSys 11] [Eltabakh,VLDB 11] Scheduling policies [Zaharia, OSDI 08] [Isard, SOSP 09] [Zaharia, EuroSys 10] Cost-based optimization [Herodotou, VLDB 11] Resource provisioning [Verma, Middleware 11] Most evaluations use micro-bechmarks Not representative of full distributed, concurrent applications Not representative of realistic workloads No dependability benchmarking December 10, December 10,

7 MRBS objectives Empirical evaluation of dependability and performance of MapReduce Fault - tolerance Scalability MRBS characteristics Variety of application domains, workloads and dataloads Compute - oriented vs. data - oriented applications Batch applications vs. real - time applications Variety of Big Data workloads and faultloads Various workloads, dataloads Different fault models Different fault rates Portable and easy to use on a wide range of clouds Different cloud infrastructures * A. Sangroya, D. Serrano, S. Bouchenak. Benchmarking Dependability of MapReduce Systems. The 31 st IEEE Int. Symp. on Reliable Distributed Systems (SRDS 2012), Irvine, CA, Oct December 10, December 10, Use-case: Comparing two MapReduce frameworks w.r.t. performance & dependability How does Hadoop 1.0 compare to Hadoop 0.20 w.r.t. performance? Use-case: Comparing two MapReduce frameworks w.r.t. performance & dependability How does Hadoop 1.0 compare to Hadoop 0.20 w.r.t. dependability? Response time with Hadoop 1.0: up to +40% Throughput with Hadoop 1.0: up to -42% Less failed jobs with Hadoop 1.0 Experiments conducted on a ten node Hadoop cluster Experiments conducted on a ten node Hadoop cluster December 10, December 10,

8 Use-case: Comparing two MapReduce frameworks w.r.t. performance & dependability How does Hadoop 1.0 compare to Hadoop 0.20 w.r.t. dependability? Conclusion & Perspectives Multicriteria SLA by design Different applications Less I/O failures with Hadoop 1.0 Experiments conducted on a ten node Hadoop cluster December 10, December 10,

Jean Arnaud, Sara Bouchenak. Performance, Availability and Cost of Self-Adaptive Internet Services

Jean Arnaud, Sara Bouchenak. Performance, Availability and Cost of Self-Adaptive Internet Services Jean Arnaud, Sara Bouchenak Performance, Availability and Cost of Self-Adaptive Internet Services Chapter of Performance and Dependability in Service Computing: Concepts, Techniques and Research Directions

More information

Modeling and Control of Server Systems: Application to Database Systems

Modeling and Control of Server Systems: Application to Database Systems Proceedings of the European Control Conference 9 Budapest, Hungary, August 3 6, 9 TuC.6 Modeling and Control of Server Systems: Application to Database Systems Luc Malrait, Nicolas Marchand and Sara Bouchenak

More information

Network Infrastructure Services CS848 Project

Network Infrastructure Services CS848 Project Quality of Service Guarantees for Cloud Services CS848 Project presentation by Alexey Karyakin David R. Cheriton School of Computer Science University of Waterloo March 2010 Outline 1. Performance of cloud

More information

Cloud/SaaS enablement of existing applications

Cloud/SaaS enablement of existing applications Cloud/SaaS enablement of existing applications GigaSpaces: Nati Shalom, CTO & Founder About GigaSpaces Technologies Enabling applications to run a distributed cluster as if it was a single machine 75+

More information

SLA Aware Elastic Clouds

SLA Aware Elastic Clouds SLA Aware Elastic Clouds Jean Arnaud and Sara Bouchenak Université Grenoble I France Sara.Bouchenak@imag.fr ABSTRACT. Although Cloud Computing provides a means to support remote, on-demand access to a

More information

Feedback Autonomic Provisioning for guaranteeing performance (and reliability. - application to Big Data Systems

Feedback Autonomic Provisioning for guaranteeing performance (and reliability. - application to Big Data Systems Feedback Autonomic Provisioning for guaranteeing performance (and reliability) - application to Big Data Systems Bogdan Robu bogdan.robu@gipsa-lab.fr HIPEAC - HPES Workshop Amsterdam 19-21.01.2015 Context

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

IaaS Multi Tier Applications - Problem Statement & Review

IaaS Multi Tier Applications - Problem Statement & Review Outline PHD Dissertation Proposal Defense Wes J. Lloyd Colorado State University, Fort Collins, Colorado USA Research Problem Challenges Approaches & Gaps Research Goals Research Questions & Experiments

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

USING VIRTUAL MACHINE REPLICATION FOR DYNAMIC CONFIGURATION OF MULTI-TIER INTERNET SERVICES

USING VIRTUAL MACHINE REPLICATION FOR DYNAMIC CONFIGURATION OF MULTI-TIER INTERNET SERVICES USING VIRTUAL MACHINE REPLICATION FOR DYNAMIC CONFIGURATION OF MULTI-TIER INTERNET SERVICES Carlos Oliveira, Vinicius Petrucci, Orlando Loques Universidade Federal Fluminense Niterói, Brazil ABSTRACT In

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

SLA Guarantees for Cloud Services

SLA Guarantees for Cloud Services SLA Guarantees for Cloud Services Damián Serrano a, Sara Bouchenak a, Yousri Kouki b, Frederico Alvares de Oliveira Jr. b, Thomas Ledoux b, Jonathan Lejeune c, Julien Sopena c, Luciana Arantes c, Pierre

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Overview of Cloud Computing (ENCS 691K Chapter 1) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ Overview of Cloud Computing Towards a definition

More information

Resource Scalability for Efficient Parallel Processing in Cloud

Resource Scalability for Efficient Parallel Processing in Cloud Resource Scalability for Efficient Parallel Processing in Cloud ABSTRACT Govinda.K #1, Abirami.M #2, Divya Mercy Silva.J #3 #1 SCSE, VIT University #2 SITE, VIT University #3 SITE, VIT University In the

More information

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM Ramesh Maharjan and Manoj Shakya Department of Computer Science and Engineering Dhulikhel, Kavre, Nepal lazymesh@gmail.com,

More information

Task Scheduling in Hadoop

Task Scheduling in Hadoop Task Scheduling in Hadoop Sagar Mamdapure Munira Ginwala Neha Papat SAE,Kondhwa SAE,Kondhwa SAE,Kondhwa Abstract Hadoop is widely used for storing large datasets and processing them efficiently under distributed

More information

Cloud Performance Considerations

Cloud Performance Considerations Dr. Stefan Pappe - Distinguished Engineer - Leader Cloud Service Specialty Area Dr. Curtis Hrischuk Cloud Performance Leader IBM Global Technology Services Cloud Performance Considerations Disclaimer This

More information

Data Consistency on Private Cloud Storage System

Data Consistency on Private Cloud Storage System Volume, Issue, May-June 202 ISS 2278-6856 Data Consistency on Private Cloud Storage System Yin yein Aye University of Computer Studies,Yangon yinnyeinaye.ptn@email.com Abstract: Cloud computing paradigm

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm A REVIEW OF THE LOAD BALANCING TECHNIQUES AT CLOUD SERVER Kiran Bala, Sahil Vashist, Rajwinder Singh, Gagandeep Singh Department of Computer Science & Engineering, Chandigarh Engineering College, Landran(Pb),

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Introducing EEMBC Cloud and Big Data Server Benchmarks

Introducing EEMBC Cloud and Big Data Server Benchmarks Introducing EEMBC Cloud and Big Data Server Benchmarks Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific

More information

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.

More information

Big Data Management in the Clouds and HPC Systems

Big Data Management in the Clouds and HPC Systems Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:

More information

CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

More information

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information

More information

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud) Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University

More information

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman A Very Brief Introduction To Cloud Computing Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman What is The Cloud Cloud computing refers to logical computational resources accessible via a computer

More information

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage

A Brief Analysis on Architecture and Reliability of Cloud Based Data Storage Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf

More information

socloud: distributed multi-cloud platform for deploying, executing and managing distributed applications

socloud: distributed multi-cloud platform for deploying, executing and managing distributed applications socloud: distributed multi-cloud platform for deploying, executing and managing distributed applications Fawaz PARAISO PhD Defense Advisors: Lionel Seinturier, Philippe Merle University Lille 1, Inria,

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II) UC BERKELEY Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II) Anthony D. Joseph LASER Summer School September 2013 My Talks at LASER 2013 1. AMP Lab introduction 2. The Datacenter

More information

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi CloudCmp:Comparing Cloud Providers Raja Abhinay Moparthi 1 Outline Motivation Cloud Computing Service Models Charging schemes Cloud Common Services Goal CloudCom Working Challenges Designing Benchmark

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research. Young Choon Lee How to Do/Evaluate Cloud Computing Research Young Choon Lee Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing

More information

Hadoop Scheduler w i t h Deadline Constraint

Hadoop Scheduler w i t h Deadline Constraint Hadoop Scheduler w i t h Deadline Constraint Geetha J 1, N UdayBhaskar 2, P ChennaReddy 3,Neha Sniha 4 1,4 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bangalore,

More information

GraySort on Apache Spark by Databricks

GraySort on Apache Spark by Databricks GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner

More information

MRBS: A Comprehensive MapReduce Benchmark Suite

MRBS: A Comprehensive MapReduce Benchmark Suite MRBS: A Comprehensive MapReduce Benchmark Suite Amit Sangroya INRIA - LIG Grenoble, France Amit.Sangroya@inria.fr Damián Serrano INRIA - LIG Grenoble, France Damian.Serrano@inria.fr Sara Bouchenak University

More information

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet Final Project Proposal CSCI.6500 Distributed Computing over the Internet Qingling Wang 660795696 1. Purpose Implement an application layer on Hybrid Grid Cloud Infrastructure to automatically or at least

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Workflow Engine for clouds

More information

STeP-IN SUMMIT 2014. June 2014 at Bangalore, Hyderabad, Pune - INDIA. Performance testing Hadoop based big data analytics solutions

STeP-IN SUMMIT 2014. June 2014 at Bangalore, Hyderabad, Pune - INDIA. Performance testing Hadoop based big data analytics solutions 11 th International Conference on Software Testing June 2014 at Bangalore, Hyderabad, Pune - INDIA Performance testing Hadoop based big data analytics solutions by Mustufa Batterywala, Performance Architect,

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Towards a Resource Elasticity Benchmark for Cloud Environments. Presented By: Aleksey Charapko, Priyanka D H, Kevin Harper, Vivek Madesi

Towards a Resource Elasticity Benchmark for Cloud Environments. Presented By: Aleksey Charapko, Priyanka D H, Kevin Harper, Vivek Madesi Towards a Resource Elasticity Benchmark for Cloud Environments Presented By: Aleksey Charapko, Priyanka D H, Kevin Harper, Vivek Madesi Introduction & Background Resource Elasticity Utility Computing (Pay-Per-Use):

More information

CSE-E5430 Scalable Cloud Computing Lecture 11

CSE-E5430 Scalable Cloud Computing Lecture 11 CSE-E5430 Scalable Cloud Computing Lecture 11 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 30.11-2015 1/24 Distributed Coordination Systems Consensus

More information

Monitoring Elastic Cloud Services

Monitoring Elastic Cloud Services Monitoring Elastic Cloud Services trihinas@cs.ucy.ac.cy Advanced School on Service Oriented Computing (SummerSoc 2014) 30 June 5 July, Hersonissos, Crete, Greece Presentation Outline Elasticity in Cloud

More information

Cloud Computing Summary and Preparation for Examination

Cloud Computing Summary and Preparation for Examination Basics of Cloud Computing Lecture 8 Cloud Computing Summary and Preparation for Examination Satish Srirama Outline Quick recap of what we have learnt as part of this course How to prepare for the examination

More information

Cloud Computing Backgrounder

Cloud Computing Backgrounder Cloud Computing Backgrounder No surprise: information technology (IT) is huge. Huge costs, huge number of buzz words, huge amount of jargon, and a huge competitive advantage for those who can effectively

More information

Cloud SLAs: Present and Future

Cloud SLAs: Present and Future Cloud SLAs: Present and Future Salman A. Baset sabaset@us.ibm.com IBM Research Abstract The variability in the service level agreements (SLAs) of cloud providers prompted us to ask the question how do

More information

Introduction to Hadoop

Introduction to Hadoop Introduction to Hadoop 1 What is Hadoop? the big data revolution extracting value from data cloud computing 2 Understanding MapReduce the word count problem more examples MCS 572 Lecture 24 Introduction

More information

What is cloud computing?

What is cloud computing? Introduction to Clouds and MapReduce Jimmy Lin University of Maryland What is cloud computing? With mods by Alan Sussman This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike

More information

Long-Term Resource Fairness

Long-Term Resource Fairness Long-Term Resource Fairness Towards Economic Fairness on Pay-as-you-use Computing Systems Shanjiang Tang, u-sung Lee, ingsheng He, Haikun Liu School of Computer Engineering Nanyang Technological University

More information

Cloud Computing and Amazon Web Services

Cloud Computing and Amazon Web Services Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Scalable Cloud Computing Solutions for Next Generation Sequencing Data Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of

More information

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara Sudipto Das (Microsoft summer intern) Shyam Antony (Microsoft now) Aaron Elmore (Amazon summer intern)

More information

Improving MapReduce Performance in Heterogeneous Environments

Improving MapReduce Performance in Heterogeneous Environments UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University of California at Berkeley Motivation 1. MapReduce

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)

More information

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Cloud Computing using MapReduce, Hadoop, Spark

Cloud Computing using MapReduce, Hadoop, Spark Cloud Computing using MapReduce, Hadoop, Spark Benjamin Hindman benh@cs.berkeley.edu Why this talk? At some point, you ll have enough data to run your parallel algorithms on multiple computers SPMD (e.g.,

More information

Application Performance in the Cloud

Application Performance in the Cloud Application Performance in the Cloud Understanding and ensuring application performance in highly elastic environments Albert Mavashev, CTO Nastel Technologies, Inc. amavashev@nastel.com What is Cloud?

More information

A Survey of Cloud Computing Guanfeng Octides

A Survey of Cloud Computing Guanfeng Octides A Survey of Cloud Computing Guanfeng Nov 7, 2010 Abstract The principal service provided by cloud computing is that underlying infrastructure, which often consists of compute resources like storage, processors,

More information

This paper defines as "Classical"

This paper defines as Classical Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of

More information

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010 Cloud Design and Implementation Cheng Li MPI-SWS Nov 9 th, 2010 1 Modern Computing CPU, Mem, Disk Academic computation Chemistry, Biology Large Data Set Analysis Online service Shopping Website Collaborative

More information

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue and Seung-Jong Park Department of Computer Science,

More information

Elastic VM for Rapid and Optimum Virtualized

Elastic VM for Rapid and Optimum Virtualized Elastic VM for Rapid and Optimum Virtualized Resources Allocation Wesam Dawoud PhD. Student Hasso Plattner Institute Potsdam, Germany 5th International DMTF Academic Alliance Workshop on Systems and Virtualization

More information

4/6/2009 CLOUD COMPUTING : PART I WHY IS CLOUD COMPUTING DISTINCT? INTRODUCTION: CONTINUE A PERSPECTIVE STUDY

4/6/2009 CLOUD COMPUTING : PART I WHY IS CLOUD COMPUTING DISTINCT? INTRODUCTION: CONTINUE A PERSPECTIVE STUDY CLOUD COMPUTING : A PERSPECTIVE STUDY PART I BACKGROUND AND CONCEPTS Guannang Wang YingFeng Wang Qi Li INTRODUCTION: Coined in late of 2007 Currently emerges as a hot topic due to its abilities to offer

More information

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004

More information

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,

More information

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus International Symposium on Grid Computing 2009 (Taipei) Christian Baun The cooperation of and Universität Karlsruhe (TH) Agenda

More information

Scaling Database Performance in Azure

Scaling Database Performance in Azure Scaling Database Performance in Azure Results of Microsoft-funded Testing Q1 2015 2015 2014 ScaleArc. All Rights Reserved. 1 Test Goals and Background Info Test Goals and Setup Test goals Microsoft commissioned

More information

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. http://wso2.com Version 1.1 (June 19, 2012)

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. http://wso2.com Version 1.1 (June 19, 2012) Cloud Native Advantage: Multi-Tenant, Shared Container PaaS Version 1.1 (June 19, 2012) Table of Contents PaaS Container Partitioning Strategies... 03 Container Tenancy... 04 Multi-tenant Shared Container...

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.

More information

Evaluation Methodology of Converged Cloud Environments

Evaluation Methodology of Converged Cloud Environments Krzysztof Zieliński Marcin Jarząb Sławomir Zieliński Karol Grzegorczyk Maciej Malawski Mariusz Zyśk Evaluation Methodology of Converged Cloud Environments Cloud Computing Cloud Computing enables convenient,

More information

CLOUD COMPUTING. When It's smarter to rent than to buy

CLOUD COMPUTING. When It's smarter to rent than to buy CLOUD COMPUTING When It's smarter to rent than to buy Is it new concept? Nothing new In 1990 s, WWW itself Grid Technologies- Scientific applications Online banking websites More convenience Not to visit

More information

27 th March 2015 Istanbul, Turkey. Performance Testing Best Practice

27 th March 2015 Istanbul, Turkey. Performance Testing Best Practice 27 th March 2015 Istanbul, Turkey Performance Testing Best Practice Your Host.. Ian Molyneaux Leads the Intechnica performance team More years in IT than I care to remember Author of The Art of Application

More information

Fault-Tolerant Application Placement in Heterogeneous Cloud Environments. Bart Spinnewyn, prof. Steven Latré

Fault-Tolerant Application Placement in Heterogeneous Cloud Environments. Bart Spinnewyn, prof. Steven Latré Fault-Tolerant Application Placement in Heterogeneous Cloud Environments Bart Spinnewyn, prof. Steven Latré Cloud Application Placement Problem (CAPP) Application Placement admission control: decide on

More information

Evaluating HDFS I/O Performance on Virtualized Systems

Evaluating HDFS I/O Performance on Virtualized Systems Evaluating HDFS I/O Performance on Virtualized Systems Xin Tang xtang@cs.wisc.edu University of Wisconsin-Madison Department of Computer Sciences Abstract Hadoop as a Service (HaaS) has received increasing

More information

Infrastructure as a Service (IaaS)

Infrastructure as a Service (IaaS) Infrastructure as a Service (IaaS) (ENCS 691K Chapter 4) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ References 1. R. Moreno et al.,

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Cloud Computing: a Perspective Study Lizhe WANG, Gregor von LASZEWSKI, Younge ANDREW, Xi HE Service Oriented Cyberinfrastruture Lab, Rochester Inst. of Tech. Abstract The Cloud computing emerges as a new

More information

Grid Computing Vs. Cloud Computing

Grid Computing Vs. Cloud Computing International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com StreamHorizon & Big Data Integrates into your Data Processing Pipeline Seamlessly integrates at any point of your your data processing pipeline Implements

More information

MapReduce and Hadoop Distributed File System V I J A Y R A O

MapReduce and Hadoop Distributed File System V I J A Y R A O MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB

More information

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral

More information

Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds

Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds Deepal Jayasinghe, Simon Malkowski, Qingyang Wang, Jack Li, Pengcheng Xiong, Calton Pu Outline Motivation

More information

Towards an understanding of oversubscription in cloud

Towards an understanding of oversubscription in cloud IBM Research Towards an understanding of oversubscription in cloud Salman A. Baset, Long Wang, Chunqiang Tang sabaset@us.ibm.com IBM T. J. Watson Research Center Hawthorne, NY Outline Oversubscription

More information

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of

More information

Cloud computing doesn t yet have a

Cloud computing doesn t yet have a The Case for Cloud Computing Robert L. Grossman University of Illinois at Chicago and Open Data Group To understand clouds and cloud computing, we must first understand the two different types of clouds.

More information

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Web Email DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing) 1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication

More information

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications by Samuel D. Kounev (skounev@ito.tu-darmstadt.de) Information Technology Transfer Office Abstract Modern e-commerce

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

The Hidden Extras. The Pricing Scheme of Cloud Computing. Stephane Rufer

The Hidden Extras. The Pricing Scheme of Cloud Computing. Stephane Rufer The Hidden Extras The Pricing Scheme of Cloud Computing Stephane Rufer Cloud Computing Hype Cycle Definition Types Architecture Deployment Pricing/Charging in IT Economics of Cloud Computing Pricing Schemes

More information

CHAPTER 8 CLOUD COMPUTING

CHAPTER 8 CLOUD COMPUTING CHAPTER 8 CLOUD COMPUTING SE 458 SERVICE ORIENTED ARCHITECTURE Assist. Prof. Dr. Volkan TUNALI Faculty of Engineering and Natural Sciences / Maltepe University Topics 2 Cloud Computing Essential Characteristics

More information

Duke University http://www.cs.duke.edu/starfish

Duke University http://www.cs.duke.edu/starfish Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University http://www.cs.duke.edu/starfish Practitioners of Big Data Analytics Google Yahoo! Facebook ebay Physicists Biologists Economists

More information

Extending Hadoop beyond MapReduce

Extending Hadoop beyond MapReduce Extending Hadoop beyond MapReduce Mahadev Konar Co-Founder @mahadevkonar (@hortonworks) Page 1 Bio Apache Hadoop since 2006 - committer and PMC member Developed and supported Map Reduce @Yahoo! - Core

More information