Cloud Federation to Elastically Increase MapReduce Processing Resources

Size: px
Start display at page:

Download "Cloud Federation to Elastically Increase MapReduce Processing Resources"

Transcription

1 Cloud Federation to Elastically Increase MapReduce Processing Resources A.Panarello, A.Celesti, M. Villari, M. Fazio and A. Puliafito {apanarello,acelesti, mfazio, mvillari, DICIEAMA, University of Messina Contrada di Dio, S. Agata, Messina, Italy The second international FedICI'2014 workshop: Federative and interoperable cloud infrastructures

2 Outline Cloud federation introduction How Cloud federation can elastically increase providers' MapReduce resources Case of study: a video transcoding service System prototype (Hadoop, CLEVER, Amazon S3) Main factors involved in job submission Conclusion and future works

3 Toward Cloud Federation Currently, only the major cloud providers (e.g., Amazon, Google, Rackspace, etc) hold big datacenters, i.e., virtualization infrastructures Small cloud provider cannot directly compete with these market leaders. They have to buy services from these mega-providers. The largest business is in hand of mega-providers. Possible solution: Cloud Federation

4 Evolution of the Cloud Ecosystem Indepentent Cloud Cloud federation cloud federation: a mesh of cloud providers that are interconnected to provide a universal decentralized computing environment where everything is driven by constraints and agreements in a ubiquitous, multi-provider infrastructure Different distributed services (e.g., IaaS, PaaS, SaaS) One of the main challenges: minimizing the barriers of delivering services among different administrative domains

5 Why to Federate Cloud Providers? Multiple reasons: Clouds can benefit of a market in which the can buy/sell resources A cloud has saturated its own resources and it needs external assets A cloud needs particular types of services or resources that it does not hold A cloud wants to perform software consolidation in order to save energy cost A cloud wants to move part of processing into other providers (e.g., for security, performance, or for the deployment of particular location-dependent services) And so on...

6 Motivation MapReduce is a programming model for processing and generating large data sets with a parallel, distributed algorithm on a cluster The major MapReduce pieces of framework are not cloud-like: They often are non resilient They often does not scale up/down They often require manual configurations Objectives: Make a piece of MapReduce framework cloud-like Investigate the main concerns regarding the job submission in a federated cloud environment

7 MapReduce Distributed Processing in Cloud Federation: A Reference Scenario (1) Actors: Multiple Cloud Providers (CPs), each one running a MapReduce system in its administrative domain. A public Cloud Storage Provider (CSP), offering storage services and supporting multi-part data download Clients, each one submitting a parallel processing request (job) to a particular CP (i.e., Home CP). The the piece of input data is stored in a CSP (e.g., Amazon S3, Dropbox, Drive, etc) to minimize the transmission overhead between federated CPs

8 MapReduce Distributed Processing in Cloud Federation: A Reference Scenario (2) The client contacts the home CP that offers a particular parallel processing service and he/she submits a job (where the piece of data is stored and how to process it) The home CP establishes a federation with other foreign CPs and sends them sub-job instructions. Exploiting the multi-part download each federated CP download chunks of data and process them exploiting the local MapReduce system Each federated CP upload the output in the CSP sends a notification to the home CP Finally, the client merges the processed chunks (if required) and read the whole output.

9 A Video Transcoding Use Case A user would like to watch a movie that is stored in a CSP using his/her mobile phone Unfortunately the movie is stored as HD file and the user's device is not able to play it Thus, the client submit a video transcoding job to reduce the resolution of the movie to a particular home CP The job submission includes where the input movie is stored and how to process it The Home CP establishes a federation with other foreign CPs submitting them a sub-job Each foreign CP downloads a chunk of file, processes it, upload it in the CSP, and sends a notification to the home CP. Once the Home CP received all the notification is generates a SMIL file, i.e., an XML file that allows to play a video without merge chunks. The home CP upload the SMIL file in the CSP The client is able to play the movie

10 System Prototype (1) System components Hadoop as MapReduce piece of framework CLEVER as middleware to make Hadoop cloud-like with federation capabilities in CPs Amazon S3 as public CSP Hadoop Master/Slave architecture It consists of a single master JobTracker node and several slave TaskTracker nodes. To speed up the processing it supports a distributed file system, i.e., HDFS including Name and Data nodes typically respectively deployed in the same nodes running JobTracker and TaskTracker

11 System Prototype (2) CLEVER The CLoud-Enabled Virtual EnviRonment (CLEVER) is a Message- Oriented Middleware for Cloud comptuting (MOM4C) that enables to arrange federated cloud systems A Cluster Manager (CM) acts as interface with client and manages several Host Managers (HMs) Inter-module communication by means of MUC using XMPP Pluggable architecture: agents can be added to control third party components (Sensor networks, virtualization, parallel processing, storage, etc )

12 System Prototype (3) Advantages of integrating Hadoop in CLEVER Typically, Hadoop uses the TCP/IP layer for communication: firewalls can block inter-domain communication. Solution: Integrating Hadoop in CLEVER communication can be sent on port 80 thanks ot XMPP. The system can automatically scale The two main software agents: Hadoop Master Node (HMN) and Hadoop Slave Node (HSN) running in respectively in CM and HM Two possible configurations: HM with HSN in PHs or in VMs (more resilient)

13 Experiments (1) Objective: understanding the main concerns regarding the job submission in the federated cloud environment. Processing time of a hadoop cluster was out of scope of this paper (many works are available in literature) Testbed Specification 4 CLEVER/Hadoop administrative domains (i.e., A, B, C, and D) deployed in 4 servers CPU: Intel(R) Core(TM)2 CPU 6300; 1.86GHz, 3GB RAM, running Linux Ubuntu x86 64 OS and VirtualBox Overall system deployed in 10 VMs (1 VM in domain A and 3 in domains B, C, D) Amazon S3 Experiment repeated 50 times in order to consider mean values and confidence intervals

14 Experiments (2) Timeline T0, a client submit a video transcoding job to the home CP T1, the home CP that receives the request decides to establish a federation with the other ones, retrieving domain information. T2, the home CP performs a job assignment involving the whole federated environment. By means of the job tracker it creates the video transcoding job, and it assigns the sub-jobsto the other federated domains. T3, each involved federated CP downloads only particular video chunks from Amazon S3 using the multipart download mechanism. T4. Each CP uploads the previously downloaded video chunks in its own HDFS of the local domain for the processing.

15 Experiments (3) The average time required to retrieve domain information (tt1-t0) and to forward in parallel the request to federated CPs (t2-t1) is roughly 5 seconds.

16 Experiments (3) Distribution histogram of the mean times required to download 20MB, 10MB, and 7MB block sizes from Amazon S3 (t3-t2) in each CP considering one administrative domain. Looking at the summary distribution histogram, it is evident how thanks to the federation, increasing the number of administrative domains the download time can be reduced considering smaller chunks.

17 Experiments (4) The average upload time of chunks in the HDFS on each domain (t4-t3) changes according to the number of active Data Nodes and video file sizes. we can notice that increasing the number of Hadoop Data Nodes the upload time increases too. We can motivate this trend remembering that the Hadoop has been configured with a redundancy parameter equal to 2. In fact with a single active Data Node, the upload time has a very low value, because the system does not have the need to replicate the file. Due to Hadoop s data replication mechanisms, increasing the number of Data Nodes, we can notice a linear increase of the upload

18 Conclusion and Future Work The main result has been understanding how a MapReduce parallel processing system can be deployed in a federated cloud environment. Experiments highlighted the overhead of the system in job submission In future works we plan to integrate resource provisioning policies to make more flexible the federation relationship establishment between Cps. For who is interested in CLEVER, a guide on how to use the middleware and how to develop agents is available in the official web site

19 Questions?

CLEVER: a CLoud-Enabled Virtual EnviRonment

CLEVER: a CLoud-Enabled Virtual EnviRonment CLEVER: a CLoud-Enabled Virtual EnviRonment Francesco Tusa Maurizio Paone Massimo Villari Antonio Puliafito {ftusa,mpaone,mvillari,apuliafito}@unime.it Università degli Studi di Messina, Dipartimento di

More information

University of Messina, Italy

University of Messina, Italy University of Messina, Italy IEEE MoCS 2011 Kerkyra - Greece June 28, 2011 Dr. Massimo Villari mvillari@unime.it Cross Cloud Federation Federated Cloud Scenario Cloud Middleware Model: the Stack The CLEVER

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series www.cumulux.com ` CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS Review Business and Technology Series www.cumulux.com Table of Contents Cloud Computing Model...2 Impact on IT Management and

More information

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud) Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University

More information

Cloud Courses Description

Cloud Courses Description Cloud Courses Description Cloud 101: Fundamental Cloud Computing and Architecture Cloud Computing Concepts and Models. Fundamental Cloud Architecture. Virtualization Basics. Cloud platforms: IaaS, PaaS,

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Performance and Energy Efficiency of. Hadoop deployment models

Performance and Energy Efficiency of. Hadoop deployment models Performance and Energy Efficiency of Hadoop deployment models Contents Review: What is MapReduce Review: What is Hadoop Hadoop Deployment Models Metrics Experiment Results Summary MapReduce Introduced

More information

Hadoop Scheduler w i t h Deadline Constraint

Hadoop Scheduler w i t h Deadline Constraint Hadoop Scheduler w i t h Deadline Constraint Geetha J 1, N UdayBhaskar 2, P ChennaReddy 3,Neha Sniha 4 1,4 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bangalore,

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

A Cost-Evaluation of MapReduce Applications in the Cloud

A Cost-Evaluation of MapReduce Applications in the Cloud 1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce

More information

Cloud computing - Architecting in the cloud

Cloud computing - Architecting in the cloud Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices

More information

Cloud Courses Description

Cloud Courses Description Courses Description 101: Fundamental Computing and Architecture Computing Concepts and Models. Data center architecture. Fundamental Architecture. Virtualization Basics. platforms: IaaS, PaaS, SaaS. deployment

More information

Hadoop Architecture. Part 1

Hadoop Architecture. Part 1 Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

More information

Emerging Technology for the Next Decade

Emerging Technology for the Next Decade Emerging Technology for the Next Decade Cloud Computing Keynote Presented by Charles Liang, President & CEO Super Micro Computer, Inc. What is Cloud Computing? Cloud computing is Internet-based computing,

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

Mobile Cloud Computing for Data-Intensive Applications

Mobile Cloud Computing for Data-Intensive Applications Mobile Cloud Computing for Data-Intensive Applications Senior Thesis Final Report Vincent Teo, vct@andrew.cmu.edu Advisor: Professor Priya Narasimhan, priya@cs.cmu.edu Abstract The computational and storage

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Data Semantics Aware Cloud for High Performance Analytics

Data Semantics Aware Cloud for High Performance Analytics Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement

More information

CHAPTER 8 CLOUD COMPUTING

CHAPTER 8 CLOUD COMPUTING CHAPTER 8 CLOUD COMPUTING SE 458 SERVICE ORIENTED ARCHITECTURE Assist. Prof. Dr. Volkan TUNALI Faculty of Engineering and Natural Sciences / Maltepe University Topics 2 Cloud Computing Essential Characteristics

More information

Evaluation Methodology of Converged Cloud Environments

Evaluation Methodology of Converged Cloud Environments Krzysztof Zieliński Marcin Jarząb Sławomir Zieliński Karol Grzegorczyk Maciej Malawski Mariusz Zyśk Evaluation Methodology of Converged Cloud Environments Cloud Computing Cloud Computing enables convenient,

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud Thuy D. Nguyen, Cynthia E. Irvine, Jean Khosalim Department of Computer Science Ground System Architectures Workshop

More information

METHOD OF A MULTIMEDIA TRANSCODING FOR MULTIPLE MAPREDUCE JOBS IN CLOUD COMPUTING ENVIRONMENT

METHOD OF A MULTIMEDIA TRANSCODING FOR MULTIPLE MAPREDUCE JOBS IN CLOUD COMPUTING ENVIRONMENT METHOD OF A MULTIMEDIA TRANSCODING FOR MULTIPLE MAPREDUCE JOBS IN CLOUD COMPUTING ENVIRONMENT 1 SEUNGHO HAN, 2 MYOUNGJIN KIM, 3 YUN CUI, 4 SEUNGHYUN SEO, 5 SEUNGBUM SEO, 6 HANKU LEE 1,2,3,4,5 Department

More information

AAA in a Cloud-Based Virtual DIME Network Architecture (DNA)

AAA in a Cloud-Based Virtual DIME Network Architecture (DNA) AAA in a Cloud-Based Virtual DIME Network Architecture (DNA) Francesco Tusa, Antonio Celesti Dept. of Mathematics, Faculty of Engineering, University of Messina Contrada di Dio, S. Agata, 98166 Messina,

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

A CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL

A CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL A CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL *Hung-Ming Chen, Chuan-Chien Hou, and Tsung-Hsi Lin Department of Construction Engineering National Taiwan University

More information

Written examination in Cloud Computing

Written examination in Cloud Computing Written examination in Cloud Computing February 11th 2014 Last name: First name: Student number: Provide on all sheets (including the cover sheet) your last name, rst name and student number. Use the provided

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

Federation Establishment Between CLEVER Clouds Through a SAML SSO Authentication Profile

Federation Establishment Between CLEVER Clouds Through a SAML SSO Authentication Profile 14 Federation Establishment Between CLEVER Clouds Through a SAML SSO Authentication Profile Antonio Celesti, Francesco Tusa, Massimo Villari and Antonio Puliafito Dept. of Mathematics, Faculty of Engineering,

More information

OpenNebula Leading Innovation in Cloud Computing Management

OpenNebula Leading Innovation in Cloud Computing Management OW2 Annual Conference 2010 Paris, November 24th, 2010 OpenNebula Leading Innovation in Cloud Computing Management Ignacio M. Llorente DSA-Research.org Distributed Systems Architecture Research Group Universidad

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

VMware for your hosting services

VMware for your hosting services VMware for your hosting services Anindya Kishore Das 2009 VMware Inc. All rights reserved Everybody talks Cloud! You will eat your cloud and you will like it! Everybody talks Cloud - But what is it? VMware

More information

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM Ramesh Maharjan and Manoj Shakya Department of Computer Science and Engineering Dhulikhel, Kavre, Nepal lazymesh@gmail.com,

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

@ 2014 SEMAR GROUPS TECHNICAL SOCIETY.

@ 2014 SEMAR GROUPS TECHNICAL SOCIETY. www.semargroup.org, www.ijsetr.com ISSN 2319-8885 Vol.03,Issue.11 June-2014, Pages:2344-2350 Enhance the Performance of Cloud Computing with Hadoop Dept of CSE, College of Science, Kirkuk University, Ministry

More information

Cloud Computing & Hosting Solutions

Cloud Computing & Hosting Solutions Cloud Computing & Hosting Solutions SANTA FE COLLEGE CTS2356: NETWORK ADMIN DANIEL EAKINS 4/15/2012 1 Cloud Computing & Hosting Solutions ABSTRACT For this week s topic we will discuss about Cloud computing

More information

ESPRESSO: An Encryption as a Service for Cloud Storage Systems

ESPRESSO: An Encryption as a Service for Cloud Storage Systems 8th International Conference on Autonomous Infrastructure, Management and Security ESPRESSO: An Encryption as a Service for Cloud Storage Systems Kang Seungmin 30 th Jun., 2014 Outline Introduction and

More information

A very short Intro to Hadoop

A very short Intro to Hadoop 4 Overview A very short Intro to Hadoop photo by: exfordy, flickr 5 How to Crunch a Petabyte? Lots of disks, spinning all the time Redundancy, since disks die Lots of CPU cores, working all the time Retry,

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

A Middleware Strategy to Survive Compute Peak Loads in Cloud

A Middleware Strategy to Survive Compute Peak Loads in Cloud A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk

More information

Research Article Hadoop-Based Distributed Sensor Node Management System

Research Article Hadoop-Based Distributed Sensor Node Management System Distributed Networks, Article ID 61868, 7 pages http://dx.doi.org/1.1155/214/61868 Research Article Hadoop-Based Distributed Node Management System In-Yong Jung, Ki-Hyun Kim, Byong-John Han, and Chang-Sung

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue and Seung-Jong Park Department of Computer Science,

More information

Cloud Computing from an Institutional Perspective

Cloud Computing from an Institutional Perspective 15th April 2010 e-infranet Workshop Louvain, Belgium Next Generation Data Center Summit Cloud Computing from an Institutional Perspective Distributed Systems Architecture Research Group Universidad Complutense

More information

A Requirements Analysis for IaaS Cloud Federation

A Requirements Analysis for IaaS Cloud Federation A Requirements Analysis for IaaS Cloud Federation Alfonso Panarello, Antonio Celesti, Maria Fazio, Massimo Villari and Antonio Puliafito DICIEAMA, Università degli Studi di Messina, Contrada Di Dio, S.

More information

Test of cloud federation in CHAIN-REDS project

Test of cloud federation in CHAIN-REDS project Test of cloud federation in CHAIN-REDS project Italian National Institute of Nuclear Physics, Division of Catania - Italy E-mail: giuseppe.andronico@ct.infn.it Roberto Barbera Department of Physics and

More information

MapReduce, Hadoop and Amazon AWS

MapReduce, Hadoop and Amazon AWS MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables

More information

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Managing and Conducting Biomedical Research on the Cloud Prasad Patil Managing and Conducting Biomedical Research on the Cloud Prasad Patil Laboratory for Personalized Medicine Center for Biomedical Informatics Harvard Medical School SaaS & PaaS gmail google docs app engine

More information

Environments, Services and Network Management for Green Clouds

Environments, Services and Network Management for Green Clouds Environments, Services and Network Management for Green Clouds Carlos Becker Westphall Networks and Management Laboratory Federal University of Santa Catarina MARCH 3RD, REUNION ISLAND IARIA GLOBENET 2012

More information

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee Abstract Virtualization platform solutions

More information

Amazon Web Services Demo Tech Exchange. Slides: http://goo.gl/mwz0es. ssw@iu.edu

Amazon Web Services Demo Tech Exchange. Slides: http://goo.gl/mwz0es. ssw@iu.edu Amazon Web Services Demo Tech Exchange Slides: http://goo.gl/mwz0es ssw@iu.edu $ dig +short emergency.iu.edu emergency.iu.edu.s3-website-us-east-1.amazonaws.com. s3-website-us-east-1.amazonaws.com. 54.231.14.220

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

If you do NOT use applications based on Amazon Web Services raise your hand.

If you do NOT use applications based on Amazon Web Services raise your hand. If you do NOT use applications based on Amazon Web Services raise your hand. If you use NetFlix, lower your hand. Use reddit, lower your hand. Use Scribd, lower your hand. Use Spotify, lower your hand.

More information

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm ( Apache Hadoop 1.0 High Availability Solution on VMware vsphere TM Reference Architecture TECHNICAL WHITE PAPER v 1.0 June 2012 Table of Contents Executive Summary... 3 Introduction... 3 Terminology...

More information

Cloud Computing Simulation Using CloudSim

Cloud Computing Simulation Using CloudSim Cloud Computing Simulation Using CloudSim Ranjan Kumar #1, G.Sahoo *2 # Assistant Professor, Computer Science & Engineering, Ranchi University, India Professor & Head, Information Technology, Birla Institute

More information

marlabs driving digital agility WHITEPAPER Big Data and Hadoop

marlabs driving digital agility WHITEPAPER Big Data and Hadoop marlabs driving digital agility WHITEPAPER Big Data and Hadoop Abstract This paper explains the significance of Hadoop, an emerging yet rapidly growing technology. The prime goal of this paper is to unveil

More information

Windows Azure and private cloud

Windows Azure and private cloud Windows Azure and private cloud Joe Chou Senior Program Manager China Cloud Innovation Center Customer Advisory Team Microsoft Asia-Pacific Research and Development Group 1 Agenda Cloud Computing Fundamentals

More information

CLOUD COMPUTING. When It's smarter to rent than to buy

CLOUD COMPUTING. When It's smarter to rent than to buy CLOUD COMPUTING When It's smarter to rent than to buy Is it new concept? Nothing new In 1990 s, WWW itself Grid Technologies- Scientific applications Online banking websites More convenience Not to visit

More information

Open source large scale distributed data management with Google s MapReduce and Bigtable

Open source large scale distributed data management with Google s MapReduce and Bigtable Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory

More information

International Journal of Engineering Research & Management Technology

International Journal of Engineering Research & Management Technology International Journal of Engineering Research & Management Technology March- 2015 Volume 2, Issue-2 Survey paper on cloud computing with load balancing policy Anant Gaur, Kush Garg Department of CSE SRM

More information

Survey on Scheduling Algorithm in MapReduce Framework

Survey on Scheduling Algorithm in MapReduce Framework Survey on Scheduling Algorithm in MapReduce Framework Pravin P. Nimbalkar 1, Devendra P.Gadekar 2 1,2 Department of Computer Engineering, JSPM s Imperial College of Engineering and Research, Pune, India

More information

Cloud Computing Summary and Preparation for Examination

Cloud Computing Summary and Preparation for Examination Basics of Cloud Computing Lecture 8 Cloud Computing Summary and Preparation for Examination Satish Srirama Outline Quick recap of what we have learnt as part of this course How to prepare for the examination

More information

Web Application Hosting Cloud Architecture

Web Application Hosting Cloud Architecture Web Application Hosting Cloud Architecture Executive Overview This paper describes vendor neutral best practices for hosting web applications using cloud computing. The architectural elements described

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

An Introduction to Cloud Computing Concepts

An Introduction to Cloud Computing Concepts Software Engineering Competence Center TUTORIAL An Introduction to Cloud Computing Concepts Practical Steps for Using Amazon EC2 IaaS Technology Ahmed Mohamed Gamaleldin Senior R&D Engineer-SECC ahmed.gamal.eldin@itida.gov.eg

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Introduction to Cloud Computing CSCI319 Chapter 1 Page: 1 of 10 Content and Objectives 1. Introduce to cloud computing 2. Develop and understanding to how

More information

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

More information

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay Weekly Report Hadoop Introduction submitted By Anurag Sharma Department of Computer Science and Engineering Indian Institute of Technology Bombay Chapter 1 What is Hadoop? Apache Hadoop (High-availability

More information

Geoff Raines Cloud Engineer

Geoff Raines Cloud Engineer Geoff Raines Cloud Engineer Approved for Public Release; Distribution Unlimited. 13-2170 2013 The MITRE Corporation. All rights reserved. Why are P & I important for DoD cloud services? Improves the end-to-end

More information

Dutch HPC Cloud: flexible HPC for high productivity in science & business

Dutch HPC Cloud: flexible HPC for high productivity in science & business Dutch HPC Cloud: flexible HPC for high productivity in science & business Dr. Axel Berg SARA national HPC & e-science Support Center, Amsterdam, NL April 17, 2012 4 th PRACE Executive Industrial Seminar,

More information

HSCLOUD: CLOUD ARCHITECTURE FOR SUPPORTING HOMELAND SECURITY

HSCLOUD: CLOUD ARCHITECTURE FOR SUPPORTING HOMELAND SECURITY INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, VOL. 5, NO. 1, MARCH 2012 HSCLOUD: CLOUD ARCHITECTURE FOR SUPPORTING HOMELAND SECURITY M. Fazio, M. Paone, A. Puliafito and M. Villari Faculty

More information

Sriram Krishnan, Ph.D. sriram@sdsc.edu

Sriram Krishnan, Ph.D. sriram@sdsc.edu Sriram Krishnan, Ph.D. sriram@sdsc.edu (Re-)Introduction to cloud computing Introduction to the MapReduce and Hadoop Distributed File System Programming model Examples of MapReduce Where/how to run MapReduce

More information

Apache Hadoop new way for the company to store and analyze big data

Apache Hadoop new way for the company to store and analyze big data Apache Hadoop new way for the company to store and analyze big data Reyna Ulaque Software Engineer Agenda What is Big Data? What is Hadoop? Who uses Hadoop? Hadoop Architecture Hadoop Distributed File

More information

Clearing Away the Clouds: What is the Future of Cloud Computing? BEBO WHITE PEWE WORKSHOP BRATISLAVA APRIL 2010

Clearing Away the Clouds: What is the Future of Cloud Computing? BEBO WHITE PEWE WORKSHOP BRATISLAVA APRIL 2010 Clearing Away the Clouds: What is the Future of Cloud Computing? BEBO WHITE PEWE WORKSHOP BRATISLAVA APRIL 2010 The Top 10 Strategic Technologies for 2010 Gartner Report 1 Cloud Computing 2 Advanced Analytics

More information

Evaluating MapReduce and Hadoop for Science

Evaluating MapReduce and Hadoop for Science Evaluating MapReduce and Hadoop for Science Lavanya Ramakrishnan LRamakrishnan@lbl.gov Lawrence Berkeley National Lab Computation and Data are critical parts of the scientific process Three Pillars of

More information

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD Building Out Your Cloud-Ready Solutions Clark D. Richey, Jr., Principal Technologist, DoD Slide 1 Agenda Define the problem Explore important aspects of Cloud deployments Wrap up and questions Slide 2

More information

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures Jornadas Técnicas de RedIRIS 2009 Santiago de Compostela 27th November 2009 The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures Distributed Systems Architecture Research Group

More information

Business applications:

Business applications: Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Business applications: the COMETA approach Prof. Antonio Puliafito University of Messina Open Grid Forum (OGF25) Catania, 2-6.03.2009 www.consorzio-cometa.it

More information

Key Research Challenges in Cloud Computing

Key Research Challenges in Cloud Computing 3rd EU-Japan Symposium on Future Internet and New Generation Networks Tampere, Finland October 20th, 2010 Key Research Challenges in Cloud Computing Ignacio M. Llorente Head of DSA Research Group Universidad

More information

Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics

Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Dharmendra Agawane 1, Rohit Pawar 2, Pavankumar Purohit 3, Gangadhar Agre 4 Guide: Prof. P B Jawade 2

More information

A.Prof. Dr. Markus Hagenbuchner markus@uow.edu.au. CSCI319 A Brief Introduction to Cloud Computing. CSCI319 Page: 1

A.Prof. Dr. Markus Hagenbuchner markus@uow.edu.au. CSCI319 A Brief Introduction to Cloud Computing. CSCI319 Page: 1 A.Prof. Dr. Markus Hagenbuchner markus@uow.edu.au CSCI319 A Brief Introduction to Cloud Computing CSCI319 Page: 1 Content and Objectives 1. Introduce to cloud computing 2. Develop and understanding to

More information

Introduction to OpenStack

Introduction to OpenStack Introduction to OpenStack Carlo Vallati PostDoc Reseracher Dpt. Information Engineering University of Pisa carlo.vallati@iet.unipi.it Cloud Computing - Definition Cloud Computing is a term coined to refer

More information

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment James Devine December 15, 2008 Abstract Mapreduce has been a very successful computational technique that has

More information

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering NIST Definition of Cloud Computing Cloud computing is a model for enabling convenient, on-demand

More information

Grid and Cloud Computing at LRZ Dr. Helmut Heller, Group Leader Distributed Resources Group

Grid and Cloud Computing at LRZ Dr. Helmut Heller, Group Leader Distributed Resources Group Grid and Cloud Computing at LRZ Dr. Helmut Heller, Group Leader Distributed Resources Group Overview Grid: http://www.grid.lrz.de What is Grid computing? Advantages of Grid computing (why you should use

More information

Unleash the IaaS Cloud About VMware vcloud Director and more VMUG.BE June 1 st 2012

Unleash the IaaS Cloud About VMware vcloud Director and more VMUG.BE June 1 st 2012 Unleash the IaaS Cloud About VMware vcloud Director and more VMUG.BE June 1 st 2012 2 Who? Viktor van den Berg Consultant @ PQR Former Dutch VMUG Leader Blogger at www.viktorious.nl Twitter @viktoriousss

More information

Session 3. the Cloud Stack, SaaS, PaaS, IaaS

Session 3. the Cloud Stack, SaaS, PaaS, IaaS Session 3. the Cloud Stack, SaaS, PaaS, IaaS The service models resemble a cascading architecture where services on a higher level, as identified by Weinhardt et.al. (2009); encapsulate functionality from

More information

Analysing Large Web Log Files in a Hadoop Distributed Cluster Environment

Analysing Large Web Log Files in a Hadoop Distributed Cluster Environment Analysing Large Files in a Hadoop Distributed Cluster Environment S Saravanan, B Uma Maheswari Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham,

More information

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014 White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page

More information

CLOUD STORAGE USING HADOOP AND PLAY

CLOUD STORAGE USING HADOOP AND PLAY 27 CLOUD STORAGE USING HADOOP AND PLAY Devateja G 1, Kashyap P V B 2, Suraj C 3, Harshavardhan C 4, Impana Appaji 5 1234 Computer Science & Engineering, Academy for Technical and Management Excellence

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

Cloud Computing Technology

Cloud Computing Technology Cloud Computing Technology The Architecture Overview Danairat T. Certified Java Programmer, TOGAF Silver danairat@gmail.com, +66-81-559-1446 1 Agenda What is Cloud Computing? Case Study Service Model Architectures

More information

An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce.

An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce. An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce. Amrit Pal Stdt, Dept of Computer Engineering and Application, National Institute

More information

Performance Optimization of a Distributed Transcoding System based on Hadoop for Multimedia Streaming Services

Performance Optimization of a Distributed Transcoding System based on Hadoop for Multimedia Streaming Services RESEARCH ARTICLE Adv. Sci. Lett. 4, 400 407, 2011 Copyright 2011 American Scientific Publishers Advanced Science Letters All rights reserved Vol. 4, 400 407, 2011 Printed in the United States of America

More information

Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud

Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud Aditya Jadhav, Mahesh Kukreja E-mail: aditya.jadhav27@gmail.com & mr_mahesh_in@yahoo.co.in Abstract : In the information industry,

More information

24/11/14. During this course. Internet is everywhere. Frequency barrier hit. Management costs increase. Advanced Distributed Systems Cloud Computing

24/11/14. During this course. Internet is everywhere. Frequency barrier hit. Management costs increase. Advanced Distributed Systems Cloud Computing Advanced Distributed Systems Cristian Klein Department of Computing Science Umeå University During this course Treads in IT Towards a new data center What is Cloud computing? Types of Clouds Making applications

More information