Like us on Facebook! Big Data in the Cloud: Computational Challenges and The Way Forward

Size: px
Start display at page:

Download "Like us on Facebook! Big Data in the Cloud: Computational Challenges and The Way Forward"

Transcription

1 Big Data in the Cloud: Computational Challenges and The Way Forward Marcos Vaz Salles Assistant Professor, University of Copenhagen (DIKU) Like us on Facebook!

2 About the Speaker Marcos Vaz Salles Tenure-track Assistant Professor, University of Copenhagen (DIKU) Postdoc: Cornell University PhD: ETH Zurich Mission: Find creative ways to expand the reach of the 30+ years of top-level R&D invested in database technology, broadly defined Examples: Database techniques for scientific simulations, games, search and integration, geospatial data 2

3 Where does your most important data live? 3

4 Where does your most important data live? DATABASES! 4

5 Historical Justification for Databases 5

6 Historical Justification for Databases Common applications Record maintenance, banking, government Complex implementation Concurrency, integrity, durability, storage, representation, Enough abstraction Operating systems virtualize low-level hardware Competing platforms No virtualization of platform: IBM, DEC, Data-Driven Applications Data Sharing (DBMS) Virtualization (Operating Systems) Platforms (Hardware) 6

7 Historical Justification for Databases Common applications Record maintenance, banking, government Complex implementation Concurrency, integrity, durability, storage, representation, Enough abstraction Operating systems virtualize low-level hardware Competing platforms No virtualization of platform: IBM, DEC, Data-Driven Applications But the Cloud today is completely different?! Data Sharing (DBMS) Virtualization (Operating Systems) Platforms (Hardware) 7

8 The Cloud Today Common applications Big Data, Data warehousing, SOA Complex implementation Data consistency and management, distribution, scalability, fault tolerance Enough abstraction Cloud IaaS virtualizes enormous clusters of machines Competing platforms No virtualization of platform: Amazon, Microsoft, Data-Driven Applications Data Sharing (????) Virtualization (Cloud IaaS) Platforms (Cloud Datacenter) 8

9 The Cloud Today Common applications Big Data, Data warehousing, SOA mplex implementation Data consistency and management, distribution, scalability, fault tolerance Enough abstraction Cloud IaaS virtualizes enormous clusters of machines Competing platforms No virtualization of platform: Amazon, Microsoft, Data-Driven Applications Challenge: What Data Sharing (????) should be the new Data Sharing Virtualization (Cloud IaaS) Abstraction in the Cloud? Platforms (Cloud Datacenter) 9

10 From Databases to Dataclouds While there were databases in the past, we will have dataclouds in the future Databases Database Management System (DBMS) Dataclouds Datacloud Management System (DCMS) Emerging application systems already being built! 10 But at high cost, constrained by skills gap With less features than desired Roadblock in the way of Software as a Service (SaaS) and Big Data revolutions

11 Challenges in Dataclouds and DCMS Programming, programming, programming Resources, resources, resources Scale, scale, scale 11

12 Challenges in Dataclouds and DCMS Programming, programming, programming Re-use or create new programming abstractions? How to incorporate data into software engineering? Resources, resources, resources How to deal with virtualized environments and abstract cost? Scale, scale, scale How to scale applications to petabytes automatically? 12

13 ClouDiA: A Cloud Deployment Advisor Initial work on deployment of latency-sensitive data services in public clouds Simulation analytics, e.g., multiagent simulations Search engines Key-value stores Acknowledgment Joint work with Tao Zou, Ronan LeBras, Alan Demers, and Johannes Gehrke at Cornell University 13 Best paper nominee at VLDB 2013 Talk includes slides by Tao Zou

14 Instance Allocation in Public Clouds Cloud Cloud Provider s Tenant s View 14

15 Instance Allocation in Public Clouds Core Cloud Provider s View Sub-Aggregation Aggregation TOR TOR TOR 15

16 Network Latencies in Public Clouds Mean Latency Heterogeneity in EC2 Challenge: How to avoid long links? Mean Latency Stability in EC2 Challenge: Can this be done out-of-the-box? (i.e. no changes to the cloud or the application) 16

17 Examples of Latency Sensitive Applications Scientific Simulation (time-to-solution) Key-value Store (response time) Opportunity: Longest Link Communication graphs are not complete. A careful logical to physical mapping can help! Search Aggregation Service Pipelines (response time) (response time) Opportunity: Gamble with over-allocation. Longest Path 17

18 Architecture of ClouDiA ClouDiA Allocate Instances (+ Extra Instances) Public Cloud Tenant Communication Graph Get Measurements Search Mapping Objectives Deployment Plan Start Application Terminate Extra Instances 18

19 Measuring Network Distance Goal: obtain a reliable estimate of link costs Approximations that are easy to obtain Hop counts Length of Common IP Prefix do not work Accurate distance: pair-wise network latencies Large number of measurements for each pair To observe enough latency jitters Interferences at end points Heavily application and network dependent Can t model exactly measure without interference 19

20 Measuring Network Latencies Without interference Most efficient method: staged Stage i Stage i+1 No concurrent send/recv at end points Parallelism with minimal coordination 20

21 Architecture of ClouDiA ClouDiA Allocate Instances (+ Extra Instances) Public Cloud Tenant Communication Graph Get Measurements Search Mapping Objectives Deployment Plan Start Application Terminate Extra Instances 21

22 Search Mapping (Longest Link) NP-Hard To find a solution of cost = OPT To find a solution of cost <= OPT + α To find a solution of cost <= (1+ε)OPT Hard to approximate Goal: Find a good solution within timeout Two formulations Mixed-Integer Programming O(N 2 ) boolean variables Constraint Programming (*) O(N) integer variables An objective cost has to be give a priori 22

23 Experimental Settings 100 to 150 m1.large instances in EC2 Aggregation workload only uses up to 50 IBM ILOG CPLEX Optimizer/CP Optimizer Multi-core, a single machine Three workloads: Behavioral simulation Synthetic aggregation query Key-value store 23

24 Effect of Over-Allocation 100 instances + 10%-50% over-allocation Get Measurements + Search Deployment < 10 minutes 24

25 Overall Improvement 10% over-allocation 15%-55% reduction in time-to-solution or response time 25

26 Wrap-up Big Data in Dataclouds Programming, programming, programming Resources, resources, resources Scale, scale, scale ClouDiA An initial step in resource optimization in public clouds Next steps: Build a datacloud! Tons of research challenges open Ongoing collaborations with Danish Geodata Agency (GST) and the HIPERFIT center Collaborate with us too! Thank you! HIPERFIT 26

27 Backup Slides 27

28 Searching using Constraint Programming Give Give an an objective c: c: 1. Remove 1. all all links links with with cost cost > c > c 2. Find 2. Find a subgraph a isomorphism k=4 0 Timeout Timeout cost A lot of distinct latency values Bi-section search? finding no solution takes time k-means clustering? works well with proper k

29 Summary of Node Deployment Objectives Minimize cost of worst link Minimize cost of longest path Optimization Methods Akin to graph embedding problem, but with minimization goals Mixed-integer programming (MIP) formulation for both objectives Constraint programming (CP) formulation also for worst link Greedy easy to beat Network measurements Staged message exchange to measure costs More details on the paper! 29

30 Experiments with ClouDiA on Amazon EC2 Workloads & Setup Behavioral simulation 30 Fish simulation by Couzin et al., Nature 2D mesh 100 Amazon EC2 large instances Minimize Worst Link objective Synthetic aggregation workload Models search engines, distributed text databases Multi-level aggregation tree 50 Amazon EC2 large instances Minimize Longest Path objective Key-value store workload Bipartite graph of front-end servers and storage servers 100 Amazon EC2 large instances Minimize Worst Link objective used, but not perfect fit

Data Sharing in the Cloud: Scaling to the World, Unleashing Creativity, and Generating Value?

Data Sharing in the Cloud: Scaling to the World, Unleashing Creativity, and Generating Value? Data Sharing in the Cloud: Scaling to the World, Unleashing Creativity, and Generating Value? Marcos Vaz Salles Assistant Professor, University of Copenhagen (DIKU) About the Speaker Marcos Vaz Salles

More information

ClouDiA: A Deployment Advisor for Public Clouds

ClouDiA: A Deployment Advisor for Public Clouds ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles, Alan Demers, Johannes Gehrke Cornell University University of Copenhagen Ithaca, NY Copenhagen, Denmark {taozou,

More information

ClouDiA: a deployment advisor for public clouds. Tao Zou, Ronan Le Bras, Marcos Vaz Salles, Alan Demers & Johannes Gehrke

ClouDiA: a deployment advisor for public clouds. Tao Zou, Ronan Le Bras, Marcos Vaz Salles, Alan Demers & Johannes Gehrke ClouDiA: a deployment advisor for public clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles, Alan Demers & Johannes Gehrke The VLDB Journal The International Journal on Very Large Data Bases ISSN 66-8888

More information

OPTIMIZING RESPONSE TIME FOR DISTRIBUTED APPLICATIONS IN PUBLIC CLOUDS

OPTIMIZING RESPONSE TIME FOR DISTRIBUTED APPLICATIONS IN PUBLIC CLOUDS OPTIMIZING RESPONSE TIME FOR DISTRIBUTED APPLICATIONS IN PUBLIC CLOUDS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for

More information

Data Management in the Cloud. Zhen Shi

Data Management in the Cloud. Zhen Shi Data Management in the Cloud Zhen Shi Overview Introduction 3 characteristics of cloud computing 2 types of cloud data management application 2 types of cloud data management architecture Conclusion Introduction

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

Big Data Analytics. Chances and Challenges. Volker Markl

Big Data Analytics. Chances and Challenges. Volker Markl Volker Markl Professor and Chair Database Systems and Information Management (DIMA), Technische Universität Berlin www.dima.tu-berlin.de Big Data Analytics Chances and Challenges Volker Markl DIMA BDOD

More information

Webpage: www.ijaret.org Volume 3, Issue XI, Nov. 2015 ISSN 2320-6802

Webpage: www.ijaret.org Volume 3, Issue XI, Nov. 2015 ISSN 2320-6802 An Effective VM scheduling using Hybrid Throttled algorithm for handling resource starvation in Heterogeneous Cloud Environment Er. Navdeep Kaur 1 Er. Pooja Nagpal 2 Dr.Vinay Guatum 3 1 M.Tech Student,

More information

ABSTRACT: [Type text] Page 2109

ABSTRACT: [Type text] Page 2109 International Journal Of Scientific Research And Education Volume 2 Issue 10 Pages-2109-2115 October-2014 ISSN (e): 2321-7545 Website: http://ijsae.in ABSTRACT: Database Management System as a Cloud Computing

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

Multilevel Communication Aware Approach for Load Balancing

Multilevel Communication Aware Approach for Load Balancing Multilevel Communication Aware Approach for Load Balancing 1 Dipti Patel, 2 Ashil Patel Department of Information Technology, L.D. College of Engineering, Gujarat Technological University, Ahmedabad 1

More information

Beyond the Stars: Revisiting Virtual Cluster Embeddings

Beyond the Stars: Revisiting Virtual Cluster Embeddings Beyond the Stars: Revisiting Virtual Cluster Embeddings Matthias Rost Technische Universität Berlin September 7th, 2015, Télécom-ParisTech Joint work with Carlo Fuerst, Stefan Schmid Published in ACM SIGCOMM

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

Enterprise Application Integration (Middleware)

Enterprise Application Integration (Middleware) Enterprise Application Integration (Middleware) Gustavo Alonso Systems Group Computer Science Department - ETH Zurich alonso@inf.ethz.ch http://www.systems.inf.ethz.ch/ EAI Course Administration Lecture:

More information

PostgreSQL Performance Characteristics on Joyent and Amazon EC2

PostgreSQL Performance Characteristics on Joyent and Amazon EC2 OVERVIEW In today's big data world, high performance databases are not only required but are a major part of any critical business function. With the advent of mobile devices, users are consuming data

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

The Sierra Clustered Database Engine, the technology at the heart of

The Sierra Clustered Database Engine, the technology at the heart of A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel

More information

2) Xen Hypervisor 3) UEC

2) Xen Hypervisor 3) UEC 5. Implementation Implementation of the trust model requires first preparing a test bed. It is a cloud computing environment that is required as the first step towards the implementation. Various tools

More information

PaaS Cloud Migration Migration Process, Architecture Problems and Solutions. Claus Pahl and Huanhuan Xiong

PaaS Cloud Migration Migration Process, Architecture Problems and Solutions. Claus Pahl and Huanhuan Xiong PaaS Cloud Migration Migration Process, Architecture Problems and Solutions Claus Pahl and Huanhuan Xiong Cloud Migration Motivation HOW TO MIGRATE TO CLOUD IaaS PaaS SaaS Cloud Migration Definition A

More information

A1 and FARM scalable graph database on top of a transactional memory layer

A1 and FARM scalable graph database on top of a transactional memory layer A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

Graph Database Proof of Concept Report

Graph Database Proof of Concept Report Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment

More information

Cloud Based Distributed Databases: The Future Ahead

Cloud Based Distributed Databases: The Future Ahead Cloud Based Distributed Databases: The Future Ahead Arpita Mathur Mridul Mathur Pallavi Upadhyay Abstract Fault tolerant systems are necessary to be there for distributed databases for data centers or

More information

Task Scheduling in Hadoop

Task Scheduling in Hadoop Task Scheduling in Hadoop Sagar Mamdapure Munira Ginwala Neha Papat SAE,Kondhwa SAE,Kondhwa SAE,Kondhwa Abstract Hadoop is widely used for storing large datasets and processing them efficiently under distributed

More information

From Internet Data Centers to Data Centers in the Cloud

From Internet Data Centers to Data Centers in the Cloud From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs

More information

WOLKEN KOSTEN GELD GUSTAVO ALONSO SYSTEMS GROUP ETH ZURICH WWW.SYSTEMS.ETHZ.CH

WOLKEN KOSTEN GELD GUSTAVO ALONSO SYSTEMS GROUP ETH ZURICH WWW.SYSTEMS.ETHZ.CH WOLKEN KOSTEN GELD GUSTAVO ALONSO SYSTEMS GROUP ETH ZURICH WWW.SYSTEMS.ETHZ.CH ELCA Update June 16, 2010, Gustavo Alonso About the speaker Professor of Computer Science at ETH Zurich Areas of interest:

More information

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara

Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara Sudipto Das (Microsoft summer intern) Shyam Antony (Microsoft now) Aaron Elmore (Amazon summer intern)

More information

Report Data Management in the Cloud: Limitations and Opportunities

Report Data Management in the Cloud: Limitations and Opportunities Report Data Management in the Cloud: Limitations and Opportunities Article by Daniel J. Abadi [1] Report by Lukas Probst January 4, 2013 In this report I want to summarize Daniel J. Abadi's article [1]

More information

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1 CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level -ORACLE TIMESTEN 11gR1 CASE STUDY Oracle TimesTen In-Memory Database and Shared Disk HA Implementation

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Overview of Cloud Computing (ENCS 691K Chapter 1) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ Overview of Cloud Computing Towards a definition

More information

High Performance Applications over the Cloud: Gains and Losses

High Performance Applications over the Cloud: Gains and Losses High Performance Applications over the Cloud: Gains and Losses Dr. Leila Ismail Faculty of Information Technology United Arab Emirates University leila@uaeu.ac.ae http://citweb.uaeu.ac.ae/citweb/profile/leila

More information

Introduction to Database Systems CSE 444. Lecture 24: Databases as a Service

Introduction to Database Systems CSE 444. Lecture 24: Databases as a Service Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service CSE 444 - Spring 2009 References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website

More information

SDN and Data Center Networks

SDN and Data Center Networks SDN and Data Center Networks 10/9/2013 1 The Rise of SDN The Current Internet and Ethernet Network Technology is based on Autonomous Principle to form a Robust and Fault Tolerant Global Network (Distributed)

More information

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html Datacenters and Cloud Computing Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html What is Cloud Computing? A model for enabling ubiquitous, convenient, ondemand network

More information

BEDIFFERENT A C E 2 0 1 2 I N T E R N A T I O N A L

BEDIFFERENT A C E 2 0 1 2 I N T E R N A T I O N A L Copyright 2012 Aras. All Rights Reserved. BEDIFFERENT A C E 2 0 1 2 I N T E R N A T I O N A L Copyright 2012 Aras. All Rights Reserved. ACE 2012 I N TERNATIONAL Leveraging the Cloud Rob McAveney Director

More information

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research. Young Choon Lee How to Do/Evaluate Cloud Computing Research Young Choon Lee Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

White Paper on NETWORK VIRTUALIZATION

White Paper on NETWORK VIRTUALIZATION White Paper on NETWORK VIRTUALIZATION INDEX 1. Introduction 2. Key features of Network Virtualization 3. Benefits of Network Virtualization 4. Architecture of Network Virtualization 5. Implementation Examples

More information

Advanced Computer Networks. Scheduling

Advanced Computer Networks. Scheduling Oriana Riva, Department of Computer Science ETH Zürich Advanced Computer Networks 263-3501-00 Scheduling Patrick Stuedi, Qin Yin and Timothy Roscoe Spring Semester 2015 Outline Last time Load balancing

More information

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani

More information

Cloud Computing Services and its Application

Cloud Computing Services and its Application Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 1 (2014), pp. 107-112 Research India Publications http://www.ripublication.com/aeee.htm Cloud Computing Services and its

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Introduction to Cloud Computing CSCI319 Chapter 1 Page: 1 of 10 Content and Objectives 1. Introduce to cloud computing 2. Develop and understanding to how

More information

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction Vol. 3 Issue 1, January-2014, pp: (1-5), Impact Factor: 1.252, Available online at: www.erpublications.com Performance evaluation of cloud application with constant data center configuration and variable

More information

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica jcampbell@vertica.com Big

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Emerging Technology for the Next Decade

Emerging Technology for the Next Decade Emerging Technology for the Next Decade Cloud Computing Keynote Presented by Charles Liang, President & CEO Super Micro Computer, Inc. What is Cloud Computing? Cloud computing is Internet-based computing,

More information

Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000

Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000 Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000 Your Data, Any Place, Any Time Executive Summary: More than ever, organizations rely on data

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

Lecture 26 Enterprise Internet Computing 1. Enterprise computing 2. Enterprise Internet computing 3. Natures of enterprise computing 4.

Lecture 26 Enterprise Internet Computing 1. Enterprise computing 2. Enterprise Internet computing 3. Natures of enterprise computing 4. Lecture 26 Enterprise Internet Computing 1. Enterprise computing 2. Enterprise Internet computing 3. Natures of enterprise computing 4. Platforms High end solutions Microsoft.Net Java technology 1 Enterprise

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

ECE6130 Grid and Cloud Computing

ECE6130 Grid and Cloud Computing ECE6130 Grid and Cloud Computing Howie Huang Department of Electrical and Computer Engineering School of Engineering and Applied Science Cloud Computing Hardware Software Outline Research Challenges 2

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING A SURVEY ON MAPREDUCE IN CLOUD COMPUTING Dr.M.Newlin Rajkumar 1, S.Balachandar 2, Dr.V.Venkatesakumar 3, T.Mahadevan 4 1 Asst. Prof, Dept. of CSE,Anna University Regional Centre, Coimbatore, newlin_rajkumar@yahoo.co.in

More information

International Journal of Engineering Research & Management Technology

International Journal of Engineering Research & Management Technology International Journal of Engineering Research & Management Technology March- 2015 Volume 2, Issue-2 Survey paper on cloud computing with load balancing policy Anant Gaur, Kush Garg Department of CSE SRM

More information

Cloud Computing Backgrounder

Cloud Computing Backgrounder Cloud Computing Backgrounder No surprise: information technology (IT) is huge. Huge costs, huge number of buzz words, huge amount of jargon, and a huge competitive advantage for those who can effectively

More information

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH CONTENTS Introduction... 4 System Components... 4 OpenNebula Cloud Management Toolkit... 4 VMware

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A COMPREHENSIVE VIEW OF HADOOP ER. AMRINDER KAUR Assistant Professor, Department

More information

An Introduction to Private Cloud

An Introduction to Private Cloud An Introduction to Private Cloud As the word cloud computing becomes more ubiquitous these days, several questions can be raised ranging from basic question like the definitions of a cloud and cloud computing

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

NCTA Cloud Architecture

NCTA Cloud Architecture NCTA Cloud Architecture Course Specifications Course Number: 093019 Course Length: 5 days Course Description Target Student: This course is designed for system administrators who wish to plan, design,

More information

A Cloud Test Bed for China Railway Enterprise Data Center

A Cloud Test Bed for China Railway Enterprise Data Center A Cloud Test Bed for China Railway Enterprise Data Center BACKGROUND China Railway consists of eighteen regional bureaus, geographically distributed across China, with each regional bureau having their

More information

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current

More information

: Tiering Storage for Data Analytics in the Cloud

: Tiering Storage for Data Analytics in the Cloud : Tiering Storage for Data Analytics in the Cloud Yue Cheng, M. Safdar Iqbal, Aayush Gupta, Ali R. Butt Virginia Tech, IBM Research Almaden Cloud enables cost-efficient data analytics Amazon EMR Cloud

More information

Big Data Processing with Google s MapReduce. Alexandru Costan

Big Data Processing with Google s MapReduce. Alexandru Costan 1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

A Cost-Evaluation of MapReduce Applications in the Cloud

A Cost-Evaluation of MapReduce Applications in the Cloud 1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce

More information

Daniel J. Adabi. Workshop presentation by Lukas Probst

Daniel J. Adabi. Workshop presentation by Lukas Probst Daniel J. Adabi Workshop presentation by Lukas Probst 3 characteristics of a cloud computing environment: 1. Compute power is elastic, but only if workload is parallelizable 2. Data is stored at an untrusted

More information

Keywords Cloud computing, virtual machines, migration approach, deployment modeling

Keywords Cloud computing, virtual machines, migration approach, deployment modeling Volume 3, Issue 8, August 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Effective Scheduling

More information

This paper defines as "Classical"

This paper defines as Classical Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of

More information

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. http://wso2.com Version 1.1 (June 19, 2012)

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. http://wso2.com Version 1.1 (June 19, 2012) Cloud Native Advantage: Multi-Tenant, Shared Container PaaS Version 1.1 (June 19, 2012) Table of Contents PaaS Container Partitioning Strategies... 03 Container Tenancy... 04 Multi-tenant Shared Container...

More information

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

Last time. Data Center as a Computer. Today. Data Center Construction (and management) Last time Data Center Construction (and management) Johan Tordsson Department of Computing Science 1. Common (Web) application architectures N-tier applications Load Balancers Application Servers Databases

More information

KPACK: SQL Capacity Monitoring

KPACK: SQL Capacity Monitoring KPACK: SQL Capacity Monitoring Microsoft SQL database capacity monitoring is extremely critical for enterprise high availability deployments. Although built-in SQL tools and certain 3 rd party monitoring

More information

Networking in the Hadoop Cluster

Networking in the Hadoop Cluster Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at http://www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2009 Vol. 8, No. 3, May-June 2009 Cloud Computing Benefits and Challenges! Dave Thomas

More information

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

Group Based Load Balancing Algorithm in Cloud Computing Virtualization Group Based Load Balancing Algorithm in Cloud Computing Virtualization Rishi Bhardwaj, 2 Sangeeta Mittal, Student, 2 Assistant Professor, Department of Computer Science, Jaypee Institute of Information

More information

A Survey on Load Balancing and Scheduling in Cloud Computing

A Survey on Load Balancing and Scheduling in Cloud Computing IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 7 December 2014 ISSN (online): 2349-6010 A Survey on Load Balancing and Scheduling in Cloud Computing Niraj Patel

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Gen-iTest Services. Realize The Full Power Of The Cloud For Performance Testing. Ian Mortimer and Gareth Shackel

Gen-iTest Services. Realize The Full Power Of The Cloud For Performance Testing. Ian Mortimer and Gareth Shackel Gen-iTest Services Realize The Full Power Of The Cloud For Performance Testing Ian Mortimer and Gareth Shackel Agenda 2 Agenda 1. Why Performance Test? 2. What is Performance Testing? 3. Why use the Cloud

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Alfresco Enterprise on AWS: Reference Architecture

Alfresco Enterprise on AWS: Reference Architecture Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

Optimizing the Hybrid Cloud

Optimizing the Hybrid Cloud Judith Hurwitz President and CEO Marcia Kaufman COO and Principal Analyst Sponsored by IBM Introduction Hybrid cloud is fast becoming a reality for enterprises that want speed, predictability and flexibility

More information

The Regional Medical Business Process Optimization Based on Cloud Computing Medical Resources Sharing Environment

The Regional Medical Business Process Optimization Based on Cloud Computing Medical Resources Sharing Environment BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, Special Issue Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0034 The Regional Medical

More information

On the effect of forwarding table size on SDN network utilization

On the effect of forwarding table size on SDN network utilization IBM Haifa Research Lab On the effect of forwarding table size on SDN network utilization Rami Cohen IBM Haifa Research Lab Liane Lewin Eytan Yahoo Research, Haifa Seffi Naor CS Technion, Israel Danny Raz

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Background The command over cloud computing infrastructure is increasing with the growing demands of IT infrastructure during the changed business scenario of the 21 st Century.

More information

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together Fault-Tolerant Computer System Design ECE 695/CS 590 Putting it All Together Saurabh Bagchi ECE/CS Purdue University ECE 695/CS 590 1 Outline Looking at some practical systems that integrate multiple techniques

More information

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD Building Out Your Cloud-Ready Solutions Clark D. Richey, Jr., Principal Technologist, DoD Slide 1 Agenda Define the problem Explore important aspects of Cloud deployments Wrap up and questions Slide 2

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

Multi-Datacenter Replication

Multi-Datacenter Replication www.basho.com Multi-Datacenter Replication A Technical Overview & Use Cases Table of Contents Table of Contents... 1 Introduction... 1 How It Works... 1 Default Mode...1 Advanced Mode...2 Architectural

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Winter 2009 Lecture 1 - Class Introduction

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Winter 2009 Lecture 1 - Class Introduction CSE 544 Principles of Database Management Systems Magdalena Balazinska (magda) Winter 2009 Lecture 1 - Class Introduction Outline Introductions Class overview What is the point of a db management system

More information

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at distributing load b. QUESTION: What is the context? i. How

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information