Expanding the Horizons of Cloud Computing Beyond the Data Center. Jon Weissman Department of CS&E University of Minnesota

Similar documents
Scheduling in the Cloud

Living on the Edge: Scheduling Edge Resources Across the Cloud

Using Proxies to Accelerate Cloud Applications

CSci 8980 Mobile Cloud Computing. Outsourcing: Components

Improving MapReduce Performance in Heterogeneous Environments

Distribution transparency. Degree of transparency. Openness of distributed systems

Introduction to Cloud Computing

bigdata Managing Scale in Ontological Systems

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

From Internet Data Centers to Data Centers in the Cloud

Exploring MapReduce Efficiency with Highly-Distributed Data

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

Bringing Big Data Modelling into the Hands of Domain Experts

Intel SSDs Transforming Storage in the Data Center

Cloud Infrastructure Operational Excellence & Reliability

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

Chapter 19 Cloud Computing for Multimedia Services

Putchong Uthayopas, Kasetsart University

Cloud Computing Trends

Hadoop & its Usage at Facebook

Introduction to Computer Networking: Trends and Issues

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Big Data Are You Ready? Thomas Kyte

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration

Cloud Computing for Mobile Devices - Reducing Energy Consumption 1 -

So#ware to Data model

The Value of Content Distribution Networks Mike Axelrod, Google Google Public

2015 The MathWorks, Inc. 1

Products & Features. For more information. Web/app service to be managed Real Brower. Public. ARGOS PC Probe. Apps. Mobile subscriber network

Azure Data Lake Analytics

CLOUD NETWORKING THE NEXT CHAPTER FLORIN BALUS

VStore++: Virtual Storage Services for Mobile Devices

Cloud Platforms, Challenges & Hadoop. Aditee Rele Karpagam Venkataraman Janani Ravi

Mobile Cloud Computing for Data-Intensive Applications

DDS-Enabled Cloud Management Support for Fast Task Offloading

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Architecting for the Internet of Things & Big Data

Lecture 24: WSC, Datacenters. Topics: network-on-chip wrap-up, warehouse-scale computing and datacenters (Sections )

A Virtual Cloud Computing Provider for Mobile Devices

Corso di Reti di Calcolatori L-A. Cloud Computing

CLOUD COMPUTING USABILITY IN MOBILE COMMUNICATION NETWORK

CLOUD COMPUTING. Dana Petcu West University of Timisoara

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

Cloud Computing From Hype to Reality Use cases. Lars Andersen April 2013

A Near Real-Time Personalization for ecommerce Platform Amit Rustagi

A Study on the Cloud Computing Architecture, Service Models, Applications and Challenging Issues

Security Benefits of Cloud Computing

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM

Apache Hadoop. Alexandru Costan

Google

Hadoop & its Usage at Facebook

Cloud computing - Architecting in the cloud

How To Understand Cloud Computing

CSci 8980 Mobile Cloud Computing. MCC Overview

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 1, March, 2013 ISSN:

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Analyzing Big Data with AWS

Testing & Assuring Mobile End User Experience Before Production. Neotys

GraySort on Apache Spark by Databricks

Network Performance Between Geo-Isolated Data Centers. Testing Trans-Atlantic and Intra-European Network Performance between Cloud Service Providers

Enhancing MapReduce Functionality for Optimizing Workloads on Data Centers

PortLand:! A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric

Content Delivery Networks. Shaxun Chen April 21, 2009

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Lecture 02a Cloud Computing I

Forensic Clusters: Advanced Processing with Open Source Software. Jon Stewart Geoff Black

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Maginatics Cloud Storage Platform A primer

Experimenters, Developers, Innovators

Cloud Federation to Elastically Increase MapReduce Processing Resources

CDN and Traffic-structure

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Cloud Computing. Cloud computing:

CLOUD COMPUTING IN HIGHER EDUCATION

Real- Time Wireless Sensor- Actuator Networks

SDN and Data Center Networks

Application Development. A Paradigm Shift

Grand Design for Next-Generation Data Centers

Contents. 1. Introduction

Cloud Computing. What s the Big Deal? Michael J. Carey Information Systems Group CS Department UC Irvine

Mobile Cloud Computing: Paradigms and Challenges 移 动 云 计 算 : 模 式 与 挑 战

High Performance Applications over the Cloud: Gains and Losses

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

How To Handle Big Data With A Data Scientist

Horizontal IoT Application Development using Semantic Web Technologies

Challenges for Data Driven Systems

Information Processing, Big Data, and the Cloud

IBM Spectrum Protect in the Cloud

Energy Efficient MapReduce

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Using distributed technologies to analyze Big Data

Introduction to OpenStack

Performance Evaluation of the Illinois Cloud Computing Testbed

Luncheon Webinar Series May 13, 2013

A Cloud Test Bed for China Railway Enterprise Data Center

Transcription:

Expanding the Horizons of Cloud Computing Beyond the Data Center Jon Weissman Department of CS&E University of Minnesota Open Cirrus October 2011

Key Trends Client technology devices: smart phones, tablets, sensors Big data distributed and dynamic Privacy/trust small circles Multiple DCs global services

Minnesota Cloud Research Eye towards cloud evolution Projects Nebula Mobile cloud DMapReduce Proxy cloud Active cloud storage Virtualization

Big Data Trend Big data is distributed earth science: weather data, seismic data life science: GenBank, PubMed health science: GoogleEarth + CDC pandemic data web 2.0: user multimedia blogs everyone is a sensor Cost in moving data to the cloud

Big Data Trend: Nebulas Process data close by fully and/or on-route to the central cloud cost: save time and money privacy (think: patient records, local google doc) Close by network distance trusted peers

Example: Dispersed-Data-Intensive Services Data is geographically distributed Costly, inefficient to move to central location

Example Instance: Blog Analysis blog1 blog2 blog3

Nebula Decentralized, less-managed cloud dispersed storage/compute resources low user cost

Nebula: A New Cloud Model Make the cloud more distributed exploit the rich collection of edge computers volunteers (P2P, @home), commercial (CDNs) enormous computing potential, network diversity lower latency: on demand, native client sandboxing Nebula Central

Example: Blog Analysis blog1 blog2 blog3

Blog Results 140000 120000 100000 Amazon emulator Nebula testbed Time taken (sec) 80000 60000 40000 20000 0 40 80 120 160 240 320 # Blogs

Current Status Prototype running on PlanetLab Chrome browser clients + native client Distributed data-store service Network dashboard service

Nebula Going Forward Organize Nebulas around trusted peers social groups communities of interest local resources Nebula + commercial cloud use the edge opportunistically

Nebula Going Forward Hybrid paradigms early discard fan-in

Mobility Trend: Mobile Cloud

Mobility Trend: Mobile Cloud Mobile users/applications: s-phones, tablets resource limited: power, CPU, memory applications are becoming sophisticated Improve mobile user experience performance, reliability, fidelity tap into the cloud dynamically based on current resource state, preferences, interests, privacy

Cloud Mobile Opportunity Dynamic outsourcing move computation, data to the cloud dynamically User context exploit user behavior to pre-fetch, pre-compute, cache Multi-user sharing discover implicit cloud sharing based on interests, social ties

Outsourcing Partitioning across (Mobile <-> Cloud) local data capture + cloud processing images/video, speech, digital design, aug. reality Server Server Server Server. Server Dalvik cloud end (EC-2) Code repository Proxy Outsourcing Client mobile end (Android). Application Profiler Outsourcing Controller

Experimental Results -Image Response time both WIFI & 3G up to 27 speedup Sharpening Avg. Time Power consumption save up to 9 times 19 Avg. Power

Current Work Optimizations User patterns => speculation, aggregation, data reduction Cloud-side Cross-user patterns => sharing: VM provisioning, data placement, data re-use

Big Data Trend + Multi-DCs: DMapReduce Data Push Map Reduce Input Data Output Data 21

Wide-Area MapReduce Big data is distributed earth science: weather data/seismic data life science: GenBank/PubMed health science: GoogleEarth /pandemic data web 2.0: user multimedia blogs Data in different data-centers Run MapReduce across them Data-flow spanning wide-area networks

Option 1: Local MapReduce MapReduce Job Final Result Data Push (Fast) 23 Data Source (US) Data Center (US) Data Center (EU) Data Push (Slow) Data Source (EU)

Option 2: Global MapReduce MapReduce Job Final Result Data Push (Fast) Data Push (Fast) 24 Data Source (US) Data Center (US) Data Push (Slow) Data Center (EU) Data Push (Slow) Data Source (EU)

Option 3: Distributed MapReduce Combine Results Final Result Data Push (Fast) MapReduce Job MapReduce Job Data Push (Fast) Data Source (US) Data Center (US) Data Center (EU) Data Source (EU) 25

PlanetLab: WordCount (text data) Time in seconds 1800 1600 1400 1200 1000 800 600 400 Local MR Global MR Distributed MR 200 0 Push US Push EU Map Reduce Result Combine Total Distributed MR works best in presence of data aggregation 26

PlanetLab: WordCount (random data) 900 Time in seconds 800 700 600 500 400 300 200 Local MR Global MR Distributed MR 100 0 Push US Push EU Map Reduce Result Combine Total Local MR works best in presence of data ballooning 27

EC-2 Results Time in seconds 1000 900 800 700 600 500 400 300 200 100 0 Local MR WordCount (Text) Distributed MR Push US Push EU Map Reduce Result Combine 300 Total Time in Seconds 900 800 700 600 500 400 300 200 100 0 Sort WordCount (Random) Local MR Distributed MR Push US Push EU Map Reduce Result Combine Total Time in Seconds 250 200 150 100 50 Local MR Distributed MR 28 0 Push US Push EU Map Reduce Result Combine Total

Intelligent Data Placement for Global Services HDFS push local node, same rack, random rack Resource Topology Application Characteristics /DC i /rack A /node X Data placement Scheduling Data expansion factors input->intermediate, α Intermediate->output, β

Experimental Results (Word count) Smart HDFS HDFS

Cloud Evolution Summary mobile users, big data, privacy/trust, global services Our Vision of the Cloud locality of users, data, other clouds/data centers, usercentric behavior