What is Cloud Computing? Tackling the Challenges of Big Data. Tackling The Challenges of Big Data. Matei Zaharia. Matei Zaharia. Big Data Collection



Similar documents
Cloud Computing Trends

Daren Kinser Auditor, UCSD Jennifer McDonald Auditor, UCSD

Cloud Computing Backgrounder

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Cloud Computing: Making the right choices

Cloud Computing and Amazon Web Services

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Cloud Computing; What is it, How long has it been here, and Where is it going?

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

Topics. Images courtesy of Majd F. Sakr or from Wikipedia unless otherwise noted.


Security Considerations for Public Mobile Cloud Computing

Case Studies: Protecting Sensitive Data in

Cloud Computing: The atmospheric jeopardy. Unique Approach Unique Solutions. Salmon Ltd 2014 Commercial in Confidence Page 1 of 5

Lecture 02a Cloud Computing I

Cloud Courses Description

Cloud Computing. Technologies and Types

How to Leverage Cloud to Quickly Build Scalable Applications

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

Cloud Computing Benefits for Educational Institutions

Cloud Services Overview

Addressing Data Security Challenges in the Cloud

Cloud Computing: The Next Computing Paradigm

CLOUD COMPUTING An Overview

Security & Trust in the Cloud

Cloud Computing Flying High (or not) Ben Roper IT Director City of College Station

PCI COMPLIANCE ON AWS: HOW TREND MICRO CAN HELP

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

Cloud computing - Architecting in the cloud

With Eversync s cloud data tiering, the customer can tier data protection as follows:

CLOUD COMPUTING USING HADOOP TECHNOLOGY

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

ANDREW HERTENSTEIN Manager Microsoft Modern Datacenter and Azure Solutions En Pointe Technologies Phone

Certified Cloud Computing Professional VS-1067

Cloud Computing, and REST-based Architectures Reid Holmes

Outline. What is cloud computing? History Cloud service models Cloud deployment forms Advantages/disadvantages

OTM in the Cloud. Ryan Haney

Public Clouds. Krishnan Subramanian Analyst & Researcher Krishworld.com. A whitepaper sponsored by Trend Micro Inc.

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

24/11/14. During this course. Internet is everywhere. Frequency barrier hit. Management costs increase. Advanced Distributed Systems Cloud Computing

What is Cloud Computing? Why call it Cloud Computing?

Ø Teaching Evaluations. q Open March 3 through 16. Ø Final Exam. q Thursday, March 19, 4-7PM. Ø 2 flavors: q Public Cloud, available to public

How To Run A Cloud Computer System

Cloud Security. Peter Jopling IBM UK Ltd Software Group Hursley Labs. peterjopling IBM Corporation

Cloud Courses Description

Elastic Data Warehousing in the Cloud Is the sky really the limit?

Keywords Cloud Computing, Platform as a Service, Software as a Service, Infrastructure as a Service, Architecture.

How To Understand Cloud Computing

Emerging Technology for the Next Decade

ADOPTING MICROSOFT AZURE

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services

Cloud Computing. Bringing the Cloud into Focus

Running Oracle Applications on AWS

Oracle Applications and Cloud Computing - Future Direction

Backing up to the Cloud

Building Blocks of the Private Cloud

Enhancing Operational Capacities and Capabilities through Cloud Technologies

DLT Solutions and Amazon Web Services

CHECKLIST FOR THE CLOUD ADOPTION IN THE PUBLIC SECTOR

Sistemi Operativi e Reti. Cloud Computing

Hybrid Cloud Computing

A.Prof. Dr. Markus Hagenbuchner CSCI319 A Brief Introduction to Cloud Computing. CSCI319 Page: 1

WOLKEN KOSTEN GELD GUSTAVO ALONSO SYSTEMS GROUP ETH ZURICH

Table of contents. Cloud Computing Sourcing. August Key Takeaways

CLOUD COMPUTING. A Primer

Cloud IT, Privacy, and Security. June 13, 2013

Oracle Cloud Update November 2, Eric Frank Oracle Sales Consultant. Copyright 2014 Oracle and/or its affiliates. All rights reserved.

CLOUD COMPUTING SECURITY ISSUES

A Survey on Cloud Security Issues and Techniques

Above the Clouds: A Berkeley View of Cloud Computing & Outlook: Cloudy with a Chance of Security Challenges and. Presented by Nikhil Tripathi

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

Cloud Computing An Elephant In The Dark

Elevate your analytics with SAS in the cloud

The Private Cloud Your Controlled Access Infrastructure

The Hidden Extras. The Pricing Scheme of Cloud Computing. Stephane Rufer

Data Centers and Cloud Computing. Data Centers

Cloud computing. Examples

CLOUD COMPUTING SECURITY CONCERNS

Cloud Computing Security Issues

ISSN: (Online) Volume 2, Issue 5, May 2014 International Journal of Advance Research in Computer Science and Management Studies

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Implementing & Developing Cloud Computing on Web Application

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

Cloud Computing: Compliance and Client Expectations

Last time. Today. IaaS Providers. Amazon Web Services, overview

Développement logiciel pour le Cloud (TLC)

Integrating Active Directory Federation Services (ADFS) with Office 365 through IaaS

Future of Cloud Computing. Irena Bojanova, Ph.D. UMUC, NIST

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

CSE 599c Scientific Data Management. Magdalena Balazinska and Bill Howe Spring 2010 Lecture 3 Science in the Cloud

Orchestrating the New Paradigm Cloud Assurance

Virginia Government Finance Officers Association Spring Conference May 28, Cloud Security 101

Deploying a Geospatial Cloud

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Strategic Compliance & Securing the Cloud. Annalea Sharack-Ilg, CISSP, AMBCI Technical Director of Information Security

Transcription:

Introduction What is Cloud Computing? Cloud computing means computing resources available on demand Resources can include storage, compute cycles, or software built on top (e.g. database as a service) On demand means fast setup/teardown, pay-as-you-go For big data, clouds are attractive for several reasons Access to large infrastructure that is hard to operate Bursty workloads benefitting from pay-as-you-go Recent years have seen major growth of cloud computing in most software domains 1

Examples Low-level storage and computing Amazon S3 and EC2; Google Compute Engine; Windows Azure; Rackspace Hosted services Amazon Relational Database Service (MySQL/Oracle) Google BigQuery, Amazon Redshift (in-house systems) Vertical applications Salesforce, Splunk, Tableau Benefits for Users Fast deployment Cloud services can start in minutes, without long setup Outsourced management Provider handles administration, reliability, security Lower costs Benefit from economies of scale of provider; only pay for resources while in use Elasticity Easy to acquire lots of infrastructure for a short period Benefits for Providers Economies of scale Share expertise and resources across many customers Lower costs per user due to scale Fast deployment Compared to traditional software sales cycles, new features reach users directly Optimization across users 2

Clouds and Big Data Clouds have several benefits for big data use cases: Access to reliable distributed storage (hard to do alone) Elasticity for large computations (100 nodes for 1 hour) Data sharing across tenants (e.g. public datasets) At the same time, several challenges exist: Security and privacy guarantees Data import and export Lock-in This Lecture Cloud economics Types of services Challenges of the cloud model Introduction THANK YOU 3

Cloud Economics When Does the Cloud Make Economic Sense? Three cases compared to traditional on-site hosting: Variable utilization Economies of scale Cost associativity 4

Variable Utilization With on-site hosting, must provision for peak load time" Risk of over- or under-provisioning resources" resources" resources" time" time" Variable Utilization Clouds typically charge at a much finer granularity (e.g. 1 hour) resources" time" Even with higher hourly rates, can be worth it Why Can the Provider Do This Better? Statistical multiplexing Different variable workloads peak at different times, making the sum more predictable +! +! +! = Other uses for compute resources Amazon & Google can use idle resources for their own internal computations, thus not wasting them 5

Economies of Scale Small company hires 1 sysadmin for 100 servers $100K/year => $1000 per year per servers Amazon hires 1 sysadmin for 10K servers Only $10 per year per server Amazon s scale also lets it buy hardware, power, security, etc. at lower prices Flip side: cloud providers must also make margins! Cost Associativity Associativity: a b = b a! For the cloud: 100 servers for 1 hour cost the same as 1 server for 100 hours! 1 server-hour" servers" servers" time (h)" time (h)" Result: For parallel workloads, can get answer faster! Same CPU cycles/dollar, but more productivity/dollar" Summary Cloud provides most advantage when one of: Resource usage is variable In-house organization is small Parallelism improves productivity 6

Cloud Economics THANK YOU Types of Cloud Services 7

Levels of Abstraction Software as a Service (SaaS) Complete, user-facing applications E.g. Splunk Storm, Tableau Online Platform as a Service (PaaS) Developer-facing services and abstractions that are higher level than raw machines E.g. hosted databases (Amazon RDS), MapReduce Infrastructure as a Service (IaaS) Raw computing resources, e.g. virtual machines, disks [Peter Mell and Timothy Grance, The NIST definition of Cloud Computing]! Multitenancy Public Cloud Shared by multiple tenants from the general public Private Cloud Used by a single organization for internal workloads May be hosted either on or off premises [Peter Mell and Timothy Grance, The NIST definition of Cloud Computing]! Access Interfaces Open interfaces Standard across vendors and even on-premise E.g. x86 virtual machine, block devices for storage, MySQL database hosting, Hadoop MapReduce Proprietary Specific to vendor E.g. Amazon DynamoDB, Google BigQuery 8

Examples Service! Details! Level! Hosting! Interface! Amazon EC2" Virtual machine hosting" IaaS" Public" Standard" Rackspace Private Cloud" Virtual machine hosting" IaaS" Private" Standard" Amazon Relational Database Service" Hosted MySQL, Oracle, and others" PaaS" Public" Standard" Amazon DynamoDB" Key-value store" PaaS" Public" Proprietary" Tableau Online" Visualization & reporting software" SaaS" Public" Proprietary" Types of Cloud Services THANK YOU 9

Challenges & Responses 1. Security With outsourced computation / storage, security and confidentiality may be harder to guarantee Legal compliance (e.g. HIPAA, PCI DSS) Need to assure that provider also follows guidelines Provider may be in a different legal jurisdiction 1. Security: Responses Control over security properties Encryption of stored data Remote key rotation Access roles and user authentication Provider compliance Example: many providers are PCI DSS compliant Advances in cryptography Homomorphic encryption Enc(a+b) = Enc(a) + Enc(b) Order-preserving encryption a<b => Enc(a) < Enc(b) Searching on encrypted data 10

2. Availability Cloud gives responsibility for availability to 3rd party Making sure data is reliable, service is up, etc. Business continuity Responses: Location diversity within a provider (data replication, availability zones ) Multiple providers Scale may let providers do more than on-site hosting 3. Data Transfer Moving data over the Internet is slow! Transferring 10 TB over a T3 line (45 Mbps) = 20 days 10 TB of disks = $400 (5 disks) Responses: Data transfer into many providers is free Shipping physical disks (e.g. Amazon Import/Export) 4. Lock-In Interface lock-in Proprietary interfaces may make applications hard to move on-site or across providers" Data lock-in Data is expensive to move out!" Computation needs to be near data" Responses: Preference for open / standard APIs" Wrappers over provider interfaces (e.g. jclouds)" Physical import/export" 11

Conclusions While still relatively new, clouds are an exciting environment to manage and process big data Several challenges, both legal and technological, remain, but are actively worked on In 1900, large companies generated their own electricity; can computing also become a utility? Challenges & Responses THANK YOU 12