Cloud Computing and Datacenters Prof. Sasu Tarkoma University of Helsinki



Similar documents
Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Infrastructure for Cloud Computing

Cloud Computing and Open Source: Watching Hype meet Reality

THE EUCALYPTUS OPEN-SOURCE PRIVATE CLOUD

LSKA 2010 Survey Report I Device Drivers & Cloud Computing

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Network Virtualization

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器

Lecture 02a Cloud Computing I

Sistemi Operativi e Reti. Cloud Computing

2) Xen Hypervisor 3) UEC

Cluster, Grid, Cloud Concepts

Cloud Computing Trends

Chapter 11 Cloud Application Development

Introduction to Cloud Computing

Cloud Computing: Making the right choices

Software Defined Networks

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

Assignment # 1 (Cloud Computing Security)

The Eucalyptus Open-source Cloud Computing System

Network performance in virtual infrastructures

CHAPTER 2 THEORETICAL FOUNDATION

Programmable Networking with Open vswitch

Introduction to Software Defined Networking (SDN) and how it will change the inside of your DataCentre

Introduction to Cloud Computing

9/26/2011. What is Virtualization? What are the different types of virtualization.

Cloud Computing an introduction

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Introduction: Why do we need computer networks?

Cloud computing - Architecting in the cloud

A Gentle Introduction to Cloud Computing

How To Make A Vpc More Secure With A Cloud Network Overlay (Network) On A Vlan) On An Openstack Vlan On A Server On A Network On A 2D (Vlan) (Vpn) On Your Vlan

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

Virtualization, SDN and NFV

Private Cloud in Educational Institutions: An Implementation using UEC

Cloud Infrastructure Pattern

How To Understand Cloud Computing

CLOUD COMPUTING. When It's smarter to rent than to buy

Private Distributed Cloud Deployment in a Limited Networking Environment

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Proactively Secure Your Cloud Computing Platform

Computing Service Provision in P2P Clouds

Lecture 02b Cloud Computing II

How To Understand The Power Of The Internet

Cloud Computing Architecture: A Survey

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

International Journal of Engineering Research & Management Technology

Cloud Models and Platforms

The Cisco Powered Network Cloud: An Exciting Managed Services Opportunity

An Introduction to Cloud Computing Concepts

Open Source Network: Software-Defined Networking (SDN) and OpenFlow

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

Cloud Computing. Adam Barker

Building a Private Cloud with Eucalyptus

Oracle Applications and Cloud Computing - Future Direction

SDN/Virtualization and Cloud Computing

Efficient Cloud Management for Parallel Data Processing In Private Cloud

CHAPTER 8 CLOUD COMPUTING

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

RIDE THE SDN AND CLOUD WAVE WITH CONTRAIL

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Private Clouds with Open Source

Cloud Computing Overview

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

Assembling Cloud Infrastructures with Eucalyptus

CLEVER: a CLoud-Enabled Virtual EnviRonment

From Internet Data Centers to Data Centers in the Cloud

Architectural Implications of Cloud Computing

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

Software Defined Networking What is it, how does it work, and what is it good for?

Intel Cloud Builder Guide to Cloud Design and Deployment on Intel Xeon Processor-based Platforms

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Cloud Networking Disruption with Software Defined Network Virtualization. Ali Khayam

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Cloud 101. Mike Gangl, Caltech/JPL, 2015 California Institute of Technology. Government sponsorship acknowledged

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Comparing Open Source Private Cloud (IaaS) Platforms

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Network Security Demonstration - Snort based IDS Integration -

System Models for Distributed and Cloud Computing

Outline. Why Neutron? What is Neutron? API Abstractions Plugin Architecture

Written examination in Cloud Computing

Business applications:

Contents. What is Cloud Computing? Why Cloud computing? Cloud Anatomy Cloud computing technology Cloud computing products and market

Extending Networking to Fit the Cloud

How cloud computing can transform your business landscape

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

How To Install Eucalyptus (Cont'D) On A Cloud) On An Ubuntu Or Linux (Contd) Or A Windows 7 (Cont') (Cont'T) (Bsd) (Dll) (Amd)

Performance measurement of a private Cloud in the OpenCirrus Testbed

Chapter 19 Cloud Computing for Multimedia Services

Aneka: A Software Platform for.net-based Cloud Computing

Transcription:

Cloud Computing and Datacenters Prof. Sasu Tarkoma University of Helsinki Matemaattis-luonnontieteellinen tiedekunta / Sasu Tarkoma 22.1.2010 1

Contents Introduction Views to Cloud Computing Eucalyptus Open Source Platform MapReduce and Hadoop NEXUS/Mesos Network and Data Center Virtualization PSIRP Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 2

Focus Distributed applications and services on the cloud Collaboration & P2P Massive scale Millions and billions of users and data items Significant growth is expected in the mobile sector 50 times increase in 2015? Evaluation methods for wide-scale experiments Exploiting parallelisms & locality Custom software & solutions Building blocks Interfaces, policies, protocols, algorithms, runtimes, platforms State of the art examples: Amazon, Google, Yahoo, Microsoft

Cloud Computing The services of Cloud computing can be divided into three categories: 1. Software-as-a-Service (SaaS), in which a vendor supplies the hardware infrastructure, the software product and interacts with the user using a portal (software on demand, pay-as-you-go). 2. Platform-as-a-Service (PaaS), in which a set of software and development tools are hosted by a provider on the provider's infrastructure, for example, Google's AppEngine. 3. Infrastructure-as-a-Service (IaaS), which involves virtual server instances with unique IP addresses and blocks of on-demand storage, for example, Amazon's Web services infrastructure. Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 4

Cloud Computing Private, community, public, hybrid clouds Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) On-demand service Information demand and supply (Open APIs) Location Independent Resource Pooling Ubiquitous Network Access Elasticity Virtualization Web Application Frameworks Datacenters and clusters Browser as a Platform 22.1.2010 5

Datacenter for Experiments UH has a new datacenter for experimenting with networking technology, services, and computational science 240 CPUs with 8 cores each, virtualization support (Ubuntu and KVM) Aim to support network virtualization with OpenFlow switches and NetFPGA software-defined routers Platform experiments with Eucalyptus Platform experiments with Nexus (joint work with ICSI and UCB) Pub/sub control and data planes for data centers

Toward New Deployments New deployments Services, applications Distribution middleware Networking Solutions and Basic Processes Datacenter / cluster / testbed

Toward New Knowledge New Knowledge on Distributed Systems Real-life experiments Wide-area simulation Formal models Evaluation Methodology Datacenter / cluster / testbed

Eucalyptus Elastic Utility Computing Architecture Linking Your Programs To Useful Systems Originally a research project at UC Santa Barbara Now a private company (Eucalyptus Systems) Web services based implementation of elastic/utility/cloud computing infrastructure Linux image hosting ala Amazon How do we know if it is a cloud? Try and emulate an existing cloud: Amazon AWS Functions as a software overlay Existing installation should not be violated (too much) Focus on installation and maintenance

Eucalyptus Supports KVM and Xen Open source platform with Amazon AWS features Support for Amazon AWS (Elastic Compute Cloud EC2, Simple Storage Service S3, and Elastic Block Storage EBS) Includes Walrus: an Amazon S3 interface-compatible storage manager Added support for elastic IP assignment Web-based interface for cloud configuration Image registration and image attribute manipulation Configurable scheduling policies and SLAs Support for multiple hypervisor technologies within the same cloud

Eucalyptus Usage Foster greater understanding and uptake of cloud computing Provide a vehicle for extending what is known about the utility model of computing Experimentation vehicle prior to buying commercial services Provide development, debugging, and tech preview platform for Public Clouds Homogenize local IT environment with Public Clouds AWS functionality locally makes moving using Amazon AWS easier, cheaper, and more sustainable Provide a basic software development platform for the open source community

Architecture Client-side Interface (via network) Client-side API Translator Cloud Controller Database Walrus (S3) Cluster Controller Node Controller

Components Node Controller controls the execution, inspection, and terminating of VM instances on the host where it runs. Cluster Controller gathers information about and schedules VM execution on specific node controllers, as well as manages virtual instance network. Storage Controller (Walrus) is a put/get storage service that implements Amazon s S3 interface, providing a mechanism for storing and accessing virtual machine images and user data. Cloud Controller is the entry-point into the cloud for users and administrators. It queries node managers for information about resources, makes high-level scheduling decisions, and implements them by making requests to cluster controllers. Matemaattis-luonnontieteellinen tiedekunta 22.1.2010 13

Cloud Controller Web services in three categories Resource Services perform system-wide arbitration of resource allocations, let users manipulate properties of the virtual machines and networks, and monitor both system components and virtual resources. Data Services govern persistent user and system data and provide for a configurable user environment for formulating resource allocation request properties. Interface Services present user-visible interfaces, handling authentication & protocol translation, and expose system management tools.

Virtual Interfaces VM instance is assigned a virtual interface that is connected to a software Ethernet bridge Cluster Controller configures VM traffic isolation, dynamic public IP assignment Physical Resource VM Instance Virtual Interface Physical Interface VLAN Tagged Virtual Interface Software Ethernet Bridge To physical Ethernet To physical Ethernet 22.1.2010 Matemaattis-luonnontieteellinen tiedekunta 15

Walrus Walrus is a data storage service Leverages standard web services technologies (Axis2, Mule) Interface compatible with Amazon s Simple Storage Service (S3) Walrus implements the REST (via HTTP), sometimes termed the Query interface and SOAP interfaces Users that have access to EUCALYPTUS can use Walrus to stream data into/out of the cloud as well as from instances that they have started on nodes. Walrus acts as a storage service for VM images.

What s it Made Out Of? Axis2 and Axis2c version 1.4.0 Hibernate 3.2.2 HSQLDB 1.8.0 jetty 6.1.9 JiBX (March 30th sourceforge) Mule 2.0.1 Rampart version 1.3 libvirt version 0.4.2 socat-1.6.0 VDE version 2.2.0-pre2

Cloud Speed Performance study using HPC applications and benchmarks Two questions: What is the performance impact of virtualization? What is the performance impact of cloud infrastructure? Tested Xen, Eucalyptus, and AWS (small SLA) Many answers: Random access disk is slower with Xen CPU bound can be faster with Xen -> depends on configuration Kernel version is far more important Eucalyptus imposes no statistically detectable overhead AWS small appears to throttle network bandwidth and (maybe) disk bandwidth

MapReduce and Hadoop MapReduce was developed by Google to process large datasets in clusters Used for sorting, searching and indexing, counting, clustering, machine learning, etc. Inspired by the Map and Reduce operations used in functional programming Uses a filesystem or a database to store intermediate values and solutions Solve problems by splitting them into smaller problems, then combine the solutions Done by a master node Hadoop is a Java framework inspired by the Google file system and MapReduce Rack-aware processing of vast data sets Batch-oriented workloads Used by Facebook and Amazon Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 19

User program Worker (1)fork (2)assign map (1)fork Master (2) assign reduce (1)fork split 0 split 1 split 2 (3)read Worker (4) local write (5)Remote read Worker (6)write output file 0 split 3 split 4 Worker output file 1 Worker Input files Map phase Intermediate files (on local disks) Reduce phase Output files 22.1.2010 20

Nexus / Mesos Nexus is about running multiple frameworks in the same cluster. Resource manager. Can run systems such as Hadoop. Multiplexes resources across frameworks. Nexus decouples job execution management from resource management by providing a simple resource management layer over which frameworks like Hadoop and Dryad can run. Microkernel Make reliable component as small as possible Exokernel Give maximal control to frameworks IP model Narrow waist over which diverse frameworks can run Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 21

Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 22

Task Management Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 23

Task Management Operations Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 24

Intro to OpenFlow OpenFlow is an open protocol for router configuration and control Opportunities for convergence of packet and circuit switched and clean slate designs Slides from the Stanford Clean Slate program http://cleanslate.stanford.edu OpenFlow Web site http://www.openflowswitch.org

OpenFlow: Enable Innovations within the Infrastructure PC Net Services sw OpenFlow Switch Secure Channel SSL Controller API hw Flow Table Add/delete flow entries Encapsulated packets Controller discovery 26

PSIRP Publish/subscribe Internet Routing Paradigm (PSIRP) FP7 project coordinated by HIIT Creating a new clean-slate protocol suite based on publish/ subscribe

PSIRP Rendezvous Routing Fragmentation Observations No topological addresses, only labels No explicit layering (blackboard pattern) Security enhanced using self-certification End-to-end reachability, control in the network Natural support for multicast, it is the norm Support for broadcast and all-optical labelswitching technologies Dynamic state is introduced into the network How do we make it scale? 14.10.2008 28

Forwarding Design Fast path In-packet Bloom filters Line-rate forwarding Slow path (Rendezvous) Content-centric functions Policies Caching configuration Security 14.10.2008 29

AS: Rendezvous AS: Rendezvous Publish AS: Topology AS: Topology Subscribe Create delivery path Forwarding node Forwarding node Forwarding edge node Forwarding node Forwarding node Configure Forwarding path Subscriber Data Forwarding Publisher

Characteristics Links have identifiers (Link IDs) Source routing mechanism Install forwarding state on demand (traffic aggregation) Topology Manager Network topology graph and its maintenance Constructs Bloom filter-based forwarding identifiers

Efficient flat identifier based forwarding Currrent zfilter size 256 bits Link IDs are added in the zfilter (OR operation) Verification requires one comparison (AND operation) Limitations Possible false positives Wrong forwarding path

Overlay Networks Book Introduction Overview Overlay Technology Applications Properties of Data Structure of the Book Network Technologies Networking Firewalls and NATs Naming Addressing Routing Multicast Network Coordinates Network Metrics Properties of Networks and Data Data on the Internet Zipf s Law Scale-free Networks Robustness Small Worlds Unstructured Overlays Overview Early Systems Locating Data Napster Gnutella Skype BitTorrent Cross-ISP BitTorrent Freenet Comparison Foundations of Structured Overlays Overview Geometries Consistent Hashing Distributed Data Structures for Clusters Distributed Hash Tables Overview APIs Plaxton s Algorithm Chord Pastry Koorde Tapestry Kademlia Content Addressable Network Viceroy Skip Graph Comparison Probabilistic Algorithms Overview of Bloom Filters Bloom Filters Bloom Filters in Distributed Computing Gossip Algorithms Content-based Networking and Publish/Subscribe Overview DHT-based Data-centric Communications Content-based Routing Router Configurations Siena and Routing Structures Hermes Formal Specification of Content-based Routing Systems Pub/sub Mobility Security Overview Attacks and Threats Securing Data Security Issues in P2P Networks Anonymous Routing Security Issues in Pub/Sub Networks Applications Amazon Dynamo Overlay Video Delivery SIP and P2PSIP CDN Solutions Conclusions References Index 33 22.1.2010

Conclusions and Future Work Excellent facilities and connections for doing high impact research in cloud computing in Helsinki Current work with PSIRP, Future Internet SHOK, Cloud Software SHOK Not covered by current activities, seeds for a new project Software-defined networking Dynamic controller for OpenFlow enabled routers PSIRP in data centers Petabyte storage mechanisms Matemaattis-luonnontieteellinen tiedekunta / Petri Kettunen 22.1.2010 34