Open source software for building a private cloud



Similar documents
STeP-IN SUMMIT June 2014 at Bangalore, Hyderabad, Pune - INDIA. Performance testing Hadoop based big data analytics solutions

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Iron Chef: Bare Metal OpenStack

Cloud Computing and Big Data What Technical Writers Need to Know

2) Xen Hypervisor 3) UEC

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

Cloud Courses Description

IOS110. Virtualization 5/27/2014 1

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

OpenStack Ecosystem and Xen Cloud Platform

Sistemi Operativi e Reti. Cloud Computing

CONDOR CLUSTERS ON EC2

Déployer son propre cloud avec OpenStack. GULL François Deppierraz

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

SURFnet Cloud Computing Solutions

Mobile Cloud Computing T Open Source IaaS

BIG DATA USING HADOOP

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

Overview: Building Open Source Cloud Computing Environments

OpenNebula An Innovative Open Source Toolkit for Building Cloud Solutions

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Deploying Business Virtual Appliances on Open Source Cloud Computing

Cloud Computing. Adam Barker

Scalable Architecture on Amazon AWS Cloud

Cloud Courses Description

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

What We Can Do in the Cloud (1) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea

TUT5605: Deploying an elastic Hadoop cluster Alejandro Bonilla

Peers Techno log ies Pv t. L td. HADOOP

How To Install Eucalyptus (Cont'D) On A Cloud) On An Ubuntu Or Linux (Contd) Or A Windows 7 (Cont') (Cont'T) (Bsd) (Dll) (Amd)

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

JAVA IN THE CLOUD PAAS PLATFORM IN COMPARISON

Assignment # 1 (Cloud Computing Security)

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Modern Web development and operations practices. Grig Gheorghiu VP Tech Operations Nasty Gal

Installing & Using KVM with Virtual Machine Manager COSC 495

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Cloud Computing. Chapter 1 Introducing Cloud Computing

Download Virtualization Software Download a Linux-based OS Creating a Virtual Machine using VirtualBox: VM name

SUSE Cloud 5 Private Cloud based on OpenStack

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

Towards an Architecture for Monitoring Private Cloud

Upgrading to Ubuntu Server Edition LTS

Open Source for Cloud Infrastructure

Big Data Analytics - Accelerated. stream-horizon.com

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures

Introduction to Big Data Training

Cloud JPL Science Data Systems

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Cloud Computing. Chapter 1 Introducing Cloud Computing

Chapter 7. Using Hadoop Cluster and MapReduce

Qsoft Inc

Deploying Hadoop with Manager

Apache Bigtop: 100% Apache Bigdata management distribution. (and so much more!)

Microsoft Azure: Opção de Nuvem para Todo o Desenvolvedor. Danilo Bordini & Osvaldo Daibert

How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning

Apache Hadoop. Alexandru Costan

The Cloud as a Computing Platform: Options for the Enterprise

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

Virtualization & Cloud Computing (2W-VnCC)

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

Open Source Technologies on Microsoft Azure

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

Tungsten Replicator, more open than ever!

Introduction to Cloud Computing

Solution for private cloud computing

Automated deployment of virtualization-based research models of distributed computer systems

NoSQL Data Base Basics

ASAM ODS, Peak ODS Server and openmdm as a company-wide information hub for test and simulation data. Peak Solution GmbH, Nuremberg

Cloud computing - Architecting in the cloud

Virtualization for Cloud Computing

Open source large scale distributed data management with Google s MapReduce and Bigtable

HO5604 Deploying MongoDB. A Scalable, Distributed Database with SUSE Cloud. Alejandro Bonilla. Sales Engineer abonilla@suse.com

IJREISS Volume 2, Issue 11 (November 2012) ISSN: SURVEY OF OPEN SOURCE IN THE CLOUD FOR FUTURE-THINKING ABSTRACT

Emerging Technology for the Next Decade

Private Clouds with Open Source

How To Understand Cloud Computing

Cloud Computing Architecture

OpenNebula Leading Innovation in Cloud Computing Management

Transcription:

Michael J Pan CEO & co-founder, nephosity COSCUP 15 August 2010

An introduction me 10+ years working on high performance (distributed, grid, cloud) computing at DreamWorks Animation, NASA JPL, NIH Center for Computational Biology, Compaq started nephosity in March 2010. nephosity develops cloud computing platform for enterprises showcased by STRUCTURE 2010 as one of 10 most promising cloud computing startups of 2010

What s behind the cloud?

Motivation Scenario You are (or your company is) developing a SaaS You require elastic compute resources So you want to deploy in the cloud, but... Public clouds do not satisfy your (security, performance, etc.) requirements You want to use open source components in your cloud What s available to you?

Why not Amazon EC2 (or some other public cloud?) EC2 (more specifically, dynamic provisioning 1 capabilities provided by EC2) is only one part of the equation Core is dynamic provisioning capabilities EC2 is not open source. You need a machine image to run on EC2 what software (OS + platform) to install on the image? What are the (open source) alternatives for dynamic provisioning? 1 the ability to start up and tear-down compute resources on-demand

What about Hadoop? Hadoop is also only part of the equation Hadoop-core provides map-reduce functionality HDFS provides data management functionality How do you control Hadoop jobs? What alternatives to Hadoop are there?

Cloud computing stack Infrastructure Hypervisor / machine image Dynamic provisioning Operating system Platform Data management Map-reduce Workflow management Messaging Cluster management Configuration Analytics

Disclaimer Will discuss only open source offerings that have been released Will present what s available, not how to adopt/implement them Lists may be incomplete You will see some badly hand drawn graphics

Infrastructure Hypervisor / Virtual machine Dynamic provisioning Operating system

Hypervisor / Virtual machine Hardware virtualization Allows multiple virtual machines to run on a single physical machine

Hypervisor / Virtual machine QEMU (virtualizer) KVM Xen VirtualBox (Desktop only)

Dynamic provisioning de/allocate compute resources on demand You get compute resources when you want them Compute resources are reclaimed when you release them

Dynamic provisioning Open source software Eucalyptus OpenNebula / Haizea Condor (via VM universe) TCloud Elaster (not yet released)

Operating system The interface between your software and the underlying hardware In cloud computing, operating systems are stored as machine images Images are distributed to local storage on-demand Loaded into memory and booted into the hypervisor by the dynamic provisioner

Operating system Various Linux distributions Ubuntu SUSE Fedora CentOS BSD (on VirtualBox)

Platform Data management Map reduce Workflow management Messaging

Data management Distribute your data across your network Replicate your data across your network Optimize retrieval to improve computation time Optimize storage requirements

Data management considerations SQL vs. NoSQL Replication degree small file vs. BLOB storage Consistency Centralized vs. decentralized Access patterns

Data management HDFS (Hadoop) SphereFS (UIC) DDFS (Nokia) Cassandra (Facebook / Apache) MongoDB CouchDB (Apache) MySQL (Oracle) PostgreSQL Ceph (DreamHost), release as part of Linux v2.6.34

Map reduce Split and parallelize a task into many parts Combine the results of the split tasks for a final result

Map reduce Open source offerings Hadoop (Yahoo) Sphere (UIC) Disco (Nokia)

Workflow management design specification coordinated execution of compute tasks

Workflow management Open source offerings Oozie (Yahoo) Pig (Hadoop / Apache) Cascading (Concurrent) Azkaban (LinkedIn) pomsets (nephosity)

Messaging Unified framework for your application and all components to communicate with each other Above the network hardware and network protocol layer Your application handles only discrete messages

Messaging Open source offerings qpid (Apache) RabbitMQ (SpringSource / VMWare) ZeroMQ (imatix)

Cluster management

Configuration management Configuration of your running cloud instances Software upgrades Dynamic configuration that cannot be stored onto OS images Relaxes storage constraints vs. using OS images

Configuration Open source offerings Chef (Opscode) Puppet StarCluster (MIT)

Analytics Collection and visualization of the status of your cloud Compute load Network usage Dynamic load balancing and scaling of your cloud Start new instances Tear down existing instances

Analytics Open source offerings Graphite (Orbitz) Scalr Nagios Ganglia

Questions? For more info: Michael Pan mjpan@nephosity.com