Infrastructure-as-a-Service Cloud Computing for Science



Similar documents
Nimbus: Cloud Computing with Science

How To Build A Cloud Computing System With Nimbus

Cloud Computing for Science

Cloud Computing with Nimbus

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman

Efficient Data Management Support for Virtualized Service Providers

Deploying Business Virtual Appliances on Open Source Cloud Computing

Building a Volunteer Cloud

Efficient Cloud Management for Parallel Data Processing In Private Cloud

Plug-and-play Virtual Appliance Clusters Running Hadoop. Dr. Renato Figueiredo ACIS Lab - University of Florida

Solution for private cloud computing

PoS(EGICF12-EMITC2)005

Proactively Secure Your Cloud Computing Platform

A Service for Data-Intensive Computations on Virtual Clusters

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

OpenNebula An Innovative Open Source Toolkit for Building Cloud Solutions

SERVER 101 COMPUTE MEMORY DISK NETWORK

Design and Building of IaaS Clouds

Comparison of Several Cloud Computing Platforms

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Comparative Study of Eucalyptus, Open Stack and Nimbus

Experimental Study of Bidding Strategies for Scientific Workflows using AWS Spot Instances

The OpenNebula Standard-based Open -source Toolkit to Build Cloud Infrastructures

VM Management for Green Data Centres with the OpenNebula Virtual Infrastructure Engine

Enabling Technologies for Cloud Computing

Sky Computing: When Multiple Clouds Become One

Cloud Computing from an Institutional Perspective

Research computing in a distributed cloud environment

Shoal: IaaS Cloud Cache Publisher

Data intensive high energy physics analysis in a distributed cloud

Cloud Infrastructure Pattern

Amazon EC2 Product Details Page 1 of 5

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice

HPC performance applications on Virtual Clusters

Emerging Technology for the Next Decade

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring

CONDOR CLUSTERS ON EC2

CHAPTER 8 CLOUD COMPUTING

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

9/26/2011. What is Virtualization? What are the different types of virtualization.

Virtual Machine Management with OpenNebula in the RESERVOIR project

Amazon EC2 XenApp Scalability Analysis

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

Interoperating Cloud-based Virtual Farms

How To Understand Cloud Computing

Data Sharing Options for Scientific Workflows on Amazon EC2

Using WebSphere Application Server on Amazon EC2. Speaker(s): Ed McCabe, Arthur Meloy

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform

How To Understand Cloud Computing

Scheduler in Cloud Computing using Open Source Technologies

SURFsara HPC Cloud Workshop

OpenNebula Leading Innovation in Cloud Computing Management

SURFsara HPC Cloud Workshop

Improving MapReduce Performance in Heterogeneous Environments

Cloud Computing and Amazon Web Services

Transcription:

Infrastructure-as-a-Service Cloud Computing for Science October 2009 Banff Centre, Banff, Canada Kate Keahey keahey@mcs.anl.gov Nimbus project lead University of Chicago Argonne National Laboratory

Cloud Computing for Science Environment Complexity Consistency Availability

Grid Computing Assumption: control over the manner in which resources are used stays with the site Site A Site B VO-A Site-specific environment and mode of access Site-driven prioritization But: site control -> rapid adoption

Cloud Computing Change of assumption: control over the resource is turned over to the user Site A Site B VO-A Enabling factors: virtualization and isolation Challenges our notion of a site But: slow adoption

Grids to Clouds: a Personal Perspective First STA production run on EC2 Xen released EC2 goes online Nimbus Cloud comes online 2003 2006 2009 A Case for Grid Computing on VMs In-Vigo, VIOLIN, DVEs, Dynamic accounts Policy-driven negotiation First WSF Workspace Service release EC2 gateway available Context Broker release Support for EC2 interfaces

A Very Quick Introduction to the Nimbus Toolkit: an Infrastructure-as-a-Service Toolkit

Nimbus: Cloud Computing Software Allow providers to build clouds Workspace Service: a service providing EC2-like functionality WSF and WS (EC2) interfaces Allow users to use cloud computing Do whatever it takes to enable scientists to use IaaS Context Broker: turnkey virtual clusters, Also: protocol adapters, account managers and scaling tools Allow developers to experiment with Nimbus For research or usability/performance improvements Open source, extensible software Community extensions and contributions: UVIC (monitoring), IU (EBS, research), Technical University of Vienna (privacy, research) Nimbus: http://workspace.globus.org

The Workspace Service VWS Service

The Workspace Service The workspace service publishes information about each workspace Users can find out information about their workspace (e.g. what IP the workspace was bound to) VWS Service Users can interact directly with their workspaces the same way the would with a physical machine.

Cloud Computing Ecosystem Appliance Providers Marketplaces, commercial providers, Virtual Organizations Appliance management software Deployment Orchestrator VMM/DataCenter/IaaS User Environments VMM/DataCenter/IaaS User Environments

Turnkey Virtual Clusters IP1 HK1 IP2 HK2 IP3 HK3 IP1 HK1 IP1 HK1 IP1 HK1 IP2 HK2 IP2 MPI HK2 IP2 HK2 IP3 HK3 IP3 HK3 IP3 HK3 Context Broker Turnkey, tightly-coupled cluster Shared trust/security context Shared configuration/context information

Scientific Cloud esources and Applications

Science Clouds Goals Enable experimentation with IaaS Evolve software in response to user needs Exploration of cloud interoperability issues Participants University of Chicago (since 03/08), University of Florida (05/08, access via VPN), Wispy @ Purdue (09/08) International collaborators Using EC2 for large runs Science Clouds Marketplace: OSG cluster, Hadoop, etc. 100s of users, many diverse projects ranging across science, CS research, build&test, education, etc. Come and run: http://workspace.globus.org/clouds

STA experiment Work by Jerome Lauret, Leve Hajdu, Lidia Didenko (BNL), Doug Olson (LBNL) STA: a nuclear physics experiment at Brookhaven National Laboratory Studies fundamental properties of nuclear matter Problems: Complexity Consistency Availability

STA Virtual Clusters Virtual resources A virtual OSG STA cluster: OSG head (gridmapfiles, host certificates, NFS, Torque), worker s: SL4 + STA One-click virtual cluster deployment via Nimbus Context Broker From Science Clouds to EC2 runs unning production codes since 2007 The Quark Matter run: producing just-in-time results for a conference: http://www.isgtw.org/?pid=1001735

STA Quark Matter un Gateway/ Context Broker Infrastructure-as-a-Service

Priceless? Compute costs: $ 5,630.30 300+ s over ~10 days, Instances, 32-bit, 1.7 GB memory: EC2 default: 1 EC2 CPU unit High-CPU Medium Instances: 5 EC2 CPU units (2 cores) ~36,000 compute hours total Data transfer costs: $ 136.38 Small I/O needs : moved <1TB of data over duration Storage costs: $ 4.69 Images only, all data transferred at run-time Producing the result before the deadline $ 5,771.37

Modeling the Progression of Epidemics Work by on Price and others, Public Health Informatics, University of Utah Can we use clouds to acquire on-demand resources for modeling the progression of epidemics? What is the efficiency of simulations in the cloud? Compare execution on: a physical machine 10 VMs on the cloud The Nimbus cloud only 2.5 hrs versus 17 minutes Speedup = 8.81 9 times faster

A Large Ion Collider Experiment (ALICE) Work by Artem Harutyunyan and Predrag Buncic, CEN Heavy ion simulations at CEN Problem: integrate elastic computing into current infrastructure Collaboration with CernVM project Elastically extend the ALICE testbed to accommodate more computing

Elastic Provisioning for ALICE HEP ALICE queue queue sensor AliEn Context Broker Infrastructure-as-a-Service

Elastically Provisioned esources CHEP09 paper, Harutyunyan et al. Elastic resource base: OOI, ATLAS, ElasticSite, and others

Sky Computing Change of assumption: we can now trust remote resources Site A Site B VO-A Enabling factors: cloud computing and virtual networks Instead of a bunch of disconnected domains, one domain overlapping the Internet Network leases for a fully controlled environment

Sky Computing Environment Work by A. Matsunaga, M. Tsugawa, University of Florida U of Chicago U of Florida ViNE router ViNE router ViNE router Purdue Creating a seamless environment in a distributed domain

Hadoop in the Science Clouds U of Chicago U of Florida Hadoop cloud Purdue Papers: CloudBLAST: Combining Mapeduce and Virtualization on Distributed esources for Bioinformatics Applications by A. Matsunaga, M. Tsugawa and J. Fortes. escience 2008. Sky Computing, by K. Keahey, A. Matsunaga, M. Tsugawa, J. Fortes, to appear in IEEE Internet Computing, September 2009

Canadian Involvement Two CANAIE Network Enabled Platforms project in Development using Nimbus High Energy Physics Legacy Data Project (started this month) Allow the preservation of analysis environments along with the data to ensure that the data will continue to be analyzable in the future. CANAFA: Canadian Astronomy Network for Astronomical esearch Allows astronomers to do their analysis in their own custom environments. Supports multiple telescopes. Come check it out at the NEP Exhibition Wednesday 18:00 during reception Ian Gable igable@uvic.ca

Parting Thoughts IaaS cloud computing is science-driven Scientific applications are successfully using the existing infrastructure for production runs Promising new model for the future We are just at the very beginning of the cloud revolution Cloud computing is not done Significant challenges in building ecosystem, security, usage, price-performance, etc. Lots of work to do!