Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams



Similar documents
HYBRID CLOUD SUPPORT FOR LARGE SCALE ANALYTICS AND WEB PROCESSING. Navraj Chohan, Anand Gupta, Chris Bunch, Kowshik Prakasam, and Chandra Krintz

appscale: open-source platform-level cloud computing

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

CLOUD COMPUTING. When It's smarter to rent than to buy

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet

Performance measurement of a private Cloud in the OpenCirrus Testbed

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Mobile Cloud Computing T Open Source IaaS

AppScale: Open-Source Platform-As-A-Service

2) Xen Hypervisor 3) UEC

A Cost-Evaluation of MapReduce Applications in the Cloud

Comparing Open Source Private Cloud (IaaS) Platforms

Putchong Uthayopas, Kasetsart University

Manjrasoft Market Oriented Cloud Computing Platform

wu.cloud: Insights Gained from Operating a Private Cloud System

Open Cirrus: Towards an Open Source Cloud Stack

MapReduce and Hadoop Distributed File System V I J A Y R A O

Cluster, Grid, Cloud Concepts

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Private Clouds with Open Source

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Networks and Services

Sriram Krishnan, Ph.D.

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

Dynamic Resource allocation in Cloud

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Efficient Cloud Management for Parallel Data Processing In Private Cloud

MapReduce and Hadoop Distributed File System

Shared Parallel File System

Mobile Cloud Computing for Data-Intensive Applications

Cloud-WIEN2k. A Scientific Cloud Computing Platform for Condensed Matter Physics

Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud

5 SCS Deployment Infrastructure in Use

Cloud computing - Architecting in the cloud

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX

LOGO Resource Management for Cloud Computing

Assignment # 1 (Cloud Computing Security)

Manual for using Super Computing Resources

Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study

Speeding up MATLAB and Simulink Applications

Comparing Ganeti to other Private Cloud Platforms. Lance Albertson

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Cloud Computing. Alex Crawford Ben Johnstone

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

Cloud Computing through Virtualization and HPC technologies

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

opening the clouds qualitative overview of the state-of-the-art open source cloud management platforms. ACM, middleware 2009 conference

benchmarking Amazon EC2 for high-performance scientific computing

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Cloud Computing. Lecture 24 Cloud Platform Comparison

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Infrastructure as a Service (IaaS)

Open Source Cloud Computing: Characteristics and an Overview

Manjrasoft Market Oriented Cloud Computing Platform

1 Bull, 2011 Bull Extreme Computing

Solution for private cloud computing

Cloud Computing For Bioinformatics

Building Platform as a Service for Scientific Applications

Cloud Computing Technology

The Quest for Conformance Testing in the Cloud

Introduction to Cloud Computing

THE EUCALYPTUS OPEN-SOURCE PRIVATE CLOUD

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University

Amazon EC2 Product Details Page 1 of 5

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Deploying a Geospatial Cloud

What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea

Chapter 2 Cloud Computing

Performance Analysis of a Numerical Weather Prediction Application in Microsoft Azure

Virtualization & Cloud Computing (2W-VnCC)

Hybrid Development and Test USE CASE

Data Analysis with MATLAB The MathWorks, Inc. 1

Hadoop. Bioinformatics Big Data

Apache Hadoop. Alexandru Costan

Virtual Machine Instance Scheduling in IaaS Clouds

Cloud Computing Now and the Future Development of the IaaS

Building a Private Cloud with Eucalyptus

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

Intro to Virtualization

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Data Centers and Cloud Computing

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Transcription:

Neptune A Domain Specific Language for Deploying HPC Software on Cloud Platforms Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams ScienceCloud 2011 @ San Jose, CA June 8, 2011

Cloud Computing Three tiers of abstraction: Infrastructure: Scalable hardware Platform: Scalable APIs Software: Scalable application

HPC in the Cloud Hard to automatically configure and deploy libraries Especially for the average scientist Even harder for those w/o grid experience Hard to get high performance on opaque cloud Wide range of APIs for similar services (e.g., compute, storage)

Introducing Neptune A Domain Specific Language that facilitates HPC configuration and deployment Serves High Throughput Computing (HTC) as well as Many Task Computing (MTC) Part language component, part runtime component

Design Language - Consists of metaprogramming extensions to Ruby Runtime - Operates at cloud platform layer Can control VMs as well as applications Can act standalone or as a library to existing Ruby programs

Syntax neptune :type => service-type, :option1 => setting-1, :option2 => setting-2

Semantics Options are unique per job Not all jobs are equal Uses Ruby s dynamic typing so option and setting types can be dynamic Extensible - add your own job types / options / settings

Semantics Neptune jobs return a hashmap :result is either :success or :failure Additional parameters are job-specific Customizable per job

Design Need something more than XML Need to control the execution Allows for integration with Rails Code is reusable across job types e.g., job input / output / ACLs

To Run a Job: Minimum set of requirements for jobs: Acquire resources Run job Get output Leverage AppScale cloud platform

AppScale Open source implementation of the Google App Engine APIs Can deploy to Amazon EC2 or Eucalyptus Standard three tier deployment model: Load balancer app server database

AppScale Tools Modified the command line tools Added support for placement strategies e.g., hot spares for HPC computation Added hybrid cloud support Can run in Amazon EC2 and Eucalyptus simultaneously

AppController A special daemon on all systems that configures necessary services Now acts as part of Neptune s runtime component Can receive Neptune jobs Acquires nodes from infrastructure, configures, and deploys as needed

Virtual Machine Reuse Machines are charged by the hour - so keep them for an hour in case users need them later Scheduling policies - hill climbing algorithm used based on execution time or total cost Relies on user to specify how many nodes the code can run over

Message Passing Interface (MPI) Many implementations exist - we use a commonly used version (C/C++) Data is shared via NFS User specifies how many nodes are needed, and processors are to be used, and where their code is located

An Example neptune :type => :mpi, :nodes_to_use => 64, :code => /code/nqueens, :output => /results/nqueens.txt, :storage => s3

X10 IBM s effort at simplifying parallel computing Java-like syntax, no pointers needed Can use Java backend or MPI backend via C++ Thus, same parameters as MPI jobs

Popularized by Google in 2004 Neptune supports both flavors : Regular: Write Java Mapper and Reducer, specify a single JAR for execution and main class Streaming: Write code in any language, specify location of Map and Reduce files

Stochastic Simulation Algorithms Computational scientists do large numbers of Monte Carlo simulations Embarassingly parallel Two types of simulations (DFSP and dwssa) supported by Neptune User specifies how many nodes and how many simulations to run

An Example path = /racelab/dfsp-run-#{rand} neptune :type => dfsp, :output => path, :simulations => 1_000_000 result = neptune :type => output, :output => path puts DFSP result is #{result[:stdout]}

Storage Backends Can store to any database AppScale uses Can also store to Amazon S3 Or anything that uses the same APIs e.g., Eucalyptus Walrus, Google Storage

Not Just for HPC Can be used to scale AppScale itself Specify which component to add and how many Can specify resources across clouds e.g., ten database nodes, five in each of two clouds

Limitations Not amenable to codes that require hardcoded IP addresses or other identifiers Not amenable for codes that require closed source software to run Is ok if the software is only needed for compilation or linking

Evaluation Physical hardware: Intel Xeon, 8 cores, 16GB of memory Virtual machines: 1 virtual core, 1GB of memory MapReduce Java WordCount: 2.5GB input file containing Shakespeare 500 times

Java WordCount

VM Reuse

Wrapping it Up Thanks to the AppScale team, especially colead Navraj Chohan and advisor Chandra Krintz Currently Neptune is at version 0.1.1, with added support for UPC, Erlang, Go and R gem install neptune Visit us at http://neptune-lang.org