Brian Connolly Systems Engineer, LabKey Software brian@labkey.com. LabKey Server in the Cloud



Similar documents
LabKey Server: An open source platform for scientific data integration, analysis, and collaboration

Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study

Cloud Computing: Making the right choices

An Introduction to Cloud Computing Concepts

Oracle Applications and Cloud Computing - Future Direction

What is Cloud Computing? Why call it Cloud Computing?

Cloud Hosting. QCLUG presentation - Aaron Johnson. Amazon AWS Heroku OpenShift

OTM in the Cloud. Ryan Haney

Outline. What is cloud computing? History Cloud service models Cloud deployment forms Advantages/disadvantages

Cloud Computing and Amazon Web Services. CJUG March, 2009 Tom Malaher

Using Cloud Services for Test Environments A case study of the use of Amazon EC2

Scientific and Technical Applications as a Service in the Cloud

How To Use Arcgis For Free On A Gdb (For A Gis Server) For A Small Business

CLOUD COMPUTING OVERVIEW

Dynamic Deployment and Scalability for the Cloud. Jerome Bernard Director, EMEA Operations Elastic Grid, LLC.

Cloud Computing and Amazon Web Services

Cloud 101. Mike Gangl, Caltech/JPL, 2015 California Institute of Technology. Government sponsorship acknowledged

SURFsara HPC Cloud Workshop

NCTA Cloud Architecture

WOLKEN KOSTEN GELD GUSTAVO ALONSO SYSTEMS GROUP ETH ZURICH

The Cost of the Cloud. Steve Saporta CTO, SwipeToSpin Mar 20, 2015

Cloud Computing Technology

CLOUD COMPUTING. When It's smarter to rent than to buy

Cloud Computing An Elephant In The Dark

Data Centers and Cloud Computing

Contents. What is Cloud Computing? Why Cloud computing? Cloud Anatomy Cloud computing technology Cloud computing products and market

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Modeling Public Pensions with Mathematica and Python II

Cloud Computing. Aditya Wikan Mahastama

Scalable Architecture on Amazon AWS Cloud


Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

How AWS Pricing Works

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

How To Choose Cloud Computing

Designing Apps for Amazon Web Services

Deploying Splunk on Amazon Web Services

CLOUD COMPUTING. When it's smarter to rent than to buy.. Presented by Anand Tirumani

Cloud Computing Summary and Preparation for Examination

Assignment # 1 (Cloud Computing Security)

SURFsara HPC Cloud Workshop

Putchong Uthayopas, Kasetsart University

A Gentle Introduction to Cloud Computing

Understanding ArcGIS Deployments in Public and Private Cloud. Marwa Mabrouk

High Performance Computing Cloud Computing. Dr. Rami YARED

Cloud Computing. Adam Barker

Emerging Technology for the Next Decade

CLOUD COMPUTING. Dana Petcu West University of Timisoara

Cloud computing. Examples

CLOUD COMPUTING. A Primer

Public Cloud Offerings and Private Cloud Options. Week 2 Lecture 4. M. Ali Babar

Deploying ArcGIS for Server using Managed Services

CPAS Overview. Josh Eckels LabKey Software

Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using

A Complete Open Cloud Storage, Virt, IaaS, PaaS. Dave Neary Open Source and Standards, Red Hat

ATI Cloud Computing.

Cloud Computing with Amazon Web Services and the DevOps Methodology.

What is Cloud Computing? First, a little history. Demystifying Cloud Computing. Mainframe Era ( ) Workstation Era ( ) Xerox Star 1981!

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

How AWS Pricing Works May 2015

Cloud computing - Architecting in the cloud

Cloud Computing Services and its Application

2) Xen Hypervisor 3) UEC

ArcGIS for Server: In the Cloud

CloudCenter Full Lifecycle Management. An application-defined approach to deploying and managing applications in any datacenter or cloud environment

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

How to Grow and Transform your Security Program into the Cloud

Last time. Today. IaaS Providers. Amazon Web Services, overview

Cloud computing is a marketing term for technologies that provide servers, outside of the firewall, for:

Cloud Computing. Technologies and Types

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Build Your Own Performance Test Lab in the Cloud. Leslie Segal Testware Associate, Inc.

Turnkey Technologies- A Closer Look

Cloud Computing INTRODUCTION

White Paper on CLOUD COMPUTING

Sugar Professional. Approvals Competitor tracking Territory management Third-party sales methodologies

Software as a Service (SaaS) and Platform as a Service (PaaS) (ENCS 691K Chapter 1)

Infrastructure as a Service (IaaS)

CSE543 Computer and Network Security Module: Cloud Computing

McAfee Public Cloud Server Security Suite

Introduction to Cloud Services

Introduction to Cloud computing. Viet Tran

Beginning Azure. Ready for the Cloud!

Session 3. the Cloud Stack, SaaS, PaaS, IaaS

Data Centers and Cloud Computing. Data Centers

Transcription:

Brian Connolly Systems Engineer, LabKey Software brian@labkey.com LabKey Server in the Cloud 1

Agenda What is the Cloud? Why would I want to use the cloud? What will it cost? Using LabKey in the cloud How does LabKey use the cloud? Other scientific tools in the Cloud 2

Introduction Who am I What do I bring to the conversation 3

What is the Cloud? 4

What is the cloud Wikipedia says Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility (like the electricity grid) over a network (typically the Internet). - http://en.wikipedia.org/wiki/cloud_computing What does this mean to me? rent vs buy only pay for what you use takes minutes have a new computer instead of days or weeks 5

Types of Clouds Datacenter as a Service* Platform as a Service* Software as a Service* How non-it folks see the cloud vendors. * http://en.wikipedia.org/wiki/cloud_computing 6

Datacenter as a Service From an API or GUI, you are able to provision and manage all pieces of a datacenter (servers, network hardware, storage, databases, etc). What do you get? Full control over: servers (install and configure any way you want) network hardware (firewall, load balancing) Pay as you use, self-service, and support custom configs Access to all the vendors services Storage, Database, Message Queues, CDN, load balancers, firewall,etc Major Vendors: Amazon Web Services 7

Platform as a Service delivers a computing platform and/or solution stack as a service, often consuming cloud infrastructure and sustaining cloud applications. - Wikipedia (http://en.wikipedia.org/wiki/iaas#platform) What do you get? servers (or their equivalent) storage service database service, etc no firewall, load balancing, etc Failover, clustering, custom configs. maybe Pay as you use and self-service 8

Platform as a Service (cont) There are two types of Platform as a Service You get a server(s) essentially the old hosted server model existing hosting and VPS companies are in this space. major vendors: Rackspace, GoGrid, IBM, etc You write some code and hit the deploy button. you never interact with the servers directly your application code is bundled with deployment descriptors and sent to cloud via API major vendors: Microsoft, Google App Engine, Heroku 9

Software as a Service deliver software over the Internet, eliminating the need to install and run the application on the customer's own computers and simplifying maintenance and support. - Wikipedia (http://en.wikipedia.org/wiki/iaas#application) What do you get? you are the end-user for this software personalization available pay as you use Major Vendors: Salesforce, Spotify, Flickr, etc 10

Why would I want to use the cloud? 12

Why would I want to use the cloud? To meet a deadline. A reviewer asked for the samples processed using a new method. I need to process large number of samples for a grant application Prototyping: Try a new processing method Proteomics: Use an updated FASTA file or additional parameter Genomics: Reference sequence has changed I have new hypothesis and want to quickly re-process my data 14

Why would I want to use the cloud? (cont) I want to try out new software to see if it meets my needs LabKey Online Galaxy s free public server UCSC Genome Browser I want to automate my pipeline Cyclecomputing.com (Push button HPC in the cloud) Starcluster CloudBioLinux No need to wait for 3 months for IT to purchase and setup 15

Why would I NOT want to use the cloud? (cont) Processing huge amounts of data Data transfer time is too long small network pipe to the internet transfer time + processing time in the cloud >= processing time in on your laptop I have a long running study (year or more) and I need to the computing around 24x7 (iffy) 16

What will it cost? 17

What will it cost? What will I be changed for? How will I be billed? How to estimate my costs 18

What will I get charged for? When using the cloud you are renting the computers you need. Most clouds bill by hour vendors: AWS, Rackspace,Windows Azure, Google App Engine, etc some do not (Heroku) SAAS usually bills by the month vendors: Salesforce, etc If you forgot to turn it off you will still be billed 19

How will I be billed? Billed monthly Billed to credit card Large institutions or large companies use purchase orders Monitoring usage and cost during the month 20

How to estimate your costs In general things you will get charged for are: Servers (instances) Network usage ie what you send into and out of cloud Storage How much data you store in the cloud 21

Estimating Costs: Servers What do I mean by Servers? Called instances at AWS, Google App Engine and Windows Azure Called Dynos at Heroku Usage is charged per hour Price goes up with the size of the server How to estimate: how many servers will you need? what type of servers do you need? windows or linux what will they be doing? how big a server to do you need? where should they be located AWS: Spot instances 22

Estimating Costs: Network What do I mean by Network Bandwidth into and out of the cloud You are changed only for Bandwidth out of cloud Bandwidth into cloud is generally free Bandwidth between servers is generally free Bandwidth between datacenters (not free) For most scientific applications This is usually small compared to Servers 100GB of traffic in a month = $15 23

Estimating Costs: Storage What do I mean by Storage? Amount of data you have stored in the cloud Windows Azure $0.15/GB per month based on daily average You are charged for # of transactions AWS $0.10/GB per month You are charged for # of I/O requests For most scientific applications: This can be a significant cost 24

Using LabKey in the cloud 25

Using LabKey in the cloud Who is doing it Which clouds can run a LabKey Server Installing LabKey from scratch 26

Who is running LabKey in the cloud? LabKey LabKey Online Test servers Non-Profit Research Institute Seattle based BioTech company 27

Which clouds can run LabKey Server? Datacenter as a Service clouds Amazon Web Services Some Platform as a Service clouds Rackspace GoGrid IBM Smartcloud LabKey currently cannot be used on Window Azure Google App Engine Heroku 28

Installing the LabKey in a cloud 1. Start a new instance at your cloud provider 1. Download the LabKey Server installer Windows Installer Linux Installer (coming in 11.3) 2. Install LabKey Server Instructions at http://www.labkey.org 3. Start using your LabKey Server in the cloud. 29

How does LabKey us the cloud? 30

How does LabKey do it? Use Amazon Web Services and Rackspace Cloud offerings Operating Systems Linux: Ubuntu 10.04 LTS Windows: Windows Server 2008 31

How does LabKey do it? (cont) Installation/Configuration Choose latest Ubuntu AMI (http://uecimages.ubuntu.com/releases/10.04/release/) Use EBS backed instances AWS: Use Cloudformation to provision Instances Networks (firewalls) Disks Use Chef (http://opscode.com/chef) to automate install/configuration 32

How does LabKey do it? (cont) Data upload/download speeds what do we see here at FHCRC the ship us a hard drive option Processor /memory combinations test and measure Pipelines in the Cloud our experience working with Galaxy 33

What does it cost us? Lets use LabKey Online as an example: Server stats instance type: m1.large (2) EBS volumes: 85GB total Operating System: Linux Datacenter: us-east-1c Cost break-down (average monthly price: July->Oct 2011) Cost Price Percentage of Total Instance $250.92 95.8% Storage $10.92 4.1% Network $0.04 0.1% 34

Other scientific tools in the cloud 35

Other scientific tools in the cloud Galaxy Both SAAS and install on your own instances in the cloud GenomeSpace Cytoscape, Galaxy, GenePattern, Genomica, Integrative Genomics Viewer (IGV), and the UCSC Browser in the cloud The Gaggle The Gaggle is a framework for exchanging data between independently developed software tools and databases. CloudBioLinux Starcluster 36

Key Messages LabKey has been run successfully in the cloud by both LabKey and a number of other customers We would love to help you get started using LabKey in the cloud 37

Any questions? Brian Connolly brian@labkey.com 206-667-7521

If you use LabKey Server for your research, please reference one of these publications about the platform: General Use: Nelson EK, Piehler B, Eckels J, Rauch A, Bellew M, Hussey P, Ramsay S, Nathe C, Lum K, Krouse K, Stearns D, Connolly B, Skillman T, Igra M. LabKey Server: An open source platform for scientific data integration, analysis and collaboration. BMC Bioinformatics 2011 Mar 9; 12(1): 71. Proteomics: Rauch A, Bellew M, Eng J, Fitzgibbon M, Holzman T, Hussey P, Igra M, Maclean B, Lin CW, Detter A, Fang R, Faca V, Gafken P, Zhang H, Whitaker J, States D, Hanash S, Paulovich A, McIntosh MW: Computational Proteomics Analysis System (CPAS): An Extensible, Open-Source Analytic System for Evaluating and Publishing Proteomic Data and High Throughput Biological Experiments. Journal of Proteome Research 2006, 5:112-121. Flow Cytometry: Shulman N, Bellew M, Snelling G, Carter D, Huang Y, Li H, Self SG, McElrath MJ, De Rosa SC: Development of an automated analysis system for data from flow cytometric intracellular cytokine staining assays from clinical vaccine trials. Cytometry 2008, 73A:847-856.