Cloud Computing For Bioinformatics. EC2 and AMIs



Similar documents
This computer will be on independent from the computer you access it from (and also cost money as long as it s on )

Cloud Computing. Command Line Tools

Online Backup Guide for the Amazon Cloud: How to Setup your Online Backup Service using Vembu StoreGrid Backup Virtual Appliance on the Amazon Cloud

Zend Server Amazon AMI Quick Start Guide

Eucalyptus User Console Guide

GETTING STARTED WITH PROGRESS AMAZON CLOUD

DVS-100 Installation Guide

Moving Drupal to the Cloud: A step-by-step guide and reference document for hosting a Drupal web site on Amazon Web Services

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Rstudio Server on Amazon EC2

Tutorial: Using HortonWorks Sandbox 2.3 on Amazon Web Services

Amazon Web Services EC2 & S3

ST 810, Advanced computing

How To Image A Single Vm For Forensic Analysis On Vmwarehouse.Com

Amazon EFS (Preview) User Guide

DVS-100 Installation Guide

An Introduction to Cloud Computing Concepts

Creating a DUO MFA Service in AWS

Amazon Elastic Beanstalk

Using The Hortonworks Virtual Sandbox

INSTALLING KAAZING WEBSOCKET GATEWAY - HTML5 EDITION ON AN AMAZON EC2 CLOUD SERVER

A SHORT INTRODUCTION TO BITNAMI WITH CLOUD & HEAT. Version

ArcGIS 10.3 Server on Amazon Web Services

MATLAB on EC2 Instructions Guide

USER CONFERENCE 2011 SAN FRANCISCO APRIL Running MarkLogic in the Cloud DEVELOPER LOUNGE LAB

3CX IP PBX with Twilio Elastic SIP Trunking Interconnection Guide

Getting Started with AWS. Computing Basics for Linux

Tibbr Installation Addendum for Amazon Web Services

FREE computing using Amazon EC2

Building a Private Cloud Cloud Infrastructure Using Opensource

INUVIKA OVD INSTALLING INUVIKA OVD ON UBUNTU (TRUSTY TAHR)

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

VX 9000E WiNG Express Manager INSTALLATION GUIDE

CloudCIX Bootcamp. The essential IaaS getting started guide.

MATLAB Distributed Computing Server Cloud Center User s Guide

Creating an ESS instance on the Amazon Cloud

Deploying a Virtual Machine (Instance) using a Template via CloudStack UI in v4.5.x (procedure valid until Oct 2015)

Introduction to Cloud Computing on Amazon Web Services (AWS) with focus on EC2 and S3. Horst Lueck

System Administration Training Guide. S100 Installation and Site Management

Single Node Hadoop Cluster Setup

SOA Software API Gateway Appliance 7.1.x Administration Guide

ServerPronto Cloud User Guide

BOA BARRACUDA ON ÆGIR ~ MY FIRST YEAR ~ Mladen

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.2

QuickStart Guide for Managing Mobile Devices. Version 9.2

Secure Web Browsing in Public using Amazon

Cloud Computing. AWS a practical example. Hugo Pérez UPC. Mayo 2012

A technical whitepaper describing steps to setup a Private Cloud using the Eucalyptus Private Cloud Software and Xen hypervisor.

Free Dynamic DNS account you can use one of your choosing I like DynDNS but there's also No-IP and probably others.

Distributed convex Belief Propagation Amazon EC2 Tutorial

Kollaborate Server Installation Guide!! 1. Kollaborate Server! Installation Guide!

jbase 5 Install on Amazon AWS a Primer

Enterprise Manager. Version 6.2. Installation Guide

JAMF Software Server Installation Guide for Linux. Version 8.6

How to configure the TopCloudXL WHMCS plugin (version 2+) Update: Version: 2.2

FortiGate-AWS Deployment Guide

Partek Flow Installation Guide

VXOA AMI on Amazon Web Services

JAMF Software Server Installation and Configuration Guide for Linux. Version 9.0

Setting up your virtual infrastructure using FIWARE Lab Cloud

Amazon Web Services (AWS) Setup Guidelines

Homework #7 Amazon Elastic Compute Cloud Web Services

Comsol Multiphysics. Running COMSOL on the Amazon Cloud. VERSION 4.3a

insync Installation Guide

Reference and Troubleshooting: FTP, IIS, and Firewall Information

JAMF Software Server Installation and Configuration Guide for OS X. Version 9.2

ULTEO OPEN VIRTUAL DESKTOP UBUNTU (PRECISE PANGOLIN) SUPPORT

AWS Account Setup and Services Overview

GeoCloud Project Report GEOSS Clearinghouse

Renderbot Tutorial. Intro to AWS

ArcGIS for Server in the Amazon Cloud. Michele Lundeen Esri

Sentral servers provide a wide range of services to school networks.

Cloud computing is a marketing term that means different things to different people. In this presentation, we look at the pros and cons of using

About the VM-Series Firewall

JAMF Software Server Installation and Configuration Guide for Windows. Version 9.3

Ulteo Open Virtual Desktop Installation

Getting Started With Your Virtual Dedicated Server. Getting Started Guide

FTP, IIS, and Firewall Reference and Troubleshooting

TANDBERG MANAGEMENT SUITE 10.0

RingStor User Manual. Version 2.1 Last Update on September 17th, RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ

Install and configure SSH server

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Storage Sync for Hyper-V. Installation Guide for Microsoft Hyper-V

Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1

Opsview in the Cloud. Monitoring with Amazon Web Services. Opsview Technical Overview

CLOUD INFRASTRUCTURE VIRTUAL SERVER (SHARED) USER GUIDE

JAMF Software Server Installation and Configuration Guide for OS X. Version 9.0

Elastic Detector on Amazon Web Services (AWS) User Guide v5

TECHNOLOGY WHITE PAPER Jan 2016

Amazon Web Services Student Tutorial

OpenTOSCA Release v1.1. Contact: Documentation Version: March 11, 2014 Current version:

Comodo Mobile Device Manager Software Version 1.0

Freshservice Discovery Probe User Guide

KeyControl Installation on Amazon Web Services

Using SUSE Studio to Build and Deploy Applications on Amazon EC2. Guide. Solution Guide Cloud Computing.

Securing Windows Remote Desktop with CopSSH

Transcription:

Cloud Computing For Bioinformatics EC2 and AMIs

Cloud Computing Quick-starting an EC2 instance (let s get our feet wet!)

Cloud Computing: EC2 instance Quick Start On EC2 console, we can click on Launch Instance This will let us get up and going quickly

Cloud Computing: EC2 instance Availability Zone First thing s first: Choose an availability zone that you would like to work with Remember usage prices change per zone Your AMIs, S3 buckets, and EBS volumes live in one availability zone only and will not show up if you choose different ones for your instance versus where you ve saved your data We will use US-West for this tutorial

Cloud Computing: EC2 instance Choosing an AMI Quick Start AMIs are pre-configured for popular operating systems & software packages My AMIs is where you would choose your custom-made instances. You have access to any of your AMIs in the availability zone you ve chosen.

Cloud Computing: EC2 instance Community AMIs AMIs created by users & made public Use at your own risk Instances created from community AMIs will be instance-stores For this example, we ll get one of Canonical s official Ubuntu releases For our purposes, we ll search ami-d197c694 Two types of root devices: EBS (OS on EBS) Instance-store (OS on ephemeral storage) When possible, use EBS Official Ubuntu 10.04 releases: http://uec-images.ubuntu.com/releases/10.04/release/

Cloud Computing: EC2 instance Instance Details Choose the type of instance you want to run. Remember different sizes have different prices! We re using a 64-bit AMI, so the smallest choice available is Large. You can request spot instances here, but we want instant gratification at this point.

Cloud Computing: EC2 instance Kernel ID & RAM ID Kernel ID & RAM Disk ID identify what physical kernel & disk your instance will run on. Super-fiddly and 99.99% of the time you ll go with defaults CloudWatch monitoring is an extra pay-for service which monitors your instance usages & provides metrics based on it. Costs extra Useful if you re interested in the crunchy bits of what the instance is doing, otherwise you can skip.

Cloud Computing: EC2 instance Key Pairs Key pairs are part of the security protocol for AWS. Only a user with the appropriate key pair will be allowed to log onto an instance as root Key pairs are specific to machines. You ll create one for each machine you d like to have root access to your instance Security is not perfect: if your machine has root access, anyone using it has root access. Name your key pair, then download. Remember where you save this, may need it later.

Cloud Computing: EC2 instance Security Groups Think of this as an instance-specific firewall setting Default is useless, as it has no SSH access so we re making our own Create Custom Group Click on New Security Group Give it a name and description From the dropdown at the bottom, choose SSH and add rule

Cloud Computing: EC2 instance Launch! The instance will spin up, and you ll be ready to log in! Clicking on the instance once it says running on status will show details about the instance. Note that this is an ebsbacked instance. If your image is instance-store, you will not be able to create snapshots of your server! Pay attention to the Public DNS, this is what you ll use to SSH into your machine.

Cloud Computing Logging into your Instance

Cloud Computing: Logging in From Windows This one takes a little more doing Download PuTTY and PuTTYgen: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html PuTTY doesn t support the.pem file Amazon provides, we ll need to convert it to PuTTY.ppk Launch PuTTYgen Load the private key created during instance launch (told you to remember where you saved it!) Save the private key

Cloud Computing: Logging in From Windows Launch PuTTY Copy/Paste the Public DNS from the AWS Management console into Host Name or IP Go to Connection -> SSH -> Auth Load the keypair we created in PuTTYgen Go to Connection -> Data and put ubuntu in the Auto-login username (or root for other unix instances) Optionally save this configuration (to avoid doing all this in the future) You re good to go!

Cloud Computing Now that we re on the cloud, Let s take a few minutes and enjoy the view (I can see my house from here!)

You Your server

Cloud Computing Even though you need a computer to access your instance, you are running on a machine with great capabilities This computer will be on independent from the computer you access it from (and also cost money as long as it s on ) Let s go over a few general topics & terms

Cloud Computing Security and Credentials (a.k.a. the Labyrinth)

Cloud Computing: Credentials Credentials serve to indentify your machine to Amazon s services The credentials you use vary with type of API you re accessing To get to Credentials page, go to http://aws.amazon.com/account/, and click on Security Credentials (I forget where this is all the time)

Cloud Computing: Credentials Three different types of credentials for accessing APIs EC2 Key pairs: Created on-the-fly from EC2 instance start or from EC2 console under Key pairs Access keys: Created when account is created, also can be created from Credentials page X.509 Certificates: Created from Credentials page Other authentication methods for other services Username/password Account Identifiers AWS Account ID: If you want to let someone else use one of your AWS resources (like an Amazon EC2 AMI, Amazon EBS snapshot, Amazon SQS queue, etc.), you use the AWS Account ID to specify the account. Canonical User ID: If you want to share Amazon S3 resources (objects and buckets) with another AWS account, you use this ID to specify the account. The ID is a long string. More info at: http://docs.amazonwebservices.com/awssecuritycredentials/1.0/aboutawscredentials.html

Cloud Computing: Credentials If you want to Make a REST or Query API request to an AWS product Use this credential Access Keys Make a SOAP API request to an AWS product X.509 (except for S3 and Amazon Mechanical Turk, which require Access Keys) Access secure pages on the AWS web site or AWS Management Console Username/Password with optional Multi-Factor Authentication Use the Amazon EC2 command line tools X.509 Launch or connect to an Amazon EC2 instance Bundle an Amazon EC2 AMI Share an Amazon EC2 AMI or Amazon EBS snapshot Create a signed URL to access Amazon CloudFront private content Access AWS Discussion Forums or AWS Premium Support site EC2 Key pairs Linux/UNIX AMIs: X.509 & Amazon Account ID to bundle the AMI, and Access Keys to upload to S3. For Windows AMIs: Access Keys for both bundling and uploading. Amazon Account ID of the account you want to share with (without hyphens) CloudFront Key Pairs Username/Password

Cloud Computing: Credentials AWS Account Identifiers Mainly used for permissions & sharing between accounts Multiple accounts can share data (personal vs lab-wide) S3 uses different credentials for sharing Our ID is: 3249-7882- 2037, share with us! Get your account identifier, we ll use it later

Cloud Computing: Credentials X.509 Certificates Same certificates used for SSL Can be created either on your own or by Amazon To create on your own, use openssl & CA.pl, but this is beyond our scope Can only have two X.509 certificates at any time Should be rotated every 90 days

Cloud Computing: Credentials There are too many Credentials, they make my head spin! Remember these three stooges: AWS user name and password Shopping at Amazon, login into your AWS web console, and account management. SSH Key pairs Login into and use your cloud machine. X.509 Certificate and Key Maximize your EC2 cloud machine.

Cloud Computing Security Groups

Cloud Computing: Security Groups Security groups allow you to set up firewalls for each of your instances Let you save setups for various services, such as web server, or mail server Default security group is rather useless (no ssh) Some services are pre-defined for you. Learn these as you may want to access your instance in ways you haven t thought of (FTP, Telnet, MySQL, etc)

Cloud Computing: Security Groups Creating new Security Group Let s create one that gives us SSH access Go to Console -> EC2 -> Security Groups and click Create Security Group Give it a name & description From the bottom pane, you can add connection methods Protocol is either TCP (packets) or UDP (stream) Define a port range & source IP or group 0.0.0.0/0: any IP Localhost: only the instance Your computer s IP: only your machine can access the instance For our example, let s add SSH and click save

Cloud Computing And now we go back to your regularly scheduled program(ming)

Cloud Computing Installing Software onto your AMI

Cloud Computing: Software We re starting with a bare instance, so we want to update the software sudo apt-get update sudo apt-get upgrade Let s install some bioinformatics packages sudo apt-get install bioperl sudo apt-get install python-biopython Biolinux anyone? First edit /etc/apt/sources.list and add biolinux to it deb http://nebc.nerc.ac.uk/bio-linux/ unstable bio-linux sudo apt-get update sudo apt-get install bio-linux-keyring sudo apt-get update

Cloud Computing: Software Now we can install some biolinux goodness sudo apt-get install blast2 bio-linux-blast+ bio-linuxblast ncbi-tools-bin bio-linux-big-blast libncbi6 Let s add R to our instance. Add to sources.list & get GPG key deb http://cran.cnr.berkeley.edu/bin/linux/ubuntu lucid/ sudo gpg --keyserver subkeys.pgp.net --recv-key E2A11821 sudo gpg -a --export E2A11821 sudo apt-key add - sudo apt-get update Now we can install R and some R packages sudo apt-get install r-base sudo apt-get install r-cran-boot r-cran-class r-crancluster r-cran-codetools r-cran-foreign r-cran-kernsmooth r-cran-lattice r-cran-mass r-cran-matrix r-cran-mgcv r- cran-nlme r-cran-nnet r-cran-rpart r-cran-spatial r-cransurvival r-cran-vr r-cran-rodbc

Cloud Computing Saving your AMI

Cloud Computing: Saving & Sharing We ve done all this work, but if we were to terminate this instance it s all gone. To avoid this, we need to save our AMI for future use. Once we have our own AMI updated and loaded with our tools, any number of users can launch an instance with your setup. Think of it as saving your progress in a video game (or your kids video game)

Cloud Computing: Saving & Sharing A Note: Stop vs Terminate All instances have Terminate under the instance actions menu. This will kill the instance This is how you lose all your changes All instances also have Reboot Rebooting is data-safe: you ll keep your data on reboot EBS-backed instances have an additional parameter: Stop This will stop the instance so you won t get charged for having it, but keep everything you ve worked on in the EBS on the instance If you go onto Terminate this instance, all work is still lost! So now we know many ways to lose all our work, how do we actually save it?

Cloud Computing: Saving & Sharing Creating an Image In the Console, click on your instance, then under Instance Actions, click Create Image (EBS AMI). You ll probably notice you ve been kicked out of your session. Amazon makes instances unavailable while it takes the snapshot. Do not create a snapshot while jobs are running, or you ll lose your work. Once the AMI is created, you may terminate your current instance and launch your own instance with your AMI. All the work you ve done will be there.

Cloud Computing: Saving & Sharing AMI Permissions AMIs are, by default, private. This means only your account has access to them. But you can change this. In the Console, go to AMIs and choose the AMI you just created, and click Permissions. You can share this AMI with other AWS account numbers (remember the Security and Credentials slides? Comes into play here.) You can also set this AMI to public, allowing anyone who knows the AMI ID to find and run on this They cannot modify your AMI, but can create snapshots from it. We ll be sharing data later on.

Cloud Computing: Saving & Sharing Speaking of Sharing: Public Data Sets Amazon hosts many public data sets that can be used in the cloud List of data sets is at http://bit.ly/amazonpublicdata (http://developer.amazonwebservices.com/connect/kbcategory.jspa?categoryid=243) Public data sets are hosted by Amazon, and are not billed to creators Great way to unhook your big data from charging you money This is public as in public public; you give up access control rights to data To make data set public, fill out form at: http://developer.amazonwebservices.com/connect/entrycreate!default.jspa?categoryid=244&entrytypeid=14 (Now you see why I like to shorten URLs)

Cloud Computing Loading Data into the Clouds (with rockets)

Cloud Computing: Loading Data There are two services that hold file-type data: S3 and EBS Pros and cons: S3 EBS Durability of data Price per GB Disk I/O operations More expensive / GB but pay only for what you use 1/3 cheaper but pay for what is allocated, not just used File sizes Speed Unlimited files, but each capped at 5gb No cap, but size limited to volume size on creation Ease of Use

Cloud Computing: Loading Data Why bioinformaticists should NOT to use S3 Files are capped at 5GB, even if you can make unlimited numbers of them Crappy support with uploaders & file managers Web interface: best interface S3Fox: Firefox addon from developer who vocally rejects updating the software S3browser: Not bad, but windows-only Speed concerns (150mb file = 4 minutes over S3, 20 seconds over mounted EBS) Here s the kicker: Most disk operations require transferring data to EBS regardless! That said, EMR uses S3 exclusively, so sometimes you can t avoid the headaches.

Cloud Computing: Loading Data Loading Data the Easy Way Since your instance is EBS-backed, you refer to Creating an Image earlier in the talk, and do that. Your data is saved! A few problems with this: Data needlessly tied to system it runs on Sharing issues: sharing this instance means sharing your file system AND data If you need to restore your image to an earlier version, you revert your data as well

Cloud Computing Loading Data into the Clouds (Slightly Less Easy Way)

Cloud Computing: Loading Data Loading Data the Slightly Less Easy Way To separate the concerns of data vs operating system, we ll need to create a separate EBS This EBS will allow you to share ONLY your data, and update the data independent of the system you run it on Allows other systems to mount your data Assumes file system format is compatible

Cloud Computing: Loading Data Loading Data the Slightly Less Easy Way Find your instance ID Go to Console, click on your instance, then in the detail pane, copy the digits after EC2 Instance (should begin with i- ) Create an EBS volume (or get volume ID of one already made) ec2-create-volume --size 10 -z us-west-1a (This will print out the volume ID) Use the volume ID to mount the drive onto your instance ec2-attach-volume <volume ID> -i <instance ID> -d <mount point: /dev/sdd > Log into your instance and check that the drive is available fdisk -l /dev/sdd

Cloud Computing: Loading Data Loading Data the Slightly Less Easy Way If your EBS volume is new, you ll have to create a file system on it sudo mkfs.ext3 /dev/sdd Mount and use the new file system sudo mkdir <whatever directory you like: /data> sudo mount /dev/sdd /data And now give yourself permissions on that new directory sudo chown ubuntu:ubuntu /data To unmount the drive from your instance: sudo umount /data Or from your local computer: ec2-detach-volume <volume ID> -i <instance ID>

Cloud Computing: Loading Data Loading Data the Slightly Less Easy Way Best ways to get data: WGET: allows you to pull data from a URL SCP: secure file transfer over SSH protocol, can transfer to/from two machines you have permissions on RSYNC: auto-resumable file transfer across two machines FTP: Good old ftp. Emphasis on old Any of these is still faster than loading data over S3 Don t forget public data sets! You can mount those EBS snapshots as well