ST 810, Advanced computing Eric B. Laber & Hua Zhou Department of Statistics, North Carolina State University January 30, 2013
Supercomputers are expensive. Eric B. Laber, 2011, while browsing the internet.
Cloud computing What is cloud computing? Using a collection of remote servers for computation, data storage/manipulation, etc. Pay for clock-cycles rather than hardware Computing when you need it Scalable computing Scalable storage
Cloud computing cont d Why cloud is computing so popular? Scalability! Adapt to fluctuating demand Ex. Websites with fluctuating traffic Ex. Large corporations use much more computing during business hours than off-business hours Efficiency Pay for what you need No need for hardware maintenance Less waiting for fixed compute-time jobs
Cloud computing cont d Why we care about cloud computing Our computational demands often fluctuate dramatically Grant resource management Old style: buy computer with grant X money, must use computer ONLY for grant X research New style: buy computing time for grant X research with grant X money Massive computing when you need it On the cloud you can run your job on 10,000 machines for one hour for the same price as running it on one machine for 10,000 hours! Using EC2 spot instances, this can be done for as little as $700 an hour, and gets you super-computer performance (It may be more compelling to know that you can get 8 cores for around $0.27 an hour.)
Amazon web services (AWS) Amazon s Elastic Compute Cloud (EC2) Simple Storage Service (S3) Relational Database Service (RDS)...
Using EC2: very basic workflow 0. Setup AWS account 1. Launch Amazon Machine Image(s) (AMI) 2. Configure the AMI(s) as needed 3. Run your jobs 4. Transfer results 5. Terminate instances
Using EC2: Step 0: Setup AWS account Go to aws.amazon.com/ec2/ Click Sign up now Fill in the requisite info, you will need a credit card and cell phone
Using EC2: Step 1: Launch AMI Sign in to your account Go to the AWS management console
Using EC2: Step 1: Launch AMI Click on EC2
Using EC2: Step 1: Launch AMI AMIs are launched from the AWS management console
Using EC2: Step 1: Launch AMI Click continue and select the Ubuntu Server 12.04
Using EC2: Step 1: Launch AMI Leave at defaults click continue
Using EC2: Step 1: Launch AMI Click continue
Using EC2: Step 1: Launch AMI Click continue
Using EC2: Step 1: Launch AMI Download and store your key pair (do not put in a public folder)
Using EC2: Step 1: Launch AMI SSH access by default, we will add HTTP
Using EC2: Step 1: Launch AMI Select HTTP from dropdown menu then click add rule
Using EC2: Step 1: Launch AMI Both SSH and HTTP are now available
Using EC2: Step 1: Launch AMI Click launch!
Using EC2: Step 2: Configure AMI The AMI we launched is like a fresh install of the operating system it needs to be configured As an example we will install R Using linux: login to the instance using ssh: ssh -i merlin.pem ubuntu@ec2address.amazonaws.com Using PuTTY is more complicated, see http://docs.aws.amazon.com/awsec2/latest/userguide/ putty.html
Using EC2: Step 2: Configure AMI Instance address is given on the EC2 console
Using EC2: Step 2: Configure AMI SSH ing into the instance
Using EC2: Step 2: Configure AMI After accessing the instance using SSH, run: 1. sudo apt-get update 2. sudo apt-get -y upgrade # may take a while 3. sudo apt-get install r-base-core You now have R Install whatever else you need...
Saving your AMI Setting up your AMI can take a considerable amount of time Save you AMI after configuration Store it on the cloud (for pennies) Launch an arbitrary number of pre-configured instances at will You can use the micro instance to setup the AMI and then launch large or extra large instances for heavy computing jobs
Using EC2: Step 3: Run you jobs Setup and run you jobs (use screen to detach) You can check CPU usage on the EC2 console Setup notification if CPU drops below a threshold (simple way to know when your jobs finish)
Using EC2: Wrapping up Transfer results using sftp (as usual) Don t forget to terminate your instance!
Go forth and use the cloud Using EC2 is easy Launching large cluster instances is easy (amazon has nice video tutorials) Heavily customize the AMIs (e.g., cplex, python, etc.) Massive computing at your fingertips Other cloud services PiCloud wrapper to EC2 for python, C/C++, Java, and more Google s forthcoming products...