Cloud Computing. AWS a practical example. Hugo Pérez UPC. Mayo 2012



Similar documents
Introduction to analyzing big data using Amazon Web Services

Using ArcGIS for Server in the Amazon Cloud

Amazon Elastic Beanstalk

Using The Hortonworks Virtual Sandbox

Using Amazon EMR and Hunk to explore, analyze and visualize machine data

CONFIGURING ECLIPSE FOR AWS EMR DEVELOPMENT

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Getting Started with Hadoop with Amazon s Elastic MapReduce

This computer will be on independent from the computer you access it from (and also cost money as long as it s on )

Amazon Elastic Compute Cloud Getting Started Guide. My experience

Introduction to Cloud Computing on Amazon Web Services (AWS) with focus on EC2 and S3. Horst Lueck

AWS Account Setup and Services Overview

Amazon Web Services (AWS) Setup Guidelines

Zend Server Amazon AMI Quick Start Guide

MANAGE YOUR AMAZON AWS ASSETS USING BOTO

Cloud Computing For Bioinformatics. EC2 and AMIs

AWS Data Pipeline. Developer Guide API Version

Amazon Web Services Yu Xiao

Eucalyptus User Console Guide

Online Backup Guide for the Amazon Cloud: How to Setup your Online Backup Service using Vembu StoreGrid Backup Virtual Appliance on the Amazon Cloud

MATLAB Distributed Computing Server Cloud Center User s Guide

Creating an ESS instance on the Amazon Cloud

Amazon EC2 Container Service. Developer Guide API Version

Options in Open Source Virtualization and Cloud Computing. Andrew Hadinyoto Republic Polytechnic

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Amazon EFS (Preview) User Guide

Hands-on Exercises with Big Data

USER CONFERENCE 2011 SAN FRANCISCO APRIL Running MarkLogic in the Cloud DEVELOPER LOUNGE LAB

Tutorial: Using HortonWorks Sandbox 2.3 on Amazon Web Services

ur skills.com

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Opsview in the Cloud. Monitoring with Amazon Web Services. Opsview Technical Overview

VX 9000E WiNG Express Manager INSTALLATION GUIDE

DVS-100 Installation Guide

Rstudio Server on Amazon EC2

Monitoring and Scaling My Application

Deploy XenApp 7.5 and 7.6 and XenDesktop 7.5 and 7.6 with Amazon VPC

Workshop: From Zero. Budapest DW Forum 2014

Orchestrator ver

DVS-100 Installation Guide

Moving Drupal to the Cloud: A step-by-step guide and reference document for hosting a Drupal web site on Amazon Web Services

Cloud Computing. Adam Barker

Building a Private Cloud Cloud Infrastructure Using Opensource

Elastic Detector on Amazon Web Services (AWS) User Guide v5

AWS Import/Export. Developer Guide API Version

Amazon EC2 Product Details Page 1 of 5

ArcGIS 10.3 Server on Amazon Web Services

Virtualization and cloud computing monitoring

Cloud Computing with Amazon Web Services and the DevOps Methodology.

Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study

A technical whitepaper describing steps to setup a Private Cloud using the Eucalyptus Private Cloud Software and Xen hypervisor.

AWS Database Migration Service. User Guide Version API Version

Amazon Hosted ESRI GeoPortal Server. GeoCloud Project Report

INSTALLING KAAZING WEBSOCKET GATEWAY - HTML5 EDITION ON AN AMAZON EC2 CLOUD SERVER

Alfresco Enterprise on AWS: Reference Architecture

Talari Virtual Appliance CT800. Getting Started Guide

McAfee Public Cloud Server Security Suite

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Cloud Models and Platforms

Getting Started with Amazon EC2 Management in Eclipse

AdWhirl Open Source Server Setup Instructions

AWS Service Catalog. User Guide

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

jbase 5 Install on Amazon AWS a Primer

Introduction to DevOps on AWS

FREE computing using Amazon EC2

Last time. Today. IaaS Providers. Amazon Web Services, overview

Every Silver Lining Has a Vault in the Cloud

OpenTOSCA Release v1.1. Contact: Documentation Version: March 11, 2014 Current version:

GeoCloud Project Report GEOSS Clearinghouse

Web Application Firewall

A Tutorial Introduc/on to Big Data. Hands On Data Analy/cs over EMR. Robert Grossman University of Chicago Open Data Group

Hadoop & Spark Using Amazon EMR

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Continuous Delivery on AWS. Version 1.0 DO NOT DISTRIBUTE

AWS Lambda. Developer Guide

AWS Import/Export. Developer Guide API Version

Amazon Web Services EC2 & S3

Security Gateway R75. for Amazon VPC. Getting Started Guide

Cloud computing - Architecting in the cloud

Security Gateway Virtual Appliance R75.40

Amazon CloudWatch to monitor cloud resource usage

Renderbot Tutorial. Intro to AWS

CSE 344 Introduction to Data Management. Section 9: AWS, Hadoop, Pig Latin TA: Yi-Shu Wei

vrealize Operations Management Pack for AWS Installation and Configuration Guide 2.0

EEDC. Scalability Study of web apps in AWS. Execution Environments for Distributed Computing

Running Knn Spark on EC2 Documentation

Healthstone Monitoring System

Using ArcGIS for Server in the Amazon Cloud

Aneka Dynamic Provisioning

Transcription:

Cloud Computing AWS a practical example Mayo 2012 Hugo Pérez UPC

-2- Index Introduction Infraestructure Development and Results Conclusions

Introduction In order to know deeper about AWS services, mapreduce process, the public data available from tweeter and the method to interact with them, i developed a little example, using: AWS Infraestructure: - Elastic Cloud Compute EC2 - Elastic Block Store EBS - Elastic IP - Simple Storage Service S3 AWS Tools: - Management Console - CloudWatch - Elastic MapReduce EMR Tweeter Search API -3-

-4- Index Introduction Infraestructure Development and Results Conclusions

Creating AWS Account Go to http://aws.amazon.com -5-

Creating AWS Account Sign in as a new user -6-

Creating AWS Account Record name, email and password -7-

Creating AWS Account Record contact details -8-

Creating AWS Account Record payment data -9-

Creating AWS Account Confirm a PIN by a phone call - 10 -

Creating AWS Account Confirming.. - 11 -

Creating AWS Account Wait some minutes until the account is active (less than 10 mins in this case) - 12 -

Creating EC2 Go to AWS Management Console-> EC2 Dashboard - 13 -

Creating EC2 Create a new instance - 14 -

Creating EC2 Choose the AMI (Amazon Machine Image) to install, Ubuntu Server 12.04-15 -

- 16 - Creating EC2 Defining number of instances and type, in this case 1 Micro, characteristics: HD: 8Gb (EBS), RAM: 600 Mb, CPU:Intel(R) Xeon(R) CPU E5430 @ 2.66GHz

Creating EC2 Defining instance details, like shutdown behavior, user data. - 17 -

Creating EC2 Defining tags: user-friendly names to manage the resources - 18 -

Creating EC2 Creating Key Pair to securely connect with the instance. - 19 -

Creating EC2 Configuring the firewall - 20 -

Creating EC2 Review - 21 -

Creating EC2 You can check the details from the Management Console - 22 -

- 23 - Creating EC2 Also you can monitor the instance, create alarms, configure detailed monitoring.

Creating Elastic IP Now you can access to the instance by ssh using this name: ec2-23-23-187119.compute-1.amazonaws.com To simplify it, you can create a elastic ip address - 24 -

Creating Elastic IP Once created the elastic ip - 25 -

Creating Elastic IP You should associate it with the instance - 26 -

Creating S3 Defining the name and region, the region should be the same that EC2 to optimize for latency. AWS gives 5 Gb free. - 27 -

Creating S3 Set permissions to grant access to list the S3 Bucket to Authenticated Users. - 28 -

Creating Billing Alarm First you have to enable this function. - 29 -

Creating Billing Alarm Define the parameters: recipients and threshold - 30 -

Cloud Watch Besides the alarm, you can check the estimated charges, through cloud watch - 31 -

Cloud Watch Throught cloud watch you can query different kind of metrics - 32 -

- 33 - Index Introduction Infraestructure Development and Results Conclusions

- 34 - Installing EMR CLI Connect to the server ssh -i awskey.pem ubuntu@23.21.252.15 Install the Amazon Elastic MapReduce Ruby Client $ mkdir elastic-mapreduce-cli $ cd elastic-mapreduce-cli $ wget http://elasticmapreduce.s3.amazonaws.com/elastic-mapreduce-ruby.zip $ unzip elastic-mapreduce-ruby.zip

Installing EMR CLI Configuring credentials$ vi credentials.json { "access_id": "[Your AWS Access Key ID]", "private_key": "[Your AWS Secret Access Key]", "keypair": "[Your key pair name]", "key-pair-file": "[The path and name of your PEM file]", "log_uri": "[A path to a bucket you own on Amazon S3, such as, s3n://mylog-uri/]", "region": "[The Region of your job flow, either us-east-1, us-west-2, us-west-1, eu-west-1, ap-northeast-1, apsoutheast-1, or sa-east-1]" } - 35 -

Installing EMR CLI You can get the AWS Access Key ID and the AWS Secret Access Key by entering to your account in http://aws.amazon.com in the Access Credentials section. - 36 -

Installing EMR CLI It is recomended to create a new key pair for the exercise. I did it from Management Console, i put this key pair in the EC2 instance. - 37 -

Installing EMR CLI I save all the parameters in the file: ubuntu@ip-10-195-195-175:~/elastic-mapreduce-cli$ more credentials.json { "access_id": "HPVAJFNULSZULY5NWHPV", "private_key": "65xBzYVzV7THPVYWW2LcYN0roVwK1I+nxJ+BNHPV", "keypair": "mapreduce", "key-pair-file": "/home/ubuntu/mapreduce.pem", "log_uri": "s3n://mylog-uri-hpv/", "region": "us-east-1" } - 38 -

Basics EMR CLI Basic commands of EMR CLI: $./elastic-mapreduce --help $./elastic-mapreduce --create $./elastic-mapreduce --list $./elastic-mapreduce --describe --jobflow [JobFlowID] $./elastic-mapreduce -j JobFlowID --stream $./elastic-mapreduce --terminate JobFlowID - 39 -

Mapper The mapper script, the classic word counter: #!/usr/bin/python import sys import re def main(argv): pattern = re.compile("[a-za-z][a-za-z0-9]*") for line in sys.stdin: for word in pattern.findall(line): print "LongValueSum:" + word.lower() + "\t" + "1" if name == " main ": main(sys.argv) - 40 -

Using Twitter API To generate the input data, run a simple query to twitter: - 41 -

Using Twitter API Query: http://search.twitter.com/search.json?q=cloud% 20computing&rpp=5&include_entities=true&result_type=mixed pattern: cloud computing rpp: return per page=5 include_entities: if it is true the result includes urls, media and hashtags result_type: - mixed: Include both popular and real time results in the response. - recent: return only the most recent results in the response - popular: return only the most popular results in the response. - 42 -

Using Twitter API Query: http://search.twitter.com/search.json?q=cloud% 20computing&rpp=5&include_entities=true&result_type=mixed pattern: cloud computing rpp: return per page=5 include_entities: if it is true the result includes urls, media and hashtags result_type: - mixed: Include both popular and real time results in the response. - recent: return only the most recent results in the response - popular: return only the most popular results in the response. - 43 -

Using Twitter API Transfer the result to S3: $ s3curl.pl --id=personal --put=cloudcomputing -- http://s3.amazonaws.com/mylog-uri-hpv/entradas/cloudcomputing - 44 -

Exec EMR $./elastic-mapreduce --create --stream --mapper s3://elasticmapreduce/samples/wordcount/wordsplitter.py --input s3://mylog-uri-hpv/entradas/cloudcomputing --output s3://mylog-uri-hpv/salidas/cloudcomputing --reducer aggregate $./elastic-mapreduce --list --active j-3ebj6mt4fbm80 STARTING Development Job Flow PENDING Example Streaming Step $./elastic-mapreduce --list --active j-3ebj6mt4fbm80 RUNNING ec2-23-20-6-34.compute-1.amazonaws. com Development Job Flow RUNNING Example Streaming Step - 45 -

Exec EMR Monitoring from Management Console - 46 -

Exec EMR Provisioning on demand - 47 -

Exec EMR Monitoring Graphs - 48 -

Results EMR Results on S3-49 -

- 50 - Index Introduction Infraestructure Development and Results Conclusions

- 51 - Conclusions The software development model is completely new. Is eliminated the purchase process, the installation process is becoming easier, the role of system administrator (sysadmin, DBA, etc.) is disappearing, the developer can focus on business logic, not only provides AWS infrastructure, but also the development platform. Twitter api is well documented and easy to use. This model is available to a company of any size. The free application layer covers all hardware components used in this exercise (EC2, EBS, Elastic IP, S3) except for one small EC2 instance that is used on demand in the process of MapReduce. The total charge for the development of this exercise was USD $ 0.45

Conclusions Charges: - 52 -

References http://aws.amazon.com http://aws.amazon.com/es/elasticmapreduce/ http://docs.amazonwebservices. com/elasticmapreduce/latest/gettingstartedguide/welcome.html?r=6602 https://dev.twitter.com/docs https://dev.twitter.com/start https://dev.twitter.com/docs/using-search https://dev.twitter.com/docs/api/1/get/search

Thanks In order to know deeper about AWS services, mapreduce process, the public data available from tweeter and the method to interact with them, i developed a little example, using: AWS Infraestructure: - Elastic Cloud Compute EC2 - Elastic Block Store EBS - Elastic IP - Simple Storage Service S3 AWS Tools: - Management Console - CloudWatch - Elastic MapReduce EMR Tweeter Search API