IBM Smart Cloud guide started



Similar documents
Easily parallelize existing application with Hadoop framework Juan Lago, July 2011

HADOOP - MULTI NODE CLUSTER

Single Node Hadoop Cluster Setup

Setup Hadoop On Ubuntu Linux. ---Multi-Node Cluster

How To Install Hadoop From Apa Hadoop To (Hadoop)

HSearch Installation

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Running Knn Spark on EC2 Documentation

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

Extending Remote Desktop for Large Installations. Distributed Package Installs

Running Kmeans Mapreduce code on Amazon AWS

Secure Shell. The Protocol

The objective of this lab is to learn how to set up an environment for running distributed Hadoop applications.

CISE Research Infrastructure: Mid-Scale Infrastructure - NSFCloud (CRI: NSFCloud)

Hadoop (pseudo-distributed) installation and configuration

HADOOP. Installation and Deployment of a Single Node on a Linux System. Presented by: Liv Nguekap And Garrett Poppe

Hadoop Hands-On Exercises

研 發 專 案 原 始 程 式 碼 安 裝 及 操 作 手 冊. Version 0.1

TP1: Getting Started with Hadoop

Hadoop Installation MapReduce Examples Jake Karnes

Deploying MongoDB and Hadoop to Amazon Web Services

TS-800. Configuring SSH Client Software in UNIX and Windows Environments for Use with the SFTP Access Method in SAS 9.2, SAS 9.3, and SAS 9.

WinSCP PuTTY as an alternative to F-Secure July 11, 2006

Tutorial- Counting Words in File(s) using MapReduce

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

Extreme computing lab exercises Session one

Set JAVA PATH in Linux Environment. Edit.bashrc and add below 2 lines $vi.bashrc export JAVA_HOME=/usr/lib/jvm/java-7-oracle/

Introduction to analyzing big data using Amazon Web Services

Installing Hadoop. Hortonworks Hadoop. April 29, Mogulla, Deepak Reddy VERSION 1.0

Setting up your virtual infrastructure using FIWARE Lab Cloud

Installing Hadoop. You need a *nix system (Linux, Mac OS X, ) with a working installation of Java 1.7, either OpenJDK or the Oracle JDK. See, e.g.

Using The Hortonworks Virtual Sandbox

SSH! Keep it secret. Keep it safe

How to Backup XenServer VM with VirtualIQ

Manual for using Super Computing Resources

Hadoop Hands-On Exercises

Comsol Multiphysics. Running COMSOL on the Amazon Cloud. VERSION 4.3a

Kognitio Technote Kognitio v8.x Hadoop Connector Setup

CONFIGURING ECLIPSE FOR AWS EMR DEVELOPMENT

Using Google Compute Engine

Configuring for SFTP March 2013

Source Code Management for Continuous Integration and Deployment. Version 1.0 DO NOT DISTRIBUTE

Creating a DUO MFA Service in AWS

Adobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud

Install and configure SSH server

Back Up Linux And Windows Systems With BackupPC

Hadoop Distributed File System and Map Reduce Processing on Multi-Node Cluster

Distributed convex Belief Propagation Amazon EC2 Tutorial

Data Analytics. CloudSuite1.0 Benchmark Suite Copyright (c) 2011, Parallel Systems Architecture Lab, EPFL. All rights reserved.

Hadoop Training Hands On Exercise

A SHORT INTRODUCTION TO BITNAMI WITH CLOUD & HEAT. Version

Building a Private Cloud Cloud Infrastructure Using Opensource

Installation Guide Setting Up and Testing Hadoop on Mac By Ryan Tabora, Think Big Analytics

HADOOP CLUSTER SETUP GUIDE:

Recommended File System Ownership and Privileges

HOW TO CREATE A SLICE IN GENI?

RHadoop and MapR. Accessing Enterprise- Grade Hadoop from R. Version 2.0 (14.March.2014)

Mobile Labs Plugin for IBM Urban Code Deploy

Comsol Multiphysics. Running COMSOL on the Amazon Cloud. VERSION 4.3b

SparkLab May 2015 An Introduction to

Single Node Setup. Table of contents

Working with Docker on Microsoft Azure

CASHNet Secure File Transfer Instructions

Navigating the Rescue Mode for Linux

Hadoop Data Warehouse Manual

Deploy Apache Hadoop with Emulex OneConnect OCe14000 Ethernet Network Adapters

SSH. Introduction. SSH Private Key

Connectivity using ssh, rsync & vsftpd

Big Business, Big Data, Industrialized Workload

IDS 561 Big data analytics Assignment 1

Cloud Storage Quick Start Guide

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Securing Windows Remote Desktop with CopSSH

Zend Server Amazon AMI Quick Start Guide

EZcast Installation guide

Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.

Project 5 Twitter Analyzer Due: Fri :59:59 pm

How to Run Spark Application

cloud-kepler Documentation

Rstudio Server on Amazon EC2

IBM Pure Application Create Custom Virtual Image Guide - Part 1 Virtual Image by extending

Windows Intune Walkthrough: Windows Phone 8 Management

Introduction to Cloud Computing

Adafruit's Raspberry Pi Lesson 6. Using SSH

About this Tutorial. Audience. Prerequisites. Copyright & Disclaimer

File transfer clients manual File Delivery Services

INASP: Effective Network Management Workshops

High-Performance Reservoir Risk Assessment (Jacta Cluster)

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing

How to connect to the University of Exeter VPN service

Hadoop Tutorial. General Instructions

Hadoop 2.6 Configuration and More Examples

hadoop Running hadoop on Grid'5000 Vinicius Cogo Marcelo Pasin Andrea Charão

MapReduce, Hadoop and Amazon AWS

CactoScale Guide User Guide. Athanasios Tsitsipas (UULM), Papazachos Zafeirios (QUB), Sakil Barbhuiya (QUB)

How to install PowerChute Network Shutdown on VMware ESXi 3.5, 4.0 and 4.1

Using Network Attached Storage with Linux. by Andy Pepperdine

A SHORT INTRODUCTION TO DUPLICITY WITH CLOUD OBJECT STORAGE. Version

How To Enable A Websphere To Communicate With Ssl On An Ipad From Aaya One X Portal On A Pc Or Macbook Or Ipad (For Acedo) On A Network With A Password Protected (

Transcription:

IBM Smart Cloud guide started 1. Overview Access link: https://www-147.ibm.com/cloud/enterprise/dashboard We are going to work in the IBM Smart Cloud Enterprise. The first thing we are going to do is to sign in. So, the dashboard is composed in four parts: Overview. We can see and access the main options to be managed. Control panel. Add images, create instances and control the storage. Account. Manage users inside instances and permissions. Support. Help and guide you to use the IBM Cloud.

Overview panel This window gives us information about our instances and our recent activity in SCE Control Panel In this case we can do the things we want such as create instances, delete it, reboot etc. Also, we can establish predefined images and specifics storage for large data sets.

Account These tools allow us to manage fixed IPs and keys for connecting machines.

Support

2. Create and access to instances You can create different kind of instances such as predefined or your own images. In this document we can focus in preinstalled images and how to configure. First of all, we have to learn how to work and manage instances by command lines. A link with a couple of examples: http://www.ibm.com/developerworks/cloud/library/cl-commandlinelx/ These examples show us how to create and describe instances. Also we can delete it or ask for a lot of information about everything show in the dashboard told before. When we create an instance, we have to say which private key we want. I recommend that this key must the same for all the instances. So we can connect between instances by the same key. This key allows us to connect for the instance too by software. For that, we have to download the key by the dashboard when we create a private key in the account window. If we want to connect by ssh (i.e: by putty) we have to do these steps: Firstly, we have to generate a.ppk with the software Puttygen. Click on Load button and select de private key downloaded in the dashboard. After that, write the instance s IP.

Click in the option Allow attempted changes of username in SSH-2 and select de private key (*.ppk) previously created by putty.

3. Configure instances 3.1. Use MPI in the instances http://www.ibm.com/developerworks/systems/library/es-hpcluster1/ The only thing we need is (after installing in one machine in a usual way) copy the folder in other machines and establish the PATH environment. In the last page of the document we can describe the steps in each part Linux Command Line If we want to manage by command line we have to download the tools and establish the JAVA environment. The following link helps us: https://www-147.ibm.com/cloud/enterprise/ram/assetdetail/generaldetails.faces?guid={f1466f46-a4ab-3879- D883-1A26A43BF046}&v=1.4.1&fid=&tid=&post= To access your IBM account by command line you have to create a file with a passphrase. There is a command line to create that called: ic-create-password -u --username username -p --password password -w --passphrase passphrase -g --file-path filepath [-h --help] See the documentation about the command line necessary to manage the IBM Cloud. Pseudo code for managing instances using MPI: 1. Create Master instance:./ic-create-instance.sh by command line Or Add Instance by dashboard web interface 2. Configure Master instance: a. Copy MPI directory (mpich) and copy directory command line tools. b. Environment variable PATH for MPI and JAVA_HOME Export PATH=$PATH:/home/idcuser/mpich/bin Export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk Export PATH=$PATH:$JAVA_HOME/bin c. Copy Directory keys. Inside key.priv & pass.txt for accessing IBM. The key.priv allows us connect the Master with any worker the first time (by ssh). This file we already have it (previously downloaded in the dashboard) Pass.txt is a file generated by command line tool to access the IBM Cloud. See the documents for creating the file d. Stop IPtables: sudo /sbin/service iptables stop

e. Generate public key for ssh: ssh-keygen t rsa. This provides us a file called id_rsa.pub. We use that for allowing Master to connect workers. Permissions to authorize ssh and connect it: chmod 700.ssh chmod 600 key.priv cat.ssh/id_rsa.pub >.ssh/authorized_keys2 chmod 600.ssh/* 3. Create Workers instances: while nº PE < Max do./ic-create_instance.sh u user w passphrase g pass.txt t instancetype (i.e: COP32.1/2048/60 k imageid (see dashboard information) L region (i.e: 61 for Germany) n instance name d description > instance.txt./getid instance.txt newid.txt It is a software that create a file with only the ID for the new instance. This file generated help us later to configure each instance created by this master instance. end while Nº PE = Nº PE - 1 4. Configure workers by master instance. a. Copy mpich directory and establish the PATH environment. b. Copy id_rsa.pub key into the file authorized_keys2 to allow the master to connect this machine. We can do this with an scp with the parameter i. The key can be sent for the master machine. It is the key file downloaded in the dashboard. An example: scp -r -i key.priv masterip :/home/ /.ssh/id_rsa.pub /home/ /.ssh/authorized_keys2 Don t forget to change the key.priv permissions: chmod 600 key.priv c. Change permissions for ssh: chmod 700.ssh chmod 600.ssh/* d. Stop iptables: sudo /sbin/service iptables stop e. Create the directory for MrCirrus: software, data, etc. Also it is necessary to create a little script called sendresult.sh because MrCirrus-Worker when it finished, execute this script to send the partial result to the Master machine. If you use a previous version for MrCirrus, that step is not necessary.

3.2. Hadoop f. Create directory called the same name of the tools directory of the IBM command line tools. It is necessary because the mpirun has to change the execution directory (step e) and the worker machine has to have the same directories for that. Example: The Master has the following directory.: - cmdtools with the command line tolos and software necessary to prepare and send the configuration to workers. - testmpi with the MrCirrus software, sequential app and data. So, the worker machine has to have the same name of these directories. g. Once you have configured the machines, you can test if all is right. Try to connect via ssh from the Master machine: ssh IPworker and also test the data and directories are rights. In Hadoop (hadoop-streaming in this case) you have to create a script to upload and download the necessary files to run your application. There are some command lines to do all this things. The script depends on the software. There is an example in this document to prepare a Blast execution for hadoop. This example is necessary to send for all hadoop machines (slaves) before running the job Some of the command line to manage hadoop-streaming are: Manage files: Hadoop dfs get Hadoop-directory/file my machine directory Hadoop dfs put my machine directory Hadoop-directory/file Hadoop dfs rm Hadoop-directory/file Hadoop dfs rmr Hadoop-directory Execute hadoop-streaming: Hadoop jar /mnt/biginsights/opt/ibm/biginsights/ihc/contrib/streaming/hadoop-0.20.2- streaming.jar mapper scriptcreated input script-map output Hadoop directory scriptcreated is the script showed in this document. script-map is a file with each execution line.

3. Scripts examples. Create instances: Example:./createInstances.sh 4 testibm /home/ /passfile nproc=$1 prefix=$2 #prefix for the names of the machines created. pass=$3 #directory of passfile created by command line while [ $nproc -ne 0 ] do #crear instancias segun numero de procesadores. echo "creando instancia..."./ic-create-instance.sh -u ots@bioinf.jku.at -w bitlab -g $pass -t COP32.1/2048/60 -k 20032564 -c keyrhel -L 61 -n "$prefix$nproc" -d "Instance created by command line" > instances.txt nproc=`expr $nproc - 1` echo "instancia creada...faltan $nproc." cat instances.txt >> first.txt./getids instances.txt IDsAux.txt cat IDsAux.txt >> newids.txt done Configure instances: First of all, I have sent this script to the worker and I execute it by ssh from the Master Node. Example:./scriptClient.sh IPMaster Directory of MrCirrus (only the name) # Copy Directory MPI from master miip=$2 dirmpi=$1 scp -r -i key2.priv -o StrictHostKeyChecking=no $miip:/home/idcuser/mpich/ /home/idcuser/ chmod 777 mpich/bin/* echo "export PATH=/home/idcuser/mpich/bin:$PATH" >>.bash_profile scp -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miip:/home/idcuser/.ssh/id_rsa.pub /home/idcuser/.ssh/authorized_keys2 chmod 700.ssh/ chmod 600.ssh/* sudo /sbin/service iptables stop #una vez preparado hay que copiar los datos mkdir cmdtools mkdir $dirmpi

scp -r -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miip:/home/idcuser/$dirmpi/soft /home/idcuser/$dirmpi scp -r -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miip:/home/idcuser/$dirmpi/data /home/idcuser/$dirmpi # When the Worker finish, MrCirrus execute a script called sendresult.sh. This script depends of the application. In this case AllFrag is the app used. echo "scp -i /home/idcuser/key2.priv -o StrictHostKeyChecking=no /home/idcuser/$dirmpi/bin* $miip:/home/idcuser/$dirmpi/" > /home/idcuser/$dirmpi/sendresult.sh echo "ls -l > ls.txt" >> /home/idcuser/$dirmpi/sendresult.sh echo "rm bin*" >> /home/idcuser/$dirmpi/sendresult.sh chmod 777 /home/idcuser/$dirmpi/sendresult.sh Hadoop script #!/usr/bin/env bash # map_hadoop_blast.sh: running a C program using Hadoop Streaming # Written by Oscar, 2011 read offset CMD echo "offset = $offset, CMD = $CMD" >&2 #bucket=jlc-uma folder=dirprueba app=blastall64 APP_PATH=`pwd` # Splits CMD into individual words (white space delimiter) set $CMD app=$offset algorithm=$1 db=$3 seq2=$5 output_file=$7 paramb=${9} paramv=${11} #S3_PREFIX='$bucket.s3.amazonaws.com' # Check directory contents ls >&2 if [! -f $app ] then echo "soft get $app" >&2 hadoop dfs -get $folder/$app $app fi if [! -f $db ]

then echo "hadoop dfs -get $folder/$db $db" >&2 hadoop dfs -get $folder/$db $db fi if [! -f $db.nhr ] then echo "hadoop dfs -get $folder/$db.nhr $db.nhr" >&2 hadoop dfs -get $folder/$db.nhr $db.nhr fi if [! -f $db.nin ] then echo "hadoop dfs -get $folder/$db.nin $db.nin" >&2 hadoop dfs -get $folder/$db.nin $db.nin fi if [! -f $db.nsq ] then echo "hadoop dfs -get $folder/$db.nsq $db.nsq" >&2 hadoop dfs -get $folder/$db.nsq $db.nsq fi if [! -f $seq2 ] then echo "hadoop dfs -get $folder/$seq2 $seq2" >&2 hadoop dfs -get $folder/$seq2 $seq2 fi echo "Launching: $app -p $algorithm -d $db -i $seq2 -o $output_file -b $paramb -v $paramv" >&2 chmod 777 $APP_PATH/$app $APP_PATH/$app -p $algorithm -d $db -i $seq2 -o $output_file -b $paramb -v $paramv ls >&2 echo "Uploading to IBM $output_file $folder" >&2 hadoop dfs -put $output_file $folder/$output_file # Writes something to standard output echo "File $seq2 processed over the DB $db. Output file is $output_file!!" # Script report status exit 0