How To Run A Tompouce Cluster On An Ipra (Inria) (Sun) 2 (Sun Geserade) (Sun-Ge) 2/5.2 (

Size: px

Start display at page:

Download "How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 ("

Adelia Miller
5 years ago
Views:

1 Running Hadoop and Stratosphere jobs on TomPouce cluster 16 October 2013

2 TomPouce cluster TomPouce is a cluster of 20 calcula@on nodes = 240 cores Located in the Inria Turing building (École Polytechnique) Used jointly by Inria teams Jobs are run with the help of a scheduler SGE (Sun Grid Engine) 2

(École Polytechnique) Used jointly by Inria teams Jobs

3 TomPouce cluster SPECIFICATIONS: 20 nodes > bi- processors > 6 cores. Total: 240 cores 48 Gb Ram per node Local space 400 Gb Storage: Dell R510 /home 19 Tb NFS Dell R710 x2 /scratch 37 Tb FHGFS (Fraunhofer FS) Network: Switch Dell 5548 Switch infiniband Mellanox InfiniScale IV QDR 3

4 TomPouce cluster 4

5 1. Copy your job from the local machine to the cluster front node $ scp myjob.jar inria_username@ :~/ myjob.jar will be copied in the folder /home/leo/inria_username. 5

jar inria_username@195.83.212.209:~/ myjob.

6 2. Connect via ssh to front node $ ssh inria_username@ Welcome to Bright Cluster Manager 6.0 Based on Scientific Linux release 6 Cluster Manager ID: # Use the following commands to adjust your environment: 'module avail ' - show available modules 'module add <module> ' - adds a module to your environment for this session 'module initadd <module> ' - configure module to be loaded at every login IMPORTANT: To connect to the cluster, you ssh key should be stored in the Inria LDAP. If not, send an e- mail with your public ssh key to: helpmi- [email protected] 6

available modules 'module add <module> ' - adds a module to your environment for this session 'module initadd <module> ' - configure module to be

7 3. Log in as clustervision superuser using your LDAP password $ sudo su - clustervision - To execute Hadoop and Stratosphere jobs and edit configura@ons needed. - If you don t have enough permissions, ask for them to: helpmi- [email protected] 7

Stratosphere jobs and edit configura@ons needed.

8 4. Add Hadoop/Stratosphere environment to your session To add Hadoop environment, type: $module add hadoop/1.1.1 To add Stratosphere environment, type: $module add stratosphere/stratosphere - To add an environment automa@cally when you login: $module initadd hadoop/ To check all the environments loaded: $ module list Currently Loaded Modulefiles: 1) gcc/ ) intel-cluster-checker/1.8 3) stratosphere/stratosphere ) sge/ ) openmpi/gcc/64/ ) gromacs/openmpi/gcc/64/ ) hadoop/

login: $module initadd hadoop/1.1.1 - To check all the environments loaded: $ module list Currently Loaded Modulefiles: 1) gcc/4.7.

9 4. Add Hadoop/Stratosphere environment to your session Hadoop /cm/shared/apps/hadoop/current/ Stratosphere /cm/shared/apps/stratosphere/current/ 9

10 5. Create an executon script (Hadoop) #/bin/bash #$ -N hadoop_run #$ -pe hadoop 12 #$ -j y #$ -o output.$job_id #$ -l h_rt=00:10:00,hadoop=true,excl=true #$ -cwd #$ -q hadoop.q #Copy the input files into the HDFS filesystem hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/clustervision/tmp /input #Running the hadoop task(s) here. I am specifying the jar, class, run parameters: hadoop --config /home/guests/clustervision/current/ jar myjob.jar org.myorg.job /input /output # Copying the output files from the HDFS filesystem hadoop --config /home/guests/clustervision/current/ fs get /output 10

q #Copy the input files into the HDFS filesystem hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/clustervision/tmp /input

11 5. Create an executon script (Hadoop) #/bin/bash #$ -N hadoop_run #$ -pe hadoop 12 #$ -j y #$ -o output.$job_id #$ -l h_rt=00:10:00,hadoop=true,excl=true #$ -cwd #$ -q hadoop.q #Copy the input files into the HDFS filesystem hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/clustervision/tmp /input #Running the hadoop task(s) here. I am specifying the jar, class, run parameters: hadoop --config /home/guests/clustervision/current/ jar myjob.jar org.myorg.job /input /output # Copying the output files from the HDFS filesystem hadoop --config /home/guests/clustervision/current/ fs get /output 11

12 SGE executon parameters: Should be wrigen aher #$ at the beginning of the script. - N <job_name>. Used to give a name to the job to run. - pe <environment> N. Specifies the environment. N is the number of cores (limited to 180). - j y : to use the same output file (errors and standard exit). 12

- pe <environment> N. Specifies the environment.

13 SGE executon parameters: - o output.$job_id: the standard output will be in a file name ouput.$job_id. $JOB_ID will be the number SGE will assign automa@cally to our job. - l name=value. Used to demand a resource. In this case: h_rt=00:10:00 indicates that the job should be killed aher 10 minutes hadoop=true indicates that the job to run is a Hadoop job (it DOES NOT CHANGE for Stratosphere jobs) excl=true indicates that it is executed exclusively 13

In this case: h_rt=00:10:00 indicates that the job should be killed aher 10 minutes hadoop=true indicates that

14 5. Create an executon script (Hadoop) HADOOP COMMANDS Copy input files into HDFS hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/clustervision/tmp /input Run Hadoop tasks hadoop --config /home/guests/clustervision/current/ jar /pathtojob/myjob.jar org.myorg.job /input /output Copy output files from HDFS hadoop --config /home/guests/clustervision/current/ fs get /output 14

Hadoop tasks hadoop --config /home/guests/clustervision/current/ jar /pathtojob/myjob.jar org.myorg.

15 5. Create an executon script (Hadoop) HADOOP COMMANDS Copy input files into HDFS hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/clustervision/tmp /input Run Hadoop tasks hadoop --config /home/guests/clustervision/current/ jar /pathtojob/myjob.jar org.myorg.job /input /output Copy output files from HDFS hadoop --config /home/guests/clustervision/current/ fs get /output 15

16 5. Create an executon script (Stratosphere): #/bin/bash #$ -N strato_run #$ -pe stratosphere 24 #$ -j y #$ -o output.$job_id #$ -l h_rt=00:10:00,hadoop=true,excl=true #$ -cwd #$ -q hadoop.q export PATH=$PATH:'/cm/shared/apps/hadoop/current/conf/' export STRATOSPHERE_HOME='/cm/shared/apps/stratosphere/current MASTER=`cat /home/guests/clustervision/current/masters` hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/ clustervision/tmp /var/hadoop/dfs.name.dir $STRATOSPHERE_HOME/bin/pact-client.sh run -j myjob.jar -a 2 hdfs://$master:50040/var/hadoop/ dfs.name.dir/inputfile hdfs://$master:50040/var/hadoop/dfs.name.dir/outputfile hadoop --config /home/guests/clustervision/current/ fs -get /var/hadoop/dfs.name.dir/output 16

--config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/ clustervision/tmp /var/hadoop/dfs.name.dir $STRATOSPHERE_HOME/bin/pact-client.sh run -j myjob.

17 5. Create an executon script (Stratosphere): #/bin/bash #$ -N strato_run #$ -pe stratosphere 24 #$ -j y #$ -o output.$job_id #$ -l h_rt=00:10:00,hadoop=true,excl=true #$ -cwd #$ -q hadoop.q export PATH=$PATH:'/cm/shared/apps/hadoop/current/conf/' export STRATOSPHERE_HOME='/cm/shared/apps/stratosphere/current MASTER=`cat /home/guests/clustervision/current/masters` hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/ clustervision/tmp /input $STRATOSPHERE_HOME/bin/pact-client.sh run -j myjob.jar -a 2 hdfs://$master:50040/input hdfs://$master:50040/output hadoop --config /home/guests/clustervision/current/ fs -get /output 17

q export PATH=$PATH:'/cm/shared/apps/hadoop/current/conf/' export STRATOSPHERE_HOME='/cm/shared/apps/stratosphere/current MASTER=`cat

18 5. Create an executon script (Stratosphere): #/bin/bash #$ -N strato_run #$ -pe stratosphere 24 #$ -j y #$ -o output.$job_id #$ -l h_rt=00:10:00,hadoop=true,excl=true #$ -cwd #$ -q hadoop.q export PATH=$PATH:'/cm/shared/apps/hadoop/current/conf/' export STRATOSPHERE_HOME='/cm/shared/apps/stratosphere/current MASTER=`cat /home/guests/clustervision/current/masters` hadoop --config /home/guests/clustervision/current/ dfs -copyfromlocal /home/guests/ clustervision/tmp /input $STRATOSPHERE_HOME/bin/pact-client.sh run -j myjob.jar -a 2 hdfs://$master:50040/input hdfs://$master:50040/output hadoop --config /home/guests/clustervision/current/ fs -get /output 18

19 5. Create an executon script (Stratosphere) STRATOSPHERE COMMANDS Copy input files into HDFS hadoop --config /home/guests/clustervision/current/ dfs - copyfromlocal /home/guests/clustervision/tmp /input Run Stratosphere tasks $STRATOSPHERE_HOME/bin/pact-client.sh run -j /pathtojob/myjob.jar -a 2 hdfs://$master:50040/input hdfs://$master:50040/output Copy output files from HDFS hadoop --config /home/guests/clustervision/current/ fs -get /output 19

tasks $STRATOSPHERE_HOME/bin/pact-client.sh run -j /pathtojob/myjob.

20 6. Submission of a job To submit, execute: $qsub script.qsub Aher submission, you can see the state of execu@on with the command: $ qstat job-id prior name user state submit/start at queue slots ja-task-id strato_run clustervisio r 10/15/ :17:59 [email protected] 24 20

user state submit/start at queue slots ja-task-id

21 6. Submission of a job Or if you want a more detailed informa@on: $qstat t 21

22 7. Logs /home/guests/clustervision/output.$job_id: Output of the job in SGE /home/guests/clustervision/config.$job_id/logs: Logs of Hadoop file system. 22

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine) Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing