Introductory Tutorial to Parallel and Distributed Computing Tools of cluster.tigem.it

A Computer Cluster is a group of networked computers, working together closely The computer are called nodes Cluster

Cluster node Each cluster node contains one ore more CPUs, memory, disks, network interfaces, graphic adapter, Like your desktop computer Can execute programs without tying up your workstation

front-end node where users log in and interact with the system computing nodes execute users programs Terminology (cluster)

Users home directories The users home directories are hosted on the front-end that shares them with the computing nodes

cluster.tigem.it nodes specifications 28 x Nodes Dell Server PowerEdge 1750 CPU: 2 X Intel Xeon CPU 3.06GHz Memory: 2GB (node 18 4GB, front-end 8GB) Disk: 147GB OS: LINUX Distribution: CentOS ~ Redhat Enterprise

cluster.tigem.it specifications One cluster CPU: 56 X Intel Xeon CPU 3.06GHz Memory: 64GB Disk: 8700GB OS: LINUX Distribution: Rocks Cluster

Access to the cluster Login ssh/putty text based (faster) vnc for graphical (fast) X11 for graphical (slow) File Transfer scp text based Winscp/Cyberduck graphical

Problem How can we manage multi-users access to the cluster nodes? Users agreement? Assign subset of nodes to each user? Not feasible Not convenient We can use a resource management system

Terminology (resource management systems) Batch or batch processing, is the capability of running jobs outside of the interactive login session Job or batch job is the basic execution object managed by the batch subsystem A job is a collection of related processes which is managed as a whole. A job can often be thought of as a shell script

Terminology (resource management systems) Queue is an ordered collection of jobs within the batch queuing system Each queue has a set of associated attributes which determine what actions are performed upon each job within the queue Typical attributes include queue name, queue priority, resource limits, destination(s ) and job count limits Selection/scheduling of jobs depend by a central job manager

Using a resource manager system To execute our programs on the cluster we need to prepare a job Write a non-interactive shell script and enqueue it in the system Non interactive: input output and error streams are files

First simple script Sleep 10 seconds and print the hostname [oliva@cluster ~]$ cat hostname.sh #!/bin/sh sleep 10 /bin/hostname

First simple submission Submit the job to the serial queue [oliva@cluster ~]$ qsub -q serial hostname.sh 1447.cluster.tigem.it The output of the qsub comand is JobID (Job IDentifier) a unique value that identify your job inside the system

First simple status query Look at the job status with qstat [oliva@cluster ~]$ qstat Job id Name User Time Use S Queue ---------------------- ------------ ------- -------- - ----- 1447.cluster.tigem.it hostname.sh oliva 0 R serial qstat display jobs status sorted by JobID R means Running Q means Queued (man qstat for other values)

Job Completition When our simple job is completed we can find two files in our directory [oliva@cluster ~]$ ls hostname.sh.* hostname.sh.e1447 hostname.sh.o1447 The ${JobName}.e${JobID} contain the job standard error stream while ${JobName}.o${JobID} contains the job standard output Look inside them with cat!

Status of the queues The qstat command can also be used to check the queue status [oliva@cluster ~]$ qstat -q

Cancelling Jobs To cancel a job that is running or queued you must use the qdel command qdel accepts the JobID as argument [oliva@cluster ~]$ qdel 1448.cluster.tigem.it

Interactive Jobs qsub allows you to execute interactive jobs by using the -I option If your program is controlled by a grafical user interface you can also export the display with the -X option (like ssh) To run matlab on a dedicated node: [oliva@cluster ~]$ qsub -X -I -q serial qsub: waiting for job 1449.cluster.tigem.it to start qsub: job 1449.cluster.tigem.it ready [oliva@compute-0-14 ~]$ /share/apps/...

Interactive Jobs The use of Graphical User Interfaces on cluser nodes is HIGHLY DISCOURAGED!!!! You'd better use matlab from the terminal [oliva@compute-0-14 ~]$ /share/apps/matlab/bin/matlab -nodisplay Matlab>

Exclusive use of a cluster Node Every node our cluster is equipped with 2 CPUs therefore the job manager allocate 2 jobs on each node Torque allows you to use a node exclusively and ensure that only our job is executed on that node by specifying the option -W x="naccesspolicy:singlejob"

Batch Matlab Jobs To run your matlab program in a non iteractive batch job you need to invoke matlab with the -nodesktop option and redirect its standard input from the.m file [oliva@cluster ~]$ cat matlab.sh #!/bin/sh /usr/local/bin/matlab -nodesktop < /data/user/run1.m

Batch R Jobs To run your R program in a non iteractive batch job you need to invoke R with the CMD BATCH arguments, the name of the file containing the R code to be executed, options and the name of the output file [oliva@cluster ~]$ cat R.sh #!/bin/sh /usr/bin/r CMD BATCH script.r script.rout Syntax: R CMD BATCH [options] infile [outfile]

Job Array To submit large numbers of jobs based on the same job script, rather than repeatedly call qsub Allow the creation of multiple jobs with one qsub command New job naming convention that allows users to reference the entire set of jobs as a unit, or to reference one particular job from the set

Job Array To submit a job array use the -t option with a range of integers that can be combined in a comma separated list: Examples : -t 1-100 or -t 1,10,50-100 [oliva@cluster ~]$ qsub -t 1-10 -q serial hostname.sh 1450.cluster.tigem.it Job id Name User Time Use S Queue --------------- -------------- ------- -------- - ----- 1450-1.cluster hostname.sh-1 oliva 0 Q default 1450-2.cluster hostname.sh-2 oliva 0 Q default ArrayID

PBS_ARRAYID Each job in a job array gets a unique ArrayID Use the ArrayID value in your script through the PBS_ARRAYID environment variable Example: Suppose you have 1000 jpg images named image-1.jpg image-2.jpg... and want to convert them in the png format: [oliva@cluster ~]$ cat image-processing.sh #!/bin/bash convert image-$pbs_arrayid.jpg image-$pbs_arrayid.png [oliva@cluster ~]$ qsub -t 1-1000 image-processing.sh

Matlab Parallel Computing Toolbox

Matlab PCT Architecture Parallel Computing Toolbox (PCT) allows you to offload work from one MATLAB session (the client) to other MATLAB sessions, called workers. Matlab Client Matlab Workers

Matlab PCT You can use multiple workers to take advantage of parallel processing You can use a worker to keep your MATLAB client session free for interactive work MATLAB Distributed Computing Server software allows you to run up to 54 workers on cluster.tigem.it

Matlab PCT use cases Parallel for-loops (parfor) Large Data Sets SPMD Pmode

Repetitive iterations Many applications involve multiple segments of repetitive code (for-loops) Parameter sweep applications: Many iterations A sweep might take a long time because it comprises many iterations. Each iteration by itself might not take long to execute, but to complete thousands or millions of iterations in serial could take a long time Long iterations A sweep might not have a lot of iterations, but each iteration could take a long time to run

parfor A parfor-loop do the same job as the standard MATLAB for-loop: executes a series of statements (the loop body) over a range of values Part of the parfor body is executed on the MATLAB client (where the parfor is issued) and part is executed in parallel on MATLAB workers Data is sent from the client to workers and the results are sent back to the client and pieced

Parfor execution Steps for and parfor code comparison for i=1:1024 A(i) = sin(i*2*pi/1024); end plot(a) matlabpool open local 3 parfor i=1:1024 A(i) = sin(i*2*pi/1024); end plot(a) matlabpool close To interactively run code that contains a parallel loop 1 open a MATLAB pool to reserve a collection of MATLAB workers

Parfor limitations You cannot use a parfor-loop when an iteration in your loop depends on the results of other iterations Each iteration must be independent of all others Since there is a communications cost involved in a parfor-loop, there might be no advantage to using one when you have only a small number of simple calculations

Single Program Multiple Data The single program multiple data (spmd) language construct allows the subsequent use of serial and parallel programming The spmd statement lets you define a block of code to run simultaneously on multiple workers (called Labs)

SPMD example This code create the same identity matrix of random size on all the Labs Selects the same random row on each Lab Select a different random row on each Lab matlabpool 4 i=randi(10,1) spmd R = eye(i); end j=randi(i,1) spmd R(j,:); k=randi(i,1) R(k,:); end

Labindex variable The Labs used for an spmd statement each have a unique value for labindex This lets you specify code to be run on only certain labs, or to customize execution, usually for the purpose of accessing unique data. spmd labdata = load(['datafile_' num2str(labindex) '.ascii']) result = MyFunction(labdata) end

Distributed Arrays You can create a distributed array in the MATLAB client, and its data is stored on the Labs of the open MATLAB pool A distributed array is distributed in one dimension, along the last nonsingleton dimension, and as evenly as possible along that dimension among the labs You cannot control the details of distribution when creating a distributed array

Distributed Arrays Example This code distribute the identity matrix among the Labs Multiply the row by labindex Reassemble the resulting distributed matrix T on the client W = eyes(4); W = distributed(w); spmd T = labindex*w; end T

Codistributed Arrays You can create a codistributed inside the Labs When creating a codistributed array, you can control all aspects of distribution, including dimensions and partitions

Codistributed VS Distributed Codistributed arrays are partitioned among the labs from which you execute code to create or manipulate them Distributed arrays are partitioned among labs from the client with the open MATLAB pool Both can be accessed and used in the client code almost like regular arrays

Create a Codistributed Array Using MATLAB Constructor Function like rand or zeros with the a codistributor object argument Partitioning a Larger Array that is replicated on all labs, and partition it so that the pieces are distributed across the labs Building from Smaller Arrays stored on each lab, and combine them so that each array becomes a segment of a larger codistributed array

Constructors Valid constructors are: cell, colon, eye, false, Inf, NaN, ones, rand, randn, sparse, speye, sprand, sprandn, true, zeros Check their syntax with: help codistributed.constructor Create a codistributed random matrix of size 100 with spmd T = codistributed.rand(100) end

Partitioning a Larger Array When you have sufficient memory to store the initial replicated array you can use the codistributed function to partition a large array among Labs spmd A = [11:18; 21:28; 31:38; 41:48]; D = codistributed(a); getlocalpart(d) end

Building from Smaller Arrays To save on memory, you can construct the smaller pieces (local part) on each lab first, and then combine them into a single array that is distributed across the labs matlabpool 3 spmd A = (labindex-1) * 10 + [ 1:5 ; 6:10 ]; R = codistributed.build(a, codistributor1d(1,[2 2 2],[6 5])) getlocalpart(r) C = codistributed.build(a, codistributor1d(2,[5 5 5],[2 15])) getlocalpart(c) end...

Codistributor1d Describes the distribution scheme Matrix Codistributed by 1 st dimension codistributor1d(1,[2 2 2],[6 5]) Pick 2 rows in the first lab, 2 in the second and 2 in the third Obtain a 6x5 codistributed matrix

Codistributor1d Describes the distribution scheme Matrix Codistributed by 2 nd dimension (columns) codistributor1d(2,[5 5 5],[2 15]) Pick 5 columns in the first lab, 5 in the second and 5 in the third Obtain a 2x15 codistributed matrix

pmode Like spmd, pmode lets you work interactively with a parallel job running simultaneously on several Labs Commands you type at the pmode prompt in the Parallel Command Window are executed on all labs at the same time In contrast to spmd, pmode provides a desktop with a display for each lab running the job, where you can enter commands, see results, access each lab s workspace, etc

pmode Pmode gives a separated view of the situation on the various Labs

dfeval The dfeval function allows you to evaluate a function in a cluster of workers You need to provide basic required information, such as the function to be evaluated, the number of tasks to divide the job into, and the variable into which the results are returned results = dfeval(@sum, {[1 1] [2 2] [3 3]}, 'Configuration', 'cluster.tigem.it')

dfeval Suppose the function myfun accepts three input arguments, and generates two output arguments The number of elements of the input argument cell arrays determines the number of tasks in the job [X, Y] = dfeval(@myfun, All input cell arrays must have the same {a1 a2 a3 a4}, {b1 b2 b3 b4}, {c1 c2 c3 c4}, number 'configuration','cluster.tigem.it, 'FileDependencies',myfun.m); of elements In this example, there are four tasks

dfeval results Results are stored this way X{1}, Y{1} myfun(a1, b1, c1) X{2}, Y{2} myfun(a2, b2, c2) X{3}, Y{3} myfun(a3, b3, c3) X{4}, Y{4} myfun(a4, b4, c4) Like you would have executed [X{1}, Y{1}] = myfun(a1, b1, c1); [X{2}, Y{2}] = myfun(a2, b2, c2); [X{3}, Y{3}] = myfun(a3, b3, c3); [X{4}, Y{4}] = myfun(a4, b4, c4);

Using torque inside Matlab Find a Job Manager Create a Job Create Tasks Submit a Job to the Job Queue Retrieve the Job s Results

Find a Job Manager Use findresource to load the configuration jm = findresource('scheduler','configuration','cluster.tigem.it') jm = PBS Scheduler Information ========================= Type : Torque ClusterSize : 27 DataLocation : /home/oliva HasSharedFilesystem : true - Assigned Jobs... Number Pending : 1 Number Queued : 0 Number Running : 0 Number Finished : 1

Create Job Use CreateJob to create a Matlab Job object that correspond to a Torque job job1=createjob(jobm) Job ID 10 Information ===================== - Data Dependencies UserName : oliva State : pending SubmitTime : StartTime : Running Duration : FileDependencies : {} PathDependencies : {}

Create Tasks Add Tasks to the job using CreateTask createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3});

Submit a Job to the Job Queue Submit your Job submit(job1) Retrieve it's output results = getalloutputarguments(job1); Delete Job's Data destroy(job1);