Introductory Tutorial to Parallel and Distributed Computing Tools of cluster.tigem.it

Similar documents
High-Performance Computing

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge December 1st, 2015

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

Quick Tutorial for Portable Batch System (PBS)

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Matlab on a Supercomputer

Parallel Computing with MATLAB

Streamline Computing Linux Cluster User Training. ( Nottingham University)

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Beyond Windows: Using the Linux Servers and the Grid

Introduction to Sun Grid Engine (SGE)

Job Scheduling with Moab Cluster Suite

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

NEC HPC-Linux-Cluster

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Grid Engine Users Guide p1 Edition

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

An Introduction to High Performance Computing in the Department

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Chapter 2: Getting Started

Bringing Big Data Modelling into the Hands of Domain Experts

SGE Roll: Users Guide. Version Edition

Martinos Center Compute Clusters

Hodor and Bran - Job Scheduling and PBS Scripts

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 29.

High-Performance Reservoir Risk Assessment (Jacta Cluster)

Parallel Processing using the LOTUS cluster

Running applications on the Cray XC30 4/12/2015

Calcul Parallèle sous MATLAB

Introduction to Matlab

Integrating VoltDB with Hadoop

Introduction to the SGE/OGS batch-queuing system

Using Parallel Computing to Run Multiple Jobs

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

User s Manual

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Job scheduler details

Grid 101. Grid 101. Josh Hegie.

NYUAD HPC Center Running Jobs

HPCC USER S GUIDE. Version 1.2 July IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35

MATLAB Distributed Computing Server Cloud Center User s Guide

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

SLURM Workload Manager

Installing and running COMSOL on a Linux cluster

The Asterope compute cluster

GRID Computing: CAS Style

An introduction to compute resources in Biostatistics. Chris Scheller

Batch Scripts for RA & Mio

Using the Yale HPC Clusters

Cisco Networking Academy Program Curriculum Scope & Sequence. Fundamentals of UNIX version 2.0 (July, 2002)

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

File Transfer Examples. Running commands on other computers and transferring files between computers

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Introduction to Grid Engine

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

wu.cloud: Insights Gained from Operating a Private Cloud System

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Getting Started with HPC

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Imaging Computing Server User Guide

Manual for using Super Computing Resources

Tutorial-4a: Parallel (multi-cpu) Computing

Using NeSI HPC Resources. NeSI Computational Science Team

High Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu

SA-9600 Surface Area Software Manual

Using Google Compute Engine

PARALLELS SERVER BARE METAL 5.0 README

HPC system startup manual (version 1.30)

DiskPulse DISK CHANGE MONITOR

MFCF Grad Session 2015

Cluster Computing With R

Introduction to HPC Workshop. Center for e-research

InventoryControl for use with QuoteWerks Quick Start Guide

2015 The MathWorks, Inc. 1

ORACLE NOSQL DATABASE HANDS-ON WORKSHOP Cluster Deployment and Management

Using Red Hat Network Satellite Server to Manage Dell PowerEdge Servers

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry

Specific Information for installation and use of the database Report Tool used with FTSW100 software.

MSU Tier 3 Usage and Troubleshooting. James Koll

Rational Rational ClearQuest

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

The RWTH Compute Cluster Environment

Oracle EXAM - 1Z Oracle Database 11g Release 2: SQL Tuning. Buy Full Product.

HP-UX Essentials and Shell Programming Course Summary

CS380 Final Project Evaluating the Scalability of Hadoop in a Real and Virtual Environment

UMass High Performance Computing Center

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Speed up numerical analysis with MATLAB

Agenda. Using HPC Wales 2

Transcription:

Introductory Tutorial to Parallel and Distributed Computing Tools of cluster.tigem.it

A Computer Cluster is a group of networked computers, working together closely The computer are called nodes Cluster

Cluster node Each cluster node contains one ore more CPUs, memory, disks, network interfaces, graphic adapter, Like your desktop computer Can execute programs without tying up your workstation

front-end node where users log in and interact with the system computing nodes execute users programs Terminology (cluster)

Users home directories The users home directories are hosted on the front-end that shares them with the computing nodes

cluster.tigem.it nodes specifications 28 x Nodes Dell Server PowerEdge 1750 CPU: 2 X Intel Xeon CPU 3.06GHz Memory: 2GB (node 18 4GB, front-end 8GB) Disk: 147GB OS: LINUX Distribution: CentOS ~ Redhat Enterprise

cluster.tigem.it specifications One cluster CPU: 56 X Intel Xeon CPU 3.06GHz Memory: 64GB Disk: 8700GB OS: LINUX Distribution: Rocks Cluster

Access to the cluster Login ssh/putty text based (faster) vnc for graphical (fast) X11 for graphical (slow) File Transfer scp text based Winscp/Cyberduck graphical

Problem How can we manage multi-users access to the cluster nodes? Users agreement? Assign subset of nodes to each user? Not feasible Not convenient We can use a resource management system

Terminology (resource management systems) Batch or batch processing, is the capability of running jobs outside of the interactive login session Job or batch job is the basic execution object managed by the batch subsystem A job is a collection of related processes which is managed as a whole. A job can often be thought of as a shell script

Terminology (resource management systems) Queue is an ordered collection of jobs within the batch queuing system Each queue has a set of associated attributes which determine what actions are performed upon each job within the queue Typical attributes include queue name, queue priority, resource limits, destination(s ) and job count limits Selection/scheduling of jobs depend by a central job manager

Using a resource manager system To execute our programs on the cluster we need to prepare a job Write a non-interactive shell script and enqueue it in the system Non interactive: input output and error streams are files

First simple script Sleep 10 seconds and print the hostname [oliva@cluster ~]$ cat hostname.sh #!/bin/sh sleep 10 /bin/hostname

First simple submission Submit the job to the serial queue [oliva@cluster ~]$ qsub -q serial hostname.sh 1447.cluster.tigem.it The output of the qsub comand is JobID (Job IDentifier) a unique value that identify your job inside the system

First simple status query Look at the job status with qstat [oliva@cluster ~]$ qstat Job id Name User Time Use S Queue ---------------------- ------------ ------- -------- - ----- 1447.cluster.tigem.it hostname.sh oliva 0 R serial qstat display jobs status sorted by JobID R means Running Q means Queued (man qstat for other values)

Job Completition When our simple job is completed we can find two files in our directory [oliva@cluster ~]$ ls hostname.sh.* hostname.sh.e1447 hostname.sh.o1447 The ${JobName}.e${JobID} contain the job standard error stream while ${JobName}.o${JobID} contains the job standard output Look inside them with cat!

Status of the queues The qstat command can also be used to check the queue status [oliva@cluster ~]$ qstat -q

Cancelling Jobs To cancel a job that is running or queued you must use the qdel command qdel accepts the JobID as argument [oliva@cluster ~]$ qdel 1448.cluster.tigem.it

Interactive Jobs qsub allows you to execute interactive jobs by using the -I option If your program is controlled by a grafical user interface you can also export the display with the -X option (like ssh) To run matlab on a dedicated node: [oliva@cluster ~]$ qsub -X -I -q serial qsub: waiting for job 1449.cluster.tigem.it to start qsub: job 1449.cluster.tigem.it ready [oliva@compute-0-14 ~]$ /share/apps/...

Interactive Jobs The use of Graphical User Interfaces on cluser nodes is HIGHLY DISCOURAGED!!!! You'd better use matlab from the terminal [oliva@compute-0-14 ~]$ /share/apps/matlab/bin/matlab -nodisplay Matlab>

Exclusive use of a cluster Node Every node our cluster is equipped with 2 CPUs therefore the job manager allocate 2 jobs on each node Torque allows you to use a node exclusively and ensure that only our job is executed on that node by specifying the option -W x="naccesspolicy:singlejob"

Batch Matlab Jobs To run your matlab program in a non iteractive batch job you need to invoke matlab with the -nodesktop option and redirect its standard input from the.m file [oliva@cluster ~]$ cat matlab.sh #!/bin/sh /usr/local/bin/matlab -nodesktop < /data/user/run1.m

Batch R Jobs To run your R program in a non iteractive batch job you need to invoke R with the CMD BATCH arguments, the name of the file containing the R code to be executed, options and the name of the output file [oliva@cluster ~]$ cat R.sh #!/bin/sh /usr/bin/r CMD BATCH script.r script.rout Syntax: R CMD BATCH [options] infile [outfile]

Job Array To submit large numbers of jobs based on the same job script, rather than repeatedly call qsub Allow the creation of multiple jobs with one qsub command New job naming convention that allows users to reference the entire set of jobs as a unit, or to reference one particular job from the set

Job Array To submit a job array use the -t option with a range of integers that can be combined in a comma separated list: Examples : -t 1-100 or -t 1,10,50-100 [oliva@cluster ~]$ qsub -t 1-10 -q serial hostname.sh 1450.cluster.tigem.it Job id Name User Time Use S Queue --------------- -------------- ------- -------- - ----- 1450-1.cluster hostname.sh-1 oliva 0 Q default 1450-2.cluster hostname.sh-2 oliva 0 Q default ArrayID

PBS_ARRAYID Each job in a job array gets a unique ArrayID Use the ArrayID value in your script through the PBS_ARRAYID environment variable Example: Suppose you have 1000 jpg images named image-1.jpg image-2.jpg... and want to convert them in the png format: [oliva@cluster ~]$ cat image-processing.sh #!/bin/bash convert image-$pbs_arrayid.jpg image-$pbs_arrayid.png [oliva@cluster ~]$ qsub -t 1-1000 image-processing.sh

Matlab Parallel Computing Toolbox

Matlab PCT Architecture Parallel Computing Toolbox (PCT) allows you to offload work from one MATLAB session (the client) to other MATLAB sessions, called workers. Matlab Client Matlab Workers

Matlab PCT You can use multiple workers to take advantage of parallel processing You can use a worker to keep your MATLAB client session free for interactive work MATLAB Distributed Computing Server software allows you to run up to 54 workers on cluster.tigem.it

Matlab PCT use cases Parallel for-loops (parfor) Large Data Sets SPMD Pmode

Repetitive iterations Many applications involve multiple segments of repetitive code (for-loops) Parameter sweep applications: Many iterations A sweep might take a long time because it comprises many iterations. Each iteration by itself might not take long to execute, but to complete thousands or millions of iterations in serial could take a long time Long iterations A sweep might not have a lot of iterations, but each iteration could take a long time to run

parfor A parfor-loop do the same job as the standard MATLAB for-loop: executes a series of statements (the loop body) over a range of values Part of the parfor body is executed on the MATLAB client (where the parfor is issued) and part is executed in parallel on MATLAB workers Data is sent from the client to workers and the results are sent back to the client and pieced

Parfor execution Steps for and parfor code comparison for i=1:1024 A(i) = sin(i*2*pi/1024); end plot(a) matlabpool open local 3 parfor i=1:1024 A(i) = sin(i*2*pi/1024); end plot(a) matlabpool close To interactively run code that contains a parallel loop 1 open a MATLAB pool to reserve a collection of MATLAB workers

Parfor limitations You cannot use a parfor-loop when an iteration in your loop depends on the results of other iterations Each iteration must be independent of all others Since there is a communications cost involved in a parfor-loop, there might be no advantage to using one when you have only a small number of simple calculations

Single Program Multiple Data The single program multiple data (spmd) language construct allows the subsequent use of serial and parallel programming The spmd statement lets you define a block of code to run simultaneously on multiple workers (called Labs)

SPMD example This code create the same identity matrix of random size on all the Labs Selects the same random row on each Lab Select a different random row on each Lab matlabpool 4 i=randi(10,1) spmd R = eye(i); end j=randi(i,1) spmd R(j,:); k=randi(i,1) R(k,:); end

Labindex variable The Labs used for an spmd statement each have a unique value for labindex This lets you specify code to be run on only certain labs, or to customize execution, usually for the purpose of accessing unique data. spmd labdata = load(['datafile_' num2str(labindex) '.ascii']) result = MyFunction(labdata) end

Distributed Arrays You can create a distributed array in the MATLAB client, and its data is stored on the Labs of the open MATLAB pool A distributed array is distributed in one dimension, along the last nonsingleton dimension, and as evenly as possible along that dimension among the labs You cannot control the details of distribution when creating a distributed array

Distributed Arrays Example This code distribute the identity matrix among the Labs Multiply the row by labindex Reassemble the resulting distributed matrix T on the client W = eyes(4); W = distributed(w); spmd T = labindex*w; end T

Codistributed Arrays You can create a codistributed inside the Labs When creating a codistributed array, you can control all aspects of distribution, including dimensions and partitions

Codistributed VS Distributed Codistributed arrays are partitioned among the labs from which you execute code to create or manipulate them Distributed arrays are partitioned among labs from the client with the open MATLAB pool Both can be accessed and used in the client code almost like regular arrays

Create a Codistributed Array Using MATLAB Constructor Function like rand or zeros with the a codistributor object argument Partitioning a Larger Array that is replicated on all labs, and partition it so that the pieces are distributed across the labs Building from Smaller Arrays stored on each lab, and combine them so that each array becomes a segment of a larger codistributed array

Constructors Valid constructors are: cell, colon, eye, false, Inf, NaN, ones, rand, randn, sparse, speye, sprand, sprandn, true, zeros Check their syntax with: help codistributed.constructor Create a codistributed random matrix of size 100 with spmd T = codistributed.rand(100) end

Partitioning a Larger Array When you have sufficient memory to store the initial replicated array you can use the codistributed function to partition a large array among Labs spmd A = [11:18; 21:28; 31:38; 41:48]; D = codistributed(a); getlocalpart(d) end

Building from Smaller Arrays To save on memory, you can construct the smaller pieces (local part) on each lab first, and then combine them into a single array that is distributed across the labs matlabpool 3 spmd A = (labindex-1) * 10 + [ 1:5 ; 6:10 ]; R = codistributed.build(a, codistributor1d(1,[2 2 2],[6 5])) getlocalpart(r) C = codistributed.build(a, codistributor1d(2,[5 5 5],[2 15])) getlocalpart(c) end...

Codistributor1d Describes the distribution scheme Matrix Codistributed by 1 st dimension codistributor1d(1,[2 2 2],[6 5]) Pick 2 rows in the first lab, 2 in the second and 2 in the third Obtain a 6x5 codistributed matrix

Codistributor1d Describes the distribution scheme Matrix Codistributed by 2 nd dimension (columns) codistributor1d(2,[5 5 5],[2 15]) Pick 5 columns in the first lab, 5 in the second and 5 in the third Obtain a 2x15 codistributed matrix

pmode Like spmd, pmode lets you work interactively with a parallel job running simultaneously on several Labs Commands you type at the pmode prompt in the Parallel Command Window are executed on all labs at the same time In contrast to spmd, pmode provides a desktop with a display for each lab running the job, where you can enter commands, see results, access each lab s workspace, etc

pmode Pmode gives a separated view of the situation on the various Labs

dfeval The dfeval function allows you to evaluate a function in a cluster of workers You need to provide basic required information, such as the function to be evaluated, the number of tasks to divide the job into, and the variable into which the results are returned results = dfeval(@sum, {[1 1] [2 2] [3 3]}, 'Configuration', 'cluster.tigem.it')

dfeval Suppose the function myfun accepts three input arguments, and generates two output arguments The number of elements of the input argument cell arrays determines the number of tasks in the job [X, Y] = dfeval(@myfun, All input cell arrays must have the same {a1 a2 a3 a4}, {b1 b2 b3 b4}, {c1 c2 c3 c4}, number 'configuration','cluster.tigem.it, 'FileDependencies',myfun.m); of elements In this example, there are four tasks

dfeval results Results are stored this way X{1}, Y{1} myfun(a1, b1, c1) X{2}, Y{2} myfun(a2, b2, c2) X{3}, Y{3} myfun(a3, b3, c3) X{4}, Y{4} myfun(a4, b4, c4) Like you would have executed [X{1}, Y{1}] = myfun(a1, b1, c1); [X{2}, Y{2}] = myfun(a2, b2, c2); [X{3}, Y{3}] = myfun(a3, b3, c3); [X{4}, Y{4}] = myfun(a4, b4, c4);

Using torque inside Matlab Find a Job Manager Create a Job Create Tasks Submit a Job to the Job Queue Retrieve the Job s Results

Find a Job Manager Use findresource to load the configuration jm = findresource('scheduler','configuration','cluster.tigem.it') jm = PBS Scheduler Information ========================= Type : Torque ClusterSize : 27 DataLocation : /home/oliva HasSharedFilesystem : true - Assigned Jobs... Number Pending : 1 Number Queued : 0 Number Running : 0 Number Finished : 1

Create Job Use CreateJob to create a Matlab Job object that correspond to a Torque job job1=createjob(jobm) Job ID 10 Information ===================== - Data Dependencies UserName : oliva State : pending SubmitTime : StartTime : Running Duration : FileDependencies : {} PathDependencies : {}

Create Tasks Add Tasks to the job using CreateTask createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3}); createtask(job1, @rand, 1, {3,3});

Submit a Job to the Job Queue Submit your Job submit(job1) Retrieve it's output results = getalloutputarguments(job1); Delete Job's Data destroy(job1);