Until now: tl;dr: - submit a job to the scheduler

Similar documents
General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)

SLURM Workload Manager

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

An introduction to compute resources in Biostatistics. Chris Scheller

The Asterope compute cluster

Biowulf2 Training Session

Slurm Workload Manager Architecture, Configuration and Use

Submitting batch jobs Slurm on ecgate. Xavi Abellan User Support Section

Matlab on a Supercomputer

Batch Usage on JURECA Introduction to Slurm. Nov 2015 Chrysovalantis Paschoulas

Tutorial-4a: Parallel (multi-cpu) Computing

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

CHEOPS Cologne High Efficient Operating Platform for Science Brief Instructions

Job Scheduling Using SLURM

Bright Cluster Manager 5.2. User Manual. Revision: Date: Fri, 30 Nov 2012

Introduction to parallel computing and UPPMAX

Introduction to Supercomputing with Janus

To connect to the cluster, simply use a SSH or SFTP client to connect to:

How to Run Parallel Jobs Efficiently

SLURM Resources isolation through cgroups. Yiannis Georgiou Matthieu Hautreux

HPC Wales Skills Academy Course Catalogue 2015

A High Performance Computing Scheduling and Resource Management Primer

Introduction to Sun Grid Engine (SGE)

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Hodor and Bran - Job Scheduling and PBS Scripts

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University

Overview of HPC Resources at Vanderbilt

GRID Computing: CAS Style

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Optimizing Shared Resource Contention in HPC Clusters

Requesting Nodes, Processors, and Tasks in Moab

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid 101. Grid 101. Josh Hegie.

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

BLM 413E - Parallel Programming Lecture 3

Introduction to the SGE/OGS batch-queuing system

R and High-Performance Computing

MSU Tier 3 Usage and Troubleshooting. James Koll

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

The RWTH Compute Cluster Environment

An Introduction to High Performance Computing in the Department

NEC HPC-Linux-Cluster

HPC-Nutzer Informationsaustausch. The Workload Management System LSF

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Grid Engine Training Introduction

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Simulation of batch scheduling using real production-ready software tools

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

High Performance Computing

OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware

Parallel Algorithm Engineering

Chapter 2: Getting Started

Installing and running COMSOL on a Linux cluster

Using the Windows Cluster

OpenMP Programming on ScaleMP

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Debugging with TotalView

User s Manual

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer. Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA)

Running a Workflow on a PowerCenter Grid

SGE Roll: Users Guide. Version Edition

MOSIX: High performance Linux farm

Getting Started with HPC

Running applications on the Cray XC30 4/12/2015

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

Martinos Center Compute Clusters

Parallel Debugging with DDT

Bringing Big Data Modelling into the Hands of Domain Experts

Grid Engine Users Guide p1 Edition

Introduction to HPC Workshop. Center for e-research

Microsoft SQL Server OLTP Best Practice

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Parallel Programming Survey

Using Google Compute Engine

Kiko> A personal job scheduler

Administering batch environments

HPCC - Hrothgar Getting Started User Guide MPI Programming

SMock A Test Platform for the Evaluation of Monitoring Tools

Benchmark Report: Univa Grid Engine, Nextflow, and Docker for running Genomic Analysis Workflows

The CNMS Computer Cluster

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

GPUs for Scientific Computing

Manual for using Super Computing Resources

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

ZooKeeper Administrator's Guide

OLCF Best Practices. Bill Renaud OLCF User Assistance Group

Transcription:

Until now: - access the cluster copy data to/from the cluster create parallel software compile code and use optimized libraries how to run the software on the full cluster tl;dr: - submit a job to the scheduler

What is a job?

What is a job scheduler?

Job scheduler/resource manager : Piece of software which: manages and allocates resources; manages and schedules jobs; Two computers are available for 10h You go, then you go. You wait. and sets up the environment for parallel and distributed computing

Resources: CPU cores Memory Disk space Network Accelerators Software Licenses

Slurm Free and open-source Mature Very active community Many success stories Runs 50% of TOP10 systems, including 1st Also an intergalactic soft drink

Other job schedulers PBSpro Torque/Maui Oracle (ex Sun) Grid Engine Condor...

You will learn how to: Create a job Monitor the jobs Control your own job Get job accounting info with

1. Make up your mind e.g. 1 core, 2GB RAM for 1 hour Job parameters resources you need; operations you need to perform. e.g. launch 'myprog' Job steps

2. Write a submission script It is a shell script (Bash) Bash sees these as comments Regular Bash comment Slurm takes them as commands Job step creation Regular Bash commands

Other useful parameters You want You ask To set a job name --job-name=myjobname To attach a comment to the job --comment= Some comment To get emails --email-type= BEGIN END FAILED --email-user=my@mail.com To set the name of the ouptut file --output=result-%j.txt --error=error-%j.txt To delay the start of your job --begin=16:00 --begin=now+1hour --begin=2010-01-20t12:34:00 To specify an ordering of your jobs --dependency=after(ok notok any):jobids --dependency=singleton To control failure options --nokill --norequeue --requeue

Constraints and resources You want You ask To choose a specific feature (e.g. a processor --constraint type or a NIC type) To use a specific resources (e.g. a gpu) --gres To reserve a whole node for yourself --exclusive To chose a partition --partition

3. Submit the script I submit with 'sbatch' Slurm gives me the JobID One more job parameter

So you can play Download http://www.cism.ucl.ac.be/services/formations/slurm.tgz with wget and untar it on hmem compile the 'stress' program you can use it to burn cputime and memory:./stress --cpu 1 --vm-bytes 128M --timeout 30s Write a job script Submit a job See it running Cancel it Get it killed

4. Monitor your job squeue sprio sstat sview

4. Monitor your job squeue sprio sstat sview

4. Monitor your job squeue sprio sstat sview

4. Monitor your job squeue sprio sstat sview

A word about backfill The rule: a job with a lower priority can start before a job with a higher priority if it does not delay that job's start time. resources job 100 job's priority 70 60 80 10 time Low priority job has short max run time and less requirements ; it starts before larger priority job

4. Monitor your job squeue sprio sstat sview

4. Monitor your job squeue sprio sstat sview

4. Monitor your job squeue sprio sstat sview http://www.schedmd.com/slurmdocs/slurm_ug_2011/sview-users-guide.pdf

5. Control your job scancel scontrol sview

5. Control your job scancel scontrol sview

5. Control your job scancel scontrol sview

5. Control your job scancel scontrol sview

5. Control your job scancel scontrol sview http://www.schedmd.com/slurmdocs/slurm_ug_2011/sview-users-guide.pdf

6. Job accounting sacct sreport sshare

6. Job accounting sacct sreport sshare

6. Job accounting sacct sreport sshare

6. Job accounting sacct sreport sshare

6. Job accounting sacct sreport sshare

6. Job accounting sacct sreport sshare

The rules of fairshare A share is allocated to you: 1/nbusers If your actual usage is above that share, your fairshare value is decreased towards 0. If your actual usage is below that share, your fairshare value is increased towards 1. The actual usage taken into account decreases over time

A word about fairshare

A word about fairshare Assume 3 users, 3-cores cluster Red uses 1 core for a certain period of time Blue uses 2 cores for half that period Red uses 2 cores afterwards #nodes time

A word about fairshare Assume 3 users, 3-cores cluster Red uses 1 core for a certain period of time Blue uses 2 cores for half that period Red uses 2 cores afterwards

A word about fairshare

Getting cluster info sinfo sjstat

Getting cluster info sinfo sjstat

Interactive work salloc salloc -ntasks=4 --nodes=2

Interactive work salloc salloc -ntasks=4 --nodes=2

Summary Explore the enviroment Get node features (sinfo --node --long) Get node usage (sinfo --summarize) Submit a job: Define the resources you need Determine what the job should do Submit the job script (sbatch) View the job status (squeue) Get accounting information (sacct) job script

You will learn how to: Create a parallel job Request distributed resources with

Concurrent - Parallel - Distributed Master/slave vs SPMD Synchronous vs asynchronous Message passing vs shared memory

Typical resource request You want You ask 16 independent processes (no communication) --ntasks=16 MPI and do not care about where cores are distributed --ntasks=16 cores spread across distinct nodes --ntasks=16 --nodes=16 cores spread across distinct nodes and nobody else around --ntasks=16 --nodes=16 --exclusive 16 processes to spread across 8 nodes --ntasks=16 --ntasks-per-node=2 16 processes on the same node --ntasks=16 --ntasks-per-node=16 one process multithreading that can use 16 cores for --ntasks=1 --cpus-per-task=16 4 processes that can use 4 cores --ntasks=4 --cpus-per-task=4 more constraint requests --distribution=block cyclic arbitrary

Use case 1: Random sampling Your program draws random numbers and processes them sequentially Parallelism is obtained by launching the same program multiple times simultaneously Every process does the same thing No inter process communication Results appended to one common file

Use case 1: Random sampling You want You ask 16 independent processes (no communication) --ntasks=16 You use srun./myprog

Use case 1: Random sampling You want You ask 16 independent processes (no communication) --array=1-16 --output=res%a You merge with cat res*

Use case 2: Multiple datafiles Your program processes data from one datafile Parallelism is obtained by launching the same program multiple times on distinct data files Everybody does the same thing on distinct data stored in different files No inter process communication Results appended to one common file

Use case 2: Multiple datafiles You want You ask 16 independent processes (no communication) --ntasks=16 You use srun./myprog $SLURM_PROCID

Use case 2: Multiple datafiles Useful commands: xargs and find/ls: Single node: ls data* xargs -n1 -P $SLURM_NPROCS myprog Multiple nodes: ls data* xargs -n1 -P $SLURM_NTASKS srun -c1 myprog Safer: find. -maxdepth1 -name data* -print0 xargs -0 -n1 -P...

Use case 2: Multiple datafiles You want You ask 16 independent processes (no communication) --array=1-16 You use $=SLURM_TASK_ARRAY_ID

Use case 3: Parameter sweep Your program tests something for one particular value of a parameter Parallelism is obtained by launching the same program multiple times with an distinct identifier Everybody does the same thing except for a given parameter value based on the identifier No inter process communication Results appended to one common file

Use case 3: Parameter sweep You want You ask 16 independent processes (no communication) --ntasks=16 You use srun./myprog $SLURM_PROCID

Use case 3: Parameter sweep You want You ask 16 independent processes (no communication) --array=1-16 --output=res%a You use $SLURM_ARRAY_TASK_ID cat res* to merge

Use case 3: Parameter sweep Useful command: GNU Parallel Single node: parallel -j $SLURM_NPROCS myprog ::: {1..5} ::: {A..D} Multiple nodes: parallel -j $SLURM_NTASKS srun -c1 myprog ::: {1..5} ::: {A..D} Useful: parallel --joblog runtask.log resume for checkpointing parallel echo data_{1}_{2}.dat ::: 1 2 3 ::: 1 2 3

Use case 4: Multithread Your program uses OpenMP or TBB Parallelism is obtained by launching a multithreaded program One program spawns itself on the node Inter process communication by shared memory Results managed in the program which outputs a summary

Use case 4: Multithread You want one process multithreading You use that can use You ask 16 cores for --ntasks=1 --cpus-per-task=16 OMP_NUMTHREADS=16 srun myprog

Use case 5: Message passing Your program uses MPI Parallelism is obtained by launching a multi-process program One program spawns itself on several nodes Inter process communication by the network Results managed in the program which outputs a summary

Use case 5: Message passing You want You ask 16 processes for use with MPI --ntasks=16 You use module load openmpi mpirun myprog

Use case 6: Master/slave You have two types of programs: master and slave Parallelism is obtained by launching a several slaves, managed by the master The master launches several slaves on distinct nodes Inter process communication by the network or the disk Results managed in the master program which outputs a summary

Use case 6: Master slave You want You ask 16 processes 16 threads --ntasks=16 --cpus-per-task=16 You use --multi-prog + conf file

Use case 6: Master slave You want You ask 16 processes 16 threads --ntasks=16 --cpus-per-task=16 You use --multi-prog + conf file

Summary Choose number of processes: --ntasks Choose number of threads: --cpu-per-task Launch processes with srun or mpirun Set multithreading with OMP_NUM_THREADS You can use $SLURM_PROC_ID $SLURM_TASK_ARRAY_ID

Try Download MPI hello world on Wikipedia, compile it, write job script and submit it Rewrite 'Multiple files' examples using xargs Rewrite 'Parameter sweep' example using GNU parallel