General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)

Similar documents
Until now: tl;dr: - submit a job to the scheduler

SLURM Workload Manager

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Slurm Workload Manager Architecture, Configuration and Use

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

An introduction to compute resources in Biostatistics. Chris Scheller

Job Scheduling Using SLURM

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

Batch Usage on JURECA Introduction to Slurm. Nov 2015 Chrysovalantis Paschoulas

Biowulf2 Training Session

Submitting batch jobs Slurm on ecgate. Xavi Abellan User Support Section

Introduction to parallel computing and UPPMAX

SLURM Resources isolation through cgroups. Yiannis Georgiou Matthieu Hautreux

The Asterope compute cluster

Matlab on a Supercomputer

Bright Cluster Manager 5.2. User Manual. Revision: Date: Fri, 30 Nov 2012

Introduction to HPC Workshop. Center for e-research

CHEOPS Cologne High Efficient Operating Platform for Science Brief Instructions

Tutorial-4a: Parallel (multi-cpu) Computing

Grid Engine Training Introduction

Slurm License Management

Simulation of batch scheduling using real production-ready software tools

Introduction to Supercomputing with Janus

Hodor and Bran - Job Scheduling and PBS Scripts

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Background and introduction Using the cluster Summary. The DMSC datacenter. Lars Melwyn Jensen. Niels Bohr Institute University of Copenhagen

Using NeSI HPC Resources. NeSI Computational Science Team

A High Performance Computing Scheduling and Resource Management Primer

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

R and High-Performance Computing

How to control Resource allocation on pseries multi MCM system

Parallel Debugging with DDT

locuz.com HPC App Portal V2.0 DATASHEET

HPC-Nutzer Informationsaustausch. The Workload Management System LSF

Slurm at CEA. Site Report CEA 10 AVRIL 2012 PAGE septembre SLURM User Group - September 2014 F. Belot, F. Diakhaté, A. Roy, M.

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

HPC Wales Skills Academy Course Catalogue 2015

Streamline Computing Linux Cluster User Training. ( Nottingham University)

The RWTH Compute Cluster Environment

Job Scheduling with Moab Cluster Suite

HRSK-II Nutzerschulung

CNAG User s Guide. Barcelona Supercomputing Center Copyright c 2015 BSC-CNS December 18, Introduction 2

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO

To connect to the cluster, simply use a SSH or SFTP client to connect to:

LoadLeveler Overview. January 30-31, IBM Storage & Technology Group. IBM HPC Developer TIFR, Mumbai

Using the Windows Cluster

Running applications on the Cray XC30 4/12/2015

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Distributed Operating Systems. Cluster Systems

High Performance Computing

CPU Scheduling Outline

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Compute Cluster Documentation

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

MPI / ClusterTools Update and Plans

Requesting Nodes, Processors, and Tasks in Moab

Load Balancing in Beowulf Clusters

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Subversion Integration

How to Run Parallel Jobs Efficiently

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

SAS Grid: Grid Scheduling Policy and Resource Allocation Adam H. Diaz, IBM Platform Computing, Research Triangle Park, NC

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU

Debugging with TotalView

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Parallel Processing using the LOTUS cluster

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Frysk The Systems Monitoring and Debugging Tool. Andrew Cagney

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

Overview of HPC Resources at Vanderbilt

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

NEC HPC-Linux-Cluster

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture

System Software for High Performance Computing. Joe Izraelevitz

CSE 120 Principles of Operating Systems. Modules, Interfaces, Structure

USE OF PYTHON AS A SATELLITE OPERATIONS AND TESTING AUTOMATION LANGUAGE

Using the Yale HPC Clusters

Visualizing gem5 via ARM DS-5 Streamline. Dam Sunwoo ARM R&D December 2012

The SUN ONE Grid Engine BATCH SYSTEM

Grid Engine Users Guide p1 Edition

Chapter 2: Getting Started

TOOLS AND TIPS FOR MANAGING A GPU CLUSTER. Adam DeConinck HPC Systems Engineer, NVIDIA

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Using Google Compute Engine

Transcription:

Slurm Training15

Agenda 1 2 3 About Slurm Key Features of Slurm Extending Slurm Resource Management Daemons Job/step allocation 4 5 SMP MPI Parametric Job monitoring Accounting Scheduling Administration

About Slurm

About Slurm

About Slurm About Slurm Slurm was an acronym for Simple Linux Utility for Resource Management. Development started in 2002 at LLNL. It evolved into a capable job scheduler through use of optional plugins. On 2010 the Slurm developers founded SchedMD. It has about 500,000 lines of C code. Not Simple anymore. Now is called : Slurm Workload Manager. Supports AIX, Linux, Solaris, other Unix variants. Used on many of the world s largest computers. Official website : http://slurm.schedmd.com

About Slurm Support Community Support through email list slurm-dev@schedmd.com We strongly recommend 3rd-Level Support provided by SchedMD

Key Features Available in Slurm Key Features Available in Slurm Full control over CPU usage Topology awareness Full control Memory usage Job Preemption support Scheduling techniques & Job requeue mechanism performance Job Array support Integration with MPI Interactive sessions support High Availability Debugger friendly Suspension and Resume capabilities Kernel Level Checkpointing & Restart Job Migration Job profiling

Extending Slurm Extending Slurm features Write a plug-in C API slurm.h Perl API Python API Contributing back to Slurm 1 Fork the Github Slurm branch. 2 Clone the Github Slurm branch. 3 Upload your changes to your github repo. 4 Create a pull request to SchedMD Github.

Resource Management Manage resources within a single cluster Nodes state (up/down/idle/allocated/mix/drained) access Jobs queued and running start/stop suspended resizing completing

Job States Figure: Job States and job flow. Source SchedMD

Definitions of Socket, Core & Thread Figure: Definitions of Socket, Core & Thread. Source SchedMD

Slurm daemons Slurm daemons slurmctld Central management daemon Master/Slave slurmdbd (optional) Accounting database system slurmd On every compute node slurmstepd Started by slurmd for every job step

Job/step allocation sbatch submits a script job. salloc creates an allocation. srun runs a command across nodes. srun_cr runs a command across nodes with C&R. sattach Connect stdin/out/err for an existing job or job step. scancel cancels a running or pending job. sbcast Transfer file to a compute nodes allocated to a job.

srun : Simple way to manage MPI, OpenMP, pthreads & serial jobs Slurm provides a single command line to manage all the MPI flavors, OpenMP, Pthreads and serial applications Users don t need to worry about MPI flags and options for each MPI implementation mpirun/mpiexec/mpiexec.hydra The tool is called srun and it should be mandatory for submitting jobs in the cluster. It provides key features like: memory affinity CPU binding cgroups integration

sbatch ~ $ vim testjob.sl ~ $ cat testjob.sl #!/bin/bash #SBATCH --nodes=10 srun echo "running on : $(hostname)" srun echo "allocation : $SLURM_NODELIST" ~ $ sbatch testjob.sl Submitted batch job 11109 ~ $ cat slurm-11109.out running on : hsw001 allocation : hsw[001-010]

sbatch ~ $ vim testjob.sl ~ $ cat testjob.sl #!/bin/bash #SBATCH --nodes=10 srun echo "running on : $(hostname)" srun echo "allocation : $SLURM_NODELIST" ~ $ sbatch testjob.sl Submitted batch job 11109 ~ $ cat slurm-11109.out running on : hsw001 allocation : hsw[001-010]

sbatch ~ $ vim testjob.sl ~ $ cat testjob.sl #!/bin/bash #SBATCH --nodes=10 srun echo "running on : $(hostname)" srun echo "allocation : $SLURM_NODELIST" ~ $ sbatch testjob.sl Submitted batch job 11109 ~ $ cat slurm-11109.out running on : hsw001 allocation : hsw[001-010]

Submitting a Job Job Description Example : SMP #!/bin/bash #SBATCH -o blastn-%j.%n.out #SBATCH -D /share/src/blast+/example #SBATCH -J BLAST #SBATCH --cpus-per-task=4 #SBATCH --mail-type=end #SBATCH --mail-user=youremail@yourdomain.org #SBATCH --time=08:00:00 ml blast+ export OMP_NUM_THREADS=4 srun blastn

Submitting a Job Job Description Example : MPI #!/bin/bash #SBATCH -o migrate-%j.%n.out #SBATCH -D /share/src/migrate/example #SBATCH -J MIGRATE-MPI #SBATCH --ntasks=96 #SBATCH --mail-type=end #SBATCH --mail-user=youremail@yourdomain.org #SBATCH --time=08:00:00 ml migrate/3.4.4-ompi srun migrate-n-mpi parmfile.testml -nomenu

Submitting a Job Parametric Job To submit 10,000 element job array will take less than 1 second! sbatch array=1-10000 jobarray.sh In order to identify the input files or the parameters you can use environment variable as array index: SLURM_ARRAY_TASK_ID

Submitting a Job Parametric Job Example : jobarray.sh #!/ bin / bash # SBATCH -J JobArray # SBATCH -- time =01:00:00 # SBATCH -- cpus - per - task =4 # SBATCH -- mem - per - cpu =1024 PARAMS = $ ( sed -n $ { SLURM_ARRAY_TASK_ID } p params. dat ) srun your_binary $PARAMS params.dat example -file your_input_file -x -file your_input_file -x -file your_input_file -x 1 -y 1 -y 1 -y 1 -z 1 -z 1 -z 1 2 3

Submitting a Job Parametric Job The management is really easy: squ JOBID PARTITION PRIOR NAME USER 2142564_[20-1000] high 24544 JobArray jordi 2142564_1 high 24544 JobArray jordi R 2142564_2 high 24544 JobArray jordi R $ scancel 204258_[400-500] $ scontrol hold 204258 STATE TIME TIMEL PD 0:00 1:00:00 0:02 1:00:00 0:02 1:00:00

Realistic example #!/bin/bash #SBATCH -J GPU_JOB #SBATCH --time=01:00:00 # Walltime #SBATCH -A uoa99999 # Project Account #SBATCH --ntasks=4 # number of tasks #SBATCH --ntasks-per-node=2 # number of tasks per node #SBATCH --mem-per-cpu=8132 # memory/core (in MB) #SBATCH --cpus-per-task=4 # 4 OpenMP Threads #SBATCH --gres=gpu:2 # GPUs per node #SBATCH -C kepler ml GROMACS/4.6.5-goolfc-2.6.10-hybrid srun mdrun_mpi ------

squeue shows the status of jobs. sinfo provides information on partitions and nodes. sview GUI to view job, node and partition information. smap CLI to view job, node and partition information.

squeue Show jobid and allocated nodes for running jobs of the user jordi: [4925] ~$ squeue -u jordi JOBID PARTITION NAME 24258 high Migrate-N 24259 high Migrate-N 24257 high Migrate-N USER ST jordi PD jordi PD jordi R TIME 0:00 0:00 0:27 NODES NODELIST(REASON) 4 (Resources) 4 (Priority) 512 hsw[1-512]

sview

Slurm Commands Accounting sacct Report accounting information by individual job and job step sstat Report accounting information about currently running jobs and job steps (more detailed than sacct) sreport Report resources usage by cluster, partition, user, account, etc.

Slurm Commands Scheduling sacctmgr Database management tool Add/delete clusters, accounts, users, etc. Get/set resource limits, fair-share allocations, etc. sprio View factors comprising a job s priority sshare View current hierarchical fair-share information sdiag View statistics about scheduling module operations (execution time, queue length, etc.)

Slurm Commands Slurm Administration Tool scontrol - Administrator tool to view and/or update system, job, step, partition or reservation status. The most popular are: checkpoint create delete hold reconfigure release requeue resume suspend More information : http://slurm.schedmd.com/scontrol.html