Job Scheduling with Moab Cluster Suite



Similar documents
Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu

Quick Tutorial for Portable Batch System (PBS)

Batch Scripts for RA & Mio

Resource Management and Job Scheduling

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Job scheduler details

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

TORQUE Administrator s Guide. version 2.3

The Moab Scheduler. Dan Mazur, McGill HPC Aug 23, 2013

How To Run A Cluster On A Linux Server On A Pcode (Amd64) On A Microsoft Powerbook On A Macbook 2 (Amd32)

A High Performance Computing Scheduling and Resource Management Primer

Using Parallel Computing to Run Multiple Jobs

Hodor and Bran - Job Scheduling and PBS Scripts

NYUAD HPC Center Running Jobs

LSKA 2010 Survey Report Job Scheduler

Installing and running COMSOL on a Linux cluster

High-Performance Reservoir Risk Assessment (Jacta Cluster)

HPC-Nutzer Informationsaustausch. The Workload Management System LSF

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Grid Engine Users Guide p1 Edition

Martinos Center Compute Clusters

SLURM Workload Manager

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

Using the Yale HPC Clusters

User s Guide. Introduction

Fair Scheduler. Table of contents

Batch Job Management with Torque/OpenPBS

Guillimin HPC Users Meeting. Bryan Caron

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

HOD Scheduler. Table of contents

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Streamline Computing Linux Cluster User Training. ( Nottingham University)

The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - -

SGE Roll: Users Guide. Version Edition

PBS Job scheduling for Linux clusters

PBS + Maui Scheduler

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Cobalt: An Open Source Platform for HPC System Software Research

Maui Administrator's Guide

PBSPro scheduling. PBS overview Qsub command: resource requests. Queues a7ribu8on. Fairshare. Backfill Jobs submission.

LINUX CLUSTER MANAGEMENT TOOLS

Healthstone Monitoring System

HPC system startup manual (version 1.30)

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014

Integrating VoltDB with Hadoop

Command Line Interface User Guide for Intel Server Management Software

Parallel Processing using the LOTUS cluster

Introduction to the SGE/OGS batch-queuing system

NorduGrid ARC Tutorial

Enhanced Connector Applications SupportPac VP01 for IBM WebSphere Business Events 3.0.0

Hack the Gibson. John Fitzpatrick Luke Jennings. Exploiting Supercomputers. 44Con Edition September Public EXTERNAL

A Tiny Queuing System for Blast Servers

8/15/2014. Best (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status

Introduction to Sun Grid Engine (SGE)

Handle Tool. User Manual

CLC Server Command Line Tools USER MANUAL

The RWTH Compute Cluster Environment

Running applications on the Cray XC30 4/12/2015

A highly configurable and efficient simulator for job schedulers on supercomputers

ITG Software Engineering

Running a Workflow on a PowerCenter Grid

Using NeSI HPC Resources. NeSI Computational Science Team

Grid Engine 6. Troubleshooting. BioTeam Inc.

Using RADIUS Agent for Transparent User Identification

Oracle Fusion Middleware 11gR2: Forms, and Reports ( ) Certification with SUSE Linux Enterprise Server 11 SP2 (GM) x86_64

Sun Grid Engine, a new scheduler for EGEE

RA MPI Compilers Debuggers Profiling. March 25, 2009

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

Getting Started with HPC

CA Workload Automation Agent for Microsoft SQL Server

Parallel Debugging with DDT

NEC HPC-Linux-Cluster

LICENSE4J FLOATING LICENSE SERVER USER GUIDE

High-Performance Computing

Biowulf2 Training Session

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU

Chapter 2: Getting Started

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

User s Manual

Transcription:

Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010

Workload Manager Torque Source: Adaptive Computing 2

Some terminology.. Resource Manager Manages a queue of jobs for a cluster of resources Launches job to a simple FIFO job queue The Resource Manager is Torque Workload Manager A scheduler that integrates with one or more Resource Managers to schedule jobs across domains of resources (servers, storage, applications) Prioritizes jobs Provides status of running and queued jobs Implements fair-share mechanism and achieving efficient utilization of resources Enforces established policy Collects and reports resource usage statistics The workload manager is Moab 3

User submits job request Moab Workload Manager Control jobs Query jobs, resources & policies Torque Resource Manager idataplax dx360 M2 node001, node168 x3950 M2 servers smp001, smp007 4

Moab / Torque scheduler Moab Cluster Suite Initially started as the commercial derivative of Maui Cluster Scheduler when the open source job scheduler was maintained and supported by Adaptive Computing (formerly Cluster Resources, Inc) mid-90s for use on clusters and supercomputers with contributions from several academic institutions and national research labs. The Maui Workload Manager is NOT a resource manager. The scheduler tells the resource manger what to do, when and where to run jobs Can be integrated with several resource managers, including Torque Capable of supporting multiple scheduling policies, dynamic priorities, extensive reservations and fair-share capabilities Users typically submit jobs and query state of resources and jobs through the resource manager. Users will submit the job script for the resource manager. 5

Torque Resource Manager An open source resource manager providing control over batch jobs and distributed compute nodes Community effort based on the original PBS project with enhancement in scalability, fault tolerance and feature extensions over standard OpenPBS Fault Tolerance Additional failure conditions checked/handled Node health check script support Aggressive development with new capabilities advanced diagnostics, job arrays, high-throughput support Scalability Significantly improved server to MOM communication model Ability to handle larger clusters (over 20,000 cores) Ability to handle tens of thousands of jobs Ability to support larger server messages 6

Documentation on Moab and Torque resource manager at Adaptive computing Links from the url: http://www.clusterresources.com/resources/documentation.php Tutorials / User Guide from computing centers, such as https://computing.llnl.gov/tutorials/moab/ 7

Moab commands Majority of Moab commands are for use by the scheduler administrators. For command details, access links from: http://www.clusterresources.com/products/mwm/docs/a.gcommandoverview.shtml Moab end user commands checkjob mjobctl e.g. mjobctl c JOBid showbf showq showstart showres 8 Description Provide detailed status report for specified job Control and modify job cancels a job with ID JOBid Show resource available for jobs with specific resource requirements Display all jobs in active, idle and non-queued states. The flags to display extended details can only be used by level 1, 2, or 3 scheduler administrators Show estimates of when job can/will start Show existing reservations

Submitting jobs If Moab is configured to run as root, users can submit jobs to Moab directly using msub msub [-a datetime] [-A account] [-c interval] [-C directive_prefix] [-d path] [-e path] [-h] [-I] [-j join] [-k keep] [-K] [-l resource_list] [-m mail_options] [-M user_list] [-N name] [-o path] [-p priority] [-q destination] [-r] [-S path_list] [-u user_list] [-v variable_list] [-V] [-W additional_attributes] [-z][script] Jobs submitted by msub can run on any of the resources of the resources managers managed by Moab Jobs submitted to a resource manager (e.g. qsub for Torque) can only run resources managed by the resource manager. 9

Building Torque job script Users build job scripts and submit the job using the qsub command for scheduling qsub [-a date_time] [-A account_string] [-b secs] [-c checkpoint_options] [-C directive_prefix] [-d path] [-D path] [-e path] [-h] [-I ] [-j join ] [-k keep ] [-l resource_list ] [-m mail_options] [-M user_list] [-N name] [-o path] [-p priority] [-q destination] [-r c] [-S path_list] [-t array_request] [-u user_list] [-v variable_list] [-V ] [-W additional_attributes] [-X] [-z] [script] http://www.clusterresources.com/products/torque/docs/a.acommands.shtml The job script is a plain text file Includes shell scripting, comment lines Command, directives specific to the batch system The directive is an alternative to command line option to specify job attributes All directive lines must precede shell script command Shell scripting is parsed at runtime The job script may be specified as qsub command line argument [script] or may be entered via STDIN or piped to qsub. cat job.script qsub 10

Torque job script.. The job script will be executed from the user s home directory For parallel jobs, the job script will be staged to and executed on the first allocated compute node. The job script will use the default user environment variables (set in the shell startup script e.g..bashrc) unless the -V or -v flags are specified to include all current environment variables (-V), or selected environment variables (-v) qsub will pass the value of the environment variables HOME, LANG, LOGNAME, PATH, MAIL, SHELL and TZ to the job script and be assigned to a new name prefixed with PBS_O qsub will process a line as a directive is the string of characters starting with the first non white space character on the line and of the same length as the directive prefix matches the directive prefix. The directive prefix is determined in order of preference: value of command line option -C Value of environment variable PBS_DPREFIX if defined The string #PBS 11

Resources are requested at job submission with: with command line option l for qsub. For example: -l walltime=1:00:00 l nodes=4:ppn=4 Directives in the job script. For example, #PBS l walltime=1:00.00 #PBS l nodes=4:ppn=4 A few frequently requested resources: -l mem=<size> is maximum amount of physical memory used by the job (Ignored on Linux is number of nodes is not 1 ) where <size> is defined in form of number of bytes (suffix b) or words (suffix w) The multiplier is k=1024, m=1,048,576, g=1,073,741,824, t=1,099,511,627,776 e.g. l mem=1gb -l vmem=<size> is maximum amount of virtual memory used by all concurrent processes in the job -l walltime=<seconds> or [[HH:]MM:]SS is the maximum amount of real time during which the job is in run state -l cput=<seconds> or [[HH:]MM:]SS is the maximum amount of CPU time used by all processes in the job 12

-l nodes={<node_count> <hostname>}[:ppn=<ppn>][:<property>][:<property>][+] is the number and/or type of nodes to be reserved for use by the job. The value is one or more node_specs joined with the + character, node_spec[+node_spec...]. Each node_spec is an number of nodes required of the type declared in the node_spec and a name or one or more property or properties desired for the nodes. The number, the name, and each property in the node_spec are separated by a colon :. If no number is specified, one (1) is assumed. The name of a node is its hostname. The properties of nodes are: ppn=# - specifying the number of processors per node requested. Defaults to 1. property - a string assigned by the system administrator specify a node s features For example: -l nodes=2:ppn=4+4:ppn=2 : requesting 2 nodes with 4 cores per node and 4 nodes with 2 cores per node, a total of 6 nodes with 16 cores -l nodes=node001+node003+node005 : requesting 3 specific nodes by hostname 13

-N name : Declares a name for the job. If N is not specified, the job name is the base name of the job script. Running interactive jobs: -I option specified on the command line script include the I directive Job attributed interactive declared to be true: -W interactive=true During execution of the interactive job, input to and output from the job is passed through the qsub. Useful for debug while building and testing applications 14

Job Script Environment Variables Exported batch environment variables that can be used in job script : Variable PBS_JOBNAME PBS_JOBID PBS_ARRAYID PBS_NODEFILE PBS_QUEUE PBS_O_HOST PBS_O_QUEUE PBS_O_WORKDIR PBS_O_LOGNAME PBS_O_HOME PBS_O_PATH Description user specified jobname Job identifier assigned by the batch system value of job array index for this job Name of file containing the list of node(s) assigned Name of queue from which the job will be executed Name of the host upon which the qsub command is running Name of original queue to which job was submitted Absolute path of current working directory of qsub name of submitting user Home directory of submitting user Path variable user to locate executables within job script 15