Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University. This document provides an overview of how to work with the batch scheduling system on the cluster. Conventions used in this document:... 1 Purpose of the Batch Scheduler... 2 Writing Batch Job Scripts... 2 Batch Scheduling (PBS) Commands... 3 Requesting Nodes with Extra Memory... 3 Batch Job Commands... 4 Submitting Batch Jobs... 4 Batch Job Status... 4 Job Queues... 5 Job Log Files... 6 Interactive Batch Jobs... 6 For information on connecting to the cluster, see the separate Connecting to the RedHawk Cluster (with specific versions for Windows, Mac, and Linux) document available on the Miami University Research Computing Group web site ( http://www.muohio.edu/researchcomputing ). If you have any questions about batch jobs that are not answered in this document, please e-mail the Miami University Research Computing Support group at rescomp@muohio.edu. Conventions used in this document: In this document, commands that are to be typed in Linux will be shown enclosed in quotes in an alternate font and should be entered as shown, but without the quotation marks. For reference, in the alternate font, the letter l (lower case L ) is distinct from the number 1 to avoid confusion. Last updated: December 9, 2011 Page 1
Purpose of the Batch Scheduler Batch jobs are used to run programs without requiring any input from the user. Batch jobs run for hours, days, or weeks executing a given set of commands and saving any output to a file for later review by the user. The batch scheduler is used to control access to the compute nodes on the cluster. Users submit batch jobs that specify what resources they need (number of CPUs and amount of time) and what commands should be run. The batch scheduler determines when the requested resources will be available and schedules the job. The batch system also takes care of collecting output from the job and can notify users via e-mail when their job starts or ends. Once a batch job starts running on a node (or nodes) the job has exclusive use of those nodes until the job is complete. While the scheduling system is usually thought of a non-interactive system, it can be used to gain access to a compute node for interactive use. This is described in the Interactive Batch Jobs section. The batch scheduling system currently running on the Redhawk cluster uses a package called Torque to manage the resources on the compute nodes and run the jobs. A separate package called Moab is used to schedule jobs. Torque is the current open-source version of the PBS (Portable Batch System) package. Writing Batch Job Scripts A batch job script is a Linux shell script with additional comment lines that are interpreted by the batch scheduling system. In the first part of the script, job parameters such as name, resource requests, etc. are specified. The second part of the script contains the commands that the job will execute. Here is a sample batch job script: #!/bin/bash -l #PBS -N test1 ##PBS -N test <- this PBS directive is commented out. #PBS -l nodes=1:ppn=1 #PBS -l walltime=10:0:0 #this is a comment - the commands to run follow cd test1./test1 The first line #!/bin/bash l specifies what Linux shell to use in evaluating the commands in the script. If you don t know what this means, don t worry, just make sure all of your batch job scripts have this as the first line. Last updated: December 9, 2011 Page 2
The lines beginning with # are treated as comments, but lines starting with #PBS are instructions to the batch scheduling system. Note that batch scheduling instructions can be commented out by using ##PBS. Any line that does not start with # is interpreted as a command to be executed by the batch job. In the example, the script changes into a sub-directory and executes the test1 command in that directory. Batch Scheduling (PBS) Commands All batch job scripts should include the following PBS commands: #PBS -N jobname - Indicate the name of the job. #PBS -l nodes=1:ppn=1 - Indicate number of nodes and processors per node (ppn) requested. Allowed values for ppn are from 1 to 8. #PBS -l walltime=1:00:00 - Indicate requested wall clock time for job. Format is hours:minutes:seconds - in the example, 1 hour is requested. Note the job will not be allowed to exceed this. Additional PBS commands include: #PBS -m abe - Send e-mail when job begins (b), ends (e), or aborts (a) - use any combination of these three letters. By default, e-mail is sent to uniqueid@muohio.edu #PBS -M nobody@example.com - Specify additional e-mail addresses for notification. Use a comma to separate multiple addresses #PBS -j oe - Join standard output and standard error output streams in a single file. Default behavior is to have these in separate files. #PBS V - Declare that all environment variables in current environment should be passed to the job when the job is submitted. #PBS q queuename Specify which queue you job should run in. If this is not present, your job will automatically be routed to the serial or parallel queue based on the number of nodes requested. See the section below for more information about the available batch queues. Requesting Specific Nodes The cluster contains two types of compute nodes. The original 32 nodes purchased in 2009 have 2.26 GHz Intel Xeon E5520 CPUs while the 4 nodes purchased in 2011 have 2.4 GHz Intel Xeon E5620 CPUs. To request a specific type of node, add nxx (where xx is 09 for the 2.26 GHz nodes and 11 for the 2.4 GHz nodes) to the node resource request. For example, to request all processors on two of the original nodes #PBS l nodes=2:ppn=8:n09. Last updated: December 9, 2011 Page 3
Batch Job Commands Any line in the batch job script that does not start with # will be treated as a command to be run on the compute node assigned to the job. The commands included in the batch job script should be the same commands you would use to run the command on the head node of the cluster, except that some programs require different parameters when run in batch (or noninteractive mode). Consult the documentation for your particular program. As an example, if you want to have a batch job run Matlab and execute the commands in a file names commands.m, you would use the command matlab nodisplay nojvm r commands. Note that the batch job is executed as a new process, and does not inherit any information for the process it was submitted from (unless the #PBS V option is used). You should include commands to load any needed software modules (for example module load matlab ). You will also need to include commands to change into the directory where you data or command files are located. To help with this, when the job starts, the Linux environment variable $PBS_O_WORKDIR is set to the directory where the job was submitted from, so you can include the command cd $PBS_O_WORKDIR in your script to navigate to this directory. Submitting Batch Jobs To submit a batch job contained in a script names test.job execute the command qsub test.job. You can override any of the PBS instructions in the script file when you submit the jobs. For example, to give the job a different name, execute the command qsub N newname test.job. See man qsub for more information about the available PBS commands. When you submit a batch job, a job identifier is returned: $qsub test.job 66286.torque.hpc.muohio.edu $ This identifier can be used to get information about the status of the job, and the numeric portion will be used in naming the standard output and standard error files created by the job. Batch Job Status To see the current status of your batch job, use the qstat command along with the numeric portion of the job identifier: Last updated: December 9, 2011 Page 4
$ qstat 66286 Job id Name User Time Use S Queue --------- ------ --------- -------- - ----- 66286.torque first woodsdm2 00:00:00 R serial $ The column labeled TimeUse shows the CPU time that has been used by a job. For parallel jobs, this will be the sum across all assigned CPUs. The column labeled S shows that current jobs status. Common status values are: R = running, Q = queued (waiting to run), and E = exiting. More detailed information about a job can be obtained using qstat f jobid. Other useful commands for getting information about jobs or the batch scheduling system include: qstat to see all jobs. qstat u username to see all jobs for a specific user. qstat n jobid to see a brief summary of a job, including the node(s) it is running on. pbsnodes a to see an overview of all nodes in the cluster. showstart to see when the batch system thinks a queued job will start. This is an estimate based on the resources requests of queued and running jobs. showbf to see currently available resources. This command will return information like 5 procs available with no timelimit or 6 procs available for 6:20:00. In the second example, this shows that a job requesting a single CPU for fewer than 6 hours and 20 minutes will run immediately. qdel jobid to delete a job. You can only delete your own jobs. Job Queues A number of different job queues are defined on the Redhawk system. The main queue that users should use is the batch queue which routes jobs to the serial or parallel queue based on the number of nodes requested. Several queues such as stata, paup, comsol, and sas are setup for software packages where only a limited number of licenses are available. If you have questions about which batch queue to use, or feel that none of the standard queues meet your needs, please contact the Research Computing Support group at rescomp@muohio.edu. Several tools are available to view job queue definitions on the Redhawk cluster. These tools can be used to see maximum walltime limits for a queue. These commands are: qstat q to see an overview of the batch queue definitions. qstat Q to see an overview of the batch queue status. qstat Qf queue-name to see the detailed definition of a specific queue. Last updated: December 9, 2011 Page 5
Job Log Files All processes running on the cluster produce two output streams one for standard output and a second for error messages. During interactive use, both of these output streams are displayed on the terminal. For batch jobs, the batch system captures these output streams and writes them to files. The names of these files are built using the job name specified by the #PBS N jobname directive and the numeric job identifier. For example, if a job is named test and has a job identifier of 12345, the standard output will be written to a file named test.o12345 and error output will be written to test.e12345. The files will be located in the directory the job was submitted from. If the #PBS j oe directive is used, only one of these files will be written, but it will contain both output streams. The qpeek command can be used to view log files for running jobs. The command takes the numeric job identifier as an argument, so the command qpeek 12345 would show the current contents of the standard output log. The qpeek command has additional options to view the error log, only the beginning or end of the file, etc. Details of these options can be found with the qpeek -help command. Interactive Batch Jobs To obtain interactive access to a compute node, execute the command qsub IV. Once a node is allocated, your prompt will change to indicate that you are working on a compute node. This command will use the default resource requests of 1 node and 1 hour of CPU time. Additional resources can be requested for example qsub IV l walltime=2:00:00 using the same -l resource request options that are used in batch job scripts. Alternately, the resource requests can be placed in a batch job script and submitted. For example, the command qsub IV test.job will start an interactive process using all of the PBS scheduling commands in the test.job file, but will not execute any Linux commands in the file. Last updated: December 9, 2011 Page 6