Requesting Nodes, Processors, and Tasks in Moab

Similar documents
A High Performance Computing Scheduling and Resource Management Primer

WinMagic Encryption Software Installation and Configuration

Isolation of Battery Chargers Integrated Into Printed Circuit Boards

Building CHAOS: an Operating System for Livermore Linux Clusters

Using Virtualization to Help Move a Data Center

A Systems Approach to HVAC Contractor Security

Configuring Outlook 2011 for Mac with Google Mail

Patch Management Marvin Christensen /CIAC

Configuring Outlook 2010 for Windows

DISCLAIMER. This document was prepared as an account of work sponsored by an agency of the United States

A Compact Linac for Proton Therapy Based on a Dielectric Wall Accelerator

Configuring Outlook 2016 for Windows

Confocal μ-xrf for 3D analysis of elements distribution in hot environmental particles

Asynchronous data change notification between database server and accelerator control systems

An introduction to compute resources in Biostatistics. Chris Scheller

SLURM Workload Manager

Configuring Apple Mail for Mac OS X (Mountain Lion)

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

The Production Cluster Construction Checklist

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Data Collection Agent for Active Directory

Active Vibration Isolation of an Unbalanced Machine Spindle

DUAL MONITOR DRIVER AND VBIOS UPDATE

HotSpot Software Configuration Management Plan

Penetration Testing of Industrial Control Systems

Until now: tl;dr: - submit a job to the scheduler

Proposal: Application of Agile Software Development Process in xlpr ORNL- 2012/41412 November 2012

Early Fuel Cell Market Deployments: ARRA and Combined (IAA, DLA, ARRA)

Second Line of Defense Virtual Private Network Guidance for Deployed and New CAS Systems

Simulating Aftershocks for an On Site Inspection (OSI) Exercise

Large Scale Text Analysis Using the Map/Reduce

User Guide. The Business Energy Dashboard

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

CONFIGURING MICROSOFT SQL SERVER REPORTING SERVICES

Dynamic Vulnerability Assessment

Optimizing Shared Resource Contention in HPC Clusters

Radioactive Waste Storage Facility and Tank System Design Criteria Standards

Instant Chime for IBM Sametime High Availability Server Guide

VeriSign PKI Client Government Edition v 1.5. VeriSign PKI Client Government. VeriSign PKI Client VeriSign, Inc. Government.

NetFlow Collection and Processing Cartridge Pack User Guide Release 6.0

StoneGate Firewall/VPN How-To Evaluating StoneGate FW/VPN in VMware Workstation

Oracle Provides Cost Effective Oracle8 Scalable Technology on Microsoft* Windows NT* for Small and Medium-sized Businesses

User Guide. The Business Energy Dashboard

KB How to Reconfigure Microsoft Exchange Server Maximum s per minute limitation

COMPARISON OF INDIRECT COST MULTIPLIERS FOR VEHICLE MANUFACTURING

Laser Safety Audit and Inventory System Database

Deploying Business Objects Crystal Reports Server on IBM InfoSphere Balanced Warehouse C-Class Solution for Windows

NGNP Risk Management Database: A Model for Managing Risk

The Asterope compute cluster

Requirements for a Long-term Viable, Archive Data Format

Allan Hirt Clustering MVP

Using SNMP with OnGuard

This document was prepared in conjunction with work accomplished under Contract No. DE-AC09-96SR18500 with the U. S. Department of Energy.

Muse Server Sizing. 18 June Document Version Muse

Climate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model

This document was prepared in conjunction with work accomplished under Contract No. DE-AC09-96SR18500 with the U. S. Department of Energy.

IDC Reengineering Phase 2 & 3 US Industry Standard Cost Estimate Summary

Hopper File Management Tool

NVIDIA GRID 2.0 ENTERPRISE SOFTWARE

Status And Future Plans. Mitsuyoshi Tanaka. AGS Department.Brookhaven National Laboratory* Upton NY 11973, USA INTRODUCTION

Virtual LoadMaster for Microsoft Hyper-V

Defect studies of optical materials using near-field scanning optical microscopy and spectroscopy

Data Collection Agent for NAS EMC Isilon Edition

Event Manager. LANDesk Service Desk

PARALLELS SERVER 4 BARE METAL README

Configuring and Monitoring SNMP Generic Servers. eg Enterprise v5.6

StarWind iscsi SAN Software: Using with Citrix XenServer

NEAMS Software Licensing, Release, and Distribution: Implications for FY2013 Work Package Planning

LOAD BALANCING 2X APPLICATIONSERVER XG SECURE CLIENT GATEWAYS THROUGH MICROSOFT NETWORK LOAD BALANCING

Small PV Systems Performance Evaluation at NREL's Outdoor Test Facility Using the PVUSA Power Rating Method

User Experience Implementing SSL and Terminal Servers in z/vm 6.1

User s Manual for BEST-Dairy:

Optical Blade Position Tracking System Test

Hadoop Applications on High Performance Computing. Devaraj Kavali

Measurement of BET Surface Area on Silica Nanosprings

Transcription:

LLNL-MI-401783 LAWRENCE LIVERMORE NATIONAL LABORATORY Requesting Nodes, Processors, and Tasks in Moab D.A Lipari March 29, 2012

This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

Introduction This document explains the differences between requesting nodes, tasks, and processors for jobs running on LC Linux clusters. The discussion does not pertain to jobs running on any of the BlueGene clusters. For running jobs on BlueGene, see Running Jobs on BlueGene for a similar discussion. In the discussion below, the terms nodes, processors, and tasks have special meaning for Moab. Nodes refer to the collection of processing resources associated with one IP address. On each node is a collection of sockets, with each socket containing one or more cores. For certain architectures, each core can run one or multiple hardware threads. The term processor refers to one core, whether or not that core runs multiple threads. A task is an execution of a user s code, whether or not that code spawns multiple threads. Typically, the SLURM resource manager will launch one or many tasks across one or multiple nodes. Note that these definitions are slightly different from SLURM s definition. The following discussion pertains to submitting jobs to Moab using msub. A brief comparison of Moab and SLURM is included at the end of this document. Submitting Jobs to Moab Moab has always offered the option for a job to request a specific number of nodes (i.e., msub -l nodes=<n>) on LC parallel clusters. LC policy on parallel clusters is to allocate entire nodes to jobs for their exclusive use, i.e., only one job can be running on any node at any time. In contrast, LC policy constrains jobs running on its capacity clusters (Aztec and Inca) to just one node. Jobs submitted to those machines instead request tasks (i.e., msub -l ttc=<t>), and Moab allocates one processor per task requested. (ttc stands for total task count.) Hence, Moab could schedule 12 one-task jobs per Aztec or Inca node, or one 12-task job, or anywhere in between. As such, Aztec and Inca are the only machines that allow multiple jobs to run on one node. LC s various parallel clusters have differing numbers of processors per node. Users can submit jobs to a grid of Linux clusters and specify multiple clusters as potential targets. For example: msub -l partition=clustera:clusterb:clusterc Moab will place the job in a queue and run the job on the first available cluster from the list of candidates. Past Problem Now Solved Some parallel jobs launch a specific number of tasks (assuming a one task to one processor relationship for this example), and the number of nodes needed to provide those processors is irrelevant. Hence, users would like to submit jobs to Moab using the following command form msub -l procs=<p> and have Moab decide whether to allocate X nodes of cluster A or Y nodes of cluster B to provide those processors. Moab v5.3 was unable to provide this flexibility. Moab v6.1 offers the msub -l procs=<p> option without the need to specify a node count. If one cluster has 8-processor nodes and a second cluster has 16-processor nodes, Moab will allocate more or less nodes for the job depending on which cluster was selected to run the job. 1

Recommendations The following tables provide a summary of how to allocate nodes and processors using Moab s msub command. The first table provides the simplest command forms, with Yes or No indicating whether that form is appropriate for a specific cluster. The second table displays accepted (but unnecessarily complicated) command forms. Recommended Command Forms Parallel Clusters Aztec/Inca msub -l ttc=<t> No Yes msub -l nodes=<n> Yes No msub -l procs=<p> Yes Yes Accepted Command Forms Parallel Clusters Aztec/Inca msub -l nodes=1,procs=<p> msub -l procs=<p> msub -l ttc=<p> msub -l nodes=<n>,procs=<p> where n > 1 msub -l nodes=1,ttc=<t> msub -l nodes=<n>,ttc=<t> where n > 1 msub -l procs=<p> msub -l nodes=1 msub -l nodes=<n> Invalid msub -l ttc=<t> Invalid Moab and SLURM Moab s msub command requests a node/processor resource allocation. After scheduling the resources, Moab dispatches the job request to SLURM, which runs the job across those allocated resources. There is a strict relationship between the resources Moab allocates and SLURM s launching of the user s executables across those resources. SLURM s srun command is used to launch the job s tasks. The srun command offers a set of node/task options that mirror those of msub described above. There are obvious constraints. For example, a job cannot request a one-node allocation from Moab and then use srun to launch tasks across multiple nodes. The table below provides some valid and invalid examples of the required consistencies between msub and srun options on the Aztec and Inca clusters. #MSUB -l ttc=8 srun -n8 a.out #MSUB -l ttc=4 srun -n8 a.out #MSUB -l ttc=4 srun -n8 -O a.out #MSUB -l ttc=64 srun -n64 a.out # (no specification for -l ttc) srun -n2 a.out #MSUB -l ttc=8 srun -n2 a.out Correct. Uses 8 processes on 8 processors. Incorrect. Uses 8 processes on only 4 processors. This command will work because the -O option permits SLURM to oversubscribe a node s processors. However, using more tasks than processors is not recommended and can degrade performance. Incorrect if 64 is more than the number of physical processors on a node. Incorrect. The #MSUB -l ttc= option isn t specified, so the default of 1 is less than required by the -n2 tasks specified with srun. This is OK and might be used if each of 2 tasks spawns 4 threads. 2

Further Reading After Moab allocates the node/processor resources to the job, you can use srun to alter the default one-to-one relationship between tasks and processors. Information on the following options is available in the srun man page. srun --cpus-per-task srun --ntasks-per-node 3