Miami University RedHawk Cluster Working with batch jobs on the Cluster

Size: px
Start display at page:

Download "Miami University RedHawk Cluster Working with batch jobs on the Cluster"

Transcription

1 Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University. This document provides an overview of how to work with the batch scheduling system on the cluster. Conventions used in this document:... 1 Purpose of the Batch Scheduler... 2 Writing Batch Job Scripts... 2 Batch Scheduling (PBS) Commands... 3 Requesting Nodes with Extra Memory... 3 Batch Job Commands... 4 Submitting Batch Jobs... 4 Batch Job Status... 4 Job Queues... 5 Job Log Files... 6 Interactive Batch Jobs... 6 For information on connecting to the cluster, see the separate Connecting to the RedHawk Cluster (with specific versions for Windows, Mac, and Linux) document available on the Miami University Research Computing Group web site ( ). If you have any questions about batch jobs that are not answered in this document, please the Miami University Research Computing Support group at rescomp@muohio.edu. Conventions used in this document: In this document, commands that are to be typed in Linux will be shown enclosed in quotes in an alternate font and should be entered as shown, but without the quotation marks. For reference, in the alternate font, the letter l (lower case L ) is distinct from the number 1 to avoid confusion. Last updated: December 9, 2011 Page 1

2 Purpose of the Batch Scheduler Batch jobs are used to run programs without requiring any input from the user. Batch jobs run for hours, days, or weeks executing a given set of commands and saving any output to a file for later review by the user. The batch scheduler is used to control access to the compute nodes on the cluster. Users submit batch jobs that specify what resources they need (number of CPUs and amount of time) and what commands should be run. The batch scheduler determines when the requested resources will be available and schedules the job. The batch system also takes care of collecting output from the job and can notify users via when their job starts or ends. Once a batch job starts running on a node (or nodes) the job has exclusive use of those nodes until the job is complete. While the scheduling system is usually thought of a non-interactive system, it can be used to gain access to a compute node for interactive use. This is described in the Interactive Batch Jobs section. The batch scheduling system currently running on the Redhawk cluster uses a package called Torque to manage the resources on the compute nodes and run the jobs. A separate package called Moab is used to schedule jobs. Torque is the current open-source version of the PBS (Portable Batch System) package. Writing Batch Job Scripts A batch job script is a Linux shell script with additional comment lines that are interpreted by the batch scheduling system. In the first part of the script, job parameters such as name, resource requests, etc. are specified. The second part of the script contains the commands that the job will execute. Here is a sample batch job script: #!/bin/bash -l #PBS -N test1 ##PBS -N test <- this PBS directive is commented out. #PBS -l nodes=1:ppn=1 #PBS -l walltime=10:0:0 #this is a comment - the commands to run follow cd test1./test1 The first line #!/bin/bash l specifies what Linux shell to use in evaluating the commands in the script. If you don t know what this means, don t worry, just make sure all of your batch job scripts have this as the first line. Last updated: December 9, 2011 Page 2

3 The lines beginning with # are treated as comments, but lines starting with #PBS are instructions to the batch scheduling system. Note that batch scheduling instructions can be commented out by using ##PBS. Any line that does not start with # is interpreted as a command to be executed by the batch job. In the example, the script changes into a sub-directory and executes the test1 command in that directory. Batch Scheduling (PBS) Commands All batch job scripts should include the following PBS commands: #PBS -N jobname - Indicate the name of the job. #PBS -l nodes=1:ppn=1 - Indicate number of nodes and processors per node (ppn) requested. Allowed values for ppn are from 1 to 8. #PBS -l walltime=1:00:00 - Indicate requested wall clock time for job. Format is hours:minutes:seconds - in the example, 1 hour is requested. Note the job will not be allowed to exceed this. Additional PBS commands include: #PBS -m abe - Send when job begins (b), ends (e), or aborts (a) - use any combination of these three letters. By default, is sent to uniqueid@muohio.edu #PBS -M nobody@example.com - Specify additional addresses for notification. Use a comma to separate multiple addresses #PBS -j oe - Join standard output and standard error output streams in a single file. Default behavior is to have these in separate files. #PBS V - Declare that all environment variables in current environment should be passed to the job when the job is submitted. #PBS q queuename Specify which queue you job should run in. If this is not present, your job will automatically be routed to the serial or parallel queue based on the number of nodes requested. See the section below for more information about the available batch queues. Requesting Specific Nodes The cluster contains two types of compute nodes. The original 32 nodes purchased in 2009 have 2.26 GHz Intel Xeon E5520 CPUs while the 4 nodes purchased in 2011 have 2.4 GHz Intel Xeon E5620 CPUs. To request a specific type of node, add nxx (where xx is 09 for the 2.26 GHz nodes and 11 for the 2.4 GHz nodes) to the node resource request. For example, to request all processors on two of the original nodes #PBS l nodes=2:ppn=8:n09. Last updated: December 9, 2011 Page 3

4 Batch Job Commands Any line in the batch job script that does not start with # will be treated as a command to be run on the compute node assigned to the job. The commands included in the batch job script should be the same commands you would use to run the command on the head node of the cluster, except that some programs require different parameters when run in batch (or noninteractive mode). Consult the documentation for your particular program. As an example, if you want to have a batch job run Matlab and execute the commands in a file names commands.m, you would use the command matlab nodisplay nojvm r commands. Note that the batch job is executed as a new process, and does not inherit any information for the process it was submitted from (unless the #PBS V option is used). You should include commands to load any needed software modules (for example module load matlab ). You will also need to include commands to change into the directory where you data or command files are located. To help with this, when the job starts, the Linux environment variable $PBS_O_WORKDIR is set to the directory where the job was submitted from, so you can include the command cd $PBS_O_WORKDIR in your script to navigate to this directory. Submitting Batch Jobs To submit a batch job contained in a script names test.job execute the command qsub test.job. You can override any of the PBS instructions in the script file when you submit the jobs. For example, to give the job a different name, execute the command qsub N newname test.job. See man qsub for more information about the available PBS commands. When you submit a batch job, a job identifier is returned: $qsub test.job torque.hpc.muohio.edu $ This identifier can be used to get information about the status of the job, and the numeric portion will be used in naming the standard output and standard error files created by the job. Batch Job Status To see the current status of your batch job, use the qstat command along with the numeric portion of the job identifier: Last updated: December 9, 2011 Page 4

5 $ qstat Job id Name User Time Use S Queue torque first woodsdm2 00:00:00 R serial $ The column labeled TimeUse shows the CPU time that has been used by a job. For parallel jobs, this will be the sum across all assigned CPUs. The column labeled S shows that current jobs status. Common status values are: R = running, Q = queued (waiting to run), and E = exiting. More detailed information about a job can be obtained using qstat f jobid. Other useful commands for getting information about jobs or the batch scheduling system include: qstat to see all jobs. qstat u username to see all jobs for a specific user. qstat n jobid to see a brief summary of a job, including the node(s) it is running on. pbsnodes a to see an overview of all nodes in the cluster. showstart to see when the batch system thinks a queued job will start. This is an estimate based on the resources requests of queued and running jobs. showbf to see currently available resources. This command will return information like 5 procs available with no timelimit or 6 procs available for 6:20:00. In the second example, this shows that a job requesting a single CPU for fewer than 6 hours and 20 minutes will run immediately. qdel jobid to delete a job. You can only delete your own jobs. Job Queues A number of different job queues are defined on the Redhawk system. The main queue that users should use is the batch queue which routes jobs to the serial or parallel queue based on the number of nodes requested. Several queues such as stata, paup, comsol, and sas are setup for software packages where only a limited number of licenses are available. If you have questions about which batch queue to use, or feel that none of the standard queues meet your needs, please contact the Research Computing Support group at rescomp@muohio.edu. Several tools are available to view job queue definitions on the Redhawk cluster. These tools can be used to see maximum walltime limits for a queue. These commands are: qstat q to see an overview of the batch queue definitions. qstat Q to see an overview of the batch queue status. qstat Qf queue-name to see the detailed definition of a specific queue. Last updated: December 9, 2011 Page 5

6 Job Log Files All processes running on the cluster produce two output streams one for standard output and a second for error messages. During interactive use, both of these output streams are displayed on the terminal. For batch jobs, the batch system captures these output streams and writes them to files. The names of these files are built using the job name specified by the #PBS N jobname directive and the numeric job identifier. For example, if a job is named test and has a job identifier of 12345, the standard output will be written to a file named test.o12345 and error output will be written to test.e The files will be located in the directory the job was submitted from. If the #PBS j oe directive is used, only one of these files will be written, but it will contain both output streams. The qpeek command can be used to view log files for running jobs. The command takes the numeric job identifier as an argument, so the command qpeek would show the current contents of the standard output log. The qpeek command has additional options to view the error log, only the beginning or end of the file, etc. Details of these options can be found with the qpeek -help command. Interactive Batch Jobs To obtain interactive access to a compute node, execute the command qsub IV. Once a node is allocated, your prompt will change to indicate that you are working on a compute node. This command will use the default resource requests of 1 node and 1 hour of CPU time. Additional resources can be requested for example qsub IV l walltime=2:00:00 using the same -l resource request options that are used in batch job scripts. Alternately, the resource requests can be placed in a batch job script and submitted. For example, the command qsub IV test.job will start an interactive process using all of the PBS scheduling commands in the test.job file, but will not execute any Linux commands in the file. Last updated: December 9, 2011 Page 6

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007 PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit

More information

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27. Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction

More information

Hodor and Bran - Job Scheduling and PBS Scripts

Hodor and Bran - Job Scheduling and PBS Scripts Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.

More information

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Ra - Batch Scripts Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Jobs on Ra are Run via a Batch System Ra is a shared resource Purpose: Give fair access to all users Have control over where jobs are run Set

More information

Quick Tutorial for Portable Batch System (PBS)

Quick Tutorial for Portable Batch System (PBS) Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Job Scheduling with Moab Cluster Suite

Job Scheduling with Moab Cluster Suite Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..

More information

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine) Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing

More information

Job scheduler details

Job scheduler details Job scheduler details Advanced Computing Center for Research & Education (ACCRE) Job scheduler details 1 / 25 Outline 1 Batch queue system overview 2 Torque and Moab 3 Submitting jobs (ACCRE) Job scheduler

More information

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St

More information

NYUAD HPC Center Running Jobs

NYUAD HPC Center Running Jobs NYUAD HPC Center Running Jobs 1 Overview... Error! Bookmark not defined. 1.1 General List... Error! Bookmark not defined. 1.2 Compilers... Error! Bookmark not defined. 2 Loading Software... Error! Bookmark

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew

More information

Juropa. Batch Usage Introduction. May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de

Juropa. Batch Usage Introduction. May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Juropa Batch Usage Introduction May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Batch System Usage Model A Batch System: monitors and controls the resources on the system manages and schedules

More information

NEC HPC-Linux-Cluster

NEC HPC-Linux-Cluster NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

Batch Scripts for RA & Mio

Batch Scripts for RA & Mio Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs

More information

Resource Management and Job Scheduling

Resource Management and Job Scheduling Resource Management and Job Scheduling Jenett Tillotson Senior Cluster System Administrator Indiana University May 18 18-22 May 2015 1 Resource Managers Keep track of resources Nodes: CPUs, disk, memory,

More information

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015 Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians

More information

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Streamline Computing Linux Cluster User Training. ( Nottingham University) 1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running

More information

Martinos Center Compute Clusters

Martinos Center Compute Clusters Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress

More information

Running applications on the Cray XC30 4/12/2015

Running applications on the Cray XC30 4/12/2015 Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes

More information

SGE Roll: Users Guide. Version @VERSION@ Edition

SGE Roll: Users Guide. Version @VERSION@ Edition SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1

More information

Grid Engine Users Guide. 2011.11p1 Edition

Grid Engine Users Guide. 2011.11p1 Edition Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the

More information

Using Parallel Computing to Run Multiple Jobs

Using Parallel Computing to Run Multiple Jobs Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The

More information

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014 Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan

More information

Getting Started with HPC

Getting Started with HPC Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage

More information

High-Performance Reservoir Risk Assessment (Jacta Cluster)

High-Performance Reservoir Risk Assessment (Jacta Cluster) High-Performance Reservoir Risk Assessment (Jacta Cluster) SKUA-GOCAD 2013.1 Paradigm 2011.3 With Epos 4.1 Data Management Configuration Guide 2008 2013 Paradigm Ltd. or its affiliates and subsidiaries.

More information

Beyond Windows: Using the Linux Servers and the Grid

Beyond Windows: Using the Linux Servers and the Grid Beyond Windows: Using the Linux Servers and the Grid Topics Linux Overview How to Login & Remote Access Passwords Staying Up-To-Date Network Drives Server List The Grid Useful Commands Linux Overview Linux

More information

Introduction to Sun Grid Engine (SGE)

Introduction to Sun Grid Engine (SGE) Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Dec 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing

More information

SLURM Workload Manager

SLURM Workload Manager SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux

More information

An Introduction to High Performance Computing in the Department

An Introduction to High Performance Computing in the Department An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software

More information

Installing and running COMSOL on a Linux cluster

Installing and running COMSOL on a Linux cluster Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation

More information

Maxwell compute cluster

Maxwell compute cluster Maxwell compute cluster An introduction to the Maxwell compute cluster Part 1 1.1 Opening PuTTY and getting the course materials on to Maxwell 1.1.1 On the desktop, double click on the shortcut icon for

More information

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)

More information

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz)

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz) NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting

More information

The Moab Scheduler. Dan Mazur, McGill HPC daniel.mazur@mcgill.ca Aug 23, 2013

The Moab Scheduler. Dan Mazur, McGill HPC daniel.mazur@mcgill.ca Aug 23, 2013 The Moab Scheduler Dan Mazur, McGill HPC daniel.mazur@mcgill.ca Aug 23, 2013 1 Outline Fair Resource Sharing Fairness Priority Maximizing resource usage MAXPS fairness policy Minimizing queue times Should

More information

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Martin Siegert, SFU Cluster Myths There are so many jobs in the queue - it will take ages until

More information

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is

More information

Cluster@WU User s Manual

Cluster@WU User s Manual Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut

More information

Installing and Using No Machine to connect to the Redhawk Cluster. Mac version

Installing and Using No Machine to connect to the Redhawk Cluster. Mac version Installing and Using No Machine to connect to the Redhawk Cluster Mac version No Machine (also called NX) is a tool that can be used to connect to Miami s Redhawk cluster when a graphical interface is

More information

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:

More information

Parallel Processing using the LOTUS cluster

Parallel Processing using the LOTUS cluster Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS

More information

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz)

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz) Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using

More information

MFCF Grad Session 2015

MFCF Grad Session 2015 MFCF Grad Session 2015 Agenda Introduction Help Centre and requests Dept. Grad reps Linux clusters using R with MPI Remote applications Future computing direction Technical question and answer period MFCF

More information

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University

More information

Parallel Debugging with DDT

Parallel Debugging with DDT Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece

More information

Introduction to the SGE/OGS batch-queuing system

Introduction to the SGE/OGS batch-queuing system Grid Computing Competence Center Introduction to the SGE/OGS batch-queuing system Riccardo Murri Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich Oct. 6, 2011 The basic

More information

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry New High-performance computing cluster: PAULI Sascha Frick Institute for Physical Chemistry 02/05/2012 Sascha Frick (PHC) HPC cluster pauli 02/05/2012 1 / 24 Outline 1 About this seminar 2 New Hardware

More information

Miami University RedHawk Cluster Connecting to the Cluster Using Windows

Miami University RedHawk Cluster Connecting to the Cluster Using Windows Miami University RedHawk Cluster Connecting to the Cluster Using Windows The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.

More information

Technical Guide to ULGrid

Technical Guide to ULGrid Technical Guide to ULGrid Ian C. Smith Computing Services Department September 4, 2007 1 Introduction This document follows on from the User s Guide to Running Jobs on ULGrid using Condor-G [1] and gives

More information

High-Performance Computing

High-Performance Computing High-Performance Computing Windows, Matlab and the HPC Dr. Leigh Brookshaw Dept. of Maths and Computing, USQ 1 The HPC Architecture 30 Sun boxes or nodes Each node has 2 x 2.4GHz AMD CPUs with 4 Cores

More information

HPCC USER S GUIDE. Version 1.2 July 2012. IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35

HPCC USER S GUIDE. Version 1.2 July 2012. IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35 HPCC USER S GUIDE Version 1.2 July 2012 IITS (Research Support) Singapore Management University IITS, Singapore Management University Page 1 of 35 Revision History Version 1.0 (27 June 2012): - Modified

More information

Matlab on a Supercomputer

Matlab on a Supercomputer Matlab on a Supercomputer Shelley L. Knuth Research Computing April 9, 2015 Outline Description of Matlab and supercomputing Interactive Matlab jobs Non-interactive Matlab jobs Parallel Computing Slides

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de

More information

Table of Contents New User Orientation...1

Table of Contents New User Orientation...1 Table of Contents New User Orientation...1 Introduction...1 Helpful Resources...3 HPC Environment Overview...4 Basic Tasks...10 Understanding and Managing Your Allocations...16 New User Orientation Introduction

More information

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Slide 1 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic

More information

Cluster Computing With R

Cluster Computing With R Cluster Computing With R Stowers Institute for Medical Research R/Bioconductor Discussion Group Earl F. Glynn Scientific Programmer 18 December 2007 1 Cluster Computing With R Accessing Linux Boxes from

More information

Rocoto. HWRF Python Scripts Training Miami, FL November 19, 2015

Rocoto. HWRF Python Scripts Training Miami, FL November 19, 2015 Rocoto HWRF Python Scripts Training Miami, FL November 19, 2015 Outline Introduction to Rocoto How it works Overview and description of XML Effectively using Rocoto (run, boot, stat, check, rewind, logs)

More information

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science

More information

HOD Scheduler. Table of contents

HOD Scheduler. Table of contents Table of contents 1 Introduction... 2 2 HOD Users... 2 2.1 Getting Started... 2 2.2 HOD Features...5 2.3 Troubleshooting... 14 3 HOD Administrators... 21 3.1 Getting Started... 22 3.2 Prerequisites...

More information

Introduction to SDSC systems and data analytics software packages "

Introduction to SDSC systems and data analytics software packages Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available

More information

Heterogeneous Clustering- Operational and User Impacts

Heterogeneous Clustering- Operational and User Impacts Heterogeneous Clustering- Operational and User Impacts Sarita Salm Sterling Software MS 258-6 Moffett Field, CA 94035.1000 sarita@nas.nasa.gov http :llscience.nas.nasa.govl~sarita ABSTRACT Heterogeneous

More information

WestGrid. Handbook for Researchers at the University of Manitoba. January 2010

WestGrid. Handbook for Researchers at the University of Manitoba. January 2010 WestGrid Handbook for Researchers at the University of Manitoba January 2010 2 Table of Contents Table of Contents...3 1 Overview...5 1.1 This Guide... 5 1.2 WestGrid... 5 2 Information for Grant Applicants...6

More information

HPC system startup manual (version 1.30)

HPC system startup manual (version 1.30) HPC system startup manual (version 1.30) Document change log Issue Date Change 1 12/1/2012 New document 2 10/22/2013 Added the information of supported OS 3 10/22/2013 Changed the example 1 for data download

More information

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 (

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 ( Running Hadoop and Stratosphere jobs on TomPouce cluster 16 October 2013 TomPouce cluster TomPouce is a cluster of 20 calcula@on nodes = 240 cores Located in the Inria Turing building (École Polytechnique)

More information

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu Advanced PBS Workflow Example Bill Brouwer 050112 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu 0.0 An elementary workflow All jobs consuming significant cycles need to be submitted

More information

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers

More information

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging

More information

Pcounter Mobile Guide

Pcounter Mobile Guide Pcounter Mobile Guide Pcounter Mobile Guide 2012.06.22 Page 1 of 19 1. Overview... 3 2. Pre-requisites and Requirements... 4 2.1 Gateway server requirements... 4 2.2 Mobile device requirements... 4 2.3

More information

Agenda. Using HPC Wales 2

Agenda. Using HPC Wales 2 Using HPC Wales Agenda Infrastructure : An Overview of our Infrastructure Logging in : Command Line Interface and File Transfer Linux Basics : Commands and Text Editors Using Modules : Managing Software

More information

Systems, Storage and Software in the National Supercomputing Service. CSCS User Assembly, Luzern, 26 th March 2010 Neil Stringfellow

Systems, Storage and Software in the National Supercomputing Service. CSCS User Assembly, Luzern, 26 th March 2010 Neil Stringfellow Systems, Storage and Software in the National Supercomputing Service CSCS User Assembly, Luzern, 26 th March 2010 Neil Stringfellow Cray XT5 Monte Rosa 22,168 processors 1844 twelve-way nodes 2 AMD 2.4

More information

Integrating VoltDB with Hadoop

Integrating VoltDB with Hadoop The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.

More information

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics

More information

Submitting and Running Jobs on the Cray XT5

Submitting and Running Jobs on the Cray XT5 Submitting and Running Jobs on the Cray XT5 Richard Gerber NERSC User Services RAGerber@lbl.gov Joint Cray XT5 Workshop UC-Berkeley Outline Hopper in blue; Jaguar in Orange; Kraken in Green XT5 Overview

More information

Using the Millipede cluster - I

Using the Millipede cluster - I Using the Millipede cluster - I Fokke Dijkstra, Bob Dröge High Performance Computing and Visualisation group Donald Smits Centre for Information Technology General introduction Course aimed at beginners

More information

Manual for using Super Computing Resources

Manual for using Super Computing Resources Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad

More information

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014 1. How to Get An Account CACR Accounts 2. How to Access the Machine Connect to the front end, zwicky.cacr.caltech.edu: ssh -l username zwicky.cacr.caltech.edu or ssh username@zwicky.cacr.caltech.edu Edits,

More information

Introduction to Grid Engine

Introduction to Grid Engine Introduction to Grid Engine Workbook Edition 8 January 2011 Document reference: 3609-2011 Introduction to Grid Engine for ECDF Users Workbook Introduction to Grid Engine for ECDF Users Author: Brian Fletcher,

More information

Grid Engine Training Introduction

Grid Engine Training Introduction Grid Engine Training Jordi Blasco (jordi.blasco@xrqtc.org) 26-03-2012 Agenda 1 How it works? 2 History Current status future About the Grid Engine version of this training Documentation 3 Grid Engine internals

More information

one Managing your PBX Administrator ACCESSING YOUR PBX ACCOUNT CHECKING ACCOUNT ACTIVITY

one Managing your PBX Administrator ACCESSING YOUR PBX ACCOUNT CHECKING ACCOUNT ACTIVITY one Managing your PBX Administrator ACCESSING YOUR PBX ACCOUNT Navigate to https://portal.priorityonenet.com/ and log in to the PriorityOne portal account. If you would like your web browser to keep you

More information

ELEC 377. Operating Systems. Week 1 Class 3

ELEC 377. Operating Systems. Week 1 Class 3 Operating Systems Week 1 Class 3 Last Class! Computer System Structure, Controllers! Interrupts & Traps! I/O structure and device queues.! Storage Structure & Caching! Hardware Protection! Dual Mode Operation

More information

Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge guillimin@calculquebec.ca December 1st, 2015

Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge guillimin@calculquebec.ca December 1st, 2015 Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge guillimin@calculquebec.ca December 1st, 2015 1 Partners and sponsors 2 Exercise 0: Login and Setup Ubuntu login:

More information

RA MPI Compilers Debuggers Profiling. March 25, 2009

RA MPI Compilers Debuggers Profiling. March 25, 2009 RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar

More information

Batch Job Analysis to Improve the Success Rate in HPC

Batch Job Analysis to Improve the Success Rate in HPC Batch Job Analysis to Improve the Success Rate in HPC 1 JunWeon Yoon, 2 TaeYoung Hong, 3 ChanYeol Park, 4 HeonChang Yu 1, First Author KISTI and Korea University, jwyoon@kisti.re.kr 2,3, KISTI,tyhong@kisti.re.kr,chan@kisti.re.kr

More information

s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ]

s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] s@lm@n Oracle Exam 1z0-102 Oracle Weblogic Server 11g: System Administration I Version: 9.0 [ Total Questions: 111 ] Oracle 1z0-102 : Practice Test Question No : 1 Which two statements are true about java

More information

Hadoop Basics with InfoSphere BigInsights

Hadoop Basics with InfoSphere BigInsights An IBM Proof of Technology Hadoop Basics with InfoSphere BigInsights Part: 1 Exploring Hadoop Distributed File System An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2013 US Government

More information

Moab and TORQUE Highlights CUG 2015

Moab and TORQUE Highlights CUG 2015 Moab and TORQUE Highlights CUG 2015 David Beer TORQUE Architect 28 Apr 2015 Gary D. Brown HPC Product Manager 1 Agenda NUMA-aware Heterogeneous Jobs Ascent Project Power Management and Energy Accounting

More information

Compute Cluster Documentation

Compute Cluster Documentation Drucksachenkategorie Compute Cluster Documentation Compiled by Martin Geier (DLR FA-STM) Authors Michael Schäfer (INIT GmbH) Martin Geier (DLR FA-STM) Date 22.05.2015 Table of contents Table of contents...

More information

Configure Cisco Unified Customer Voice Portal

Configure Cisco Unified Customer Voice Portal Cisco Unified Customer Voice Portal Configuration, page 1 Configure Gateways, page 1 Transfer Unified CVP Scripts and Media Files, page 2 Unified Customer Voice Portal Licenses, page 2 Configure SNMP,

More information

Installation Manual v2.0.0

Installation Manual v2.0.0 Installation Manual v2.0.0 Contents ResponseLogic Install Guide v2.0.0 (Command Prompt Install)... 3 Requirements... 4 Installation Checklist:... 4 1. Download and Unzip files.... 4 2. Confirm you have

More information

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting. Bryan Caron November 13, 2014 Bryan Caron bryan.caron@mcgill.ca bryan.caron@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News October Service Interruption

More information

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research ! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at

More information

State of Michigan Data Exchange Gateway. Web-Interface Users Guide 12-07-2009

State of Michigan Data Exchange Gateway. Web-Interface Users Guide 12-07-2009 State of Michigan Data Exchange Gateway Web-Interface Users Guide 12-07-2009 Page 1 of 21 Revision History: Revision # Date Author Change: 1 8-14-2009 Mattingly Original Release 1.1 8-31-2009 MM Pgs 4,

More information

1 Intel Smart Connect Technology Installation Guide:

1 Intel Smart Connect Technology Installation Guide: 1 Intel Smart Connect Technology Installation Guide: 1.1 System Requirements The following are required on a system: System BIOS supporting and enabled for Intel Smart Connect Technology Microsoft* Windows*

More information

1Z0-102. Oracle Weblogic Server 11g: System Administration I. Version: Demo. Page <<1/7>>

1Z0-102. Oracle Weblogic Server 11g: System Administration I. Version: Demo. Page <<1/7>> 1Z0-102 Oracle Weblogic Server 11g: System Administration I Version: Demo Page 1. Which two statements are true about java EE shared libraries? A. A shared library cannot bedeployed to a cluster.

More information

High Performance Computing

High Performance Computing High Performance Computing at Stellenbosch University Gerhard Venter Outline 1 Background 2 Clusters 3 SU History 4 SU Cluster 5 Using the Cluster 6 Examples What is High Performance Computing? Wikipedia

More information