CERM Cluster User's guide

Size: px
Start display at page:

Download "CERM Cluster User's guide"

Transcription

1 CERM Cluster User's guide

2 CERM Cluster User's guide Abstract These pages collect some info to allow CERM users to use our computational resources at best. You can download this manual in PDF format here [

3

4 Table of Contents 1. How to get an account How to login Login Info and Enviroment Setup OpenPBS/Torque Usage... 4 Batch Processing... 4 PBS Options... 4 Maui commands... 5 PBS Environment Variables... 5 Job Script Template... 6 Job script examples... 6 Submitting a Job... 8 Monitoring a Job Running serial codes Running MPI parallel codes Running interactive MPI programs Job Script Template iv

5 List of Tables 4.1. PBS Options Maui commands PBS Environment Variables PBS Environment Variables Commands to monitor a job... 8 v

6 Chapter 1. How to get an account To ask for an account send your request to morelli AT cerm.unifi.it specifying the project that you'll use for calculations. To request a cluster account: You must to have an active project You have to be the project's owner 1

7 Chapter 2. How to login To ensure a secure login session, users must connect to machines using the secure shell, ssh program. Telnet is not allowed because of the security vulnerabilities associated with it. The "r" commands rlogin, rsh, and rcp are also disabled on this machine for similar reasons. These commands are replaced by the more secure alternatives included in SSH --- ssh,scp. To submit, monitoring and deleting jobs, you have to login on the cluster server named athlon. On atlhon it's also possible to do backups on CD or DVD. Important Plase note that interactive login is only allowed on the cluster server (athlon). Computing nodes are accessed and used only using the queue system. 2

8 Chapter 3. Login Info and Enviroment Setup The default shell is the bash shell. To change it use the chsh command. At login /etc/motd file is displayed: please take care of reading it, because information about system are usally written there. A basic default environment is already set up by means of system login configuration files, this includes variables and paths for the all the compilers and their MPI wrappers of the MPI standard, and OpenPBS/ Torque batch queuing system with the MAUI scheduler. Check your environment with the env command. You should be careful modifying the shell customization files (.cshrc.profile.login.bashrc), since they could overwrite the default values altering the behaviour of the compilers and of the batch queuing system. 3

9 Chapter 4. OpenPBS/Torque Usage Batch Processing The Portable Batch System, PBS, is a workload management system for Linux clusters. It supplies command to submit, monitor, and delete jobs. It has the following components. Job Server - also called pbs_server provides the basic batch services such as receiving/creating a batch job, modifying the job, protecting the job against system crashes, and running the job. Job Executor - a daemon (pbs_mom) that actually places the job into execution when it receives a copy of the job from the Job Server. Mom creates a new session as identical to a user login session as is possible and returns the job's output to the user. Job Scheduler - a daemon that contains the site's policy controlling which job is run and where and when it is run. PBS allows each site to create its own Scheduler. We are using the Maui Scheduler. The Maui Scheduler can communicate with various Moms to learn about the state of a system's resources and with the Server to learn about the availability of jobs to execute. Below are the steps needed to run production code: 1. Create a job script containing the following PBS options: request the resources that will be needed (i.e. number of processors, wall-clock time, etc.) and use commands to prepare for execution of the executable (i.e. cd to working directory, etc.). 2. Submit the job script file to PBS. 3. Monitor the job. PBS Options Below are some of the commonly used PBS options in a job script file. The options start with "#PBS". Table 4.1. PBS Options Option #PBS -N myjob #PBS -l nodes=4:ppn=2 #PBS -l walltime=01:00:00 #PBS -o mypath/my.out Description Assigns a job name. The default is the name of PBS job script. The number of nodes and processors per node. Only for parallel jobs The maximum wall-clock time during which this job can run. The path and file name for standard output. 4

10 OpenPBS/Torque Usage Option #PBS -e mypath/my.err #PBS -j oe #PBS -k oe #PBS -W stagein=file_list #PBS -W stageout=file_list #PBS -r n #PBS -V Description The path and file name for standard error. Join option that merges the standard error stream with the standard output stream of the job. Define which output of the batch job to retain on the execution host. Copies the file onto the execution host before the job starts. (*) Copies the file from the execution host after the job completes. (*) Indicates that a job should not rerun if it fails. Exports all environment variables to the job. Note (*) File staging can specify which files should be copied onto the execution host before the job starts and which files should be copied off the execution host when it completes. The file_list regardless of the direction of copy, is of the following form, where the name local_file is the name of the file on the system where the job executes, and the remote_file is the destination name on the host specified by hostname: local_file@hostname:remote_file. stagein=my.input@frontend-0:/home/login_name/my.input stageout=my.output@frontend-0:/home/login_name/my.output Maui commands There are some quite useful Maui commands: Table 4.2. Maui commands Command showq showbf checkjob job.id showstart job.id Description Show a detailed list of submitted jobs Show the free resources (time and processors available) at the moment show a detailed description of the job job.id gives an estimate of the expected started time of the job job.id PBS Environment Variables There are a number of predefined environment variables. These include the following: Variables defined on the execution host; Variables exported from the submission host to the execution host; and Variables defined by PBS. 5

11 OpenPBS/Torque Usage The following environment variables relate to the submission machine: Table 4.3. PBS Environment Variables Variable PBS_O_HOST PBS_O_LOGNAME PBS_O_HOME PBS_O_WORKDIR Description The host machine on which the qsub command was run. The login name on the machine on which the qsub was run. The home directory from which the qsub was run. The working directory from which the qsub was run. The following variables relate to the environment where the job is executing: Table 4.4. PBS Environment Variables Variable PBS_ENVIRONMENT PBS_O_QUEUE PBS_JOBID PBS_JOBNAME PBS_NODEFILE Description This is set to PBS_BATCH for batch jobs and to PBS_INTERACTIVE for interactive jobs. The original queue to which the job was submitted. The identifier that PBS assigns to the job. The name of the job. The file containing the list of nodes assigned to a parallel job. Job Script Template The following job script template should be modified for the need of the job. A job script may consist of PBS directives, comments and executable statements. A PBS directive provides a way of specifying job attributes in addition to the command line options. For example: #PBS -N Job_name #PBS -l walltime=10:30,mem=320kb # step1 arg1 arg2 step2 arg3 arg4 Job script examples Dyana/Pseudyana/Paramagneticdyana 6

12 OpenPBS/Torque Usage To run dyana's programs family having a RUN script like: #!/bin/bash /prog/pseudyana << EOF./ANNEAL exit EOF you can write a job script named run, for example, with the following content: #!/bin/bash -f #PBS -k oe #PBS -m n LAUNCH="./RUN" cd ${PBS_O_WORKDIR} ${LAUNCH} exit Amber8 To run amber calculations you can write the following job script (changing all the filename's occurences with real filenames and adding other options if you need): #!/bin/bash -f #PBS -k oe #PBS -m n #PBS -V LAUNCH="/prog/amber8/exe/sander -O -i filename -o filename -c filename -p filename -r filename" cd ${PBS_O_WORKDIR} ${LAUNCH} exit Bash script To run bash script based calculations you can write the following job script (remember to change the LAUNCH entry): #!/bin/bash -f #PBS -k oe #PBS -m n LAUNCH="/home_nXX/project/bash_script" cd ${PBS_O_WORKDIR} ${LAUNCH} exit 7

13 OpenPBS/Torque Usage Haddock 1.3 To run Haddock 1.3 calculations you can write the following job script (the WORKDIR entry points to the directory containing the user's haddock data, remember to change it): #!/bin/bash #PDS -j oe #PBS -k oe #PBS -V HADDOCK="/prog/haddock1.3" HADDOCKTOOLS="$HADDOCK/tools" PYTHONPATH=$HADDOCK NACCESS="/prog/naccess2.1.1/naccess" PROFIT="/prog/profit/profit" WORKDIR="/home_nXX/project/HADDOCK/run1" LAUNCH="python $HADDOCK/Haddock/Runhaddock.py" cd $WORKDIR $LAUNCH Submitting a Job Use the qsub command to submit the job script (in this example the name of the job script is run). $ qsub run PBS assigns a job a unique job identifier once it is submitted (e.g. 123.athlon). After a job has been queued, it is selected for execution based on the time it has been in the queue, wall-clock time limit, and number of processors. Monitoring a Job Below are commands for monitoring a job: Table 4.5. Commands to monitor a job Command qstat -a qstat -f canceljob job.id qhold job.id Description check status of jobs, queues, and the PBS server get all the information about a job, i.e. resources requested, resource limits, owner, source, destination, queue, etc. delete a job from the queue hold a job if it is in the queue 8

14 OpenPBS/Torque Usage Command qrls job.id Description release a job from hold 9

15 Chapter 5. Running serial codes Important On the master node no production is allowed and any serial execution program lasting more than 5 minutes is automatically deleted. Note Serial codes are all non parallel programs like dyana, pseudyana, cyana. Execution of serial application on computational nodes can be only done the through the queuing system, even for interactive runs. 10

16 Chapter 6. Running MPI parallel codes Note To run MPI parallel program users have to use the lam environments. Running interactive MPI programs Suppose for instance you want to run your a test.x interactively on four processors then you could use the following sequence of commands: $ qsub -l nodes=2:ppn=2,walltime=0:30:00 -I at this point (if there are free resources) you will enter in the batch interactive session, and you could run your test with: $ lamboot -v $PBS_NODEFILE $ cd testdir $ mpirun -n 4 -no-shmem test.x $ mpirun -np 4 Example of an interactive execution: $ qsub -l nodes=2:ppn=2,walltime=0:30:00 -I $ cd testdir $ mpirun -n 4 test.x Job Script Template The following job script template should be modified for the need of the job. #!/bin/bash -f #PBS -l nodes=2:ppn=2 #PBS -k oe LAMSTART="lamboot $PBS_NODEFILE" LAMSTOP="lamhalt $PBS_NODEFILE" HOME="/home_n01/guest" LAUNCH="mpirun -np 4 cpmd.x" WORKDIR="${HOME}/cp_test" export PP_LIBRARY_PATH=${WORKDIR} cd ${WORKDIR} ${LAMSTART} ${LAUNCH} au_surf_job1.in > au_surf_job1.out 11

17 Running MPI parallel codes ${LAMSTOP} # exit The following job scripts should be used for GROMACS parallel calculations. The first one is for preminimization and the second one is to launch the dinamic calculation. #!/bin/bash -f #PBS -k oe #PBS -m n PBS_O_WORKDIR="/home_n11/hetdyn/GROMACS/SPI_1ns_cluster" lamboot LAUNCH="./SPI_MINI.csh" cd ${PBS_O_WORKDIR} ${LAUNCH} exit #!/bin/bash -f #PBS -k oe #PBS -m n PBS_O_WORKDIR="/home_n11/hetdyn/GROMACS/SPI_1ns_cluster" lamboot LAUNCH="./SPI_MD_5PR_1ns.csh" cd ${PBS_O_WORKDIR} ${LAUNCH} exit 12

Job Scheduling with Moab Cluster Suite

Job Scheduling with Moab Cluster Suite Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..

More information

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Ra - Batch Scripts Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Jobs on Ra are Run via a Batch System Ra is a shared resource Purpose: Give fair access to all users Have control over where jobs are run Set

More information

Resource Management and Job Scheduling

Resource Management and Job Scheduling Resource Management and Job Scheduling Jenett Tillotson Senior Cluster System Administrator Indiana University May 18 18-22 May 2015 1 Resource Managers Keep track of resources Nodes: CPUs, disk, memory,

More information

Batch Scripts for RA & Mio

Batch Scripts for RA & Mio Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs

More information

Quick Tutorial for Portable Batch System (PBS)

Quick Tutorial for Portable Batch System (PBS) Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.

More information

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Miami University RedHawk Cluster Working with batch jobs on the Cluster Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.

More information

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007 PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit

More information

NYUAD HPC Center Running Jobs

NYUAD HPC Center Running Jobs NYUAD HPC Center Running Jobs 1 Overview... Error! Bookmark not defined. 1.1 General List... Error! Bookmark not defined. 1.2 Compilers... Error! Bookmark not defined. 2 Loading Software... Error! Bookmark

More information

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St

More information

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27. Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction

More information

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015 Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Juropa. Batch Usage Introduction. May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de

Juropa. Batch Usage Introduction. May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Juropa Batch Usage Introduction May 2014 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Batch System Usage Model A Batch System: monitors and controls the resources on the system manages and schedules

More information

Hodor and Bran - Job Scheduling and PBS Scripts

Hodor and Bran - Job Scheduling and PBS Scripts Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Job scheduler details

Job scheduler details Job scheduler details Advanced Computing Center for Research & Education (ACCRE) Job scheduler details 1 / 25 Outline 1 Batch queue system overview 2 Torque and Moab 3 Submitting jobs (ACCRE) Job scheduler

More information

Installing and running COMSOL on a Linux cluster

Installing and running COMSOL on a Linux cluster Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation

More information

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is

More information

SEE-GRID-SCI. Cluster installation and configuration. www.see-grid.eu. SEE-GRID-SCI Training Event, Yerevan, Armenia, 24-25 July 2008

SEE-GRID-SCI. Cluster installation and configuration. www.see-grid.eu. SEE-GRID-SCI Training Event, Yerevan, Armenia, 24-25 July 2008 SEE-GRID-SCI Cluster installation and configuration www.see-grid.eu SEE-GRID-SCI Training Event, Yerevan, Armenia, 24-25 July 2008 Mikayel Gyurjyan Institute for Informatics and Automation Problems National

More information

NEC HPC-Linux-Cluster

NEC HPC-Linux-Cluster NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores

More information

Getting Started with HPC

Getting Started with HPC Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage

More information

Introduction to Sun Grid Engine (SGE)

Introduction to Sun Grid Engine (SGE) Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems

More information

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine) Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing

More information

Grid Engine Users Guide. 2011.11p1 Edition

Grid Engine Users Guide. 2011.11p1 Edition Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the

More information

Running applications on the Cray XC30 4/12/2015

Running applications on the Cray XC30 4/12/2015 Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes

More information

PBS + Maui Scheduler

PBS + Maui Scheduler PBS + Maui Scheduler This web page serves the following purpose Survey, study and understand the documents about PBS + Maui scheduler. Carry out test drive to verify our understanding. Design schdeuling

More information

Cluster@WU User s Manual

Cluster@WU User s Manual Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut

More information

SGE Roll: Users Guide. Version @VERSION@ Edition

SGE Roll: Users Guide. Version @VERSION@ Edition SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1

More information

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)

More information

SLURM Workload Manager

SLURM Workload Manager SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux

More information

LSKA 2010 Survey Report Job Scheduler

LSKA 2010 Survey Report Job Scheduler LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,

More information

High-Performance Reservoir Risk Assessment (Jacta Cluster)

High-Performance Reservoir Risk Assessment (Jacta Cluster) High-Performance Reservoir Risk Assessment (Jacta Cluster) SKUA-GOCAD 2013.1 Paradigm 2011.3 With Epos 4.1 Data Management Configuration Guide 2008 2013 Paradigm Ltd. or its affiliates and subsidiaries.

More information

How To Run A Cluster On A Linux Server On A Pcode 2.5.2.2 (Amd64) On A Microsoft Powerbook 2.6.2 2.4.2 On A Macbook 2 (Amd32)

How To Run A Cluster On A Linux Server On A Pcode 2.5.2.2 (Amd64) On A Microsoft Powerbook 2.6.2 2.4.2 On A Macbook 2 (Amd32) UNIVERSIDAD REY JUAN CARLOS Máster Universitario en Software Libre Curso Académico 2012/2013 Campus Fuenlabrada, Madrid, España MSWL-THESIS - Proyecto Fin de Master Distributed Batch Processing Autor:

More information

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Streamline Computing Linux Cluster User Training. ( Nottingham University) 1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running

More information

TORQUE Administrator s Guide. version 2.3

TORQUE Administrator s Guide. version 2.3 TORQUE Administrator s Guide version 2.3 Copyright 2009 Cluster Resources, Inc. All rights reserved. Trademarks Cluster Resources, Moab, Moab Workload Manager, Moab Cluster Manager, Moab Cluster Suite,

More information

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014 Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan

More information

HOD Scheduler. Table of contents

HOD Scheduler. Table of contents Table of contents 1 Introduction... 2 2 HOD Users... 2 2.1 Getting Started... 2 2.2 HOD Features...5 2.3 Troubleshooting... 14 3 HOD Administrators... 21 3.1 Getting Started... 22 3.2 Prerequisites...

More information

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

An Introduction to High Performance Computing in the Department

An Introduction to High Performance Computing in the Department An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software

More information

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science

More information

Beyond Windows: Using the Linux Servers and the Grid

Beyond Windows: Using the Linux Servers and the Grid Beyond Windows: Using the Linux Servers and the Grid Topics Linux Overview How to Login & Remote Access Passwords Staying Up-To-Date Network Drives Server List The Grid Useful Commands Linux Overview Linux

More information

Heterogeneous Clustering- Operational and User Impacts

Heterogeneous Clustering- Operational and User Impacts Heterogeneous Clustering- Operational and User Impacts Sarita Salm Sterling Software MS 258-6 Moffett Field, CA 94035.1000 sarita@nas.nasa.gov http :llscience.nas.nasa.govl~sarita ABSTRACT Heterogeneous

More information

Using Parallel Computing to Run Multiple Jobs

Using Parallel Computing to Run Multiple Jobs Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The

More information

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...

Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Martin Siegert, SFU Cluster Myths There are so many jobs in the queue - it will take ages until

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers

More information

Batch Scheduling and Resource Management

Batch Scheduling and Resource Management Batch Scheduling and Resource Management Luke Tierney Department of Statistics & Actuarial Science University of Iowa October 18, 2007 Luke Tierney (U. of Iowa) Batch Scheduling and Resource Management

More information

Manual for using Super Computing Resources

Manual for using Super Computing Resources Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad

More information

The SUN ONE Grid Engine BATCH SYSTEM

The SUN ONE Grid Engine BATCH SYSTEM The SUN ONE Grid Engine BATCH SYSTEM Juan Luis Chaves Sanabria Centro Nacional de Cálculo Científico (CeCalCULA) Latin American School in HPC on Linux Cluster October 27 November 07 2003 What is SGE? Is

More information

Introduction to SDSC systems and data analytics software packages "

Introduction to SDSC systems and data analytics software packages Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available

More information

RA MPI Compilers Debuggers Profiling. March 25, 2009

RA MPI Compilers Debuggers Profiling. March 25, 2009 RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar

More information

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research ! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at

More information

Matlab on a Supercomputer

Matlab on a Supercomputer Matlab on a Supercomputer Shelley L. Knuth Research Computing April 9, 2015 Outline Description of Matlab and supercomputing Interactive Matlab jobs Non-interactive Matlab jobs Parallel Computing Slides

More information

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging

More information

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing

More information

How To Run A Steady Case On A Creeper

How To Run A Steady Case On A Creeper Crash Course Introduction to OpenFOAM Artur Lidtke University of Southampton akl1g09@soton.ac.uk November 4, 2014 Artur Lidtke Crash Course Introduction to OpenFOAM 1 / 32 What is OpenFOAM? Using OpenFOAM

More information

Martinos Center Compute Clusters

Martinos Center Compute Clusters Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress

More information

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:

More information

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz)

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz) NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting

More information

Parallel Processing using the LOTUS cluster

Parallel Processing using the LOTUS cluster Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS

More information

Introduction to the SGE/OGS batch-queuing system

Introduction to the SGE/OGS batch-queuing system Grid Computing Competence Center Introduction to the SGE/OGS batch-queuing system Riccardo Murri Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich Oct. 6, 2011 The basic

More information

WestGrid. Handbook for Researchers at the University of Manitoba. January 2010

WestGrid. Handbook for Researchers at the University of Manitoba. January 2010 WestGrid Handbook for Researchers at the University of Manitoba January 2010 2 Table of Contents Table of Contents...3 1 Overview...5 1.1 This Guide... 5 1.2 WestGrid... 5 2 Information for Grant Applicants...6

More information

Table of Contents New User Orientation...1

Table of Contents New User Orientation...1 Table of Contents New User Orientation...1 Introduction...1 Helpful Resources...3 HPC Environment Overview...4 Basic Tasks...10 Understanding and Managing Your Allocations...16 New User Orientation Introduction

More information

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry New High-performance computing cluster: PAULI Sascha Frick Institute for Physical Chemistry 02/05/2012 Sascha Frick (PHC) HPC cluster pauli 02/05/2012 1 / 24 Outline 1 About this seminar 2 New Hardware

More information

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 (

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 ( Running Hadoop and Stratosphere jobs on TomPouce cluster 16 October 2013 TomPouce cluster TomPouce is a cluster of 20 calcula@on nodes = 240 cores Located in the Inria Turing building (École Polytechnique)

More information

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu

Advanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu Advanced PBS Workflow Example Bill Brouwer 050112 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu 0.0 An elementary workflow All jobs consuming significant cycles need to be submitted

More information

How To Use A Job Management System With Sun Hpc Cluster Tools

How To Use A Job Management System With Sun Hpc Cluster Tools A Comparison of Job Management Systems in Supporting HPC ClusterTools Presentation for SUPerG Vancouver, Fall 2000 Chansup Byun and Christopher Duncan HES Engineering-HPC, Sun Microsystems, Inc. Stephanie

More information

How to Run Parallel Jobs Efficiently

How to Run Parallel Jobs Efficiently How to Run Parallel Jobs Efficiently Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education May 9, 2013 1 The big picture: running parallel jobs on Hoffman2

More information

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959

More information

INF-110. GPFS Installation

INF-110. GPFS Installation INF-110 GPFS Installation Overview Plan the installation Before installing any software, it is important to plan the GPFS installation by choosing the hardware, deciding which kind of disk connectivity

More information

Grid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net

Grid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net Grid Engine 6 Troubleshooting BioTeam Inc. info@bioteam.net Grid Engine Troubleshooting There are two core problem types Job Level Cluster seems OK, example scripts work fine Some user jobs/apps fail Cluster

More information

8/15/2014. Best Practices @OLCF (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status

8/15/2014. Best Practices @OLCF (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status Best Practices @OLCF (and more) Bill Renaud OLCF User Support General Information This presentation covers some helpful information for users of OLCF Staying informed Aspects of system usage that may differ

More information

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014 1. How to Get An Account CACR Accounts 2. How to Access the Machine Connect to the front end, zwicky.cacr.caltech.edu: ssh -l username zwicky.cacr.caltech.edu or ssh username@zwicky.cacr.caltech.edu Edits,

More information

Parallel Debugging with DDT

Parallel Debugging with DDT Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece

More information

Batch Job Management with Torque/OpenPBS

Batch Job Management with Torque/OpenPBS Batch Job Management with Torque/OpenPBS The batch system on titan uses OpenPBS, a free customizable batch system. Jobs are submitted by users with qsub from titan.physics.umass.edu, and are scheduled

More information

Using the Millipede cluster - I

Using the Millipede cluster - I Using the Millipede cluster - I Fokke Dijkstra, Bob Dröge High Performance Computing and Visualisation group Donald Smits Centre for Information Technology General introduction Course aimed at beginners

More information

The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - -

The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - - The Maui High Performance Computing Center Department of Defense Supercomputing Resource Center (MHPCC DSRC) Hadoop Implementation on Riptide - - Hadoop Implementation on Riptide 2 Table of Contents Executive

More information

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University

More information

Exchange Migration Guide

Exchange Migration Guide IceWarp Unified Communications Exchange Migration Guide Version 11.3 Exchange Migration Guide 2 Exchange Migration Guide This document will guide you through a process of migration from Microsoft Exchange

More information

Documentation for hanythingondemand

Documentation for hanythingondemand Documentation for hanythingondemand Release 20151120.01 Ghent University Thu, 07 Jan 2016 12:53:15 Contents 1 Introductory topics 3 1.1 What is hanythingondemand?.................................... 3

More information

CASHNet Secure File Transfer Instructions

CASHNet Secure File Transfer Instructions CASHNet Secure File Transfer Instructions Copyright 2009, 2010 Higher One Payments, Inc. CASHNet, CASHNet Business Office, CASHNet Commerce Center, CASHNet SMARTPAY and all related logos and designs are

More information

High Performance Computing

High Performance Computing High Performance Computing at Stellenbosch University Gerhard Venter Outline 1 Background 2 Clusters 3 SU History 4 SU Cluster 5 Using the Cluster 6 Examples What is High Performance Computing? Wikipedia

More information

Extending Remote Desktop for Large Installations. Distributed Package Installs

Extending Remote Desktop for Large Installations. Distributed Package Installs Extending Remote Desktop for Large Installations This article describes four ways Remote Desktop can be extended for large installations. The four ways are: Distributed Package Installs, List Sharing,

More information

Submitting and Running Jobs on the Cray XT5

Submitting and Running Jobs on the Cray XT5 Submitting and Running Jobs on the Cray XT5 Richard Gerber NERSC User Services RAGerber@lbl.gov Joint Cray XT5 Workshop UC-Berkeley Outline Hopper in blue; Jaguar in Orange; Kraken in Green XT5 Overview

More information

PuTTY/Cygwin Tutorial. By Ben Meister Written for CS 23, Winter 2007

PuTTY/Cygwin Tutorial. By Ben Meister Written for CS 23, Winter 2007 PuTTY/Cygwin Tutorial By Ben Meister Written for CS 23, Winter 2007 This tutorial will show you how to set up and use PuTTY to connect to CS Department computers using SSH, and how to install and use the

More information

Cluster Computing With R

Cluster Computing With R Cluster Computing With R Stowers Institute for Medical Research R/Bioconductor Discussion Group Earl F. Glynn Scientific Programmer 18 December 2007 1 Cluster Computing With R Accessing Linux Boxes from

More information

Grid Engine. Application Integration

Grid Engine. Application Integration Grid Engine Application Integration Getting Stuff Done. Batch Interactive - Terminal Interactive - X11/GUI Licensed Applications Parallel Jobs DRMAA Batch Jobs Most common What is run: Shell Scripts Binaries

More information

On-demand (Pay-per-Use) HPC Service Portal

On-demand (Pay-per-Use) HPC Service Portal On-demand (Pay-per-Use) Portal Wang Junhong INTRODUCTION High Performance Computing, Computer Centre The Service Portal is a key component of the On-demand (pay-per-use) HPC service delivery. The Portal,

More information

Biowulf2 Training Session

Biowulf2 Training Session Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs

More information

Rocoto. HWRF Python Scripts Training Miami, FL November 19, 2015

Rocoto. HWRF Python Scripts Training Miami, FL November 19, 2015 Rocoto HWRF Python Scripts Training Miami, FL November 19, 2015 Outline Introduction to Rocoto How it works Overview and description of XML Effectively using Rocoto (run, boot, stat, check, rewind, logs)

More information

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz)

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz) Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or

More information

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting. Bryan Caron November 13, 2014 Bryan Caron bryan.caron@mcgill.ca bryan.caron@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News October Service Interruption

More information

Obelisk: Summoning Minions on a HPC Cluster

Obelisk: Summoning Minions on a HPC Cluster Obelisk: Summoning Minions on a HPC Cluster Abstract In scientific research, having the ability to perform rigorous calculations in a bearable amount of time is an invaluable asset. Fortunately, the growing

More information

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Slide 1 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic

More information

HADOOP CLUSTER SETUP GUIDE:

HADOOP CLUSTER SETUP GUIDE: HADOOP CLUSTER SETUP GUIDE: Passwordless SSH Sessions: Before we start our installation, we have to ensure that passwordless SSH Login is possible to any of the Linux machines of CS120. In order to do

More information

JMS: A workflow management system and web-based cluster front-end for the Torque resource manager

JMS: A workflow management system and web-based cluster front-end for the Torque resource manager JMS: A workflow management system and web-based cluster front-end for the Torque resource manager David K. Brown, Thommas M. Musyoka, David L. Penkler and Özlem Tastan Bishop* Research Unit in Bioinformatics

More information

HPCC - Hrothgar Getting Started User Guide MPI Programming

HPCC - Hrothgar Getting Started User Guide MPI Programming HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...

More information

Koha 3 Library Management System

Koha 3 Library Management System P U B L I S H I N G community experience distilled Koha 3 Library Management System Savitra Sirohi Amit Gupta Chapter No.4 "Koha's Web Installer, Crontab, and Other Server Configurations" In this package,

More information