Introduction to the SGE/OGS batch-queuing system

Size: px
Start display at page:

Download "Introduction to the SGE/OGS batch-queuing system"

Transcription

1 Grid Computing Competence Center Introduction to the SGE/OGS batch-queuing system Riccardo Murri Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich Oct. 6, 2011

2 The basic problem Process a large set of data. Assumptions: 1. Cannot be done on a single computer for space or time constraints. 2. The data can be subdivided into files, each of which can be processed independently. 3. (Processing each file can comprise several steps.) 4. (Accessing the files over a network has acceptable overhead.)

3 Today s lab session Two approaches: Local execution of programs (e.g., on your laptop) Batched execution of programs (on a cluster) The goal of these initial lab sessions is to show what the difference is, in practice, and what tools are available in each case. These slides are available for download from:

4 Login to the cluster ocikbpra.uzh.ch Log in to the cluster: ssh You should be greeted by this shell prompt: ]$ Gather the sample application and test files into a directory lab2: mkdir lab2 cp -av murri/lsci/rank-int.i386 lab2/ cp -av murri/lsci/m0,6*.sms lab2/ cd lab2

5 The cluster ocikbpra.uzh.ch ssh internet ocikbpra.uzh.ch /home filesystem /share/apps filesystem (exported over the net) local 1Gb/s ethernet network compute 0 0.local compute 0 1.local compute 0 27.local /state/partition1 (local scratch filesystem) filesystem

6 Recap from Lab Session 1 Process control features offered by the GNU/Linux shell: background processes with the & operator monitor process status with the ps command send signals to running processes with the kill command Lab Session 1 slides are available for download from:

7 Timing command execution, I The command /usr/bin/time reports about the time spent by the system executing a command. Typical reports include: user time: CPU time spent processing user-level code. system time: CPU time spent processing kernel-level code. real/elapsed time: time from the start to the end of the program (as would have been measured by an external clock). Quiz: can the CPU time be greater than the real/elapsed time?

8 Timing command execution, II Exercises: 1. Using man time, figure out how to determine the CPU and real time spent running the command rank-int.i386 M0,6-D5.sms. 2. Can time also report on the memory? If yes, how much memory does the above command take?

9 Timing command execution, III $ /usr/bin/time./rank-int.i386 M0,6-D5.sms./rank-int.i386 file:m0,6-d5.sms rows:3024 cols: user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

10 Timing command execution, III Command-line to run $ /usr/bin/time./rank-int.i386 M0,6-D5.sms./rank-int.i386 file:m0,6-d5.sms rows:3024 cols: user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

11 Timing command execution, III $ /usr/bin/time./rank-int.i386 M0,6-D5.sms Command output./rank-int.i386 file:m0,6-d5.sms rows:3024 cols:49800 nonze 0.10user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

12 Timing command execution, III $ /usr/bin/time./rank-int.i386 M0,6-D5.sms./rank-int.i386 file:m0,6-d5.sms rows:3024 cols: Timing information 0.10user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

13 Timing command execution, III $ /usr/bin/time./rank-int.i386 M0,6-D5.sms./rank-int.i386 file:m0,6-d5.sms rows:3024 cols: Memory information 0.10user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

14 Timing command execution, III $ /usr/bin/time./rank-int.i386 M0,6-D5.sms./rank-int.i386 file:m0,6-d5.sms rows:3024 cols: user 0.04system 0:00.18elapsed 80%CPU (0avgtext+0avgdata I/O and paging info 0inputs+0outputs (0major+1971minor)pagefaults 0swaps

15 Resource limits, I Why impose limits on the utilization of system resources? What system resources would you want to limit in our case?

16 Resource limits, II The command ulimit allows setting resource usage limits: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited [...] file size [...] (blocks, -f) unlimited max memory size (kbytes, -m) unlimited open files (-n) 1024 [...] stack size (kbytes, -s) cpu time (seconds, -t) unlimited max user processes (-u) virtual memory (kbytes, -v) unlimited [...]

17 Resource limits, III Warning: The ulimit command is a shell built-in. It takes immediate effect on all the following commands. To restrict the scope to one command only, enclose it and ulimit in parentheses: $ (ulimit -t 15;./rank-int.i386 M0,6-D8.sms) (Parentheses force the enclosed commands to be executed in a sub-shell.)

18 Resource limits, IV Exercises: 1. What does the following command do? $ (ulimit -t 15;./rank-int.i386 M0,6-D8.sms) What happens if you leave out the ulimit part? 2. What are the options given by ulimit for limiting memory? 3. What should happen if you run the following command? What really happens? $ (ulimit -m ;./rank-int.i386 M0,6-D11.sms) 4. What should happen if you run the following command? What really happens? $ (ulimit -v ;./rank-int.i386 M0,6-D11.sms)

19 SGE/OGS Sun Grid Engine (SGE) is a batch-queuing system produced by Sun Microcomputers; made open-source in After acquisition by Oracle, the product forked: Open Grid Scheduler (OGS), the open-source version Univa Grid Engine is a commercial-only version, developed by the core SGE engineer team from Sun. Used on UZH main HPC cluster Schroedinger.

20 SGE architecture, I sge qmaster Runs on master node ocikbpra.uzh.ch Accepts client requests (job submission, job/host state inspection) Schedules jobs on compute nodes (formerly separate sge schedd process) Client programs qhost, qsub, qstat Run by user on submit node Clients for sge qmaster Master daemon has a list of authorized submit nodes

21 SGE architecture, II sge execd Runs on every compute node Accepts job start requests from sge qmaster Monitors node status (load average, free memory, etc.) and reports back to sge qmaster sge shepherd Spawned by sge execd when starting a job Monitors the execution of a single job

22 Job submission, I The qsub command is used to submit a job to the batch system. The job consists of a shell script and its (optional) arguments. Example: qsub myscript.sh If any arguments are given after the script name, they will be available to the script as $1, $2, etc. # in myscript.sh, $1="hello" and $2="world" qsub myscript.sh hello world

23 Job submission, II Upon successful submission, qsub prints a job ID to standard output: $ qsub -cwd myscript.sh Your job ("myscript.sh") has been submitted This job ID must be used with all SGE commands that operate on jobs. As soon as the job starts, two files will be created, containing the script s standard output (.ojobid) and standard error (.ejobid). $ ls -l myscript.sh* -rwxrwxr-x 1 murri murri 30 Oct 6 14:23 myscript.sh -rw-r--r-- 1 murri murri 0 Oct 6 14:24 myscript.sh.e rw-r--r-- 1 murri murri 14 Oct 6 14:24 myscript.sh.o76104

24 Commonly used options for qsub -cwd Execute job in current directory; if not given, the job script is run in the home directory. -o Path name of the file where standard output will be stored. -e Path name of the file where standard error will be stored. -j If -j y is given, then merge standard error into standard output (as they were both sent to the screen).

25 Monitoring jobs The qstat command is used to monitor jobs submitted to the SGE system. Example: $ qstat job-id prior name user state submit/start at queue mod_run danielyli dt 10/06/ :38:45 all.q@compute-0-13.local myscript.s murri r 10/06/ :40:35 all.q@compute-0-20.local The state column is a combination of the following codes: (see man qstat for a complete list) r Job is running qw Job is waiting in the queue qh Job is being held back in queue E An Error has occurred d Job has been deleted by user t Job is being transferred to compute node

26 Job submission, III Exercises: 1. Write a script rank1.sh to run the command./rank-int.i386 M0,6-D5.sms, then run it. Does this job appear in qstat output? Compare the output with what you would get when running locally: is there any significant change? 2. Write a script rank2.sh to run the command./rank-int.i386 M0,6-D11.sms, then run it. Does this job appear in qstat output? When do the standard output and standard error files appear? What s their initial content? 3. How can you determine the amount of resources (CPU time, wall-clock time, etc.) used by a job?

27 Job resource utilization, I The qstat -j command reports information on a job, while it is running Example: $ qstat -j ============================================================== job_number: exec_file: job_scripts/76106 submission_time: Thu Oct 6 14:51: owner: murri [...] cwd: /home/murri/lsci [...] script_file: myscript.sh usage 1: cpu=00:01:30, mem= GBs, io= , vmem= m, maxvme scheduling info: queue instance "all.q@compute-0-3.local" dropped because it is t [...] The usage line contains current resource utilization.

28 Job resource utilization, II The qacct command reports all information on a job, but only after it has completed. Example: $ qacct -j ============================================================== qname all.q hostname compute-0-27.local group murri [...] jobname myscript.sh jobnumber taskid undefined [...] qsub_time Thu Oct 6 14:51: start_time Thu Oct 6 14:51: end_time Thu Oct 6 14:54: [...] exit_status 0 ru_wallclock 159 ru_utime ru_stime [...] cpu mem [...] maxvmem M [...]

29 Resource utilization, I The -l option to qsub allows specifying what resources will be needed by a job. The most common resource requirements are: s rt Total job runtime (wall-clock time), in seconds s cpu Total job CPU time, in seconds mem free Request at least this much free RAM; use m or g suffix for MB or GB s mem Upper limit to RAM usage; use m or g suffix s vmem Upper limit to virtual memory usage; use m or g suffix Example: # run job with a time limit of 20 seconds $ qsub -l s_rt=20 myscript.sh

30 Resource utilization, II Exercises: 1. Is the following job limited to 20 seconds runtime? $ qsub -l s_rt=20 rank2.sh What do you find the in the job s stdout and stderr file? Compare with what happens in the ulimit case. What happens if you replace s rt by s cpu? 2. Run the same job, putting a 10MB limit on mem free, then s rss, s mem, and finally s vmem. Compare the actual resource utilization (via qacct) with the requirement. In what cases does the job terminate correctly? What s the resource utilization in this cases? 3. Compile a table with runtime, CPU time, and memory utilization for each of the matrices M*.sms. Is there a correlation with the matrix file size?

31 References [1] setrlimit(2) manual page, en/man2/getrlimit.2.html

GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems

GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems Riccardo Murri, Sergio Maffioletti Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich

More information

Maxwell compute cluster

Maxwell compute cluster Maxwell compute cluster An introduction to the Maxwell compute cluster Part 1 1.1 Opening PuTTY and getting the course materials on to Maxwell 1.1.1 On the desktop, double click on the shortcut icon for

More information

Grid Engine Users Guide. 2011.11p1 Edition

Grid Engine Users Guide. 2011.11p1 Edition Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the

More information

Introduction to Sun Grid Engine (SGE)

Introduction to Sun Grid Engine (SGE) Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems

More information

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine) Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing

More information

SGE Roll: Users Guide. Version @VERSION@ Edition

SGE Roll: Users Guide. Version @VERSION@ Edition SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1

More information

Efficient cluster computing

Efficient cluster computing Efficient cluster computing Introduction to the Sun Grid Engine (SGE) queuing system Markus Rampp (RZG, MIGenAS) MPI for Evolutionary Anthropology Leipzig, Feb. 16, 2007 Outline Introduction Basic concepts:

More information

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu

Grid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging

More information

Grid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net

Grid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net Grid Engine 6 Troubleshooting BioTeam Inc. info@bioteam.net Grid Engine Troubleshooting There are two core problem types Job Level Cluster seems OK, example scripts work fine Some user jobs/apps fail Cluster

More information

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015 Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians

More information

Quick Tutorial for Portable Batch System (PBS)

Quick Tutorial for Portable Batch System (PBS) Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.

More information

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Streamline Computing Linux Cluster User Training. ( Nottingham University) 1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running

More information

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 (

How To Run A Tompouce Cluster On An Ipra (Inria) 2.5.5 (Sun) 2 (Sun Geserade) 2-5.4 (Sun-Ge) 2/5.2 ( Running Hadoop and Stratosphere jobs on TomPouce cluster 16 October 2013 TomPouce cluster TomPouce is a cluster of 20 calcula@on nodes = 240 cores Located in the Inria Turing building (École Polytechnique)

More information

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007

PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007 PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit

More information

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers

More information

Cluster Computing With R

Cluster Computing With R Cluster Computing With R Stowers Institute for Medical Research R/Bioconductor Discussion Group Earl F. Glynn Scientific Programmer 18 December 2007 1 Cluster Computing With R Accessing Linux Boxes from

More information

GRID Computing: CAS Style

GRID Computing: CAS Style CS4CC3 Advanced Operating Systems Architectures Laboratory 7 GRID Computing: CAS Style campus trunk C.I.S. router "birkhoff" server The CAS Grid Computer 100BT ethernet node 1 "gigabyte" Ethernet switch

More information

Enigma, Sun Grid Engine (SGE), and the Joint High Performance Computing Exchange (JHPCE) Cluster

Enigma, Sun Grid Engine (SGE), and the Joint High Performance Computing Exchange (JHPCE) Cluster Enigma, Sun Grid Engine (SGE), and the Joint High Performance Computing Exchange (JHPCE) Cluster http://www.biostat.jhsph.edu/bit/sge_lecture.ppt.pdf Marvin Newhouse Fernando J. Pineda The JHPCE staff:

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de

More information

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew

More information

Introduction to Grid Engine

Introduction to Grid Engine Introduction to Grid Engine Workbook Edition 8 January 2011 Document reference: 3609-2011 Introduction to Grid Engine for ECDF Users Workbook Introduction to Grid Engine for ECDF Users Author: Brian Fletcher,

More information

Batch Job Analysis to Improve the Success Rate in HPC

Batch Job Analysis to Improve the Success Rate in HPC Batch Job Analysis to Improve the Success Rate in HPC 1 JunWeon Yoon, 2 TaeYoung Hong, 3 ChanYeol Park, 4 HeonChang Yu 1, First Author KISTI and Korea University, jwyoon@kisti.re.kr 2,3, KISTI,tyhong@kisti.re.kr,chan@kisti.re.kr

More information

High Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda

High Performance Computing with Sun Grid Engine on the HPSCC cluster. Fernando J. Pineda High Performance Computing with Sun Grid Engine on the HPSCC cluster Fernando J. Pineda HPSCC High Performance Scientific Computing Center (HPSCC) " The Johns Hopkins Service Center in the Dept. of Biostatistics

More information

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Miami University RedHawk Cluster Working with batch jobs on the Cluster Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.

More information

HPCC USER S GUIDE. Version 1.2 July 2012. IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35

HPCC USER S GUIDE. Version 1.2 July 2012. IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35 HPCC USER S GUIDE Version 1.2 July 2012 IITS (Research Support) Singapore Management University IITS, Singapore Management University Page 1 of 35 Revision History Version 1.0 (27 June 2012): - Modified

More information

An Introduction to High Performance Computing in the Department

An Introduction to High Performance Computing in the Department An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software

More information

Hodor and Bran - Job Scheduling and PBS Scripts

Hodor and Bran - Job Scheduling and PBS Scripts Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.

More information

Beyond Windows: Using the Linux Servers and the Grid

Beyond Windows: Using the Linux Servers and the Grid Beyond Windows: Using the Linux Servers and the Grid Topics Linux Overview How to Login & Remote Access Passwords Staying Up-To-Date Network Drives Server List The Grid Useful Commands Linux Overview Linux

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Martinos Center Compute Clusters

Martinos Center Compute Clusters Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress

More information

SLURM Workload Manager

SLURM Workload Manager SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

Grid Engine Training Introduction

Grid Engine Training Introduction Grid Engine Training Jordi Blasco (jordi.blasco@xrqtc.org) 26-03-2012 Agenda 1 How it works? 2 History Current status future About the Grid Engine version of this training Documentation 3 Grid Engine internals

More information

Using Parallel Computing to Run Multiple Jobs

Using Parallel Computing to Run Multiple Jobs Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The

More information

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section

Submitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Slide 1 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic

More information

Submitting Jobs to the Sun Grid Engine. CiCS Dept The University of Sheffield. Email D.Savas@sheffield.ac.uk M.Griffiths@sheffield.ac.

Submitting Jobs to the Sun Grid Engine. CiCS Dept The University of Sheffield. Email D.Savas@sheffield.ac.uk M.Griffiths@sheffield.ac. Submitting Jobs to the Sun Grid Engine CiCS Dept The University of Sheffield Email D.Savas@sheffield.ac.uk M.Griffiths@sheffield.ac.uk October 2012 Topics Covered Introducing the grid and batch concepts.

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27. Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction

More information

Cluster@WU User s Manual

Cluster@WU User s Manual Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut

More information

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu

Ra - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Ra - Batch Scripts Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Jobs on Ra are Run via a Batch System Ra is a shared resource Purpose: Give fair access to all users Have control over where jobs are run Set

More information

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics

More information

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St

More information

NYUAD HPC Center Running Jobs

NYUAD HPC Center Running Jobs NYUAD HPC Center Running Jobs 1 Overview... Error! Bookmark not defined. 1.1 General List... Error! Bookmark not defined. 1.2 Compilers... Error! Bookmark not defined. 2 Loading Software... Error! Bookmark

More information

CycleServer Grid Engine Support Install Guide. version 1.25

CycleServer Grid Engine Support Install Guide. version 1.25 CycleServer Grid Engine Support Install Guide version 1.25 Contents CycleServer Grid Engine Guide 1 Administration 1 Requirements 1 Installation 1 Monitoring Additional OGS/SGE/etc Clusters 3 Monitoring

More information

Parallel Debugging with DDT

Parallel Debugging with DDT Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece

More information

User s Guide. Introduction

User s Guide. Introduction CHAPTER 3 User s Guide Introduction Sun Grid Engine (Computing in Distributed Networked Environments) is a load management tool for heterogeneous, distributed computing environments. Sun Grid Engine provides

More information

HPC system startup manual (version 1.30)

HPC system startup manual (version 1.30) HPC system startup manual (version 1.30) Document change log Issue Date Change 1 12/1/2012 New document 2 10/22/2013 Added the information of supported OS 3 10/22/2013 Changed the example 1 for data download

More information

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014 Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan

More information

Running applications on the Cray XC30 4/12/2015

Running applications on the Cray XC30 4/12/2015 Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes

More information

Configuration of High Performance Computing for Medical Imaging and Processing. SunGridEngine 6.2u5

Configuration of High Performance Computing for Medical Imaging and Processing. SunGridEngine 6.2u5 Configuration of High Performance Computing for Medical Imaging and Processing SunGridEngine 6.2u5 A manual guide for installing, configuring and using the cluster. Mohammad Naquiddin Abd Razak Summer

More information

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research ! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at

More information

Introduction to Programming and Computing for Scientists

Introduction to Programming and Computing for Scientists Oxana Smirnova (Lund University) Programming for Scientists Tutorial 7b 1 / 48 Introduction to Programming and Computing for Scientists Oxana Smirnova Lund University Tutorial 7b: Grid certificates and

More information

Getting Started with HPC

Getting Started with HPC Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage

More information

The SUN ONE Grid Engine BATCH SYSTEM

The SUN ONE Grid Engine BATCH SYSTEM The SUN ONE Grid Engine BATCH SYSTEM Juan Luis Chaves Sanabria Centro Nacional de Cálculo Científico (CeCalCULA) Latin American School in HPC on Linux Cluster October 27 November 07 2003 What is SGE? Is

More information

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry New High-performance computing cluster: PAULI Sascha Frick Institute for Physical Chemistry 02/05/2012 Sascha Frick (PHC) HPC cluster pauli 02/05/2012 1 / 24 Outline 1 About this seminar 2 New Hardware

More information

Biowulf2 Training Session

Biowulf2 Training Session Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs

More information

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative

More information

Manual for using Super Computing Resources

Manual for using Super Computing Resources Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad

More information

Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine

Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine Last updated: 6/2/2008 4:43PM EDT We informally discuss the basic set up of the R Rmpi and SNOW packages with OpenMPI and the Sun Grid

More information

Introduction to SDSC systems and data analytics software packages "

Introduction to SDSC systems and data analytics software packages Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available

More information

Batch Scripts for RA & Mio

Batch Scripts for RA & Mio Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs

More information

Job Scheduling with Moab Cluster Suite

Job Scheduling with Moab Cluster Suite Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..

More information

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:

More information

High Performance Computing

High Performance Computing High Performance Computing at Stellenbosch University Gerhard Venter Outline 1 Background 2 Clusters 3 SU History 4 SU Cluster 5 Using the Cluster 6 Examples What is High Performance Computing? Wikipedia

More information

High-Performance Reservoir Risk Assessment (Jacta Cluster)

High-Performance Reservoir Risk Assessment (Jacta Cluster) High-Performance Reservoir Risk Assessment (Jacta Cluster) SKUA-GOCAD 2013.1 Paradigm 2011.3 With Epos 4.1 Data Management Configuration Guide 2008 2013 Paradigm Ltd. or its affiliates and subsidiaries.

More information

High Performance Compute Cluster

High Performance Compute Cluster High Performance Compute Cluster Overview for Researchers and Users Alces Software Limited 2013 COMMERCIAL IN CONFIDENCE 1 Copyrights, Licenses and Acknowledgements Dell, PowerEdge, PowerVault and PowerConnect

More information

AstroCompute. AWS101 - using the cloud for Science. Brendan Bouffler ( boof ) Scientific Computing (SciCo) @ AWS. ska-astrocompute@amazon.

AstroCompute. AWS101 - using the cloud for Science. Brendan Bouffler ( boof ) Scientific Computing (SciCo) @ AWS. ska-astrocompute@amazon. AstroCompute AWS101 - using the cloud for Science Brendan Bouffler ( boof ) Scientific Computing (SciCo) @ AWS ska-astrocompute@amazon.com AWS is hoping to contribute to the development of data processing,

More information

Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer. Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA)

Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer. Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA) Grid Engine experience in Finis Terrae, large Itanium cluster supercomputer Pablo Rey Mayo Systems Technician, Galicia Supercomputing Centre (CESGA) Agenda Introducing CESGA Finis Terrae Architecture Grid

More information

Big Data Evaluator 2.1: User Guide

Big Data Evaluator 2.1: User Guide University of A Coruña Computer Architecture Group Big Data Evaluator 2.1: User Guide Authors: Jorge Veiga, Roberto R. Expósito, Guillermo L. Taboada and Juan Touriño May 5, 2016 Contents 1 Overview 3

More information

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing

More information

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley Linux command line An introduction to the Linux command line for genomics Susan Fairley Aims Introduce the command line Provide an awareness of basic functionality Illustrate with some examples Provide

More information

UMass High Performance Computing Center

UMass High Performance Computing Center .. UMass High Performance Computing Center University of Massachusetts Medical School October, 2014 2 / 32. Challenges of Genomic Data It is getting easier and cheaper to produce bigger genomic data every

More information

Documentation for hanythingondemand

Documentation for hanythingondemand Documentation for hanythingondemand Release 20151120.01 Ghent University Thu, 07 Jan 2016 12:53:15 Contents 1 Introductory topics 3 1.1 What is hanythingondemand?.................................... 3

More information

Setting up PostgreSQL

Setting up PostgreSQL Setting up PostgreSQL 1 Introduction to PostgreSQL PostgreSQL is an object-relational database management system based on POSTGRES, which was developed at the University of California at Berkeley. PostgreSQL

More information

On-demand (Pay-per-Use) HPC Service Portal

On-demand (Pay-per-Use) HPC Service Portal On-demand (Pay-per-Use) Portal Wang Junhong INTRODUCTION High Performance Computing, Computer Centre The Service Portal is a key component of the On-demand (pay-per-use) HPC service delivery. The Portal,

More information

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz)

Using NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz) NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting

More information

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science

More information

How to Run Parallel Jobs Efficiently

How to Run Parallel Jobs Efficiently How to Run Parallel Jobs Efficiently Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education May 9, 2013 1 The big picture: running parallel jobs on Hoffman2

More information

Until now: tl;dr: - submit a job to the scheduler

Until now: tl;dr: - submit a job to the scheduler Until now: - access the cluster copy data to/from the cluster create parallel software compile code and use optimized libraries how to run the software on the full cluster tl;dr: - submit a job to the

More information

Introduction to Sun Grid Engine 5.3

Introduction to Sun Grid Engine 5.3 CHAPTER 1 Introduction to Sun Grid Engine 5.3 This chapter provides background information about the Sun Grid Engine 5.3 system that is useful to users and administrators alike. In addition to a description

More information

HOD Scheduler. Table of contents

HOD Scheduler. Table of contents Table of contents 1 Introduction... 2 2 HOD Users... 2 2.1 Getting Started... 2 2.2 HOD Features...5 2.3 Troubleshooting... 14 3 HOD Administrators... 21 3.1 Getting Started... 22 3.2 Prerequisites...

More information

MapReduce Evaluator: User Guide

MapReduce Evaluator: User Guide University of A Coruña Computer Architecture Group MapReduce Evaluator: User Guide Authors: Jorge Veiga, Roberto R. Expósito, Guillermo L. Taboada and Juan Touriño December 9, 2014 Contents 1 Overview

More information

Running Jobs on Genepool

Running Jobs on Genepool Running Jobs on Genepool Douglas Jacobsen! NERSC Bioinformatics Computing Consultant February 12, 2013-1 - Structure of the Genepool System User Access Command Line Scheduler Service ssh genepool.nersc.gov

More information

System Resources. To keep your system in optimum shape, you need to be CHAPTER 16. System-Monitoring Tools IN THIS CHAPTER. Console-Based Monitoring

System Resources. To keep your system in optimum shape, you need to be CHAPTER 16. System-Monitoring Tools IN THIS CHAPTER. Console-Based Monitoring CHAPTER 16 IN THIS CHAPTER. System-Monitoring Tools. Reference System-Monitoring Tools To keep your system in optimum shape, you need to be able to monitor it closely. Such monitoring is imperative in

More information

Advanced Techniques with Newton. Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011

Advanced Techniques with Newton. Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011 Advanced Techniques with Newton Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011 Workshop Goals Gain independence Executing your work Finding Information Fixing Problems Optimizing Effectiveness

More information

NEC HPC-Linux-Cluster

NEC HPC-Linux-Cluster NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores

More information

Grid Engine 6. Monitoring, Accounting & Reporting. BioTeam Inc. info@bioteam.net

Grid Engine 6. Monitoring, Accounting & Reporting. BioTeam Inc. info@bioteam.net Grid Engine 6 Monitoring, Accounting & Reporting BioTeam Inc. info@bioteam.net This module covers System Monitoring Accounting & Reporting tools SGE Accounting File ARCo & sgeinspect SGE Reporting 3rd

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

Open Source Grid Computing Java Roundup

Open Source Grid Computing Java Roundup Open Source Grid Computing Java Roundup Nikita Ivanov www.gridgain.org Nikita Ivanov Open Source Grid Computing Java Roundup Slide 1 Introduction Nikita Ivanov Over 15 years of experience Last 7 years

More information

Vital-IT Users Training: HPC in Life Sciences

Vital-IT Users Training: HPC in Life Sciences Vital-IT Users Training: HPC in Life Sciences Vital-IT Group, Lausanne Contact: projects@vital-it.ch Status: 29 January 2015 Objectives of this course Obtain basic knowledge on high throughput and high

More information

Grid Engine. Application Integration

Grid Engine. Application Integration Grid Engine Application Integration Getting Stuff Done. Batch Interactive - Terminal Interactive - X11/GUI Licensed Applications Parallel Jobs DRMAA Batch Jobs Most common What is run: Shell Scripts Binaries

More information

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz)

Introduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz) Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using

More information

Installing and running COMSOL on a Linux cluster

Installing and running COMSOL on a Linux cluster Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation

More information

SUN GRID ENGINE & SGE/EE: A CLOSER LOOK

SUN GRID ENGINE & SGE/EE: A CLOSER LOOK SUN GRID ENGINE & SGE/EE: A CLOSER LOOK Carlo Nardone HPC Consultant Sun Microsystems, GSO SUN GRID ENGINE & SGE/EE: A CLOSER LOOK Agenda Sun and Grid Computing Sun Grid Engine: Architecture Campus Grid

More information

HDFS Installation and Shell

HDFS Installation and Shell 2012 coreservlets.com and Dima May HDFS Installation and Shell Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses

More information

The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing

The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing December 19, 2013 The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing Andrew Grimshaw, University of Virginia Co-architect XSEDE The complexity of software

More information

Site Configuration SETUP GUIDE. Windows Hosts Single Workstation Installation. May08. May 08

Site Configuration SETUP GUIDE. Windows Hosts Single Workstation Installation. May08. May 08 Site Configuration SETUP GUIDE Windows Hosts Single Workstation Installation May08 May 08 Copyright 2008 Wind River Systems, Inc. All rights reserved. No part of this publication may be reproduced or transmitted

More information

Using the Yale HPC Clusters

Using the Yale HPC Clusters Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Dec 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support

More information

Introduction to Operating Systems

Introduction to Operating Systems Introduction to Operating Systems It is important that you familiarize yourself with Windows and Linux in preparation for this course. The exercises in this book assume a basic knowledge of both of these

More information

HPCC - Hrothgar Getting Started User Guide MPI Programming

HPCC - Hrothgar Getting Started User Guide MPI Programming HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...

More information