Juropa. Batch Usage Introduction. May 2014 Chrysovalantis Paschoulas
|
|
- Collin Moore
- 8 years ago
- Views:
Transcription
1 Juropa Batch Usage Introduction May 2014 Chrysovalantis Paschoulas
2 Batch System Usage Model A Batch System: monitors and controls the resources on the system manages and schedules the jobs enforces limits and policies according to the Batch Model allocates the resources, sets up the environment and then runs the jobs Juropa Cluster (3289 compute nodes) Juropa JSC Partition 3033 compute nodes 64 nodes are dedicated to interactive jobs A few nodes are dedicated to software or system tests Rest of the compute nodes are used for normal batch jobs 256 compute nodes for a special partition
3 Batch System On Juropa for Batch System we use the combination of Moab and Torque Moab is the Workload Manager - Batch Scheduler Torque is the Resource Manager Moab/Torque Batch System: Manages policies, priorities, limits Starts jobs, manages output Provides features for advanced reservations, backfilling etc. Job statistics and accounting User commands for job submission, job query, job canceling etc.
4 Batch System Configuration & Limits In order to implement and manage the scheduling, the Batch System uses the abstraction of queues (classes), like jsc, inter_jsc etc. Class configuration: allowed users, list of nodes, max. limits, default values etc. Interactive jobs 64 compute nodes available No node Sharing Max. number of nodes: 16 (default: 1) Max. wall-clock time: 10h (default: 30 min) Max. running jobs: 20 per user (including batch jobs) Accounting: (number of nodes) x (connect-time)
5 Batch System Configuration & Limits Batch jobs 3033 compute nodes available No node sharing Max. number of nodes per job: 1024 Max. wall-clock time: 24h Max. number of running jobs: 20 per user Max. number of eligible jobs is limited to 20 per user Default values: Number of nodes: 1 Walltime limit: 30min Tasks per node: 8 Accounting: (number of nodes) x (wall-clock time) Jobs requesting more nodes than the limits can be run on special request Not included in normal scheduling Will be run e.g. once per week if needed during nonprime time Please contact sc@fz-juelich.de
6 Compiling Programs In order to compile and execute parallel programs on Juropa, there are available MPI wrappers for the Intel compilers. These wrappers build up the MPI environment for the compilation task. The current wrappers are: mpicc, mpicxx, mpif77, mpif90 Some useful compiler options: -openmp: enables OpenMP -g: creates debugging information -L: path to libraries for the linker -O[0-3]: optimization levels The wrappers (as many tools) are provided as modules and it is very easy for the users to choose the version of the compiler they want. module load parastation/mpi2 intel Compile example: mpicxx O2 program.cpp o mpi_program Execute MPI program: mpiexec <options> mpi_program
7 Job Submission After the compilation, the users can submit a job with the command: msub <options> <executable> or <jobscript> For example: msub l nodes=16:ppn=8 l walltime=04:00:00 e /lustre/jhome/zam/err o /lustre/jhome5/zam/out myscript Useful options of msub: l nodes=<num> # number of nodes l ppn=<num> # procs per node l walltime=<hh:mm:ss> # requested wall clock time j oe # combine stderr and stdout M < address> # send to this address m eab # send on end, abort or begin N <name> # name of job I # run an interactive job v tpt=<num threads> # number of OpenMP threads q <queue name> # destination queue (class) Instead of using msub with all this options in a single command line, it is possible for the users include all the job submission options in the batch script.
8 Batch Script The users in order to define the msub options in their batch scripts, they have to use #MSUB directives. For parallel jobs the users have to include the MPI execution command mpiexec. Example 1 In this example we will have an MPI application that will start 64 MPI tasks on 8 nodes using 8 cores per node. Each MPI task runs on a single core. #!/bin/bash x #MSUB l nodes=8:ppn=8 #MSUB l walltime=04:00:00 #MSUB e /home/jhome3/test_user/my error.txt #MSUB o /home/jhome3/test_user/my out.txt ### start of jobscript cd $PBS_O_WORKDIR echo "workdir: $PBS_O_WORKDIR" # NSLOTS = nodes * ppn = 8 * 8 = 64 NSLOTS=64 echo "running on $NSLOTS cpus " mpiexec np $NSLOTS./mpi_program
9 Batch Script Example 2 In the following example, the application will be started on 32 nodes using 1 MPI tasks per node, 32 tasks in total, and only one core will be used per node. #!/bin/bash x #MSUB N MPI_32x1_job #MSUB l nodes=32:ppn=1 ### start of jobscript ### mpiexec np 32 mpi_program >> $PBS_O_WORKDIR/out.$PBSJOBID Example 3 In this example we have a Hybrid job that uses MPI and OpenMP. The application will be started on 4 nodes, on each node 1 MPI tasks will be created, 4 tasks in total, and each task will have 8 OpenMP threads. #!/bin/bash x #MSUB N hybrid_8x8_job #MSUB l nodes=4:ppn=8 #MSUB v tpt=8 ### start of jobscript ### export OMP_NUM_THREADS=8 mpiexec np 4 exports=omp_num_threads mpi_omp_program
10 Task Allocation & SMT Task Allocation SMT To calculate the total number of MPI tasks for a job: total number of MPI tasks = (number of nodes)x(procs per node) When we have hybrid MPI/OpenMp jobs (option -tpt is used): total tasks = (number of nodes)x((procs per node)/(threads per task)) To calculate the number of MPI tasks per node for hybrid jobs: MPI tasks per node = (procs per node)/(threads per task) The compute nodes on Juropa support the SMT technology (Intel Xeon CPUs Nehalem Arch.). In order to use SMT for the jobs, the users have to set the msub option ppn=16. For example: #!/bin/bash x #MSUB N SMT_hybrid_8x8_job #MSUB l nodes=4:ppn=16 #MSUB v tpt=8 ### start of jobscript ### export OMP_NUM_THREADS=8 mpiexec np 8 exports=omp_num_threads application.exe
11 More Options Interactive Jobs In order to start an interactive job on Juropa, the users have to submit a job with the option I I of msub. If the resources are free, the users will automatically have access to the nodes and can start the application. msub I l nodes=2:ppn=8,walltime=00:15:00 Job dependencies & Job chains Users can submit a job defining dependencies msub W depend=<jobid> <jobscript> or even submit job chains #!/bin/bash NO_OF_JOBS=<no of jobs > JOB_SCRIPT=<jobscript> i=0 JOBID=$(msub $JOB_SCRIPT 2>&1 grep v e '^$' sed e 's/\s*//') while [ $i le $NO_OF_JOBS ]; do JOBID=$(msub W depend=afterok:$jobid $JOB_SCRIPT 2>&1 grep v e '^$' sed e 's/\s*//') let i=$i+1 done
12 Batch System - Commands msub Submit a job Returns job ID on success Note: during times of heavy load Moab might run into a timeout showq [-r -i -b] [-u <userid>] Shows all, running, idle or blocked jobs of all or specified users mjobctl -c <jobid> Cancel queued or running job mjobctl -c -w USER=<userID> Cancel all jobs of specified user checkjob [-v] <jobid> Display detailed information on a specified job
13 Batch System -Commands showstart <jobid> Shows estimated start-time of specified job. Estimated starttime can change while jobs with higher priority get scheduled mjobctl --help Shows all options, e.g. how to hold or resume holds on jobs showbf -c jsc Shows available resources for immediate use For more detailed information on Moab commands please see: Graphical view of usage, jobs, distribution of jobs, etc. llview Was developed by W.Frings, member of JSC
14 Batch System Job Scheduling Some comments on job scheduling Jobs are scheduled by priority Priority increases with number of nodes Priority increases during waiting (aging) No node sharing Backfilling mode Please specify the requested wall-time as exact as possible Jobs of users (groups) who ran out of cpu quota will get very low priority Please be aware of the fact that estimated start-time change very often
15 Job life-cycle 6. When there are enough free resources, Moab converts the job script into PBS syntax and tells the RM to start the job on the set of the associated nodes(calls qsub). Juropa Master Node MOAB Server 5. msub uses the submission filter to put the jobs into the proper queue and the job goes into the Moab queue. TORQUE Server 7. Torque starts the prologue on the compute nodes associated to the job. When all the required resources are available AND the nodes are in healthy condition then it starts the jobscript on the Mother Superior. Compute Nodes Login Nodes Compute node Users develop their Software. Mother Superior Compute node 02.. Compute node N 8. Mother Superior runs the jobscript and execute the mpiexec command. This communicates via psid daemons to start the MPI tasks on all compute nodes. And when is completed, the RM daemons run the epilogue on all nodes to clean up resources. 2. Compile with MPI Wrappers. 3. Create the batch script. 4. Submit their jobs with msub.
16 Batch System CPU quota & Accounting Query current status of cpu quota q_cpuquota <options> q_cpuquota -? # shows all available options Types of CPU quota Fixed: : a fixed amount of cpu quota can be used during the allocation period. (refers to small quota amounts) Monthly: : jobs will be scheduled with normal priority until current, previous and next monthly quota is exhausted. CPU quota not used in this time frame is lost. Charging mode Users will be charged for wall-clock time of their jobs on the set of nodes 3 states of contingent: normal, low-cont, no-cont (monthly quotas) CPU Quotas are defined per group All members of a group will be informed by mail if the group runs out of cpu quota or if new quota is assigned Jobs will get low priority (<0) and reduced wall-time limit (6 hours) when CPU quota is used up. They will only run if no other waiting jobs fit into free resources
17 Further Information Regular preventive maintenance every second Thursdays See Message of today at login Get recent status updates by subscribing to the system highmessages as described at the bottom of this page: Juropa on-line documentation User support at FZJ Phone:
Batch Scripts for RA & Mio
Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs
More informationMiami University RedHawk Cluster Working with batch jobs on the Cluster
Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.
More informationJob Scheduling with Moab Cluster Suite
Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. yjw@us.ibm.com 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..
More informationRa - Batch Scripts. Timothy H. Kaiser, Ph.D. tkaiser@mines.edu
Ra - Batch Scripts Timothy H. Kaiser, Ph.D. tkaiser@mines.edu Jobs on Ra are Run via a Batch System Ra is a shared resource Purpose: Give fair access to all users Have control over where jobs are run Set
More informationBatch Usage on JURECA Introduction to Slurm. Nov 2015 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de
Batch Usage on JURECA Introduction to Slurm Nov 2015 Chrysovalantis Paschoulas c.paschoulas@fz-juelich.de Batch System Concepts (1) A cluster consists of a set of tightly connected "identical" computers
More informationRunning applications on the Cray XC30 4/12/2015
Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes
More informationSLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
More informationSLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
More informationJob scheduler details
Job scheduler details Advanced Computing Center for Research & Education (ACCRE) Job scheduler details 1 / 25 Outline 1 Batch queue system overview 2 Torque and Moab 3 Submitting jobs (ACCRE) Job scheduler
More informationHodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
More informationNYUAD HPC Center Running Jobs
NYUAD HPC Center Running Jobs 1 Overview... Error! Bookmark not defined. 1.1 General List... Error! Bookmark not defined. 1.2 Compilers... Error! Bookmark not defined. 2 Loading Software... Error! Bookmark
More informationJUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
More informationResource Management and Job Scheduling
Resource Management and Job Scheduling Jenett Tillotson Senior Cluster System Administrator Indiana University May 18 18-22 May 2015 1 Resource Managers Keep track of resources Nodes: CPUs, disk, memory,
More informationJob Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...
Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Martin Siegert, SFU Cluster Myths There are so many jobs in the queue - it will take ages until
More informationNEC HPC-Linux-Cluster
NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores
More informationGrid 101. Grid 101. Josh Hegie. grid@unr.edu http://hpc.unr.edu
Grid 101 Josh Hegie grid@unr.edu http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging
More informationIntroduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St
More informationGetting Started with HPC
Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage
More informationUsing WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014
Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan
More informationRA MPI Compilers Debuggers Profiling. March 25, 2009
RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar
More informationParallel Debugging with DDT
Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece
More informationWork Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
More informationDebugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics
More informationPBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007
PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit
More informationThe Moab Scheduler. Dan Mazur, McGill HPC daniel.mazur@mcgill.ca Aug 23, 2013
The Moab Scheduler Dan Mazur, McGill HPC daniel.mazur@mcgill.ca Aug 23, 2013 1 Outline Fair Resource Sharing Fairness Priority Maximizing resource usage MAXPS fairness policy Minimizing queue times Should
More informationQuick Tutorial for Portable Batch System (PBS)
Quick Tutorial for Portable Batch System (PBS) The Portable Batch System (PBS) system is designed to manage the distribution of batch jobs and interactive sessions across the available nodes in the cluster.
More informationUsing Parallel Computing to Run Multiple Jobs
Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The
More informationInstalling and running COMSOL on a Linux cluster
Installing and running COMSOL on a Linux cluster Introduction This quick guide explains how to install and operate COMSOL Multiphysics 5.0 on a Linux cluster. It is a complement to the COMSOL Installation
More informationLinux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.
Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction
More informationThe RWTH Compute Cluster Environment
The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de
More informationAdvanced PBS Workflow Example Bill Brouwer 05/01/12 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu
Advanced PBS Workflow Example Bill Brouwer 050112 Research Computing and Cyberinfrastructure Unit, PSU wjb19@psu.edu 0.0 An elementary workflow All jobs consuming significant cycles need to be submitted
More informationMartinos Center Compute Clusters
Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress
More informationSLURM Workload Manager
SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux
More informationTutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria
Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew
More informationGuillimin HPC Users Meeting. Bryan Caron
November 13, 2014 Bryan Caron bryan.caron@mcgill.ca bryan.caron@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News October Service Interruption
More informationStreamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
More informationMitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform
Mitglied der Helmholtz-Gemeinschaft System monitoring with LLview and the Parallel Tools Platform November 25, 2014 Carsten Karbach Content 1 LLview 2 Parallel Tools Platform (PTP) 3 Latest features 4
More informationHow to Run Parallel Jobs Efficiently
How to Run Parallel Jobs Efficiently Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education May 9, 2013 1 The big picture: running parallel jobs on Hoffman2
More informationAn introduction to compute resources in Biostatistics. Chris Scheller schelcj@umich.edu
An introduction to compute resources in Biostatistics Chris Scheller schelcj@umich.edu 1. Resources 1. Hardware 2. Account Allocation 3. Storage 4. Software 2. Usage 1. Environment Modules 2. Tools 3.
More informationIntroduction to Running Computations on the High Performance Clusters at the Center for Computational Research
! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at
More informationThe Asterope compute cluster
The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU
More informationParallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University
More informationIntroduction to SDSC systems and data analytics software packages "
Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni (mahidhar@sdsc.edu) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available
More informationA highly configurable and efficient simulator for job schedulers on supercomputers
Mitglied der Helmholtz-Gemeinschaft A highly configurable and efficient simulator for job schedulers on supercomputers April 12, 2013 Carsten Karbach, Jülich Supercomputing Centre (JSC) Motivation Objective
More informationUsing NeSI HPC Resources. NeSI Computational Science Team (support@nesi.org.nz)
NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting
More informationMitglied der Helmholtz-Gemeinschaft UNICORE. Uniform Access to JSC Resources. Michael Rambadt, unicore-info@fz-juelich.de. 20.
Mitglied der Helmholtz-Gemeinschaft UNICORE Uniform Access to JSC Resources 20. Mai 2014 Michael Rambadt, unicore-info@fz-juelich.de Outline Introduction Features UNICORE Portal UNICORE Rich Client UNICORE
More information8/15/2014. Best Practices @OLCF (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status
Best Practices @OLCF (and more) Bill Renaud OLCF User Support General Information This presentation covers some helpful information for users of OLCF Staying informed Aspects of system usage that may differ
More informationSubmitting and Running Jobs on the Cray XT5
Submitting and Running Jobs on the Cray XT5 Richard Gerber NERSC User Services RAGerber@lbl.gov Joint Cray XT5 Workshop UC-Berkeley Outline Hopper in blue; Jaguar in Orange; Kraken in Green XT5 Overview
More informationUsing the Windows Cluster
Using the Windows Cluster Christian Terboven terboven@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University Windows HPC 2008 (II) September 17, RWTH Aachen Agenda o Windows Cluster
More informationIntroduction to HPC Workshop. Center for e-research (eresearch@nesi.org.nz)
Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using
More informationSubmitting batch jobs Slurm on ecgate. Xavi Abellan xavier.abellan@ecmwf.int User Support Section
Submitting batch jobs Slurm on ecgate Xavi Abellan xavier.abellan@ecmwf.int User Support Section Slide 1 Outline Interactive mode versus Batch mode Overview of the Slurm batch system on ecgate Batch basic
More informationLoadLeveler Overview. January 30-31, 2012. IBM Storage & Technology Group. IBM HPC Developer Education @ TIFR, Mumbai
IBM HPC Developer Education @ TIFR, Mumbai IBM Storage & Technology Group LoadLeveler Overview January 30-31, 2012 Pidad D'Souza (pidsouza@in.ibm.com) IBM, System & Technology Group 2009 IBM Corporation
More informationGrid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)
Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing
More informationTo connect to the cluster, simply use a SSH or SFTP client to connect to:
RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or
More informationBatch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource
PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)
More information1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology
Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science
More informationOpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware
OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based
More informationRunning on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)
Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:
More informationCluster@WU User s Manual
Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut
More informationUsing the Yale HPC Clusters
Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support
More informationHigh Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina
High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers
More informationA Performance Data Storage and Analysis Tool
A Performance Data Storage and Analysis Tool Steps for Using 1. Gather Machine Data 2. Build Application 3. Execute Application 4. Load Data 5. Analyze Data 105% Faster! 72% Slower Build Application Execute
More informationBiowulf2 Training Session
Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs
More informationOLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group
OLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may
More informationParallel Computing using MATLAB Distributed Compute Server ZORRO HPC
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing
More informationHPC-Nutzer Informationsaustausch. The Workload Management System LSF
HPC-Nutzer Informationsaustausch The Workload Management System LSF Content Cluster facts Job submission esub messages Scheduling strategies Tools and security Future plans 2 von 10 Some facts about the
More informationPBSPro scheduling. PBS overview Qsub command: resource requests. Queues a7ribu8on. Fairshare. Backfill Jobs submission.
PBSPro scheduling PBS overview Qsub command: resource requests Queues a7ribu8on Fairshare Backfill Jobs submission 9 mai 03 PBS PBS overview 9 mai 03 PBS PBS organiza5on: daemons frontend compute nodes
More informationManual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
More informationHPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
More informationA High Performance Computing Scheduling and Resource Management Primer
LLNL-TR-652476 A High Performance Computing Scheduling and Resource Management Primer D. H. Ahn, J. E. Garlick, M. A. Grondona, D. A. Lipari, R. R. Springmeyer March 31, 2014 Disclaimer This document was
More informationDebugging with TotalView
Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich
More informationSGE Roll: Users Guide. Version @VERSION@ Edition
SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1
More informationIntroduction to the SGE/OGS batch-queuing system
Grid Computing Competence Center Introduction to the SGE/OGS batch-queuing system Riccardo Murri Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich Oct. 6, 2011 The basic
More informationGeneral Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)
Slurm Training15 Agenda 1 2 3 About Slurm Key Features of Slurm Extending Slurm Resource Management Daemons Job/step allocation 4 5 SMP MPI Parametric Job monitoring Accounting Scheduling Administration
More informationParallel Processing using the LOTUS cluster
Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS
More informationMoab and TORQUE Highlights CUG 2015
Moab and TORQUE Highlights CUG 2015 David Beer TORQUE Architect 28 Apr 2015 Gary D. Brown HPC Product Manager 1 Agenda NUMA-aware Heterogeneous Jobs Ascent Project Power Management and Energy Accounting
More informationThe CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
More informationJob Scheduling on a Large UV 1000. Chad Vizino SGI User Group Conference May 2011. 2011 Pittsburgh Supercomputing Center
Job Scheduling on a Large UV 1000 Chad Vizino SGI User Group Conference May 2011 Overview About PSC s UV 1000 Simon UV Distinctives UV Operational issues Conclusion PSC s UV 1000 - Blacklight Blacklight
More informationHPCC - Hrothgar Getting Started User Guide MPI Programming
HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...
More informationIntroduction to Sun Grid Engine (SGE)
Introduction to Sun Grid Engine (SGE) What is SGE? Sun Grid Engine (SGE) is an open source community effort to facilitate the adoption of distributed computing solutions. Sponsored by Sun Microsystems
More informationStanford HPC Conference. Panasas Storage System Integration into a Cluster
Stanford HPC Conference Panasas Storage System Integration into a Cluster David Yu Industry Verticals Panasas Inc. Steve Jones Technology Operations Manager Institute for Computational and Mathematical
More informationOLCF Best Practices. Bill Renaud OLCF User Assistance Group
OLCF Best Practices Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may differ from
More informationBatch Scheduling on the Cray XT3
Batch Scheduling on the Cray XT3 Chad Vizino, Nathan Stone, John Kochmar, J. Ray Scott {vizino,nstone,kochmar,scott}@psc.edu Pittsburgh Supercomputing Center ABSTRACT: The Pittsburgh Supercomputing Center
More informationBroadening Moab/TORQUE for Expanding User Needs
Broadening Moab/TORQUE for Expanding User Needs Gary D. Brown HPC Product Manager CUG 2016 1 2016 Adaptive Computing Enterprises, Inc. Agenda DataWarp Intel MIC KNL Viewpoint Web Portal User Portal Administrator
More informationCNAG User s Guide. Barcelona Supercomputing Center Copyright c 2015 BSC-CNS December 18, 2015. 1 Introduction 2
CNAG User s Guide Barcelona Supercomputing Center Copyright c 2015 BSC-CNS December 18, 2015 Contents 1 Introduction 2 2 System Overview 2 3 Connecting to CNAG cluster 2 3.1 Transferring files...................................
More informationBest Practice mini-guide "Stokes"
SGI Altix ICE at ICHEC Michael Lysaght, ICHEC Niall Wilson, ICHEC Eoin McHugh, ICHEC Michael Browne, ICHEC Gilles Civario, ICHEC February 2013 1 Table of Contents 1. Introduction... 3 2. System architecture
More informationGrid Engine Users Guide. 2011.11p1 Edition
Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the
More informationAllinea Performance Reports User Guide. Version 6.0.6
Allinea Performance Reports User Guide Version 6.0.6 Contents Contents 1 1 Introduction 4 1.1 Online Resources...................................... 4 2 Installation 5 2.1 Linux/Unix Installation...................................
More informationGrid Engine 6. Troubleshooting. BioTeam Inc. info@bioteam.net
Grid Engine 6 Troubleshooting BioTeam Inc. info@bioteam.net Grid Engine Troubleshooting There are two core problem types Job Level Cluster seems OK, example scripts work fine Some user jobs/apps fail Cluster
More informationDiskPulse DISK CHANGE MONITOR
DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com info@flexense.com 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product
More informationInformationsaustausch für Nutzer des Aachener HPC Clusters
Informationsaustausch für Nutzer des Aachener HPC Clusters Paul Kapinos, Marcus Wagner - 21.05.2015 Informationsaustausch für Nutzer des Aachener HPC Clusters Agenda (The RWTH Compute cluster) Project-based
More informationIntroduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
More informationOn-demand (Pay-per-Use) HPC Service Portal
On-demand (Pay-per-Use) Portal Wang Junhong INTRODUCTION High Performance Computing, Computer Centre The Service Portal is a key component of the On-demand (pay-per-use) HPC service delivery. The Portal,
More informationHigh-Performance Reservoir Risk Assessment (Jacta Cluster)
High-Performance Reservoir Risk Assessment (Jacta Cluster) SKUA-GOCAD 2013.1 Paradigm 2011.3 With Epos 4.1 Data Management Configuration Guide 2008 2013 Paradigm Ltd. or its affiliates and subsidiaries.
More informationMATLAB Distributed Computing Server System Administrator's Guide
MATLAB Distributed Computing Server System Administrator's Guide R2015b How to Contact MathWorks Latest news: www.mathworks.com Sales and services: www.mathworks.com/sales_and_services User community:
More informationUntil now: tl;dr: - submit a job to the scheduler
Until now: - access the cluster copy data to/from the cluster create parallel software compile code and use optimized libraries how to run the software on the full cluster tl;dr: - submit a job to the
More informationGC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems
GC3: Grid Computing Competence Center Cluster computing, I Batch-queueing systems Riccardo Murri, Sergio Maffioletti Grid Computing Competence Center, Organisch-Chemisches Institut, University of Zurich
More informationHigh Performance Computing at the Oak Ridge Leadership Computing Facility
Page 1 High Performance Computing at the Oak Ridge Leadership Computing Facility Page 2 Outline Our Mission Computer Systems: Present, Past, Future Challenges Along the Way Resources for Users Page 3 Our
More informationCluster Computing With R
Cluster Computing With R Stowers Institute for Medical Research R/Bioconductor Discussion Group Earl F. Glynn Scientific Programmer 18 December 2007 1 Cluster Computing With R Accessing Linux Boxes from
More informationRWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative
More information