The Asterope compute cluster

Similar documents
Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

SLURM Workload Manager

An introduction to compute resources in Biostatistics. Chris Scheller

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

To connect to the cluster, simply use a SSH or SFTP client to connect to:

Introduction to Supercomputing with Janus

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

A National Computing Grid: FGI

Manual for using Super Computing Resources

Until now: tl;dr: - submit a job to the scheduler

Introduction to parallel computing and UPPMAX

The RWTH Compute Cluster Environment

An Introduction to High Performance Computing in the Department

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Parallel Debugging with DDT

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Matlab on a Supercomputer

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Tutorial-4a: Parallel (multi-cpu) Computing

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Getting Started with HPC

CNAG User s Guide. Barcelona Supercomputing Center Copyright c 2015 BSC-CNS December 18, Introduction 2

NEC HPC-Linux-Cluster

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

LANL Computing Environment for PSAAP Partners

MPI / ClusterTools Update and Plans

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

HPC Wales Skills Academy Course Catalogue 2015

General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)

Designed for Maximum Accelerator Performance

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma

Biowulf2 Training Session

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Debugging with TotalView

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

Using the Windows Cluster

Grid 101. Grid 101. Josh Hegie.

RA MPI Compilers Debuggers Profiling. March 25, 2009

icer Bioinformatics Support Fall 2011

A High Performance Computing Scheduling and Resource Management Primer

User s Manual

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

1 Bull, 2011 Bull Extreme Computing

Parallels Plesk Automation

Background and introduction Using the cluster Summary. The DMSC datacenter. Lars Melwyn Jensen. Niels Bohr Institute University of Copenhagen

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

ABAQUS High Performance Computing Environment at Nokia

Estonian Scientific Computing Infrastructure (ETAIS)

Overview of HPC Resources at Vanderbilt

Miami University RedHawk Cluster Connecting to the Cluster Using Windows

wu.cloud: Insights Gained from Operating a Private Cloud System

Visualization Cluster Getting Started

Parallel Processing using the LOTUS cluster

The CNMS Computer Cluster

CHEOPS Cologne High Efficient Operating Platform for Science Brief Instructions

How To Run A Tompouce Cluster On An Ipra (Inria) (Sun) 2 (Sun Geserade) (Sun-Ge) 2/5.2 (

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 24.

Smarter Cluster Supercomputing from the Supercomputer Experts

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

An Oracle White Paper September Oracle WebLogic Server 12c on Microsoft Windows Azure

High Performance Computing Cluster Quick Reference User Guide

High Performance Computing in Aachen

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Using NeSI HPC Resources. NeSI Computational Science Team

Integrating SNiFF+ with the Data Display Debugger (DDD)

Windows HPC 2008 Cluster Launch

New High-performance computing cluster: PAULI. Sascha Frick Institute for Physical Chemistry

13.1 Backup virtual machines running on VMware ESXi / ESX Server

The Information Technology Solution. Denis Foueillassar TELEDOS project coordinator

File Transfer Examples. Running commands on other computers and transferring files between computers

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

Virtualization of a Cluster Batch System

An Oracle White Paper July Oracle VM 3: Building a Demo Environment using Oracle VM VirtualBox

Bright Cluster Manager 5.2. User Manual. Revision: Date: Fri, 30 Nov 2012

GRID Computing: CAS Style

Xeon Phi Application Development on Windows OS

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

High Performance Computing

Scaling from Workstation to Cluster for Compute-Intensive Applications

High-Performance Computing

Introduction to Sun Grid Engine (SGE)

How to Run Parallel Jobs Efficiently

Easing embedded Linux software development for SBCs

- An Essential Building Block for Stable and Reliable Compute Clusters

24/08/2004. Introductory User Guide

1 DCSC/AU: HUGE. DeIC Sekretariat /RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

OpenMP Programming on ScaleMP

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Transcription:

The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU cards, each with 3 GB of memory The network is 4x QDR Infiniband (also 1 Gb Ethernet) Disk server with 24 TB IB Net Cluster ethernet Login FE (scaleout) ClusterFE (vm) AdminFE (scaleout, vmhost) Cluster nodes GridFE (vm) Disk server (DL360) Disk pool 1 Using the Asterope cluster The cluster is part of FGI, the Finnish Grid Infrastructure can both be used locally and as part of the national grid resources Local users access the cluster through the front-end node uses SSH keys for authentication (not passwords) log in with SSH to the front-end asterope.abo.fi Uses a separate file system to store user files it does not mount the normal home directory have to explicitly copy files to the system Uses the Environment Modules package to manage the software environment see http://modules.sourceforge.net or man module on Asterope have to load all software modules that you will use (compilers, libraries, tools, ) The nodes are named asg1, asg2,, asg8 2 1

Setting up SSH keys The Asterope cluster uses SSH keys for user authentication public key encryption scheme instead of a password You should access the cluster from ÅA:s login server tuxedo.abo.fi login to tuxedo from your local machine with your normal ÅA user name and password on Windows you can use Putty or some other terminal emulator on Linux open a terminal window and give the command ssh X username@tuxedo.abo.fi when you are logged in to Tuxedo, generate a SSH key with the command ssh-keygen if you already have a SSH key you don t need to do this again your public key will be stored in the file.ssh/id_rsa.pub! send it to Mats Aspnäs and ask for an account on the cluster ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCsPDBbfOU/dS6K4ay2eBeXGtsGkEVhFrxRZoyksdXuTMgGp3pO7RBRkhaAfop0m! kqrsdmsphtk+rldbgl+yil6dozaygafgzsliixzowumcvlqyw5whcf/osabvdbykoxdzvhbpeesibnlhzd1uapzf0aahn/zo7gyxq! KsQ9e6HsdP3P123fZLiu3IzrU511IDI79zhkqJtxevHIn0c1bOVhINgkyoE5to7fl7iX+LYFkKY3J/eJJOHrRTipLLBdMYe4956yN! cypv6+eemla3vommyyoqgtsvwh6+/w5gpo7wf+3kpacr0teel6us9+ozugm0ywscjxnbo4k7es3rd mats@tuxedo.abo.fi! 3 Logging in to Asterope First log in to tuxedo.abo.fi Tuxedo is a login server, so it can be accessed also from outside ÅA:s network On Tuxedo, log in to the front-end node of Asterope with ssh X username@asterope.abo.fi Your home directory is /home/username this is not your normal ÅA home directory, so initially it will be empty you can transfer files to/from the system with scp Asterope Tuxedo ssh You ssh 4 2

Setting up your environment Load the software modules that you need: Gnu compiler and MPI module load PrgEnv-gnu! module load mvapich2! Can list all available modules with the command module avail list the modules that you currently have loaded with module list The module system sets the environment variable that are needed by the programming tools, like PATH, LIBRARY_PATH, MANPATH, xxx_include and xxx_lib (where xxx is the name of a loaded module) simplifies the design of Makefiles, no need to set up paths 5 Compiling and running programs Copy the example program hello.c to your home directory on Asterope Compile the MPI program with mpicc -O3 hello.c -o hello! Submit the program for execution on 4 cores srun n 4./hello! % srun -n 4./hello! srun: job 28165 queued and waiting for resources! srun: job 28165 has been allocated resources! Hello World from process 0 running on asg7! Hello World from process 1 running on asg7! Hello World from process 2 running on asg7! Hello World from process 3 running on asg7! Ready! %! 6 3

Executing programs with SLURM The cluster uses the SLURM resource manager to execute jobs on the nodes see https://computing.llnl.gov/linux/slurm/quickstart.html To execute a program on X cores on the cluster srun n X./myprogram Useful SLURM commands: srun: run a parallel job on cluster managed by SLURM squeue: view information about jobs in the SLURM scheduling queue sbatch: submit a batch script to SLURM sinfo: view information about SLURM nodes and partitions scancel: signal jobs that are under the control of Slurm, for instance to cancel submitted jobs 7 Partitions The cluster is divided into a number of partitions jobs submitted through SLURM are always allocated resources from some partition The default partition in named user and contains 48 cores (4 nodes) max. run time is 30 minutes If you don t specify which partition to use, your job goes to users use srun p local to specify the local partition Partition Nodes Nr of nodes Cores Max time users asg[1-8] 4 48 30 min local asg[1-8] 8 96 5 days grid asg[1-6] 2 24 2 days Please don t use the system for anything else than course work 8 4

Debugging parallel programs There are some well developed debuggers for parallel programs, supporting MPI, OpenMP and CUDA, like TotalView and Allinea DDT however, these are commercial products It is possible to attach gdb (the Gnu debugger) or ddd (a graphical frontend to gdb) to each MPI process you get one debugger window for each MPI process can set breakpoints in the code, step forward, inspect the value of variables etc. Compile your program without optimization (no O flag) and with the g switch run your program with srun -n 2 --x11=all ddd hello! the flag --x11=all instructs srun to forward X-windows connections from all processes ddd hello starts ddd (the Data Display Debugger) on the program hello 9 5