Hands-on exercise: NPB-OMP / BT

Similar documents
Hands-on exercise (FX10 pi): NPB-MZ-MPI / BT

Multicore Parallel Computing with OpenMP

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop

User s Manual

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

Performance and Scalability of the NAS Parallel Benchmarks in Java

OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware

Introduction to SDSC systems and data analytics software packages "

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Parallel Debugging with DDT

benchmarking Amazon EC2 for high-performance scientific computing

On-demand (Pay-per-Use) HPC Service Portal

Advanced Computational Software

Profile Data Mining with PerfExplorer. Sameer Shende Performance Research Lab, University of Oregon

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

How To Write A Mapreduce Program On An Ipad Or Ipad (For Free)

Introduction to ACENET Accelerating Discovery with Computational Research May, 2015

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Performance of the NAS Parallel Benchmarks on Grid Enabled Clusters

University of Amsterdam - SURFsara. High Performance Computing and Big Data Course

INF-110. GPFS Installation

Distributed convex Belief Propagation Amazon EC2 Tutorial

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

Performance Data Management with TAU PerfExplorer

HPCC USER S GUIDE. Version 1.2 July IITS (Research Support) Singapore Management University. IITS, Singapore Management University Page 1 of 35

Mathematical Libraries and Application Software on JUROPA and JUQUEEN

Towards OpenMP Support in LLVM

An Introduction to Parallel Programming with OpenMP

Agenda. Using HPC Wales 2

Scalability evaluation of barrier algorithms for OpenMP

MPI / ClusterTools Update and Plans

SLURM Workload Manager

NEC HPC-Linux-Cluster

OpenACC Parallelization and Optimization of NAS Parallel Benchmarks

Unified Performance Data Collection with Score-P

The RWTH Compute Cluster Environment

Configuring Sun StorageTek SL500 tape library for Amanda Enterprise backup software

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Improving System Scalability of OpenMP Applications Using Large Page Support

SGE Roll: Users Guide. Version Edition

Profile Data Mining with PerfExplorer. Sameer Shende Performance Reseaerch Lab, University of Oregon

CNAG User s Guide. Barcelona Supercomputing Center Copyright c 2015 BSC-CNS December 18, Introduction 2

High Performance Computing Cluster Quick Reference User Guide

Documentation Installation of the PDR code

Application of a Development Time Productivity Metric to Parallel Software Development

GPU Tools Sandra Wienke

Yocto Project Eclipse plug-in and Developer Tools Hands-on Lab

Hodor and Bran - Job Scheduling and PBS Scripts

MapReduce Evaluator: User Guide

Introduction to HPC Workshop. Center for e-research

INTEGRAL OFF-LINE SCIENTIFIC ANALYSIS

Big Data Evaluator 2.1: User Guide

Poisson Equation Solver Parallelisation for Particle-in-Cell Model

Parallel Computing. Shared memory parallel programming with OpenMP

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Parallel Processing using the LOTUS cluster

Hadoop Basics with InfoSphere BigInsights

The CNMS Computer Cluster

Manual for using Super Computing Resources

OLCF Best Practices. Bill Renaud OLCF User Assistance Group

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 20.

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Debugging with TotalView

Introduction to HDFS. Prasanth Kothuri, CERN

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment

To reduce or not to reduce, that is the question

Allinea Performance Reports User Guide. Version 6.0.6

INTEL PARALLEL STUDIO XE EVALUATION GUIDE

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Matching Memory Access Patterns and Data Placement for NUMA Systems

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

Introduction to the SGE/OGS batch-queuing system

How To Run A Steady Case On A Creeper

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Yocto Project ADT, Eclipse plug-in and Developer Tools

Improve Fortran Code Quality with Static Analysis

PetaLinux SDK User Guide. Application Development Guide

Grid Engine Users Guide p1 Edition

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Installing (1.8.7) 9/2/ Installing jgrasp

High Performance Computing

Grid 101. Grid 101. Josh Hegie.

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Use of a Software Configuration Management Tool in LHCb

Building and Using a Cross Development Tool Chain

Buildroot for Vortex86EX (2016/04/20)

Project Discussion Multi-Core Architectures and Programming

End-user Tools for Application Performance Analysis Using Hardware Counters

An Embedded Wireless Mini-Server with Database Support

High Performance Computing in Aachen

High Performance Computing in Aachen

Transcription:

Hands-on exercise: NPB-OMP / BT VI-HPS Team 1

Tutorial exercise objectives Familiarise with usage of VI-HPS tools complementary tools capabilities & interoperability Prepare to apply tools productively to your applications(s) Exercise is based on a small portable benchmark code unlikely to have significant optimisation opportunities Optional (recommended) exercise extensions analyse performance of alternative configurations investigate effectiveness of system-specific compiler/mpi optimisations and/or placement/binding/affinity capabilities investigate scalability and analyse scalability limiters compare performance on different HPC platforms 2

Local installation for workshop Log-in to cluster with X11 forwarding enabled % ssh X igloo.calcul.ecp.fr Copy the tutorial sources to your working directory % cd <myworkdir> % cp r /home/wylieb/tutorial/exercies/npb3.3-omp. % cd NPB3.3-OMP (When available, generally advisable to use a parallel filesystem such as $WORK) 3

NPB-MZ-MPI Suite The NAS Parallel Benchmark suite (OpenMP version) Available from http://www.nas.nasa.gov/software/npb 8 benchmarks in Fortran77, 2 in C Configurable for various sizes & classes Move into the NPB3.3-OMP root directory % ls bin/ common/ jobscript/ Makefile README.install SP/ BT/ config/ LU/ README README.tutorial sys/ Subdirectories contain source code for each benchmark plus additional configuration and common code The provided distribution has already been configured for the tutorial, such that it's ready to make one or more of the benchmarks and install them into a (tool-specific) bin subdirectory 4

Building an NPB-MZ-MPI Benchmark Type make for instructions % make =========================================== = NAS PARALLEL BENCHMARKS 3.3 = = OpenMP Versions = = F77/C = =========================================== To make a NAS benchmark type make <benchmark-name> CLASS=<class> where <benchmark-name> is "bt", "cg", "ep", "ft", "is", "lu", "mg", "sp", "ua", or "dc" <class> is "S", "W", "A", "B", "C" or "D" [...] *************************************************************** * Custom build configuration is specified in config/make.def * * Suggested tutorial exercise configuration for HPC systems: * * make bt CLASS=B * *************************************************************** 5

Building an NPB-MZ-MPI Benchmark Specify the benchmark configuration benchmark name: bt, cg, ep, ft, is, lu, mg, sp, ua the benchmark class (S, W, A, B, C, D, E): CLASS=B % make bt CLASS=B cd BT; make CLASS=B VERSION= make: Entering directory 'BT' cd../sys; cc -o setparams setparams.c../sys/setparams bt B ifort -c -O g -openmp bt.f [...] cd../common; ifort -c -O -g -openmp timers.f ifort O g -openmp -o../bin/bt_b \ bt.o initialize.o exact_solution.o exact_rhs.o set_constants.o \ adi.o rhs.o zone_setup.o x_solve.o y_solve.o exch_qbc.o \ solve_subs.o z_solve.o add.o error.o verify.o mpi_setup.o \../common/print_results.o../common/timers.o Built executable../bin/bt_b make: Leaving directory 'BT' 6

NPB-MZ-MPI / BT (Block Tridiagonal Solver) What does it do? Solves a discretized version of the unsteady, compressible Navier-Stokes equations in three spatial dimensions Performs 200 time-steps on a regular 3-dimensional grid Implemented in 20 or so Fortran77 source modules Uses OpenMP on ICE with 12 threads, class B takes around 30 seconds 7

NPB-MZ-MPI / BT Reference Execution In the subdirectory with the executable, copy provided jobscript and launch the OpenMP application % cd bin % cp../jobscript/intel/run.pbs. % less run.pbs % qsub run.pbs % cat run.o<id> NAS Parallel Benchmarks (NPB3.3-OMP) - BT OpenMP Benchmark Size: 102x 102x 102 Iterations: 200 dt: 0.000300 Number of available threads: 12 Time step 1 Time step 20 [...] Time step 180 Time step 200 Verification Successful BT Benchmark Completed. Time in seconds = 30.25 Hint: save the benchmark output (or note the run time) to be able to refer to it later 8

NPB-OMP / BT: config/make.def (Intel version) # SITE- AND/OR PLATFORM-SPECIFIC DEFINITIONS # Items in this file may need to be changed for each platform. # Configured for Intel compilers OPENMP = -openmp # The Fortran compiler used for OpenMP programs F77 = ifort #F77 = scorep ifort Default (no instrumentation) Hint: uncomment one of these alternative compiler wrappers to perform instrumentation # PREP is a generic preposition macro for instrumentation preparation #F77 = $(PREP) ifort # This links OMP Fortran programs; usually the same as ${F77} FLINK = $(F77)... 10

NPB-OMP / BT: config/make.def (GCC version) # SITE- AND/OR PLATFORM-SPECIFIC DEFINITIONS # Items in this file may need to be changed for each platform. # Configured for GCC compilers OPENMP = -fopenmp # The Fortran compiler used for OpenMP programs F77 = gfortran #F77 = scorep gfortran Default (no instrumentation) Hint: uncomment one of these alternative compiler wrappers to perform instrumentation # PREP is a generic preposition macro for instrumentation preparation #F77 = $(PREP) gfortran # This links OMP Fortran programs; usually the same as ${F77} FLINK = $(F77)... 11