Session 2: MUST. Correctness Checking

Size: px
Start display at page:

Download "Session 2: MUST. Correctness Checking"

Transcription

1 Center for Information Services and High Performance Computing (ZIH) Session 2: MUST Correctness Checking Dr. Matthias S. Müller (RWTH Aachen University) Tobias Hilbrich (Technische Universität Dresden) Joachim Protze (RWTH Aachen University, LLNL)

2 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 2

3 How many errors can you spot in this tiny example? #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Recv (buf, 2, MPI_INT, size - rank, 123, MPI_COMM_WORLD, MPI_STATUS_IGNORE); MPI_Send (buf, 2, type, size - rank, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); } return 0; At least 8 issues in this code example Matthias Müller, Joachim Protze, Tobias Hilbrich 3

4 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 4

5 MPI usage errors! MPI programming is error prone! Bugs may manifest as: Ø Crashes Ø Hangs Ø Wrong results Ø Not at all! (Sleeping bugs)! Tools help to detect these issues Matthias Müller, Joachim Protze, Tobias Hilbrich 5

6 MPI usage errors (2)! Complications in MPI usage: Ø Non-blocking communication Ø Persistent communication Ø Complex collectives (e.g. Alltoallw) Ø Derived datatypes Ø Non-contiguous buffers! Error Classes include: Ø Incorrect arguments Ø Resource errors Ø Buffer usage Ø Type matching Ø Deadlocks Matthias Müller, Joachim Protze, Tobias Hilbrich 6

7 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 7

8 Skipping some errors! Missing MPI_Init:! Current release doesn t start to work, implementation in progress! Missing MPI_Finalize:! Current release doesn t terminate all analyses, work in progress! Src/dest rank out of range (size-rank): leads to crash, use crash save version of tool Matthias Müller, Joachim Protze, Tobias Hilbrich 8

9 MUST: Tool design PMPI Interface P^nMPI Library Application Additional tool ranks MPI Library Split-comm-world Force status wrapper GTI: event forwarding, network Local Analyses Non-Local Analyses Rewrite-comm-world MPI Library Matthias Müller, Joachim Protze, Tobias Hilbrich 9

10 0 4 1 GTI: event forwarding, network 2 Local Analyses 6 Non-Local 5 Analyses 3 Matthias Müller, Joachim Protze, Tobias Hilbrich 10

11 Fixed some errors: #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Recv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, MPI_STATUS_IGNORE); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 11

12 Must detects deadlocks Click for graphical representa2on of the detected deadlock situa2on. Matthias Müller, Joachim Protze, Tobias Hilbrich 12

13 Graphical representation of deadlocks Rank 0 waits for rank 1 and vv. Simple call stack for this example. Matthias Müller, Joachim Protze, Tobias Hilbrich 13

14 Fix1: use asynchronous receive #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Use asynchronous receive: (MPI_Irecv) } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 14

15 MUST detects errors in handling datatypes Use of uncommited datatype: type Matthias Müller, Joachim Protze, Tobias Hilbrich 15

16 Fix2: use MPI_Type_commit #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Commit the datatype before usage } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 16

17 MUST detects errors in transfer buffer sizes / types Size of sent message larger than receive buffer Matthias Müller, Joachim Protze, Tobias Hilbrich 17

18 Fix3: use same message size for send and receive #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Reduce the message size } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 18

19 MUST detects use of wrong argument values Use of Fortran type in C, datatype mismatch between sender and receiver Matthias Müller, Joachim Protze, Tobias Hilbrich 19

20 Fix4: use C-datatype constants in C-code #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Use the integer datatype intended for usage in C } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 20

21 MUST detects data races in asynchronous communication Data race between send and ascynchronous receive opera2on Matthias Müller, Joachim Protze, Tobias Hilbrich 21

22 Graphical representation of the race condition Graphical representa2on of the data race loca2on Matthias Müller, Joachim Protze, Tobias Hilbrich 22

23 Fix5: use independent memory regions #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Offset points to independent memory } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 23

24 MUST detects leaks of user defined objects Leak of user defined datatype object! User defined objects include! MPI_Comms (even by MPI_Comm_dup)! MPI_Datatypes! MPI_Groups Matthias Müller, Joachim Protze, Tobias Hilbrich 24

25 Fix6: Deallocate datatype object #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); } printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Type_free (&type); MPI_Finalize (); return 0; Deallocate the created datatype Matthias Müller, Joachim Protze, Tobias Hilbrich 25

26 MUST detects unfinished asynchronous communication Remaining unfinished asynchronous receive Matthias Müller, Joachim Protze, Tobias Hilbrich 26

27 Fix8: use MPI_Wait to finish asynchronous communication #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); MPI_Wait (&request, MPI_STATUS_IGNORE); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Type_free (&type); Finish the asynchronous communica2on MPI_Finalize (); } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 27

28 No further error detected Hopefully this message applies to many applica2ons Matthias Müller, Joachim Protze, Tobias Hilbrich 28

29 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 29

30 Classes of Correctness Tools! Debuggers: Ø Helpful to pinpoint any error Ø Finding the root cause may be very hard Ø Won t detect sleeping errors Ø E.g.: gdb, TotalView, Alinea DDT! Static Analysis: Ø Compilers and Source analyzers Ø Typically: type and expression errors Ø E.g.: MPI-Check! Model checking: Ø Requires a model of your applications Ø State explosion possible Ø E.g.: MPI-Spin Matthias Müller, Joachim Protze, Tobias Hilbrich 30

31 Strategies of Correctness Tools! Runtime error detection: Ø Inspect MPI calls at runtime Ø Limited to the timely interleaving that is observed Ø Causes overhead during application run Ø E.g.: Intel Trace Analyzer, Umpire, Marmot, MUST! Formal verification: Ø Extension of runtime error detection Ø Explores ALL possible timely interleavings Ø Can detect potential deadlocks or type missmatches that would otherwise not occur in the presence of a tool Ø For non-deterministic applications exponential exploration space Ø E.g.: ISP Matthias Müller, Joachim Protze, Tobias Hilbrich 31

32 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 32

33 MUST Usage 1) Compile and link application as usual Link against the shared version of the MPI lib (Usually default) 2) Replace mpiexec with mustrun E.g.: mustrun np 4 myapp.exe input.txt output.txt 3) Inspect MUST_Output.html in run directory MUST_Output/MUST_Deadlock.dot exists in case of deadlock Visualize with: dot Tps MUST_Deadlock.dot o deadlock.ps The mustrun script will use an extra process for nonlocal checks (Invisible to application) I.e.: mustrun np 4 will issue a mpirun np 5 Make sure to allocate the extra task in batch jobs Matthias Müller, Joachim Protze, Tobias Hilbrich 33

34 MUST - Features! Local checks: Ø Integer validation Ø Integrity checks (pointers valid, etc.) Ø Operation, Request, Communicator, Datatype, Group usage Ø Resource leak detection Ø Memory overlap checks! Non-local checks: Ø Collective verification Ø Lost message detection Ø Type matching (For P2P and collectives) Ø Deadlock detection (with root cause visualization) Matthias Müller, Joachim Protze, Tobias Hilbrich 34

35 MUST - Features: Scalability! Local checks largely scalable! Non-local checks:! Current default uses a central process Ø This process is an MPI task taken from the application Ø Limited scalability ~100 tasks (Depending on application)! Distributed analysis available (tested with 10k tasks) Ø Uses more extra tasks (10%-100%)! Recommended: Logging to an HTML file! Uses a scalable tool infrastructure! Tool configuration happens at execution time Matthias Müller, Joachim Protze, Tobias Hilbrich 35

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop Joachim Protze and Felix Münchhalfen IT Center RWTH Aachen University February 2014 Content MPI Usage Errors Error Classes Avoiding

More information

Parallelization: Binary Tree Traversal

Parallelization: Binary Tree Traversal By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First

More information

MPI Application Development Using the Analysis Tool MARMOT

MPI Application Development Using the Analysis Tool MARMOT MPI Application Development Using the Analysis Tool MARMOT HLRS High Performance Computing Center Stuttgart Allmandring 30 D-70550 Stuttgart http://www.hlrs.de 24.02.2005 1 Höchstleistungsrechenzentrum

More information

Parallel Programming

Parallel Programming Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 4 Message-Passing Programming Learning Objectives Understanding how MPI programs execute Familiarity with fundamental MPI functions

More information

High performance computing systems. Lab 1

High performance computing systems. Lab 1 High performance computing systems Lab 1 Dept. of Computer Architecture Faculty of ETI Gdansk University of Technology Paweł Czarnul For this exercise, study basic MPI functions such as: 1. for MPI management:

More information

MPI-Checker Static Analysis for MPI

MPI-Checker Static Analysis for MPI MPI-Checker Static Analysis for MPI Alexander Droste, Michael Kuhn, Thomas Ludwig November 15, 2015 Motivation 2 / 39 Why is runtime analysis in HPC challenging? Large amount of resources are used State

More information

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015 Hermann Härtig ISSUES starting points independent Unix processes and block synchronous execution who does it load migration mechanism

More information

Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1

Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1 Lecture 6: Introduction to MPI programming Lecture 6: Introduction to MPI programming p. 1 MPI (message passing interface) MPI is a library standard for programming distributed memory MPI implementation(s)

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or

More information

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich

More information

MPI Tutorial. Shao-Ching Huang. IDRE High Performance Computing Workshop

MPI Tutorial. Shao-Ching Huang. IDRE High Performance Computing Workshop MPI Tutorial Shao-Ching Huang IDRE High Performance Computing Workshop 2013-02-13 Distributed Memory Each CPU has its own (local) memory This needs to be fast for parallel scalability (e.g. Infiniband,

More information

HPCC - Hrothgar Getting Started User Guide MPI Programming

HPCC - Hrothgar Getting Started User Guide MPI Programming HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...

More information

Lightning Introduction to MPI Programming

Lightning Introduction to MPI Programming Lightning Introduction to MPI Programming May, 2015 What is MPI? Message Passing Interface A standard, not a product First published 1994, MPI-2 published 1997 De facto standard for distributed-memory

More information

C C. Programming MPI with C /2007 MPI introductory course Center. Christian Terboven. Center for Computing and Communication RWTH Aachen

C C. Programming MPI with C /2007 MPI introductory course Center. Christian Terboven. Center for Computing and Communication RWTH Aachen Programming MPI with ++ hristian Terboven enter omputing and ommunication RWTH Aachen 1 03/2007 MPI introductory course enter omputing and ommunication MPI and ++: the Basics Let s state that clear: most

More information

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007 HP-MPI User s Guide 11th Edition Manufacturing Part Number : B6060-96024 September 2007 Copyright 1979-2007 Hewlett-Packard Development Company, L.P. Table 1 Revision history Edition MPN Description Eleventh

More information

Introduction to Hybrid Programming

Introduction to Hybrid Programming Introduction to Hybrid Programming Hristo Iliev Rechen- und Kommunikationszentrum aixcelerate 2012 / Aachen 10. Oktober 2012 Version: 1.1 Rechen- und Kommunikationszentrum (RZ) Motivation for hybrid programming

More information

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ) Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications

More information

Parallelization: Sieve of Eratosthenes

Parallelization: Sieve of Eratosthenes By Aaron Weeden, Shodor Education Foundation, Inc. Module Document Overview This module presents the sieve of Eratosthenes, a method for finding the prime numbers below a certain integer. One can model

More information

RA MPI Compilers Debuggers Profiling. March 25, 2009

RA MPI Compilers Debuggers Profiling. March 25, 2009 RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar

More information

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs. MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.de HLRS High Performance Computing Center Stuttgart Allmandring 30

More information

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS by Subodh Sharma A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor

More information

Programming with OpenMP and MPI

Programming with OpenMP and MPI Programming with OpenMP and MPI Mark Howison User Services & Support This talk will... Focus on the basics of parallel programming Will not inundate you with the details of memory hierarchies, hardware

More information

MPI Programming Part 1

MPI Programming Part 1 MPI Programming Part 1 Objectives Basic structure of MPI code MPI communicators Sample programs 1 Introduction to MPI The Message Passing Interface (MPI) is a library of subroutines (in Fortran) or function

More information

Parallel Programming with MPI on the Odyssey Cluster

Parallel Programming with MPI on the Odyssey Cluster Parallel Programming with MPI on the Odyssey Cluster Plamen Krastev Office: Oxford 38, Room 204 Email: plamenkrastev@fas.harvard.edu FAS Research Computing Harvard University Objectives: To introduce you

More information

Asynchronous Dynamic Load Balancing (ADLB)

Asynchronous Dynamic Load Balancing (ADLB) Asynchronous Dynamic Load Balancing (ADLB) A high-level, non-general-purpose, but easy-to-use programming model and portable library for task parallelism Rusty Lusk Mathema.cs and Computer Science Division

More information

Parallel Sérgio Mar*nho Microso2 Portugal

Parallel Sérgio Mar*nho Microso2 Portugal Parallel Computing @Microsoft Sérgio Mar*nho sergioma@microso2.com Microso2 Portugal Challenges? Free Lunch Is Over For Tradi*onal So2ware Looking Forward: Hard Problems Scaling distributed systems is

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer cramer@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again

More information

Parallel Debugging with DDT

Parallel Debugging with DDT Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece

More information

Message Passing with MPI

Message Passing with MPI Message Passing with MPI Hristo Iliev (Христо Илиев) PPCES 2012, 2013 Christian Iwainsky PPCES 2011 Rechen- und Kommunikationszentrum (RZ) Agenda Motivation MPI Part 1 Concepts Point-to-point communication

More information

What is High Performance Computing?

What is High Performance Computing? What is High Performance Computing? V. Sundararajan Scientific and Engineering Computing Group Centre for Development of Advanced Computing Pune 411 007 vsundar@cdac.in Plan Why HPC? Parallel Architectures

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION A SIMPLIFIED MESSAGE-PASSING LIBRARY Timothy J. McGuire Sam Houston State University Department of Computer Science Huntsville, TX 77341-2090 (936) 294-1571 mcguire@shsu.edu ABSTRACT MPI, the standard

More information

Format String Vulnerability. printf ( user input );

Format String Vulnerability. printf ( user input ); Lecture Notes (Syracuse University) Format String Vulnerability: 1 Format String Vulnerability printf ( user input ); The above statement is quite common in C programs. In the lecture, we will find out

More information

Dynamic Software Testing of MPI Applications with Umpire

Dynamic Software Testing of MPI Applications with Umpire Dynamic Software Testing of MPI Applications with Umpire Jeffrey S. Vetter Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory Livermore, California, USA

More information

Eliminate Memory Errors and Improve Program Stability

Eliminate Memory Errors and Improve Program Stability Eliminate Memory Errors and Improve Program Stability with Intel Parallel Studio XE Can running one simple tool make a difference? Yes, in many cases. You can find errors that cause complex, intermittent

More information

MPI and C-Language Seminars 2010

MPI and C-Language Seminars 2010 MPI and C-Language Seminars 2010 Seminar Plan (1/3) Aim: Introduce the C Programming Language. Plan to cover: Basic C, and programming techniques needed for HPC coursework. C-bindings for the Message Passing

More information

Introduction. Ned Nedialkov. McMaster University Canada. CS/SE 4F03 January 2016

Introduction. Ned Nedialkov. McMaster University Canada. CS/SE 4F03 January 2016 Introduction Ned Nedialkov McMaster University Canada CS/SE 4F03 January 2016 Outline von Neumann architecture Processes Threads SIMD MIMD UMA vs. NUMA SPMD MPI example OpenMP example c 2013 16 Ned Nedialkov

More information

An Introduction to Parallel Computing With MPI. Computing Lab I

An Introduction to Parallel Computing With MPI. Computing Lab I An Introduction to Parallel Computing With MPI Computing Lab I The purpose of the first programming exercise is to become familiar with the operating environment on a parallel computer, and to create and

More information

Visual Debugging of MPI Applications

Visual Debugging of MPI Applications Visual Debugging of MPI Applications Basile Schaeli 1, Ali Al-Shabibi 1 and Roger D. Hersch 1 1 Ecole Polytechnique Fédérale de Lausanne (EPFL) School of Computer and Communication Sciences CH-1015 Lausanne,

More information

Application Performance Tools on Discover

Application Performance Tools on Discover Application Performance Tools on Discover Tyler Simon 21 May 2009 Overview 1. ftnchek - static Fortran code analysis 2. Cachegrind - source annotation for cache use 3. Ompp - OpenMP profiling 4. IPM MPI

More information

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics

More information

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters User s Manual for the HPCVL DMSM Library Gang Liu and Hartmut L. Schmider High Performance Computing

More information

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI Load Balancing 1 the Mandelbrot set computing a file with grayscales 2 Static Work Load Assignment granularity considerations static work load assignment with MPI 3 Dynamic Work Load Balancing scheduling

More information

Parallel Computing. Parallel shared memory computing with OpenMP

Parallel Computing. Parallel shared memory computing with OpenMP Parallel Computing Parallel shared memory computing with OpenMP Thorsten Grahs, 14.07.2014 Table of contents Introduction Directives Scope of data Synchronization OpenMP vs. MPI OpenMP & MPI 14.07.2014

More information

Helping you avoid stack overflow crashes!

Helping you avoid stack overflow crashes! Helping you avoid stack overflow crashes! One of the toughest (and unfortunately common) problems in embedded systems is stack overflows and the collateral corruption that it can cause. As a result, we

More information

Using the Intel Inspector XE

Using the Intel Inspector XE Using the Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Race Condition Data Race: the typical OpenMP programming error, when: two or more threads access the same memory

More information

Chao Chen 1 Michael Lang 2 Yong Chen 1. IEEE BigData, 2013. Department of Computer Science Texas Tech University

Chao Chen 1 Michael Lang 2 Yong Chen 1. IEEE BigData, 2013. Department of Computer Science Texas Tech University Chao Chen 1 Michael Lang 2 1 1 Data-Intensive Scalable Laboratory Department of Computer Science Texas Tech University 2 Los Alamos National Laboratory IEEE BigData, 2013 Outline 1 2 3 4 Outline 1 2 3

More information

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt MUSIC Multi-Simulation Coordinator Users Manual Örjan Ekeberg and Mikael Djurfeldt March 3, 2009 Abstract MUSIC is an API allowing large scale neuron simulators using MPI internally to exchange data during

More information

Memory Debugging with TotalView on AIX and Linux/Power

Memory Debugging with TotalView on AIX and Linux/Power S cico m P Austin Aug 2004 Memory Debugging with TotalView on AIX and Linux/Power Chris Gottbrath Memory Debugging in AIX and Linux-Power Clusters Intro: Define the problem and terms What are Memory bugs?

More information

Introduction to C Programming in Linux Environment

Introduction to C Programming in Linux Environment Introduction to C Programming in Linux Environment Ibrahim Korpeoglu Department of Computer Engineering Bilkent University Spring 2004 This document includes information about the followings: Logging to

More information

Introduction to MPI Programming!

Introduction to MPI Programming! Introduction to MPI Programming! Rocks-A-Palooza II! Lab Session! 2006 UC Regents! 1! Modes of Parallel Computing! SIMD - Single Instruction Multiple Data!!processors are lock-stepped : each processor

More information

The MPI Message-passing Standard Practical use and implementation (II) SPD Course 01/03 02/03/2016 Massimo Coppola

The MPI Message-passing Standard Practical use and implementation (II) SPD Course 01/03 02/03/2016 Massimo Coppola The MPI Message-passing Standard Practical use and implementation (II) SPD Course 01/03 02/03/2016 Massimo Coppola MPI communication semantics Message order is not guaranteed, Only communications with

More information

SyncChecker: Detecting Synchronization Errors between MPI Applications and Libraries

SyncChecker: Detecting Synchronization Errors between MPI Applications and Libraries SyncChecker: Detecting Synchronization Errors between MPI Applications and Libraries Zhezhe Chen Xinyu Li Jau-Yuan Chen Hua Zhong Feng Qin Dept. of Computer Science and Engineering Technology Center of

More information

Running C/C++ Program in parallel using MPI

Running C/C++ Program in parallel using MPI Running C/C++ Program in parallel using MPI Eunyoung Seol April 29, 2003 abstract MPI (Message Passing Interface) is one of the most popular library for message passing within a parallel program. MPI can

More information

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5) Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load

More information

Parallel and Distributed Computing Programming Assignment 1

Parallel and Distributed Computing Programming Assignment 1 Parallel and Distributed Computing Programming Assignment 1 Due Monday, February 7 For programming assignment 1, you should write two C programs. One should provide an estimate of the performance of ping-pong

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Christian Iwainsky iwainsky@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Produktivitätstools unter Linux Sep 16, RWTH Aachen University

More information

Parallelizing Heavyweight Debugging Tools with MPIecho

Parallelizing Heavyweight Debugging Tools with MPIecho Parallelizing Heavyweight Debugging Tools with MPIecho Barry Rountree rountree@llnl.gov Martin Schulz schulzm@llnl.gov Guy Cobb University of Colorado guy.cobb@gmail.com Bronis R. de Supinski bronis@llnl.gov

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 7 1.2

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 8 1.2

More information

Oracle Solaris Studio Code Analyzer

Oracle Solaris Studio Code Analyzer Oracle Solaris Studio Code Analyzer The Oracle Solaris Studio Code Analyzer ensures application reliability and security by detecting application vulnerabilities, including memory leaks and memory access

More information

Libmonitor: A Tool for First-Party Monitoring

Libmonitor: A Tool for First-Party Monitoring Libmonitor: A Tool for First-Party Monitoring Mark W. Krentel Dept. of Computer Science Rice University 6100 Main St., Houston, TX 77005 krentel@rice.edu ABSTRACT Libmonitor is a library that provides

More information

Using Trace Replayer Debugger and Managing Traces in IDA

Using Trace Replayer Debugger and Managing Traces in IDA Using Trace Replayer Debugger and Managing Traces in IDA Copyright 2014 Hex-Rays SA Table of contents Introduction...2 Quick Overview...2 Following this tutorial...2 Supplied files...2 Replaying and managing

More information

TN203. Porting a Program to Dynamic C. Introduction

TN203. Porting a Program to Dynamic C. Introduction TN203 Porting a Program to Dynamic C Introduction Dynamic C has a number of improvements and differences compared to many other C compiler systems. This application note gives instructions and suggestions

More information

Introduction to C Programming S Y STEMS

Introduction to C Programming S Y STEMS Introduction to C Programming CS 40: INTRODUCTION TO U NIX A ND L I NUX O P E R AT ING S Y STEMS Objectives Introduce C programming, including what it is and what it contains, which includes: Command line

More information

Basic Concepts in Parallelization

Basic Concepts in Parallelization 1 Basic Concepts in Parallelization Ruud van der Pas Senior Staff Engineer Oracle Solaris Studio Oracle Menlo Park, CA, USA IWOMP 2010 CCS, University of Tsukuba Tsukuba, Japan June 14-16, 2010 2 Outline

More information

Linux and High-Performance Computing

Linux and High-Performance Computing Linux and High-Performance Computing Outline Architectures & Performance Measurement Linux on High-Performance Computers Beowulf Clusters, ROCKS Kitten: A Linux-derived LWK Linux on I/O and Service Nodes

More information

From Source Code to Executable

From Source Code to Executable From Source Code to Executable Dr. Axel Kohlmeyer Scientific Computing Expert Information and Telecommunication Section The Abdus Salam International Centre for Theoretical Physics http://sites.google.com/site/akohlmey/

More information

Admin. Threads, CPU Scheduling. Yesterday s Lecture: Threads. Today s Lecture. ITS 225: Operating Systems. Lecture 4

Admin. Threads, CPU Scheduling. Yesterday s Lecture: Threads. Today s Lecture. ITS 225: Operating Systems. Lecture 4 ITS 225: Operating Systems Admin Lecture 4 Threads, CPU Scheduling Jan 23, 2004 Dr. Matthew Dailey Information Technology Program Sirindhorn International Institute of Technology Thammasat University Some

More information

Performance analysis with Periscope

Performance analysis with Periscope Performance analysis with Periscope M. Gerndt, V. Petkov, Y. Oleynik, S. Benedict Technische Universität München September 2010 Outline Motivation Periscope architecture Periscope performance analysis

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Christian Iwainsky iwainsky@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Produktivitätstools unter Linux Sep 16, RWTH Aachen University

More information

Creating a Standard C++ Program in Visual Studio C January 2013

Creating a Standard C++ Program in Visual Studio C January 2013 Creating a Standard C++ Program in Visual Studio C++ 10.0 January 2013 ref: http://msdn.microsoft.com/en-us/library/ms235629(v=vs.100).aspx To create a project and add a source file 1 1. On the File menu,

More information

Debugging Heap Corruption in Visual C++ Using Microsoft Debugging Tools for Windows

Debugging Heap Corruption in Visual C++ Using Microsoft Debugging Tools for Windows Using Microsoft Debugging Tools for Windows Version 10.0 David Dahlbacka Contents Heap Corruption... 2 Causes of Heap Corruption... 2 Debugging Heap Corruption... 2 Debugging Tools...3 Specific WinDbg

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

Introduction. dnotify

Introduction. dnotify Introduction In a multi-user, multi-process operating system, files are continually being created, modified and deleted, often by apparently unrelated processes. This means that any software that needs

More information

Leak Check Version 2.1 for Linux TM

Leak Check Version 2.1 for Linux TM Leak Check Version 2.1 for Linux TM User s Guide Including Leak Analyzer For x86 Servers Document Number DLC20-L-021-1 Copyright 2003-2009 Dynamic Memory Solutions LLC www.dynamic-memory.com Notices Information

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Distributed Systems 15-319, spring 2010 11 th Lecture, Feb 16 th Majd F. Sakr Lecture Motivation Understand Distributed Systems Concepts Understand the concepts / ideas

More information

Common C Errors. Compiled by: Leela Kamalesh Yadlapalli

Common C Errors. Compiled by: Leela Kamalesh Yadlapalli Common C Errors Compiled by: Leela Kamalesh Yadlapalli This document shows some of the common errors and warnings that you may encounter during this class. Always remember to use the Wall option if you

More information

The Asterope compute cluster

The Asterope compute cluster The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

C A short introduction

C A short introduction About these lectures C A short introduction Stefan Johansson Department of Computing Science Umeå University Objectives Give a short introduction to C and the C programming environment in Linux/Unix Go

More information

Synchronization in Java

Synchronization in Java Synchronization in Java We often want threads to co-operate, typically in how they access shared data structures. Since thread execution is asynchronous, the details of how threads interact can be unpredictable.

More information

High-Performance Computing: Architecture and APIs

High-Performance Computing: Architecture and APIs High-Performance Computing: Architecture and APIs Douglas Fuller ASU Fulton High Performance Computing Why HPC? Capacity computing Do similar jobs, but lots of them. Capability computing Run programs we

More information

The V-model. Validation and Verification. Inspections [24.3] Testing overview [8, 15.2] - system testing. How much V&V is enough?

The V-model. Validation and Verification. Inspections [24.3] Testing overview [8, 15.2] - system testing. How much V&V is enough? Validation and Verification Inspections [24.3] Testing overview [8, 15.2] - system testing Requirements Design The V-model V & V Plans Implementation Unit tests System tests Integration tests Operation,

More information

MPI / ClusterTools Update and Plans

MPI / ClusterTools Update and Plans HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski

More information

Sources: On the Web: Slides will be available on:

Sources: On the Web: Slides will be available on: C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Unified Performance Data Collection with Score-P

Unified Performance Data Collection with Score-P Unified Performance Data Collection with Score-P Bert Wesarg 1) With contributions from Andreas Knüpfer 1), Christian Rössel 2), and Felix Wolf 3) 1) ZIH TU Dresden, 2) FZ Jülich, 3) GRS-SIM Aachen Fragmentation

More information

The Easiest Way to Run Spark Jobs. How-To Guide

The Easiest Way to Run Spark Jobs. How-To Guide The Easiest Way to Run Spark Jobs How-To Guide The Easiest Way to Run Spark Jobs Recently, Databricks added a new feature, Jobs, to our cloud service. You can find a detailed overview of this feature in

More information

MPI on morar and ARCHER

MPI on morar and ARCHER MPI on morar and ARCHER Access morar available directly from CP-Lab machines external access to morar: gateway: ssh X user@ph-cplab.ph.ed.ac.uk then: ssh X cplabxxx (pick your favourite machine) external

More information

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014 Debugging in Heterogeneous Environments with TotalView ECMWF HPC Workshop 30 th October 2014 Agenda Introduction Challenges TotalView overview Advanced features Current work and future plans 2014 Rogue

More information

Parallel Computing. Shared memory parallel programming with OpenMP

Parallel Computing. Shared memory parallel programming with OpenMP Parallel Computing Shared memory parallel programming with OpenMP Thorsten Grahs, 27.04.2015 Table of contents Introduction Directives Scope of data Synchronization 27.04.2015 Thorsten Grahs Parallel Computing

More information

INDEX. C programming Page 1 of 10. 5) Function. 1) Introduction to C Programming

INDEX. C programming Page 1 of 10. 5) Function. 1) Introduction to C Programming INDEX 1) Introduction to C Programming a. What is C? b. Getting started with C 2) Data Types, Variables, Constants a. Constants, Variables and Keywords b. Types of Variables c. C Keyword d. Types of C

More information

6.189 IAP 2007. Lecture 9. Debugging Parallel Programs. Dr. Rodric Rabbah, IBM. 1 6.189 IAP 2007 MIT

6.189 IAP 2007. Lecture 9. Debugging Parallel Programs. Dr. Rodric Rabbah, IBM. 1 6.189 IAP 2007 MIT 6.189 IAP 2007 Lecture 9 Debugging Parallel Programs 1 6.189 IAP 2007 MIT Debugging Parallel Programs is Hard-er Parallel programs are subject to the usual bugs Plus: new timing and synchronization errors

More information

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee ADVANCED MPI Dr. David Cronk Innovative Computing Lab University of Tennessee Day 1 Morning - Lecture Course Outline Communicators/Groups Extended Collective Communication One-sided Communication Afternoon

More information

Introduction to Synoptic

Introduction to Synoptic Introduction to Synoptic 1 Introduction Synoptic is a tool that summarizes log files. More exactly, Synoptic takes a set of log files, and some rules that tell it how to interpret lines in those logs,

More information

Purify User s Guide. Version 4.1 support@rational.com http://www.rational.com

Purify User s Guide. Version 4.1 support@rational.com http://www.rational.com Purify User s Guide Version 4.1 support@rational.com http://www.rational.com IMPORTANT NOTICE DISCLAIMER OF WARRANTY Rational Software Corporation makes no representations or warranties, either express

More information

Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem. Dr Rosemary Francis, Ellexus

Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem. Dr Rosemary Francis, Ellexus Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem Dr Rosemary Francis, Ellexus For years instruction-level debuggers and profilers have improved in leaps and bounds. Similarly, system-level

More information