Session 2: MUST. Correctness Checking

Size: px
Start display at page:

Download "Session 2: MUST. Correctness Checking"

Transcription

1 Center for Information Services and High Performance Computing (ZIH) Session 2: MUST Correctness Checking Dr. Matthias S. Müller (RWTH Aachen University) Tobias Hilbrich (Technische Universität Dresden) Joachim Protze (RWTH Aachen University, LLNL)

2 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 2

3 How many errors can you spot in this tiny example? #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Recv (buf, 2, MPI_INT, size - rank, 123, MPI_COMM_WORLD, MPI_STATUS_IGNORE); MPI_Send (buf, 2, type, size - rank, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); } return 0; At least 8 issues in this code example Matthias Müller, Joachim Protze, Tobias Hilbrich 3

4 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 4

5 MPI usage errors! MPI programming is error prone! Bugs may manifest as: Ø Crashes Ø Hangs Ø Wrong results Ø Not at all! (Sleeping bugs)! Tools help to detect these issues Matthias Müller, Joachim Protze, Tobias Hilbrich 5

6 MPI usage errors (2)! Complications in MPI usage: Ø Non-blocking communication Ø Persistent communication Ø Complex collectives (e.g. Alltoallw) Ø Derived datatypes Ø Non-contiguous buffers! Error Classes include: Ø Incorrect arguments Ø Resource errors Ø Buffer usage Ø Type matching Ø Deadlocks Matthias Müller, Joachim Protze, Tobias Hilbrich 6

7 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 7

8 Skipping some errors! Missing MPI_Init:! Current release doesn t start to work, implementation in progress! Missing MPI_Finalize:! Current release doesn t terminate all analyses, work in progress! Src/dest rank out of range (size-rank): leads to crash, use crash save version of tool Matthias Müller, Joachim Protze, Tobias Hilbrich 8

9 MUST: Tool design PMPI Interface P^nMPI Library Application Additional tool ranks MPI Library Split-comm-world Force status wrapper GTI: event forwarding, network Local Analyses Non-Local Analyses Rewrite-comm-world MPI Library Matthias Müller, Joachim Protze, Tobias Hilbrich 9

10 0 4 1 GTI: event forwarding, network 2 Local Analyses 6 Non-Local 5 Analyses 3 Matthias Müller, Joachim Protze, Tobias Hilbrich 10

11 Fixed some errors: #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Recv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, MPI_STATUS_IGNORE); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 11

12 Must detects deadlocks Click for graphical representa2on of the detected deadlock situa2on. Matthias Müller, Joachim Protze, Tobias Hilbrich 12

13 Graphical representation of deadlocks Rank 0 waits for rank 1 and vv. Simple call stack for this example. Matthias Müller, Joachim Protze, Tobias Hilbrich 13

14 Fix1: use asynchronous receive #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Use asynchronous receive: (MPI_Irecv) } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 14

15 MUST detects errors in handling datatypes Use of uncommited datatype: type Matthias Müller, Joachim Protze, Tobias Hilbrich 15

16 Fix2: use MPI_Type_commit #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 2, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Commit the datatype before usage } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 16

17 MUST detects errors in transfer buffer sizes / types Size of sent message larger than receive buffer Matthias Müller, Joachim Protze, Tobias Hilbrich 17

18 Fix3: use same message size for send and receive #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Reduce the message size } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 18

19 MUST detects use of wrong argument values Use of Fortran type in C, datatype mismatch between sender and receiver Matthias Müller, Joachim Protze, Tobias Hilbrich 19

20 Fix4: use C-datatype constants in C-code #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Use the integer datatype intended for usage in C } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 20

21 MUST detects data races in asynchronous communication Data race between send and ascynchronous receive opera2on Matthias Müller, Joachim Protze, Tobias Hilbrich 21

22 Graphical representation of the race condition Graphical representa2on of the data race loca2on Matthias Müller, Joachim Protze, Tobias Hilbrich 22

23 Fix5: use independent memory regions #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INTEGER, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Finalize (); Offset points to independent memory } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 23

24 MUST detects leaks of user defined objects Leak of user defined datatype object! User defined objects include! MPI_Comms (even by MPI_Comm_dup)! MPI_Datatypes! MPI_Groups Matthias Müller, Joachim Protze, Tobias Hilbrich 24

25 Fix6: Deallocate datatype object #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); } printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Type_free (&type); MPI_Finalize (); return 0; Deallocate the created datatype Matthias Müller, Joachim Protze, Tobias Hilbrich 25

26 MUST detects unfinished asynchronous communication Remaining unfinished asynchronous receive Matthias Müller, Joachim Protze, Tobias Hilbrich 26

27 Fix8: use MPI_Wait to finish asynchronous communication #include <mpi.h> #include <stdio.h> int main (int argc, char** argv) { int rank, size, buf[8]; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Datatype type; MPI_Type_contiguous (2, MPI_INT, &type); MPI_Type_commit (&type); MPI_Request request; MPI_Irecv (buf, 2, MPI_INT, size - rank - 1, 123, MPI_COMM_WORLD, &request); MPI_Send (buf + 4, 1, type, size - rank - 1, 123, MPI_COMM_WORLD); MPI_Wait (&request, MPI_STATUS_IGNORE); printf ("Hello, I am rank %d of %d.\n", rank, size); MPI_Type_free (&type); Finish the asynchronous communica2on MPI_Finalize (); } return 0; Matthias Müller, Joachim Protze, Tobias Hilbrich 27

28 No further error detected Hopefully this message applies to many applica2ons Matthias Müller, Joachim Protze, Tobias Hilbrich 28

29 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 29

30 Classes of Correctness Tools! Debuggers: Ø Helpful to pinpoint any error Ø Finding the root cause may be very hard Ø Won t detect sleeping errors Ø E.g.: gdb, TotalView, Alinea DDT! Static Analysis: Ø Compilers and Source analyzers Ø Typically: type and expression errors Ø E.g.: MPI-Check! Model checking: Ø Requires a model of your applications Ø State explosion possible Ø E.g.: MPI-Spin Matthias Müller, Joachim Protze, Tobias Hilbrich 30

31 Strategies of Correctness Tools! Runtime error detection: Ø Inspect MPI calls at runtime Ø Limited to the timely interleaving that is observed Ø Causes overhead during application run Ø E.g.: Intel Trace Analyzer, Umpire, Marmot, MUST! Formal verification: Ø Extension of runtime error detection Ø Explores ALL possible timely interleavings Ø Can detect potential deadlocks or type missmatches that would otherwise not occur in the presence of a tool Ø For non-deterministic applications exponential exploration space Ø E.g.: ISP Matthias Müller, Joachim Protze, Tobias Hilbrich 31

32 Content! Motivation! MPI usage errors! Examples: Common MPI usage errors Ø Including MUST s error descriptions! Correctness tools! MUST usage Matthias Müller, Joachim Protze, Tobias Hilbrich 32

33 MUST Usage 1) Compile and link application as usual Link against the shared version of the MPI lib (Usually default) 2) Replace mpiexec with mustrun E.g.: mustrun np 4 myapp.exe input.txt output.txt 3) Inspect MUST_Output.html in run directory MUST_Output/MUST_Deadlock.dot exists in case of deadlock Visualize with: dot Tps MUST_Deadlock.dot o deadlock.ps The mustrun script will use an extra process for nonlocal checks (Invisible to application) I.e.: mustrun np 4 will issue a mpirun np 5 Make sure to allocate the extra task in batch jobs Matthias Müller, Joachim Protze, Tobias Hilbrich 33

34 MUST - Features! Local checks: Ø Integer validation Ø Integrity checks (pointers valid, etc.) Ø Operation, Request, Communicator, Datatype, Group usage Ø Resource leak detection Ø Memory overlap checks! Non-local checks: Ø Collective verification Ø Lost message detection Ø Type matching (For P2P and collectives) Ø Deadlock detection (with root cause visualization) Matthias Müller, Joachim Protze, Tobias Hilbrich 34

35 MUST - Features: Scalability! Local checks largely scalable! Non-local checks:! Current default uses a central process Ø This process is an MPI task taken from the application Ø Limited scalability ~100 tasks (Depending on application)! Distributed analysis available (tested with 10k tasks) Ø Uses more extra tasks (10%-100%)! Recommended: Logging to an HTML file! Uses a scalable tool infrastructure! Tool configuration happens at execution time Matthias Müller, Joachim Protze, Tobias Hilbrich 35

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop Joachim Protze and Felix Münchhalfen IT Center RWTH Aachen University February 2014 Content MPI Usage Errors Error Classes Avoiding

More information

Parallelization: Binary Tree Traversal

Parallelization: Binary Tree Traversal By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First

More information

MPI Application Development Using the Analysis Tool MARMOT

MPI Application Development Using the Analysis Tool MARMOT MPI Application Development Using the Analysis Tool MARMOT HLRS High Performance Computing Center Stuttgart Allmandring 30 D-70550 Stuttgart http://www.hlrs.de 24.02.2005 1 Höchstleistungsrechenzentrum

More information

High performance computing systems. Lab 1

High performance computing systems. Lab 1 High performance computing systems Lab 1 Dept. of Computer Architecture Faculty of ETI Gdansk University of Technology Paweł Czarnul For this exercise, study basic MPI functions such as: 1. for MPI management:

More information

MPI-Checker Static Analysis for MPI

MPI-Checker Static Analysis for MPI MPI-Checker Static Analysis for MPI Alexander Droste, Michael Kuhn, Thomas Ludwig November 15, 2015 Motivation 2 / 39 Why is runtime analysis in HPC challenging? Large amount of resources are used State

More information

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015 Hermann Härtig ISSUES starting points independent Unix processes and block synchronous execution who does it load migration mechanism

More information

Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1

Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1 Lecture 6: Introduction to MPI programming Lecture 6: Introduction to MPI programming p. 1 MPI (message passing interface) MPI is a library standard for programming distributed memory MPI implementation(s)

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or

More information

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1

More information

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007 HP-MPI User s Guide 11th Edition Manufacturing Part Number : B6060-96024 September 2007 Copyright 1979-2007 Hewlett-Packard Development Company, L.P. Table 1 Revision history Edition MPN Description Eleventh

More information

HPCC - Hrothgar Getting Started User Guide MPI Programming

HPCC - Hrothgar Getting Started User Guide MPI Programming HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...

More information

Lightning Introduction to MPI Programming

Lightning Introduction to MPI Programming Lightning Introduction to MPI Programming May, 2015 What is MPI? Message Passing Interface A standard, not a product First published 1994, MPI-2 published 1997 De facto standard for distributed-memory

More information

Introduction to Hybrid Programming

Introduction to Hybrid Programming Introduction to Hybrid Programming Hristo Iliev Rechen- und Kommunikationszentrum aixcelerate 2012 / Aachen 10. Oktober 2012 Version: 1.1 Rechen- und Kommunikationszentrum (RZ) Motivation for hybrid programming

More information

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ) Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications

More information

RA MPI Compilers Debuggers Profiling. March 25, 2009

RA MPI Compilers Debuggers Profiling. March 25, 2009 RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar

More information

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs. MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.de HLRS High Performance Computing Center Stuttgart Allmandring 30

More information

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS by Subodh Sharma A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor

More information

Asynchronous Dynamic Load Balancing (ADLB)

Asynchronous Dynamic Load Balancing (ADLB) Asynchronous Dynamic Load Balancing (ADLB) A high-level, non-general-purpose, but easy-to-use programming model and portable library for task parallelism Rusty Lusk Mathema.cs and Computer Science Division

More information

Parallel Programming with MPI on the Odyssey Cluster

Parallel Programming with MPI on the Odyssey Cluster Parallel Programming with MPI on the Odyssey Cluster Plamen Krastev Office: Oxford 38, Room 204 Email: plamenkrastev@fas.harvard.edu FAS Research Computing Harvard University Objectives: To introduce you

More information

Message Passing with MPI

Message Passing with MPI Message Passing with MPI Hristo Iliev (Христо Илиев) PPCES 2012, 2013 Christian Iwainsky PPCES 2011 Rechen- und Kommunikationszentrum (RZ) Agenda Motivation MPI Part 1 Concepts Point-to-point communication

More information

Parallel Debugging with DDT

Parallel Debugging with DDT Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece

More information

Dynamic Software Testing of MPI Applications with Umpire

Dynamic Software Testing of MPI Applications with Umpire Dynamic Software Testing of MPI Applications with Umpire Jeffrey S. Vetter Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory Livermore, California, USA

More information

Eliminate Memory Errors and Improve Program Stability

Eliminate Memory Errors and Improve Program Stability Eliminate Memory Errors and Improve Program Stability with Intel Parallel Studio XE Can running one simple tool make a difference? Yes, in many cases. You can find errors that cause complex, intermittent

More information

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI Load Balancing 1 the Mandelbrot set computing a file with grayscales 2 Static Work Load Assignment granularity considerations static work load assignment with MPI 3 Dynamic Work Load Balancing scheduling

More information

Application Performance Tools on Discover

Application Performance Tools on Discover Application Performance Tools on Discover Tyler Simon 21 May 2009 Overview 1. ftnchek - static Fortran code analysis 2. Cachegrind - source annotation for cache use 3. Ompp - OpenMP profiling 4. IPM MPI

More information

Parallel Computing. Parallel shared memory computing with OpenMP

Parallel Computing. Parallel shared memory computing with OpenMP Parallel Computing Parallel shared memory computing with OpenMP Thorsten Grahs, 14.07.2014 Table of contents Introduction Directives Scope of data Synchronization OpenMP vs. MPI OpenMP & MPI 14.07.2014

More information

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics

More information

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters User s Manual for the HPCVL DMSM Library Gang Liu and Hartmut L. Schmider High Performance Computing

More information

Memory Debugging with TotalView on AIX and Linux/Power

Memory Debugging with TotalView on AIX and Linux/Power S cico m P Austin Aug 2004 Memory Debugging with TotalView on AIX and Linux/Power Chris Gottbrath Memory Debugging in AIX and Linux-Power Clusters Intro: Define the problem and terms What are Memory bugs?

More information

Introduction to MPI Programming!

Introduction to MPI Programming! Introduction to MPI Programming! Rocks-A-Palooza II! Lab Session! 2006 UC Regents! 1! Modes of Parallel Computing! SIMD - Single Instruction Multiple Data!!processors are lock-stepped : each processor

More information

Helping you avoid stack overflow crashes!

Helping you avoid stack overflow crashes! Helping you avoid stack overflow crashes! One of the toughest (and unfortunately common) problems in embedded systems is stack overflows and the collateral corruption that it can cause. As a result, we

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Christian Iwainsky iwainsky@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Produktivitätstools unter Linux Sep 16, RWTH Aachen University

More information

Using the Intel Inspector XE

Using the Intel Inspector XE Using the Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Race Condition Data Race: the typical OpenMP programming error, when: two or more threads access the same memory

More information

Chao Chen 1 Michael Lang 2 Yong Chen 1. IEEE BigData, 2013. Department of Computer Science Texas Tech University

Chao Chen 1 Michael Lang 2 Yong Chen 1. IEEE BigData, 2013. Department of Computer Science Texas Tech University Chao Chen 1 Michael Lang 2 1 1 Data-Intensive Scalable Laboratory Department of Computer Science Texas Tech University 2 Los Alamos National Laboratory IEEE BigData, 2013 Outline 1 2 3 4 Outline 1 2 3

More information

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt MUSIC Multi-Simulation Coordinator Users Manual Örjan Ekeberg and Mikael Djurfeldt March 3, 2009 Abstract MUSIC is an API allowing large scale neuron simulators using MPI internally to exchange data during

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 7 1.2

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 8 1.2

More information

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5) Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load

More information

Parallelizing Heavyweight Debugging Tools with MPIecho

Parallelizing Heavyweight Debugging Tools with MPIecho Parallelizing Heavyweight Debugging Tools with MPIecho Barry Rountree rountree@llnl.gov Martin Schulz schulzm@llnl.gov Guy Cobb University of Colorado guy.cobb@gmail.com Bronis R. de Supinski bronis@llnl.gov

More information

Parallel and Distributed Computing Programming Assignment 1

Parallel and Distributed Computing Programming Assignment 1 Parallel and Distributed Computing Programming Assignment 1 Due Monday, February 7 For programming assignment 1, you should write two C programs. One should provide an estimate of the performance of ping-pong

More information

Libmonitor: A Tool for First-Party Monitoring

Libmonitor: A Tool for First-Party Monitoring Libmonitor: A Tool for First-Party Monitoring Mark W. Krentel Dept. of Computer Science Rice University 6100 Main St., Houston, TX 77005 krentel@rice.edu ABSTRACT Libmonitor is a library that provides

More information

Basic Concepts in Parallelization

Basic Concepts in Parallelization 1 Basic Concepts in Parallelization Ruud van der Pas Senior Staff Engineer Oracle Solaris Studio Oracle Menlo Park, CA, USA IWOMP 2010 CCS, University of Tsukuba Tsukuba, Japan June 14-16, 2010 2 Outline

More information

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint) TN203 Porting a Program to Dynamic C Introduction Dynamic C has a number of improvements and differences compared to many other C compiler systems. This application note gives instructions and suggestions

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Christian Iwainsky iwainsky@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Produktivitätstools unter Linux Sep 16, RWTH Aachen University

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Distributed Systems 15-319, spring 2010 11 th Lecture, Feb 16 th Majd F. Sakr Lecture Motivation Understand Distributed Systems Concepts Understand the concepts / ideas

More information

Introduction. dnotify

Introduction. dnotify Introduction In a multi-user, multi-process operating system, files are continually being created, modified and deleted, often by apparently unrelated processes. This means that any software that needs

More information

Performance analysis with Periscope

Performance analysis with Periscope Performance analysis with Periscope M. Gerndt, V. Petkov, Y. Oleynik, S. Benedict Technische Universität München September 2010 Outline Motivation Periscope architecture Periscope performance analysis

More information

Oracle Solaris Studio Code Analyzer

Oracle Solaris Studio Code Analyzer Oracle Solaris Studio Code Analyzer The Oracle Solaris Studio Code Analyzer ensures application reliability and security by detecting application vulnerabilities, including memory leaks and memory access

More information

High-Performance Computing: Architecture and APIs

High-Performance Computing: Architecture and APIs High-Performance Computing: Architecture and APIs Douglas Fuller ASU Fulton High Performance Computing Why HPC? Capacity computing Do similar jobs, but lots of them. Capability computing Run programs we

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

GPU Tools Sandra Wienke

GPU Tools Sandra Wienke Sandra Wienke Center for Computing and Communication, RWTH Aachen University MATSE HPC Battle 2012/13 Rechen- und Kommunikationszentrum (RZ) Agenda IDE Eclipse Debugging (CUDA) TotalView Profiling (CUDA

More information

MPI / ClusterTools Update and Plans

MPI / ClusterTools Update and Plans HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski

More information

Sources: On the Web: Slides will be available on:

Sources: On the Web: Slides will be available on: C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,

More information

The Asterope compute cluster

The Asterope compute cluster The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU

More information

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014 Debugging in Heterogeneous Environments with TotalView ECMWF HPC Workshop 30 th October 2014 Agenda Introduction Challenges TotalView overview Advanced features Current work and future plans 2014 Rogue

More information

Unified Performance Data Collection with Score-P

Unified Performance Data Collection with Score-P Unified Performance Data Collection with Score-P Bert Wesarg 1) With contributions from Andreas Knüpfer 1), Christian Rössel 2), and Felix Wolf 3) 1) ZIH TU Dresden, 2) FZ Jülich, 3) GRS-SIM Aachen Fragmentation

More information

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee ADVANCED MPI Dr. David Cronk Innovative Computing Lab University of Tennessee Day 1 Morning - Lecture Course Outline Communicators/Groups Extended Collective Communication One-sided Communication Afternoon

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, Josef.Pelikan@mff.cuni.cz Abstract 1 Interconnect quality

More information

Leak Check Version 2.1 for Linux TM

Leak Check Version 2.1 for Linux TM Leak Check Version 2.1 for Linux TM User s Guide Including Leak Analyzer For x86 Servers Document Number DLC20-L-021-1 Copyright 2003-2009 Dynamic Memory Solutions LLC www.dynamic-memory.com Notices Information

More information

Introduction to Synoptic

Introduction to Synoptic Introduction to Synoptic 1 Introduction Synoptic is a tool that summarizes log files. More exactly, Synoptic takes a set of log files, and some rules that tell it how to interpret lines in those logs,

More information

University of Amsterdam - SURFsara. High Performance Computing and Big Data Course

University of Amsterdam - SURFsara. High Performance Computing and Big Data Course University of Amsterdam - SURFsara High Performance Computing and Big Data Course Workshop 7: OpenMP and MPI Assignments Clemens Grelck C.Grelck@uva.nl Roy Bakker R.Bakker@uva.nl Adam Belloum A.S.Z.Belloum@uva.nl

More information

The V-model. Validation and Verification. Inspections [24.3] Testing overview [8, 15.2] - system testing. How much V&V is enough?

The V-model. Validation and Verification. Inspections [24.3] Testing overview [8, 15.2] - system testing. How much V&V is enough? Validation and Verification Inspections [24.3] Testing overview [8, 15.2] - system testing Requirements Design The V-model V & V Plans Implementation Unit tests System tests Integration tests Operation,

More information

Andreas Burghart 6 October 2014 v1.0

Andreas Burghart 6 October 2014 v1.0 Yocto Qt Application Development Andreas Burghart 6 October 2014 Contents 1.0 Introduction... 3 1.1 Qt for Embedded Linux... 3 1.2 Outline... 4 1.3 Assumptions... 5 1.4 Corrections... 5 1.5 Version...

More information

How To Visualize Performance Data In A Computer Program

How To Visualize Performance Data In A Computer Program Performance Visualization Tools 1 Performance Visualization Tools Lecture Outline : Following Topics will be discussed Characteristics of Performance Visualization technique Commercial and Public Domain

More information

System Calls and Standard I/O

System Calls and Standard I/O System Calls and Standard I/O Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex 1 Goals of Today s Class System calls o How a user process contacts the Operating System o For advanced services

More information

Parallel Computing. Shared memory parallel programming with OpenMP

Parallel Computing. Shared memory parallel programming with OpenMP Parallel Computing Shared memory parallel programming with OpenMP Thorsten Grahs, 27.04.2015 Table of contents Introduction Directives Scope of data Synchronization 27.04.2015 Thorsten Grahs Parallel Computing

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de

More information

An Incomplete C++ Primer. University of Wyoming MA 5310

An Incomplete C++ Primer. University of Wyoming MA 5310 An Incomplete C++ Primer University of Wyoming MA 5310 Professor Craig C. Douglas http://www.mgnet.org/~douglas/classes/na-sc/notes/c++primer.pdf C++ is a legacy programming language, as is other languages

More information

Purify User s Guide. Version 4.1 support@rational.com http://www.rational.com

Purify User s Guide. Version 4.1 support@rational.com http://www.rational.com Purify User s Guide Version 4.1 support@rational.com http://www.rational.com IMPORTANT NOTICE DISCLAIMER OF WARRANTY Rational Software Corporation makes no representations or warranties, either express

More information

Allinea Performance Reports User Guide. Version 6.0.6

Allinea Performance Reports User Guide. Version 6.0.6 Allinea Performance Reports User Guide Version 6.0.6 Contents Contents 1 1 Introduction 4 1.1 Online Resources...................................... 4 2 Installation 5 2.1 Linux/Unix Installation...................................

More information

Java Troubleshooting and Performance

Java Troubleshooting and Performance Java Troubleshooting and Performance Margus Pala Java Fundamentals 08.12.2014 Agenda Debugger Thread dumps Memory dumps Crash dumps Tools/profilers Rules of (performance) optimization 1. Don't optimize

More information

Building Embedded Systems

Building Embedded Systems All Rights Reserved. The contents of this document cannot be reproduced without prior permission of the authors. Building Embedded Systems Chapter 5: Maintenance and Debugging Andreas Knirsch andreas.knirsch@h-da.de

More information

Agenda. Using HPC Wales 2

Agenda. Using HPC Wales 2 Using HPC Wales Agenda Infrastructure : An Overview of our Infrastructure Logging in : Command Line Interface and File Transfer Linux Basics : Commands and Text Editors Using Modules : Managing Software

More information

Load Balancing MPI Algorithm for High Throughput Applications

Load Balancing MPI Algorithm for High Throughput Applications Load Balancing MPI Algorithm for High Throughput Applications Igor Grudenić, Stjepan Groš, Nikola Bogunović Faculty of Electrical Engineering and, University of Zagreb Unska 3, 10000 Zagreb, Croatia {igor.grudenic,

More information

OMPT and OMPD: OpenMP Tools Application Programming Interfaces for Performance Analysis and Debugging

OMPT and OMPD: OpenMP Tools Application Programming Interfaces for Performance Analysis and Debugging OMPT and OMPD: OpenMP Tools Application Programming Interfaces for Performance Analysis and Debugging Alexandre Eichenberger, John Mellor-Crummey, Martin Schulz, Nawal Copty, John DelSignore, Robert Dietrich,

More information

Frysk The Systems Monitoring and Debugging Tool. Andrew Cagney

Frysk The Systems Monitoring and Debugging Tool. Andrew Cagney Frysk The Systems Monitoring and Debugging Tool Andrew Cagney Agenda Two Use Cases Motivation Comparison with Existing Free Technologies The Frysk Architecture and GUI Command Line Utilities Current Status

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Secure Programming with Static Analysis. Jacob West jacob@fortify.com

Secure Programming with Static Analysis. Jacob West jacob@fortify.com Secure Programming with Static Analysis Jacob West jacob@fortify.com Software Systems that are Ubiquitous Connected Dependable Complexity U Unforeseen Consequences Software Security Today The line between

More information

The Easiest Way to Run Spark Jobs. How-To Guide

The Easiest Way to Run Spark Jobs. How-To Guide The Easiest Way to Run Spark Jobs How-To Guide The Easiest Way to Run Spark Jobs Recently, Databricks added a new feature, Jobs, to our cloud service. You can find a detailed overview of this feature in

More information

Programming the Intel Xeon Phi Coprocessor

Programming the Intel Xeon Phi Coprocessor Programming the Intel Xeon Phi Coprocessor Tim Cramer cramer@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Agenda Motivation Many Integrated Core (MIC) Architecture Programming Models Native

More information

System Calls Related to File Manipulation

System Calls Related to File Manipulation KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS Information and Computer Science Department ICS 431 Operating Systems Lab # 12 System Calls Related to File Manipulation Objective: In this lab we will be

More information

Tools for Performance Debugging HPC Applications. David Skinner deskinner@lbl.gov

Tools for Performance Debugging HPC Applications. David Skinner deskinner@lbl.gov Tools for Performance Debugging HPC Applications David Skinner deskinner@lbl.gov Tools for Performance Debugging Practice Where to find tools Specifics to NERSC and Hopper Principles Topics in performance

More information

Get the Better of Memory Leaks with Valgrind Whitepaper

Get the Better of Memory Leaks with Valgrind Whitepaper WHITE PAPER Get the Better of Memory Leaks with Valgrind Whitepaper Memory leaks can cause problems and bugs in software which can be hard to detect. In this article we will discuss techniques and tools

More information

Integrating SNiFF+ with the Data Display Debugger (DDD)

Integrating SNiFF+ with the Data Display Debugger (DDD) 1.1 1 of 5 Integrating SNiFF+ with the Data Display Debugger (DDD) 1. Introduction In this paper we will describe the integration of SNiFF+ with the Data Display Debugger (DDD). First we will start with

More information

Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem. Dr Rosemary Francis, Ellexus

Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem. Dr Rosemary Francis, Ellexus Application-Level Debugging and Profiling: Gaps in the Tool Ecosystem Dr Rosemary Francis, Ellexus For years instruction-level debuggers and profilers have improved in leaps and bounds. Similarly, system-level

More information

Developing Parallel Applications with the Eclipse Parallel Tools Platform

Developing Parallel Applications with the Eclipse Parallel Tools Platform Developing Parallel Applications with the Eclipse Parallel Tools Platform Greg Watson IBM STG grw@us.ibm.com Parallel Tools Platform Enabling Parallel Application Development Best practice tools for experienced

More information

Parallel I/O on Mira Venkat Vishwanath and Kevin Harms

Parallel I/O on Mira Venkat Vishwanath and Kevin Harms Parallel I/O on Mira Venkat Vishwanath and Kevin Harms Argonne Na*onal Laboratory venkat@anl.gov ALCF-2 I/O Infrastructure Mira BG/Q Compute Resource Tukey Analysis Cluster 48K Nodes 768K Cores 10 PFlops

More information

Improve Fortran Code Quality with Static Analysis

Improve Fortran Code Quality with Static Analysis Improve Fortran Code Quality with Static Analysis This document is an introductory tutorial describing how to use static analysis on Fortran code to improve software quality, either by eliminating bugs

More information

GDB Tutorial. A Walkthrough with Examples. CMSC 212 - Spring 2009. Last modified March 22, 2009. GDB Tutorial

GDB Tutorial. A Walkthrough with Examples. CMSC 212 - Spring 2009. Last modified March 22, 2009. GDB Tutorial A Walkthrough with Examples CMSC 212 - Spring 2009 Last modified March 22, 2009 What is gdb? GNU Debugger A debugger for several languages, including C and C++ It allows you to inspect what the program

More information

How to Design and Manage a Multi- Migger for parallel Programs

How to Design and Manage a Multi- Migger for parallel Programs Programming and Computer Software, Vol. 31, No. 1, 2005, pp. 20 28. Translated from Programmirovanie, Vol. 31, No. 1, 2005. Original Russian Text Copyright 2005 by Kalinov, Karganov, Khorenko. An Approach

More information

DiskBoss. File & Disk Manager. Version 2.0. Dec 2011. Flexense Ltd. www.flexense.com info@flexense.com. File Integrity Monitor

DiskBoss. File & Disk Manager. Version 2.0. Dec 2011. Flexense Ltd. www.flexense.com info@flexense.com. File Integrity Monitor DiskBoss File & Disk Manager File Integrity Monitor Version 2.0 Dec 2011 www.flexense.com info@flexense.com 1 Product Overview DiskBoss is an automated, rule-based file and disk manager allowing one to

More information

Optimization tools. 1) Improving Overall I/O

Optimization tools. 1) Improving Overall I/O Optimization tools After your code is compiled, debugged, and capable of running to completion or planned termination, you can begin looking for ways in which to improve execution speed. In general, the

More information

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters The char Type ASCII Encoding The C char type stores small integers. It is usually 8 bits. char variables guaranteed to be able to hold integers 0.. +127. char variables mostly used to store characters

More information

BG/Q Performance Tools. Sco$ Parker BG/Q Early Science Workshop: March 19-21, 2012 Argonne Leadership CompuGng Facility

BG/Q Performance Tools. Sco$ Parker BG/Q Early Science Workshop: March 19-21, 2012 Argonne Leadership CompuGng Facility BG/Q Performance Tools Sco$ Parker BG/Q Early Science Workshop: March 19-21, 2012 BG/Q Performance Tool Development In conjuncgon with the Early Science program an Early SoMware efforts was inigated to

More information

Lazy OpenCV installation and use with Visual Studio

Lazy OpenCV installation and use with Visual Studio Lazy OpenCV installation and use with Visual Studio Overview This tutorial will walk you through: How to install OpenCV on Windows, both: The pre-built version (useful if you won t be modifying the OpenCV

More information

Developing, Deploying, and Debugging Applications on Windows Embedded Standard 7

Developing, Deploying, and Debugging Applications on Windows Embedded Standard 7 Developing, Deploying, and Debugging Applications on Windows Embedded Standard 7 Contents Overview... 1 The application... 2 Motivation... 2 Code and Environment... 2 Preparing the Windows Embedded Standard

More information