Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1

Size: px
Start display at page:

Download "Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1"

Transcription

1 Lecture 6: Introduction to MPI programming Lecture 6: Introduction to MPI programming p. 1

2 MPI (message passing interface) MPI is a library standard for programming distributed memory MPI implementation(s) available on almost every major parallel platform (also on shared-memory machines) Portability, good performance & functionality Collaborative computing by a group of individual processes Each process has its own local memory Explicit message passing enables information exchange and collaboration between processes More info: Lecture 6: Introduction to MPI programming p. 2

3 MPI basics The MPI specification is a combination of MPI-1 and MPI-2 MPI-1 defines a collection of 120+ commands MPI-2 is an extension of MPI-1 to handle "difficult" issues MPI has language bindings for F77, C and C++ There also exist, e.g., several MPI modules in Python (more user-friendly) Knowledge of entire MPI is not necessary Lecture 6: Introduction to MPI programming p. 3

4 MPI language bindings C binding #include <mpi.h> rc = MPI_Xxxxx(parameter,... ) Fortran binding include mpif.h CALL MPI_XXXXX(parameter,..., ierr) Lecture 6: Introduction to MPI programming p. 4

5 MPI communicator An MPI communicator: a "communication universe" for a group of processes MPI COMM WORLD name of the default MPI communicator, i.e., the collection of all processes Each process in a communicator is identified by its rank Almost every MPI command needs to provide a communicator as input argument Lecture 6: Introduction to MPI programming p. 5

6 MPI process rank Each process has a unique rank, i.e. an integer identifier, within a communicator The rank value is between 0 and #procs-1 The rank value is used to distinguish one process from another Commands MPI Comm size & MPI Comm rank are very useful Example int size, my_rank; MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Comm_rank (MPI_COMM_WORLD, &my_rank); if (my_rank==0) {... } Lecture 6: Introduction to MPI programming p. 6

7 The 6 most important MPI commands MPI Init - initiate an MPI computation MPI Finalize - terminate the MPI computation and clean up MPI Comm size - how many processes participate in a given MPI communicator? MPI Comm rank - which one am I? (A number between 0 and size-1.) MPI Send - send a message to a particular process within an MPI communicator MPI Recv - receive a message from a particular process within an MPI communicator Lecture 6: Introduction to MPI programming p. 7

8 MPI "Hello-world" example #include <stdio.h> #include <mpi.h> int main (int nargs, char** args) { int size, my_rank; MPI_Init (&nargs, &args); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Comm_rank (MPI_COMM_WORLD, &my_rank); printf("hello world, I ve rank %d out of %d procs.\n", my_rank,size); MPI_Finalize (); return 0; } Lecture 6: Introduction to MPI programming p. 8

9 MPI "Hello-world" example (cont d) Compilation example: mpicc hello.c Parallel execution example: mpirun -np 4 a.out Order of output from the processes is not determined, may vary from execution to execution Hello world, I ve rank 2 out of 4 procs. Hello world, I ve rank 1 out of 4 procs. Hello world, I ve rank 3 out of 4 procs. Hello world, I ve rank 0 out of 4 procs. Lecture 6: Introduction to MPI programming p. 9

10 ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ Ò Ð Þ µ ÐÖ ØÙÖÒ¼ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ Ò Ð Þ µ ÐÖ ØÙÖÒ¼ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ Ò Ð Þ µ ÐÖ ØÙÖÒ¼ The mental picture of parallel execution The same MPI program is executed concurrently on each process Process 0 Process 1 Process P -1 Lecture 6: Introduction to MPI programming p. 10

11 Synchronization Many parallel algorithms require that none process proceeds before all the processes have reached the same state at certain points of a program. Explicit synchronization int MPI_Barrier (MPI_Comm comm) Implicit synchronization through use of e.g. pairs of MPI Send and MPI Recv. Ask yourself the following question: If Process 1 progresses 100 times faster than Process 2, will the final result still be correct? Lecture 6: Introduction to MPI programming p. 11

12 Example: ordered output #include <stdio.h> #include <mpi.h> int main (int nargs, char** args) { int size, my_rank,i; MPI_Init (&nargs, &args); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Comm_rank (MPI_COMM_WORLD, &my_rank); for (i=0; i<size; i++) { MPI_Barrier (MPI_COMM_WORLD); if (i==my_rank) { printf("hello world, I ve rank %d out of %d procs.\n", my_rank, size); fflush (stdout); } } } MPI_Finalize (); return 0; Lecture 6: Introduction to MPI programming p. 12

13 ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÓÖ ¼ Þ µß ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ ÖÖ Ö ÅÈÁ ÇÅÅ ÏÇÊÄ µ Ð ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÑÝ Ö Ò µß ÓÖ ¼ Þ µß ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÑÝ Ö Ò µß ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÅÈÁ ÖÖ Ö ÅÈÁ ÇÅÅ ÏÇÊÄ µ ÑÝ Ö Ò Þ µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò Example: ordered output (cont d) Process 0 Process 1 Process P -1 ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò ÒÐÙ Ø Óº ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÓÖ ¼ Þ µß ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÑÝ Ö Ò µß ÅÈÁ ÖÖ Ö ÅÈÁ ÇÅÅ ÏÇÊÄ µ The processes synchronize between themselves P times. Ð ÐÅÈÁ Ò Ð Þ µ ÐÙ Ø ÓÙص Ö ØÙÖÒ¼ Parallel execution result: Ð ÐÅÈÁ Ò Ð Þ µ ÐÙ Ø ÓÙص ÐÖ ØÙÖÒ¼ Hello world, I ve rank 0 out of 4 procs. Hello world, I ve rank 1 out of 4 procs. Hello world, I ve rank 2 out of 4 procs. Hello world, I ve rank 3 out of 4 procs. Ð ÐÅÈÁ Ò Ð Þ µ ÐÙ Ø ÓÙص ÐÖ ØÙÖÒ¼ Lecture 6: Introduction to MPI programming p. 13

14 MPI point-to-point communication Participation of two different processes Several different types of send and receive commands Blocking/non-blocking send Blocking/non-blocking receive Four modes of send operations Combined send/receive Lecture 6: Introduction to MPI programming p. 14

15 The simplest MPI send command int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); This blocking send function returns when the data has been delivered to the system and the buffer can be reused. The message may not have been received by the destination process. Lecture 6: Introduction to MPI programming p. 15

16 The simplest MPI receive command int MPI_Recv(void *buf, int count MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status); This blocking receive function waits until a matching message is received from the system so that the buffer contains the incoming message. Match of data type, source process (or MPI ANY SOURCE), message tag (or MPI ANY TAG). Receiving fewer datatype elements than count is ok, but receiving more is an error. Lecture 6: Introduction to MPI programming p. 16

17 MPI message An MPI message is an array of data elements "inside an envelope" Data: start address of the message buffer, counter of elements in the buffer, data type Envelope: source/destination process, message tag, communicator Lecture 6: Introduction to MPI programming p. 17

18 MPI Status The source or tag of a received message may not be known if wildcard values were used in the receive function. In C, MPI Status is a structure that contains further information. It can be queried as follows: status.mpi_source status.mpi_tag MPI_Get_count (MPI_Status *status, MPI_Datatype datatype, int *count); Lecture 6: Introduction to MPI programming p. 18

19 Example of MPI send and MPI recv #include <stdio.h> #include <mpi.h> int main (int nargs, char** args) { int size, my_rank, flag; MPI_Status status; MPI_Init (&nargs, &args); MPI_Comm_size (MPI_COMM_WORLD, &size); MPI_Comm_rank (MPI_COMM_WORLD, &my_rank); if (my_rank>0) MPI_Recv (&flag, 1, MPI_INT, my_rank-1, 100, MPI_COMM_WORLD, &status); printf("hello world, I ve rank %d out of %d procs.\n",my_rank,size); if (my_rank<size-1) MPI_Send (&my_rank, 1, MPI_INT, my_rank+1, 100, MPI_COMM_WORLD); } MPI_Finalize (); return 0; Lecture 6: Introduction to MPI programming p. 19

20 ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò Ð ÒÐÙ Ø Óº ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÅÈÁ ËØ ØÙ Ø ØÙ ÑÝ Ö Ò ¼µ Ð ÅÈÁ Ê Ú ² Ð ½ ÅÈÁ ÁÆÌ ÅÈÁ Ê Ú ² Ð ½ ÅÈÁ ÁÆÌ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò Ð ÒÐÙ Ø Óº ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÅÈÁ ËØ ØÙ Ø ØÙ ÑÝ Ö Ò ¼µ ÑÝ Ö Ò ¹½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Ø ØÙ µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÅÈÁ Ê Ú ² Ð ½ ÅÈÁ ÁÆÌ Example of MPI send/mpi recv (cont d) Process 0 Process 1 Process P -1 ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÑÝ Ö Ò ¹½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Ø ØÙ µ ÒÐÙ ÑÔ º ÒØÑ Ò ÒØÒ Ö Ö Ö µ ß ÒØ Þ ÑÝ Ö Ò Ð ÒÐÙ Ø Óº ÅÈÁ ÁÒ Ø ²Ò Ö ² Ö µ ÅÈÁ ÓÑÑ Þ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Þ µ ÅÈÁ ÓÑÑ Ö Ò ÅÈÁ ÇÅÅ ÏÇÊÄ ²ÑÝ Ö Ò µ ÅÈÁ ËØ ØÙ Ø ØÙ ÑÝ Ö Ò ¼µ ÔÖ ÒØ À ÐÐÓÛÓÖÐ Á³Ú Ö Ò ± ÓÙØÓ ± ÔÖÓ º Ò ÑÝ Ö Ò Þ µ ÑÝ Ö Ò ¹½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ ² Ø ØÙ µ ÅÈÁ Ë Ò ²ÑÝ Ö Ò ½ ÅÈÁ ÁÆÌ ÑÝ Ö Ò ½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ µ ÅÈÁ Ò Ð Þ µ ÑÝ Ö Ò Þ ¹½µ Ö ØÙÖÒ¼ ÅÈÁ Ë Ò ²ÑÝ Ö Ò ½ ÅÈÁ ÁÆÌ ÑÝ Ö Ò ½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ µ ÅÈÁ Ò Ð Þ µ ÑÝ Ö Ò Þ ¹½µ ÐÖ ØÙÖÒ¼ ÅÈÁ Ë Ò ²ÑÝ Ö Ò ½ ÅÈÁ ÁÆÌ ÑÝ Ö Ò ½ ½¼¼ ÅÈÁ ÇÅÅ ÏÇÊÄ µ ÅÈÁ Ò Ð Þ µ ÑÝ Ö Ò Þ ¹½µ ÐÖ ØÙÖÒ¼ Enforcement of ordered output by passing around a "semaphore", using MPI send and MPI recv Successful message passover requires a matching pair of MPI send and MPI recv Lecture 6: Introduction to MPI programming p. 20

21 MPI collective communication A collective operation involves all the processes in a communicator: (1) synchronization (2) data movement (3) collective computation data processes A 0 A 0 one-to-all broadcast A 0 MPI_BCAST A 0 A 0 A 0 A 1 A 2 A 3 all-to-one gather MPI_GATHER A 0 A 1 A 2 A 3 A 0 A 1 A 2 A 3 A 0 one-to-all scatter A 1 MPI_SCATTER A 2 A 3 Lecture 6: Introduction to MPI programming p. 21

22 Collective communication (cont d) Processes Initial Data : MPI_REDUCE with MPI_MIN, root = 0 : MPI_ALLREDUCE with MPI_MIN: MPI_REDUCE with MPI_SUM, root = 1 : Lecture 6: Introduction to MPI programming p. 22

23 Computing inner-product in parallel Let us write an MPI program that calculates inner-product between two vectors u = (u 1, u 2,...,u M ) and v = (v 1, v 2,...,v M ), c := M u i v i i=1 Partition u and v into P segments each of sub-length m = M P Each process first concurrently computes its local result Then the global result can be computed Lecture 6: Introduction to MPI programming p. 23

24 Making use of collective communication MPI_Comm_size (MPI_COMM_WORLD, &num_procs); MPI_Comm_rank (MPI_COMM_WORLD, &my_rank); my_start = M/num_procs*my_rank; my_stop = M/num_procs*(my_rank+1); my_c = 0.; for (i=my_start; i<my_stop; i++) my_c = my_c + (u[i] * v[i]); MPI_Allreduce (&my_c, &c, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); Lecture 6: Introduction to MPI programming p. 24

25 Reminder: steps of parallel programming Decide a "breakup" of the global problem functional decomposition a set of concurrent tasks data parallelism sub-arrays, sub-loops, sub-domains Choose a parallel algorithm (e.g. based on modifying a serial algorithm) Design local data structure, if needed Standard serial programming plus insertion of MPI calls Lecture 6: Introduction to MPI programming p. 25

26 Calculation of π Want to numerically approximate the value of π Area of a circle: A = πr 2 Area of the largest circle that fits into the unit square: π 4, because R = 1 2 Estimate of the area of the circle estimate of π How? Throw a number of random points into the unit square Count the percentage of points that lie in the circle by ((x 12 )2 + (y 12 )2 ) 1 4 The percentage is an estimate of the area of the circle π 4 A Lecture 6: Introduction to MPI programming p. 26

27 Parallel calculation of π num = npoints/p; my_circle_pts = 0; for (j=1; j<=num; j++) { generate random 0<=x,y<=1 if (x,y) inside circle my_circle_pts += 1 } MPI_Allreduce(&my_circle_pts,&total_count, 1,MPI_INT,MPI_SUM, MPI_COMM_WORLD); pi = 4.0*total_count/npoints; Lecture 6: Introduction to MPI programming p. 27

28 The issue of load balancing What if npoints is not divisible by P? Simple solution of load balancing num = npoints/p; if (my_rank < (npoints%p)) num += 1; Load balancing is very important for performance Homogeneous processes should have as even disbribution of work load as possible (Dynamic) load balancing is nontrivial for real-world parallel computation Lecture 6: Introduction to MPI programming p. 28

29 Exercises Write a new Hello World program, where all the processes first generate a text message using sprintf and then send it to Process 0 (you may use strlen(message)+1 to find out the length of the message). Afterwards, Process 0 is responsible for writing out all the messages on the standard output. Write three simple parallel programs for adding up a number of random numbers. Each process should first generate and sum up locally an assigned number of random numbers. To find the total sum among all the processes, there are three options: Option 1: let one process be the master and let each process use MPI Send to send its local sum to the master. Option 2: let one process be the master such that it collects from all the other processes by the MPI Gather command. Option 3: let one process be the master and make use of the MPI Reduce command. Exercise 4.8 of the textbook. Exercise 4.11 of the textbook. Lecture 6: Introduction to MPI programming p. 29

High performance computing systems. Lab 1

High performance computing systems. Lab 1 High performance computing systems Lab 1 Dept. of Computer Architecture Faculty of ETI Gdansk University of Technology Paweł Czarnul For this exercise, study basic MPI functions such as: 1. for MPI management:

More information

Lightning Introduction to MPI Programming

Lightning Introduction to MPI Programming Lightning Introduction to MPI Programming May, 2015 What is MPI? Message Passing Interface A standard, not a product First published 1994, MPI-2 published 1997 De facto standard for distributed-memory

More information

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015 Hermann Härtig ISSUES starting points independent Unix processes and block synchronous execution who does it load migration mechanism

More information

HPCC - Hrothgar Getting Started User Guide MPI Programming

HPCC - Hrothgar Getting Started User Guide MPI Programming HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...

More information

To connect to the cluster, simply use a SSH or SFTP client to connect to:

To connect to the cluster, simply use a SSH or SFTP client to connect to: RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or

More information

Session 2: MUST. Correctness Checking

Session 2: MUST. Correctness Checking Center for Information Services and High Performance Computing (ZIH) Session 2: MUST Correctness Checking Dr. Matthias S. Müller (RWTH Aachen University) Tobias Hilbrich (Technische Universität Dresden)

More information

Parallelization: Binary Tree Traversal

Parallelization: Binary Tree Traversal By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First

More information

Parallel Programming with MPI on the Odyssey Cluster

Parallel Programming with MPI on the Odyssey Cluster Parallel Programming with MPI on the Odyssey Cluster Plamen Krastev Office: Oxford 38, Room 204 Email: plamenkrastev@fas.harvard.edu FAS Research Computing Harvard University Objectives: To introduce you

More information

MPI Application Development Using the Analysis Tool MARMOT

MPI Application Development Using the Analysis Tool MARMOT MPI Application Development Using the Analysis Tool MARMOT HLRS High Performance Computing Center Stuttgart Allmandring 30 D-70550 Stuttgart http://www.hlrs.de 24.02.2005 1 Höchstleistungsrechenzentrum

More information

Introduction to Hybrid Programming

Introduction to Hybrid Programming Introduction to Hybrid Programming Hristo Iliev Rechen- und Kommunikationszentrum aixcelerate 2012 / Aachen 10. Oktober 2012 Version: 1.1 Rechen- und Kommunikationszentrum (RZ) Motivation for hybrid programming

More information

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop

MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop Joachim Protze and Felix Münchhalfen IT Center RWTH Aachen University February 2014 Content MPI Usage Errors Error Classes Avoiding

More information

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007

HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007 HP-MPI User s Guide 11th Edition Manufacturing Part Number : B6060-96024 September 2007 Copyright 1979-2007 Hewlett-Packard Development Company, L.P. Table 1 Revision history Edition MPN Description Eleventh

More information

Message Passing with MPI

Message Passing with MPI Message Passing with MPI Hristo Iliev (Христо Илиев) PPCES 2012, 2013 Christian Iwainsky PPCES 2011 Rechen- und Kommunikationszentrum (RZ) Agenda Motivation MPI Part 1 Concepts Point-to-point communication

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 8 1.2

More information

RA MPI Compilers Debuggers Profiling. March 25, 2009

RA MPI Compilers Debuggers Profiling. March 25, 2009 RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar

More information

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.

MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs. MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation Bettina Krammer, Matthias Müller krammer@hlrs.de, mueller@hlrs.de HLRS High Performance Computing Center Stuttgart Allmandring 30

More information

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters

The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters User s Manual for the HPCVL DMSM Library Gang Liu and Hartmut L. Schmider High Performance Computing

More information

Introduction to MPI Programming!

Introduction to MPI Programming! Introduction to MPI Programming! Rocks-A-Palooza II! Lab Session! 2006 UC Regents! 1! Modes of Parallel Computing! SIMD - Single Instruction Multiple Data!!processors are lock-stepped : each processor

More information

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI

Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI Load Balancing 1 the Mandelbrot set computing a file with grayscales 2 Static Work Load Assignment granularity considerations static work load assignment with MPI 3 Dynamic Work Load Balancing scheduling

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Distributed Systems 15-319, spring 2010 11 th Lecture, Feb 16 th Majd F. Sakr Lecture Motivation Understand Distributed Systems Concepts Understand the concepts / ideas

More information

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1

More information

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS

PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS PREDICTIVE ANALYSIS OF MESSAGE PASSING APPLICATIONS by Subodh Sharma A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor

More information

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)

Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5) Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load

More information

Agenda. Using HPC Wales 2

Agenda. Using HPC Wales 2 Using HPC Wales Agenda Infrastructure : An Overview of our Infrastructure Logging in : Command Line Interface and File Transfer Linux Basics : Commands and Text Editors Using Modules : Managing Software

More information

Message Passing Interface (MPI)

Message Passing Interface (MPI) Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 7 1.2

More information

Performance and scalability of MPI on PC clusters

Performance and scalability of MPI on PC clusters CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 24; 16:79 17 (DOI: 1.12/cpe.749) Performance Performance and scalability of MPI on PC clusters Glenn R. Luecke,,

More information

MPI-Checker Static Analysis for MPI

MPI-Checker Static Analysis for MPI MPI-Checker Static Analysis for MPI Alexander Droste, Michael Kuhn, Thomas Ludwig November 15, 2015 Motivation 2 / 39 Why is runtime analysis in HPC challenging? Large amount of resources are used State

More information

Introducción a la computación de altas prestaciones!!"

Introducción a la computación de altas prestaciones!! Introducción a la computación de altas prestaciones!!" Questions Why Parallel Computers? How Can the Quality of the Algorithms be Analyzed? How Should Parallel Computers Be Programmed? Why the Message

More information

Application Performance Tools on Discover

Application Performance Tools on Discover Application Performance Tools on Discover Tyler Simon 21 May 2009 Overview 1. ftnchek - static Fortran code analysis 2. Cachegrind - source annotation for cache use 3. Ompp - OpenMP profiling 4. IPM MPI

More information

N servers. Load-Balancing. A(t) speed s. clients. αn servers. (i) speed s. N servers speed αs. (ii)

N servers. Load-Balancing. A(t) speed s. clients. αn servers. (i) speed s. N servers speed αs. (ii) ËÀÊ ÆÃ Ò Ï Ë ÖÚ Ö ÖÑ Å Ø Ó ÓÖ Ë Ð Ð È Ö ÓÖÑ Ò ÈÖ Ø ÓÒ Ò Å ÙÖ Ñ ÒØ ÃÓÒ Ø ÒØ ÒÓ È ÓÙÒ Ô ÖØÑ ÒØ Ó Ð ØÖ Ð Ò Ò Ö Ò Ò ÓÑÔÙØ Ö Ë Ò ÍÒ Ú Ö ØÝ Ó ËÓÙØ ÖÒ Ð ÓÖÒ Ñ Ð Ô ÓÙÒ Ù º Ù Ô ÓÒ ¼¼½¹¾½ ¹ ¼ Ö ¼ Å Ð ÒØÓ Ú º ¼ ÄÓ

More information

OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware

OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based

More information

Outline. Project. Bestandteile der Veranstaltung. Termine. Optimierung von Multi-Core Systemen. Prof. Dr. Sabine Glesner

Outline. Project. Bestandteile der Veranstaltung. Termine. Optimierung von Multi-Core Systemen. Prof. Dr. Sabine Glesner Outline Project Optimierung von Multi-Core Systemen Lecture 1 Prof. Dr. Sabine Glesner SS 2012 Technische Universität Berlin Termine Bestandteile Anmeldung Prüfungsmodalitäten Bewertung Prof. Glesner Optimierung

More information

Parallel Computing. Parallel shared memory computing with OpenMP

Parallel Computing. Parallel shared memory computing with OpenMP Parallel Computing Parallel shared memory computing with OpenMP Thorsten Grahs, 14.07.2014 Table of contents Introduction Directives Scope of data Synchronization OpenMP vs. MPI OpenMP & MPI 14.07.2014

More information

COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP

COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State

More information

Why Choose C/C++ as the programming language? Parallel Programming in C/C++ - OpenMP versus MPI

Why Choose C/C++ as the programming language? Parallel Programming in C/C++ - OpenMP versus MPI Parallel Programming (Multi/cross-platform) Why Choose C/C++ as the programming language? Compiling C/C++ on Windows (for free) Compiling C/C++ on other platforms for free is not an issue Parallel Programming

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich

More information

An Introduction to Parallel Computing/ Programming

An Introduction to Parallel Computing/ Programming An Introduction to Parallel Computing/ Programming Vicky Papadopoulou Lesta Astrophysics and High Performance Computing Research Group (http://ahpc.euc.ac.cy) Dep. of Computer Science and Engineering European

More information

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces Software Engineering, Lecture 4 Decomposition into suitable parts Cross cutting concerns Design patterns I will also give an example scenario that you are supposed to analyse and make synthesis from The

More information

Program Coupling and Parallel Data Transfer Using PAWS

Program Coupling and Parallel Data Transfer Using PAWS Program Coupling and Parallel Data Transfer Using PAWS Sue Mniszewski, CCS-3 Pat Fasel, CCS-3 Craig Rasmussen, CCS-1 http://www.acl.lanl.gov/paws 2 Primary Goal of PAWS Provide the ability to share data

More information

Dynamic Software Testing of MPI Applications with Umpire

Dynamic Software Testing of MPI Applications with Umpire Dynamic Software Testing of MPI Applications with Umpire Jeffrey S. Vetter Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory Livermore, California, USA

More information

MPI Hands-On List of the exercises

MPI Hands-On List of the exercises MPI Hands-On List of the exercises 1 MPI Hands-On Exercise 1: MPI Environment.... 2 2 MPI Hands-On Exercise 2: Ping-pong...3 3 MPI Hands-On Exercise 3: Collective communications and reductions... 5 4 MPI

More information

University of Notre Dame

University of Notre Dame University of Notre Dame MPI Tutorial Part 1 Introduction Laboratory for Scientific Computing Fall 1998 http://www.lam-mpi.org/tutorials/nd/ lam@lam-mpi.org Fall 1998 1 Tutorial Instructors M.D. McNally

More information

CS 253: Intro to Systems Programming

CS 253: Intro to Systems Programming CS 253: Intro to Systems Programming Spring 2014 Amit Jain, Shane Panter, Marissa Schmidt Department of Computer Science College of Engineering Boise State University Logistics Instructor: Amit Jain http://cs.boisestate.edu/~amit

More information

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability

More information

High Performance Computing. MPI and PETSc

High Performance Computing. MPI and PETSc High Performance Computing. MPI and PETSc Mario Storti Centro de Investigación de Métodos Computacionales - CIMEC (CONICET-UNL), Santa Fe, Argentina http://www.cimec.org.ar/mstorti,

More information

Informatica e Sistemi in Tempo Reale

Informatica e Sistemi in Tempo Reale Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)

More information

Ò ÐÝÞ Ò ÔÐÓÊ ÓÛÒÐÓ ÈÖÓ Ð Û Ø ÁÒ¹ Ø ÐÐ ÒØ Å Ò Ö À Þ Ö ËÓ Ý Ò Ò Ü Ð Ï ÖÛ ØÞ ½ ½ ÁÒ Ø ØÙØ ĐÙÖ ËØ Ø Ø ÙÒ ĐÇ ÓÒÓÑ ØÖ ÀÙÑ ÓÐ Ø ÍÒ Ú Ö ØĐ Ø ÞÙ ÖÐ Ò ËÔ Ò Ù Ö ËØÖº ½ ½¼½ ÖÐ Ò ËÙÑÑ ÖÝ Ì Ô Ô Ö Ò Ü ÑÔÐ Ó Ø Ñ Ò Ò Ò

More information

Hybrid Programming with MPI and OpenMP

Hybrid Programming with MPI and OpenMP Hybrid Programming with and OpenMP Ricardo Rocha and Fernando Silva Computer Science Department Faculty of Sciences University of Porto Parallel Computing 2015/2016 R. Rocha and F. Silva (DCC-FCUP) Programming

More information

Advanced Operating Systems CS428

Advanced Operating Systems CS428 Advanced Operating Systems CS428 Lecture TEN Semester I, 2009-10 Graham Ellis NUI Galway, Ireland DIY Parallelism MPI is useful for C and Fortran programming. DIY Parallelism MPI is useful for C and Fortran

More information

McMPI. Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk

McMPI. Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk McMPI Managed-code MPI library in Pure C# Dr D Holmes, EPCC dholmes@epcc.ed.ac.uk Outline Yet another MPI library? Managed-code, C#, Windows McMPI, design and implementation details Object-orientation,

More information

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee

ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee ADVANCED MPI Dr. David Cronk Innovative Computing Lab University of Tennessee Day 1 Morning - Lecture Course Outline Communicators/Groups Extended Collective Communication One-sided Communication Afternoon

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

CHM 579 Lab 1: Basic Monte Carlo Algorithm

CHM 579 Lab 1: Basic Monte Carlo Algorithm CHM 579 Lab 1: Basic Monte Carlo Algorithm Due 02/12/2014 The goal of this lab is to get familiar with a simple Monte Carlo program and to be able to compile and run it on a Linux server. Lab Procedure:

More information

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ) Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications

More information

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt

MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt MUSIC Multi-Simulation Coordinator Users Manual Örjan Ekeberg and Mikael Djurfeldt March 3, 2009 Abstract MUSIC is an API allowing large scale neuron simulators using MPI internally to exchange data during

More information

Concern Highlight: A Tool for Concern Exploration and Visualization

Concern Highlight: A Tool for Concern Exploration and Visualization Concern Highlight: A Tool for Concern Exploration and Visualization Eugen C. Nistor André van der Hoek Department of Informatics School of Information and Computer Sciences University of California, Irvine

More information

Performance Monitoring of Parallel Scientific Applications

Performance Monitoring of Parallel Scientific Applications Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure

More information

University of Amsterdam - SURFsara. High Performance Computing and Big Data Course

University of Amsterdam - SURFsara. High Performance Computing and Big Data Course University of Amsterdam - SURFsara High Performance Computing and Big Data Course Workshop 7: OpenMP and MPI Assignments Clemens Grelck C.Grelck@uva.nl Roy Bakker R.Bakker@uva.nl Adam Belloum A.S.Z.Belloum@uva.nl

More information

An Incomplete C++ Primer. University of Wyoming MA 5310

An Incomplete C++ Primer. University of Wyoming MA 5310 An Incomplete C++ Primer University of Wyoming MA 5310 Professor Craig C. Douglas http://www.mgnet.org/~douglas/classes/na-sc/notes/c++primer.pdf C++ is a legacy programming language, as is other languages

More information

ORB User Sponsor Client Authenticate User Request Principal Create Credentials Authenticator Attributes ORB

ORB User Sponsor Client Authenticate User Request Principal Create Credentials Authenticator Attributes ORB Ö Ñ ÛÓÖ ÓÖ ÁÑÔÐ Ñ ÒØ Ò ÊÓÐ ¹ ÓÒØÖÓÐ Í Ò ÇÊ Ë ÙÖ ØÝ Ë ÖÚ ÃÓÒ Ø ÒØ Ò ÞÒÓ ÓÚ Ò Ò ÒØ Ö ÓÖ Ú Ò ØÖ ÙØ ËÝ Ø Ñ Ò Ò Ö Ò Ë ÓÓÐ Ó ÓÑÔÙØ Ö Ë Ò ÐÓÖ ÁÒØ ÖÒ Ø ÓÒ Ð ÍÒ Ú Ö ØÝ ØÖ Ø Ì Ô Ô Ö ÓÛ ÓÛ ÖÓÐ ¹ ÓÒØÖÓÐ Ê µ ÑÓ Ð ÓÙÐ

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Qloud Demonstration 15 319, spring 2010 3 rd Lecture, Jan 19 th Suhail Rehman Time to check out the Qloud! Enough Talk! Time for some Action! Finally you can have your own

More information

Allinea Performance Reports User Guide. Version 6.0.6

Allinea Performance Reports User Guide. Version 6.0.6 Allinea Performance Reports User Guide Version 6.0.6 Contents Contents 1 1 Introduction 4 1.1 Online Resources...................................... 4 2 Installation 5 2.1 Linux/Unix Installation...................................

More information

Kommunikation in HPC-Clustern

Kommunikation in HPC-Clustern Kommunikation in HPC-Clustern Communication/Computation Overlap in MPI W. Rehm and T. Höfler Department of Computer Science TU Chemnitz http://www.tu-chemnitz.de/informatik/ra 11.11.2005 Outline 1 2 Optimize

More information

Basic Concepts in Parallelization

Basic Concepts in Parallelization 1 Basic Concepts in Parallelization Ruud van der Pas Senior Staff Engineer Oracle Solaris Studio Oracle Menlo Park, CA, USA IWOMP 2010 CCS, University of Tsukuba Tsukuba, Japan June 14-16, 2010 2 Outline

More information

A Review of Customized Dynamic Load Balancing for a Network of Workstations

A Review of Customized Dynamic Load Balancing for a Network of Workstations A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester

More information

WinBioinfTools: Bioinformatics Tools for Windows Cluster. Done By: Hisham Adel Mohamed

WinBioinfTools: Bioinformatics Tools for Windows Cluster. Done By: Hisham Adel Mohamed WinBioinfTools: Bioinformatics Tools for Windows Cluster Done By: Hisham Adel Mohamed Objective Implement and Modify Bioinformatics Tools To run under Windows Cluster Project : Research Project between

More information

High-Performance Computing: Architecture and APIs

High-Performance Computing: Architecture and APIs High-Performance Computing: Architecture and APIs Douglas Fuller ASU Fulton High Performance Computing Why HPC? Capacity computing Do similar jobs, but lots of them. Capability computing Run programs we

More information

Dalhousie University CSCI 2132 Software Development Winter 2015 Lab 7, March 11

Dalhousie University CSCI 2132 Software Development Winter 2015 Lab 7, March 11 Dalhousie University CSCI 2132 Software Development Winter 2015 Lab 7, March 11 In this lab, you will first learn how to use pointers to print memory addresses of variables. After that, you will learn

More information

Ù ØÓÑ Ö Ö ÔÓÒ Ð Ø À Ú Ð Ö À Ú Ø Ñ ØÓ Ù Ú ÁÒ Ø Ø Ñ Ø Ò Ä Ñ Ø ÔÖÓ Ø ÐÛ Ý Ú Ø Ñ ½¹½ Ì Ù ØÓÑ Ö ÓÙÐ ººº ß Ú Ð Ö ÙØ Ñ Ý ÒÓØ ÓÑÔÐ Ø µ Ó Û Ø» Û ÒØ º ß Ú Ø Ñ ØÓ Ù Ø Ö ÕÙ Ö Ñ ÒØ Û Ø Ø ÖÓÙÔ ÙÖ Ò Ø ÔÖÓ Øº ß Ð ØÓ Ú

More information

NON-COMPRESSED PGP MESSAGE L E N G T H M O D E C T B NAME LENGTH SEDP PACKET

NON-COMPRESSED PGP MESSAGE L E N G T H M O D E C T B NAME LENGTH SEDP PACKET ÁÑÔÐ Ñ ÒØ Ø ÓÒ Ó Ó Ò¹ Ô ÖØ ÜØ ØØ Ò Ø È È Ò ÒÙÈ Ã Ð Â ÐÐ ½ ÂÓÒ Ø Ò Ã ØÞ ¾ ÖÙ Ë Ò Ö ¾ ½ Ì ÓÒ ÓÑÔ ÒÝ Ð ÓÒÓÑÔ ÒݺÓÑ Ô ÖØÑ ÒØ Ó ÓÑÔÙØ Ö Ë Ò ÍÒ Ú Ö ØÝ Ó Å ÖÝÐ Ò ÓÐÐ È Ö µ ØÞ ºÙÑ º Ù ÓÙÒØ ÖÔ Ò ÁÒØ ÖÒ Ø Ë ÙÖ ØÝ

More information

Partitioning and Divide and Conquer Strategies

Partitioning and Divide and Conquer Strategies and Divide and Conquer Strategies Lecture 4 and Strategies Strategies Data partitioning aka domain decomposition Functional decomposition Lecture 4 and Strategies Quiz 4.1 For nuclear reactor simulation,

More information

Linux Driver Devices. Why, When, Which, How?

Linux Driver Devices. Why, When, Which, How? Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may

More information

Implementing MPI-IO Shared File Pointers without File System Support

Implementing MPI-IO Shared File Pointers without File System Support Implementing MPI-IO Shared File Pointers without File System Support Robert Latham, Robert Ross, Rajeev Thakur, Brian Toonen Mathematics and Computer Science Division Argonne National Laboratory Argonne,

More information

A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin

A Comparison Of Shared Memory Parallel Programming Models. Jace A Mogill David Haglin A Comparison Of Shared Memory Parallel Programming Models Jace A Mogill David Haglin 1 Parallel Programming Gap Not many innovations... Memory semantics unchanged for over 50 years 2010 Multi-Core x86

More information

MPI and Hybrid Programming Models. William Gropp www.cs.illinois.edu/~wgropp

MPI and Hybrid Programming Models. William Gropp www.cs.illinois.edu/~wgropp MPI and Hybrid Programming Models William Gropp www.cs.illinois.edu/~wgropp 2 What is a Hybrid Model? Combination of several parallel programming models in the same program May be mixed in the same source

More information

Sources: On the Web: Slides will be available on:

Sources: On the Web: Slides will be available on: C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,

More information

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Streamline Computing Linux Cluster User Training. ( Nottingham University) 1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running

More information

Chapter 2 Parallel Architecture, Software And Performance

Chapter 2 Parallel Architecture, Software And Performance Chapter 2 Parallel Architecture, Software And Performance UCSB CS140, T. Yang, 2014 Modified from texbook slides Roadmap Parallel hardware Parallel software Input and output Performance Parallel program

More information

System Calls and Standard I/O

System Calls and Standard I/O System Calls and Standard I/O Professor Jennifer Rexford http://www.cs.princeton.edu/~jrex 1 Goals of Today s Class System calls o How a user process contacts the Operating System o For advanced services

More information

Introduction to Parallel Programming and MapReduce

Introduction to Parallel Programming and MapReduce Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant

More information

Myrinet Express (MX): A High-Performance, Low-Level, Message-Passing Interface for Myrinet. Version 1.0 July 01, 2005

Myrinet Express (MX): A High-Performance, Low-Level, Message-Passing Interface for Myrinet. Version 1.0 July 01, 2005 Myrinet Express (MX): A High-Performance, Low-Level, Message-Passing Interface for Myrinet Version 1.0 July 01, 2005 Table of Conten ts I OVERVIEW...1 I.1 UPCOMING FEATURES...3 II CONCEPTS...4 II.1 N

More information

How To Visualize Performance Data In A Computer Program

How To Visualize Performance Data In A Computer Program Performance Visualization Tools 1 Performance Visualization Tools Lecture Outline : Following Topics will be discussed Characteristics of Performance Visualization technique Commercial and Public Domain

More information

Static Approximation of MPI Communication Graphs for Optimized Process Placement

Static Approximation of MPI Communication Graphs for Optimized Process Placement Static Approximation of MPI Communication Graphs for Optimized Process Placement Andrew J. McPherson 1, Vijay Nagarajan 1, and Marcelo Cintra 2 1 School of Informatics, University of Edinburgh 2 Intel

More information

Parallel I/O on Mira Venkat Vishwanath and Kevin Harms

Parallel I/O on Mira Venkat Vishwanath and Kevin Harms Parallel I/O on Mira Venkat Vishwanath and Kevin Harms Argonne Na*onal Laboratory venkat@anl.gov ALCF-2 I/O Infrastructure Mira BG/Q Compute Resource Tukey Analysis Cluster 48K Nodes 768K Cores 10 PFlops

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

ØÓÖ Ò Ê Ø ÓÒ Ð ÈÓÐÝÒÓÑ Ð ÓÚ Ö Ø ÓÑÔÐ Ü ÆÙÑ Ö Ò Ö Ø ÂÓ Ò ÒÒÝ Ý Ì ÓÑ ÖÖ ØÝ Þ ÂÓ Ï ÖÖ Ò Ü ÖÙ ÖÝ ½ ØÖ Ø Æ Ð ÓÖ Ø Ñ Ö Ú Ò ÓÖ Ø ÖÑ Ò Ò Ø ÒÙÑ Ö Ò Ö Ó Ø ØÓÖ ÖÖ Ù Ð ÓÚ Ö Ø ÓÑÔÐ Ü ÒÙÑ Ö Ó ÑÙÐØ ¹ Ú Ö Ø ÔÓÐÝÒÓÑ Ð

More information

Objectives. Overview of OpenMP. Structured blocks. Variable scope, work-sharing. Scheduling, synchronization

Objectives. Overview of OpenMP. Structured blocks. Variable scope, work-sharing. Scheduling, synchronization OpenMP Objectives Overview of OpenMP Structured blocks Variable scope, work-sharing Scheduling, synchronization 1 Overview of OpenMP OpenMP is a collection of compiler directives and library functions

More information

OpenACC 2.0 and the PGI Accelerator Compilers

OpenACC 2.0 and the PGI Accelerator Compilers OpenACC 2.0 and the PGI Accelerator Compilers Michael Wolfe The Portland Group michael.wolfe@pgroup.com This presentation discusses the additions made to the OpenACC API in Version 2.0. I will also present

More information

Parallel and Distributed Computing Programming Assignment 1

Parallel and Distributed Computing Programming Assignment 1 Parallel and Distributed Computing Programming Assignment 1 Due Monday, February 7 For programming assignment 1, you should write two C programs. One should provide an estimate of the performance of ping-pong

More information

Service -realization. Imported web -service interfaces. Web -service usage interface. Web -service specification. client. build/buy reuse/buy

Service -realization. Imported web -service interfaces. Web -service usage interface. Web -service specification. client. build/buy reuse/buy Ò Å Ø Ó ÓÐÓ Ý ÓÖ Ï Ë ÖÚ Ò Ù Ò ÈÖÓ Å ÈºÈ Ô ÞÓ ÐÓÙ Ò Â Ò Ò ÁÒ ÓÐ Ì Ð ÙÖ ÍÒ Ú Ö ØÝ ÈÇ ÓÜ ¼½ ¼¼¼ Ä Ì Ð ÙÖ Æ Ø ÖÐ Ò Ñ Ô Ò Ù ºÒÐ ØÖ Øº ¹ Ù Ò Ø Ò ØØ ÒØ ÓÒ ÖÓÑ ÓÑÔÓÒ ÒØ ØÓ Û ÖÚ ÔÔÐ Ø ÓÒ º ÅÓ Ø ÒØ ÖÔÖ Ô Ò ÑÓ Ø

More information

Spark ΕΡΓΑΣΤΗΡΙΟ 10. Prepared by George Nikolaides 4/19/2015 1

Spark ΕΡΓΑΣΤΗΡΙΟ 10. Prepared by George Nikolaides 4/19/2015 1 Spark ΕΡΓΑΣΤΗΡΙΟ 10 Prepared by George Nikolaides 4/19/2015 1 Introduction to Apache Spark Another cluster computing framework Developed in the AMPLab at UC Berkeley Started in 2009 Open-sourced in 2010

More information

Ä Ò Ö Ò ÒØ ÖÓÔØ Ñ Þ Ø ÓÒÛ Ø ÔÔÐ Ø ÓÒ ÅÎ ½»ÅÅ ½ Å Ò ÑÙÑÓ Ø ÓÛÑÓ Ð Ò Ð ÓÖ Ø Ñ Ä ØÙÖ ½¼ ÒÒ¹ Ö Ø ËØÖ Ñ Ö ¾¼½ ¼ ¼¾ Ä ØÙÖ Ä Ò Ö Ò ÒØ ÖÓÔØ Ñ Þ Ø ÓÒÛ Ø ÔÔÐ Ø ÓÒ Å Ü ÑÙÑ ÓÛÑÓ Ð ÓÒ Ö ØÖ Ø Ø Ò Ò ØÛÓÖ Û Ø Ô Ô Ð Ò

More information

sc13.wlu.edu Steven Bogaerts Assistant Professor of Computer Science DePauw University Greencastle, IN

sc13.wlu.edu Steven Bogaerts Assistant Professor of Computer Science DePauw University Greencastle, IN Steven Bogaerts Assistant Professor of Computer Science DePauw University Greencastle, IN Joshua Stough Assistant Professor of Computer Science Washington and Lee University Lexington, VA sc13.wlu.edu

More information

Good FORTRAN Programs

Good FORTRAN Programs Good FORTRAN Programs Nick West Postgraduate Computing Lectures Good Fortran 1 What is a Good FORTRAN Program? It Works May be ~ impossible to prove e.g. Operating system. Robust Can handle bad data e.g.

More information

We will learn the Python programming language. Why? Because it is easy to learn and many people write programs in Python so we can share.

We will learn the Python programming language. Why? Because it is easy to learn and many people write programs in Python so we can share. LING115 Lecture Note Session #4 Python (1) 1. Introduction As we have seen in previous sessions, we can use Linux shell commands to do simple text processing. We now know, for example, how to count words.

More information

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics

More information

Using EDA Databases: Milkyway & OpenAccess

Using EDA Databases: Milkyway & OpenAccess Using EDA Databases: Milkyway & OpenAccess Enabling and Using Scripting Languages with Milkyway and OpenAccess Don Amundson Khosro Khakzadi 2006 LSI Logic Corporation 1 Outline History Choice Of Scripting

More information

Developing an Inventory Management System for Second Life

Developing an Inventory Management System for Second Life Developing an Inventory Management System for Second Life Abstract Anthony Rosequist Workflow For the progress report a month ago, I set the goal to have a basic, functional inventory management system

More information

Introduction to Perl

Introduction to Perl Introduction to Perl March 8, 2011 by Benjamin J. Lynch http://msi.umn.edu/~blynch/tutorial/perl.pdf Outline What Perl Is When Perl Should Be used Basic Syntax Examples and Hands-on Practice More built-in

More information

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges Why should you care? What are they? Which are different from non-petascale? What has changed since

More information