High performance computing systems. Lab 1
|
|
|
- Vernon Todd
- 10 years ago
- Views:
Transcription
1 High performance computing systems Lab 1 Dept. of Computer Architecture Faculty of ETI Gdansk University of Technology Paweł Czarnul For this exercise, study basic MPI functions such as: 1. for MPI management: MPI_Init(...), MPI_Finalize(), Each MPI program should start with MPI_Init(...) and finish with MPI_Finalize(). Each process can fetch the number of processes in the default communicator MPI_COMM_WORLD (the application) by calling MPI_Comm_size (see the example below). Processes in an MPI application are identified by so-called ranks ranging from 0 to n-1 where n is the number of processes returned by MPI_Comm_size(). Based on the rank, each process can perform a part of all required computations so that all processes contribute to the final goal and process all required data. 2. for point-to-point communication: MPI_Send(...), MPI_Recv(...), int MPI_Send(void *buf, int count, MPI_Datatype dtype, int dest, int tag, MPI_Comm comm) MPI_Send sends data pointed by buf to process with rank dest. There should be count elements of data type dtype. For instance, when sending 5 doubles, count should be 5 and dtype should be MPI_DOUBLE. tag can be any number which additionally describes the message and comm can be MPI_COMM_WORLD for the default communicator. int MPI_Recv(void *buf, int count, MPI_Datatype dtype, int src, int tag, MPI_Comm comm, MPI_Status *stat) MPI_Recv is a blocking receive which waits for a message with tag tag from process with rank src in communicator comm. Dtype and count denote the type and the number of elements which are to be received and stored in buf. Stat holds information about the received message. 3. for collective communication: MPI_Barrier(...), MPI_Gather(...), MPI_Scatter(...), MPI_Allgather(...).
2 As an example, int MPI_Reduce(void *sbuf, void* rbuf, int count, MPI_Datatype dtype, MPI_Op op, int root, MPI_Comm comm) reduces all values given by processes in communicator comm to a single value in process with rank root. See the code below for adding numbers given by all processes to a single value in process 0. Study the following tutorial on MPI: The following example computes pi in parallel using an old method from the 17 th century: Pi/4=1/1 1/3 + 1/5 1/7 + 1/9. (1) Note that the program works for any number of processes requested. Successive elements of (1) are assigned to successive processes with ranks from 0 to (proccount-1). For 2 processes: Pi/4 = 1/1 1/3 + 1/5 1/7 + 1/9. process For 3 processes: Pi/4 = 1/1 1/3 + 1/5 1/7 + 1/9 1/11. process etc. This is a simple load balancing technique. For example, checking if successive numbers are prime numbers might involve more time for larger numbers. This strategy balances the execution time among processes quite well. Note that in reality we only consider a predefined number of elements in (1). In general, we should make sure that the data types used for adding the numbers can store resulting subsums. #include <stdio.h> #include <mpi.h>
3 int main(int argc, char **argv) { double precision= ; int myrank,proccount; double pi,pi_final; int mine,sign; int i; // Initialize MPI MPI_Init(&argc, &argv); // find out my rank MPI_Comm_rank(MPI_COMM_WORLD, &myrank); // find out the number of processes in MPI_COMM_WORLD MPI_Comm_size(MPI_COMM_WORLD, &proccount); // now distribute the required precision if (precision<proccount) { printf("precision smaller than the number of processes - try again."); MPI_Finalize(); return -1; } // each process performs computations on its part pi=0; mine=myrank*2+1; sign=(((mine-1)/2)%2)?-1:1; for (;mine<precision;) { // printf("\nprocess %d %d %d", myrank,sign,mine); // fflush(stdout); pi+=sign/(double)mine; mine+=2*proccount; sign=(((mine-1)/2)%2)?-1:1; } // now merge the numbers to rank 0 MPI_Reduce(&pi,&pi_final,1, MPI_DOUBLE,MPI_SUM,0, MPI_COMM_WORLD); if (!myrank) {
4 } pi_final*=4; printf("pi=%f",pi_final); // Shut down MPI MPI_Finalize(); return 0; } Assuming the code was saved in file program.c, we have to: 1. compile the code: mpicc program.c 2. run it 1 process: [klaster@n01 1]$ time mpirun -np 1./a.out real 0m9.286s user 0m9.244s sys 0m0.037s 2 processes: [klaster@n01 1]$ time mpirun -np 2./a.out real 0m4.706s user 0m9.286s sys 0m0.063s 4 processes: [klaster@n01 1]$ time mpirun -np 4./a.out real 0m2.420s user 0m9.380s sys 0m0.118s Note smaller execution times for larger numbers of processes used for computations. Lab 527: For this lab, you can use the default MPI implementation on desxx computers in the lab (XX range from 01 to 18) Open MPI.
5 Compile the code: mpicc program.c create a configuration for the virtual machine in this case just 2 nodes (des01 and des02): student@des01:~> cat > machinefile des01 des02 then invoke the application for 1 process (running on des01): student@des01:~> mpirun -machinefile./machinefile -np 1 time./a.out 9.25user 0.01system 0:09.27elapsed 99%CPU (0avgtext+0avgdata 13008maxresident)k 0inputs+0outputs (0major+1009minor)pagefaults 0swaps and 2 processes (running on des01 and des02): student@des01:~> mpirun -machinefile./machinefile -np 2 time./a.out 4.63user 0.01system 0:04.65elapsed 99%CPU (0avgtext+0avgdata 13072maxresident)k 0inputs+0outputs (0major+1013minor)pagefaults 0swaps 4.63user 0.01system 0:04.67elapsed 99%CPU (0avgtext+0avgdata 13312maxresident)k 0inputs+0outputs (0major+1023minor)pagefaults 0swaps You can create a larger virtual machine and test the scalability of the application. Lab 527: You can also use mpich on desxx: student@des01:~> /opt/mpich/ch-p4/bin/mpicc program.c program.c: In function main : program.c:12:7: warning: unused variable i student@des01:~> scp a.out des02:~ a.out 100% 1427KB 1.4MB/s 00:00 student@des01:~> scp a.out des03:~ a.out 100% 1427KB 1.4MB/s 00:00 student@des01:~> scp a.out des04:~ a.out
6 now run the code: 1 process student@des01:~> /opt/mpich/ch-p4/bin/mpirun -np 1 -machinefile./machinefile./a.out student@des01:~> 2 processes student@des01:~> /opt/mpich/ch-p4/bin/mpirun -np 2 -machinefile./machinefile./a.out student@des01:~> 4 processes student@des01:~> /opt/mpich/ch-p4/bin/mpirun -np 4 -machinefile./machinefile./a.out cluster KASK: reach the cluster by ssh [email protected] X is a number from 1 to 18 The following MPI implementations are available on cluster KASK (use a full path for running mpicc and mpirun): 1. MPICH executables such as mpicc and mpirun available in /opt/mpich2/gnu/bin/ 2. Open MPI - executables in /opt/sun-ct/bin/ 3. MVAPICH executables in /usr/mpi/gcc/mvapich-1.2.0/bin/ Note: the following nodes are available on the cluster: n01 access node compute-0-0 compute-0-1 compute-0-8 Bibliography MPI Docs
Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1
Lecture 6: Introduction to MPI programming Lecture 6: Introduction to MPI programming p. 1 MPI (message passing interface) MPI is a library standard for programming distributed memory MPI implementation(s)
Lightning Introduction to MPI Programming
Lightning Introduction to MPI Programming May, 2015 What is MPI? Message Passing Interface A standard, not a product First published 1994, MPI-2 published 1997 De facto standard for distributed-memory
LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig
LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015 Hermann Härtig ISSUES starting points independent Unix processes and block synchronous execution who does it load migration mechanism
To connect to the cluster, simply use a SSH or SFTP client to connect to:
RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or
Parallelization: Binary Tree Traversal
By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First
HPCC - Hrothgar Getting Started User Guide MPI Programming
HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...
Session 2: MUST. Correctness Checking
Center for Information Services and High Performance Computing (ZIH) Session 2: MUST Correctness Checking Dr. Matthias S. Müller (RWTH Aachen University) Tobias Hilbrich (Technische Universität Dresden)
HP-MPI User s Guide. 11th Edition. Manufacturing Part Number : B6060-96024 September 2007
HP-MPI User s Guide 11th Edition Manufacturing Part Number : B6060-96024 September 2007 Copyright 1979-2007 Hewlett-Packard Development Company, L.P. Table 1 Revision history Edition MPN Description Eleventh
MPI Application Development Using the Analysis Tool MARMOT
MPI Application Development Using the Analysis Tool MARMOT HLRS High Performance Computing Center Stuttgart Allmandring 30 D-70550 Stuttgart http://www.hlrs.de 24.02.2005 1 Höchstleistungsrechenzentrum
Parallel Programming with MPI on the Odyssey Cluster
Parallel Programming with MPI on the Odyssey Cluster Plamen Krastev Office: Oxford 38, Room 204 Email: [email protected] FAS Research Computing Harvard University Objectives: To introduce you
Message Passing with MPI
Message Passing with MPI Hristo Iliev (Христо Илиев) PPCES 2012, 2013 Christian Iwainsky PPCES 2011 Rechen- und Kommunikationszentrum (RZ) Agenda Motivation MPI Part 1 Concepts Point-to-point communication
Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI
Load Balancing 1 the Mandelbrot set computing a file with grayscales 2 Static Work Load Assignment granularity considerations static work load assignment with MPI 3 Dynamic Work Load Balancing scheduling
Introduction to Hybrid Programming
Introduction to Hybrid Programming Hristo Iliev Rechen- und Kommunikationszentrum aixcelerate 2012 / Aachen 10. Oktober 2012 Version: 1.1 Rechen- und Kommunikationszentrum (RZ) Motivation for hybrid programming
MPI-Checker Static Analysis for MPI
MPI-Checker Static Analysis for MPI Alexander Droste, Michael Kuhn, Thomas Ludwig November 15, 2015 Motivation 2 / 39 Why is runtime analysis in HPC challenging? Large amount of resources are used State
Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005
Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1
Why Choose C/C++ as the programming language? Parallel Programming in C/C++ - OpenMP versus MPI
Parallel Programming (Multi/cross-platform) Why Choose C/C++ as the programming language? Compiling C/C++ on Windows (for free) Compiling C/C++ on other platforms for free is not an issue Parallel Programming
MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop
MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop Joachim Protze and Felix Münchhalfen IT Center RWTH Aachen University February 2014 Content MPI Usage Errors Error Classes Avoiding
Introduction to MPI Programming!
Introduction to MPI Programming! Rocks-A-Palooza II! Lab Session! 2006 UC Regents! 1! Modes of Parallel Computing! SIMD - Single Instruction Multiple Data!!processors are lock-stepped : each processor
WinBioinfTools: Bioinformatics Tools for Windows Cluster. Done By: Hisham Adel Mohamed
WinBioinfTools: Bioinformatics Tools for Windows Cluster Done By: Hisham Adel Mohamed Objective Implement and Modify Bioinformatics Tools To run under Windows Cluster Project : Research Project between
Parallel Computing. Parallel shared memory computing with OpenMP
Parallel Computing Parallel shared memory computing with OpenMP Thorsten Grahs, 14.07.2014 Table of contents Introduction Directives Scope of data Synchronization OpenMP vs. MPI OpenMP & MPI 14.07.2014
Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)
Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load
RA MPI Compilers Debuggers Profiling. March 25, 2009
RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar
Introduction to Cloud Computing
Introduction to Cloud Computing Distributed Systems 15-319, spring 2010 11 th Lecture, Feb 16 th Majd F. Sakr Lecture Motivation Understand Distributed Systems Concepts Understand the concepts / ideas
MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation. Bettina Krammer, Matthias Müller [email protected], mueller@hlrs.
MARMOT- MPI Analysis and Checking Tool Demo with blood flow simulation Bettina Krammer, Matthias Müller [email protected], [email protected] HLRS High Performance Computing Center Stuttgart Allmandring 30
Debugging with TotalView
Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich
16 node Linux cluster at SCFBio
Clustering Tutorial What is Clustering? Clustering is the use of multiple computers, typically PCs or UNIX workstations, multiple storage devices, and redundant interconnections, to form what appears to
LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.
LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability
Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)
Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications
COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP
COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State
Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber
Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )
University of Amsterdam - SURFsara. High Performance Computing and Big Data Course
University of Amsterdam - SURFsara High Performance Computing and Big Data Course Workshop 7: OpenMP and MPI Assignments Clemens Grelck [email protected] Roy Bakker [email protected] Adam Belloum [email protected]
INF-110. GPFS Installation
INF-110 GPFS Installation Overview Plan the installation Before installing any software, it is important to plan the GPFS installation by choosing the hardware, deciding which kind of disk connectivity
The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters
The Double-layer Master-Slave Model : A Hybrid Approach to Parallel Programming for Multicore Clusters User s Manual for the HPCVL DMSM Library Gang Liu and Hartmut L. Schmider High Performance Computing
Parallel and Distributed Computing Programming Assignment 1
Parallel and Distributed Computing Programming Assignment 1 Due Monday, February 7 For programming assignment 1, you should write two C programs. One should provide an estimate of the performance of ping-pong
Automated Testing of Installed Software
Automated Testing of Installed Software or so far, How to validate MPI stacks of an HPC cluster? Xavier Besseron HPC and Computational Science @ FOSDEM 2014 February 1, 2014 Automated Testing of Installed
Parallel I/O on Mira Venkat Vishwanath and Kevin Harms
Parallel I/O on Mira Venkat Vishwanath and Kevin Harms Argonne Na*onal Laboratory [email protected] ALCF-2 I/O Infrastructure Mira BG/Q Compute Resource Tukey Analysis Cluster 48K Nodes 768K Cores 10 PFlops
Analysis and Implementation of Cluster Computing Using Linux Operating System
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661 Volume 2, Issue 3 (July-Aug. 2012), PP 06-11 Analysis and Implementation of Cluster Computing Using Linux Operating System Zinnia Sultana
Allinea Performance Reports User Guide. Version 6.0.6
Allinea Performance Reports User Guide Version 6.0.6 Contents Contents 1 1 Introduction 4 1.1 Online Resources...................................... 4 2 Installation 5 2.1 Linux/Unix Installation...................................
Message Passing Interface (MPI)
Message Passing Interface (MPI) Jalel Chergui Isabelle Dupays Denis Girou Pierre-François Lavallée Dimitri Lecas Philippe Wautelet MPI Plan I 1 Introduction... 7 1.1 Availability and updating... 8 1.2
Performance and scalability of MPI on PC clusters
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 24; 16:79 17 (DOI: 1.12/cpe.749) Performance Performance and scalability of MPI on PC clusters Glenn R. Luecke,,
OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware
OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based
Hybrid Programming with MPI and OpenMP
Hybrid Programming with and OpenMP Ricardo Rocha and Fernando Silva Computer Science Department Faculty of Sciences University of Porto Parallel Computing 2015/2016 R. Rocha and F. Silva (DCC-FCUP) Programming
How to Run Parallel Jobs Efficiently
How to Run Parallel Jobs Efficiently Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education May 9, 2013 1 The big picture: running parallel jobs on Hoffman2
Charm++, what s that?!
Charm++, what s that?! Les Mardis du dev François Tessier - Runtime team October 15, 2013 François Tessier Charm++ 1 / 25 Outline 1 Introduction 2 Charm++ 3 Basic examples 4 Load Balancing 5 Conclusion
Parallel Astronomical Data Processing or How to Build a Beowulf Class Cluster for High Performance Computing?
Parallel Astronomical Data Processing or How to Build a Beowulf Class Cluster for High Performance Computing? [email protected] Version 1.0 Centre for Astronomy, School of Physics National University
MUSIC Multi-Simulation Coordinator. Users Manual. Örjan Ekeberg and Mikael Djurfeldt
MUSIC Multi-Simulation Coordinator Users Manual Örjan Ekeberg and Mikael Djurfeldt March 3, 2009 Abstract MUSIC is an API allowing large scale neuron simulators using MPI internally to exchange data during
The CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)
Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing
GRID Computing: CAS Style
CS4CC3 Advanced Operating Systems Architectures Laboratory 7 GRID Computing: CAS Style campus trunk C.I.S. router "birkhoff" server The CAS Grid Computer 100BT ethernet node 1 "gigabyte" Ethernet switch
SQLITE C/C++ TUTORIAL
http://www.tutorialspoint.com/sqlite/sqlite_c_cpp.htm SQLITE C/C++ TUTORIAL Copyright tutorialspoint.com Installation Before we start using SQLite in our C/C++ programs, we need to make sure that we have
Kommunikation in HPC-Clustern
Kommunikation in HPC-Clustern Communication/Computation Overlap in MPI W. Rehm and T. Höfler Department of Computer Science TU Chemnitz http://www.tu-chemnitz.de/informatik/ra 11.11.2005 Outline 1 2 Optimize
SGE Roll: Users Guide. Version @VERSION@ Edition
SGE Roll: Users Guide Version @VERSION@ Edition SGE Roll: Users Guide : Version @VERSION@ Edition Published Aug 2006 Copyright 2006 UC Regents, Scalable Systems Table of Contents Preface...i 1. Requirements...1
Asynchronous Dynamic Load Balancing (ADLB)
Asynchronous Dynamic Load Balancing (ADLB) A high-level, non-general-purpose, but easy-to-use programming model and portable library for task parallelism Rusty Lusk Mathema.cs and Computer Science Division
CSC230 Getting Starting in C. Tyler Bletsch
CSC230 Getting Starting in C Tyler Bletsch What is C? The language of UNIX Procedural language (no classes) Low-level access to memory Easy to map to machine language Not much run-time stuff needed Surprisingly
ADVANCED MPI. Dr. David Cronk Innovative Computing Lab University of Tennessee
ADVANCED MPI Dr. David Cronk Innovative Computing Lab University of Tennessee Day 1 Morning - Lecture Course Outline Communicators/Groups Extended Collective Communication One-sided Communication Afternoon
Agenda. Using HPC Wales 2
Using HPC Wales Agenda Infrastructure : An Overview of our Infrastructure Logging in : Command Line Interface and File Transfer Linux Basics : Commands and Text Editors Using Modules : Managing Software
Bright Cluster Manager 5.2. User Manual. Revision: 3324. Date: Fri, 30 Nov 2012
Bright Cluster Manager 5.2 User Manual Revision: 3324 Date: Fri, 30 Nov 2012 Table of Contents Table of Contents........................... i 1 Introduction 1 1.1 What Is A Beowulf Cluster?..................
Linux Cluster Computing An Administrator s Perspective
Linux Cluster Computing An Administrator s Perspective Robert Whitinger Traques LLC and High Performance Computing Center East Tennessee State University : http://lxer.com/pub/self2015_clusters.pdf 2015-Jun-14
1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology
Volume 1.0 FACULTY OF CUMPUTER SCIENCE & ENGINEERING Ghulam Ishaq Khan Institute of Engineering Sciences & Technology User Manual For HPC Cluster at GIKI Designed and prepared by Faculty of Computer Science
Grid 101. Grid 101. Josh Hegie. [email protected] http://hpc.unr.edu
Grid 101 Josh Hegie [email protected] http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging
Static Approximation of MPI Communication Graphs for Optimized Process Placement
Static Approximation of MPI Communication Graphs for Optimized Process Placement Andrew J. McPherson 1, Vijay Nagarajan 1, and Marcelo Cintra 2 1 School of Informatics, University of Edinburgh 2 Intel
Libmonitor: A Tool for First-Party Monitoring
Libmonitor: A Tool for First-Party Monitoring Mark W. Krentel Dept. of Computer Science Rice University 6100 Main St., Houston, TX 77005 [email protected] ABSTRACT Libmonitor is a library that provides
System Software for High Performance Computing. Joe Izraelevitz
System Software for High Performance Computing Joe Izraelevitz Agenda Overview of Supercomputers Blue Gene/Q System LoadLeveler Job Scheduler General Parallel File System HPC at UR What is a Supercomputer?
Parallel Computing. Shared memory parallel programming with OpenMP
Parallel Computing Shared memory parallel programming with OpenMP Thorsten Grahs, 27.04.2015 Table of contents Introduction Directives Scope of data Synchronization 27.04.2015 Thorsten Grahs Parallel Computing
Retargeting PLAPACK to Clusters with Hardware Accelerators
Retargeting PLAPACK to Clusters with Hardware Accelerators Manuel Fogué 1 Francisco Igual 1 Enrique S. Quintana-Ortí 1 Robert van de Geijn 2 1 Departamento de Ingeniería y Ciencia de los Computadores.
How To Visualize Performance Data In A Computer Program
Performance Visualization Tools 1 Performance Visualization Tools Lecture Outline : Following Topics will be discussed Characteristics of Performance Visualization technique Commercial and Public Domain
P1 P2 P3. Home (p) 1. Diff (p) 2. Invalidation (p) 3. Page Request (p) 4. Page Response (p)
ËÓØÛÖ ØÖÙØ ËÖ ÅÑÓÖÝ ÓÚÖ ÎÖØÙÐ ÁÒØÖ ÖØØÙÖ ÁÑÔÐÑÒØØÓÒ Ò ÈÖÓÖÑÒ ÅÙÖÐÖÒ ÊÒÖÒ Ò ÄÚÙ ÁØÓ ÔÖØÑÒØ Ó ÓÑÔÙØÖ ËÒ ÊÙØÖ ÍÒÚÖ ØÝ È ØÛÝ ÆÂ ¼¹¼½ ÑÙÖÐÖ ØÓ ºÖÙØÖ ºÙ ØÖØ ÁÒ Ø ÔÔÖ Û Ö Ò ÑÔÐÑÒØØÓÒ Ó ÓØÛÖ ØÖÙØ ËÖ ÅÑÓÖÝ Ëŵ
Network Performance Studies in High Performance Computing Environments
Network Performance Studies in High Performance Computing Environments by Ben Huang Supervised by Dr. Michael Bauer and Dr. Michael Katchabaw Graduate Program in Computer Science A thesis submitted in
Cloud Computing through Virtualization and HPC technologies
Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC
Streamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
GPI Global Address Space Programming Interface
GPI Global Address Space Programming Interface SEPARS Meeting Stuttgart, December 2nd 2010 Dr. Mirko Rahn Fraunhofer ITWM Competence Center for HPC and Visualization 1 GPI Global address space programming
Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003
Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, [email protected] Abstract 1 Interconnect quality
Experiences with HPC on Windows
Experiences with on Christian Terboven [email protected] aachen.de Center for Computing and Communication RWTH Aachen University Server Computing Summit 2008 April 7 11, HPI/Potsdam Experiences with on
The Asterope compute cluster
The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU
How to build a Beowulf Cluster Applied to Computational Electromagnetic
How to build a Beowulf Cluster Applied to Computational Electromagnetic Carlos Henrique da Silva Santos Leonardo André Ambrosio Hugo Enrique Hernández Figueroa Department of Microwaves and Optics (DMO)
R and High-Performance Computing
R and High-Performance Computing A (Somewhat Brief and Personal) Overview Dirk Eddelbuettel ISM HPCCON 2015 & ISM HPC on R Workshop The Institute of Statistical Mathematics, Tokyo, Japan October 9-12,
Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma [email protected]
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma [email protected] Setup Login to Ranger: - ssh -X [email protected] Make sure you can export graphics
Manual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine
Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine Last updated: 6/2/2008 4:43PM EDT We informally discuss the basic set up of the R Rmpi and SNOW packages with OpenMPI and the Sun Grid
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
HPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
Hodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
Informatica e Sistemi in Tempo Reale
Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)
MapReduce Evaluator: User Guide
University of A Coruña Computer Architecture Group MapReduce Evaluator: User Guide Authors: Jorge Veiga, Roberto R. Expósito, Guillermo L. Taboada and Juan Touriño December 9, 2014 Contents 1 Overview
HPC Applications Scalability. [email protected]
HPC Applications Scalability [email protected] Applications Best Practices LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator D. E. Shaw Research Desmond NWChem MPQC - Massively
Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming
Overview Lecture 1: an introduction to CUDA Mike Giles [email protected] hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.
Cloud-based OpenMP Parallelization Using a MapReduce Runtime. Rodolfo Wottrich, Rodolfo Azevedo and Guido Araujo University of Campinas
Cloud-based OpenMP Parallelization Using a MapReduce Runtime Rodolfo Wottrich, Rodolfo Azevedo and Guido Araujo University of Campinas 1 MPI_Init(NULL, NULL); MPI_Comm_size(MPI_COMM_WORLD, &comm_sz); MPI_Comm_rank(MPI_COMM_WORLD,
High Performance Computing. MPI and PETSc
High Performance Computing. MPI and PETSc Mario Storti Centro de Investigación de Métodos Computacionales - CIMEC (CONICET-UNL), Santa Fe, Argentina http://www.cimec.org.ar/mstorti,
Parallel Debugging with DDT
Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece
