https://sharepoint.campus.rwth-aachen.de/units/rz/hpc/public/lists/pub...

Size: px
Start display at page:

Download "https://sharepoint.campus.rwth-aachen.de/units/rz/hpc/public/lists/pub..."

Transcription

1 Alle Websites HPC Team Site of the Center for Computing and Communication (RZ) of the RWTH Aachen University > Publications Publications Publications with envolvement of the HPC team. Click on a table column header in order to sort or search. New Actions Settings year Title first author authors (unsorted) event status abstract slides paper 2013 Assessing the Performance of OpenMP Programs on the Intel Xeon Phi ; Wienke, Sandra; Cramer, Tim; ; ; Matthias S. Müller Euro-Par / %2F An HPC Application Deployment Model on Azure Cloud for SMEs Ding, Fan Ding, Fan; ; Wienke, Sandra; Zhang, Ruisheng; Li, Lian CLOSER / / Towards a Performance Engineering Workflow for OpenMP 4.0 ; ; ; Bischof, ; Matthias S. Müller Performance Engineering MS bei ParCo 2013, Munich in press 2013 Accelerators for Technical Computing: Is It Worth the Pain? A TCO Perspective Wienke, Sandra Wienke, Sandra; ; Müller, Matthias S. ISC / _25 / of 16 12/19/13 20:01

2 2013 Performance Characteristics of Large SMP Machines ; an Mey, ; Matthias S. Müller IWOMP 2013 Slides / %2F Suitability of Performance Tools for OpenMP Task-parallel Programs ; ; Matthias S. Müller Paralllel Tools Workshop 2013 unsubmitted 2013 Accelerators, Quo Vadis? Performance vs. Productivity Wienke, Sandra Wienke, Sandra; ; ; Müller, Matthias S. HPCS Trajectory-Search on ScaleMP s vsmp Architecture Berr, Nicolas ; an Mey, ; Berr, Nicolas; Göbbert, Jens Henrik; Lankes, Stefan ParCo /Content/View.aspx?piid=30397 pdf (internal ac 2 of 16 12/19/13 20:01

3 2012 The Design of OpenMP Thread Affinity Alexandre E. Eichenberger ; ; Alexandre E. Eichenberger, Michael Wong IWOMP Performance Analysis Techniques for Task-Based OpenMP Applications ; an Mey, ; Peter Phillippen, Daniel Lorenz, Rössel, Markus Geimer, Bernd Mohr, Felix Wolf IWOMP 2012 pdf 2012 Assessing OpenMP Tasking Implementations on NUMA Architectures ; ; ; Cramer, Tim IWOMP 2012 pdf 3 of 16 12/19/13 20:01

4 2012 OpenACC - First Experiences with Real-World Applications Wienke, Sandra Wienke, Sandra; ; ; Springer, Paul Martin Euro-Par 2012 Springerlink 2012 Task-parallel Programming on NUMA Architectures ; Cramer, Tim; ; Euro-Par 2012 pdf 2012 Profiling of OpenMP tasks with Score-P D. Lorenz D. Lorenz, P. Philippen, D. F. Wolf Third International Workshop on Parallel Software Tools and tool Infrastructures (PSTI 2012) accepted / %2F OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison Cramer, Tim Cramer,Tim; ; Klemm, Michael; MARC Node-based Memory Management for scalable NUMA Architectures ; Lankes, S.; Bemmerl, T.; Roehl, T. Proceedings of the 2nd International Workshop on Runtime and Operating Systems for 4 of 16 12/19/13 20:01

5 Supercomputers (ROSS) 2012 Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Knüpfer, Andreas Andreas Knüpfer, Rössel, Scott Biersdorff, Kai Diethelm, Dominic Eschweiler, Markus Geimer, Michael Gerndt, Daniel Lorenz, Allen D. Malony, Wolfgang E. Nagel, Yury Oleynik, Peter Philippen, Pavel Saviankou, Sameer S. In Proc. of 5th Parallel Tools Workshop, 2011, Dresden, Germany 2012 MPI Caveats Kapinos, Paul Kapinos, Paul; Paul KapinosHans Rüdiger Hammer, Alexander Aures Second International Serpent User Group Meeting presented /mtg/2012_madrid /Hans_Hammer2.pdf 2011 Parallelising Computational Microstructure Simulations for Metallic Materials with OpenMP Altenfeld, Ralph ; Altenfeld, Ralph; Apel, Markus; Boettger, Bernd; Benke, Stefan; Bischof, IWOMP 2011 accepted pdf(internal acc 2011 Simulation of Bevel Gear Cutting with GPGPUs - Performance and Productivity Wienke, Sandra Wienke, Sandra; Plotnikov, Dmytro; ; Bischof, ; Hardjosuwito, Ario; Gorgels, Christof; Brecher, International Supercomputing Conference (ISC) 2011, Hamburg, Germany Springer pdf (1,2 MB) 5 of 16 12/19/13 20:01

6 2011 Towards NUMA Support with Distance Information ; ; International Workshop on OpenMP (IWOMP) /plurk1v42k Mesh decomposition for efficient parallel computing of electrical machines by means of FEM accounting for motion Böhmer, Stefan Cramer, Tim; Böhmer, Stefan; Lange, Enno; Hafner, Martin; Bischof, ; Hameyer, Kay 18th International Conference on the Computation of Electromagnetic Fields (COMPUMAG 2011) pdf 2011 Numerical simulation of electrical machines by means of a hybrid parallelization using MPI and OpenMP for FEM Böhmer, Stefan Bischof, ; Cramer, Tim; Böhmer, Stefan; Hafner, Martin; Lange, Enno; Hameyer, Kay; Eighth International Conference on Computation in Electromagnetics (CEM 2011) pdf 2011 Brainware for Green HPC Bischof, ; ; Bischof, ENA-HPC International Conference on Energy-Aware High Performance Computing pdf 2011 Enhancing Brainware Productivity through a Performance Tuning Workflow ; ; Bischof, ; Altenfeld, Ralph PROPER Workshop at EuroPar 2011 accepted pdf (internal ac 2011 Score-P -- A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Knüpfer, Andreas A. Knüpfer, C. Rössel, D. S. Biersdorf, K. Diethelm,D. Eschweiler, M. Gerndt, D. Lorenz, A. Malony, W. E. Nagel, Y. Oleynik, 5th Parallel Tools Workshop (internal use on 6 of 16 12/19/13 20:01

7 P. Philippen, P. Saviankou, D. S. Shende, R. Tschüter, M. Wagner, B. Wesarg, F. Wolf 2010 Binding Nested OpenMP Programs on Hierarchical Memory Architectures ; ; ; Bücker, Martin IWOMP 2010, Tsukuba, Japan pdf (277 KB) 2010 How to reconcile event-based performance analysis with tasking in OpenMP Lorenz, Daniel ; Lorenz, Daniel; Mohr, Bernd; Rössel, ; Wolf, Felix IWOMP 2010, Tsukuba, Japan pdf (214 KB) 2010 How to scale Nested OpenMP Applications on the ScaleMP vsmp Architecture ; ; ; Wolf, Andreas; Bischof, IEEE Cluster2010, Heraklion, Greece /xpls/abs_all.jsp 2010 Productivity and Performance Portability of the OpenMP 3.0 Tasking Concept When Applied to an Engineering Code Written in Fortran 95 Kapinos, Paul ; Kapinos, Paul International Journal of Parallel Programming (741,9 KB) 2010 An approach to visualize remote socket traffic on the Intel Nehalem-EX ; ; ; Reichstein, Thomas; Dahnken, Christopher; PROPER 2010, EuroPar 2010, Ischia, Italy accepted 7 of 16 12/19/13 20:01

8 Semin, Andrey; Bischof 2010 Score-P A Unified Performance Measurement System for Petascale Applications ; S. Biersdorf, C. Bischof, K. Diethelm, D. Eschweiler, M. Gerndt, A. Knüpfer, D. Lorenz, A. D. Malony, W. E. Nagel, Y. Oleynik, C. Rössel, P. Saviankou, D. S. Shende, M. Wagner, B. Wesarg, F. Wolf CiHPC: Competence in High Performance Computing, HPC Status Konferenz der Gauß-Allianz e.v. pdf (internal ac 2009 Exploiting Object- Oriented Abstractions to parallelize Sparse Linear Algebra Codes ; ; Kapinos, Paul; C. Schleiden; I. Merkulow International Conference on Parallel Computing (ParCo 2009), Lyon, France accepted IOS Press Books 2009 SHEMAT-Suite: Nested OpenMP Parallelization on Innovative Shared- Memory Architectures for Geothermal Simulations ; ; ; A. Wolf International Supercomputing Conference (ISC 2009), Hamburg, Germany poster 2009 Object-Oriented OpenMP Programming with C++ and Fortran ; ; Kapinos, Paul; C. Schleiden; I. Merkulow High Performance Computing Symposium (HPCS 2009), Kingston, ON, Canada 2009 Towards better C++ binding for OpenMP ; M. Wong 5th International Workshop on OpenMP, Dresden Germany poster 8 of 16 12/19/13 20:01

9 2009 Comparing Programmability and Scalability of Multicore Parallelization Paradigms with C++ ; ; C. Schleiden Second Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG-2), Fourth International Conference on High-Performance Embedded Architectures and Compilers (HiPEAC), Paphos, Cypress pdf 2009 Parallel Simulation of Bevel Gear Cutting Processes with OpenMP Tasks Kapinos, Paul ; Kapinos, Paul International Workshop on OpenMP (IWOMP 2009) Springer (376,9 KB) 2009 Leveraging Multicore Cluster Nodes by adding OpenMP to Flow Solvers parallelized with MPI ; ; Sarholz, Samuel; Altenfeld, Ralph High Performance Computing Symposium (HPCS 2009), Kingston, ON, Canada 2009 Simulation of Primary Breakup for Diesel Spray with Phase Transition Zeng, Peng ; Sarholz, Samuel; Zeng, Peng; Binninger, Bernd; Peters, Norbert; Herrmann, Marcus EuroPVM/MPI Data and Thread Affinity in OpenMP Programs ; ; ; H. Jin; M. Wagner Workshop "Memory Access on future Processors: A solved problem?", ACM International Conference on Computing Frontiers, Ischia, Italy /citation.cfm?id 2008 First Experiences with Intel Cluster OpenMP ; ; ; M. Wagner IWOMP 2008, West Lafayette, IN, USA; Lecture Notes in Computer Science, Springer, Vol 5004, p Springer pdf (301 KB) 9 of 16 12/19/13 20:01

10 Performance Evaluation of a Multi-Zone Application in Different OpenMP Approaches H. Jin ; H. Jin; B. Chapman; L. Huang; T. Reichstein International Journal of Parallel Programming, Vol 36, Number 3 / June 2008, p Springer (684 kb) 2008 Adding New Dimensions to Performance Analysis through User Defined Objects G. Jost ; G. Jost, O. Mazurov OpenMP Shared Memory Parallel Programming, Lecture Notes in Computer Science, Springer, Vol 4315, p Springer (650 KB) 2008 C++ and OpenMP ; OpenMP Shared Memory Parallel Programming, Lecture Notes in Computer Science, Springer, Vol 4315, p Springer (276 KB) 2008 Experiences with the OpenMP Parallelization of DROPS, a Navier- Stokes Solver written in C++ ; ; A. Spiegel; S. Gross; V. Reichelt OpenMP Shared Memory Parallel Programming, Lecture Notes in Computer Science, Springer, Vol 4315, p Springer (296 KB) 2008 Comparing the Usability of Performance Analysis Tools ; PROPER 2008, EuroPar, Las Palmas de Gran Canaria, Spain 2007 Petaflops Basics - Performance from SMP Building Blocks C. Bischof ; ; Sarholz, Samuel; C. Bischof Petascale Computing: Algorithms and Applications, Chapman & Hall / CRC Computations Science Series, editor D. Bader CRC Press pdf (internal ac 10 of 16 12/19/13 20:01

11 2007 UltraSPARC T2 ("Niagara 2") for HPC ; SunHPC Consortium Meeting, Reno 2007 First Experiences with Intel Cluster OpenMP ; ; ; M. Wagner 3. Tagung über die Kommunikation in Clusterrechnern und Clusterverbundsystemen (KiCC), p 61-76, Aachen RWTH Bibliothek p Nested Parallelization with OpenMP ; ; Sarholz, Samuel International Journal of Parallel Programming, Volume 35, Number 5, p Springer (337 KB) 2007 Comparing Intel Thread Checker and Sun Thread Analyzer Minisymposium on "Scalability and Usability of HPC Programming Tools", PARCO2007, Aachen/Jülich VI-HPS 2007 Affinity Matters! ; Minisymposium on "The Future of OpenMP in the Multi-Core Era", PARCO2007, Aachen/Jülich 2007 Parallel computers everywhere C. Bischof ; ; Sarholz, Samuel; Bischof, 16th International Conference on the Computation of Electromagnetic Fields (Compumag 2007), Aachen, Germany keynote 2007 OpenMP on Multicore Architectures ; ; Sarholz, Samuel International Workshop on OpenMP (IWOMP 2007), page electronically, Beijing, China Springer (227 KB) 2007 Exploiting Multicore Architectures for Physically Based Simulation of Deformable Objects in Virtual Environments L. Jerabkova ; Sarholz, Samuel; L. Jerabkova; T. Kuhlen; C. Bischof Virtuelle und Erweiterte Realität, 4. Workshop der GI-Fachgruppe VR/AR Shaker Verlag 2007 C++ and OpenMP ParCo of 16 12/19/13 20:01

12 2006 Shared-Memory Parallelisierung von C++ Programmen Master s thesis, RWTH Aachen University 2006 Nested Parallelization of the Flow Solver TFS using the ParaWise Parallelization Environment S. Johnson ; S. Johnson; C. Ierotheou; A. Spiegel; I. Hörschler IWOMP 2006, Reims, LNCS Springer (374 KB) 2006 Adding New Dimensions to Performance Analysis through User Defined Objects G. Jost ; G. Jost; O. Mazurov IWOMP 2006, Reims, Lecture Notes in Computer Science, Springer Springer (650 KB) 2006 C++ and OpenMP ; IWOMP 2006, Reims, LNCS D Critical Points Computed by Nested OpenMP A. Gerndt ; Sarholz, Samuel; A. Gerndt; M. Wolter; T. Kuhlen; C. Bischof Alan Heirich, Bruno Raffin, Luis P. dos Santos (eds.), Short Papers Proceedings, EGPGV 06, Braga, Portugal, Eurographics / ACM SIGGRAPH, pp IEEE Xplore (995 KB) D Critical Points Computed by Nested OpenMP A. Gerndt ; Sarholz, Samuel; A. Gerndt; M. Wolter; T. Kuhlen; C. Bischof SC 06, Nov 2006, Tampa, USA 2006 Shared-Memory Parallelization for Content-based Image Retrieval ; T. Deselaers; C. Bischof; H. Ney Workshop 2006 on Computation Intensive Methods for Computer Vision Homepage 2006 Efficient Task Scheduling in the Parallel Result- Verifying Solution of Nonlinear Systems T. Beelitz B. Lang; C. Bischof Reliable Computing 12(2): Experiences with the OpenMP Parallelization ; A. Spiegel; S. Gross; V. Reichelt First International Workshop on OpenMP (IWOMP 2005), Oregon, Springer (296 KB) 12 of 16 12/19/13 20:01

13 of DROPS, a Navier- Stokes Solver written in C++ USA 2005 Parallelization of the C++ Navier-Stokes Solver DROPS with OpenMP ; A. Spiegel; S. Gross; V. Reichelt G. R. Joubert, W. E. Nagel, F. J. Peters, O. Plata, P. Tirado, and E. Zapata, editors, Parallel Computing (ParCo 2005): Current & Future Issues of High-End Computing, volume 33 of NIC Series, pages , Malaga, Spain 2005 Parallelization of the C++ Navier-Stokes Solver DROPS with OpenMP ; A. Spiegel; S. Gross; V. Reichelt Proc. ParCo 2005, Malaga, Spain; Vol 33 in the NIC book series, Research Centre Jülich, Germany 2005 Experiences with the OpenMP Parallelization of DROPS, a Navier- Stokes Solver written in C++ ; ; A. Spiegel; S. Gross; V. Reichelt IWOMP 2005, Eugene, USA pdf 2005 Parallel calculation of accurate path lines in virtual environments through exploitation of multi-block CFD data set topology A. Gerndt M. Schirski; T. Kuhlen; C. Bischof J. of Mathematical Modeling and Algorithms, 4(1): Hybrid Parallelization of CFD Applications with Dynamic Thread Balancing A. Spiegel ; A. Spiegel; C. Bischof Proc. PARA04 Workshop, Lyngby, Denmark, June 2004; Lecture Notes in Computer Science 3732, J. Dongarra, K. Madsen, and J. Wasniewski, Eds., pp , Springer Verlag, Automatic scoping of variables in parallel regions of an openmp program Y. Lin ; ; N. Copty Workshop on OpenMP Applications and Tools (WOMPAT 2004), page electronically, 13 of 16 12/19/13 20:01

14 2004 Hybrid Parallelization with Dynamic Thread Balancing on a ccnuma System 2004 Hybrid Parallelization of CFD Applications with Dynamic Thread Balancing A. Spiegel ; A. Spiegel A. Spiegel ; A. Spiegel; C. Bischof Houston, USA EWOMP'04, Stockholm, Sweden PARA04, Copenhagen, Denmark 2004 Efficient two-trait-locus linkage analysis through program optimization and parallelization: application to hypercholesterolemia Johannes Dietter ; Alexander Spiegel, Hans-Joachim Pflug, Hussam Al-Kateb, Katrin Hoffmann, Thomas F Wienker, Konstantin Strauch European Journal of Human Genetics 12: Die Zukunft ist parallel - Perspektiven des Hochleistungsrechnens Vortrag + Diskussion, im IMBIE, Universität Bonn und im RZ der RWTH Aachen 2003 Two OpenMP Programming Patterns EWOMP'03, Aachen, Germany 2003 Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster G. Jost ; G. Jost; H. Jun, F. F. Hatay EWOMP'03, Aachen, Germany 2003 Comparing the OpenMP, MPI, and Hybrid Programming Paradigms on an SMP Cluster G. Jost ; G. Jost; H. Jin, F. F. Hatay NASA Advanced Supercom puting Division Technical Report NAS NASA, TFLOPS in Production Mode Sun HPC Consortium Meeting, Glasgow, UK 2002 Explicit Loop Scheduling in OpenMP for Parallel Automatic H. M. Bücker ; H. Martin Bücker; ; B. Lang, A. Rasch, C. J. N. Almhana and V. C. Bhavsar (Eds.), Proc. 16th Annual Int. Symp. 14 of 16 12/19/13 20:01

15 Differentiation Bischof On High Performance Computing Systems and Applications, Moncton, NB, Canada, June 16-19, 2002, pp ; IEEE Computer Soc. Press 2002 Pushing Loop-Level Parallelization to the Limit ; T. Haarmann EWOMP'02, Rome, Italy 2002 Computational Engineering & Science an der RWTH Aachen C. Bischof ; Bischof, ; K. Brühl, Th. Eifert PIK 25 (2002), pp , K.G. Saur Verlag GmbH 2001 Bringing together automatic differentiation and OpenMP H. Martin Bücker ; H. Martin Bücker; Bruno Lang, H. Bischof ICS 2001: ACM (159 KB) 2000 HP's Software Development Environment - Practice and Experiment HPCUG 2000, San Jose, USA 2000 Vom Vektorrechner zum SMP-Cluster - Hybride Parallelisierung des CFD-Codes PANTA ZKI AK Supercomputing, Zeuthen 2000 HP's Software Development Environment - Practice and Experiment ; Stephan Schmidt Hiper 2000, Barcelona, Spain 2000 From a Vector Computer to an SMP-Cluster - Hybrid Parallelization of the CFD Code PANTA ; Stephan Schmidt EWOMP 2000, Edinburgh 1999 An HP V-Class Server in a Supercomputer Environment and First Experiences with ; Stephan Schmidt HiPer '99, Tromso, Norway 15 of 16 12/19/13 20:01

16 OpenMP 1999 Sind PC-Cluster reif für Supercomputing? Vortrag im Rahmen des Kolloquiums "PC-Technologie als Basis für paralleles Hochleistungsrechnen" anlässlich der Kooperation zwischen der RWTH Aachen und der Siemens AG, Product Center HPC am an der RWTH Aachen 1999 Are PC-Clusters ready for Supercomputing? SPEC Workshop "Benchmarking Parallel and High-Performance Computing Systems, Wuppertal 1999 Der HP V-Class Server im Rechenzentrum der RWTH Aachen, Erste Erfahrungen mit parallelen Anwendungen Vortrag im Rahmen des Kolloquium anlässlich der Einführung des HP V-Class Rechners an der RWTH Aachen 1997 The energetically preferred orientation of the hydroxyl group in cyclohexanol - Ab initio and force field caclulations C. Jansen ; C. Jansen, G. Raabe, J. Fleischhauer Theochem (1997) of 16 12/19/13 20:01

Performance Characteristics of Large SMP Machines

Performance Characteristics of Large SMP Machines Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark

More information

Unified Performance Data Collection with Score-P

Unified Performance Data Collection with Score-P Unified Performance Data Collection with Score-P Bert Wesarg 1) With contributions from Andreas Knüpfer 1), Christian Rössel 2), and Felix Wolf 3) 1) ZIH TU Dresden, 2) FZ Jülich, 3) GRS-SIM Aachen Fragmentation

More information

Case Study on Productivity and Performance of GPGPUs

Case Study on Productivity and Performance of GPGPUs Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia

More information

Jens Doleschal (TUD) Jens Doleschal (TUD) Jens Doleschal (TUD) 1.0 26/09/2012 Final version of the deliverable Jens Doleschal (TUD)

Jens Doleschal (TUD) Jens Doleschal (TUD) Jens Doleschal (TUD) 1.0 26/09/2012 Final version of the deliverable Jens Doleschal (TUD) Version Date Comments, Changes, Status Authors, contributors, reviewers 0.1 24/08/2012 First full version of the deliverable Jens Doleschal (TUD) 0.1 03/09/2012 Review Ben Hall (UCL) 0.1 13/09/2012 Review

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Application Performance Analysis Tools and Techniques

Application Performance Analysis Tools and Techniques Mitglied der Helmholtz-Gemeinschaft Application Performance Analysis Tools and Techniques 2012-06-27 Christian Rössel Jülich Supercomputing Centre c.roessel@fz-juelich.de EU-US HPC Summer School Dublin

More information

Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster Gabriele Jost and Haoqiang Jin NAS Division, NASA Ames Research Center, Moffett Field, CA 94035-1000 {gjost,hjin}@nas.nasa.gov

More information

for High Performance Computing

for High Performance Computing Technische Universität München Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation Automatic Performance Engineering Workflows for High Performance Computing Ventsislav Petkov

More information

Search Strategies for Automatic Performance Analysis Tools

Search Strategies for Automatic Performance Analysis Tools Search Strategies for Automatic Performance Analysis Tools Michael Gerndt and Edmond Kereku Technische Universität München, Fakultät für Informatik I10, Boltzmannstr.3, 85748 Garching, Germany gerndt@in.tum.de

More information

Tools for Analysis of Performance Dynamics of Parallel Applications

Tools for Analysis of Performance Dynamics of Parallel Applications Tools for Analysis of Performance Dynamics of Parallel Applications Yury Oleynik Fourth International Workshop on Parallel Software Tools and Tool Infrastructures Technische Universität München Yury Oleynik,

More information

Distributed communication-aware load balancing with TreeMatch in Charm++

Distributed communication-aware load balancing with TreeMatch in Charm++ Distributed communication-aware load balancing with TreeMatch in Charm++ The 9th Scheduling for Large Scale Systems Workshop, Lyon, France Emmanuel Jeannot Guillaume Mercier Francois Tessier In collaboration

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Samuel Sarholz sarholz@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University HPC unter Linux Sep 15, RWTH Aachen Agenda o Hardware o Development

More information

Multicore Parallel Computing with OpenMP

Multicore Parallel Computing with OpenMP Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large

More information

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Assessing the Performance of OpenMP Programs on the Intel Xeon Phi Dirk Schmidl, Tim Cramer, Sandra Wienke, Christian Terboven, and Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

Software Distributed Shared Memory Scalability and New Applications

Software Distributed Shared Memory Scalability and New Applications Software Distributed Shared Memory Scalability and New Applications Mats Brorsson Department of Information Technology, Lund University P.O. Box 118, S-221 00 LUND, Sweden email: Mats.Brorsson@it.lth.se

More information

An HPC Application Deployment Model on Azure Cloud for SMEs

An HPC Application Deployment Model on Azure Cloud for SMEs An HPC Application Deployment Model on Azure Cloud for SMEs Fan Ding CLOSER 2013, Aachen, Germany, May 9th,2013 Rechen- und Kommunikationszentrum (RZ) Agenda Motivation Windows Azure Relevant Technology

More information

A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs

A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs Edmond Kereku, Tianchao Li, Michael Gerndt, and Josef Weidendorfer Institut für Informatik, Technische Universität München,

More information

OpenMP Programming on ScaleMP

OpenMP Programming on ScaleMP OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

Performance Analysis of a Hybrid MPI/OpenMP Application on Multi-core Clusters

Performance Analysis of a Hybrid MPI/OpenMP Application on Multi-core Clusters Performance Analysis of a Hybrid MPI/OpenMP Application on Multi-core Clusters Martin J. Chorley a, David W. Walker a a School of Computer Science and Informatics, Cardiff University, Cardiff, UK Abstract

More information

FD4: A Framework for Highly Scalable Dynamic Load Balancing and Model Coupling

FD4: A Framework for Highly Scalable Dynamic Load Balancing and Model Coupling Center for Information Services and High Performance Computing (ZIH) FD4: A Framework for Highly Scalable Dynamic Load Balancing and Model Coupling Symposium on HPC and Data-Intensive Applications in Earth

More information

Combining Instrumentation and Sampling for Trace-based Application Performance Analysis

Combining Instrumentation and Sampling for Trace-based Application Performance Analysis Combining Instrumentation and Sampling for Trace-based Application Performance Analysis Thomas Ilsche, Joseph Schuchart, Robert Schöne, and Daniel Hackenberg Abstract Performance analysis is vital for

More information

Score-P A Unified Performance Measurement System for Petascale Applications

Score-P A Unified Performance Measurement System for Petascale Applications Score-P A Unified Performance Measurement System for Petascale Applications Dieter an Mey(d), Scott Biersdorf(h), Christian Bischof(d), Kai Diethelm(c), Dominic Eschweiler(a), Michael Gerndt(g), Andreas

More information

Towards an Implementation of the OpenMP Collector API

Towards an Implementation of the OpenMP Collector API John von Neumann Institute for Computing Towards an Implementation of the OpenMP Collector API Van Bui, Oscar Hernandez, Barbara Chapman, Rick Kufrin, Danesh Tafti, Pradeep Gopalkrishnan published in Parallel

More information

Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München

Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München SuperMUC: 3 Petaflops (3*10 15 =quadrillion), 3 MW 2 TOP 500 List TOTAL #1 #500

More information

Online Performance Observation of Large-Scale Parallel Applications

Online Performance Observation of Large-Scale Parallel Applications 1 Online Observation of Large-Scale Parallel Applications Allen D. Malony and Sameer Shende and Robert Bell {malony,sameer,bertie}@cs.uoregon.edu Department of Computer and Information Science University

More information

Improving Time to Solution with Automated Performance Analysis

Improving Time to Solution with Automated Performance Analysis Improving Time to Solution with Automated Performance Analysis Shirley Moore, Felix Wolf, and Jack Dongarra Innovative Computing Laboratory University of Tennessee {shirley,fwolf,dongarra}@cs.utk.edu Bernd

More information

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,

More information

GASPI A PGAS API for Scalable and Fault Tolerant Computing

GASPI A PGAS API for Scalable and Fault Tolerant Computing GASPI A PGAS API for Scalable and Fault Tolerant Computing Specification of a general purpose API for one-sided and asynchronous communication and provision of libraries, tools, examples and best practices

More information

HOPSA Project. Technical Report. A Workflow for Holistic Performance System Analysis

HOPSA Project. Technical Report. A Workflow for Holistic Performance System Analysis HOPSA Project Technical Report A Workflow for Holistic Performance System Analysis September 2012 Felix Wolf 1,2,3, Markus Geimer 2, Judit Gimenez 4, Juan Gonzalez 4, Erik Hagersten 5, Thomas Ilsche 6,

More information

MPI and Hybrid Programming Models. William Gropp www.cs.illinois.edu/~wgropp

MPI and Hybrid Programming Models. William Gropp www.cs.illinois.edu/~wgropp MPI and Hybrid Programming Models William Gropp www.cs.illinois.edu/~wgropp 2 What is a Hybrid Model? Combination of several parallel programming models in the same program May be mixed in the same source

More information

Load Balancing Support for Grid-enabled Applications

Load Balancing Support for Grid-enabled Applications John von Neumann Institute for Computing Load Balancing Support for Grid-enabled Applications S. Rips published in Parallel Computing: Current & Future Issues of High-End Computing, Proceedings of the

More information

Sino-German Workshop on Cloud-based High Performance Computing. September 26-October 1, 2011, Shanghai, China

Sino-German Workshop on Cloud-based High Performance Computing. September 26-October 1, 2011, Shanghai, China Sino-German Workshop on Cloud-based High Performance Computing September 26-October 1, 2011, Shanghai, China The workshop consists of an opening ceremony, 10 sessions, a mini-workshop for PhD students

More information

HPC enabling of OpenFOAM R for CFD applications

HPC enabling of OpenFOAM R for CFD applications HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,

More information

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM Pablo Reble, Stefan Lankes, Carsten Clauss, Thomas Bemmerl Chair for Operating Systems, RWTH Aachen University Kopernikusstr. 16,

More information

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,

More information

Introduction to the TAU Performance System

Introduction to the TAU Performance System Introduction to the TAU Performance System Leap to Petascale Workshop 2012 at Argonne National Laboratory, ALCF, Bldg. 240,# 1416, May 22-25, 2012, Argonne, IL Sameer Shende, U. Oregon sameer@cs.uoregon.edu

More information

Analyzing Overheads and Scalability Characteristics of OpenMP Applications

Analyzing Overheads and Scalability Characteristics of OpenMP Applications Analyzing Overheads and Scalability Characteristics of OpenMP Applications Karl Fürlinger and Michael Gerndt Technische Universität München Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation

More information

Load Balancing MPI Algorithm for High Throughput Applications

Load Balancing MPI Algorithm for High Throughput Applications Load Balancing MPI Algorithm for High Throughput Applications Igor Grudenić, Stjepan Groš, Nikola Bogunović Faculty of Electrical Engineering and, University of Zagreb Unska 3, 10000 Zagreb, Croatia {igor.grudenic,

More information

A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering

A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering Gaby Aguilera, Patricia J. Teller, Michela Taufer, and

More information

Scheduling Task Parallelism" on Multi-Socket Multicore Systems"

Scheduling Task Parallelism on Multi-Socket Multicore Systems Scheduling Task Parallelism" on Multi-Socket Multicore Systems" Stephen Olivier, UNC Chapel Hill Allan Porterfield, RENCI Kyle Wheeler, Sandia National Labs Jan Prins, UNC Chapel Hill Outline" Introduction

More information

Scalability evaluation of barrier algorithms for OpenMP

Scalability evaluation of barrier algorithms for OpenMP Scalability evaluation of barrier algorithms for OpenMP Ramachandra Nanjegowda, Oscar Hernandez, Barbara Chapman and Haoqiang H. Jin High Performance Computing and Tools Group (HPCTools) Computer Science

More information

ALI JANNESARI Head of Multicore Programming Group Department of Computer Science, Technical University Darmstadt, Germany

ALI JANNESARI Head of Multicore Programming Group Department of Computer Science, Technical University Darmstadt, Germany ALI JANNESARI Head of Multicore Programming Group Department of Computer Science, Technical University Darmstadt, Germany Contact Web Education jannesari@cs.tu-darmstadt.de http://www.parallel.informatik.tu-darmstadt.de/multicore-group/

More information

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Haohuan Fu haohuan@tsinghua.edu.cn High Performance Geo-Computing (HPGC) Group Center for Earth System Science Tsinghua University

More information

Big Data Management in the Clouds and HPC Systems

Big Data Management in the Clouds and HPC Systems Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

Part I Courses Syllabus

Part I Courses Syllabus Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges Why should you care? What are they? Which are different from non-petascale? What has changed since

More information

Analysis of Parallel Software Development using the

Analysis of Parallel Software Development using the CTWatch Quarterly November 2006 46 Analysis of Parallel Software Development using the Relative Development Time Productivity Metric Introduction As the need for ever greater computing power begins to

More information

HPC Wales Skills Academy Course Catalogue 2015

HPC Wales Skills Academy Course Catalogue 2015 HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses

More information

Experiences with HPC on Windows

Experiences with HPC on Windows Experiences with on Christian Terboven terboven@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University Server Computing Summit 2008 April 7 11, HPI/Potsdam Experiences with on

More information

Proling of Task-based Applications on Shared Memory Machines: Scalability and Bottlenecks

Proling of Task-based Applications on Shared Memory Machines: Scalability and Bottlenecks Proling of Task-based Applications on Shared Memory Machines: Scalability and Bottlenecks Ralf Homann and Thomas Rauber Department for Mathematics, Physics and Computer Science University of Bayreuth,

More information

High Performance Computing in the Multi-core Area

High Performance Computing in the Multi-core Area High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable

More information

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver 1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution

More information

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*

More information

Parallel Computing. Introduction

Parallel Computing. Introduction Parallel Computing Introduction Thorsten Grahs, 14. April 2014 Administration Lecturer Dr. Thorsten Grahs (that s me) t.grahs@tu-bs.de Institute of Scientific Computing Room RZ 120 Lecture Monday 11:30-13:00

More information

Combining Scalability and Efficiency for SPMD Applications on Multicore Clusters*

Combining Scalability and Efficiency for SPMD Applications on Multicore Clusters* Combining Scalability and Efficiency for SPMD Applications on Multicore Clusters* Ronal Muresano, Dolores Rexachs and Emilio Luque Computer Architecture and Operating System Department (CAOS) Universitat

More information

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. White Paper 021313-3 Page 1 : A Software Framework for Parallel Programming* The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. ABSTRACT Programming for Multicore,

More information

End-user Tools for Application Performance Analysis Using Hardware Counters

End-user Tools for Application Performance Analysis Using Hardware Counters 1 End-user Tools for Application Performance Analysis Using Hardware Counters K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, T. Spencer Abstract One purpose of the end-user tools described in

More information

Integrated Communication Systems

Integrated Communication Systems Integrated Communication Systems Courses, Research, and Thesis Topics Prof. Paul Müller University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de

More information

A Pattern-Based Approach to. Automated Application Performance Analysis

A Pattern-Based Approach to. Automated Application Performance Analysis A Pattern-Based Approach to Automated Application Performance Analysis Nikhil Bhatia, Shirley Moore, Felix Wolf, and Jack Dongarra Innovative Computing Laboratory University of Tennessee (bhatia, shirley,

More information

Big Data Visualization on the MIC

Big Data Visualization on the MIC Big Data Visualization on the MIC Tim Dykes School of Creative Technologies University of Portsmouth timothy.dykes@port.ac.uk Many-Core Seminar Series 26/02/14 Splotch Team Tim Dykes, University of Portsmouth

More information

The Value of High-Performance Computing for Simulation

The Value of High-Performance Computing for Simulation White Paper The Value of High-Performance Computing for Simulation High-performance computing (HPC) is an enormous part of the present and future of engineering simulation. HPC allows best-in-class companies

More information

HPC Deployment of OpenFOAM in an Industrial Setting

HPC Deployment of OpenFOAM in an Industrial Setting HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak h.jasak@wikki.co.uk Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment

More information

A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications

A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Robert Bell, and Sameer Shende University of Oregon, Eugene, OR 97403 USA, {likai,malony,bertie,sameer}@cs.uoregon.edu

More information

A Pattern-Based Comparison of OpenACC & OpenMP for Accelerators

A Pattern-Based Comparison of OpenACC & OpenMP for Accelerators A Pattern-Based Comparison of OpenACC & OpenMP for Accelerators Sandra Wienke 1,2, Christian Terboven 1,2, James C. Beyer 3, Matthias S. Müller 1,2 1 IT Center, RWTH Aachen University 2 JARA-HPC, Aachen

More information

Using the Windows Cluster

Using the Windows Cluster Using the Windows Cluster Christian Terboven terboven@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University Windows HPC 2008 (II) September 17, RWTH Aachen Agenda o Windows Cluster

More information

High Productivity Computing With Windows

High Productivity Computing With Windows High Productivity Computing With Windows Windows HPC Server 2008 Justin Alderson 16-April-2009 Agenda The purpose of computing is... The purpose of computing is insight not numbers. Richard Hamming Why

More information

Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment

Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment Wyatt Spear, Allen Malony, Alan Morris, Sameer Shende {wspear, malony, amorris, sameer}@cs.uoregon.edu

More information

A Case Study - Scaling Legacy Code on Next Generation Platforms

A Case Study - Scaling Legacy Code on Next Generation Platforms Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 00 (2015) 000 000 www.elsevier.com/locate/procedia 24th International Meshing Roundtable (IMR24) A Case Study - Scaling Legacy

More information

OpenACC Parallelization and Optimization of NAS Parallel Benchmarks

OpenACC Parallelization and Optimization of NAS Parallel Benchmarks OpenACC Parallelization and Optimization of NAS Parallel Benchmarks Presented by Rengan Xu GTC 2014, S4340 03/26/2014 Rengan Xu, Xiaonan Tian, Sunita Chandrasekaran, Yonghong Yan, Barbara Chapman HPC Tools

More information

A Scalable Approach to MPI Application Performance Analysis

A Scalable Approach to MPI Application Performance Analysis A Scalable Approach to MPI Application Performance Analysis Shirley Moore 1, Felix Wolf 1, Jack Dongarra 1, Sameer Shende 2, Allen Malony 2, and Bernd Mohr 3 1 Innovative Computing Laboratory, University

More information

An Implementation of the POMP Performance Monitoring Interface for OpenMP Based on Dynamic Probes

An Implementation of the POMP Performance Monitoring Interface for OpenMP Based on Dynamic Probes An Implementation of the POMP Performance Monitoring Interface for OpenMP Based on Dynamic Probes Luiz DeRose Bernd Mohr Seetharami Seelam IBM Research Forschungszentrum Jülich University of Texas ACTC

More information

MPICH FOR SCI-CONNECTED CLUSTERS

MPICH FOR SCI-CONNECTED CLUSTERS Autumn Meeting 99 of AK Scientific Computing MPICH FOR SCI-CONNECTED CLUSTERS Joachim Worringen AGENDA Introduction, Related Work & Motivation Implementation Performance Work in Progress Summary MESSAGE-PASSING

More information

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 231 237 doi: 10.14794/ICAI.9.2014.1.231 Parallelization of video compressing

More information

OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis

OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis OMPT: OpenMP Tools Application Programming Interfaces for Performance Analysis Alexandre Eichenberger, John Mellor-Crummey, Martin Schulz, Michael Wong, Nawal Copty, John DelSignore, Robert Dietrich, Xu

More information

MAQAO Performance Analysis and Optimization Tool

MAQAO Performance Analysis and Optimization Tool MAQAO Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Evaluation Team, University of Versailles S-Q-Y http://www.maqao.org VI-HPS 18 th Grenoble 18/22

More information

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

High Performance Computing: A Review of Parallel Computing with ANSYS solutions. Efficient and Smart Solutions for Large Models

High Performance Computing: A Review of Parallel Computing with ANSYS solutions. Efficient and Smart Solutions for Large Models High Performance Computing: A Review of Parallel Computing with ANSYS solutions Efficient and Smart Solutions for Large Models 1 Use ANSYS HPC solutions to perform efficient design variations of large

More information

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Mitglied der Helmholtz-Gemeinschaft Welcome to the Jülich Supercomputing Centre D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Schedule: Monday, May 19 13:00-13:30 Welcome

More information

OpenMP Tools API (OMPT) and HPCToolkit

OpenMP Tools API (OMPT) and HPCToolkit OpenMP Tools API (OMPT) and HPCToolkit John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu SC13 OpenMP Birds of a Feather Session, November 19, 2013 OpenMP Tools Subcommittee

More information

OpenSPARC Program. David Weaver Principal Engineer, UltraSPARC Architecture Principal OpenSPARC Evangelist Sun Microsystems, Inc. www.opensparc.

OpenSPARC Program. David Weaver Principal Engineer, UltraSPARC Architecture Principal OpenSPARC Evangelist Sun Microsystems, Inc. www.opensparc. OpenSPARC Program David Weaver Principal Engineer, UltraSPARC Architecture Principal OpenSPARC Evangelist Sun Microsystems, Inc. 1 Agenda What is OpenSPARC? OpenSPARC University Program OpenSPARC Resources

More information

Automating Big Data Benchmarking for Different Architectures with ALOJA

Automating Big Data Benchmarking for Different Architectures with ALOJA www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.

More information

Performance Tools for System Monitoring

Performance Tools for System Monitoring Center for Information Services and High Performance Computing (ZIH) 01069 Dresden Performance Tools for System Monitoring 1st CHANGES Workshop, Jülich Zellescher Weg 12 Tel. +49 351-463 35450 September

More information

Min Si. Argonne National Laboratory Mathematics and Computer Science Division

Min Si. Argonne National Laboratory Mathematics and Computer Science Division Min Si Contact Information Address 9700 South Cass Avenue, Bldg. 240, Lemont, IL 60439, USA Office +1 630-252-4249 Mobile +1 630-880-4388 E-mail msi@anl.gov Homepage http://www.mcs.anl.gov/~minsi/ Current

More information

Integration and Synthess for Automated Performance Tuning: the SYNAPT Project

Integration and Synthess for Automated Performance Tuning: the SYNAPT Project Integration and Synthess for Automated Performance Tuning: the SYNAPT Project Nicholas Chaimov, Boyana Norris, and Allen D. Malony {nchaimov,norris,malony}@cs.uoregon.edu Department of Computer and Information

More information

Operating System Multilevel Load Balancing

Operating System Multilevel Load Balancing Operating System Multilevel Load Balancing M. Corrêa, A. Zorzo Faculty of Informatics - PUCRS Porto Alegre, Brazil {mcorrea, zorzo}@inf.pucrs.br R. Scheer HP Brazil R&D Porto Alegre, Brazil roque.scheer@hp.com

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich

More information

Performance Analysis and Optimization Tool

Performance Analysis and Optimization Tool Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Analysis Team, University of Versailles http://www.maqao.org Introduction Performance Analysis Develop

More information

Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques.

Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques. Vers des mécanismes génériques de communication et une meilleure maîtrise des affinités dans les grappes de calculateurs hiérarchiques Brice Goglin 15 avril 2014 Towards generic Communication Mechanisms

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

Multilevel Load Balancing in NUMA Computers

Multilevel Load Balancing in NUMA Computers FACULDADE DE INFORMÁTICA PUCRS - Brazil http://www.pucrs.br/inf/pos/ Multilevel Load Balancing in NUMA Computers M. Corrêa, R. Chanin, A. Sales, R. Scheer, A. Zorzo Technical Report Series Number 049 July,

More information

Clusters: Mainstream Technology for CAE

Clusters: Mainstream Technology for CAE Clusters: Mainstream Technology for CAE Alanna Dwyer HPC Division, HP Linux and Clusters Sparked a Revolution in High Performance Computing! Supercomputing performance now affordable and accessible Linux

More information

OpenMP and Performance

OpenMP and Performance Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University Tuning Cycle Performance Tuning aims to improve the runtime of an

More information

The Design and Implementation of Scalable Parallel Haskell

The Design and Implementation of Scalable Parallel Haskell The Design and Implementation of Scalable Parallel Haskell Malak Aljabri, Phil Trinder,and Hans-Wolfgang Loidl MMnet 13: Language and Runtime Support for Concurrent Systems Heriot Watt University May 8,

More information

and RISC Optimization Techniques for the Hitachi SR8000 Architecture

and RISC Optimization Techniques for the Hitachi SR8000 Architecture 1 KONWIHR Project: Centre of Excellence for High Performance Computing Pseudo-Vectorization and RISC Optimization Techniques for the Hitachi SR8000 Architecture F. Deserno, G. Hager, F. Brechtefeld, G.

More information