Editorial. Editorial. Publishers. Editor. Design

Transcription

1

2 Editorial Editorial Publishers Editor Design 2 3

3 Contents Contents Contents News Applications Systems Projects Centres Activities Courses

4 News News PRACE: Results of the 7 th Regular Call PRACE Projects meet in Varna

5 News News Interview with Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Michael M. Resch

6 News News Prof. Michael M. Resch is the Chairman of the Board of Directors of GCS since May He was a cofounder of GCS. Prof. Resch is currently the director of the High Performance Computing Center Stuttgart (HLRS), the director of the Information Service Center (IZUS) of the University of Stuttgart and the director of the Institute for High Performance Computing (IHR) of the University of Stuttgart.

7 Applications Applications Numerical Method 2 2 Simulating the Life Cycle of molecular Clouds

8 Applications Applications 2 The typical Milky Way Disk

9 Applications Applications Disk Galaxies at different Gas Surface Densities: from low to high Redshift 2 2 Conclusions and Outlook References [1] Bigiel, F., Leroy, A., Walter, F., Brinks, E., de Blok, W. J. G., Madore, B., Thornley, M. D. [2] Brown, P. N., Byrne, G. D., Hindmarsh, A. C. 2 2 [3] Chabrier, G. [4] Clark, P. C., Glover, S. C. O., Klessen, R. S. [5] Evans, II, N. J. and et al. [6] Fryxell, B., et al. [7] Glover, S. C. O., Mac Low, M.-M. 2 2 [8] Górski, K. M., Hivon, E. 2 3 [9] Mac Low, M.-M., Klessen, R. S. 2 2 [10] Marinacci, F., Fraternali, F., Nipoti, C., Binney, J., Ciotti, L., Londrillo, P. 2 [11] Moster, B. P., Somerville, R. S., Maulbetsch, C., van den Bosch, F. C., Macciò, A. V., Naab, T., Oser, L. [12] Newman, S. F., et al. 2 [13] Tacconi, L., et al. 3

10 Applications Applications Gadget3: Numerical Simulation of Structure Formation in the Universe 3

11 Applications Applications References [1] Springel, V. [2] Springel, V., Yoshida, N., White, S.D.M. 2 2

12 Applications Applications The Numerical Challenge The Kane-Mele-Hubbard Model Numerical Simulation of Correlated Electron Systems

13 Applications Applications Results Impact of U for = 0 2

14 Applications Applications Acknowledgments [11] Hohenadler, M., Meng, Z. Y., Lang, T. C., Wessel, S., Muramatsu, A., Assaad, F. F. [12] Berg, E., Metlitski, M.A., Sachdev, S. [13] Rachel, S., Le Hur, K. [14] Assaad, F.F., Bercx, M., Hohenadler, M. Impact of U for > 0 References [1] Kane, C.L., Mele, E.J. [2] König, M., Wiedmann, S., Brüne, C., Roth, A., Buhmann, H., Molenkamp, L.W., Qi, X.-L., Zhang, S.-C. [3] Meng, Z.Y., Lang, T.C., Wessel, S., Assaad, F.F., Muramatsu, A. [4] Sorella, S., Otsuka, Y., Yunoki, S. [5] Assaad, F.F., Herbut, I.F. [15] Herbut, I.F. [16] Herbut, I.F., Jurii V., Vafek, O. [17] Yang, H.-Y., Albuquerque, A.F., Capponi, S., Läuchli, A.M., Schmidt, K.P. [18] Chang, C.-C., Scalettar, R.T. [19] Clark, B.K. [20] Sorella, S., Tosatti, E. [21] Paiva, T., Scalettar, R.T., Zheng, W., Singh, R.R.P., Oitmaa, J. [22] Qi, X.-L., Zhang, S.-C. [23] Ran, Y., Vishwanath, A., Lee, D.-H. [6] Hohenadler, M., Assaad, F.F. Outlook [7] Uehlinger, T., Jotzu, G., Messer, M., Greif, D., Hofstetter, W., Bissbort, U., Esslinger, T. [8] Assaad, F.F., Evertz, H.G. [9] Assaad, F.F. [10] Hohenadler, M., Lang, T.C., Assaad, F.F.

15 Applications Applications Highly-resolved numerical Simulations of bed-load Transport in a turbulent open Channel Flow Numerical Method Computational Setup Results

16 Applications Applications R uu r x / H Fix Ref FewPart LowSh G m ( x,0) Ref FewPart LowSh / H x Conclusions References [1] Dietrich, W.E., Kirchner, J.W., Ikeda, H., Iseya, F. [2] Shvidchenko, A.B., Pender, G. [3] Kempe, T., Fröhlich, J., [4] Kempe, T., Fröhlich, J., [5] Vowinckel, B., Kempe, T., Fröhlich, J., [6] Shields, A. [7] Yalin, M.S., Ferreira da Silva, A.M., [8] Vowinckel, B., Kempe, T., Fröhlich, J., Nikora, V.I. [9] Vowinckel, B., Kempe, T., Fröhlich, J.

17 Applications Applications How to fit the Local Universe into a Supercomputer? Recovering and Simulating Structures of the Local Universe 32 33

18 Applications Applications Acknowledgements References [1] Heß, S., Kitaura, F.-S., Gottlöber, S. [2] Kitaura, F.-S. [3] Kitaura, F.-S., Heß, S.

19 Applications Applications A scalable hybrid DFT/PMM-MD Approach for accurately simulating Biomolecules on SuperMUC

20 Applications Applications References [1] Senn, H.M., Thiel, W. [2] Schwörer, M., Breitenfeld, B., Tröster, P., Bauer, S., Lorenzen, K., Tavan, P., Mathias, G. [3] Lorenzen, K., Schwörer, M., Tröster, P., Mates, S., Tavan, P. [4] [5] Mathias, G., Baer, M.D. 2

21 Applications Applications Aircraft Wake Vortex Evolution during Approach and Landing With and without Plate Lines

22 Applications Applications 0 3 3

23 Applications Applications References [1] Holzäpfel, F., Steen, M. [2] Holzäpfel, F., Gerz, T., Frech, M., Tafferner, A., Köpp, F., Smalikho, I., Rahm, S., Hahn, K.-U., Schwarz, C. [3] Misaka, T., Holzäpfel, F., Gerz, T. [4] Stephan, A., Holzäpfel, F., Misaka, T. Links

24 Projects Projects Factories of the Future Resources, Technology, Infra- structure and Services for Simulation and Modelling FORTISSIMO

25 Projects Projects References [1] Sawyer, P. [2] Schubert, J. Core Project Partners [3]

26 Projects Projects Revisiting Dynamic Scheduling Techniques for HPC Infrastructures: The Approach of the DreamCloud Project [3] [4] Project Partners [1] References [1] [2] [3] [4] [2]

27 Projects Projects SkaSim - Scalable HPC-Codes for molecular Simulation in the Chemical Industry

28 Projects Projects POLCA: Programming Large Scale Heterogeneous Infrastructures

29 Projects Projects Programming with POLCA The POLCA Approach

30 Projects Projects What POLCA will provide Who is POLCA?

31 Projects Projects A flexible Framework for Energy and Performance Analysis of highly parallel Applications in a Supercomputing Centre Background Software

32 Projects Projects Summary References [1] [2] [3] Focht, E., Jeutter, A. [4] Guillen, C., Hesse, W., Brehm, M. [5] Focus of the Project Implementation [6] Treibig, J., Hager, G., Wellein, G. [7] [8] [9] 2

33 Projects Projects SIMOPEK Simulation and Optimization of Data Center Energy Flows from Cooling Networks taking into Account HPC Operation Scenarios External Influences/Constraints Data Center: Reduce Total Cost of Ownership Pillar 1 Building Infrastructure Pillar 2 HPC System Hardware Pillar 3 HPC System Software Pillar 4 HPC Applications Advanced Heat Reuse Technologies System Scheduler Neighboring Buildings Ulity Providers Infrastructure Aware Resource Management & Scheduling Modeling, Simulaon & Opmizaon Data Center Data Acquision Monitor Building Management & Infrastructure Hardware Management System Management Soware Performance Analysis Tools Infrastructure Monitoring System Hardware Monitoring System Soware Monitoring Performance Monitoring

34 Projects Projects External Influences/Constraints Neighboring Buildings Ulity Providers Data Center: Reduce Total Cost of Ownership Pillar 1 Building Infrastructure SIMOPEK Advanced Absorpon Cooling Pillar 2 HPC System Hardware SIMOPEK Power Consumpon Modeling, Simulaon & Opmizaon using MYNTS Pillar 3 HPC System Software Infrastructure Aware Resource Management & Scheduling SIMOPEK Data Collecon using PowerDam V.2.0 System Scheduler Pillar 4 HPC Applications FEPA Performance, Energy Modeling & Opmizaon FEPA Data Collecon Facts and Figures References [1] Wilde, T., Auweter, A., Shoukourian, H. [2] Shoukourian, H., Wilde, T., Auweter, A. [3] [4] Building Management & Infrastructure Hardware Management System Management Soware Performance Analysis Tools 2 2 Infrastructure Monitoring System Hardware Monitoring System Soware Monitoring Performance Monitoring

35 Projects Projects The Catwalk Project A quick Development Path for Performance Models

36 Projects Projects References [1] Calotoiu, A., Hoefler, T., Poke, M., Wolf, F. [2] Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B. [3] an Mey, D., Biersdorff, S., Bischof, C., Diethelm, K., Eschweiler, D., Gerndt, M., Knüpfer, A., Lorenz, D., Malony, A.D., Nagel, W.E., Oleynik, Y., Rössel, C., Saviankou, P., Schmidl, D., Shende, S.S., Wagner, M., Wesarg, B., Wolf, F.

37 Projects Projects GROMEX Unified Longrange Electrostatics and Flexible Ionization Background of the Project Usability & Scalability Towards realistic Simulations

38 Projects Projects Project Partners References (1) Kabadshow, I., Dachsel, H. (2) Dachsel, H. (3) Donnini, S., Tegeler, F., Groenhof, G., Grubmüller, H. (4) Hess, B., Kutzner, C., van der Spoel, D., Lindahl, E. (5) Ullmann, R.T., Ullmann, G.M. 2

39 Projects Projects HOPSA A big Jump forward in HPC System and Application Monitoring Integration among the HOPSA Performance Analysis Tools The HOPSA Performance Tool Workflow

40 Projects Projects Conclusion Integration of System Data and Performance Analysis Tools

41 Projects Projects EU Project Partners (HOPSA-EU) Russian Project Partners (HOPSA-RU) References [1] Labarta, J., Girona, S., Pillet, V., Cortes, T., Gregoris, L. [5] an Mey, D., Biersdorff, S., Bischof, C., Diethelm, K., Eschweiler, D., Gerndt, M., Knüpfer, A., Lorenz, D., Malony, A.D., Nagel, W.E., Oleynik, Y., Rössel, C., Saviankou, P., Schmidl, D., Shende, S.S., Wagner, M., Wesarg, B., Wolf, F. [6] Servat, H., Llort, G., Giménez, J., Labarta, J. [7] [8] Adinets, A.V., Bryzgalov, P.A., Vad, V., Voevodin, V., Zhumatiy, S.A., Nikitenko, D.A. [2] Geimer, M., Wolf, F., Wylie, B.J.N., Abraham, E., Becker, D., Mohr, B. [3] Berg, E., Hagersten, E. [9] Mohr, B., Voevodin, V., Giménez, J., Hagersten, E., Knüpfer, A., Nikitenko, D.A., Nilsson, M., Servat, H., Shah, A., Winkler, F., Wolf, F., Zhujov, I. [10] [4] Nagel, W., Weber, M., Hoppe, H.-C., Solchenbach, K.

42 Systems Systems End of the HPC-FF Era

43 Systems Systems 2 References [1] [2] [3] [4]

44 Systems Systems JUROPA-3 - A Prototype for the Next-Generation HPC Cluster System Specifications

45 Systems Systems First Experiences with the Intel MIC Architecture at LRZ Intel MIC Architecture at LRZ Architectural Overview Programming Models Number of cores Frequency of cores GDDR5 memory size Number of hardware threads SIMD vector registere Flops/cycle Theoretical peak performance L2 cache per core

46 Systems Systems Benchmarks Acknowledgements References (1) Weinberg, V., (Editor) et al. (2) (3)

47 Systems The Extension of SuperMUC: Phase 2 SuperMUC Phase 1 Phase 2 Innovative Water Cooling Users from 25 European Countries Systems Financing References (1) (2) (3)

48 Centres Centres Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (Leibniz-Rechenzentrum, LRZ) provides comprehensive services to scientific and academic communities by: Research in HPC is carried out in collaboration with the distributed, statewide Competence Network for Technical and Scientific High Performance Computing in Bavaria (KONWIHR). Compute servers currently operated by LRZ are given in the following table System Size Peak Performance (TFlop/s) Purpose User Community Contact: Leibniz Supercomputing Centre Prof. Dr. Arndt Bode Boltzmannstr Garching near Munich Germany Phone [email protected]

49 Centres Centres First German National Center Based on a long tradition in supercomputing at University of Stuttgart, HLRS (Höchstleistungsrechenzentrum Stuttgart) was founded in 1995 as the first German federal Centre for High Performance Computing. HLRS serves researchers at universities and research laboratories in Europe and Germany and their external and industrial partners with high-end computing power for engineering and scientific applications. Service for Industry Service provisioning for industry is done to gether with T-Systems, T-Systems sfr, and Porsche in the public-private joint venture hww (Höchstleistungsrechner für Wissenschaft und Wirtschaft). Through this co-operation industry always has acces to the most recent HPC technology. Bundling Competencies In order to bundle service resources in the state of Baden-Württemberg HLRS has teamed up with the Steinbuch Center for Computing of the Karlsruhe Institute of Technology. This collaboration has been implemented in the non-profit organization SICOS BW GmbH. World Class Research As one of the largest research centers for HPC HLRS takes a leading role in research. Participation in the German national initiative of excellence makes HLRS an outstanding place in the field. Contact: Höchstleistungs rechenzentrum Stuttgart (HLRS) Universität Stuttgart Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Michael M. Resch Nobelstraße Stuttgart Germany Phone [email protected] / Compute servers currently operated by HLRS System Cray XE6 "Hermit" (Q4 2011) NEC Cluster (Laki, Laki2) heterogenous compunting platform of 2 independent clusters Size 3,552 dual socket nodes with 113,664 AMD Interlagos cores 23 TB memory 9988 cores 911 nodes Peak Performance (TFlop/s) Purpose 1,045 Capability 170 Laki: Computing 120,5 TFlops Laki2: 47,2 TFlops User Community European and German Research Organizations and Industry German Universities, Research Institutes and Industry Hermit

50 Centres Centres Supercomputer-oriented research and development in selected fields of physics and other natural sciences by research groups of competence in supercomputing applications. Compute servers currently operated by JSC System Size Peak Performance (TFlop/s) Purpose User Community The Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich enables scientists and engineers to solve grand Implementation of strategic support infrastructures including communityoriented simulation laboratories and cross-sectional groups on mathematical methods and algorithms and parallel challenge problems of high complexity in science and engineering in collaborative infrastructures by means of supercom- performance tools, enabling the effective usage of the supercomputer resources. puting and Grid technologies. Provision of supercomputer resources Higher education for master and doctoral students in cooperation e.g. of the highest performance class for projects in science, research and industry in the fields of modeling and computer simulation including their methods. The selection of the projects is performed by an international peer-review procedure with the German Research School for Simulation Sciences. Contact: Jülich Supercomputing Centre (JSC) Forschungszentrum Jülich implemented by the John von Neumann Institute for Computing (NIC), a joint foundation of Forschungszentrum Jülich, Prof. Dr. Dr. Thomas Lippert Jülich Deutsches Elektronen-Synchrotron Germany DESY, and GSI Helmholtzzentrum für Phone Schwerionenforschung. [email protected]

51 Activities Activities CECAM Tutorials at JSC CHANGES Workshop 2

52 Activities Activities Laboratory Experiments on Crowd Dynamics

53 Activities Activities JSC Guest Student Programme on Scientific Computing 2013

54 Activities Activities High-Q Club The highest scaling Codes on JUQUEEN Terra-Neo Gysela walberla PEPC PMG+PFASST Reference [1] Brömmel, D. dynqcd

55 Activities Activities Jülich Supercomputing Centre contributes to visionary Human Brain Project References [1] [2] [3]

56 Activities Activities Traffic and Granular Flow Conference celebrates 10 th Edition by returning to Jülich UNICORE Summit 2013 Reference [1] References [1] [2]

57 Activities Activities 3D Show at the Pharma Forum: Simulation and Visualization of the Airflow in Cleanrooms Links

58 Activities Activities The 17 th HLRS-NEC Workshop on Sustained Simulation Performance

59 Activities Activities ls1 mardyn - a Massively Parallel Molecular Simulation Code Scalability Neighbour Search Dynamic Load Balancing Partners

60 Activities Activities GCS at ISC 13 Review GCS Booth Highlights 2 Two GCS HPC systems amongst Top Ten of TOP500 ISC 13 Gauss Award Winner

61 Activities Activities Extreme Scaling Workshop at LRZ July 9-11, 2013: Running Real World Applications on more than 130,000 Cores on Super- MUC

62 Activities Activities LRZ Extreme Scale Benchmark and Optimization Suite 3 IBM MPI icc 12.1 Intel MPI icc ,000 atoms aquaporin, PME, 2 fs 2 M atoms ribosome, PME, 4 fs 12 M atoms peptides, PME, 2 fs Performance Results

63 Activities Activities HLRS Scientific Tutorials and Workshop Report and Outlook OpenACC Pro- gramming for Parallel Accelerated Supercomputers an alternative to CUDA from Cray perspective Cray XE6/ XC30 Optimization Workshops PRACE Advanced Training Centre Parallel Programming Workshop Iterative Solvers and Parallelization 2014 Workshop Announcements Scientific Conferences and Workshops at HLRS 12th HLRS/hww Workshop on Scalable Global Parallel File Systems (March/April 2014) 8th ZIH+HLRS Parallel Tools Workshop (date and location not yet fixed) High Performance Computing in Science and Engineering - The 17th Results and Review Workshop of the HPC Center Stuttgart (October 2014) IDC International HPC User Forum (October 2014) Parallel Programming Workshops: Training in Parallel Programming and CFD ISC and SC Tutorials Georg Hager, Gabriele Jost, Rolf Rabenseifner: Hybrid Parallel Programming with MPI & OpenMP. Tutorial 9 at the International Supercomputing Conference, ISC 13, Leipzig, June Georg Hager, Jan Treibig, Gerhard Wellein: Node-Level Performance Engineering. Tutorial 2 at the International Supercomputing Conference, ISC 13, Leipzig, June Rolf Rabenseifner, Georg Hager, Gabriele Jost: Hybrid MPI and OpenMP Parallel Programming. Half-day Tutorial at Super Computing 2013, SC13, Denver, Colorado, USA, November 17-22, Introduction to Computational Fluid Dynamics MPI & OpenMP Fortran for Scientific Computing Parallel Programming and Parallel Tools (TU Dresden, ZIH, February 24-27) Introduction to Computational Fluid Dynamics (HLRS, March 31 - April 4) Iterative Linear Solvers and Parallelization (HLRS, March 24-28) Cray XE6/XC30 Optimization Workshops (HLRS, March 17-20) (PATC) GPU Programming using CUDA (HLRS, April 7-9) Open ACC Programming for Parallel Accelerated Supercomputers (HLRS, April 10-11) (PATC) Unified Parallel C (UPC) and Co-Array Fortran (CAF) (HLRS, April 14-15) (PATC) Scientific Visualisation (HLRS, April 16-17) Parallel Programming with MPI & OpenMP (TU Hamburg-Harburg, July 28-30) Iterative Linear Solvers and Parallelization (LRZ, Garching, September 15-19) Introduction to Computational Fluid Dynamics (ZIMT Siegen, September/October) Unified Parallel C (UPC) and Co- Array Fortran (CAF) Message Passing Interface (MPI) for Beginners (HLRS, October 6-7) (PATC) Shared Memory Parallelization with OpenMP (HLRS, October 8) (PATC) Advanced Topics in Parallel Programming (HLRS, October 9-10) (PATC) Parallel Programming with MPI & OpenMP (FZ Jülich, JSC, December 1-3) Training in Programming Languages at HLRS Fortran for Scientific Computing (Dec 2-6, 2013 and Mar 10-14, 2014) (PATC) URLs: (PATC): This is a PRACE PATC course

64 GCS High Performance Computing Courses and Tutorials Parallel Programming with model, compilers, tools, monitoring, computational science, and also sessions will allow users to immediately Contents Parallel Programming with MPI, OpenMP and PETSc MPI, OpenMP, performance optimi- show how an awareness of the per- test and understand the language This course is targeted at scientists MPI, OpenMP, and Tools zation, mathematical software, and formance features of an application constructs. with little or no knowledge of the Date and Location application software. may lead to notable reductions in Fortran programming language, but Date and Location November 25-27, 2013 power consumption: Web Page needing it for participation in projects February 24-27, 2014 JSC, Forschungszentrum Jülich Web Page using a Fortran code base, for devel- Dresden, ZIH opment of their own codes, and for Contents events/sc-nov Second JUQUEEN Porting getting acquainted with additional Contents The focus is on programming models hierarchy and Tuning Workshop tools like debugger and syntax checker The focus is on programming models MPI, OpenMP, and PETSc. Hands-on Node-Level Performance (PATC course) as well as handling of compilers and MPI, OpenMP, and PETSc. Hands-on sessions (in C and Fortran) will allow Engineering heads libraries. The language is for the most sessions (in C and Fortran) will allow users to immediately test and (PATC course) Date and Location part treated at the level of the Fortran users to immediately test and understand the basic constructs of - The 3D Jacobi solver February 03-05, standard; features from Fortran understand the basic constructs of the Message Passing Interface (MPI) Date and Location - The Lattice-Boltzmann Method JSC, Forschungszentrum Jülich 2003 are limited to improvements on the Message Passing Interface (MPI) and the shared memory directives of December 03-04, Sparse Matrix-Vector Multiplication the elementary level. Advanced and the shared memory directives of OpenMP. Course language is English. LRZ Building, - Backprojection algorithm for CT Contents Fortran features like object-oriented OpenMP. The last day is dedicated to This course is organized by JSC in University Campus Garching, reconstruction The Blue Gene/Q petaflop super- programming or coarrays will be cov- tools for debugging and performance collaboration with HLRS. Presented near Munich, Boltzmannstr. 1 computer JUQUEEN marks another ered in a follow-on course in autumn. analysis of parallel applications. This by Dr. Rolf Rabenseifner, HLRS. Between each module, there is time quantum leap in supercomputer To consolidate the lecture material, course is organized by ZIH in collabo- Contents for Questions and Answers! performance at JSC. In order to use each day's approximately 4 hours of ration with HLRS. Web Page This course teaches performance this tool efficiently, special efforts by lecture are complemented by 3 hours engineering approaches on the Web Page the users are necessary, though. The of hands-on sessions. Web Page events/mpi compute node level. Performance aim of this hands-on workshop is to engineering as we define it is more compute/courses support current users of JUQUEEN Prerequisites Introduction to the than employing tools to identify in porting their software, in analyzing Course participants should have ba- Parallel Programming of Programming and Usage hotspots and bottlenecks. It is about Fortran for Scientific its performance, and in improving sic UNIX/Linux knowledge (login with High Performance Systems of the Supercomputer developing a thorough understanding Computing its efficiency. This course is a PATC secure shell, shell commands, basic Resources at Jülich of the interactions between software (PATC course) course (PRACE Advanced Training programming, vi or emacs editors). Dates and Location and hardware. This process must Centres). March 10-14, 2014 Date and Location start at the core, socket, and node Dates and Location Web Page RRZE building, University campus November 28-29, 2013 level, where the code gets executed December 02-06, 2013 and Web Page Erlangen, Martensstr. 1: Via video JSC, Forschungszentrum Jülich that does the actual computational March 10-14, pute/courses conference at LRZ if there is sufficient work. Once the architectural require- Stuttgart, HLRS events/juqueenpt14 interest. Contents ments of a code are understood and This course gives an overview of correlated with performance mea- Contents Programming with Fortran Contents the supercomputers JUROPA and surements, the potential benefit of This course is dedicated for scientists This course, a collaboration of JUQUEEN. Especially new users will optimizations can often be predicted. and students to learn (sequential) Dates and Locations Erlangen Regional Computing Centre learn how to program and use these We introduce a holistic node-level programming scientific applications February 03-07, 2014 (RRZE) and LRZ, is targeted at stu- systems efficiently. Topics discussed performance engineering strategy, with Fortran. The course teaches LRZ Building, University campus dents and scientists with interest in are: system architecture, usage apply it to different algorithms from newest Fortran standards. Hands-on Garching near Munich. programming modern HPC hardware,

65 GCS High Performance Computing Courses and Tutorials specifically the large scale parallel on these opportunities. From Monday and the shared memory directives of Introduction to Computa- Contents GPU Programming using computing systems available in Munich, to Wednesday, specialists from Cray OpenMP. This course is organized by tional Fluids Dynamics In this add-on course to the parallel CUDA Jülich and Stuttgart. will support you in your effort porting University of Kassel, HLRS, and IAG. programming course special topics and optimizing your application on our Date and Location are treated in more depth, in Date and Location Each day is comprised of approxi- Cray XE6. On Thursday, Georg Hager Web Page March 31 - April 04, 2014 particular performance analysis, I/O April 07-09, 2014 mately 4 hours of lectures and 3 and Jan Treibig from RRZE will present Stuttgart, HLRS and PGAS concepts. It is provided Stuttgart, HLRS hours of hands-on sessions. detailed information on optimizing in collaboration of Erlangen Regional codes on the multicore AMD Interlagos Eclipse: C/C++/Fortran Contents Computing Centre (RRZE) and LRZ Contents Web Page processor. Course language is English programming Numerical methods to solve the within KONWIHR. The course provides an introduction (if required). equations of Fluid Dynamics are pre- Each day is comprised of approxima- to the programming language CUDA, compute/courses Date and Location sented. The main focus is on explicit tely 5 hours of lectures and 2 hours which is used to write fast numeric Web Page March 25, 2014 Finite Volume schemes for the com- of hands-on sessions. algorithms for NVIDIA graphics proces- Cray XE6/XC 30 LRZ Building, University campus pressible Euler equations. Hands-on Day 1 sors (GPUs). Focus is on the basic Optimization Workshop Garching near Munich. sessions will manifest the content of Intel tools: MPI tracing and Checking usage of the language, the exploitation (PATC course) Iterative Linear Solvers and the lectures. Participants will learn Intel tools: OpenMP performance and of the most important features of the Parallelization Contents to implement the algorithms, but correctness. device (massive parallel computation, Date and Location This course is targeted at scientists also to apply existing software and to Day 2 shared memory, texture memory) and March 17-20, 2014 Dates and Location who wish to be introduced to pro- interpret the solutions correctly. Parallel I/O with MPI IO efficient usage of the hardware to Stuttgart, HLRS March 24-28, 2014 gramming C/C++/Fortran with the Methods and problems of paralleliza- Performance analysis with Scalasca. maximize performance. An overview Stuttgart, HLRS Eclipse C/C++ Development Tools tion are discussed. This course is Day 3 of the available development tools and Contents (CDT), or the Photran Plugin. Topics based on a lecture and practical Tuning I/O on LRZ's HPC systems. the advanced features of the language HLRS installed Hermit, a Cray XE6 September 15-19, 2014 covered include: awarded with the "Landeslehrpreis Portability of I/O: Binary files NetCDF is given. system with AMD Interlagos proces- Garching, LRZ Baden-Württemberg 2003" and or- HDF5. sors and a performance of 1 PFlop/s. ganized by HLRS, IAG, and University Day 4 Web Page We strongly encourage you to port Contents of Kassel. PGAS programming with coarray your applications to the new architec- The focus is on iterative and parallel Fortran and Unified Parallel C. ture as early as possible. To support solvers, the parallel programming Photran. Web Page PGAS hands on session. GPU Programming such effort we invite current and models MPI and OpenMP, and the (PATC course) future users to participate in special parallel middleware PETSc. Thereby, Prerequisites Prerequisites Cray XE6/XC30 Optimization Work- different modern Krylov Subspace Course participants should have basic Advanced Topics in High Good MPI and OpenMP knowledge Date and Location shops. With this course, we will give Methods (CG, GMRES, BiCGSTAB...) knowledge of the C and/or C++/ Performance Computing as presented in the course "Parallel April 07-09, 2014 all necessary information to move as well as highly efficient precondi- Fortran programming languages. (PATC course) programming of High Performance JSC, Forschungszentrum Jülich applications from the current NEC tioning techniques are presented Systems" (see above). SX-9, the Nehalem cluster, or other in the context of real life applica- Web Page Date and Location Contents systems to Hermit. Hermit provides tions. Hands-on sessions (in C and March 31 - April 03, 2014 Web Page Many-core programming is a very our users with a new level of perfor- Fortran) will allow users to immedi- compute/courses LRZ Building, University campus dynamic research area. Many scien- mance. To harvest this potential will ately test and understand the basic Garching near Munich. compute/courses tific applications have been ported to require all our efforts. We are look- constructs of iterative solvers, the GPU architectures during the past ing forward to working with our users Message Passing Interface (MPI) four years. We will give an introduction

66 GCS High Performance Computing Courses and Tutorials to CUDA, OpenCL, and multi-gpu will be presented, and interoperability Scientific Visualization computing since around 5 years. Advanced GPU Programming are: system architecture, usage programming using examples of in- of OpenACC directives with these Particularly GPGPUs have recently model, compilers, tools, monitoring, creasing complexity. After introducing and with CUDA will be demonstrated. Date and Location become very popular, however pro- Date and Location MPI, OpenMP, performance optimi- the basics the focus will be on Through application case studies and April 16-17, 2014 gramming GPGPUs using program- May 05-06, 2014 zation, mathematical software, and optimization and tuning of scientific tutorials, users will gain direct expe- Stuttgart, HLRS ming languages like CUDA or OpenCL JSC, Forschungszentrum Jülich application software. applications. rience of using OpenACC directives in is cumbersome and error-prone. This course is a PATC course (PRACE realistic applications. Contents Beyond introducing the basics of Contents Web Page Advanced Training Centres). Users may also bring their own codes This two day course is targeted at GPGPU-porogramming, we mainly Today's computers are commonly to discuss with Cray specialists or researchers with basic knowledge present OpenACC as an easier way equipped with multicore processors events/sc-may Web Page begin porting. in numerical simulation, who would to program GPUs using OpenMP-like and graphics processing units. To like to learn how to visualize their pragmas. Recently Intel developed make efficient use of these massively Parallel I/O and Portable events/gpu Web Page simulation results on the desktop but their own Many Integrated Core parallel compute resources advanced Data Formats also in Augmented Reality and Virtual (MIC) architecture which can be knowledge of architecture and pro- (PATC course) Open ACC Programming Environments. It will start with a programmed using standard paral- gramming models is indispensable. Cray XK Unified Parallel C (UPC) and short overview of scientific visualiza- lel programming techniques like This course focuses on finding and Date and Location (PATC course) Co-Array Fortran (CAF) tion, following a hands-on introduc- OpenMP and MPI. In the beginning eliminating bottlenecks using profiling May 21-23, 2014 (PATC course) tion to 3D desktop visualization with of 2013, the first production-level and advanced programming tech- JSC, Forschungszentrum Jülich Date and Location COVISE. On the second day, we will cards named Intel Xeon Phi came on niques, optimal usage of CPUs and April 10-11, 2014 Date and Location discuss how to build interactive 3D the market. The course discusses GPUs on a single node, and multi-gpu Contents Stuttgart, HLRS April 14-15, 2014 Models for Virtual Environments and various programming techniques for programming across multiple nodes. This course will introduce MPI parallel Stuttgart, HLRS how to set up an Augmented Reality Intel Xeon Phi and includes hands-on I/O and portable, self-describing data Contents visualization. session for both MIC and GPU pro- Web Page formats, such as HDF5 and NetCDF. This workshop will cover the pro- Contents gramming. The course is developed Participants should have experience gramming environment of the Cray Partitioned Global Address Space Web Page in collaboration with the Erlangen events/advgpu in parallel programming in general, XK7 hybrid supercomputer, which (PGAS) is a new model for parallel Regional Computing Centre (RRZE) and either C/C++ or Fortran in par- combines multicore CPUs with GPU programming. Unified Parallel C (UPC) within KONWIHR. Introduction to the ticular. This course is a PATC course accelerators. Attendees will learn and Co-Array Fortran (CAF) are PGAS Intel MIC&GPU Each day is comprised of approxi- Programming and Usage (PRACE Advanced Training Centres). about the directive-based OpenACC extensions to C and Fortran. PGAS Programming Workshop mately 5 hours of lectures and 2 of the Supercomputer programming model whose multi- languages allow any processor to (PATC course) hours of hands-on sessions. Resources at Jülich Web Page vendor support allows users to por- directly address memory/data on any tably develop applications for parallel other processors. Date and Location Prerequisites Date and Location events/parallelio accelerated supercomputers. Parallelism can be expressed more April 28-30, 2014 Good working knowledge of at least May 19-20, 2014 The workshop will also demonstrate easily compared to library based LRZ Building, University campus one of the standard HPC languages: JSC, Forschungszentrum Jülich how to use the Cray Programming approches as MPI. Hands-on sessions Garching, near Munich. Fortran 95, C or C++. Basic OpenMP Environment tools to identify CPU (in UPC and/or CAF) will allow users and MPI knowledge useful. Contents application bottlenecks, facilitate the to immediately test and understand Contents This course gives an overview of the OpenACC porting, provide accelerated the basic constructs of PGAS languages. With the rapidly growing demand for Web Page supercomputers JUROPA and performance feedback and to tune computing power new accelerator JUQUEEN. Especially new users will the ported applications. The Cray Web Page based architectures have entered compute/courses learn how to program and use these scientific libraries for accelerators the world of high performance systems efficiently. Topics discussed

67 Authors Momme Allalen Dietmar Erwin Ferdinand Jamitzky Michael M. Resch MOMME ALLALEn E ALLALEn D ERWin ferdinand jamitzky RESCH LRz DE fz-juelich DE LRz DE HLRS DE Fakherr F. Assaad Jochen Fröhlich Ivo Kabadshow Helmut Satzger ASSAAD PHySiK Uni-WUERzBURG DE jochen froehlich TU-DRESDEn DE i KABADSHOW HELMUT SATzGER Norbert ert Attig Andrea Gatto Anupam Karmakar Armin Seyfried n ATTiG A SEyfRiED fz-juelich DE fz-juelich DE AnDREAG MPA-GARCHInG MPG DE AnUPAM KARMAKAR Christian tian Baczynski Paul Gibbon Tobias T Tob ias Kempe Wolfram Schmidt BACzynSKi SKi P GiBBOn TOBiAS KEMPE SCHMiDT Uni-HD DE fz-juelich DE LRz DE LRz DE TU-DRESDEn DE fz-juelich DE ASTRO PHySiK Gurvan n Bazin Philipp Girichidis Francisco Kitaura Uni-GOETTinGEn DE GURVAn BAZin BAZin PHILIPP KiTAURA Magnus Schwörer GMAiL COM GiRICHiDiS COM AiP DE Florian n Berberich Colin Glass Ralf Klessen MAGnUS SCHWOERER f BERBERICH RICH GLASS KLESSEn GOOGL COM fz-juelich DE HLRS DE Uni-HEiDELBERG DE Christoph toph Bernau Simon Glover Bastian Koller Anton Stephan CHRiSTOPH BERNAU PH BERNAU GLOVER KOLLER AnTOn STEPHAn LRz DE Uni-HEiDELBERG DE HLRS DE DLR DE Arndt Bode Stefan Gottlöber Carsten Kutzner Sven Strohmer ARnDT BODE ODE SGOTTLOEBER CKUTznE S STROHMER LRz DE AiP DE GWDG DE fz-juelich DE Maik Boltes Jose Gracia Thomas Lippert Godehard Sutmann M BOLTES S GRACiA TH LIPPERT G SUTMAnn fz-juelich DE HLRS DE fz-juelich DE fz-juelich DE David Brayford Bärbel Große-W Große-Wöhrmann - öhrmann -W Daniel Mallmann Jan Tr T Treibig eibig DAViD BRAyfORD RAyfORD WOEHRMAnn D MALLMAnn jan TREiBiG LRz DE HLRS DE fz-juelich DE RRzE fau DE Matthias hias Brehm Carla Guillen Carias Andreas Marek Philipp Tr T Trisjono isjono MATTHiAS BREHM AS BREHM CARLA GUILLEn AMAREK P TRISjOnO LRz DE LRz DE RzG MPG DE itv RWTH-AACHEn DE Dirk Brömmel Nicolay Hammer Gerald Mathias Thomas Ullmann D BROEMMEL MMEL nicolay HAMMER GERALD MATHiAS THOMAS ULLMAnn fz-juelich DE LRz DE MPiBPC DE Alexandru ndru Calotoiu Tim Hender PHySiK Uni-MUEnCHEn DE Bernhard Vowinckel A CALOTOiU OiU TiM HEnDER Bernd Mohr BERnHARD VOWinCKEL TU-DRESDEn DE Alexeyy Cheptsov Wolfram Hesse B MOHR Stefanie Walch CHEPTSOV OV WOLfRAM HESSE Thorsten Naab WALCH Regina Weigand GRS-SiM DE HLRS DE CCfE AC UK LRz DE Paul Clark Steffen Heß naab P CLARK SHESS fz-juelich DE MPA-GARCHinG MPG DE MPA-GARCHinG MPG DE Carmen Navarrete WEIGAnD Tanja Cleese Torsten T To rsten Hoefler CARMEn navarrete Volker Weinberg TAnjA CLEES EES HTOR Christoph Niethammer VOLKER WEinBERG Torsten T To rsten Wilde Uni-HEiDELBERG DE SCAi fraunhofer DE AiP DE inf ETHz CH Holgerr Dachsel Martin Hohenadler niethammer H DACHSEL SEL LRz DE HLRS DE HLRS DE LRz DE MARTin HOHEnADLER Boris Orth TORSTEn WiLDE Ulrich Detert PHySiK Uni-WUERzBURG DE B ORTH Felix Wolf U DETERTT Stefan Holl Ludger Palm f WOLf Klaus Dolag ST HOLL LUDGER PALM Klaus Wolkersdorfer KDOLAG fz-juelich DE fz-juelich DE fz-juelich DE fz-juelich DE LRz DE LRz DE GRS-SiM DE Frank Holzäpfel Thomas Peters K WOLKERSDORfER Jan Frederik rederik Engels frank HOLzAEPfEL TPETERS Richard Wünsch MAiL Valentina Huber Rolf Rabenseifner V HUBER RABEnSEifnER MPA-GARCHinG MPG DE jfengels DE EnGELS DE DLR DE fz-juelich DE )F YOU WOULD LIKE TO RECEIVE INSIDE REGULARLY SEND AN WITH YOUR POSTAL ADDRESS TO KLANK HLRS DE 7EB PAGE HTTP INSIDE HLRS DE PHySiK UzH CH RICHARD fz-juelich DE WUnSCH Cz HLRS DE (,23