Extreme Scale Compu0ng at LRZ



Similar documents
Extreme Scaling on Energy Efficient SuperMUC

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

Cosmological simulations on High Performance Computers

Supercomputing Resources in BSC, RES and PRACE

PRACE hardware, software and services. David Henty, EPCC,

Access, Documentation and Service Desk. Anupam Karmakar / Application Support Group / Astro Lab

PRACE the European HPC Research Infrastructure. Carlos Mérida-Campos, Advisor of Spanish Member at PRACE Council

Access to the Federal High-Performance Computing-Centers

International High Performance Computing. Troels Haugbølle Centre for Star and Planet Formation Niels Bohr Institute PRACE User Forum

Relations with ISV and Open Source. Stephane Requena GENCI

Kriterien für ein PetaFlop System

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

JuRoPA. Jülich Research on Petaflop Architecture. One Year on. Hugo R. Falter, COO Lee J Porter, Engineering

Information about Pan-European HPC infrastructure PRACE. Vít Vondrák IT4Innovations

walberla: A software framework for CFD applications on Compute Cores

OpenMP Programming on ScaleMP

IT security concept documentation in higher education data centers: A template-based approach

Jean-Pierre Panziera Teratec 2011

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

BSC - Barcelona Supercomputer Center

MEGWARE HPC Cluster am LRZ eine mehr als 12-jährige Zusammenarbeit. Prof. Dieter Kranzlmüller (LRZ)

Journée Mésochallenges 2015 SysFera and ROMEO Make Large-Scale CFD Simulations Only 3 Clicks Away

Supercomputing Status und Trends (Conference Report) Peter Wegner

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Big Data Research at DKRZ

RSC presents SPbPU supercomputer center and new scientific research results achieved with RSC PetaStream massively parallel supercomputer

Linux Cluster Computing An Administrator s Perspective

Jezelf Groen Rekenen met Supercomputers

Parallel Software usage on UK National HPC Facilities : How well have applications kept up with increasingly parallel hardware?

High Performance Computing at CEA

David Vicente Head of User Support BSC

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC-related R&D in 863 Program

Barry Bolding, Ph.D. VP, Cray Product Division

High Performance Computing in the Multi-core Area

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

10- High Performance Compu5ng

How Cineca supports IT

SGI High Performance Computing

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Automating Big Data Benchmarking for Different Architectures with ALOJA

Thematic Unit of Excellence on Computational Materials Science Solid State and Structural Chemistry Unit, Indian Institute of Science

SURFsara HPC Cloud Workshop

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

walberla: A software framework for CFD applications

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Data management challenges in todays Healthcare and Life Sciences ecosystems

and RISC Optimization Techniques for the Hitachi SR8000 Architecture


Comparison of computational services at LRZ

MareNostrum 3 Javier Bartolomé BSC System Head Barcelona, April 2015

Pedraforca: ARM + GPU prototype

Power Efficiency Metrics for the Top500. Shoaib Kamil and John Shalf CRD/NERSC Lawrence Berkeley National Lab

Trends in High-Performance Computing for Power Grid Applications

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

High-Performance Computing and Big Data Challenge

Parallel file I/O bottlenecks and solutions

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

YALES2 porting on the Xeon- Phi Early results

Pros and Cons of HPC Cloud Computing

SURFsara HPC Cloud Workshop

Big Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Fast Parallel Algorithms for Computational Bio-Medicine

Altix Usage and Application Programming. Welcome and Introduction

A-CLASS The rack-level supercomputer platform with hot-water cooling

BENCHMARKING V ISUALIZATION TOOL

Transcription:

Extreme Scale Compu0ng at LRZ Dieter Kranzlmüller Munich Network Management Team Ludwig- Maximilians- Universität München (LMU) & Leibniz SupercompuFng Centre (LRZ) of the Bavarian Academy of Sciences and HumaniFes 1

Networks Grid compufng High Performance CompuFng Cloud CompuFng Service Management VirtualizaFon IT Security Leibniz Supercompu0ng Centre of the Bavarian Academy of Sciences and Humani0es With 156 employees + 38 extra staff for more than 90.000 students and for more than 30.000 employees including 8.500 scienfsts VisualisaFon Centre InsFtute Building Cuboid containing compufng systems 72 x 36 x 36 meters D. Kranzlmüller InsFtute Building Lecture Halls Energy Efficieny and Extreme Scaling 4 2

Leibniz Supercompu0ng Centre of the Bavarian Academy of Sciences and Humani0es n Computer Centre for all Munich UniversiFes IT Service Provider: Munich ScienFfic Network (MWN) Web servers e- Learning E- Mail Groupware Special equipment: Virtual Reality Laboratory Video Conference Scanners for slides and large documents Large scale plofers IT Competence Centre: Hotline and support ConsulFng (security, networking, scienffc compufng, ) Courses (text edifng, image processing, UNIX, Linux, HPC, ) D. Kranzlmüller Energy Efficieny and Extreme Scaling 5 The Munich Scien0fic Network (MWN) D. Kranzlmüller Energy Efficieny and Extreme Scaling 6 3

Leibniz Supercompu0ng Centre of the Bavarian Academy of Sciences and Humani0es n Regional Computer Centre for all Bavarian UniversiFes n Computer Centre for all Munich UniversiFes D. Kranzlmüller Energy Efficieny and Extreme Scaling 7 Virtual Reality & Visualiza0on Centre (LRZ) D. Kranzlmüller Energy Efficieny and Extreme Scaling 8 4

Examples from the V2C D. Kranzlmüller Energy Efficieny and Extreme Scaling 9 Leibniz Supercompu0ng Centre of the Bavarian Academy of Sciences and Humani0es n NaFonal SupercompuFng Centre n Regional Computer Centre for all Bavarian UniversiFes n Computer Centre for all Munich UniversiFes D. Kranzlmüller Energy Efficieny and Extreme Scaling 10 5

Gauss Centre for Supercompu0ng (GCS) n CombinaFon of the 3 German nafonal supercompufng centers: John von Neumann InsFtute for CompuFng (NIC), Jülich High Performance CompuFng Center Stufgart (HLRS) Leibniz SupercompuFng Centre (LRZ), Garching n. Munich n Founded on 13. April 2007 n HosFng member of PRACE (Partnership for Advanced CompuFng in Europe) D. Kranzlmüller Energy Efficieny and Extreme Scaling 11 Na0onal HPC- Strategy Europäische Höchstleistungsrechenzentren (Tier 0) Nationale Höchstleistungsrechenzentren (Tier 1) Thematische HPC-Zentren, Zentren mit regionalen Aufgaben (Tier 2) HPC-Server (Tier 3) 3 ca. 10 ca. 100 Gauss Centre for Supercomputing (GCS) (Garching, Stuttgart, Jülich) Aachen, Berlin, DKRZ, Dresden, DWD, Karlsruhe, Hannover, MPG/RZG, udgl. Hochschule/Institut D. Kranzlmüller Extreme Scale CompuFng 12 6

PRACE Research Infrastructure Created n Establishment of the legal framework PRACE AISBL created with seat in Brussels in April (AssociaFon InternaFonale Sans But LucraFf) 20 members represenfng 20 European countries InauguraFon in Barcelona on June 9 n Funding secured for 2010-2015 400 Million from France, Germany, Italy, Spain Provided as Tier- 0 services on TCO basis Funding decision for 100 Million in The Netherlands expected soon 70+ Million from EC FP7 for preparatory and implementafon Grants INFSO- RI- 211528 and 261557 Complemented by ~ 60 Million from PRACE members D. Kranzlmüller Extreme Scale CompuFng 13 PRACE Tier- 0 Systems n n n n n n Curie @ GENCI: Bull Cluster, 1.7 PFlop/s FERMI @ CINECA: IBM BG/Q, 2.1 PFlop/s Hermit @ HLRS: Cray XE6, 1 Pflop/s JUQUEEN @ FZJ: IBM Blue Gene/Q, 5.9 PFlop/s MareNostrum @ BSC: IBM System X idataplex, 1 PFlop/s SuperMUC @ LRZ: IBM System X idataplex, 3.2 PFlop/s D. Kranzlmüller Extreme Scale CompuFng 14 7

PRACE Tier- 0 Access n Single pan- European Peer Review n hfp://www.prace- project.eu/call- Announcements?lang=en n Early Access Call in May 2010 68 proposals asked for 1870 Million Core hours 10 projects granted with 328 Million Core hours Principal InvesFgators from D (5), UK (2) NL (1), I (1), PT (1) Involves researchers from 31 insftufons in 12 countries n Further calls being scheduled (every 6 months) Call open February > Access September same year Call open September > Access March next year n Example from 8th Regular Call closed on 15 October 2013 SpaFally adapfve radiafon- hydrodynamical simulafons of reionizafon Project leader: Dr Andreas Pawlik, Max Planck Society, GERMANY Research field: Universe Sciences Resource Awarded: 33,800,000 core hours on SuperMUC, Germany; D. Kranzlmüller Extreme Scale CompuFng 15 Leibniz Supercompu0ng Centre of the Bavarian Academy of Sciences and Humani0es n European SupercompuFng Centre n NaFonal SupercompuFng Centre n Regional Computer Centre for all Bavarian UniversiFes n Computer Centre for all Munich UniversiFes SuperMUC SGI UV SGI Altix Linux Clusters Linux Hosting and Housing D. Kranzlmüller Energy Efficieny and Extreme Scaling 16 8

SuperMUC @ LRZ Video: SuperMUC rendered on SuperMUC by LRZ hfp://youtu.be/olas6iiqwrq D. Kranzlmüller Energy Efficieny and Extreme Scaling 17 Top 500 Supercomputer List (June 2012) www.top500.org D. Kranzlmüller Energy Efficieny and Extreme Scaling 18 9

LRZ Supercomputers next to come (2014): SuperMUC Phase II 6.4 PFlop/s D. Kranzlmüller Energy Efficieny and Extreme Scaling 19 SuperMUC Phase 1 + 2 D. Kranzlmüller Extreme Scale CompuFng 20 10

SuperMUC and its predecessors 10 m D. Kranzlmüller Energy Efficieny and Extreme Scaling 21 SuperMUC and its predecessors 10 m 11 m 22 m D. Kranzlmüller Energy Efficieny and Extreme Scaling 22 11

SuperMUC and its predecessors 10 m 11 m 22 m D. Kranzlmüller Energy Efficieny and Extreme Scaling 23 LRZ Building Extension Picture: Horst-Dieter Steinhöfer Figure: Herzog+Partner für StBAM2 (staatl. Hochbauamt München 2) Picture: Ernst A. Graf D. Kranzlmüller Energy Efficieny and Extreme Scaling 24 12

Increasing numbers Date System Flop/s Cores 2000 HLRB- I 2 Tflop/s 1512 2006 HLRB- II 62 Tflop/s 9728 2012 SuperMUC 3200 Tflop/s 155656 2014 SuperMUC Phase II 3.2 + 3.2 Pflop/s 229960 D. Kranzlmüller Energy Efficieny and Extreme Scaling 25 SuperMUC Architecture Internet Achive and Backup ~ 30 PB Snapshots/Replika 1.5 PB (separate fire section) $HOME 1.5 PB / 10 GB/s NAS 80 Gbit/s Desaster Recovery Site pruned tree (4:1) non blocking SB-EP 16 cores/node 2 GB/core non blocking WM-EX 40cores/node 6.4 GB/core GPFS for $WORK $SCRATCH 10 PB 200 GB/s Compute nodes 18 Thin node islands (each >8000 cores) Compute nodes 1 Fat node island (8200 cores) è SuperMIG I/O nodes D. Kranzlmüller Energy Efficieny and Extreme Scaling 26 13

Power Consump0on at LRZ 35.000 30.000 Stromverbrauch in MWh 25.000 20.000 15.000 10.000 5.000 0 D. Kranzlmüller Energy Efficieny and Extreme Scaling 27 Cooling SuperMUC D. Kranzlmüller Energy Efficieny and Extreme Scaling 28 14

IBM System x idataplex Direct Water Cooled Rack idataplex DWC Rack w/ water cooled nodes (front view) idataplex DWC Rack w/ water cooled nodes (rear view of water manifolds) Torsten Bloth, IBM Lab Services - IBM Corporation D. Kranzlmüller Energy Efficieny and Extreme Scaling 29 IBM idataplex dx360 M4 Torsten Bloth, IBM Lab Services D. Kranzlmüller - IBM Corporation Energy Efficieny and Extreme Scaling 30 15

Cooling Infrastructure Photos: StBAM2 (staatl. Hochbauamt München 2) D. Kranzlmüller Energy Efficieny and Extreme Scaling 31 SuperMUC HPL Energy Consump0on 4000 50000 Power (kw) 3500 3000 2500 2000 1500 1000 500 0 0 16:30 18:00 19:30 21:00 22:30 0:00 1:30 3:00 4:30 6:00 7:30 9:00 10:30 12:00 13:30 Time (Clock) 45000 40000 35000 30000 25000 20000 15000 10000 5000 Energy Consump0on (kwh) Power (Machine Room, kw) Power (PDU, kw) Power (infrastructure, kw) Energy (Machine Room, kwh) Integrated Power (PDUs, kwh) HPL Start HPL End Energy efficiency (single number in GFlops/Watt) 9,380E- 01 (PDU, 10 minutes resolufon, whole run) 9,359E- 01 (PDU, 1 minutes resolufon, whole run, without cooling) 9,359E- 01 (PDU, 1 minutes resolufon, whole run, cooling included) 8,871E- 01 (machine room measurement, whole run) 7,296E- 01 (infrastructure measurement, whole run) D. Kranzlmüller Energy Efficieny and Extreme Scaling 32 16

SuperMUC @ LRZ D. Kranzlmüller Energy Efficieny and Extreme Scaling 33 LRZ Application Mix q Computational Fluid Dynamics: Optimisation of turbines and wings, noise reduction, air conditioning in trains q Fusion: Plasma in a future fusion reactor (ITER) q Astrophysics: Origin and evolution of stars and galaxies q Solid State Physics: Superconductivity, surface properties q Geophysics: Earth quake scenarios q Material Science: Semiconductors q Chemistry: Catalytic reactions q Medicine and Medical Engineering: Blood flow, aneurysms, air conditioning of operating theatres q Biophysics: Properties of viruses, genome analysis q Climate research: Currents in oceans D. Kranzlmüller Energy Efficieny and Extreme Scaling 34 17

1 st LRZ Extreme Scale Workshop n July 2013: 1 st LRZ Extreme Scale Workshop n ParFcipants: 15 internafonal projects n Prerequisites: Successful run on 4 islands (32768 cores) n ParFcipaFng Groups (Soyware packages): LAMMPS, VERTEX, GADGET, WaLBerla, BQCD, Gromacs, APES, SeisSol, CIAO n Successful results (> 64000 Cores): Invited to parfcipate in PARCO Conference (Sept. 2013) including a publicafon of their approach D. Kranzlmüller Energy Efficieny and Extreme Scaling 35 1 st LRZ Extreme Scale Workshop n Regular SuperMUC operation 4 Islands maximum Batch scheduling system n Entire SuperMUC reserved 2,5 days for challenge: 0,5 Days for testing 2 Days for executing 16 (of 19) Islands available n Consumed computing time for all groups: 1 hour of runtime = 130.000 CPU hours 1 year in total D. Kranzlmüller Energy Efficieny and Extreme Scaling 36 18

Results (Sustained TFlop/s on 128000 cores) Name MPI # cores Description TFlop/s/island TFlop/s max Linpack IBM 128000 TOP500 161 2560 Vertex IBM 128000 Plasma Physics 15 245 GROMACS IBM, Intel 64000 Molecular Modelling 40 110 Seissol IBM 64000 Geophysics 31 95 walberla IBM 128000 Lattice Boltzmann 5.6 90 LAMMPS IBM 128000 Molecular Modelling 5.6 90 APES IBM 64000 CFD 6 47 BQCD Intel 128000 Quantum Physics 10 27 D. Kranzlmüller Energy Efficieny and Extreme Scaling 37 Results n 5 Software packages were running on max 16 islands: LAMMPS VERTEX GADGET WaLBerla BQCD n VERTEX reached 245 TFlop/s on 16 islands (A. Marek) D. Kranzlmüller Energy Efficieny and Extreme Scaling 38 19

Lessons learned Technical Perspective n Hybrid (MPI+OpenMP) on SuperMUC still slower than pure MPI (e.g. GROMACS), but applications scale to larger core counts (e.g. VERTEX) n Core pinning needs a lot of experience by the programmer n Parallel IO still remains a challenge for many applications, both with regard to stability and speed. n Several stability issues with GPFS were observed for very large jobs due to writing thousands of files in a single directory. This will be improved in the upcoming versions of the application codes. D. Kranzlmüller Energy Efficieny and Extreme Scaling 39 Next Steps n LRZ Extreme Scale Benchmark Suite (LESS) will be available in two versions: public and internal n All teams will have the opportunity to run performance benchmarks after upcoming SuperMUC maintenances n 2 nd LRZ Extreme Scaling Workshop è 2-5 June 2014 Full system production runs on 18 islands with sustained Pflop/s (4h SeisSol, 7h Gadget) 4 existing + 6 additional full system applications High I/O bandwidth in user space possible (66 GB/s of 200 GB/s max) Important goal: minimize energy*runtime (3-15 W/core) n Initiation of the LRZ Partnership Initiative πcs D. Kranzlmüller Energy Efficieny and Extreme Scaling 40 20

Partnership Ini0a0ve Computa0onal Sciences πcs n Individualized services for selected scienffic groups flagship role Dedicated point- of- contact Individual support and guidance and targeted training &educafon Planning dependability for use case specific opfmized IT infrastructures Early access to latest IT infrastructure (hard- and soyware) developments and specificafon of future requirements Access to IT competence network and experfse at Computer Science and MathemaFcs departments n Partner contribu0on Embedding IT experts in user groups Joint research projects (including funding) ScienFfic partnership joint publicafons n LRZ benefits Understanding the (current and future) needs and requirements of the respecfve scienffic domain Developing future services for all user groups D. Kranzlmüller Energy Efficieny and Extreme Scaling 41 Partnerschadsini0a0ve Computa0onal Sciences πcs Goals for LRZ: n ThemaFc focusing Environmental Compu0ng n Strengthening science through innovafve, high performance IT technologies and modern IT infrastructures and IT services n Interdisciplinary integrafon (technical and personnel) of scienfsts and (internafonal) research groups n Novel requirements and research results at the interface of scienffic compufng and computer- based sciences n Increased prospects for afracfng research funding through established IT experfse as contribufon to applicafon projects n Outreach and exploitafon D. Kranzlmüller Energy Efficieny and Extreme Scaling 42 21

Astrophysics: world's largest simula0ons of supersonic, compressible turbulence Slices through the three- dimensional gas density (top panels) and vor0city (bofom panels) for fully developed, highly compressible, supersonic turbulence, generated by solenoidal driving (led- hand column) and compressive driving (right- hand column), and a grid resolu0on of 4096 3 cells. Federrath C MNRAS 2013;mnras.stt1644 2013 The Author Published by Oxford University Press on behalf of the Royal Astronomical Society D. Kranzlmüller Energy Efficieny and Extreme Scaling 43 SeisSol - Numerical Simula0on of Seismic Wave Phenomena Dr. ChrisFan PelFes, Department of Earth and Environmental Sciences (LMU) Prof. Michael Bader, Department of InformaFcs (TUM) 1,42 Petaflop/s on 147.456 Cores of SuperMUC (44,5 % of Peak Performance) hfp://www.uni- muenchen.de/informafonen_fuer/presse/presseinformafonen/2014/pelfes_seisol.html Picture: Alex Breuer (TUM) / ChrisFan PelFes (LMU) D. Kranzlmüller Energy Efficieny and Extreme Scaling 44 22

Extreme Scale Computing at the Leibniz Supercomputing Centre (LRZ) Dieter Kranzlmüller kranzlmueller@lrz.de D. Kranzlmüller Energy Efficieny and Extreme Scaling 45 23