GASPI A PGAS API for Scalable and Fault Tolerant Computing

GASPI A PGAS API for Scalable and Fault Tolerant Computing Specification of a general purpose API for one-sided and asynchronous communication and provision of libraries, tools, examples and best practices Release 1.0 in November 2012 Reported by J.-P. Weiß, Facing the Multicore-Challenge III, Sep 21, 2012

GASPI - Overview Funding Programme: ICT 2020 - Research for Innovation Funding Focus: HPC Software for Scalable Parallel Computers Funding Code: 01IH11007A Funding Volume: 2 Million Euro Duration: June 1, 2011 - May 31, 2014 Coordinator: Dr. Christian Simmendinger T-Systems SfR Pfaffenwaldring 38-40 70569 Stuttgart christian.simmendinger@t-systems.com

Background and Motivation Current parallel software is mainly MPI-based Adaptation to current hardware has highlighted significant weaknesses which preclude scalability on heterogeneous multi-core systems MPI is huge, not very flexible due to backward compatibility New demands on programming models: flexible threading models support for data locality asynchronous communication manage storage subsystems with varying bandwidth and latency This "Multicore-Challenge stimulates the development of new programming models and programming languages and leads to new challenges for mathematical modeling and algorithms

GASPI - Motivation I GASPI targets extreme scalability in the exascale age Overcome the limitations of MPI (are there any?!) GASPI aims to initiate a paradigm shift From bulk-synchronous two-sided communication patterns towards an asynchronous communication and execution model GASPI challenges algorithms, implementations and applications Rethink your communication patterns! Reformulate towards an asynchronous data flow model GASPI provides multiple memory segments per GASPI process GASPI addresses heterogeneous machines

GASPI - Motivation II GASPI is not a new language Not like X10, UPC, Chapel GASPI is not a language extension Not like Co-Array Fortran GASPI complements existing languages with a PGAS API Very much like MPI GASPI supports multiple memory models Not like OpenShmem or Global Arrays GASPI can be combined with any threading model GASPI is not fixed to SPMD or MPMD style of execution

GASPI - Motivation III GASPI is fault tolerant It provides time-out mechanisms for all non-local procedures Failure detection can handle node failures and delayed responses Can check sanity of communication partners by state vectors GASPI can be adapted to shrinking or growing node sets GASPI leverages one-sided RDMA driven communication Implemented on top of the IB verbs layer and OFED stack Communication handled by the network infrastructure No involvement of CPU cores

GASPI Features I Processes, groups, ranks Multiple PGAS memory segments per process Dynamic support for heterogeneous systems (GPUs, MICs, ) One-sided communication primitives Asynchronous communication by remote read and write Handled by local queues, no copy operations into buffers Notification mechanisms for communication partners Passive communication When the sender may be unknown, two-sided semantics Fair distributed updates of globally shared parts of data

GASPI Features II Weak synchronization primitives Global atomics fetch_and_add, compare_and_swap Counters as globally shared variables or for synchronization Collective communication Allreduce, broadcast, barrier with group support User-defined global collectives Asynchronous versions provided Time-out mechanisms for non-blocking routines Enable fault tolerance

Project Activities I Specification of the GASPI standard for a PGAS API Ensure interoperability with MPI Take into account requirements of applications Provision of open-source GASPI implementation Portable high-performance library for one-sided and asynchronous communication Adaptation and further development of the Vampir performance analysis suite for the GASPI standard

Project Activities II Development efficient numerical libraries based on GASPI core functions; sparse and dense linear algebra routines, high level solvers, FEM code Verification through porting of complex, industry-oriented applications Evaluation, benchmarking and performance analysis Outreach to the HPC & Scientific Computing Community by information dissemination, formation of user groups, trainings and workshops

Key Objectives In a Partitioned Global Address Space every thread can read/write the entire global memory of an application. Scalability From bulk synchronous two sided communication patterns to asynchronous one-sided communication Fault Tolerance Timeouts in non-local operations, dynamic node sets Flexibility Support for multiple memory models, multiple segments, configurable hardware resources Versatility PGAS API - beyond the message passing model of MPI

Project Partners Fraunhofer Gesellschaft e.v. Fraunhofer ITWM Fraunhofer SCAI T-Systems Solutions for Research GmbH Forschungszentrum Jülich Karlsruhe Institute of Technology Deutsches Zentrum für Luft- und Raumfahrt e.v. Institute of Aerodynamics and Flow Technology Institute of Propulsion Technology Technische Universität Dresden Center for Information Services and HPC Deutscher Wetterdienst scapos AG

Contributors Thomas Alrutz 1, Jan Backhaus 2, Thomas Brandes 3, Vanessa End 1, Thomas Gerhold 4, Alfred Geiger 1, Daniel Grünewald 5, Vincent Heuveline 6, Jens Jägersküpper 4, Andreas Knüpfer 7, Olaf Krzikalla 7, Edmund Kügeler 2, Carsten Lojewski 5, Guy Lonsdale 8, Ralph Müller-Pfefferkorn 7, Wolfgang Nagel 7, Lena Oden 5, Franz-Josef Pfreundt 5, Mirko Rahn 5, Michael Sattler 1, Mareike Schmidtobreick 6, Annika Schiller 9, Christian Simmendinger 1, Thomas Soddemann 3, Godehard Sutmann 9, Henning Weber 10, Jan-Philipp Weiß 2 1 T-Systems SfR, Stuttgart & Göttingen, 2 DLR, Institut für Antriebstechnik, Köln 3 Fraunhofer SCAI, Sankt Augustin 4 DLR, Institut für Aerodynamik und Strömungstechnik, Braunschweig & Göttingen 5 Fraunhofer ITWM, Kaiserslautern 6 Engineering Mathematics and Computing Lab (EMCL), KIT Karlsruhe 7 Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), TU Dresden 8 scapos AG, Sankt Augustin 9 Forschungszentrum Jülich 10 Deutscher Wetterdienst (DWD), Offenbach