GASPI A PGAS API for Scalable and Fault Tolerant Computing
|
|
|
- Melina Hodges
- 10 years ago
- Views:
Transcription
1 GASPI A PGAS API for Scalable and Fault Tolerant Computing Specification of a general purpose API for one-sided and asynchronous communication and provision of libraries, tools, examples and best practices Release 1.0 in November 2012 Reported by J.-P. Weiß, Facing the Multicore-Challenge III, Sep 21, 2012
2 GASPI - Overview Funding Programme: ICT Research for Innovation Funding Focus: HPC Software for Scalable Parallel Computers Funding Code: 01IH11007A Funding Volume: 2 Million Euro Duration: June 1, May 31, 2014 Coordinator: Dr. Christian Simmendinger T-Systems SfR Pfaffenwaldring Stuttgart [email protected]
3 Background and Motivation Current parallel software is mainly MPI-based Adaptation to current hardware has highlighted significant weaknesses which preclude scalability on heterogeneous multi-core systems MPI is huge, not very flexible due to backward compatibility New demands on programming models: flexible threading models support for data locality asynchronous communication manage storage subsystems with varying bandwidth and latency This "Multicore-Challenge stimulates the development of new programming models and programming languages and leads to new challenges for mathematical modeling and algorithms
4 GASPI - Motivation I GASPI targets extreme scalability in the exascale age Overcome the limitations of MPI (are there any?!) GASPI aims to initiate a paradigm shift From bulk-synchronous two-sided communication patterns towards an asynchronous communication and execution model GASPI challenges algorithms, implementations and applications Rethink your communication patterns! Reformulate towards an asynchronous data flow model GASPI provides multiple memory segments per GASPI process GASPI addresses heterogeneous machines
5 GASPI - Motivation II GASPI is not a new language Not like X10, UPC, Chapel GASPI is not a language extension Not like Co-Array Fortran GASPI complements existing languages with a PGAS API Very much like MPI GASPI supports multiple memory models Not like OpenShmem or Global Arrays GASPI can be combined with any threading model GASPI is not fixed to SPMD or MPMD style of execution
6 GASPI - Motivation III GASPI is fault tolerant It provides time-out mechanisms for all non-local procedures Failure detection can handle node failures and delayed responses Can check sanity of communication partners by state vectors GASPI can be adapted to shrinking or growing node sets GASPI leverages one-sided RDMA driven communication Implemented on top of the IB verbs layer and OFED stack Communication handled by the network infrastructure No involvement of CPU cores
7 GASPI Features I Processes, groups, ranks Multiple PGAS memory segments per process Dynamic support for heterogeneous systems (GPUs, MICs, ) One-sided communication primitives Asynchronous communication by remote read and write Handled by local queues, no copy operations into buffers Notification mechanisms for communication partners Passive communication When the sender may be unknown, two-sided semantics Fair distributed updates of globally shared parts of data
8 GASPI Features II Weak synchronization primitives Global atomics fetch_and_add, compare_and_swap Counters as globally shared variables or for synchronization Collective communication Allreduce, broadcast, barrier with group support User-defined global collectives Asynchronous versions provided Time-out mechanisms for non-blocking routines Enable fault tolerance
9 Project Activities I Specification of the GASPI standard for a PGAS API Ensure interoperability with MPI Take into account requirements of applications Provision of open-source GASPI implementation Portable high-performance library for one-sided and asynchronous communication Adaptation and further development of the Vampir performance analysis suite for the GASPI standard
10 Project Activities II Development efficient numerical libraries based on GASPI core functions; sparse and dense linear algebra routines, high level solvers, FEM code Verification through porting of complex, industry-oriented applications Evaluation, benchmarking and performance analysis Outreach to the HPC & Scientific Computing Community by information dissemination, formation of user groups, trainings and workshops
11 Key Objectives In a Partitioned Global Address Space every thread can read/write the entire global memory of an application. Scalability From bulk synchronous two sided communication patterns to asynchronous one-sided communication Fault Tolerance Timeouts in non-local operations, dynamic node sets Flexibility Support for multiple memory models, multiple segments, configurable hardware resources Versatility PGAS API - beyond the message passing model of MPI
12 Project Partners Fraunhofer Gesellschaft e.v. Fraunhofer ITWM Fraunhofer SCAI T-Systems Solutions for Research GmbH Forschungszentrum Jülich Karlsruhe Institute of Technology Deutsches Zentrum für Luft- und Raumfahrt e.v. Institute of Aerodynamics and Flow Technology Institute of Propulsion Technology Technische Universität Dresden Center for Information Services and HPC Deutscher Wetterdienst scapos AG
13 Contributors Thomas Alrutz 1, Jan Backhaus 2, Thomas Brandes 3, Vanessa End 1, Thomas Gerhold 4, Alfred Geiger 1, Daniel Grünewald 5, Vincent Heuveline 6, Jens Jägersküpper 4, Andreas Knüpfer 7, Olaf Krzikalla 7, Edmund Kügeler 2, Carsten Lojewski 5, Guy Lonsdale 8, Ralph Müller-Pfefferkorn 7, Wolfgang Nagel 7, Lena Oden 5, Franz-Josef Pfreundt 5, Mirko Rahn 5, Michael Sattler 1, Mareike Schmidtobreick 6, Annika Schiller 9, Christian Simmendinger 1, Thomas Soddemann 3, Godehard Sutmann 9, Henning Weber 10, Jan-Philipp Weiß 2 1 T-Systems SfR, Stuttgart & Göttingen, 2 DLR, Institut für Antriebstechnik, Köln 3 Fraunhofer SCAI, Sankt Augustin 4 DLR, Institut für Aerodynamik und Strömungstechnik, Braunschweig & Göttingen 5 Fraunhofer ITWM, Kaiserslautern 6 Engineering Mathematics and Computing Lab (EMCL), KIT Karlsruhe 7 Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH), TU Dresden 8 scapos AG, Sankt Augustin 9 Forschungszentrum Jülich 10 Deutscher Wetterdienst (DWD), Offenbach
A PGAS-based implementation for the unstructured CFD solver TAU
A PGAS-based implementation for the unstructured CFD solver TAU Christian Simmendinger T-Systems Solution for Research Pfaffenwaldring 38-40 70569 Stuttgart, Germany [email protected]
GPI Global Address Space Programming Interface
GPI Global Address Space Programming Interface SEPARS Meeting Stuttgart, December 2nd 2010 Dr. Mirko Rahn Fraunhofer ITWM Competence Center for HPC and Visualization 1 GPI Global address space programming
Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp
Petascale Software Challenges William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges Why should you care? What are they? Which are different from non-petascale? What has changed since
Altix Usage and Application Programming. Welcome and Introduction
Zentrum für Informationsdienste und Hochleistungsrechnen Altix Usage and Application Programming Welcome and Introduction Zellescher Weg 12 Tel. +49 351-463 - 35450 Dresden, November 30th 2005 Wolfgang
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications
Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)
Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications
Scientific Computing Programming with Parallel Objects
Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore
ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008
ParFUM: A Parallel Framework for Unstructured Meshes Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 What is ParFUM? A framework for writing parallel finite element
HPC enabling of OpenFOAM R for CFD applications
HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,
Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH
Equalizer Parallel OpenGL Application Framework Stefan Eilemann, Eyescale Software GmbH Outline Overview High-Performance Visualization Equalizer Competitive Environment Equalizer Features Scalability
Kriterien für ein PetaFlop System
Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working
Sourcery Overview & Virtual Machine Installation
Sourcery Overview & Virtual Machine Installation Damian Rouson, Ph.D., P.E. Sourcery, Inc. www.sourceryinstitute.org Sourcery, Inc. About Us Sourcery, Inc., is a software consultancy founded by and for
- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
Apache Hama Design Document v0.6
Apache Hama Design Document v0.6 Introduction Hama Architecture BSPMaster GroomServer Zookeeper BSP Task Execution Job Submission Job and Task Scheduling Task Execution Lifecycle Synchronization Fault
MPI / ClusterTools Update and Plans
HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski
Programming Languages for Large Scale Parallel Computing. Marc Snir
Programming Languages for Large Scale Parallel Computing Marc Snir Focus Very large scale computing (>> 1K nodes) Performance is key issue Parallelism, load balancing, locality and communication are algorithmic
Performance Evaluation of the RDMA over Ethernet (RoCE) Standard in Enterprise Data Centers Infrastructure. Abstract:
Performance Evaluation of the RDMA over Ethernet (RoCE) Standard in Enterprise Data Centers Infrastructure Motti Beck Director, Marketing [email protected] Michael Kagan Chief Technology Officer [email protected]
HPC performance applications on Virtual Clusters
Panagiotis Kritikakos EPCC, School of Physics & Astronomy, University of Edinburgh, Scotland - UK [email protected] 4 th IC-SCCE, Athens 7 th July 2010 This work investigates the performance of (Java)
HPC Software Requirements to Support an HPC Cluster Supercomputer
HPC Software Requirements to Support an HPC Cluster Supercomputer Susan Kraus, Cray Cluster Solutions Software Product Manager Maria McLaughlin, Cray Cluster Solutions Product Marketing Cray Inc. WP-CCS-Software01-0417
Advancing Applications Performance With InfiniBand
Advancing Applications Performance With InfiniBand Pak Lui, Application Performance Manager September 12, 2013 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server and
Data Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
Intel Ethernet Switch Converged Enhanced Ethernet (CEE) and Datacenter Bridging (DCB) Using Intel Ethernet Switch Family Switches
Intel Ethernet Switch Converged Enhanced Ethernet (CEE) and Datacenter Bridging (DCB) Using Intel Ethernet Switch Family Switches February, 2009 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION
Kommunikation in HPC-Clustern
Kommunikation in HPC-Clustern Communication/Computation Overlap in MPI W. Rehm and T. Höfler Department of Computer Science TU Chemnitz http://www.tu-chemnitz.de/informatik/ra 11.11.2005 Outline 1 2 Optimize
Access to the Federal High-Performance Computing-Centers
Access to the Federal High-Performance Computing-Centers [email protected] University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Slide 1 TOP 500 Nov. List German Sites,
Mellanox HPC-X Software Toolkit Release Notes
Mellanox HPC-X Software Toolkit Release Notes Rev 1.2 www.mellanox.com NOTE: THIS HARDWARE, SOFTWARE OR TEST SUITE PRODUCT ( PRODUCT(S) ) AND ITS RELATED DOCUMENTATION ARE PROVIDED BY MELLANOX TECHNOLOGIES
Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association
Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?
Pedraforca: ARM + GPU prototype
www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of
Resource Utilization of Middleware Components in Embedded Systems
Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system
EVITA-Project.org: E-Safety Vehicle Intrusion Protected Applications
EVITA-Project.org: E-Safety Vehicle Intrusion Protected Applications 7 th escar Embedded Security in Cars Conference November 24 25, 2009, Düsseldorf Dr.-Ing. Olaf Henniger, Fraunhofer SIT Darmstadt Hervé
Storage at a Distance; Using RoCE as a WAN Transport
Storage at a Distance; Using RoCE as a WAN Transport Paul Grun Chief Scientist, System Fabric Works, Inc. (503) 620-8757 [email protected] Why Storage at a Distance the Storage Cloud Following
SHARED HASH TABLES IN PARALLEL MODEL CHECKING
SHARED HASH TABLES IN PARALLEL MODEL CHECKING IPA LENTEDAGEN 2010 ALFONS LAARMAN JOINT WORK WITH MICHAEL WEBER AND JACO VAN DE POL 23/4/2010 AGENDA Introduction Goal and motivation What is model checking?
PRIMERGY server-based High Performance Computing solutions
PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating
Cluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
Middleware. Peter Marwedel TU Dortmund, Informatik 12 Germany. technische universität dortmund. fakultät für informatik informatik 12
Universität Dortmund 12 Middleware Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: Alexandra Nolte, Gesine Marwedel, 2003 2010 年 11 月 26 日 These slides use Microsoft clip arts. Microsoft copyright
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM
Trends in High-Performance Computing for Power Grid Applications
Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.
Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines
Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines Michael J Jipping Department of Computer Science Hope College Holland, MI 49423 [email protected] Gary Lewandowski Department of Mathematics
Implementing MPI-IO Shared File Pointers without File System Support
Implementing MPI-IO Shared File Pointers without File System Support Robert Latham, Robert Ross, Rajeev Thakur, Brian Toonen Mathematics and Computer Science Division Argonne National Laboratory Argonne,
Symmetric Multiprocessing
Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called
The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.
White Paper 021313-3 Page 1 : A Software Framework for Parallel Programming* The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. ABSTRACT Programming for Multicore,
Understanding applications using the BSC performance tools
Understanding applications using the BSC performance tools Judit Gimenez ([email protected]) German Llort([email protected]) Humans are visual creatures Films or books? Two hours vs. days (months) Memorizing
InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment
December 2007 InfiniBand Software and Protocols Enable Seamless Off-the-shelf Deployment 1.0 Introduction InfiniBand architecture defines a high-bandwidth, low-latency clustering interconnect that is used
Principles and characteristics of distributed systems and environments
Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single
Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
Hari Subramoni. Education: Employment: Research Interests: Projects:
Hari Subramoni Senior Research Associate, Dept. of Computer Science and Engineering The Ohio State University, Columbus, OH 43210 1277 Tel: (614) 961 2383, Fax: (614) 292 2911, E-mail: [email protected]
A Multi-layered Domain-specific Language for Stencil Computations
A Multi-layered Domain-specific Language for Stencil Computations Christian Schmitt, Frank Hannig, Jürgen Teich Hardware/Software Co-Design, University of Erlangen-Nuremberg Workshop ExaStencils 2014,
Advanced Computer Networks. High Performance Networking I
Advanced Computer Networks 263 3501 00 High Performance Networking I Patrick Stuedi Spring Semester 2014 1 Oriana Riva, Department of Computer Science ETH Zürich Outline Last week: Wireless TCP Today:
OFA Training Program. Writing Application Programs for RDMA using OFA Software. Author: Rupert Dance Date: 11/15/2011. www.openfabrics.
OFA Training Program Writing Application Programs for RDMA using OFA Software Author: Rupert Dance Date: 11/15/2011 www.openfabrics.org 1 Agenda OFA Training Program Program Goals Instructors Programming
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory 21 st Century Research Continuum Theory Theory embodied in computation Hypotheses tested through experiment SCIENTIFIC METHODS
Performance Tools for Parallel Java Environments
Performance Tools for Parallel Java Environments Sameer Shende and Allen D. Malony Department of Computer and Information Science, University of Oregon {sameer,malony}@cs.uoregon.edu http://www.cs.uoregon.edu/research/paracomp/tau
Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics
Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400
LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu
Scalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
Multicore Parallel Computing with OpenMP
Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large
OpenMosix Presented by Dr. Moshe Bar and MAASK [01]
OpenMosix Presented by Dr. Moshe Bar and MAASK [01] openmosix is a kernel extension for single-system image clustering. openmosix [24] is a tool for a Unix-like kernel, such as Linux, consisting of adaptive
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp Welcome! Who am I? William (Bill) Gropp Professor of Computer Science One of the Creators of
SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery
SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery Ismail Oukid*, Daniel Booss, Wolfgang Lehner*, Peter Bumbulis, and Thomas Willhalm + *Dresden University of Technology SAP AG + Intel GmbH
COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)
COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University [email protected] COMP
Performance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory
Graph Analytics in Big Data John Feo Pacific Northwest National Laboratory 1 A changing World The breadth of problems requiring graph analytics is growing rapidly Large Network Systems Social Networks
Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka
Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka René Widera1, Erik Zenker1,2, Guido Juckeland1, Benjamin Worpitz1,2, Axel Huebl1,2, Andreas Knüpfer2, Wolfgang E. Nagel2,
Application Performance Analysis Tools and Techniques
Mitglied der Helmholtz-Gemeinschaft Application Performance Analysis Tools and Techniques 2012-06-27 Christian Rössel Jülich Supercomputing Centre [email protected] EU-US HPC Summer School Dublin
HP ProLiant SL270s Gen8 Server. Evaluation Report
HP ProLiant SL270s Gen8 Server Evaluation Report Thomas Schoenemeyer, Hussein Harake and Daniel Peter Swiss National Supercomputing Centre (CSCS), Lugano Institute of Geophysics, ETH Zürich [email protected]
GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources
GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 2/25/2006 1 Overview Grid/NetSolve
HPC ABDS: The Case for an Integrating Apache Big Data Stack
HPC ABDS: The Case for an Integrating Apache Big Data Stack with HPC 1st JTC 1 SGBD Meeting SDSC San Diego March 19 2014 Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox [email protected] http://www.infomall.org
2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts
Chapter 2 Introduction to Distributed systems 1 Chapter 2 2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts Client-Server
How To Visualize Performance Data In A Computer Program
Performance Visualization Tools 1 Performance Visualization Tools Lecture Outline : Following Topics will be discussed Characteristics of Performance Visualization technique Commercial and Public Domain
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing
Software Development around a Millisecond
Introduction Software Development around a Millisecond Geoffrey Fox In this column we consider software development methodologies with some emphasis on those relevant for large scale scientific computing.
Multi-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
Petascale Software Challenges. Piyush Chaudhary [email protected] High Performance Computing
Petascale Software Challenges Piyush Chaudhary [email protected] High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons
GPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak
Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance
Mellanox Academy Online Training (E-learning)
Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),
Titolo del paragrafo. Titolo del documento - Sottotitolo documento The Benefits of Pushing Real-Time Market Data via a Web Infrastructure
1 Alessandro Alinone Agenda Introduction Push Technology: definition, typology, history, early failures Lightstreamer: 3rd Generation architecture, true-push Client-side push technology (Browser client,
Objective 1.2 Cloud Computing, Internet of Services and Advanced Software Engineering
Cloud Computing, Internet of Services and Advanced Software Engineering Arian Zwegers European Commission Information Society and Media Directorate General Software & Service Architectures and Infrastructures
www.thinkparq.com www.beegfs.com
www.thinkparq.com www.beegfs.com KEY ASPECTS Maximum Flexibility Maximum Scalability BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a
Microsoft SMB 2.2 - Running Over RDMA in Windows Server 8
Microsoft SMB 2.2 - Running Over RDMA in Windows Server 8 Tom Talpey, Architect Microsoft March 27, 2012 1 SMB2 Background The primary Windows filesharing protocol Initially shipped in Vista and Server
Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner
Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at
HPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware
Operating System for the K computer
Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements
Building a Private Cloud with Eucalyptus
Building a Private Cloud with Eucalyptus 5th IEEE International Conference on e-science Oxford December 9th 2009 Christian Baun, Marcel Kunze KIT The cooperation of Forschungszentrum Karlsruhe GmbH und
BLM 413E - Parallel Programming Lecture 3
BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several
Client/Server Computing Distributed Processing, Client/Server, and Clusters
Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the
