NVIDIA HPC Update. Carl Ponder, PhD; NVIDIA, Austin, TX, USA - Sr. Applications Engineer, Developer Technology Group

Size: px
Start display at page:

Download "NVIDIA HPC Update. Carl Ponder, PhD; cponder@nvidia.com; NVIDIA, Austin, TX, USA - Sr. Applications Engineer, Developer Technology Group"

Transcription

1 NVIDIA HPC Update Carl Ponder, PhD; NVIDIA, Austin, TX, USA - Sr. Applications Engineer, Developer Technology Group Stan Posey; sposey@nvidia.com; NVIDIA, Santa Clara, CA, USA - HPC Program Manager, ESM and CFD Solutions

2 IBM and NVIDIA CWO Review NVIDIA HPC Strategy for CWO GPU Model Developments Global: ECMWF ESCAPE Project, NOAA NGGPS Meso-scale: COSMO, WRF 2

3 GPU Motivation (I): Performance Trends Peak Double Precision FLOPS Peak Memory Bandwidth 8000 GFLOPS 1400 GB/s Volta ~5x 1200 ~5x ~15x 1000 Pascal Volta 4000 Pascal M1060 M2050 M2090 K20 K40 K M1060 M2050 M2090 K20 K40 K NVIDIA GPU x86 CPU NVIDIA GPU x86 CPU 3

4 GPU Motivation (II): Power Limiting Trends Challenges of Getting ECMWF s Weather Forecast Model (IFS) to the Exascale G. Mozdzynski, ECMWF 16 th HPC Workshop Conclusion: Current mathematics, algorithms and hardware will not be sufficient (by far) 4

5 GPU Motivation (III): ES Model Trends Higher grid resolution with manageable compute and energy costs Global atmosphere models from 10 km today to cloud-resolving scales of 1-3 km IFS? 128 km 16 km 10 km 3 km Source: Project Athena Increase in ensemble use and ensemble members to manage uncertainty Number of Jobs 10x - 50x Fewer model approximations (NH), more features (physics, chemistry, etc.) Accelerator technology identified as a cost-effective and practical approach to future NWP requirements 5

6 Tesla GPU Progression During Recent Years 2012 (Fermi) M (Kepler) K20X 2014 (Kepler) K (Kepler) K80 Kepler / Fermi Peak SP Peak SGEMM 1.03 TF 3.93 TF 2.95 TF 4.29 TF 3.22 TF 8.74 TF 4x Peak DP Peak DGEMM.515 TF 1.31 TF 1.22 TF 1.43 TF 1.33 TF 2.90 TF Memory size 6 GB 6 GB 12 GB 24 GB (12 each) 2x Mem BW (ECC off) 150 GB/s 250 GB/s 288 GB/s 480 GB/s (240 each) 2x Memory Clock 2.6 GHz 3.0 GHz 3.0 GHz PCIe Gen Gen 2 Gen 2 Gen 3 Gen 3 2x # of Cores (2496 each) 5x Board Power 235W 235W 235W 300W 0% 28% Note: Tesla K80 specifications are shown as aggregate of two GPUs on a single board 3x 6

7 Features of Pascal GPU Architecture 2016 NVLink Interconnect at 80 GB/s (Speed of CPU Memory) Stacked Memory 4x Higher Bandwidth ~1 TB/s 3x Capacity, 4x More Efficient Unified Memory Lower Development Effort (Available Today in CUDA6) TESLA GPU Stacked Memory NVLink GB/s HBM 1 TB/s UVM CPU (POWER,?) DDR GB/s DDR Memory 7

8 NVLink Interconnect Alternative to PCIe High speed interconnect for CPU-to-GPU and GPU-to-GPU Performance advantage over PCIe Gen-3 (5x to 12x faster) GPUs and CPUs share data structures at CPU memory speeds TESLA GPU Stacked Memory NVLink GB/s HBM 1 TB/s UVM CPU (POWER,?) DDR GB/s DDR Memory 8

9 NVLink Interconnect Alternative to PCIe GPU Density Increasing HPC Vendors & NVLink 2016 All will support GPU-to-GPU Several to support CPU-to-GPU Cray CS-Storm: 8 x K80 Dell C4130: 4 x K80 HP SL270: 8 x K80 9

10 GPU and CPU Platforms and Configurations NVLink PCIe PCIe X86, ARM64, POWER CPU X86, ARM64, POWER CPU 2014 Kepler GPU 2016 Pascal GPU More CPU Choices for GPU Acceleration 2014 NVLink Interconnect Alternative to PCIe 2016 NVLink POWER CPU 10

11 IBM Power + NVIDIA GPU Accelerated HPC Next-Gen IBM Supercomputers and Enterprise Servers OpenPOWER Foundation Open ecosystem built on Power Architecture Long term roadmap integration + POWER CPU Tesla GPU & 30+ more First GPU-Accelerated POWER-Based Systems Available Since

12 DOE CORAL Based on Power + GPU 2017 US DOE CORAL Systems Summit (ORNL) and Sierra (LLNL) Installation 2017 at ~150 PF each Nodes of POWER 9 + Tesla Volta GPUs NVLink Interconnect for CPUs + GPUs ORNL Summit System Approximately 3,400 total nodes Each node 40+ TF peak performance About 1/5 of total #2 Titan nodes (18K+) Same energy used as #2 Titan (27 PF) CORAL Summit System 5-10x Faster than Titan 1/5th the Nodes, Same Energy Use as Titan 12

13 Feature Comparison of Summit vs. Titan ~5-10x ~1/5x ~30x ~15x ~10-20x ~0x 13

14 CORAL Systems for US DOE Climate Model ACME ACME: Accelerated Climate Model for Energy DOE Labs: Argonne, LANL, LBL, LLNL, ORNL, PNNL, Sandia Co-design project using US DOE LCF systems Branch of CESM using CAM-SE (atm) and MPAS-O (ocn) ACME Project Roadmap (page 2 in report) Strategy Report 11 Jul CORAL: Summit, Sierra, Aurora Trinity: NERSC8 (Cori) TITAN OLCF [AMD + GPU] Mira ALCF [Blue GeneQ] Source: 14

15 GPU Programming Models for CPU Platforms Availability for x86 Today, POWER and ARM Under Development Libraries AmgX cublas Compiler Directives Programming Languages / x86 15

16 PGI Compilers for OpenPOWER + GPU Feature parity with PGI Compilers on Linux/x86+Tesla CUDA Fortran, OpenACC, OpenMP, CUDA C/C++ host compiler Integrated with IBM s optimized LLVM / Power code generator Limited access in 2015, Beta 1H 2016, Production in 2016 x86 Recompile 16

17 NVIDIA GPU Focus for Atmosphere Models Recent Developments Global: ECMWF ESCAPE Project, NOAA NGGPS Regional: COSMO, WRF 17

18 ECMWF Project for Exascale Weather Prediction Expectations towards Exascale: Weather and Climate Prediction P. Bauer, ECMWF, EESI-2015 EASCAPE HPC Goals: Standardized, highly optimized kernels on specialized hardware Overlap of communication and computation Compilers/standards supporting portability Project Approach: Co-design between domain scientists, computer scientists, and HPC vendors 18

19 ECMWF Initial IFS Investigation on Titan GPUs Challenges of Getting ECMWF s Weather Forecast Model (IFS) to the Exascale G. Mozdzynski, ECMWF 16 th HPC Workshop >3x >4x Legendre Transform Kernels Observe >4x for Titan K20X vs. XC core Ivy Bridge 19

20 NOAA NGGPS: Background on Program From: Next Generation HPC and Forecast Model Application Readiness at NCEP -by John Michalakes, NOAA NCEP; AMS, Phoenix, AZ, Jan

21 NOAA NGGPS: NH Model Dycore Candidates (5) Dycores FV3 & MPAS selected for Level-II evaluation NVIDIA, NOAA investigating potential for GPUs NOAA evaluations under HPC program SENA Software From: Next Generation HPC and Forecast Model Application Readiness at NCEP -by John Michalakes, NOAA NCEP; AMS, Phoenix, AZ, Jan 2015 Engineering for Novel Architectures (2015) 21

22 COSMO Regional Model on GPUs COSMO model fully implemented on GPUs by MeteoSwiss COSMO consortium approved GPU code for standard distribution (5.3) MeteoSwiss to deploy first-ever operational NWP on GPUs Successful daily test runs on GPUs since Dec 2014, operational in 2016 COSMO 2 (2.2 KM) COSMO 1 (1.1 KM) Reality Source: Increased Resolution = Quality Forecast 22

23 Operational NWP at MeteoSwiss Before GPUs Delivering performance in scientific simulations: present and future role of GPUs in supercomputers -by Dr. Thomas Schulthess, Director CSCS; NVIDIA GTC, March

24 New Operational NWP at MeteoSwiss With GPUs Climate Simulation and Numerical Weather Prediction Using GPUs -by Dr. Xavier Lapillonne, MeteoSwiss; EGU Assembly, Apr 2015 MeteoSwiss will deploy the first-ever operational NWP with GPUs (2016) Two identical systems purchased (Cray CS + K80 GPUs) to be installed during 2015 Successful daily run of pre-operational COSMO on GPUs since Dec 2014 at CSCS New Operational NWP Configurations: New configurations of higher resolution and ensemble prediction only possible with performance-per-energy gains from GPUs 2.2km increase to 1.1km 6.6km increase to 2.2km EPS Short range rapid update (grid=1158x774x80, 1.1km) forecast of 24 hrs generated every 3h (8 per day) Medium range ensemble (grid=582x390x60, 2.2km) forecast of 5+ days generated every 12h (2 per day) 24

25 MeteoSwiss GPU-Driven Weather Prediction MeteoSwiss COSMO NWP Configurations Since 2008 IFS from ECMWF 2 per day, 10 day forecast COSMO 7 (6.6 KM) 3 per day, 3 day forecast COSMO 2 (2.2 KM) 8 per day, 24 hr forecast Before GPUs MeteoSwiss COSMO NWP Configurations During 2016 IFS from ECMWF 2 per day, 10 day forecast COSMO E (2.2 KM) 2 per day, 5 day forecast COSMO 1 (1.1 KM) 8 per day, 24 hr forecast With GPUs New configurations of higher resolution and ensemble predictions possible owing to the performance-per-energy gains from GPUs X. Lapillonne, MeteoSwiss; EGU Assembly, Apr

26 COSMO Model Performance for K20X Oct 2014 Are OpenACC Directives the Easy Way to Port NWP Applications to GPUs? -by Dr. Xavier Lapillonne, MeteoSwiss; ECMWF 16 th HPC Workshop, Oct 2014 Details of Benchmark and CPU + GPU Configurations: COSMO-2 (2km) configuration, 24h run, grid = 520x350x60 System todi, XK7 at CSCS All results for 9 compute nodes with 9 x AMD CPU and 9 x K20X All results socket-to-socket CPU use of full 16 cores GPU use of 1 CPU core Physics that have OpenACC optimizations speedup 3.4X Assim, OpenACC on 1 core CPU 3.3X slower but < 15% of profile Total GPU speedup 2.7X 26

27 COSMO Model Performance for K20X Apr 2015 Climate Simulation and Numerical Weather Prediction Using GPUs -by Dr. Xavier Lapillonne, MeteoSwiss; EGU Assembly, Apr 2015 Details of Benchmark and CPU + GPU Configurations: COSMO-1 (1km) configuration, 24h run, grid = 1158x774x80 System Piz Daint, XC30 at CSCS All results for 70 compute nodes with 1 x SB CPU and 1 x K20X All results socket-to-socket CPU use of full 8 cores GPU use of 1 CPU core More substantial gains in energy per solution, but not reported Total GPU speedup 2.2X 27

28 COSMO Development on GPUs by MeteoSwiss Climate Simulation and Numerical Weather Prediction Using GPUs -by Dr. Xavier Lapillonne, MeteoSwiss; EGU Assembly, Apr 2015 Approach Implementation 28

29 References: COSMO Model on NVIDIA GPUs Towards GPU-accelerated Operational Weather Forecasting - Oliver Fuhrer (MeteoSwiss), NVIDIA GTC 2013, Mar 2013 Source: Delivering Performance in Scientific Simulations: Present and Future Role of GPUs in Supercomputers - Thomas Schulthess (CSCS), NVIDIA GTC 2014, Mar 2014 Source: Implementation of COSMO on Accelerators - Oliver Fuhrer (MeteoSwiss), ECMWF Scalability Workshop, Apr 2014 Source: Implementation of COSMO on Accelerators - Xavier Lapillonne (MeteoSwiss), ECMWF 16 th HPC Workshop, Oct 2014 Source: 29

30 WRF Progress Update Independent development projects of (I) CUDA and (II) OpenACC I. CUDA through funded collaboration with SSEC SSEC lead Dr. Bormin Huang, NVIDIA CUDA Fellow: research.nvidia.com/users/bormin-huang Objective: Hybrid WRF benchmark capability starting 2H 2015 II. OpenACC directives through collaboration with NCAR MMM Objective: NCAR GPU interest in Fortran version to host on distribution site \ 30

31 Project Status (I): NVIDIA-SSEC CUDA WRF Objective to provide Q3 benchmark capability of hybrid CPU+GPU WRF SSEC ported 20+ modules to CUDA across 5 physics types in WRF: Cloud MP = 11; Radiation = 4; Surface Layer = 3; Planetary BL = 1; Cumulus = 1 Project to integrate CUDA modules into latest full model WRF SSEC development began 2010, based on WRF 3.3/3.4 vs. latest release SSEC about publications so module development was isolated from WRF model Start with existing modules most common across target customer s input Objective: speedups measured on individual modules within full model run NVIDIA SSEC Project Success \ Target modules achieve published GPU performance within full WRF model run Final clean-up and performance studies underway, real benchmarks in 2H 2015 TQI/SSEC to fund/port all modules to CUDA towards a full WRF model on GPUs Focus on additional physics modules, and ongoing WRF-ARW dycore development 31

32 SSEC Speedups of WRF Physics Modules * Modules for initial NVIDIA-funded integration project WRF Module GPU Speedup (w-wo I/O) Technical Paper Publication Kessler MP 70x / 816x J. Comp. & GeoSci., 52, , 2012 Purdue-Lin MP 156x / 692x SPIE: doi: / WSM 3-class MP 150x / 331x WSM 5-class MP 202x / 350x JSTARS, 5, , 2012 Eta MP 37x / 272x SPIE: doi: / WSM 6-class MP 165x / 216x Submitted to J. Comp. & GeoSci. Goddard GCE MP 348x / 361x Accepted for publication in JSTARS Thompson MP * * * 76x / 153x SBU 5-class MP 213x / 896x JSTARS, 5, , 2012 WDM 5-class MP 147x / 206x WDM 6-class MP 150x / 206x J. Atmo. Ocean. Tech., 30, 2896, 2013 RRTMG LW 123x / 127x JSTARS, 7, , 2014 RRTMG SW 202x / 207x Submitted to J. Atmos. Ocean. Tech. Goddard SW 92x / 134x JSTARS, 5, , 2012 Dudhia SW MYNN SL TEMF SL 19x / 409x 6x / 113x 5x / 214x Thermal Diffusion LS 10x / 311x Submitted to JSATRS * * * * YSU PBL 34x / 193x Submitted to GMD Hybrid WRF Customer Benchmark Capability Starting in 2H 2015 Hardware and Benchmark Case CPU: Xeon Core-i7 3930K,1 core use; Benchmark: CONUS 12 km for 24 Oct 24; 433 x 308, 35 levels 32

33 NVIDIA-SSEC Phase-I Priority on 6 Modules WRF Module NCEP HRRR NCEP NWS-1 NCAR KISTI KAUST Cloud MP ~20% Rad ~6% PBL WSM 5-class MP WSM 6-class MP Thompson MP Dudhia Radiation RRTMG Radiation YSU Planetary BL NOTE: New NCAR global model MPAS uses WRF physics: WSM6, RRTMG, and YSU 33

34 Project Status (II): NVIDIA-NCAR OpenACC WRF Objective to provide benchmark capability of hybrid CPU+GPU WRF NVIDIA ported 30+ routines to OpenACC for both dynamics and physics: List provided on next page Project to integrate OpenACC modules with full model WRF release Open source project with plans for NCAR hosting on WRF trunk distribution site Objective to minimize code refactoring for scientists readability and acceptance \ NVIDIA - NCAR Project Success Several performance sensitive routines of dynamics and physics are completed This includes advection routines for dynamics; expensive microphysics schemes Hybrid WRF performance using single socket CPU + GPU faster than CPU-only Incremental improvements as more OpenACC code is developed for execution on GPU 34

35 WRF Modules Available in OpenACC Project to integrate CUDA modules into latest full model WRF Several dynamics routines including all of advection Several physics schemes: Planetary boundary layer YSU, GWDO Cumulus KFETA Microphysics Kessler, Morrison, Thompson Radiation RRTM Surface physics Noah \ 35

36 Summary Highlights NVIDIA observes strong NWP community interest in GPU acceleration Appeal of new technologies: Pascal, NVLink, more CPU platform choices NVIDIA applications engineering collaboration in several model projects Investments in OpenACC: PGI release of 15; Continued Cray collaborations GPU progress for several models we examined a few of these MeteoSwiss developments towards COSMO operational NWP on GPUs ECMWF developments of IFS and launch of H2020 ESCAPE project NVIDIA HPC technology behind the next-gen pre-exascale systems New GPU technologies for DOE CORAL program and ACME climate model 36

37 Thank You Carl Ponder, PhD; NVIDIA, Austin, TX, USA

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance

More information

Summit and Sierra Supercomputers:

Summit and Sierra Supercomputers: Whitepaper Summit and Sierra Supercomputers: An Inside Look at the U.S. Department of Energy s New Pre-Exascale Systems November 2014 1 Contents New Flagship Supercomputers in U.S. to Pave Path to Exascale

More information

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015 The GPU Accelerated Data Center Marc Hamilton, August 27, 2015 THE GPU-ACCELERATED DATA CENTER HPC DEEP LEARNING PC VIRTUALIZATION CLOUD GAMING RENDERING 2 Product design FROM ADVANCED RENDERING TO VIRTUAL

More information

HP ProLiant SL270s Gen8 Server. Evaluation Report

HP ProLiant SL270s Gen8 Server. Evaluation Report HP ProLiant SL270s Gen8 Server Evaluation Report Thomas Schoenemeyer, Hussein Harake and Daniel Peter Swiss National Supercomputing Centre (CSCS), Lugano Institute of Geophysics, ETH Zürich schoenemeyer@cscs.ch

More information

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools

More information

Accelerating CFD using OpenFOAM with GPUs

Accelerating CFD using OpenFOAM with GPUs Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide

More information

HPC with Multicore and GPUs

HPC with Multicore and GPUs HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware

More information

Case Study on Productivity and Performance of GPGPUs

Case Study on Productivity and Performance of GPGPUs Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia

More information

#OpenPOWERSummit. Join the conversation at #OpenPOWERSummit 1

#OpenPOWERSummit. Join the conversation at #OpenPOWERSummit 1 XLC/C++ and GPU Programming on Power Systems Kelvin Li, Kit Barton, John Keenleyside IBM {kli, kbarton, keenley}@ca.ibm.com John Ashley NVIDIA jashley@nvidia.com #OpenPOWERSummit Join the conversation

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

Turbomachinery CFD on many-core platforms experiences and strategies

Turbomachinery CFD on many-core platforms experiences and strategies Turbomachinery CFD on many-core platforms experiences and strategies Graham Pullan Whittle Laboratory, Department of Engineering, University of Cambridge MUSAF Colloquium, CERFACS, Toulouse September 27-29

More information

NVIDIA GPUs in the Cloud

NVIDIA GPUs in the Cloud NVIDIA GPUs in the Cloud 4 EVOLVING CLOUD REQUIREMENTS On premises Off premises Hybrid Cloud Connecting clouds New workloads Components to disrupt 5 GLOBAL CLOUD PLATFORM Unified architecture enabled by

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Programming models for heterogeneous computing Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Talk outline [30 slides] 1. Introduction [5 slides] 2.

More information

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center SRNWP Workshop HP Solutions and Activities in Climate & Weather Research Michael Riedmann European Performance Center Agenda A bit of marketing: HP Solutions for HPC A few words about recent Met deals

More information

Jean-Pierre Panziera Teratec 2011

Jean-Pierre Panziera Teratec 2011 Technologies for the future HPC systems Jean-Pierre Panziera Teratec 2011 3 petaflop systems : TERA 100, CURIE & IFERC Tera100 Curie IFERC 1.25 PetaFlops 256 TB ory 30 PB disk storage 140 000+ Xeon cores

More information

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative

More information

NVIDIA Tesla K20-K20X GPU Accelerators Benchmarks Application Performance Technical Brief

NVIDIA Tesla K20-K20X GPU Accelerators Benchmarks Application Performance Technical Brief NVIDIA Tesla K20-K20X GPU Accelerators Benchmarks Application Performance Technical Brief NVIDIA changed the high performance computing (HPC) landscape by introducing its Fermibased GPUs that delivered

More information

ST810 Advanced Computing

ST810 Advanced Computing ST810 Advanced Computing Lecture 17: Parallel computing part I Eric B. Laber Hua Zhou Department of Statistics North Carolina State University Mar 13, 2013 Outline computing Hardware computing overview

More information

Parallel Computing. Introduction

Parallel Computing. Introduction Parallel Computing Introduction Thorsten Grahs, 14. April 2014 Administration Lecturer Dr. Thorsten Grahs (that s me) t.grahs@tu-bs.de Institute of Scientific Computing Room RZ 120 Lecture Monday 11:30-13:00

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University

More information

Trends in High-Performance Computing for Power Grid Applications

Trends in High-Performance Computing for Power Grid Applications Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views

More information

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations Bilag 1 2013-03-12/RB DeIC (DCSC) Scientific Computing Installations DeIC, previously DCSC, currently has a number of scientific computing installations, distributed at five regional operating centres.

More information

Pedraforca: ARM + GPU prototype

Pedraforca: ARM + GPU prototype www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of

More information

NVLink High-Speed Interconnect: Application Performance

NVLink High-Speed Interconnect: Application Performance Whitepaper NVIDIA TM NVLink High-Speed Interconnect: Application Performance November 2014 1 Contents Accelerated Computing as the Path Forward for HPC... 3 NVLink: High-Speed GPU Interconnect... 3 Server

More information

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS SUDHAKARAN.G APCF, AERO, VSSC, ISRO 914712564742 g_suhakaran@vssc.gov.in THOMAS.C.BABU APCF, AERO, VSSC, ISRO 914712565833

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Technology brief Introduction... 2 GPU-based computing... 2 ProLiant SL390s GPU-enabled architecture... 2 Optimizing

More information

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012 Scientific Application Performance on HPC, Private and Public Cloud Resources: A Case Study Using Climate, Cardiac Model Codes and the NPB Benchmark Suite Peter Strazdins (Research School of Computer Science),

More information

5x in 5 hours Porting SEISMIC_CPML using the PGI Accelerator Model

5x in 5 hours Porting SEISMIC_CPML using the PGI Accelerator Model 5x in 5 hours Porting SEISMIC_CPML using the PGI Accelerator Model C99, C++, F2003 Compilers Optimizing Vectorizing Parallelizing Graphical parallel tools PGDBG debugger PGPROF profiler Intel, AMD, NVIDIA

More information

Experiences on using GPU accelerators for data analysis in ROOT/RooFit

Experiences on using GPU accelerators for data analysis in ROOT/RooFit Experiences on using GPU accelerators for data analysis in ROOT/RooFit Sverre Jarp, Alfio Lazzaro, Julien Leduc, Yngve Sneen Lindal, Andrzej Nowak European Organization for Nuclear Research (CERN), Geneva,

More information

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Building a Top500-class Supercomputing Cluster at LNS-BUAP Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad

More information

Barry Bolding, Ph.D. VP, Cray Product Division

Barry Bolding, Ph.D. VP, Cray Product Division Barry Bolding, Ph.D. VP, Cray Product Division 1 Corporate Overview Trends in Supercomputing Types of Supercomputing and Cray s Approach The Cloud The Exascale Challenge Conclusion 2 Slide 3 Seymour Cray

More information

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

Performance Engineering of the Community Atmosphere Model

Performance Engineering of the Community Atmosphere Model Performance Engineering of the Community Atmosphere Model Patrick H. Worley Oak Ridge National Laboratory Arthur A. Mirin Lawrence Livermore National Laboratory 11th Annual CCSM Workshop June 20-22, 2006

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State

More information

Linux Cluster Computing An Administrator s Perspective

Linux Cluster Computing An Administrator s Perspective Linux Cluster Computing An Administrator s Perspective Robert Whitinger Traques LLC and High Performance Computing Center East Tennessee State University : http://lxer.com/pub/self2015_clusters.pdf 2015-Jun-14

More information

HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development

HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development Cray Solutions for the Earth Sciences Why Cray? Cray solutions support: Range of modeling capabilities

More information

IS-ENES/PrACE Meeting EC-EARTH 3. A High-resolution Configuration

IS-ENES/PrACE Meeting EC-EARTH 3. A High-resolution Configuration IS-ENES/PrACE Meeting EC-EARTH 3 A High-resolution Configuration Motivation Generate a high-resolution configuration of EC-EARTH to Prepare studies of high-resolution ESM in climate mode Prove and improve

More information

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France

More information

Performance analysis of parallel applications on modern multithreaded processor architectures

Performance analysis of parallel applications on modern multithreaded processor architectures Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Performance analysis of parallel applications on modern multithreaded processor architectures Maciej Cytowski* a, Maciej

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist ACANO SOLUTION VIRTUALIZED DEPLOYMENTS White Paper Simon Evans, Acano Chief Scientist Updated April 2015 CONTENTS Introduction... 3 Host Requirements... 5 Sizing a VM... 6 Call Bridge VM... 7 Acano Edge

More information

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Haohuan Fu haohuan@tsinghua.edu.cn High Performance Geo-Computing (HPGC) Group Center for Earth System Science Tsinghua University

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

Evaluation of CUDA Fortran for the CFD code Strukti

Evaluation of CUDA Fortran for the CFD code Strukti Evaluation of CUDA Fortran for the CFD code Strukti Practical term report from Stephan Soller High performance computing center Stuttgart 1 Stuttgart Media University 2 High performance computing center

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

HPC Programming Framework Research Team

HPC Programming Framework Research Team HPC Programming Framework Research Team 1. Team Members Naoya Maruyama (Team Leader) Motohiko Matsuda (Research Scientist) Soichiro Suzuki (Technical Staff) Mohamed Wahib (Postdoctoral Researcher) Shinichiro

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens

More information

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory Visit to the National University for Defense Technology Changsha, China Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 3, 2013 On May 28-29, 2013, I had the opportunity to attend

More information

Hank Childs, University of Oregon

Hank Childs, University of Oregon Exascale Analysis & Visualization: Get Ready For a Whole New World Sept. 16, 2015 Hank Childs, University of Oregon Before I forget VisIt: visualization and analysis for very big data DOE Workshop for

More information

Cluster Monitoring and Management Tools RAJAT PHULL, NVIDIA SOFTWARE ENGINEER ROB TODD, NVIDIA SOFTWARE ENGINEER

Cluster Monitoring and Management Tools RAJAT PHULL, NVIDIA SOFTWARE ENGINEER ROB TODD, NVIDIA SOFTWARE ENGINEER Cluster Monitoring and Management Tools RAJAT PHULL, NVIDIA SOFTWARE ENGINEER ROB TODD, NVIDIA SOFTWARE ENGINEER MANAGE GPUS IN THE CLUSTER Administrators, End users Middleware Engineers Monitoring/Management

More information

HPC-related R&D in 863 Program

HPC-related R&D in 863 Program HPC-related R&D in 863 Program Depei Qian Sino-German Joint Software Institute (JSI) Beihang University Aug. 27, 2010 Outline The 863 key project on HPC and Grid Status and Next 5 years 863 efforts on

More information

GPGPU accelerated Computational Fluid Dynamics

GPGPU accelerated Computational Fluid Dynamics t e c h n i s c h e u n i v e r s i t ä t b r a u n s c h w e i g Carl-Friedrich Gauß Faculty GPGPU accelerated Computational Fluid Dynamics 5th GACM Colloquium on Computational Mechanics Hamburg Institute

More information

GPU Acceleration of the SENSEI CFD Code Suite

GPU Acceleration of the SENSEI CFD Code Suite GPU Acceleration of the SENSEI CFD Code Suite Chris Roy, Brent Pickering, Chip Jackson, Joe Derlaga, Xiao Xu Aerospace and Ocean Engineering Primary Collaborators: Tom Scogland, Wu Feng (Computer Science)

More information

Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com

Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Please Note IBM s statements regarding its plans, directions, and intent are subject to

More information

Experiences with Tools at NERSC

Experiences with Tools at NERSC Experiences with Tools at NERSC Richard Gerber NERSC User Services Programming weather, climate, and earth- system models on heterogeneous mul>- core pla?orms September 7, 2011 at the Na>onal Center for

More information

Jeff Wolf Deputy Director HPC Innovation Center

Jeff Wolf Deputy Director HPC Innovation Center Public Presentation for Blue Gene Consortium Nov. 19, 2013 www.hpcinnovationcenter.com Jeff Wolf Deputy Director HPC Innovation Center This work was performed under the auspices of the U.S. Department

More information

SYSTAP / bigdata. Open Source High Performance Highly Available. 1 http://www.bigdata.com/blog. bigdata Presented to CSHALS 2/27/2014

SYSTAP / bigdata. Open Source High Performance Highly Available. 1 http://www.bigdata.com/blog. bigdata Presented to CSHALS 2/27/2014 SYSTAP / Open Source High Performance Highly Available 1 SYSTAP, LLC Small Business, Founded 2006 100% Employee Owned Customers OEMs and VARs Government TelecommunicaHons Health Care Network Storage Finance

More information

Achieving Performance Isolation with Lightweight Co-Kernels

Achieving Performance Isolation with Lightweight Co-Kernels Achieving Performance Isolation with Lightweight Co-Kernels Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015

More information

Part I Courses Syllabus

Part I Courses Syllabus Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment

More information

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, June 2016

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, June 2016 Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, June 2016 Mellanox Leadership in High Performance Computing Most Deployed Interconnect in High Performance

More information

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster

More information

Review of SC13; Look Ahead to HPC in 2014. Addison Snell addison@intersect360.com

Review of SC13; Look Ahead to HPC in 2014. Addison Snell addison@intersect360.com Review of SC13; Look Ahead to HPC in 2014 Addison Snell addison@intersect360.com New at Intersect360 Research HPC500 user organization, www.hpc500.com Goal: 500 users worldwide, demographically representative

More information

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright 2013, Oracle and/or its affiliates. All rights reserved. 1 Oracle SPARC Server for Enterprise Computing Dr. Heiner Bauch Senior Account Architect 19. April 2013 2 The following is intended to outline our general product direction. It is intended for information

More information

Software and Performance Engineering of the Community Atmosphere Model

Software and Performance Engineering of the Community Atmosphere Model Software and Performance Engineering of the Community Atmosphere Model Patrick H. Worley Oak Ridge National Laboratory Arthur A. Mirin Lawrence Livermore National Laboratory Science Team Meeting Climate

More information

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting Introduction Big Data Analytics needs: Low latency data access Fast computing Power efficiency Latest

More information

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

SR-IOV: Performance Benefits for Virtualized Interconnects!

SR-IOV: Performance Benefits for Virtualized Interconnects! SR-IOV: Performance Benefits for Virtualized Interconnects! Glenn K. Lockwood! Mahidhar Tatineni! Rick Wagner!! July 15, XSEDE14, Atlanta! Background! High Performance Computing (HPC) reaching beyond traditional

More information

ASSESSMENT OF THE CAPABILITY OF WRF MODEL TO ESTIMATE CLOUDS AT DIFFERENT TEMPORAL AND SPATIAL SCALES

ASSESSMENT OF THE CAPABILITY OF WRF MODEL TO ESTIMATE CLOUDS AT DIFFERENT TEMPORAL AND SPATIAL SCALES 16TH WRF USER WORKSHOP, BOULDER, JUNE 2015 ASSESSMENT OF THE CAPABILITY OF WRF MODEL TO ESTIMATE CLOUDS AT DIFFERENT TEMPORAL AND SPATIAL SCALES Clara Arbizu-Barrena, David Pozo-Vázquez, José A. Ruiz-Arias,

More information

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009 ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory

More information

Infrastructure Matters: POWER8 vs. Xeon x86

Infrastructure Matters: POWER8 vs. Xeon x86 Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report

More information

Keeneland Enabling Heterogeneous Computing for the Open Science Community Philip C. Roth Oak Ridge National Laboratory

Keeneland Enabling Heterogeneous Computing for the Open Science Community Philip C. Roth Oak Ridge National Laboratory Keeneland Enabling Heterogeneous Computing for the Open Science Community Philip C. Roth Oak Ridge National Laboratory with contributions from the Keeneland project team and partners 2 NSF Office of Cyber

More information

Simulation Platform Overview

Simulation Platform Overview Simulation Platform Overview Build, compute, and analyze simulations on demand www.rescale.com CASE STUDIES Companies in the aerospace and automotive industries use Rescale to run faster simulations Aerospace

More information

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Gregorio Bernabé Javier Cuenca Domingo Giménez Universidad de Murcia Scientific Computing and Parallel Programming Group XXIX Simposium Nacional de la

More information

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended

More information

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez Energy efficient computing on Embedded and Mobile devices Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez A brief look at the (outdated) Top500 list Most systems are built

More information

Overview of HPC systems and software available within

Overview of HPC systems and software available within Overview of HPC systems and software available within Overview Available HPC Systems Ba Cy-Tera Available Visualization Facilities Software Environments HPC System at Bibliotheca Alexandrina SUN cluster

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

Recent Advances in HPC for Structural Mechanics Simulations

Recent Advances in HPC for Structural Mechanics Simulations Recent Advances in HPC for Structural Mechanics Simulations 1 Trends in Engineering Driving Demand for HPC Increase product performance and integrity in less time Consider more design variants Find the

More information

CFD Implementation with In-Socket FPGA Accelerators

CFD Implementation with In-Socket FPGA Accelerators CFD Implementation with In-Socket FPGA Accelerators Ivan Gonzalez UAM Team at DOVRES FuSim-E Programme Symposium: CFD on Future Architectures C 2 A 2 S 2 E DLR Braunschweig 14 th -15 th October 2009 Outline

More information

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist The Top Six Advantages of CUDA-Ready Clusters Ian Lumb Bright Evangelist GTC Express Webinar January 21, 2015 We scientists are time-constrained, said Dr. Yamanaka. Our priority is our research, not managing

More information

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano

Managing GPUs by Slurm. Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano Managing GPUs by Slurm Massimo Benini HPC Advisory Council Switzerland Conference March 31 - April 3, 2014 Lugano Agenda General Slurm introduction Slurm@CSCS Generic Resource Scheduling for GPUs Resource

More information

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age Xuan Shi GRA: Bowei Xue University of Arkansas Spatiotemporal Modeling of Human Dynamics

More information

Climate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model

Climate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model ANL/ALCF/ESP-13/1 Climate-Weather Modeling Studies Using a Prototype Global Cloud-System Resolving Model ALCF-2 Early Science Program Technical Report Argonne Leadership Computing Facility About Argonne

More information

Multicore Parallel Computing with OpenMP

Multicore Parallel Computing with OpenMP Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large

More information

Altix Usage and Application Programming. Welcome and Introduction

Altix Usage and Application Programming. Welcome and Introduction Zentrum für Informationsdienste und Hochleistungsrechnen Altix Usage and Application Programming Welcome and Introduction Zellescher Weg 12 Tel. +49 351-463 - 35450 Dresden, November 30th 2005 Wolfgang

More information

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS) PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS) PCIe Over Cable PCIe provides greater performance 8 7 6 5 GBytes/s 4

More information

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking

More information