Future Exascale Supercomputers

Size: px
Start display at page:

Download "Future Exascale Supercomputers"

Transcription

1 Future Exascale Supercomputers Prof. Mateo Valero Director Top10 Rank Site Computer Procs Rmax Rpeak 1 RIKEN Advanced Institute for Computational Science (AICS) Fujitsu, K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect 2 Tianjin, China XeonX5670+NVIDIA Oak Ridge Nat. Lab. Crat XT5,6 cores Shenzhen, China XeonX5670+NVIDIA GSIC Center, Tokyo XeonX5670+NVIDIA DOE/NNSA/LANL/SNL Cray XE6 8-core 2.4 GHz NASA/Ames Research Center/NAS SGI Altix ICE 8200EX/8400EX, Xeon HT QC 3.0/Xeon 5570/ Ghz, Infiniband DOE/SC/LBNL/NERSC Cray XE6 12 cores Commissariat a l'energie Bull bullx super-node Atomique (CEA) S6010/S DOE/NNSA/LANL QS22/LS21 Cluster, PowerXCell 8i / Opteron Infiniband

2 Parallel Systems Interconnect (Myrinet, IB, Ge, 3D torus, tree, ) Node Node Node * Node * Node * Node ** Node ** Node ** Node Node Node Memory SMP multicore IN multicore multicore multicore homogeneous multicore (BlueGene-Q chip) heterogenous multicore general-purpose accelerator (e.g. Cell) GPU FPGA ASIC (e.g. Anton for MD) Network-on-chip (bus, ring, direct, ) 3 Riken s Fujitsu K with SPARC64 VIIIfx Homogeneous architecture: Compute node: One SPARC64 VIIIfx processor 2 GHz, 8 cores per chip 128 Gigaflops per chip 16 GB memory per node Number of nodes and cores: 864 cabinets * 102 compute nodes/cabinet * (1 socket * 8 CPU cores) = cores. 50 by 60 meters Peak performance (DP): cores * 16 GFLOPS per core = PFLOPS Linpack: PF 93% efficiency. Matrix: more than rows!!! 29 hours and 28 minutes Power consumption 12.6 MWatt, 0.8 Gigaflops/W 4 2

3 Looking at the Gordon Bell Prize 1 GFlop/s; 1988; Cray Y-MP; 8 Processors Static finite element analysis 1 TFlop/s; 1998; Cray T3E; 1024 Processors Modeling of metallic magnet atoms, using a variation of the locally self-consistent multiple scattering method. 1 PFlop/s; 2008; Cray XT5; 1.5x10 5 Processors Superconductive materials 1 EFlop/s; ~2018;?; 1x10 8 Processors?? (10 9 threads) Jack Dongarra 5 6 3

4 Nvidia GPU instruction execution MP 1 MP 2 MP 3 MP 4 instruction 1 instruction 2 Long latency 3 Instruction 4 Mexico SBAC-PAD, DF, November, Vitoria October th, Potential System Architecture for Exascale Supercomputers System attributes Difference System peak 2 Pflop/s 200 Pflop/s 1 Eflop/sec O(1000) Power 6MW 15 MW ~20 MW System memory 0.3 PB 5 PB PB O(100) Node performance Node memory BW Node concurrency Total Concurrency Total Node Interconnect BW 125 GF 0.5 TF 7 TF 1 TF 10 TF O(10) O(100) 25 GB/s TB/sec 0.4 TB/sec 4 TB/sec O(100) TB/sec 12 O(100) O(1,000) O(1,000) O(10,000) O(100) O(1000) 225,000 O(10 8 ) O(10 9 ) O(10,000) 1.5 GB/s 20 GB/sec 200 GB/sec O(100) MTTI days O(1day) O(1 day) - O(10) EESI Final Conference Mexico DF, November, Oct. 2011, 2011 Barcelona 8 4

5 2. To faster air plane design Boeing: Number of wing prototypes prepared for wind-tunnel testing Date Airplane B757/B767 B777 B787 # wing prototypes Plateau due to RANS limitations. Further decrease expected from LES with ExaFlop EESI Final Conference Mexico DF, November, Oct. 2011, 2011 Barcelona 9 Diseño del Airbus

6 2. To faster air plane design Airbus: "More simulation, less tests From A380 to A350-40% less wind-tunnel days - 25% saving in aerodynamics development time - 20% saving on wind-tunnel tests cost thanks to HPC-enabled CFD runs, especially in high-speed h regime, providing even better representation of aerodynamics phenomenon turned into better design choices. Acknowledgements: E. CHAPUT (AIRBUS) EESI Final Conference Mexico DF, November, Oct. 2011, 2011 Barcelona Oil industry EESI Final Conference Mexico DF, November, Oct. 2011, 2011 Barcelona 12 6

7 Diseño del ITER TOKAMAK (JET) 13 Fundamental Sciences EESI Final Conference Oct. 2011, Barcelona 14 7

8 Materials: a new path to competitiveness On-demand materials for effective commercial use Conductivity: energy loss reduction Lifetime: corrosion protection, e.g. chrome Fissures: saftety insurance from molecular design Optimisation of materials / lubricants less friction, longer lifetime, less energy-losses Industrial need to speed up simulation from months to days EESI Final Conference, Mexico Oct. DF, 2011, November, Barcelona 2011 All atom 15 Multi-scale Exascale enables simulation of larger and realistic systems and devices Life Sciences and Health Population Organ Tissue Cell Macromolecule Small Molecule Atom EESI Final Conference, Mexico Oct. DF, 2011, November, Barcelona

9 Supercomputación, teoría y experimentación 17 Cortesia de IBM Supercomputing, theory and experimentation 18 Cortesia de IBM 9

10 Holistic approach Towards exaflop Applications Job Scheduling Programming Model Run time Interconnection Processor/node architecture Load Balancin ng Comput. Complexity Moldability Resource awareness User satisfaction Address space Async. Algs. Dependencies Work generation Locality optimization Concurrency extraction Topology and routing External contention NIC design Run time support Memory subsystem Hw counters Core Structure Pflop/s systems planned Fujitsu Kei 80,000 8-core Sparc64 VIIIfx processors 2 GHz, (16 Gflops/core, 58 watts 3.2 Gflops/watt), 16 GB/node, 1 PB memory, 6D mesh-torus, 10 Pflops Cray's Titan at DOE, Oak Ridge National Laboratory Hybrid system with Nvidia GPUs, 1 Pflop/s in 2011, 20 Pflop/s in 2012, late 2011 prototype $100 million 20 10

11 10+ Pflop/s systems planned IBM Blue Waters at Illinois 40,000 8-core Power7, 1 PB memory, 18 PB disk, 500 PB archival storage, 10 Pflop/s, 2012, $200 million IBM Blue Gene/Q systems: Mira to DOE, Argonne National Lab with 49,000 nodes, 16-core Power A2 processor (1.6-3 GHz), 750 K cores, 750 TB memory, 70 PB disk, 5D torus, 10 Pflop/s Sequoia to Lawrence Livermore National Lab with nodes (96 racks), 16-core A2 processor, 1.6 M cores (1 GB/core), 1.6 Petabytes memory, 6 Mwatt, 3 Gflops/watt, 20 Pflop/s, Japan Plan for Exascale Heterogeneous, Distributed Memory GigaHz KiloCore MegaNode system K Machine 10K Machine 100K Machine 10 PF 100 PF ExaFlops Feasibility Study ( ) Exascale Project ( ) Post-Petascale Projects 22 11

12 Thanks to S. Borkar, Intel 23 Thanks to S. Borkar, Intel 24 12

13 Nvidia: Chip for the Exaflop Computer Thanks Bill Dally 25 Nvidia: Node for the Exaflop Computer Thanks Bill Dally 26 13

14 Exascale Supercomputer Thanks Bill Dally 27 BSC-CNS: International Initiatives (IESP) Improve the world s simulation and modeling capability by improving the coordination and development of the HPC software environment Build an international i plan for developing the next generation open source software for scientific high-performance computing 28 14

15 Back to Babel? Book of Genesis Now the whole earth had one language and the same words The computer age Fortran & MPI Come, let us make bricks, and burn them thoroughly. "Come, let us build ourselves a city, and a tower with its top in the heavens, and let us make a name for ourselves And the LORD said, "Look, they are one people, and they have all one language; and this is only the beginning of what they will do; nothing that they propose to do will now be impossible for them. Come, let us go down, and confuse their language there, so that they will not understand one another's speech." ++ Fortress Cilk++ X10 CUDA Sisal HPF StarSs RapidMind Sequoia CAF OpenMP UPC ALF SDK Chapel MPI Thanks to Jesus Labarta 29 You will see. in 400 years from now people will get crazy New generation of programmers Multicore/manycore Architectures Parallel Programming New Usage models Source: Picasso -- Don Quixote Dr. Avi Mendelson (Microsoft). Keynote at ISC

16 Different models of computation. The dream for automatic parallelizing compilers not true so programmer needs to express opportunities for parallel execution in the application SPMD OpenMP 2.5 Nested fork-join OpenMP 3.0 DAG data flow And asynchrony (MPI and OpenMP too synchronous): Collectives/barriers multiply effects of microscopic load imbalance, OS noise, Huge Lookahead &Reuse. Latency/EBW/Scheduling 31 StarSs: generates task graph at run time #pragma css task input(a, B) output(c) void vadd3 (float A[BS], float B[BS], float C[BS]); #pragma css task input(sum, A) output(b) void scale_add (float sum, float A[BS], float B[BS]); #pragma css task input(a) inout(sum) void accum (float A[BS], float *sum); Task Graph Generation for (i=0; i<n; i+=bs) // C=A+B vadd3 ( &A[i], &B[i], &C[i]);... for (i=0; i<n; i+=bs) // sum(c[i]) accum (&C[i], &sum);... for (i=0; i<n; i+=bs) // B=sum*E scale_add (sum, &E[i], &B[i]);... for (i=0; i<n; i+=bs) // A=C+D vadd3 (&C[i], &D[i], &A[i]);... for (i=0; i<n; i+=bs) // E=C+F vadd3 (&C[i], &F[i], &E[i]);

17 StarSs: and executes as efficient as possible #pragma css task input(a, B) output(c) void vadd3 (float A[BS], float B[BS], float C[BS]); #pragma css task input(sum, A) output(b) void scale_add (float sum, float A[BS], float B[BS]); #pragma css task input(a) inout(sum) void accum (float A[BS], float *sum); Task Graph Execution for (i=0; i<n; i+=bs) // C=A+B vadd3 ( &A[i], &B[i], &C[i]);... for (i=0; i<n; i+=bs) // sum(c[i]) accum (&C[i], &sum);... for (i=0; i<n; i+=bs) // B=sum*E scale_add (sum, &E[i], &B[i]);... for (i=0; i<n; i+=bs) // A=C+D vadd3 (&C[i], &D[i], &A[i]);... for (i=0; i<n; i+=bs) // E=C+F vadd3 (&C[i], &F[i], &E[i]); StarSs: benefiting from data access information Flat global address space seen by programmer Flexibility to dynamically traverse dataflow graph optimizing Concurrency. Critical path Memory access Opportunities for Prefetch Reuse Eliminate i antidependences d (rename) Replication management 34 17

18 StarSs: Enabler for exascale Can exploit very unstructured parallelism Not just loop/data parallelism Easy to change structure Supports large amounts of lookahead Not stalling for dependence satisfaction Allow for locality optimizations to tolerate latency Overlap data transfers, prefetch Reuse Nicely hybridizes into MPI/StarSs Propagates to large scale the node level dataflow characteristics Overlap communication and computation A chance against Amdahl s law Support for heterogeneity Any # and combination of CPUs, GPUs Including autotuning Malleability: Decouple program from resources Allowing dynamic resource allocation and load balance Tolerate noise Data-flow; Asynchrony Potential is there; Can blame runtime Compatible with proprietary low level technologies StarSs: history/strategy/versions Basic SMPSs must provide directionality argument Contiguous, non partially overlapped Renaming Several schedulers (priority, it locality, ) lit No nesting C/Fortran MPI/SMPSs optims. SMPSs regions C, No Fortran must provide directionality argument ovelaping &strided Reshaping strided accesses Priority and locality aware scheduling OMPSs C/C++, Fortran under development OpenMP compatibility (~) Dependences based only on args. with directionality Contiguous args. (address used as centinels) Separate dependences/transfers Inlined/outlined pragmas Nesting SMP/GPU/Cluster No renaming, Several schedulers: Simple locality aware sched, 36 18

19 Multidisciplinary top-down approach Performance analysis and prediction tools Application and algorithms Interconnect Programming models Investigate solutions Load to these and other problems Power Processor and node balancing Power (MW) $3M Computer Center Power Projections Cooling Computers $9M $17M 37 $23M $31M Year 38 19

20 Green/Top 500 November 2011 Green500 Top500 _Rank _Rank Mflops/Watt Power Site Computer ,48 85,12 IBM - Rochester BlueGene/Q, Power BQC 16C 1.60 GHz, Custom ,48 85,12IBM Thomas J. Watson Research Center BlueGene/Q, Power BQC 16C 1.60 GHz, Custom ,09 170,25 IBM - Rochester BlueGene/Q, Power BQC 16C 1.60 GHz, Custom ,56 340,5 DOE/NNSA/LLNL BlueGene/Q, Power BQC 16C 1.60 GHz, Custom ,86 38,67IBM Thomas J. Watson Research Center NNSA/SC Blue Gene/Q Prototype ,32 47,05 Nagasaki University DEGIMA Cluster, Intel i5, ATI Radeon GPU, Infiniband QDR ,26 81,5 Barcelona Supercomputing Center Bullx B505, Xeon E5649 6C 2.53GHz, Infiniband QDR, NVIDIA ,11 108,8 TGCC / GENCI Curie Hybrid Nodes - Bullx B505, Xeon E GHz, Infiniband QDR Institute of Process Engineering, Chinese Mole-8.5 Cluster, Xeon X5520 4C 2.27 GHz, Infiniband QDR, ,7 515,2 Academy of Sciences NVIDIA 2050 GSIC Center, Tokyo Institute of , ,8 Technology ,96 126,27Virginia Tech ,54 117,91 Georgia Institute of Technology CINECA / SCS - SuperComputing , Solution NVIDIA ,87 76,25 Forschungszentrum Juelich (FZJ) HP ProLiant SL390s G7 Xeon 6C X5670, Nvidia GPU, Linux/Windows SuperServer 2026GT-TRF, Xeon E5645 6C 2.40GHz, Infiniband QDR, NVIDIA 2050 HP ProLiant SL390s G7 Xeon 6C X Ghz, nvidia Fermi, Infiniband QDR idataplex DX360M3, Xeon E5645 6C 2.40 GHz, Infiniband QDR, NVIDIA 2070 idataplex DX360M3, Xeon X5650 6C 2.66 GHz, Infiniband QDR, NVIDIA 2070 Xtreme-X GreenBlade GB512X, Xeon E5 (Sandy Bridge - EP) 8C 2.60GHz, Infiniband QDR ,19 198,72 Sandia National Laboratories RIKEN Advanced Institute for , ,89 Computational Science (AICS) K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect , National Supercomputing Center in Tianjin NUDT YH MPP, Xeon X5670 6C 2.93 GHz, NVIDIA , DOE/SC/Oak Ridge National Laboratory Cray XT5-HE Opteron 6-core 2.6 GHz National Supercomputing Centre in , Shenzhen (NSCS) Dawning TC3600 Blade System, Xeon X5650 6C 2.66GHz, Infiniband QDR, NVIDIA 2050 Mexico DF, November, SBAC-PAD, 2011 Vitoria October 28th, Green/Top 500 November 2011 BSC, Xeon 6C, NVIDIA 2090 GPU Top500 rank Nagasaki U., Intel i5, ATI Radeon GPU IBM and NNSA, Blue Gene/Q Mflops/watt Mwatts/Exaflop 2026, , , , Mflops/watt >1 GF/watt MF/watt MF/watt 40 20

Supercomputadores del Futuro

Supercomputadores del Futuro www.bsc.es Supercomputadores del Futuro Real Academia de Ingeniería Mateo Valero Director Madrid, Abril 29 th 2014 Cómo avanza la ciencia hoy? Experimentación Teoría Simulación Simulación = Calcular las

More information

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory Visit to the National University for Defense Technology Changsha, China Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 3, 2013 On May 28-29, 2013, I had the opportunity to attend

More information

InfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice

InfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice InfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice Provides the Best Return-on-Investment by Delivering the Highest System Efficiency and Utilization TOP500 Supercomputers June

More information

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, June 2016

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, June 2016 Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, June 2016 Mellanox Leadership in High Performance Computing Most Deployed Interconnect in High Performance

More information

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez

Energy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez Energy efficient computing on Embedded and Mobile devices Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez A brief look at the (outdated) Top500 list Most systems are built

More information

Barry Bolding, Ph.D. VP, Cray Product Division

Barry Bolding, Ph.D. VP, Cray Product Division Barry Bolding, Ph.D. VP, Cray Product Division 1 Corporate Overview Trends in Supercomputing Types of Supercomputing and Cray s Approach The Cloud The Exascale Challenge Conclusion 2 Slide 3 Seymour Cray

More information

SOSCIP Platforms. SOSCIP Platforms

SOSCIP Platforms. SOSCIP Platforms SOSCIP Platforms SOSCIP Platforms 1 SOSCIP HPC Platforms Blue Gene/Q Cloud Analytics Agile Large Memory System 2 SOSCIP Platforms Blue Gene/Q Platform 3 top500.org Rank Site System Cores Rmax (TFlop/s)

More information

BSC - Barcelona Supercomputer Center

BSC - Barcelona Supercomputer Center Objectives Research in Supercomputing and Computer Architecture Collaborate in R&D e-science projects with prestigious scientific teams Manage BSC supercomputers to accelerate relevant contributions to

More information

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015 Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2015 InfiniBand FDR and EDR Continue Growth and Leadership The Most Used Interconnect On The TOP500

More information

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) ( TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

Introduction History Design Blue Gene/Q Job Scheduler Filesystem Power usage Performance Summary Sequoia is a petascale Blue Gene/Q supercomputer Being constructed by IBM for the National Nuclear Security

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Trends in High-Performance Computing for Power Grid Applications

Trends in High-Performance Computing for Power Grid Applications Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views

More information

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca Carlo Cavazzoni CINECA Supercomputing Application & Innovation www.cineca.it 21 Aprile 2015 FERMI Name: Fermi Architecture: BlueGene/Q

More information

Supercomputing Resources in BSC, RES and PRACE

Supercomputing Resources in BSC, RES and PRACE www.bsc.es Supercomputing Resources in BSC, RES and PRACE Sergi Girona, BSC-CNS Barcelona, 23 Septiembre 2015 ICTS 2014, un paso adelante para la RES Past RES members and resources BSC-CNS (MareNostrum)

More information

OpenMP Programming on ScaleMP

OpenMP Programming on ScaleMP OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign

More information

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory Report on the Sunway TaihuLight System Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 20, 2016 University of Tennessee Department of Electrical Engineering and Computer Science

More information

Supercomputing 2004 - Status und Trends (Conference Report) Peter Wegner

Supercomputing 2004 - Status und Trends (Conference Report) Peter Wegner (Conference Report) Peter Wegner SC2004 conference Top500 List BG/L Moors Law, problems of recent architectures Solutions Interconnects Software Lattice QCD machines DESY @SC2004 QCDOC Conclusions Technical

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

BSC vision on Big Data and extreme scale computing

BSC vision on Big Data and extreme scale computing BSC vision on Big Data and extreme scale computing Jesus Labarta, Eduard Ayguade,, Fabrizio Gagliardi, Rosa M. Badia, Toni Cortes, Jordi Torres, Adrian Cristal, Osman Unsal, David Carrera, Yolanda Becerra,

More information

GPU Computing. The GPU Advantage. To ExaScale and Beyond. The GPU is the Computer

GPU Computing. The GPU Advantage. To ExaScale and Beyond. The GPU is the Computer GU Computing 1 2 3 The GU Advantage To ExaScale and Beyond The GU is the Computer The GU Advantage The GU Advantage A Tale of Two Machines Tianhe-1A at NSC Tianjin Tianhe-1A at NSC Tianjin The World s

More information

ANALYSIS OF SUPERCOMPUTER DESIGN

ANALYSIS OF SUPERCOMPUTER DESIGN ANALYSIS OF SUPERCOMPUTER DESIGN CS/ECE 566 Parallel Processing Fall 2011 1 Anh Huy Bui Nilesh Malpekar Vishnu Gajendran AGENDA Brief introduction of supercomputer Supercomputer design concerns and analysis

More information

Scientific Computing Programming with Parallel Objects

Scientific Computing Programming with Parallel Objects Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore

More information

Current Status of FEFS for the K computer

Current Status of FEFS for the K computer Current Status of FEFS for the K computer Shinji Sumimoto Fujitsu Limited Apr.24 2012 LUG2012@Austin Outline RIKEN and Fujitsu are jointly developing the K computer * Development continues with system

More information

David Vicente Head of User Support BSC

David Vicente Head of User Support BSC www.bsc.es Programming MareNostrum III David Vicente Head of User Support BSC Agenda WEDNESDAY - 17-04-13 9:00 Introduction to BSC, PRACE PATC and this training 9:30 New MareNostrum III the views from

More information

InfiniBand Takes Market Share from Ethernet and Proprietary Interconnects Top500 Supercomputers, Nov 2006

InfiniBand Takes Market Share from Ethernet and Proprietary Interconnects Top500 Supercomputers, Nov 2006 InfiniBand Takes Market Share from Ethernet and Proprietary Interconnects Top5 Supercomputers, Nov 26 Top5 Performance Trends Total # of CPUs on the Top5 Total Performance of the Top5 Total Number of CPUs

More information

Jezelf Groen Rekenen met Supercomputers

Jezelf Groen Rekenen met Supercomputers Jezelf Groen Rekenen met Supercomputers Symposium Groene ICT en duurzaamheid: Nieuwe energie in het hoger onderwijs Walter Lioen Groepsleider Supercomputing About SURFsara SURFsara

More information

Parallel Computing. Introduction

Parallel Computing. Introduction Parallel Computing Introduction Thorsten Grahs, 14. April 2014 Administration Lecturer Dr. Thorsten Grahs (that s me) t.grahs@tu-bs.de Institute of Scientific Computing Room RZ 120 Lecture Monday 11:30-13:00

More information

HPC-related R&D in 863 Program

HPC-related R&D in 863 Program HPC-related R&D in 863 Program Depei Qian Sino-German Joint Software Institute (JSI) Beihang University Aug. 27, 2010 Outline The 863 key project on HPC and Grid Status and Next 5 years 863 efforts on

More information

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools

More information

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance

More information

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems 202 IEEE 202 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum The Green Index: A Metric

More information

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Building a Top500-class Supercomputing Cluster at LNS-BUAP Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad

More information

Jean-Pierre Panziera Teratec 2011

Jean-Pierre Panziera Teratec 2011 Technologies for the future HPC systems Jean-Pierre Panziera Teratec 2011 3 petaflop systems : TERA 100, CURIE & IFERC Tera100 Curie IFERC 1.25 PetaFlops 256 TB ory 30 PB disk storage 140 000+ Xeon cores

More information

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC Mississippi State University High Performance Computing Collaboratory Brief Overview Trey Breckenridge Director, HPC Mississippi State University Public university (Land Grant) founded in 1878 Traditional

More information

HP ProLiant SL270s Gen8 Server. Evaluation Report

HP ProLiant SL270s Gen8 Server. Evaluation Report HP ProLiant SL270s Gen8 Server Evaluation Report Thomas Schoenemeyer, Hussein Harake and Daniel Peter Swiss National Supercomputing Centre (CSCS), Lugano Institute of Geophysics, ETH Zürich schoenemeyer@cscs.ch

More information

Pedraforca: ARM + GPU prototype

Pedraforca: ARM + GPU prototype www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of

More information

Jeff Wolf Deputy Director HPC Innovation Center

Jeff Wolf Deputy Director HPC Innovation Center Public Presentation for Blue Gene Consortium Nov. 19, 2013 www.hpcinnovationcenter.com Jeff Wolf Deputy Director HPC Innovation Center This work was performed under the auspices of the U.S. Department

More information

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures 11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the

More information

The K computer: Project overview

The K computer: Project overview The Next-Generation Supercomputer The K computer: Project overview SHOJI, Fumiyoshi Next-Generation Supercomputer R&D Center, RIKEN The K computer Outline Project Overview System Configuration of the K

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

How Cineca supports IT

How Cineca supports IT How Cineca supports IT Topics CINECA: an overview Systems and Services for Higher Education HPC for Research Activities and Industries Cineca: the Consortium Not For Profit Founded in 1969 HPC FERMI: TOP500

More information

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.

More information

Using the Intel Xeon Phi (with the Stampede Supercomputer) ISC 13 Tutorial

Using the Intel Xeon Phi (with the Stampede Supercomputer) ISC 13 Tutorial Using the Intel Xeon Phi (with the Stampede Supercomputer) ISC 13 Tutorial Bill Barth, Kent Milfeld, Dan Stanzione Tommy Minyard Texas Advanced Computing Center Jim Jeffers, Intel June 2013, Leipzig, Germany

More information

High Performance Computing

High Performance Computing High Performance Computing Trey Breckenridge Computing Systems Manager Engineering Research Center Mississippi State University What is High Performance Computing? HPC is ill defined and context dependent.

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale

TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale Toshio Endo,Akira Nukada, Satoshi Matsuoka GSIC, Tokyo Institute of Technology ( 東 京 工 業 大 学 ) Performance/Watt is the Issue

More information

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. LS-DYNA Scalability on Cray Supercomputers Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp. WP-LS-DYNA-12213 www.cray.com Table of Contents Abstract... 3 Introduction... 3 Scalability

More information

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training

More information

Case Study on Productivity and Performance of GPGPUs

Case Study on Productivity and Performance of GPGPUs Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia

More information

Fujitsu s Architectures and Collaborations for Weather Prediction and Climate Research

Fujitsu s Architectures and Collaborations for Weather Prediction and Climate Research Fujitsu s Architectures and Collaborations for Weather Prediction and Climate Research Ross Nobes Fujitsu Laboratories of Europe Fujitsu HPC - Past, Present and Future Fujitsu has been developing advanced

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Mitglied der Helmholtz-Gemeinschaft Welcome to the Jülich Supercomputing Centre D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich Schedule: Monday, May 19 13:00-13:30 Welcome

More information

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS SUDHAKARAN.G APCF, AERO, VSSC, ISRO 914712564742 g_suhakaran@vssc.gov.in THOMAS.C.BABU APCF, AERO, VSSC, ISRO 914712565833

More information

Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)!

Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)! Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)! Marty Deneroff Chief Technology Officer Green Wave Systems, Inc. deneroff@grnwv.com 1 Using COTS Intellectual Property,

More information

Cosmological simulations on High Performance Computers

Cosmological simulations on High Performance Computers Cosmological simulations on High Performance Computers Cosmic Web Morphology and Topology Cosmological workshop meeting Warsaw, 12-17 July 2011 Maciej Cytowski Interdisciplinary Centre for Mathematical

More information

HPC enabling of OpenFOAM R for CFD applications

HPC enabling of OpenFOAM R for CFD applications HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

Defying the Laws of Physics in/with HPC. Rafa Grimán HPC Architect

Defying the Laws of Physics in/with HPC. Rafa Grimán HPC Architect Defying the Laws of Physics in/with HPC 2013 11 12 Rafa Grimán HPC Architect 1 Agenda Bull Scalability ExaFLOP / Exascale Bull s PoV? Bar 2 Bull 3 Mastering Value Chain for Critical Processes From infrastructures

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

GPU Clusters for HPC. Volodymyr Kindratenko Innovative Systems Laboratory

GPU Clusters for HPC. Volodymyr Kindratenko Innovative Systems Laboratory GPU Clusters for HPC Volodymyr Kindratenko Innovative Systems Laboratory National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Acknowledgements ISL research staff Jeremy

More information

Summit and Sierra Supercomputers:

Summit and Sierra Supercomputers: Whitepaper Summit and Sierra Supercomputers: An Inside Look at the U.S. Department of Energy s New Pre-Exascale Systems November 2014 1 Contents New Flagship Supercomputers in U.S. to Pave Path to Exascale

More information

1 Bull, 2011 Bull Extreme Computing

1 Bull, 2011 Bull Extreme Computing 1 Bull, 2011 Bull Extreme Computing Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing HPC Overview Ares, Gerardo, HPC Team HPC concepts HPC: High Performance

More information

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver 1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution

More information

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University

More information

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations Bilag 1 2013-03-12/RB DeIC (DCSC) Scientific Computing Installations DeIC, previously DCSC, currently has a number of scientific computing installations, distributed at five regional operating centres.

More information

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft

More information

Performance of the JMA NWP models on the PC cluster TSUBAME.

Performance of the JMA NWP models on the PC cluster TSUBAME. Performance of the JMA NWP models on the PC cluster TSUBAME. K.Takenouchi 1), S.Yokoi 1), T.Hara 1) *, T.Aoki 2), C.Muroi 1), K.Aranami 1), K.Iwamura 1), Y.Aikawa 1) 1) Japan Meteorological Agency (JMA)

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

High Performance Computing in the Multi-core Area

High Performance Computing in the Multi-core Area High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Multi-Threading Performance on Commodity Multi-Core Processors

Multi-Threading Performance on Commodity Multi-Core Processors Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction

More information

Altix Usage and Application Programming. Welcome and Introduction

Altix Usage and Application Programming. Welcome and Introduction Zentrum für Informationsdienste und Hochleistungsrechnen Altix Usage and Application Programming Welcome and Introduction Zellescher Weg 12 Tel. +49 351-463 - 35450 Dresden, November 30th 2005 Wolfgang

More information

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance

More information

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Mahidhar Tatineni (mahidhar@sdsc.edu) MVAPICH User Group Meeting August 27, 2014 NSF grants: OCI #0910847 Gordon: A Data

More information

BLM 413E - Parallel Programming Lecture 3

BLM 413E - Parallel Programming Lecture 3 BLM 413E - Parallel Programming Lecture 3 FSMVU Bilgisayar Mühendisliği Öğr. Gör. Musa AYDIN 14.10.2015 2015-2016 M.A. 1 Parallel Programming Models Parallel Programming Models Overview There are several

More information

PRIMERGY server-based High Performance Computing solutions

PRIMERGY server-based High Performance Computing solutions PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

Lecture 1: the anatomy of a supercomputer

Lecture 1: the anatomy of a supercomputer Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers of the future may have only 1,000 vacuum tubes and perhaps weigh 1½ tons. Popular Mechanics, March 1949

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

Performance Monitoring of Parallel Scientific Applications

Performance Monitoring of Parallel Scientific Applications Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Chances and Challenges in Developing Future Parallel Applications

Chances and Challenges in Developing Future Parallel Applications Chances and Challenges Prof. Dr. Rudolf Berrendorf rudolf.berrendorf@h brs.de http://berrendorf.inf.h brs.de/, Germany Computer Science Department Outline Why Parallelism? Parallel Systems are Complex

More information

PRACE: access to Tier-0 systems and enabling the access to ExaScale systems Dr. Sergi Girona Managing Director and Chair of the PRACE Board of

PRACE: access to Tier-0 systems and enabling the access to ExaScale systems Dr. Sergi Girona Managing Director and Chair of the PRACE Board of PRACE: access to Tier-0 systems and enabling the access to ExaScale systems Dr. Sergi Girona Managing Director and Chair of the PRACE Board of Directors PRACE aisbl, a persistent pan-european supercomputing

More information

Power Aware and Temperature Restraint Modeling for Maximizing Performance and Reliability Laxmikant Kale, Akhil Langer, and Osman Sarood

Power Aware and Temperature Restraint Modeling for Maximizing Performance and Reliability Laxmikant Kale, Akhil Langer, and Osman Sarood Power Aware and Temperature Restraint Modeling for Maximizing Performance and Reliability Laxmikant Kale, Akhil Langer, and Osman Sarood Parallel Programming Laboratory (PPL) University of Illinois Urbana

More information

Turbomachinery CFD on many-core platforms experiences and strategies

Turbomachinery CFD on many-core platforms experiences and strategies Turbomachinery CFD on many-core platforms experiences and strategies Graham Pullan Whittle Laboratory, Department of Engineering, University of Cambridge MUSAF Colloquium, CERFACS, Toulouse September 27-29

More information

High-Performance Computing and Big Data Challenge

High-Performance Computing and Big Data Challenge High-Performance Computing and Big Data Challenge Dr Violeta Holmes Matthew Newall The University of Huddersfield Outline High-Performance Computing E-Infrastructure Top500 -Tianhe-II UoH experience: HPC

More information

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1 Introduction to High Performance Cluster Computing Cluster Training for UCL Part 1 What is HPC HPC = High Performance Computing Includes Supercomputing HPCC = High Performance Cluster Computing Note: these

More information

Das TOP500-Projekt der Universitäten Mannheim und Tennessee zur Evaluierung des Supercomputer Marktes. Hans-Werner Meuer Universität Mannheim

Das TOP500-Projekt der Universitäten Mannheim und Tennessee zur Evaluierung des Supercomputer Marktes. Hans-Werner Meuer Universität Mannheim Das TOP500-Projekt der Universitäten Mannheim und Tennessee zur Evaluierung des Supercomputer Marktes Hans-Werner Meuer Universität Mannheim Informatik - Kolloquium der Universität Passau 20. Juli 1999

More information

Resource Scheduling Best Practice in Hybrid Clusters

Resource Scheduling Best Practice in Hybrid Clusters Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Resource Scheduling Best Practice in Hybrid Clusters C. Cavazzoni a, A. Federico b, D. Galetti a, G. Morelli b, A. Pieretti

More information

Journée Mésochallenges 2015 SysFera and ROMEO Make Large-Scale CFD Simulations Only 3 Clicks Away

Journée Mésochallenges 2015 SysFera and ROMEO Make Large-Scale CFD Simulations Only 3 Clicks Away SysFera and ROMEO Make Large-Scale CFD Simulations Only 3 Clicks Away Benjamin Depardon SysFera Sydney Tekam Tech-Am ING Arnaud Renard ROMEO Manufacturing with HPC 98% of products will be developed digitally

More information

Petascale Visualization: Approaches and Initial Results

Petascale Visualization: Approaches and Initial Results Petascale Visualization: Approaches and Initial Results James Ahrens Li-Ta Lo, Boonthanome Nouanesengsy, John Patchett, Allen McPherson Los Alamos National Laboratory LA-UR- 08-07337 Operated by Los Alamos

More information

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State

More information

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp

Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges Why should you care? What are they? Which are different from non-petascale? What has changed since

More information

An introduction to Fyrkat

An introduction to Fyrkat Cluster Computing May 25, 2011 How to get an account https://fyrkat.grid.aau.dk/useraccount How to get help https://fyrkat.grid.aau.dk/wiki What is a Cluster Anyway It is NOT something that does any of

More information

Introduction to parallel computing and UPPMAX

Introduction to parallel computing and UPPMAX Introduction to parallel computing and UPPMAX Intro part of course in Parallel Image Analysis Elias Rudberg elias.rudberg@it.uu.se March 22, 2011 Parallel computing Parallel computing is becoming increasingly

More information

An Overview of High- Performance Computing and Challenges for the Future

An Overview of High- Performance Computing and Challenges for the Future The 2006 International Conference on Computational Science and its Applications (ICCSA 2006) An Overview of High- Performance Computing and Challenges for the Future Jack Dongarra University of Tennessee

More information