HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK
|
|
- Oscar Farmer
- 8 years ago
- Views:
Transcription
1 HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing
2 US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science PFLOPS Peak Performance 10x in Scientific Applications 2017 Major Step Forward on the Path to Exascale 2
3 Just 4 nodes in Summit would make the Top500 list of supercomputers today Similar Power as Titan 5-10x Faster 1/5th the Size 150 PF = 3M Laptops One laptop for Every Resident in State of Mississippi 3
4 Optimizing Serial/Parallel Execution Application Code Parallel Work Serial Work GPU Majority of Ops System and Sequential Ops CPU + 4
5 NVLink-Enabled Heterogeneous Node 5x Higher Energy Efficiency GB/s IBM POWER CPU Most Powerful Serial Processor NVIDIA NVLink Fastest CPU-GPU Interconnect NVIDIA Volta GPU Most Powerful Parallel Processor 5
6 NVLink: Logical Node Integration TESLA GPU NVLink 80 GB/s Power or ARM CPU 5x PCIe bandwidth Move data at CPU memory speed 3x lower energy/bit HBM 1 Terabyte/s Stacked Memory Throughput Optimized DDR GB/s DDR Memory Latency Optimized 6
7 KEPLER GPU PASCAL GPU NVLink NVLink High-speed GPU Interconnect POWER CPU NVLink PCIe PCIe X86 ARM64 POWER CPU 2014 X86 ARM64 POWER CPU
8 NVLink Unleashes Multi-GPU Performance GPUs Interconnected with NVLink CPU Over 2x Application Performance Speedup When Next-Gen GPUs Connect via NVLink Versus PCIe Speedup vs PCIe based Server 2.25x 2.00x PCIe Switch 1.75x TESLA GPU TESLA GPU 1.50x 1.25x 5x Faster than PCIe Gen3 x x ANSYS Fluent Multi-GPU Sort LQCD QUDA AMBER 3D FFT 3D FFT, ANSYS: 2 GPU configuration, All other apps comparing 4 GPU 8 configuration 8 AMBER Cellulose (256x128x128), FFT problem size (256^3)
9 Two Computing Models For Accelerators Many-Weak-Cores (MWC) Model Single CPU Core for Both Serial & Parallel Work Heterogeneous Computing Model Complementary Processors Work Together Xeon Phi (And Others) Many Weak Serial Cores CPU Optimized for Serial Tasks GPU Accelerator Optimized for Parallel Tasks 9
10 Amdahl s Law Analysis 98% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 10
11 Amdahl s Law Analysis 90% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 11
12 Amdahl s Law Analysis 80% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 12
13 Amdahl s Law Analysis 70% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 13
14 Amdahl s Law Analysis 60% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 14
15 Amdahl s Law Analysis 50% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 15
16 Amdahl s Law Analysis 40% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 16
17 Amdahl s Law Analysis 30% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 17
18 Amdahl s Law Analysis 20% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 18
19 Amdahl s Law Analysis 10% Parallel Work 1 GPU+1 CPU 2 x MWC (.25x CPU) Serial Parallel Work 1x CPU Minutes Run Time 19
20 TESLA K80 WORLD S FASTEST ACCELERATOR FOR DATA ANALYTICS AND SCIENTIFIC COMPUTING Dual-GPU Accelerator for Max Throughput 2x Faster 2.9 TF 4992 Cores 480 GB/s 25x 20x 15x Deep Learning: Caffe Double the Memory Designed for Big Data Apps 24GB K40 12GB Maximum Performance Dynamically Maximize Performance for Every Application 10x 5x 0x CPU Tesla K40 Tesla K80 Oil & HPC Gas Viz Data Analytics Caffe Benchmark: AlexNet training throughput based on 20 iterations, CPU: 2.70GHz. 64GB System Memory, CentOS
21 Performance Lead Continues to Grow Peak Double Precision FLOPS Peak Memory Bandwidth GFLOPS 3500 GB/s K K K K M1060 K20 M2090 Ivy Bridge Sandy Bridge Westmere Haswell M1060 K20 M2090 Sandy Bridge Westmere Haswell Ivy Bridge NVIDIA GPU x86 CPU NVIDIA GPU x86 CPU 21
22 15x 10x Faster than CPU on Applications K80 CPU 10x 5x 0x Benchmarks Molecular Dynamics Quantum Chemistry Physics CPU: 12 cores, 2.70GHz. 64GB System Memory, CentOS 6.2 GPU: Single Tesla K80, Boost enabled 22
23 Tesla Platform Enables Optimization Scalable Nodes, ISA Choice NVLink x86 23
24 Tesla Platform Enables Optimization Ecosystem Industry Standard CPUs and Interconnects ARM64 POWER x86 Ethernet Cray NVIDIA GPU InfiniBand Others Industry-Driven Solutions 24
25 CORAL Scalable Heterogeneous Node NVLink In Practice Approximately 3,400 nodes, each with: IBM POWER9 CPUs and multiple NVIDIA Tesla Volta GPUs CPUs and GPUs integrated on-node with high speed NVLink Large coherent memory: over 512 GB (HBM + DDR4) All directly addressable from the CPUs and GPUs An additional 800 GB of NVRAM, burst buffer or as extended memory Over 40 TF peak performance/node(!) 25
26 Optimized Heterogeneous Node CORAL Application Performance Projections 26
27 YOUR PLATFORM FOR DISCOVERY Tesla Accelerated Computing
Summit and Sierra Supercomputers:
Whitepaper Summit and Sierra Supercomputers: An Inside Look at the U.S. Department of Energy s New Pre-Exascale Systems November 2014 1 Contents New Flagship Supercomputers in U.S. to Pave Path to Exascale
More informationNVIDIA GPUs in the Cloud
NVIDIA GPUs in the Cloud 4 EVOLVING CLOUD REQUIREMENTS On premises Off premises Hybrid Cloud Connecting clouds New workloads Components to disrupt 5 GLOBAL CLOUD PLATFORM Unified architecture enabled by
More informationOpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC
OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationGPU Hardware and Programming Models. Jeremy Appleyard, September 2015
GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once
More informationNVLink High-Speed Interconnect: Application Performance
Whitepaper NVIDIA TM NVLink High-Speed Interconnect: Application Performance November 2014 1 Contents Accelerated Computing as the Path Forward for HPC... 3 NVLink: High-Speed GPU Interconnect... 3 Server
More informationEvoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca
Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca Carlo Cavazzoni CINECA Supercomputing Application & Innovation www.cineca.it 21 Aprile 2015 FERMI Name: Fermi Architecture: BlueGene/Q
More informationHigh Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
More informationThe GPU Accelerated Data Center. Marc Hamilton, August 27, 2015
The GPU Accelerated Data Center Marc Hamilton, August 27, 2015 THE GPU-ACCELERATED DATA CENTER HPC DEEP LEARNING PC VIRTUALIZATION CLOUD GAMING RENDERING 2 Product design FROM ADVANCED RENDERING TO VIRTUAL
More informationAccelerating CFD using OpenFOAM with GPUs
Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide
More informationNVIDIA HPC Update. Carl Ponder, PhD; cponder@nvidia.com; NVIDIA, Austin, TX, USA - Sr. Applications Engineer, Developer Technology Group
NVIDIA HPC Update Carl Ponder, PhD; cponder@nvidia.com; NVIDIA, Austin, TX, USA - Sr. Applications Engineer, Developer Technology Group Stan Posey; sposey@nvidia.com; NVIDIA, Santa Clara, CA, USA - HPC
More informationHow To Build An Ark Processor With An Nvidia Gpu And An African Processor
Project Denver Processor to Usher in a New Era of Computing Bill Dally January 5, 2011 http://blogs.nvidia.com/2011/01/project-denver-processor-to-usher-in-new-era-of-computing/ Project Denver Announced
More informationPCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)
PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS) PCIe Over Cable PCIe provides greater performance 8 7 6 5 GBytes/s 4
More informationHigh Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
More informationOverview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it
Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket
More informationBuilding a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
More informationAppro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
More informationInterconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, June 2016
Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, June 2016 Mellanox Leadership in High Performance Computing Most Deployed Interconnect in High Performance
More informationHP ProLiant SL270s Gen8 Server. Evaluation Report
HP ProLiant SL270s Gen8 Server Evaluation Report Thomas Schoenemeyer, Hussein Harake and Daniel Peter Swiss National Supercomputing Centre (CSCS), Lugano Institute of Geophysics, ETH Zürich schoenemeyer@cscs.ch
More informationInfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice
InfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice Provides the Best Return-on-Investment by Delivering the Highest System Efficiency and Utilization TOP500 Supercomputers June
More informationCOMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)
COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University
More informationThe Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens
More informationPurchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers
Information Technology Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers Effective for FY2016 Purpose This document summarizes High Performance Computing
More informationCase Study on Productivity and Performance of GPGPUs
Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia
More informationGraphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and
More informationJean-Pierre Panziera Teratec 2011
Technologies for the future HPC systems Jean-Pierre Panziera Teratec 2011 3 petaflop systems : TERA 100, CURIE & IFERC Tera100 Curie IFERC 1.25 PetaFlops 256 TB ory 30 PB disk storage 140 000+ Xeon cores
More informationIntel Xeon Processor E5-2600
Intel Xeon Processor E5-2600 Best combination of performance, power efficiency, and cost. Platform Microarchitecture Processor Socket Chipset Intel Xeon E5 Series Processors and the Intel C600 Chipset
More informationHadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
More informationAccelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing
Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools
More informationCloud Data Center Acceleration 2015
Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA
More informationSeeking Opportunities for Hardware Acceleration in Big Data Analytics
Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who
More informationFAST DATA = BIG DATA + GPU. Carlo Nardone, Senior Solution Architect EMEA Enterprise FastData @ UNITO, March 21 th, 2016
FAST DATA = BIG DATA + GPU Carlo Nardone, Senior Solution Architect EMEA Enterprise FastData @ UNITO, March 21 th, 2016 GAMING PRO ENTERPRISE VISUALIZATION DATA CENTER AUTO / EMBEDDED THE WORLD LEADER
More informationOptimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server
Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Technology brief Introduction... 2 GPU-based computing... 2 ProLiant SL390s GPU-enabled architecture... 2 Optimizing
More informationwww.xenon.com.au STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING
www.xenon.com.au STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING GPU COMPUTING VISUALISATION XENON Accelerating Exploration Mineral, oil and gas exploration is an expensive and challenging
More informationEmerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting
Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting Introduction Big Data Analytics needs: Low latency data access Fast computing Power efficiency Latest
More informationTrends in High-Performance Computing for Power Grid Applications
Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views
More informationCluster Implementation and Management; Scheduling
Cluster Implementation and Management; Scheduling CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Cluster Implementation and Management; Scheduling Spring 2013 1 /
More informationHow To Compare Amazon Ec2 To A Supercomputer For Scientific Applications
Amazon Cloud Performance Compared David Adams Amazon EC2 performance comparison How does EC2 compare to traditional supercomputer for scientific applications? "Performance Analysis of High Performance
More informationRecent Advances in HPC for Structural Mechanics Simulations
Recent Advances in HPC for Structural Mechanics Simulations 1 Trends in Engineering Driving Demand for HPC Increase product performance and integrity in less time Consider more design variants Find the
More informationCrossing the Performance Chasm with OpenPOWER
Crossing the Performance Chasm with OpenPOWER Dr. Srini Chari Cabot Partners/IBM chari@cabotpartners.com #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Disclosure Copyright 215. Cabot Partners
More informationPerformance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France
More informationECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009
ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory
More informationCUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)
CUDA in the Cloud Enabling HPC Workloads in OpenStack John Paul Walters Computer Scien5st, USC Informa5on Sciences Ins5tute jwalters@isi.edu With special thanks to Andrew Younge (Indiana Univ.) and Massimo
More informationwww.thinkparq.com www.beegfs.com
www.thinkparq.com www.beegfs.com KEY ASPECTS Maximum Flexibility Maximum Scalability BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a
More informationCFD Implementation with In-Socket FPGA Accelerators
CFD Implementation with In-Socket FPGA Accelerators Ivan Gonzalez UAM Team at DOVRES FuSim-E Programme Symposium: CFD on Future Architectures C 2 A 2 S 2 E DLR Braunschweig 14 th -15 th October 2009 Outline
More informationA GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS
A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS SUDHAKARAN.G APCF, AERO, VSSC, ISRO 914712564742 g_suhakaran@vssc.gov.in THOMAS.C.BABU APCF, AERO, VSSC, ISRO 914712565833
More informationFindings in High-Speed OrthoMosaic
Findings in High-Speed OrthoMosaic David Piekny, Solutions Product Manager PCI Geomatics Committed To Image-Centric Excellence Technical Session 6, Rm. 203D Tuesday May 3 rd, 9:30-11:00 AM ASPRS 2011,
More informationMississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC
Mississippi State University High Performance Computing Collaboratory Brief Overview Trey Breckenridge Director, HPC Mississippi State University Public university (Land Grant) founded in 1878 Traditional
More informationInfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair
InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA
More informationCORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER
CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationPedraforca: ARM + GPU prototype
www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of
More information#OpenPOWERSummit. Join the conversation at #OpenPOWERSummit 1
XLC/C++ and GPU Programming on Power Systems Kelvin Li, Kit Barton, John Keenleyside IBM {kli, kbarton, keenley}@ca.ibm.com John Ashley NVIDIA jashley@nvidia.com #OpenPOWERSummit Join the conversation
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu
More informationECDF Infrastructure Refresh - Requirements Consultation Document
Edinburgh Compute & Data Facility - December 2014 ECDF Infrastructure Refresh - Requirements Consultation Document Introduction In order to sustain the University s central research data and computing
More informationStovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.
Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster
More informationGPGPU accelerated Computational Fluid Dynamics
t e c h n i s c h e u n i v e r s i t ä t b r a u n s c h w e i g Carl-Friedrich Gauß Faculty GPGPU accelerated Computational Fluid Dynamics 5th GACM Colloquium on Computational Mechanics Hamburg Institute
More informationData Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
More informationECLIPSE Performance Benchmarks and Profiling. January 2009
ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster
More informationLecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.
Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide
More informationNew Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC
New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking
More informationBuild an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)!
Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)! Marty Deneroff Chief Technology Officer Green Wave Systems, Inc. deneroff@grnwv.com 1 Using COTS Intellectual Property,
More informationLS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationThe L-CSC cluster: Optimizing power efficiency to become the greenest supercomputer in the world in the Green500 list of November 2014
The L-CSC cluster: Optimizing power efficiency to become the greenest supercomputer in the world in the Green500 list of November 2014 David Rohr 1, Gvozden Nešković 1, Volker Lindenstruth 1,2 DOI: 10.14529/jsfi150304
More informationKriterien für ein PetaFlop System
Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working
More informationThe Foundation for Better Business Intelligence
Product Brief Intel Xeon Processor E7-8800/4800/2800 v2 Product Families Data Center The Foundation for Big data is changing the way organizations make business decisions. To transform petabytes of data
More informationEnergy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez
Energy efficient computing on Embedded and Mobile devices Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez A brief look at the (outdated) Top500 list Most systems are built
More informationOverview of HPC Resources at Vanderbilt
Overview of HPC Resources at Vanderbilt Will French Senior Application Developer and Research Computing Liaison Advanced Computing Center for Research and Education June 10, 2015 2 Computing Resources
More informationRAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29
RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant
More informationInfrastructure Matters: POWER8 vs. Xeon x86
Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report
More informationVisit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory
Visit to the National University for Defense Technology Changsha, China Jack Dongarra University of Tennessee Oak Ridge National Laboratory June 3, 2013 On May 28-29, 2013, I had the opportunity to attend
More informationHigh-Performance Computing and Big Data Challenge
High-Performance Computing and Big Data Challenge Dr Violeta Holmes Matthew Newall The University of Huddersfield Outline High-Performance Computing E-Infrastructure Top500 -Tianhe-II UoH experience: HPC
More informationEnabling High performance Big Data platform with RDMA
Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery
More informationDeep Learning GPU-Based Hardware Platform
Deep Learning GPU-Based Hardware Platform Hardware and Software Criteria and Selection Mourad Bouache Yahoo! Performance Engineering Group Sunnyvale, CA +1.408.784.1446 bouache@yahoo-inc.com John Glover
More informationParallel Computing. Introduction
Parallel Computing Introduction Thorsten Grahs, 14. April 2014 Administration Lecturer Dr. Thorsten Grahs (that s me) t.grahs@tu-bs.de Institute of Scientific Computing Room RZ 120 Lecture Monday 11:30-13:00
More informationCopyright 2013, Oracle and/or its affiliates. All rights reserved.
1 Oracle SPARC Server for Enterprise Computing Dr. Heiner Bauch Senior Account Architect 19. April 2013 2 The following is intended to outline our general product direction. It is intended for information
More informationOverview: X5 Generation Database Machines
Overview: X5 Generation Database Machines Spend Less by Doing More Spend Less by Paying Less Rob Kolb Exadata X5-2 Exadata X4-8 SuperCluster T5-8 SuperCluster M6-32 Big Memory Machine Oracle Exadata Database
More informationCabot Partners. 3D Virtual Desktops that Perform. Cabot Partners Group, Inc. 100 Woodcrest Lane, Danbury CT 06810, www.cabotpartners.
3D Virtual Desktops that Perform Consolidating your Desktops with NeXtScale System Sponsored by IBM Srini Chari, Ph.D., MBA September, 2014 mailto:chari@cabotpartners.com Executive Summary Cabot Partners
More informationInterconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015
Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2015 InfiniBand FDR and EDR Continue Growth and Leadership The Most Used Interconnect On The TOP500
More informationLinux Cluster Computing An Administrator s Perspective
Linux Cluster Computing An Administrator s Perspective Robert Whitinger Traques LLC and High Performance Computing Center East Tennessee State University : http://lxer.com/pub/self2015_clusters.pdf 2015-Jun-14
More informationALPS Supercomputing System A Scalable Supercomputer with Flexible Services
ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being
More informationInnovativste XEON Prozessortechnik für Cisco UCS
Innovativste XEON Prozessortechnik für Cisco UCS Stefanie Döhler Wien, 17. November 2010 1 Tick-Tock Development Model Sustained Microprocessor Leadership Tick Tock Tick 65nm Tock Tick 45nm Tock Tick 32nm
More informationSupercomputing Clusters with RapidIO Interconnect Fabric
Supercomputing Clusters with RapidIO Interconnect Fabric Devashish Paul, Director Strategic Marketing, Systems Solutions devashish.paul@idt.com Ethernet Summit 2015 April 14-16, 2015 Santa Clara, CA Integrated
More informationParallel Computing with MATLAB
Parallel Computing with MATLAB Scott Benway Senior Account Manager Jiro Doke, Ph.D. Senior Application Engineer 2013 The MathWorks, Inc. 1 Acceleration Strategies Applied in MATLAB Approach Options Best
More informationFLOW-3D Performance Benchmark and Profiling. September 2012
FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute
More informationIntroduction to High Performance Cluster Computing. Cluster Training for UCL Part 1
Introduction to High Performance Cluster Computing Cluster Training for UCL Part 1 What is HPC HPC = High Performance Computing Includes Supercomputing HPCC = High Performance Cluster Computing Note: these
More informationIBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud
IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain
More informationTowards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com
Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Please Note IBM s statements regarding its plans, directions, and intent are subject to
More informationpræsentation oktober 2011
Johnny Olesen System X presale præsentation oktober 2011 2010 IBM Corporation 2 Hvem er jeg Dagens agenda Server overview System Director 3 4 Portfolio-wide Innovation with IBM System x and BladeCenter
More informationOptimizing a 3D-FWT code in a cluster of CPUs+GPUs
Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Gregorio Bernabé Javier Cuenca Domingo Giménez Universidad de Murcia Scientific Computing and Parallel Programming Group XXIX Simposium Nacional de la
More informationSupercomputing Resources in BSC, RES and PRACE
www.bsc.es Supercomputing Resources in BSC, RES and PRACE Sergi Girona, BSC-CNS Barcelona, 23 Septiembre 2015 ICTS 2014, un paso adelante para la RES Past RES members and resources BSC-CNS (MareNostrum)
More informationFPGA Acceleration using OpenCL & PCIe Accelerators MEW 25
FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 December 2014 FPGAs in the news» Catapult» Accelerate BING» 2x search acceleration:» ½ the number of servers»
More informationGPU Accelerated Signal Processing in OpenStack. John Paul Walters. Computer Scien5st, USC Informa5on Sciences Ins5tute jwalters@isi.
GPU Accelerated Signal Processing in OpenStack John Paul Walters Computer Scien5st, USC Informa5on Sciences Ins5tute jwalters@isi.edu Outline Motivation OpenStack Background Heterogeneous OpenStack GPU
More informationNext Generation GPU Architecture Code-named Fermi
Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time
More informationScaling from Workstation to Cluster for Compute-Intensive Applications
Cluster Transition Guide: Scaling from Workstation to Cluster for Compute-Intensive Applications IN THIS GUIDE: The Why: Proven Performance Gains On Cluster Vs. Workstation The What: Recommended Reference
More information10- High Performance Compu5ng
10- High Performance Compu5ng (Herramientas Computacionales Avanzadas para la Inves6gación Aplicada) Rafael Palacios, Fernando de Cuadra MRE Contents Implemen8ng computa8onal tools 1. High Performance
More informationData Center and Cloud Computing Market Landscape and Challenges
Data Center and Cloud Computing Market Landscape and Challenges Manoj Roge, Director Wired & Data Center Solutions Xilinx Inc. #OpenPOWERSummit 1 Outline Data Center Trends Technology Challenges Solution
More informationBuild GPU Cluster Hardware for Efficiently Accelerating CNN Training. YIN Jianxiong Nanyang Technological University jxyin@ntu.edu.
Build Cluster Hardware for Efficiently Accelerating CNN Training YIN Jianxiong Nanyang Technological University jxyin@ntu.edu.sg Visual Object Search Private Large-scale Visual Object Database Domain Specifi
More information