Performance Evaluation and Energy Efficiency of HPC Platforms
|
|
- Katrina McCarthy
- 7 years ago
- Views:
Transcription
1 Performance Evaluation and Energy Efficiency of HPC Platforms Based on Intel, AMD and ARM Processors M. Jarus, S. Varrette, A. Oleksiak and P.Bouvry Poznań Supercomputing and Networking Center CSC, University of Luxembourg, Luxembourg 1 / 26
2 Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 2 / 26
3 Introduction Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 3 / 26
4 Introduction Why High Performance Computing? The country that out-computes will be the one that out-competes. Council on Competitiveness Accelerate research by accelerating computations 14.4 GFlops TFlops (Dual-core i7 1.8GHz) (291computing nodes, 2944cores) Increase storage capacity Communicate faster 2TB (1 disk) 1042TB raw(444disks) 1 GbE (1 Gb/s) vs Infiniband QDR (40 Gb/s) 4 / 26
5 Introduction HPC at the Heart of our Daily Life Today... Research, Industry, Local Collectivities... Tomorrow: applied research, digital health, nano/bio techno N 5 / 26
6 Introduction HPC Evolution towards Exascale Major investments since 2012 to build an Exascale platform by 2019 > 1.5 G$ for each leading countries (US, EU, Russia etc. ) Power 6MW 15MW 20MW #Nodes 18,700 50, ,000 Node concurrency 12 1,000 10,000 Interconnect BW 1.5GB/s 1TB/s 2TB/s MTTI Day Day Day = Max power consumption: 0.1 W per core 6 / 26
7 Introduction Current Leading Processor Technologies Top500 Count Model Example max. TDP % Intel Xeon X5650 6C 2.66GHz 85W 14.1W/core % Intel Xeon E C 2.7GHz 130W 16.25W/core % AMD Opteron C Interlagos 115W 7.2W/core % IBM Power BQC 16C 1.6GHz 65W 4.1W/core alternative low power processor architectures are required. 1 GPGPU accelerators (Nvidia Tesla cards / IBM PowerXCell 8i) 2 mobile and embedded devices market (ARM, Intel Atom) = Can low-power processors really suit HPC? 7 / 26
8 Context & Motivations Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 8 / 26
9 Context & Motivations The Mont Blanc Project www EU project start: October 2011 Objectives develop a ARM-based Exascale HPC using 15 to 30 less energy Current status: Tibidabo cluster based on NVidia Tegra2 SoC based on NVidia Tegra2 SoC (1 ARM Cortex-A9 1 GHz) 8 Q7 board (1 GbE) per blade Total: 128 nodes (38U) interconnect: minimalistic tree network based on 1 GbE switch Measured performance: 120 MFlops/W Other State-of-the-art projects EuroCloud project Energy-conscious3DServer-on-ChipforGreenCloud 9 / 26
10 Context & Motivations [Low-Power] HPC PSNC & UL Name Location Size #cpus #RAM Processor max TDP/proc i7 PSNC 1U GB Intel Core i7-3615qe@2.3ghz 8C 45W 5.63W/c atom64 PSNC 1U 18 36GB Intel Atom N2600@1.6GHz 2C 3.5W 1.75W/c amdf PSNC 1U 18 72GB AMD Fusion G-T40N@1GHz 2C 9W 4.5W/c bull-bcs UL 8U 16 1TB Intel Xeon E7-4850@2GHz 10C 130W 13W/c viridis UL 2U GB ARM A9 Cortex 1.1GHz 4C 1.9W 0.48W/c Objectives Compare perf. of cutting-edge high-density HPC platforms low power platforms (atom64, amdf and viridis) vs. pure computing-efficient platforms (i7 and bull-bcs) 10 / 26
11 Experimental Setup Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 11 / 26
12 Experimental Setup Considered Benchmarks Phoronix Test Suite stressing system-wide components (disk, RAM or CPU). C-ray, Hmmer, Pybench CPU Performances (single-threaded): Coremark, Fhourstones, Whetstone, Linpack MPI Performances: OSU Micro-Benchmarks osu_get_latency & osu_get_bw only HPC Performances: High-Performance Linpack (HPL) solves a linear system of order N: A x = b Gaussian elimination with partial pivoting two-dimensional P Q grid of processes N by N + 1 coefficient matrix split in NB NB blocks 12 / 26
13 Experimental Setup Performance Measurements On a given platform: 100 runs for each benchmark, each with the following operations 1 t 0: [fix the CPU frequency] & start system/power monitoring 2 t 0 + s: Start the selected benchmarks. 3 t 1: Benchmark finished execution. 4 t 1 + s: End of monitoring. PpMHz Performance per MHz impact of CPU frequency on the final benchmarks results i7: 1.2GHz 2.3GHz 2.31GHz (Turbo Mode) atom64: 0.6 GHz 1.6GHz amdf: 0.8 GHz 1GHz; bull-bcs: 1.064GHz 1.995GHz 1.996GHz (Turbo Mode) viridis: 1.1GHz 13 / 26
14 Experimental Setup Performance Measurements PpW raw benchmark result divided by (official) the average power draw (W) (better) the energy consumed (J) Performance per Watt Different results achieved with different CPU frequency values PpW metrics presented for the highest frequency value Technical details viridis: power measures available only by groups of 4 nodes bull-bcs: high latency between measure (slow IPMI) 40s min atom64: strange sensors reporting 14 / 26
15 Experiments Results Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 15 / 26
16 Experiments Results CPU Performances (single-threaded) Raw benchmark result LOGSCALE Intel Core i7 AMD G T40N Atom N2600 Intel Xeon E7 ARM Cortex A9 100 CoreMark Fhourstones Whetstones Linpack Best results are obtained by Intel Core i7, then Intel E7 AMD, ARM and Atom achieve comparable results 16 / 26
17 Experiments Results OSU MPI Benchmark 3.8 Results Latency (µs) LOGSCALE the LOWER the better OSU One Sided MPI Get latency Test v CoolEmAll Atom64 CoolEmAll AMDF CoolEmAll i7 Viridis ARM BullX BCS Packet size (bits) LOGSCALE 17 / 26
18 Experiments Results OSU MPI Benchmark 3.8 Results Bandwidth (MB/s) LOGSCALE the HIGHER the better OSU MPI One Sided MPI Get Bandwidth Test v3.8 BullX BCS 0.1 Viridis ARM CoolEmAll AMDF CoolEmAll i7 CoolEmAll Atom Packet size (bits) LOGSCALE 17 / 26
19 Experiments Results OSU MPI Power Measures OSU Micro benchmark 3.8 CoolEmAll i7 (2 nodes) OSU Micro benchmark 3.8 CoolEmAll AMDF (2 nodes) 60 Latency test Bandwidth test 60 Latency test Bandwidth test 50 Energy=4442J Energy=3093J 50 Energy=3642J Energy=2816J 111s 75s 102s 77s Power usage [W] Power usage [W] Time [s] Time [s] OSU Micro benchmark 3.8 CoolEmAll Atom64 (2 nodes) OSU Micro benchmark 3.8 Viridis ARM (2 nodes) Energy=7555J 275s Energy=4806J 172s Latency test Bandwidth test Energy 499J 45s Energy 526J 45s Latency test Bandwidth test Power usage [W] Power usage [W] Time [s] Time [s] 18 / 26
20 Experiments Results HPL 2.1 Benchmarks Results Single Node Runs i7 Performances [GFlops] HPLinpack 2.1 Single CPU benchmark CoolEmAll i7 40 NB=48, PxQ=2x4 NB=96, PxQ=2x4 NB=128, PxQ=2x4 38 NB=160, PxQ=2x Power usage [W] HPLinpack 2.1 Single CPU benchmark CoolEmAll i7 (N=41185,P=2,Q=4) 60 NB=96 Avg Best Run NB=48 Energy NB=128 NB= J N (Problem size) Time [s] 19 / 26
21 Experiments Results HPL 2.1 Benchmarks Results Single Node Runs amdf Performances [GFlops] HPLinpack 2.1 Single CPU benchmark CoolEmAll AMDF 1.61 NB=96, PxQ=1x2 NB=128, PxQ=1x2 NB=160, PxQ=1x Power usage [W] HPLinpack 2.1 Single CPU benchmark CoolEmAll AMDF (N=19496,P=1,Q=2) 22 NB=160 Avg NB=96 NB=128 Best Run Energy 57912J N (Problem size) Time [s] 19 / 26
22 Experiments Results HPL 2.1 Benchmarks Results Single Node Runs atom64 Performances [GFlops] HPLinpack 2.1 Single CPU benchmark CoolEmAll ATOM NB=48, PxQ=2x2 NB=64, PxQ=2x NB=96, PxQ=2x2 NB=112, PxQ=2x NB=128, PxQ=2x Power usage [W] HPLinpack 2.1 Single CPU benchmark CoolEmAll Atom64 (N=12891,P=2,Q=2) 18 NB=112 Best Run Avg. 17 NB=48 NB=64 NB=96 NB=128 Energy J N (Problem size) Time [s] 19 / 26
23 Experiments Results HPL 2.1 Benchmarks Results Single Node Runs viridis Performances [GFlops] HPLinpack 2.1 Single CPU benchmark Viridis ARM 3.25 NB=64, PxQ=2x2 NB=96, PxQ=2x2 NB=112, PxQ=2x2 3.2 NB=128, PxQ=2x Power usage [W] HPLinpack 2.1 Single CPU benchmark Viridis ARM (N=20711,P=2,Q=2) 6 NB=96 Best Run Avg. 5.8 NB=64 Energy NB=112 NB= J N (Problem size) Time [s] 19 / 26
24 Experiments Results HPL Power Measures Full Platforms Runs i7 amdf HPLinpack 2.1 CoolEmAll i7 platform (18 nodes,n=174733,nb=96,pxq=12x12) HPLinpack 2.1 CoolEmAll AMDF platform (16 nodes,n=77984,nb=160,pxq=4x8) 1000 Energy: J Avg Energy: J Avg Power usage [W] Power usage [W] Time [s] Time [s] 20 / 26
25 Experiments Results HPL Power Measures Full Platforms Runs atom64 bull-bcs HPLinpack 2.1 CoolEmAll Atom64 platform (18 nodes,n=54692,nb=112,pxq=8x9) HPLinpack 2.1 BullX BCS platform (1 nodes, N=87920,NB=112,PxQ=10x16) Energy: J Avg Energy: J Avg. Power usage [W] Power usage [W] Time [s] Time [s] 20 / 26
26 Experiments Results HPL Benchmarks Results Best HPL results Name #cpu R peak N NB P Q Time [s] GFlops Effic. Energy[J] i % amdf ,14% atom ,97% bcs 1 80 n/a viridis % 9983 Full platforms runs Name #nodes R peak N NB P Q Time [s] GFlops Effic. Energy[J] i % amdf % atom % bcs ,75% viridis 12 52, n/a 21 / 26
27 Experiments Results Performance per MHz PpMHz LOGSCALE Intel Core i7 AMD G T40N Atom N2600 Intel Xeon E7 ARM Cortex A9 0.1 OSU Lat. OSU Bw. HPL HPL Full CoreMark Fhourstones Whetstones Linpack PpMHz values remain quite constant under varying CPU frequencies bull-bcs outperforms in all HPC-oriented tests 22 / 26
28 Experiments Results Energy-efficiency Energy [J] LOGSCALE 1e+08 1e+07 1e Intel Core i7 AMD G T40N Atom N2600 Intel Xeon E7 ARM Cortex A9 100 OSU Lat. OSU Bw. HPL HPL Full Cray Hmmer Pybench ARM Cortex A9 is almost always the most energy-efficient CPU Intel Xeon E7 requires much more energy to execute the same application 23 / 26
29 Conclusion Summary 1 Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 24 / 26
30 Conclusion Path to Exascale requires alternative low-power proc. architectures most promising direction based on mobile and embedded devices ARM-based HPC cluster prototypes in the Mont Blanc project Tibidabo cluster: 128 nodes, 38U, 120 MFlops/W Here: performance of cutting-edge high-density HPC platforms CoolEmAll RECS PSNC Boston Viridis & Bull UL Rooms for improvement yet definitively suits HPC environments Best obtained results: Name Processor Type MFlops/W Green500 Rank* viridis ARM A9 Cortex i7 Intel Core i bull-bcs Intel Core E atom64 Intel Atom N amdf AMD Fusion G-T40N * Based on November 2012 list 25 / 26
31 Conclusion Thank you for your attention Introduction 2 Context & Motivations 3 Experimental Setup 4 Experiments Results 5 Conclusion 26 / 26
32 Conclusion CoolEmAll RECS PSNC 35 kw, 1U, up to 18 nodes in a single enclosure 3 enclosure units (3U) PSNC: i7 Intel 2.3GHz 4C HT,TB, 45W TDP atom64 Intel Atom 1.6GHz 2C HT, 3.5W TDP amdf AMD 1.0GHz 2C HT, 9W TDP 27 / 26 N
33 Conclusion Boston Viridis UL 300W, 2U, 10GbE interconnect, 48 ultra low-power SoC ARM Cortex A9 1.1GHz 4C HT, 1.9W TDP 28 / 26
34 Conclusion BullX BCS (4 S6030) UL 8U, aggregation of 4 BullX S6030 in a single SMP node 4 4 Intel Xeon 2GHz 10C HT,TB, 130W TDP 29 / 26
A Holistic Model of the Energy-Efficiency of Hypervisors
A Holistic Model of the -Efficiency of Hypervisors in an HPC Environment Mateusz Guzek,Sebastien Varrette, Valentin Plugaru, Johnatan E. Pecero and Pascal Bouvry SnT & CSC, University of Luxembourg, Luxembourg
More informationThe Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems
202 IEEE 202 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum The Green Index: A Metric
More informationBuilding a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
More informationEnergy efficient computing on Embedded and Mobile devices. Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez
Energy efficient computing on Embedded and Mobile devices Nikola Rajovic, Nikola Puzovic, Lluis Vilanova, Carlos Villavieja, Alex Ramirez A brief look at the (outdated) Top500 list Most systems are built
More informationCSC Yearly Team Meeting 2013 Edition. vendredi 18 octobre 13
CSC Yearly Team Meeting 2013 Edition October 18th, 2013 Dr. Sébastien Varrette Short CV 33 years old, married (2004), 2 children (Adrien 2007, Chloé 2010) 2003: Computer Science Master s degrees Telecom
More informationPedraforca: ARM + GPU prototype
www.bsc.es Pedraforca: ARM + GPU prototype Filippo Mantovani Workshop on exascale and PRACE prototypes Barcelona, 20 May 2014 Overview Goals: Test the performance, scalability, and energy efficiency of
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationTransforming the UL into a Big Data University. Current status and planned evolutions
Transforming the UL into a Big Data University Current status and planned evolutions December 6th, 2013 UniGR Workshop - Big Data Sébastien Varrette, PhD Prof. Pascal Bouvry Prof. Volker Müller http://hpc.uni.lu
More informationHow To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationTrends in High-Performance Computing for Power Grid Applications
Trends in High-Performance Computing for Power Grid Applications Franz Franchetti ECE, Carnegie Mellon University www.spiral.net Co-Founder, SpiralGen www.spiralgen.com This talk presents my personal views
More informationTibidabo : Making the Case for an ARM-Based HPC System
Tibidabo : Making the Case for an ARM-Based HPC System Nikola Rajovic a,b,, Alejandro Rico a,b, Nikola Puzovic a, Chris Adeniyi-Jones c, Alex Ramirez a,b a Computer Sciences Department, Barcelona Supercomputing
More informationHow To Compare Amazon Ec2 To A Supercomputer For Scientific Applications
Amazon Cloud Performance Compared David Adams Amazon EC2 performance comparison How does EC2 compare to traditional supercomputer for scientific applications? "Performance Analysis of High Performance
More informationTSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale
TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale Toshio Endo,Akira Nukada, Satoshi Matsuoka GSIC, Tokyo Institute of Technology ( 東 京 工 業 大 学 ) Performance/Watt is the Issue
More informationLecture 1: the anatomy of a supercomputer
Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers of the future may have only 1,000 vacuum tubes and perhaps weigh 1½ tons. Popular Mechanics, March 1949
More informationJezelf Groen Rekenen met Supercomputers
Jezelf Groen Rekenen met Supercomputers Symposium Groene ICT en duurzaamheid: Nieuwe energie in het hoger onderwijs Walter Lioen Groepsleider Supercomputing About SURFsara SURFsara
More informationExperiences With Mobile Processors for Energy Efficient HPC
Experiences With Mobile Processors for Energy Efficient HPC Nikola Rajovic, Alejandro Rico, James Vipond, Isaac Gelado, Nikola Puzovic, Alex Ramirez Barcelona Supercomputing Center Universitat Politècnica
More informationCoolEmAll - Tools for realising an energy efficient data centre
CoolEmAll - Tools for realising an energy efficient data centre Wolfgang Christmann christmann informationstechnik + medien GmbH & Co. KG www.christmann.info 1 Outline CoolEmAll project RECS system towards
More informationSupercomputing Resources in BSC, RES and PRACE
www.bsc.es Supercomputing Resources in BSC, RES and PRACE Sergi Girona, BSC-CNS Barcelona, 23 Septiembre 2015 ICTS 2014, un paso adelante para la RES Past RES members and resources BSC-CNS (MareNostrum)
More informationHigh Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
More informationHETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK
HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance
More informationEnergy Efficiency Analysis of The SoC Based Processors for The Scientific Applications
Energy Efficiency Analysis of The SoC Based Processors for The Scientific Applications Swapnil Gaikwad August 23, 2013 MSc in High Performance Computing The University of Edinburgh Year of Presentation:
More informationInfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice
InfiniBand Strengthens Leadership as the High-Speed Interconnect Of Choice Provides the Best Return-on-Investment by Delivering the Highest System Efficiency and Utilization TOP500 Supercomputers June
More informationA Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures
11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationThe L-CSC cluster: Optimizing power efficiency to become the greenest supercomputer in the world in the Green500 list of November 2014
The L-CSC cluster: Optimizing power efficiency to become the greenest supercomputer in the world in the Green500 list of November 2014 David Rohr 1, Gvozden Nešković 1, Volker Lindenstruth 1,2 DOI: 10.14529/jsfi150304
More informationJean-Pierre Panziera Teratec 2011
Technologies for the future HPC systems Jean-Pierre Panziera Teratec 2011 3 petaflop systems : TERA 100, CURIE & IFERC Tera100 Curie IFERC 1.25 PetaFlops 256 TB ory 30 PB disk storage 140 000+ Xeon cores
More informationKriterien für ein PetaFlop System
Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working
More informationRWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative
More information1 Bull, 2011 Bull Extreme Computing
1 Bull, 2011 Bull Extreme Computing Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing HPC Overview Ares, Gerardo, HPC Team HPC concepts HPC: High Performance
More informationCUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers
CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW Dell PowerEdge M-Series Blade Servers Simplifying IT The Dell PowerEdge M-Series blade servers address the challenges of an evolving IT environment by delivering
More informationALPS Supercomputing System A Scalable Supercomputer with Flexible Services
ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being
More informationInfiniBand Experiences at Forschungszentrum Karlsruhe. Forschungszentrum Karlsruhe
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft InfiniBand Experiences at Forschungszentrum Karlsruhe A. Heiss, U. Schwickerath Credits: Inge Bischoff-Gauss Marc García Martí Bruno Hoeft Carsten
More informationBuilding Clusters for Gromacs and other HPC applications
Building Clusters for Gromacs and other HPC applications Erik Lindahl lindahl@cbr.su.se CBR Outline: Clusters Clusters vs. small networks of machines Why do YOU need a cluster? Computer hardware Network
More informationEvoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca
Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca Carlo Cavazzoni CINECA Supercomputing Application & Innovation www.cineca.it 21 Aprile 2015 FERMI Name: Fermi Architecture: BlueGene/Q
More informationEnergy-aware job scheduler for highperformance
Energy-aware job scheduler for highperformance computing 7.9.2011 Olli Mämmelä (VTT), Mikko Majanen (VTT), Robert Basmadjian (University of Passau), Hermann De Meer (University of Passau), André Giesler
More informationHP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads
HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads Gen9 Servers give more performance per dollar for your investment. Executive Summary Information Technology (IT) organizations face increasing
More informationCluster Computing at HRI
Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: jasjeet@mri.ernet.in 1 Introduction and some local history High performance computing
More informationSupercomputing 2004 - Status und Trends (Conference Report) Peter Wegner
(Conference Report) Peter Wegner SC2004 conference Top500 List BG/L Moors Law, problems of recent architectures Solutions Interconnects Software Lattice QCD machines DESY @SC2004 QCDOC Conclusions Technical
More informationDavid Vicente Head of User Support BSC
www.bsc.es Programming MareNostrum III David Vicente Head of User Support BSC Agenda WEDNESDAY - 17-04-13 9:00 Introduction to BSC, PRACE PATC and this training 9:30 New MareNostrum III the views from
More information1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations
Bilag 1 2013-03-12/RB DeIC (DCSC) Scientific Computing Installations DeIC, previously DCSC, currently has a number of scientific computing installations, distributed at five regional operating centres.
More informationA GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS
A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS SUDHAKARAN.G APCF, AERO, VSSC, ISRO 914712564742 g_suhakaran@vssc.gov.in THOMAS.C.BABU APCF, AERO, VSSC, ISRO 914712565833
More informationMSC - Scientific Computing Facility (SCF) Supercomputer Status
MSC - Scientific Computing Facility (SCF) Supercomputer Status RFP/Contract Status The formal SCF RFP was released on March 22, 2002 and closed June 10, 2002. The bid was competitive (more than one bid
More informationHigh Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates
High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of
More informationCooling and thermal efficiently in
Cooling and thermal efficiently in the datacentre George Brown HPC Systems Engineer Viglen Overview Viglen Overview Products and Technologies Looking forward Company Profile IT hardware manufacture, reseller
More informationFPGA-Accelerated Heterogeneous Hyperscale Server Architecture for Next-Generation Compute Clusters
FPGA-Accelerated Heterogeneous Hyperscale Server Architecture for Next-Generation Clusters Rene Griessl, Peykanu Meysam, Jens Hagemeyer, Mario Porrmann Bielefeld University, Germany Stefan Krupop, Micha
More informationFLOW-3D Performance Benchmark and Profiling. September 2012
FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute
More informationIntroduction to High Performance Cluster Computing. Cluster Training for UCL Part 1
Introduction to High Performance Cluster Computing Cluster Training for UCL Part 1 What is HPC HPC = High Performance Computing Includes Supercomputing HPCC = High Performance Cluster Computing Note: these
More informationWorkshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012
Scientific Application Performance on HPC, Private and Public Cloud Resources: A Case Study Using Climate, Cardiac Model Codes and the NPB Benchmark Suite Peter Strazdins (Research School of Computer Science),
More informationAccelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing
Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools
More informationBuild an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)!
Build an Energy Efficient Supercomputer from Items You can Find in Your Home (Sort of)! Marty Deneroff Chief Technology Officer Green Wave Systems, Inc. deneroff@grnwv.com 1 Using COTS Intellectual Property,
More informationComparing the performance of the Landmark Nexus reservoir simulator on HP servers
WHITE PAPER Comparing the performance of the Landmark Nexus reservoir simulator on HP servers Landmark Software & Services SOFTWARE AND ASSET SOLUTIONS Comparing the performance of the Landmark Nexus
More informationAppro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
More informationPerformance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware
Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware 1 / 17 Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware X. Besseron 1 V.
More informationAccelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team
Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft
More informationCray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak
Cray Gemini Interconnect Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak Outline 1. Introduction 2. Overview 3. Architecture 4. Gemini Blocks 5. FMA & BTA 6. Fault tolerance
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationMichael Kagan. michael@mellanox.com
Virtualization in Data Center The Network Perspective Michael Kagan CTO, Mellanox Technologies michael@mellanox.com Outline Data Center Transition Servers S as a Service Network as a Service IO as a Service
More informationCORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER
CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER Tender Notice No. 3/2014-15 dated 29.12.2014 (IIT/CE/ENQ/COM/HPC/2014-15/569) Tender Submission Deadline Last date for submission of sealed bids is extended
More informationScientific Computing Data Management Visions
Scientific Computing Data Management Visions ELI-Tango Workshop Szeged, 24-25 February 2015 Péter Szász Group Leader Scientific Computing Group ELI-ALPS Scientific Computing Group Responsibilities Data
More informationCONSISTENT PERFORMANCE ASSESSMENT OF MULTICORE COMPUTER SYSTEMS
CONSISTENT PERFORMANCE ASSESSMENT OF MULTICORE COMPUTER SYSTEMS GH. ADAM 1,2, S. ADAM 1,2, A. AYRIYAN 2, V. KORENKOV 2, V. MITSYN 2, M. DULEA 1, I. VASILE 1 1 Horia Hulubei National Institute for Physics
More informationOpenMP Programming on ScaleMP
OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign
More informationHadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
More informationARM Processors for Computer-On-Modules. Christian Eder Marketing Manager congatec AG
ARM Processors for Computer-On-Modules Christian Eder Marketing Manager congatec AG COM Positioning Proprietary Modules Qseven COM Express Proprietary Modules Small Module Powerful Module No standard feature
More informationPerformance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer
Res. Lett. Inf. Math. Sci., 2003, Vol.5, pp 1-10 Available online at http://iims.massey.ac.nz/research/letters/ 1 Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer
More informationPre(-pre)-exascale experiences, contributions and future challenges
http://www.montblanc-project.eu Pre(-pre)-exascale experiences, contributions and future challenges Etienne Walter Project Manager at Bull/ATOS Coordinator of the Mont-Blanc 3 project Filippo Mantovani
More informationInnovativste XEON Prozessortechnik für Cisco UCS
Innovativste XEON Prozessortechnik für Cisco UCS Stefanie Döhler Wien, 17. November 2010 1 Tick-Tock Development Model Sustained Microprocessor Leadership Tick Tock Tick 65nm Tock Tick 45nm Tock Tick 32nm
More informationDefying the Laws of Physics in/with HPC. Rafa Grimán HPC Architect
Defying the Laws of Physics in/with HPC 2013 11 12 Rafa Grimán HPC Architect 1 Agenda Bull Scalability ExaFLOP / Exascale Bull s PoV? Bar 2 Bull 3 Mastering Value Chain for Critical Processes From infrastructures
More informationAccelerating CFD using OpenFOAM with GPUs
Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide
More informationHPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk
HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training
More informationDavid Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM
More informationUnderstanding Power Measurement Implications in the Green500 List
2010 IEEE/ACM International Conference on Green Computing and Communications & 2010 IEEE/ACM International Conference on Cyber, Physical and Social Computing Understanding Power Measurement Implications
More informationPurchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers
Information Technology Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers Effective for FY2016 Purpose This document summarizes High Performance Computing
More informationECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009
ECLIPSE Best Practices Performance, Productivity, Efficiency March 29 ECLIPSE Performance, Productivity, Efficiency The following research was performed under the HPC Advisory Council activities HPC Advisory
More informationbenchmarking Amazon EC2 for high-performance scientific computing
Edward Walker benchmarking Amazon EC2 for high-performance scientific computing Edward Walker is a Research Scientist with the Texas Advanced Computing Center at the University of Texas at Austin. He received
More informationClusters: Mainstream Technology for CAE
Clusters: Mainstream Technology for CAE Alanna Dwyer HPC Division, HP Linux and Clusters Sparked a Revolution in High Performance Computing! Supercomputing performance now affordable and accessible Linux
More informationFPGA Acceleration using OpenCL & PCIe Accelerators MEW 25
FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 December 2014 FPGAs in the news» Catapult» Accelerate BING» 2x search acceleration:» ½ the number of servers»
More informationBuilding and benchmarking a low power ARM cluster
Building and benchmarking a low power ARM cluster Nikilesh Balakrishnan August 24, 2012 Abstract Mobile and hand-held devices are on the rise in the consumer market. The future of innovation in the semiconductor
More informationAn Oracle White Paper August 2012. Oracle WebCenter Content 11gR1 Performance Testing Results
An Oracle White Paper August 2012 Oracle WebCenter Content 11gR1 Performance Testing Results Introduction... 2 Oracle WebCenter Content Architecture... 2 High Volume Content & Imaging Application Characteristics...
More informationIntel Xeon Processor E5-2600
Intel Xeon Processor E5-2600 Best combination of performance, power efficiency, and cost. Platform Microarchitecture Processor Socket Chipset Intel Xeon E5 Series Processors and the Intel C600 Chipset
More informationCan High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
More informationMississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC
Mississippi State University High Performance Computing Collaboratory Brief Overview Trey Breckenridge Director, HPC Mississippi State University Public university (Land Grant) founded in 1878 Traditional
More informationSR-IOV: Performance Benefits for Virtualized Interconnects!
SR-IOV: Performance Benefits for Virtualized Interconnects! Glenn K. Lockwood! Mahidhar Tatineni! Rick Wagner!! July 15, XSEDE14, Atlanta! Background! High Performance Computing (HPC) reaching beyond traditional
More informationHPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware
More informationIntel Labs at ISSCC 2012. Copyright Intel Corporation 2012
Intel Labs at ISSCC 2012 Copyright Intel Corporation 2012 Intel Labs ISSCC 2012 Highlights 1. Efficient Computing Research: Making the most of every milliwatt to make computing greener and more scalable
More informationEVALUATING NEW ARCHITECTURAL FEATURES OF THE INTEL(R) XEON(R) 7500 PROCESSOR FOR HPC WORKLOADS
Computer Science Vol. 12 2011 Paweł Gepner, David L. Fraser, Michał F. Kowalik, Kazimierz Waćkowski EVALUATING NEW ARCHITECTURAL FEATURES OF THE INTEL(R) XEON(R) 7500 PROCESSOR FOR HPC WORKLOADS In this
More informationRDMA over Ethernet - A Preliminary Study
RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Outline Introduction Problem Statement
More informationBuilding an energy dashboard. Energy measurement and visualization in current HPC systems
Building an energy dashboard Energy measurement and visualization in current HPC systems Thomas Geenen 1/58 thomas.geenen@surfsara.nl SURFsara The Dutch national HPC center 2H 2014 > 1PFlop GPGPU accelerators
More informationInterconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015
Interconnect Your Future Enabling the Best Datacenter Return on Investment TOP500 Supercomputers, November 2015 InfiniBand FDR and EDR Continue Growth and Leadership The Most Used Interconnect On The TOP500
More informationWhere is Ireland in the Global HPC Arena? and what are we doing there?
Where is Ireland in the Global HPC Arena? and what are we doing there? Dr. Brett Becker Irish Supercomputer List College of Computing Technology Dublin, Ireland Outline The Irish Supercomputer List Ireland
More informationSR-IOV In High Performance Computing
SR-IOV In High Performance Computing Hoot Thompson & Dan Duffy NASA Goddard Space Flight Center Greenbelt, MD 20771 hoot@ptpnow.com daniel.q.duffy@nasa.gov www.nccs.nasa.gov Focus on the research side
More informationKashif Iqbal - PhD Kashif.iqbal@ichec.ie
HPC/HTC vs. Cloud Benchmarking An empirical evalua.on of the performance and cost implica.ons Kashif Iqbal - PhD Kashif.iqbal@ichec.ie ICHEC, NUI Galway, Ireland With acknowledgment to Michele MicheloDo
More informationLS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
More informationImproved LS-DYNA Performance on Sun Servers
8 th International LS-DYNA Users Conference Computing / Code Tech (2) Improved LS-DYNA Performance on Sun Servers Youn-Seo Roh, Ph.D. And Henry H. Fong Sun Microsystems, Inc. Abstract Current Sun platforms
More informationApplication and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster
Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster Mahidhar Tatineni (mahidhar@sdsc.edu) MVAPICH User Group Meeting August 27, 2014 NSF grants: OCI #0910847 Gordon: A Data
More informationScaling from Workstation to Cluster for Compute-Intensive Applications
Cluster Transition Guide: Scaling from Workstation to Cluster for Compute-Intensive Applications IN THIS GUIDE: The Why: Proven Performance Gains On Cluster Vs. Workstation The What: Recommended Reference
More informationPerformance Evaluation of Amazon EC2 for NASA HPC Applications!
National Aeronautics and Space Administration Performance Evaluation of Amazon EC2 for NASA HPC Applications! Piyush Mehrotra!! J. Djomehri, S. Heistand, R. Hood, H. Jin, A. Lazanoff,! S. Saini, R. Biswas!
More informationTransforming your IT Infrastructure for Improved ROI. October 2013
1 Transforming your IT Infrastructure for Improved ROI October 2013 Legal Notices This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Software
More informationA Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department
More informationBenchmarks and Comparisons of Performance for Data Intensive Research
Benchmarks and Comparisons of Performance for Data Intensive Research Saad A. Alowayyed August 23, 2012 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2012 Abstract
More informationNetworking Virtualization Using FPGAs
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,
More information