Introduc)on to High Performance Compu)ng

Similar documents
Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o

Accelerating CFD using OpenFOAM with GPUs

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

Overview of HPC Resources at Vanderbilt

Overview of HPC systems and software available within

Parallel Programming Survey

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Using NeSI HPC Resources. NeSI Computational Science Team

A quick tutorial on Intel's Xeon Phi Coprocessor

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPUs for Scientific Computing

10- High Performance Compu5ng

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

GPU Hardware CS 380P. Paul A. Navrá7l Manager Scalable Visualiza7on Technologies Texas Advanced Compu7ng Center

ECDF Infrastructure Refresh - Requirements Consultation Document

Turbomachinery CFD on many-core platforms experiences and strategies

Case Study on Productivity and Performance of GPGPUs

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Introduction to HPC Workshop. Center for e-research

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Programming Techniques for Supercomputers: Multicore processors. There is no way back Modern multi-/manycore chips Basic Compute Node Architecture

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

The CNMS Computer Cluster

PLGrid Infrastructure Solutions For Computational Chemistry

An introduction to Fyrkat

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Using the Windows Cluster

The Asterope compute cluster

Trends in High-Performance Computing for Power Grid Applications

NVIDIA Tesla K20-K20X GPU Accelerators Benchmarks Application Performance Technical Brief

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

ST810 Advanced Computing

Lecture 1: the anatomy of a supercomputer

Using the Intel Xeon Phi (with the Stampede Supercomputer) ISC 13 Tutorial

Getting Started with HPC

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

1 Bull, 2011 Bull Extreme Computing

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Recent Advances in HPC for Structural Mechanics Simulations

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

High Performance Computing in CST STUDIO SUITE

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

Building an energy dashboard. Energy measurement and visualization in current HPC systems

Introduction to GPU hardware and to CUDA

Part I Courses Syllabus

NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

OpenMP Programming on ScaleMP

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Manual for using Super Computing Resources

Resource Scheduling Best Practice in Hybrid Clusters

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

HPC Wales Skills Academy Course Catalogue 2015

Keys to node-level performance analysis and threading in HPC applications

Evaluation of CUDA Fortran for the CFD code Strukti

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

locuz.com HPC App Portal V2.0 DATASHEET

Jean-Pierre Panziera Teratec 2011

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

Parallel Computing with MATLAB

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity

How to Run Parallel Jobs Efficiently

HP ProLiant SL270s Gen8 Server. Evaluation Report

Intel Xeon Phi Basic Tutorial

Large-Data Software Defined Visualization on CPUs

CUDA programming on NVIDIA GPUs

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Introduction to ACENET Accelerating Discovery with Computational Research May, 2015

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

CFD Implementation with In-Socket FPGA Accelerators

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Enhancing Cloud-based Servers by GPU/CPU Virtualization Management

CPU Session 1. Praktikum Parallele Rechnerarchtitekturen. Praktikum Parallele Rechnerarchitekturen / Johannes Hofmann April 14,

Cloud Computing through Virtualization and HPC technologies

Introduc)on to HPC Cluster Compu)ng Ken- ichi Nomura, Ph.D. Center for High- Performance Compu4ng

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

BLM 413E - Parallel Programming Lecture 3

1 DCSC/AU: HUGE. DeIC Sekretariat /RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

SLURM Workload Manager

ultra fast SOM using CUDA

Cloud Computing. Alex Crawford Ben Johnstone

Big Data Visualization on the MIC

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.

Transcription:

Introduc)on to High Performance Compu)ng Advanced Research Computing September 9, 2015

Outline What cons)tutes high performance compu)ng (HPC)? When to consider HPC resources What kind of problems are typically solved? What are the components of HPC? What resources are available? Overview of HPC Resources at Virginia Tech 2

Should I Pursue HPC? Are local resources insufficient to meet your needs? Very large jobs Very many jobs Large data Do you have na)onal collaborators? Share projects between different en))es Convenient mechanisms for data sharing 3

Who Uses HPC? Training (51) 2% Earth Sci (29) 2% ScienEfic CompuEng (60) 2% Chemistry (161) 7% Chemical, Thermal Sys (89) 8% Materials Research (131) 9% Atmospheric Sciences (72) 11% Physics (91) 19% Molecular Biosciences (271) 17% Astronomical Sciences (115) 13% >2 billion core- hours allocated 1400 alloca)ons 350 ins)tu)ons 32 research domains

Learning Curve Linux: Command- line interface Scheduler: Shares resources among mul)ple users Parallel Compu)ng: Need to parallelize code to take advantage of supercomputer s resources Third party programs or libraries make this easier

Popular So\ware Packages Molecular Dynamics: Gromacs, LAMMPS CFD: OpenFOAM, Ansys Finite Elements: Deal II, Abaqus Chemistry: VASP, Gaussian Climate: CESM Bioinforma)cs: Mothur, QIIME, MPIBLAST Numerical Compu)ng/Sta)s)cs: R, Matlab Visualiza)on: ParaView, VisIt, Ensight

What is Parallel Compu)ng? 8

Parallel Compu)ng 101 Parallel compu)ng: use of mul)ple processors or computers working together on a common task. Each processor works on its sec)on of the problem Processors can exchange informa)on Grid of Problem to be solved y CPU #1 works on this area of the problem exchange exchange CPU #2 works on this area of the problem exchange CPU #3 works on this area of the problem exchange CPU #4 works on this area of the problem x 9

Why Do Parallel Compu)ng? Limits of single CPU compu)ng performance available memory I/O rates Parallel compu)ng allows one to: solve problems that don t fit on a single CPU solve problems that can t be solved in a reasonable )me We can solve larger problems faster more cases 10

Parallelism is the New Moore s Law Power and energy efficiency impose a key constraint on design of micro- architectures Clock speeds have plateaued Hardware parallelism is increasing rapidly to make up the difference

What does a modern supercomputer look like? 14

Essen)al Components of HPC Supercompu)ng resources Storage Visualiza)on Data management Network infrastructure Support 16

Terminology Core: A computa)onal unit Socket: A single CPU ( processor ). Includes roughly 4-15 cores. Node: A single computer. Includes roughly 2-8 sockets. Cluster: A single supercomputer consis)ng of many nodes. GPU: Graphics processing unit. Amached to some nodes. General purpose GPUs (GPGPUs) can be used to speed up certain kinds of codes. Xeon Phi: Intel s product name for its GPU compe)tor. Also called MIC.

Shared vs. Distributed memory M M M M M Memory P P P P P P P P P P Network All processors have access to a pool of shared memory Access )mes vary from CPU to CPU in NUMA systems Example: SGI UV, CPUs on same node Memory is local to each processor Data exchange by message passing over a network Example: Clusters with single- socket blades

Mul)- core systems Memory Memory Memory Memory Memory Network Current processors place mul)ple processor cores on a die Communica)on details are increasingly complex Cache access Main memory access Quick Path / Hyper Transport socket connec)ons Node to node connec)on via network

Accelerator- based Systems Memory Memory Memory Memory G P U G P U G P U G P U Network Calcula)ons made in both CPUs and GPUs No longer limited to single precision calcula)ons Load balancing cri)cal for performance Requires specific libraries and compilers (CUDA, OpenCL) Co- processor from Intel: MIC (Many Integrated Core)

HPC Trends Memory Memory M P GPU Architecture Single core Mul)core GPU Cluster Code Serial OpenMP, Pthreads CUDA, OpenACC MPI

How are accelerators different? Intel Xeon E5-2670 (CPU) Intel Xeon Phi 5110P (MIC) Nvidia Tesla K20X (GPU) Cores 8 60 14 SMX Logical Cores 16 240 2,688 CUDA cores Frequency 2.60 GHz 1.05 GHz 0.74 MHz GFLOPs (double) 333 1,010 1,317 Memory 64 GB 8GB 6GB Memory B/W 51.2GB/s 320GB/s 250GB/s

Batch Submission Process Login Node Compute Nodes ssh qsub job Queue Master Node C1 C2 C3 mpirun np #./a.out

ARC Overview 26

Advanced Research Compu)ng Unit within the Office of the Vice President of Informa)on Technology Provide centralized resources for: Research compu)ng Visualiza)on Staff to assist users Website: hmp://www.arc.vt.edu

Goals Advance the use of compu)ng and visualiza)on in VT research Centralize resource acquisi)on, maintenance, and support for research community Provide support to facilitate usage of resources and minimize barriers to entry Enable and par)cipate in research collabora)ons between departments

Personnel Associate VP for Research Compu)ng: Terry Herdman Director, HPC: Vijay Agarwala Director, Visualiza)on: Nicholas Polys Computa)onal Scien)sts Jus)n Krome)s James McClure Brian Marshall Srinivas Yarlanki Srijith Rajamohan

Personnel (Con)nued) System Administrators Tim Rhodes Chris Snapp Brandon Sawyers Business Manager: Alana Romanella User Support GRAs: Umar Kalim, Saeed Izadi, Sangeetha Srinivasa

Compute Resources System Usage Nodes Node DescripEon Special Features Ithaca Beginners, MATLAB 79 8 cores, 24GB (2 Intel Nehalem) 10 double- memory nodes HokieOne Shared, Large Memory 82 6 cores, 32GB (Intel Westmere) 2.6TB shared- memory HokieSpeed GPGPU 201 BlueRidge NewRiver Large- scale CPU, MIC Large- scale, Data Intensive 408 134 12 cores, 24 GB (2 Intel Westmere) 16 cores, 64 GB (2 Intel Sandy Bridge) 24 cores, 128 GB (2 Intel Haswell) 402 Tesla C2050 GPU 260 Intel Xeon Phi 4 K40 GPU 18 128GB nodes 8 K80 GPGPU 16 big data nodes 24 512GB nodes 2 3TB nodes

Computa)onal Resources Name NewRiver BlueRidge HokieSpeed HokieOne Ithaca Key Features, Uses Scalable CPU, Data Intensive Scalable CPU or MIC GPU Shared Memory Beginners, MATLAB Available August 2015 March 2013 Sept 2012 Apr 2012 Fall 2009 Theore)cal Peak (TFlops/s) 152.6 398.7 238.2 5.4 6.1 Nodes 134 408 201 N/A 79 Cores 3,288 6,528 2,412 492 632 Cores/Node 24 16 12 N/A* 8 Accelerators/ Coprocessors 8 Nvidia K80 GPU 260 Intel Xeon Phi 8 Nvidia K40 GPU 408 Nvidia C2050 GPU N/A N/A Memory Size 34.4 TB 27.3 TB 5.0 TB 2.62 TB 2 TB Memory/Core 5.3 GB* 4 GB* 2 GB 5.3 GB 3 GB* Memory/Node 128 GB* 64 GB* 24 GB N/A* 24 GB*

Visualiza)on Resources VisCube: 3D immersion environment with three 10ʹ by 10ʹ walls and a floor of 1920 1920 stereo projec)on screens DeepSix: Six )led monitors with combined resolu)on of 7680 3200 ROVR Stereo Wall AISB Stereo Wall

Gexng Started with ARC Review ARC s system specifica)ons and choose the right system(s) for you Specialty so\ware Apply for an account online the Advanced Research Compu)ng website When your account is ready, you will receive confirma)on from ARC s system administrators

Resources ARC Website: hmp://www.arc.vt.edu ARC Compute Resources & Documenta)on: hmp://www.arc.vt.edu/hpc New Users Guide: hmp://www.arc.vt.edu/newusers Frequently Asked Ques)ons: hmp://www.arc.vt.edu/faq Linux Introduc)on: hmp://www.arc.vt.edu/unix

Thank you Ques)ons?