1 Bull, 2011 Bull Extreme Computing

Similar documents
Building a Top500-class Supercomputing Cluster at LNS-BUAP

Overview of HPC systems and software available within

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Cluster Computing at HRI

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Trends in High-Performance Computing for Power Grid Applications

Cluster Computing in a College of Criminal Justice

SR-IOV: Performance Benefits for Virtualized Interconnects!

1 DCSC/AU: HUGE. DeIC Sekretariat /RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

Using the Windows Cluster

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

High Performance Computing in CST STUDIO SUITE

OpenMP Programming on ScaleMP

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Parallel Programming Survey

Multi-Threading Performance on Commodity Multi-Core Processors

Multicore Parallel Computing with OpenMP

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Virtualization of a Cluster Batch System

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

benchmarking Amazon EC2 for high-performance scientific computing

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Manual for using Super Computing Resources

Parallel Processing using the LOTUS cluster

Cluster performance, how to get the most out of Abel. Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013

LS DYNA Performance Benchmarks and Profiling. January 2009

An introduction to Fyrkat

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

A PERFORMANCE COMPARISON USING HPC BENCHMARKS: WINDOWS HPC SERVER 2008 AND RED HAT ENTERPRISE LINUX 5

Cloud Computing through Virtualization and HPC technologies

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Improved LS-DYNA Performance on Sun Servers

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Performance Monitoring of Parallel Scientific Applications


64-Bit versus 32-Bit CPUs in Scientific Computing

The Asterope compute cluster

The CNMS Computer Cluster

HPC Wales Skills Academy Course Catalogue 2015

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Large-Data Software Defined Visualization on CPUs

A Crash course to (The) Bighouse

Overview

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

A Performance and Cost Analysis of the Amazon Elastic Compute Cloud (EC2) Cluster Compute Instance

Lattice QCD Performance. on Multi core Linux Servers

Supercomputing applied to Parallel Network Simulation

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams

Performance Across the Generations: Processor and Interconnect Technologies

High Performance Computing. Course Notes HPC Fundamentals

Building an energy dashboard. Energy measurement and visualization in current HPC systems

Thematic Unit of Excellence on Computational Materials Science Solid State and Structural Chemistry Unit, Indian Institute of Science

Advanced Techniques with Newton. Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

Clusters: Mainstream Technology for CAE

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer

Parallel Computing with MATLAB

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Cluster Implementation and Management; Scheduling

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

CONSISTENT PERFORMANCE ASSESSMENT OF MULTICORE COMPUTER SYSTEMS

Sun Constellation System: The Open Petascale Computing Architecture

MPI / ClusterTools Update and Plans

Performance analysis of parallel applications on modern multithreaded processor architectures

Accelerating CFD using OpenFOAM with GPUs

Mathematical Libraries on JUQUEEN. JSC Training Course

Performance Analysis of a Hybrid MPI/OpenMP Application on Multi-core Clusters

Department of Computer Sciences University of Salzburg. HPC In The Cloud? Seminar aus Informatik SS 2011/2012. July 16, 2012

Introduction to ACENET Accelerating Discovery with Computational Research May, 2015

High Productivity Computing With Windows

Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.

Fujitsu HPC Cluster Suite

Scientific Computing Programming with Parallel Objects

GPUs for Scientific Computing

Service Partition Specialized Linux nodes. Compute PE Login PE Network PE System PE I/O PE

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Recent Advances in HPC for Structural Mechanics Simulations

Introduction to GPU hardware and to CUDA

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

CFD Implementation with In-Socket FPGA Accelerators

CAS2K5. Jim Tuccillo

Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware

Pedraforca: ARM + GPU prototype

Cray XT3 Supercomputer Scalable by Design CRAY XT3 DATASHEET

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

Jean-Pierre Panziera Teratec 2011

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

Comparing the performance of the Landmark Nexus reservoir simulator on HP servers

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Lecture 1: the anatomy of a supercomputer

Transcription:

1 Bull, 2011 Bull Extreme Computing

Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing

HPC Overview Ares, Gerardo, HPC Team

HPC concepts HPC: High Performance Computing. Motivation: - Advances in technologies have helped to address bigger complex problems. In recent years were developed high performance infrastructure for scientific calculus. In the past 10 years clusters architecture has grown from 2.2% to 84.8% (source http://www.top500.com). Architecture June 2000 June2010 Constelations 18.6 0.4 Clusters 2.2 84.8 MPP 51.4 14.8 SMP 27.8 0.0 4 Bull, 2011 Bull Extreme Computing

HPC concepts Sastify the growing requisites of the computing power. - Complex problems. - Complex models. - Huge data sets. - Responses are time-limited. Parallel processesing. - Many process work together to resolve a common problem. - Domain decomposition or Functional parallelism are used to reduce the computing time of a resolution. 5 Bull, 2011 Bull Extreme Computing

HPC concepts Domain decomposition. - Many simulations in science and engineering work with a simplified picture of reality in which a computational domain, e.g., some volume of a fluid, is represented as a grid that defines discrete positions for the physical quantities under consideration. - The goal of the simulation is usually the computation of observables on this grid. - A straightforward way to distribute the work involved across workers, i.e. processors, is to assign a part of the grid to each worker. - This is called domain decomposition. 6 Bull, 2011 Bull Extreme Computing

HPC concepts Functional decomposition. - Sometimes, solving a complete problem can be split into more or less disjoint subtasks that may have to be executed in some specific order, each one potentially using results of the previous one as input or being completely unrelated up to some point. - The tasks can be worked on in parallel, using appropriate amounts of resources so that load imbalance is kept under control. 7 Bull, 2011 Bull Extreme Computing

Cluster concepts Parallel programming: - Processing power (SMP systems). - Net (data communicatons). - Libraries development and programming API. Shared memory processing: - Multithread programming. Distributed processing: - High performance networks. - Independent machines interconnected by a high performance network. - Sincronization: - Available by message passing mechanism. 8 Bull, 2011 Bull Extreme Computing

Cluster Overview Ares, Gerardo, HPC Team

Cluster overview A Cluster is composed by: - Compute nodes. - Administration nodes: login, administration. - I/O nodes. - Networks: - Application network. - Administration network. - I/O network. Software components: - Operating system. - Compilers. - Scientific and parallel libraries. - Management software. - Resource Manager. 10 Bull, 2011 Bull Extreme Computing

Cluster overview 11 Bull, 2011 Bull Extreme Computing

UFMG Cluster UFMG cluster: - Compute nodes: veredas[2-107]: 106 nodes. - 2 x Intel Xeon X5355 2.66GHz (4 cores). - 16 GB RAM memory. - Administration nodes: veredas0. - Login node. - Cluster administration. - I/O nodes: veredas[0-1]. - NFS. - Networks: - Application network: Infiniband. - Administration network: 1 GbEthernet. - I/O network: 1GbEthernet. 12 Bull, 2011 Bull Extreme Computing

UFMG Cluster Software components: - Operating system: RedHat Enterprise Linux 5.3. - Compilers: - Intel C/C++ & Fortran. - GNU gcc & g77. - Scientific libraries: - BLACS, LAPACK, SCALAPACK, FFTW, NETCDF, HDF5, etc. - Parallel libraries: - Bull MPI 2. - Intel MPI. - Management Software: - Bull XBAS 5v3.1u1. - Resource Manager: - SLURM. 13 Bull, 2011 Bull Extreme Computing

P demo P demo 14 Bull, 2011 Bull Extreme Computing

FLOPS Ares, Gerardo, HPC Team

Flops One of the most accepted metrics to evaluate a cluster power is the number of Flops that it could reach. FLOPS is an acronym: FLoating point OPerations per Second. A cluster has a theorical peak of number of FLOPS in double precision. The Intel 5000 family series has a 128 bit SSE register. Each core can run two float operation in a clock tick. As the SSE register is 128 bit long, each core can run 4 double precision float operations. On an Intel Xeon X5355 Quad-core processor can be done 16 double precision float operations on each clock tick. 16 Bull, 2011 Bull Extreme Computing

UFMG Flops To get the number of floating point operations per second on a Intel X5355 processor: #Flops = 4 * CPU_speed * number_of_cores) #Flops = 4 * 2.66 * 4 = 42,56 The UFMG cluster has 106 compute node with 2 Intel X5355 processors, so the theoritical peak of the cluster is: Theoritical peak = 106 * 2 * 42,56 = 9022,7 Gflops = 9,0 Tflops. 17 Bull, 2011 Bull Extreme Computing

18 Bull, 2011 Bull Extreme Computing