Introduction Physics at CSC. Tomasz Malkiewicz Jan Åström



Similar documents
A National Computing Grid: FGI

Building a Top500-class Supercomputing Cluster at LNS-BUAP

CSC computing resources. Ville Savolainen, Tommi Nyrönen and Tomasz Malkiewicz CSC IT Center for Science Ltd.

Building Clusters for Gromacs and other HPC applications

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

1 DCSC/AU: HUGE. DeIC Sekretariat /RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

Kriterien für ein PetaFlop System

Cluster Implementation and Management; Scheduling

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

Clusters: Mainstream Technology for CAE

JuRoPA. Jülich Research on Petaflop Architecture. One Year on. Hugo R. Falter, COO Lee J Porter, Engineering

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Sun Constellation System: The Open Petascale Computing Architecture

Overview of HPC systems and software available within

FLOW-3D Performance Benchmark and Profiling. September 2012

Estonian Scientific Computing Infrastructure (ETAIS)

HPC Update: Engagement Model

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Supercomputing Resources in BSC, RES and PRACE

Altix Usage and Application Programming. Welcome and Introduction

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

PLGrid Infrastructure Solutions For Computational Chemistry

HPC-related R&D in 863 Program

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

How Cineca supports IT

Jezelf Groen Rekenen met Supercomputers

Lecture 1: the anatomy of a supercomputer

Parallel Computing. Introduction

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

High Performance Computing in CST STUDIO SUITE

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

HP ProLiant SL270s Gen8 Server. Evaluation Report

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Thematic Unit of Excellence on Computational Materials Science Solid State and Structural Chemistry Unit, Indian Institute of Science

Linux Cluster Computing An Administrator s Perspective

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Parallel Programming Survey

PRACE hardware, software and services. David Henty, EPCC,

Overview of HPC Resources at Vanderbilt

OpenMP Programming on ScaleMP

Deploying and managing a Visualization Onera

Supercomputing Status und Trends (Conference Report) Peter Wegner

Using NeSI HPC Resources. NeSI Computational Science Team

Hadoop on the Gordon Data Intensive Cluster

David Vicente Head of User Support BSC

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

Recommended hardware system configurations for ANSYS users

Access to the Federal High-Performance Computing-Centers

Cloud Computing. Alex Crawford Ben Johnstone

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

Scientific Computing Data Management Visions

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

benchmarking Amazon EC2 for high-performance scientific computing

SR-IOV In High Performance Computing

Getting Started with HPC

SURFsara HPC Cloud Workshop

Sun in HPC. Update for IDC HPC User Forum Tucson, AZ, Sept 2008

Cosmological simulations on High Performance Computers

The CNMS Computer Cluster

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Grids Computing and Collaboration

Trends in High-Performance Computing for Power Grid Applications

Crossing the Performance Chasm with OpenPOWER

RSC presents SPbPU supercomputer center and new scientific research results achieved with RSC PetaStream massively parallel supercomputer

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Parallel Software usage on UK National HPC Facilities : How well have applications kept up with increasingly parallel hardware?

Distributed File System Performance. Milind Saraph / Rich Sudlow Office of Information Technologies University of Notre Dame


High Performance Computing in Aachen

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Accelerating CFD using OpenFOAM with GPUs

Performance Characteristics of Large SMP Machines

SURFsara HPC Cloud Workshop

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

Transcription:

Introduction Physics at CSC Tomasz Malkiewicz Jan Åström

CSC Autumn School in Computational Physics 2013 9.00-9.30 9.30-10.15 Monday November 25 Tuesday November 26 Course intro, physics@csc (T. Malkiewicz, J. Åström) Round robin: how CSC can help your research (T. Malkiewicz) 10.15-10.45 Coffee break Coffee break 10.45-11.30 11.30-12.00 Massively parallel computations (K. Rummukainen) Computational physics with Xeon Phi and GPU (F. Robertsén) 12.00-13.00 Lunch Lunch Advanced unix for physicists (J. Lento) Advanced unix for physicists (J. Lento) 13.00-14.30 Debugging and code optimization (S. Ilvonen / J. Enkovaara) Introduction to glaciology and numerical modelling of glacier dynamics, example: Vestfonna ice-cap, Svalbard (M. Schäfer) 14.30-14.45 Coffee break Coffee break 14.45-15.15 On the diversity of particle-based methods (J. Åström) Continuum models and assumptions (T. Zwinger) 15.15-16.30 Archive and IO + demo on FGI and Cloud (K. Mattila, R. Laurikainen) Scientific visualization, focus on geophysics (J. Hokkanen) 11/25/2013 CSC Autumn School in Comp. Phys. '13 2 + supercomputers guided tour on Tuesday at 12:40

Aims Rather lecture than conference-oriented presentations Slides/abstracts available in advance Try to make potentially difficult things look relatively easy to learn and understand Skip items that have less significance in everyday work of physicists A hands-on sessions included in most lecture session allow to practice the just learned subjects 11/25/2013 CSC Autumn School in Comp. Phys. '13 3

Physics at CSC Physics at supercomputers Resources available for physicists What s new Future Why and when to use supercomputers Courses of interest for physicists Physics people at CSC Q/A 11/25/2013 CSC Autumn School in Comp. Phys. '13 4

Physics at supercomputers Physics is a branch of science concerned with the nature, structure and properties of matter, ranging from the smallest scale of atoms and sub-atomic particles, to the Universe as a whole. Physics includes experiment and theory and involves both fundamental research driven by curiosity, as well as applied research linked to technology. EPS report, 2013 Supercomputer is a computer at the frontline of contemporary processing capacity particularly speed of calculation. Fastest supercomputer: China Tianhe-2 with 33.86 petaflop/s (quadrillions of calculations per second) on the LINPACK benchmark 11/25/2013 CSC Autumn School in Comp. Phys. '13 5

Usage of processor time by discipline 1H/2013 3% 2% 2% 2% Physics 4% 34% Nanoscience 6% 15% 5% Total 84.5 million billing units Chemistry Astrophysics Computational fluid dynamics Biosciences Grid usage Materials sciences 27% Computational drug design Other 11/25/2013 CSC Autumn School in Comp. Phys. '13 6

Application software usage (maintained by CSC) according to processor time 1H/2013 2% 2% 2% 2% 3% 2% 36% GPAW Gromacs 4% 10% 3% Total 22.3 million core hours CP2K Gaussian Molpro NAMD ADF VASP Matlab Turbomole 34% Other 11/25/2013 CSC Autumn School in Comp. Phys. '13 7

New projects by discipline 1H/2013 8 24 56 Biosciences Computer science Language research 12 9 12 Total 195 new projects Physics Grid usage Chemistry Structural analysis Social sciences 12 12 12 16 22 Medical sciences Computational fluid dynamics Other 11/25/2013 CSC Autumn School in Comp. Phys. '13 8

Users of computing servers by organization 2012 (total 1463 users) University of Helsinki 57 57 38 33 181 555 Aalto University University of Jyväskylä University of Turku 61 University of Oulu 78 University of Eastern Finland 97 99 207 Tampere University of Technology CSC (PRACE) University of Tampere CSC (Projects) Other 11/25/2013 CSC Autumn School in Comp. Phys. '13 9

Foreign user accounts in CSC's server environment 1H/2013 124 Germany France 327 110 U.K. Italy 41 46 46 55 Total 1121 users from 69 countries 68 68 75 79 82 India Poland China Russia USA Spain The Netherlands Other (58 countries) 11/25/2013 CSC Autumn School in Comp. Phys. '13 10

Currently available computing resources Massive computational challenges: Sisu > 10 000 cores, >23TB memory Theoretical peak performance > 240 Tflop/s HP-cluster Taito (+ Vuori by 1/2014) Small and medium-sized tasks Theoretical peak performance 180 Tflop/s (40) Application server Hippu Interactive usage, without job scheduler Postprocessing, e.g. vizualization FGI CSC cloud services 11/25/2013 CSC Autumn School in Comp. Phys. '13 11

11/25/2013 CSC Autumn School in Comp. Phys. '13 12

Last site level blackout in the early 1980s Power distribution (FinGrid) CSC started ITI Curve monitoring early Feb-2012 11/25/2013 CSC Autumn School in Comp. Phys. '13 13

11/25/2013 CSC Autumn School in Comp. Phys. '13 14

Sisu now 11/25/2013 CSC Autumn School in Comp. Phys. '13 15

Sisu rear view 11/25/2013 CSC Autumn School in Comp. Phys. '13 16

Taito (HP) hosted in SGI Ice Cube R80 11/25/2013 CSC Autumn School in Comp. Phys. '13 17

SGI Ice Cube R80 11/25/2013 CSC Autumn School in Comp. Phys. '13 18

Taito 11/25/2013 CSC Autumn School in Comp. Phys. '13 19

Cray Dragonfly Topology All-to-all network between groups 2 dimensional all-to-all network in a group Source: Robert Alverson, Cray Hot Interconnects 2012 keynote 11/25/2013 Optical uplinks to inter-group net CSC Autumn School in Comp. Phys. '13 20

GFlop/s Performance of numerical libraries 30.00 DGEMM 1000x1000 Single-Core Performance Turbo Peak (when only 1 core is used) @ 3.5GHz * 8 Flop/Hz 25.00 20.00 15.00 Peak @ 2.7GHz * 8 Flop/Hz Sandy Bridge 2.7GHz Opteron Barcelona 2.3GHz (Louhi) 10.00 Peak @ 2.3GHz * 4 Flop/Hz 5.00 0.00 ATLAS 3.8 ATLAS 3.10 ACML 5.2 Ifort 12.1 RedHat 6.2 RPM matmul MKL 12.1 LibSci ACML 4.4.0 MKL 11 MKL the best choice on Sandy Bridge, for now. (On Cray, LibSci a good alternative) 11/25/2013 CSC Autumn School in Comp. Phys. '13 21

Sisu&Taito vs. Louhi&Vuori vs. FGI vs. Local Cluster Availability CPU Sisu&Taito (Phase 1) Available Intel Sandy Bridge, 2 x 8 cores, 2.6 GHz, Xeon E5-2670 CSC Autumn School in Comp. Phys. '13 Vuori FGI Merope Available (by 1/2014) 2.6 GHz AMD Opteron and Intel Xeon Available Available Intel Xeon, 2 x 6 cores, 2.7 GHz, X5650 Interconnect Aries / FDR IB QDR IB QDR IB Cores 11776 / 9216 3648 7308 748 RAM/core 2 / 4 GB 16x 256GB/node 1 / 2 / 8 GB 2 / 4 / 8 GB 4 / 8 GB Tflops 244 / 180 33 95 8 GPU nodes in Phase2 8 88 6 Disc space 2.4 PB 145 TB 1+ PB 100 TB 11/25/2013 22

What s new 11/25/2013 CSC Autumn School in Comp. Phys. '13 23

Future Phase 1 Phase 2 Cray HP Cray HP Deployment Done Done Probably 2014 CPU Interconnect Intel Sandy Bridge 16 cores @ 2.6 GHz Aries FDR InfiniBand (56 Gbps) Next generation processors Aries EDR InfiniBand (100 Gbps) Cores 11776 9216 ~40000 ~17000 Tflops 244 180 (5x Vuori) 1700 515 (15x Vuori) Tflops 11/25/2013 total CSC 424 Autumn School in Comp. Phys. '13 2215 24

CSC Computing Capacity 1989 2012 Standardized processors max. capacity (80%) capacity used Cray XC30 10000 Cray XT5 1000 IBM eserver Cluster 1600 Two Compaq Alpha Servers (Lempo and Hiisi) Cray XT4 DC Cray XT4 QC HP CP4000 BL Proliant 6C AMD HP Proliant SL230s 100 10 1 Cray X-MP/416 Convex 3840 SGI R4400 IBM SP2 IBM SP1 Cray C94 Cray T3E expanded Cray T3E (224 proc) (192 proc) Cray T3E expanded (512 proc) SGI upgrade SGI Origin 2000 IBM upgrade Cray T3E IBM SP2 decommissioned (64 proc) SGI upgrade 1/1998 IBM SP Power3 Compaq Alpha cluster (Clux) HP DL 145 Proliant Sun Fire 25K Federation HP switch upgrade on IBM Cray T3E decommissioned 12/2002 Clux and Hiisi decommissioned 2/2005 HP CP4000BL Proliant 465c DC AMD Murska decommissioned 6/2012 Convex C220 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 50. 100. Top500 rating 1993 2012 The Top500 lists http://www.top500.org/ were started in 1993. Cray T3E IBM eserver p690 Cray XT4/XT5 Cray XC30 150. 200. 250. SGI Power Challenge IBM SP2 SGI Origin 2000 HP Proliant 465c DC HP Proliant SL230s 300. 350. Cray X-MP IBM SP Power3 400. 450. IBM SP1 Cray C94 Digital AlphaServer HP Proliant 465c 6C 500. Convex C3840 11/25/2013 CSC Autumn School in Comp. Phys. '13 25

IT summary Cray XC30 supercomputer (Sisu) Fastest computer in Finland Phase 1: 385 kw, 244 Tflop/s, 16 x 2 GB cores per computing node, 4 x 256 GB login nodes Phase 2: ~1700 Tflop/s Very high density, large racks PRACE prototype (coming late 2013 and 2014) Intel Xeon Phi coprocessors NVIDIA next generation GPUs 11/25/2013 CSC Autumn School in Comp. Phys. '13 26

IT summary cont. HP (Taito) 1152 Intel CPUs 16 x 4 GB cores per node 16 fat nodes with 16 x16 GB cores per node 6 x 64 GB login nodes 180 TFlop/s 30 kw 47 U racks HPC storage 1 + 1.4 + 1.4 PB of fast parallel storage Supports Cray and HP systems 11/25/2013 CSC Autumn School in Comp. Phys. '13 27

ns/day Why and when to use HPC? 160 140 Lipid MD, 120katoms, PME, Gromacs 120 100 80 15 10 5 louhi vuori 60 40 0 0 16 32 taito sisu 20 0 0 100 200 300 400 500 600 cores 11/25/2013 CSC Autumn School in Comp. Phys. '13 28

Courses at CSC CSC courses: http://www.csc.fi/courses CSC HPC Summer School Sisu (Cray) workshops Taito (HP) workshops December 2013 Intel Xeon Phi programming 11/25/2013 CSC Autumn School in Comp. Phys. '13 29

Physics people at CSC Particle based methods: Jan Åström Geophysics/glaciology: Thomas Zwinger Nanoscience/semiconductors: Jura Tarus Nuclear/particle physics: Tomasz Malkiewicz Partial differential equations/elmer: Peter Råback A few with background in DFT: Juha Lento Quantum chemistry: Nino Runeberg A few with numerical mathematics background Several with advanced code optimisation skills Everything related to HPC in general 11/25/2013 CSC Autumn School in Comp. Phys. '13 30

Q/A: Need disk space 3.8 PB on DDN $HOME, $USERAPPL: 20 GB $WRKDIR (not backed up), soft quota: 5 TB HPC ARCHIVE: 2 TB / user, common between Cray and HP /tmp (around 1.8 TB) to be used for compiling codes Disk space through IDA 11/25/2013 CSC Autumn School in Comp. Phys. '13 31

Disks at Kajaani taito.csc.fi login nodes sisu.csc.fi login nodes Your workstation irods client compute nodes compute nodes SUI $TMPDIR $TMPDIR $TMPDIR $TMPDIR $WRKDIR $HOME $TMPDIR New tape $ARCHIVE in Espoo irods interface disk cache icp, iput, ils, irm $USERAPPL $HOME/xyz 11/25/2013 CSC Autumn School in Comp. Phys. '13 icp 32

Datasets served by TTA Projects funded by Finnish Academy (akatemiahankkeet, huippuyksiköt, tutkimusohjelmat and tutkimusinfrastruktuurit) 1 PB capacity Universities and Polytechnics 1 PB capacity ESFRI-projects (ex. BBMRI, CLARIN) Other important research projects via special application process SA hankke et 1 PB ESFRIt, FSD, pilotit ja lisäosuudet 1 PB Korkea -koulut 1 PB 11/25/2013 CSC Autumn School in Comp. Phys. '13 33

Q/A: Is there a single place to look for info regarding supercomputers? User manuals http://research.csc.fi/guides Support helpdesk@csc.fi 11/25/2013 CSC Autumn School in Comp. Phys. '13 34

Q/A: Need large capacity -> Grand Challenges Normal GC (in half a year / year) new CSC resources available for a year no bottom limit for number of cores, up to 50% Special GC call (mainly for Cray) (when needed) possibility for short (day or less) runs with the whole Cray Remember also PRACE/DECI http://www.csc.fi/english/csc/news/news/pracecalls 11/25/2013 CSC Autumn School in Comp. Phys. '13 35

Q/A: Is Cloud something for me? ->example: Taito Taito cluster: two types of nodes, HPC and cloud HPC node HPC node Cloud node Cloud node Host OS: RHEL Virtual machine Guest OS: Ubuntu Virtual machine Guest OS: Windows 11/25/2013 CSC Autumn School in Comp. Phys. '13 36

Q/A: How fast is the I/O? I/O speed Infiniband interconnect 56 Gbit/s, tested to give 20 GB/s (peak, on DDN) i-commands 100 MB/s = 1 Gbit/s (10-16 thread, if > 32 MB then spreads, Kernel schedules) SUI: 11 MB/s, 1 GB = 1 min Fastest laptop:120 MB/s, disc speed 40 MB/s write 10 Gbit/s ethernet = 1.2 GB/s Metadata operations for Lustre take long, therefore not good to have many small files 11/25/2013 CSC Autumn School in Comp. Phys. '13 37

Q/A: Fastest way to connect? NoMachine NX server for remote access 11/25/2013 CSC Autumn School in Comp. Phys. '13 38

Q/A: How to get access to CSC supercomputers? sui.csc.fi (HAKA authentication) sing up 11/25/2013 CSC Autumn School in Comp. Phys. '13 39

ns/day Performance comparison Per core performance ~2 x compared to Vuori Better interconnects enhance scaling Larger memory Smartest collective communications The most powerful computer(s) in Finland Big investment Quick summary 120 100 80 60 40 20 0 Gromacs performance 0 20 40 60 80 cores Taito Sisu FGI Vuori Louhi 11/25/2013 CSC Autumn School in Comp. Phys. '13 40

Round robin

Round robin What are your research interest? What are your needs in terms of computing? How CSC can help? Any comments towards CSC? 42