Experiences of numerical simulations on a PC cluster Antti Vanne December 11, 2002



Similar documents
Mathematical Libraries on JUQUEEN. JSC Training Course

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin.

Cluster Implementation and Management; Scheduling

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

Mathematical Libraries and Application Software on JUROPA and JUQUEEN

Improved LS-DYNA Performance on Sun Servers

Recommended hardware system configurations for ANSYS users

The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation

High Performance Computing in CST STUDIO SUITE

Building an Inexpensive Parallel Computer

Cluster Computing at HRI

Are Blade Servers Right For HEP?

MOSIX: High performance Linux farm

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Scalability and Classifications

Linux clustering. Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University

Best practices for efficient HPC performance with large models

Analysis and Implementation of Cluster Computing Using Linux Operating System

Lecture 1: the anatomy of a supercomputer

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

1 Bull, 2011 Bull Extreme Computing

LS DYNA Performance Benchmarks and Profiling. January 2009

- An Essential Building Block for Stable and Reliable Compute Clusters

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Cluster Computing in a College of Criminal Justice

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Copyright by Parallels Holdings, Ltd. All rights reserved.

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

VTrak SATA RAID Storage System

Cloud Computing through Virtualization and HPC technologies

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

FLOW-3D Performance Benchmark and Profiling. September 2012

Overlapping Data Transfer With Application Execution on Clusters

HPC Software Requirements to Support an HPC Cluster Supercomputer

System Requirements G E N E R A L S Y S T E M R E C O M M E N D A T I O N S

CATS-i : LINUX CLUSTER ADMINISTRATION TOOLS ON THE INTERNET

1. Simulation of load balancing in a cloud computing environment using OMNET

Using the Windows Cluster

Numerical Calculation of Laminar Flame Propagation with Parallelism Assignment ZERO, CS 267, UC Berkeley, Spring 2015

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Performance Guide. 275 Technology Drive ANSYS, Inc. is Canonsburg, PA (T) (F)

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

Power-Aware High-Performance Scientific Computing

Storage Virtualization from clusters to grid

Building a Private Cloud with Eucalyptus

HP Smart Array Controllers and basic RAID performance factors

Parallels Plesk Automation

Building Clusters for Gromacs and other HPC applications

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

Efficient Load Balancing using VM Migration by QEMU-KVM

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Stateless Compute Cluster

Configuring and Launching ANSYS FLUENT Distributed using IBM Platform MPI or Intel MPI

benchmarking Amazon EC2 for high-performance scientific computing

Computing Service Provision in P2P Clouds

GridSolve: : A Seamless Bridge Between the Standard Programming Interfaces and Remote Resources

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

SOFTWARE TECHNOLOGIES

Architectures for Big Data Analytics A database perspective

High Performance Computing. Course Notes HPC Fundamentals

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Part I Courses Syllabus

Cluster performance, how to get the most out of Abel. Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013

MPI / ClusterTools Update and Plans

IOS110. Virtualization 5/27/2014 1

CS (CCN 27156) CS (CCN 26880) Software Engineering for Scientific Computing. Lecture 1: Introduction

Easier - Faster - Better

Installing & Using KVM with Virtual Machine Manager COSC 495

Getting Started with HC Exchange Module

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer

The CNMS Computer Cluster

Multicore Parallel Computing with OpenMP

Microsoft Windows Server 2003 with Internet Information Services (IIS) 6.0 vs. Linux Competitive Web Server Performance Comparison

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

Transcription:

xperiences of numerical simulations on a P cluster xperiences of numerical simulations on a P cluster ecember

xperiences of numerical simulations on a P cluster Introduction eowulf concept Using commodity off the shelf hardware to build a massively parallel computer Nodes run in a dedicated network only one node master node is connected to the LN igure Nodes run open source software Linux rees OS PVM MPI irst eowulf node X in at the enter of xcellence in Space ata and Information Sciences SIS

xperiences of numerical simulations on a P cluster eowulf network configuration LN HIHIHIHIHIHIHIHIHIHIHIHIHIHIHIHIHIHIHIHIH JIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJIJ KIKIKIKIKIKIKIKIKIKIKIKIKIKIKIKIKIKIKIKIK LILILILILILILILILILILILILILILILILILILILIL MIMIMIMIMIMIMIMIMIMIMIMIMIMIMIMIMIMIMIMIM NININININININININININININININININININININ OIOIOIOIOIOIOIOIOIOIOIOIOIOIOIOIOIOIOIOIO PIPIPIPIPIPIPIPIPIPIPIPIPIPIPIPIPIPIPIPIP QIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQIQ RIRIRIRIRIRIRIRIRIRIRIRIRIRIRIRIRIRIRIRIR SISISISISISISISISISISISISISISISISISISISIS TITITITITITITITITITITITITITITITITITITITIT UIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIU VIVIVIVIVIVIVIVIVIVIVIVIVIVIVIVIVIVIVIVIV WIWIWIWIWIWIWIWIWIWIWIWIWIWIWIWIWIWIWIWIW XIXIXIXIXIXIXIXIXIXIXIXIXIXIXIXIXIXIXIXIX YIYIYIYIYIYIYIYIYIYIYIYIYIYIYIYIYIYIYIYIY ZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZIZ [I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[I[ \I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\I\ ]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I]I] ^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^I^ _I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_ ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ` Hub thernet

xperiences of numerical simulations on a P cluster Hardware x ujitsusiemens Primergy Servers with one Hz Pentium processor node Memory master node MHz SRM slave nodes MHz SRM Storage SSI disk in the master node Network Intel TXPIX igabit copper ethernet NIs Switch HP Procurve L with two T L modules igabit ethernet

xperiences of numerical simulations on a P cluster Software OS Linux pre kernel OpenS a Redhat derivative Slave nodes are diskless startup is done via PX and OS is on NS easy to maintain and upgrade MPIH Message Passing Interface MPI implementation callable from ortran and Job scheduling NU Queue luster monitoring anglia luster Toolkit cc c and bsoft compilers

xperiences of numerical simulations on a P cluster anglia luster Toolkit luster view

xperiences of numerical simulations on a P cluster Mathematical software tlas LS Lapack sequential linear algebra libraries ScaLPK and PLS Parallel versions subsets of LS and Lapack ense and band matrices supported Petsc P solver includes basic matrix algebra operations and linear and nonlinear equation solvers supports both sparse and dense matrices eatures also interfaces to several other packages SuperLU Matlab oth Petsc and Scalapack use MPI library for communications

xperiences of numerical simulations on a P cluster Writing parallel code More or less complicated than sequential code depending on the used library MPI write everything from scratch Highlevel libraries Petsc ScaLPK libraries take care of the data distribution and communication Tradeoff between development time and execution time

xperiences of numerical simulations on a P cluster ode example Matlab versus Petsc called from for iint TiiMTiidtT_aMduiiˆ TiiU\L\Tii end for i int i { MatMult pu[i] tmpn ui VecPointwiseMulttmpN tmpn tmpn uiˆ MatMultM tmpn tmpn Muiˆ VecXPYdt dt T_a tmpn dtt_a Muiˆ MatMultddM pt[i] tmpn pt[i] Ti MT dtt_a Muiˆ SLSSolveslesL pt[i] tmpn int its SLSSolveslesU tmpn pt[i] int its }

xperiences of numerical simulations on a P cluster Performance epends heavily on application Imbalances in data distribution among the nodes result in surprising calculation times Parallel versions of three different numerical simulations bioheat transfer equation using M aerosol size distribution estimation using SIRfilter and ultrasound wavefield simulation using ultra weak variational formulation uwvf

xperiences of numerical simulations on a P cluster ioheat equation solver using M computation domain and thermal dose y m Ω I Ω II Ω III Ω IV x m

xperiences of numerical simulations on a P cluster ioheat equation solver using M calculation times N N N N t s t s of processors of nodes x

xperiences of numerical simulations on a P cluster erosol size estimation SIRfilter calculation times t s t s of processors of particles in SIR x

xperiences of numerical simulations on a P cluster Helmholtz UWV solver domain consisting of tetrahedra fkhz t s UWV p of processors

xperiences of numerical simulations on a P cluster onclusions eowulf clusters are costeffective alternatives to traditional parallel computers for memorybound problems Network latency is TH problem for matrix calculations Special NIs Myrinet SI latencies compare to b ethernet latency cost typically more than per node heaper option for low bandwidth network could be I b irewire for small or middlesize clusters or easily parallelizing problems clusters of nondedicated desktop computers can be used