MPI / ClusterTools Update and Plans

Similar documents
SUN HPC SOFTWARE CLUSTERING MADE EASY

Streamline Computing Linux Cluster User Training. ( Nottingham University)

HPC Software Requirements to Support an HPC Cluster Supercomputer

Sun Constellation System: The Open Petascale Computing Architecture

Cluster Grid Interconects. Tony Kay Chief Architect Enterprise Grid and Networking

Improved LS-DYNA Performance on Sun Servers

The Asterope compute cluster

LSKA 2010 Survey Report Job Scheduler

- An Essential Building Block for Stable and Reliable Compute Clusters

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Parallel Debugging with DDT

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

FLOW-3D Performance Benchmark and Profiling. September 2012

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Sun in HPC. Update for IDC HPC User Forum Tucson, AZ, Sept 2008

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Grid 101. Grid 101. Josh Hegie.

Cluster Implementation and Management; Scheduling

A High Performance Computing Scheduling and Resource Management Primer

Parallel application development

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

Linux tools for debugging and profiling MPI codes

Release Notes for Open Grid Scheduler/Grid Engine. Version: Grid Engine

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

Simple Introduction to Clusters

Running applications on the Cray XC30 4/12/2015

RDMA over Ethernet - A Preliminary Study

Interoperability between Sun Grid Engine and the Windows Compute Cluster

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

GRID Computing: CAS Style

An Extensible Framework for Distributed Testing of MPI Implementations

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

User s Manual

HPC Wales Skills Academy Course Catalogue 2015

ABAQUS High Performance Computing Environment at Nokia

Designed for Maximum Accelerator Performance

InfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment

Mellanox Academy Online Training (E-learning)

Fujitsu HPC Cluster Suite

The SUN ONE Grid Engine BATCH SYSTEM

Debugging with TotalView

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

<Insert Picture Here> Private Cloud with Fusion Middleware

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

LANL Computing Environment for PSAAP Partners

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

High Performance Computing in Aachen

Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine

What s Cool in the SAP JVM (CON3243)

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Developing Parallel Applications with the Eclipse Parallel Tools Platform

Optimizing Shared Resource Contention in HPC Clusters

1 Bull, 2011 Bull Extreme Computing

System Software for High Performance Computing. Joe Izraelevitz

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

The CNMS Computer Cluster

New Data Center architecture

How to Run Parallel Jobs Efficiently

The RWTH Compute Cluster Environment

Grid Engine. Application Integration

Manjrasoft Market Oriented Cloud Computing Platform

Installing and running COMSOL on a Linux cluster

Multi-core Programming System Overview

How To Use A Job Management System With Sun Hpc Cluster Tools

Using the Windows Cluster

An HPC Application Deployment Model on Azure Cloud for SMEs

Introduction to parallel computing and UPPMAX

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

Scaling from Workstation to Cluster for Compute-Intensive Applications

MOSIX: High performance Linux farm

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Distributed Operating Systems. Cluster Systems

Grid Engine Training Introduction

Microsoft Compute Clusters in High Performance Technical Computing. Björn Tromsdorf, HPC Product Manager, Microsoft Corporation

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

RA MPI Compilers Debuggers Profiling. March 25, 2009

ANSYS Computing Platform Support. July 2013

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Transcription:

HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski Engineering Manager Software Developer Tools and Services Sun Microsystems Len Wisniewski Engineering Manager, ClusterTools team Sun Microsystems Software 1 1

Overview Sun HPC ClusterTools history Open MPI Sun support for Open MPI Solaris Sun Studio Sun Grid Engine MPI profiling MPI Testing Tool (MTT) Future releases 2

Sun HPC ClusterTools history 1996: Sun acquired the GlobalWorks team from Thinking Machines Corporation 1997-2006: Sun releases six versions of Sun HPC ClusterTools based on Globalworks proprietary source MPI libraries (including MPI I/O) Cluster run-time environment (CRE) Sun Scalable Scientific Library (S3L) Sun Parallel File System High Performance Fortran compiler Prism parallel debugger Installation software ClusterTools 6 (in 2006) was last proprietary release Included just MPI, CRE, and installation software Other components EOL'd or integrated into other products 3

Sun joins Open MPI Currently 15 Members, 9 Contributors, 2 Partners Plus individual contributors 4

Open MPI goals Create a free, open source, peer-reviewed, production-quality complete MPI-2 implementation. Provide extremely high, competitive performance (latency, bandwidth,...pick your favorite metric). Directly involve the HPC community with external development and feedback (vendors, 3rd party researchers, users, etc.). Provide a stable platform for 3rd party research and commercial development. Help prevent the "forking problem" common to other MPI projects. Support a wide variety of HPC platforms and environments. www.open-mpi.org 5

Open MPI architecture MPI API Layer PML = Pt2Pt Messaging Layer BML = BTL Management Layer TCP BTL Shared Memory BTL Open IB (User Verbs) BTL udapl BTL BTL = Byte Transfer Layer 6

Sun HPC ClusterTools 7+ Starting with Sun HPC ClusterTools 7 (CT 7), ClusterTools becomes a binary distribution of Open MPI CT 7 based on Open MPI 1.2 CT 7.1 based on Open MPI 1.2.4 CT 8 to be based on Open MPI 1.3 www.sun.com/clustertools CT 7.1 is available Second early access version of CT 8 (CT8 EA2) available now! First release to support both Solaris and Linux http://www.sun.com/software/products/clustertools/early_ac cess.xml 7

Open MPI / CT Release Timeline 8

Sun support for Open MPI Resource Managers Sun Grid Engine PBS Pro / Torque / Open PBS rsh / ssh SLURM LoadLeveler Xgrid Yod Operating Systems Solaris Linux Mac OS X Windows Compilers Sun Studio gcc (on Linux only) PGI Intel Pathscale Interconnects TCP / ethernet Shared memory Infiniband Myrinet (GM and MX) Portals Loopback 9

Installation / Sun packages Installation tools ctinstall install on multiple nodes ctact switch default links between CT versions Four Sun packages SUNWompi Open MPI files SUNWompiat CT installer utilities SUNWompimn Open MPI man pages SUNWomsc miscellanous support files 10

Documentation Contributed MPI man pages to Open MPI Ship four user documents Migration guide (from CT6 to CT7) User's Guide Install Guide Release Notes Future user documents in progress MPI programming guide Performance guide Plan to collaborate and/or contribute versions of above docs in adaptable format to community http://docs.sun.com/app/docs/coll/hpc-clustertools7.1 11

Solaris support Unique Solaris 10 features Zones Dtrace udapl support (for Solaris IB stack) OpenSolaris 05.2008 packages Support for various tools on Solaris Sun Studio Sun Grid Engine 3 rd party tools Parallel debuggers Totalview (by Totalview Technologies) DDT (by Allinea) Resource managers PBS Pro (by Altair) 12

UDAPL Byte Transfer Layer (BTL) Solaris 10 has its own IB stack OFED IB Verbs not supported on Solaris udapl BTL layered on udapl library udapl is a standard library for accessing underlying interconnects Solaris to support OFED IB Verbs in future Sun will join the common effort to develop openib BTL 13

Sun Studio support Support user codes compiled in Sun Studio 10, 11, and 12 Fortran, C, and C++ Provide compiler wrappers Collaboration with community Enable easy linking of user apps with appropriate libraries when using Studio configurable for other compilers Address unique attributes of compiler, OS, or chip type SPARC alignment C++ and Fortran compilers Sun Studio supports both Solaris and Linux! 14

Sun Grid Engine support Sun Grid Engine (SGE) Free and open-source resource manager Allows integration with MPI implementations Created SGE plug-in for Open MPI Enables invocation of MPI job within SGE Via batch script invoked via qsub command Via interactive shell # Submit an SGE and OMPI job and mpirun in one line shell$ qrsh -V -pe orte 4 mpirun -np 4 mpijob SGE enables exclusive use of set of nodes SGE supports job accounting 15

MPI Profiling CT8 EA2 has support for the following profiling methods: DTrace Sun Studio Analyzer MPI PERUSE VampirTrace 16

DTrace providers DTrace Comprehensive dynamic tracing facility Examine behavior of user and kernel codes CT7 includes DTrace examples for MPI programs CT8 EA2 includes DTrace providers for MPI Piggybacks on the MPI PERUSE framework in Open MPI Unobtrusive tracing programs netstat-like example 17

Dtrace providers netstat-like example Run an MPI program using CT MPI libraries Invoke providers via a dtrace script as follows: /* Collect Removal of Send Requests */ mpi peruse$target:::peruse_comm_req_notify /args[3]->mcs_op=="send"/ { --sends_act; } Run dtrace script while MPI program runs: % dtrace -q -p 13371 -s /opt/sunwhpc/hpc8.0/examples/dtrace/mpistat. d Count decremented above shown as part of netstatlike format on stdout via dtrace script 18

Integration with Sun Studio Analyzer Sun Studio Performance (and Thread) Analyzer Takes measurements Computes metrics from them Attributes metrics to a dynamic callgraph Does not require special compilation GUI support MPI support Show MPI Timeline Define MPI Charts MPI State profiling Works on mixed MPI and OpenMP programs Thread Analyzer helps detect and fix races 19

ClusterTools and 3 rd party products Parallel debugger support Totalview Totalview Technologies DDT Allinea Including message queue viewing Resource manager support PBS Professional Altair Interconnect support Infiniband multiple vendors Myrinet MX - Myricom 20

MPI Testing Tool (MTT) MTT principles Be freely available to minimize deployment cost Incorporate existing tests Simultaneous distributed testing across sites Support specialized on-demand and e-mail reporting Support parallel tests and various resource managers https://svn.open-mpi.org/trac/mtt 21

MPI Testing Tool An Extensible Framework for Distributed Testing of MPI Implementations. Joshua Hursey (Indiana U), Ethan Mallove (Sun), Jeffrey M. Squyres (Cisco) and Andrew Lumsdaine (Indiana U) 14 th PVM / MPI Conference, Paris, France, October 2007. 22

MTT Results Summary example http://www.open- 23

Upcoming release: Open MPI 1.3 Improved job startup scalability Improved MPI_THREAD_MULTIPLE support Checkpoint / restart support Support for Platform LSF OpenIB BTL improvements iwarp support XRC / ConnectX support Processor affinity improvements Message logging VampirTrace support Many more new features and improvements https://svn.open-mpi.org/trac/ompi/wiki See Release document for 1.3 series 24

Upcoming release: Sun HPC ClusterTools 8 Based on Open MPI 1.3 Features Linux support Mellanox ConnectX support Increased scalability Support for 1024 nodes / 4096 processes And more...tacc-sized clusters Profiling support Dtrace providers Sun Studio Analyzer tight integration MPI PERUSE VampirTrace Infiniband multi-rail support on Solaris 25

Priorities for the future Performance (of course!) Job startup scalability TACC-sized clusters Collectives Leverage best algorithms From the community From CT6 Latency / bandwidth on various interconnects Ease of use Open Run-Time Environment utilities to monitor jobs Automation of parameter selection Analysis of regressions across community (MTT) Fault tolerance 26

Links for more information Sun HPC ClusterTools web page, http://www.sun.com/clustertools Open MPI papers, http://www.open-mpi.org/papers/ Open MPI web page, http://www.open-mpi.org Sun Studio web page, http://developers.sun.com/sunstudio/ Sun Grid Engine web page, http://www.sun.com/software/gridware/ Grid Engine web site, http://gridengine.sunsource.net/ DTrace web page, http://www.sun.com/bigadmin/content/dtrace/ MTT web page, https://svn.open-mpi.org/trac/mtt OpenSolaris web site, http://www.opensolaris.org 27

Len Wisniewski leonard.wisniewski@sun.co m 28