Parallel Debugging with DDT



Similar documents
Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Grid 101. Grid 101. Josh Hegie.

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

Debugging with TotalView

RA MPI Compilers Debuggers Profiling. March 25, 2009

Introduction to Sun Grid Engine (SGE)

The Asterope compute cluster

The RWTH Compute Cluster Environment

Installing and running COMSOL on a Linux cluster

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Hodor and Bran - Job Scheduling and PBS Scripts

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Grid Engine Users Guide p1 Edition

Running applications on the Cray XC30 4/12/2015

How to Run Parallel Jobs Efficiently

Installing IBM Websphere Application Server 7 and 8 on OS4 Enterprise Linux

Allinea Forge User Guide. Version 6.0.1

MPI / ClusterTools Update and Plans

NEC HPC-Linux-Cluster

Matlab on a Supercomputer

SLURM Workload Manager

Introduction to HPC Workshop. Center for e-research

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Miami University RedHawk Cluster Working with batch jobs on the Cluster

Developing Parallel Applications with the Eclipse Parallel Tools Platform

Getting Started with HPC

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Allinea DDT and MAP User Guide. Version 4.2

GRID Computing: CAS Style

Using NeSI HPC Resources. NeSI Computational Science Team

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Using Parallel Computing to Run Multiple Jobs

HPC Wales Skills Academy Course Catalogue 2015

Batch Scripts for RA & Mio

Parallel Processing using the LOTUS cluster

Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware

icer Bioinformatics Support Fall 2011

OLCF Best Practices. Bill Renaud OLCF User Assistance Group

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Linux command line. An introduction to the Linux command line for genomics. Susan Fairley

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Quick Tutorial for Portable Batch System (PBS)

High Performance Computing

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

User s Manual

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 29.

MONITORING PERFORMANCE IN WINDOWS 7

Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)

Introduction to the SGE/OGS batch-queuing system

SGE Roll: Users Guide. Version Edition

IT Infrastructure Management

Manual for using Super Computing Resources

Introduction to Supercomputing with Janus

High Performance Computing Cluster Quick Reference User Guide

An Introduction to High Performance Computing in the Department

Cluster Computing With R

Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014

Management Utilities Configuration for UAC Environments

Improve Fortran Code Quality with Static Analysis

The CNMS Computer Cluster

10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Lab 2-2: Exploring Threads

Introduction to SDSC systems and data analytics software packages "

Using the Windows Cluster

ABAQUS High Performance Computing Environment at Nokia

Integrating SNiFF+ with the Data Display Debugger (DDD)

Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v Steps to Developing a QNX Program Quickstart Guide

Notes on the SNOW/Rmpi R packages with OpenMPI and Sun Grid Engine

Intel Do-It-Yourself Challenge Lab 2: Intel Galileo s Linux side Nicolas Vailliet

Leak Check Version 2.1 for Linux TM

Microsoft Windows Compute Cluster Server 2003 Evaluation

Session 2: MUST. Correctness Checking

External Device Management - Using SNMP - Enabling the Next Wave of Connectivity

Server Manager Performance Monitor. Server Manager Diagnostics Page. . Information. . Audit Success. . Audit Failure

Grid Engine 6. Troubleshooting. BioTeam Inc.

Xeon Phi Application Development on Windows OS

High-Performance Reservoir Risk Assessment (Jacta Cluster)

ABAP Debugging Tips and Tricks

How To Run A Tompouce Cluster On An Ipra (Inria) (Sun) 2 (Sun Geserade) (Sun-Ge) 2/5.2 (

Getting Started with HC Exchange Module

HPCC - Hrothgar Getting Started User Guide MPI Programming

RTOS Debugger for ecos

Transcription:

Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1

Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware thus making it behave as expected. Debugging tends to be harder when various subsystems are tightly coupled, as changes in one may cause bugs to emerge in another. A debugger is a computer program that is used to test and debug other programs. This can be hard enough with a single local process and but get s many times more complicated with many remote processes executing asynchronously. This is why Parallel Debuggers exist. 3/10/2009 www.cac.cornell.edu 2

Debugging Requirements In general, while debugging you may need to: Step through code Set/Run to breakpoints Examine variable values at different points during execution Examine the memory profile/usage Provide source-level information after a crash For MPI and OpenMP Code we have additional requirements All of the above for remote processes Examine MPI message status Step individual processes independent of the rest 3/10/2009 www.cac.cornell.edu 3

DDT DDT Distributed Debugging Tool (www.allinea.com) A graphical debugger for scalar, multi-threaded and parallel applications for C, C++ and Fortran DDT s provides graphical process grouping functionality. DDT makes it really easy to assign arbitrary processes into groups which can be acted on separatly. Provides memory debugging features as well, things like checking pointers, array bounds, etc. Provides functionality to interact reasonable with STL components (ie you can see what a map actually contains) and create views for your own objects. Allows viewing of MPI message queues for running processes 3/10/2009 www.cac.cornell.edu 4

Integrating with SGE DDT works by submitting a job to the Ranger development queue and attaches the process once it s started on a compute node. This works by a job template. The template then takes arguments provided by the DDT GUI to generate a batch script that executes your job with the appropriate arguments. The default configuration of DDT (which you get when you start DDT the first time) provides a default template that prepares an SGE job template. The template allows you enough flexibility to provide most arguments that you would need to set (walltime, account number, number of processors, etc). login4% cat $DDTROOT/templates/sge.qtf #!/bin/bash 3/10/2009 www.cac.cornell.edu 5

DDT Job Template ($DDTROOT/template/sge.qpt) #!/bin/bash # QUEUE_TAG: {type=text,label="queue",default=development} # WALL_CLOCK_LIMIT_TAG: {type=text,label="wall Clock Limit",default="0:30:00",mask="9:09: 09"} # PROJECT_TAG: {type=text,label="project"} Pulls settings from GUI #$ -N DDTJOB #$ -q QUEUE_TAG #$ -o ddt.o$job_id Parallel Environment setting is fixed! #$ -cwd #$ -j y #$ -V #$ -l h_rt=wall_clock_limit_tag #$ -pe 16way NUM_PROCS_TAG #$ -A PROJECT_TAG # Run the program with the debugger ibrun DDTPATH_TAG/bin/ddt-debugger DDT_DEBUGGER_ARGUMENTS_TAG PROGRAM_ARGUMENTS_TAG 3/10/2009 www.cac.cornell.edu 6 Executes your job inside the debugger via ibrun

DDT Demo By far the best way to show what DDT can do is to start it up and look at it and show some things with it. Once we do this, we ll have everybody log in and make sure they can DDT started. We ll talk about: Creating and altering groups Stepping groups and processes Show Cross-group comparison Show Memory Usage/Profiling Show MPI Queues Show multi-dimensional array viewer 3/10/2009 www.cac.cornell.edu 7

Starting DDT Login to ranger with an X tunnel $ ssh X ranger.tacc.utexas.edu We need a binary compiled with debugging flags. If you don t have a binary already on ranger, you can get one from the train00 directory login3% mkdir ~/ddt login3$ cp ~train00/ddt_debug/debug_code.f. Ensure you have your preferred compiler loaded login3% module list login3% module unload mvapich login3% module swap pgi intel login3% module load mvapich 3/10/2009 www.cac.cornell.edu 8

Starting DDT Compile with debugging flags login3% cd ~/ddt login3% mpif90 g O0 debug_code.f o ddt_app Load the DDT module login3% module list login3% module load ddt login3% module list login3% echo $DDTROOT Start DDT login3% ddt ddt_app 3/10/2009 www.cac.cornell.edu 9

Starting DDT Click! 3/10/2009 www.cac.cornell.edu 10

Running a job Add any arguments Ranger default Sets number of nodes Click when ready to submit job 3/10/2009 www.cac.cornell.edu 11

Account Name Provide allocation id (qsub -A value) then click OK 3/10/2009 www.cac.cornell.edu 12

Waiting for job to start 3/10/2009 www.cac.cornell.edu 13

Job starting, connecting to all remote processes 3/10/2009 www.cac.cornell.edu 14

Session started! Root process is selected Source locations of processes Local variables STDOUT Watched Values, Expressions 3/10/2009 www.cac.cornell.edu 15

DDT At this point, DDT should be up and running for you and you only need to load the DDT module and any configuration changes you made (ie Account name) will be saved for the next time you use it. It should feel very much like an IDE debugger, just with the added capabilities of viewing remote processes and MPI information. It wasn t shown, but this can be used just as well to debug OpenMP programs, though you may need to be careful when stepping through non-threaded sections. Check out the User Guide for any questions you have or request help through the TeraGrid help desk. UserGuide: http://www.allinea.com/downloads/userguide.pdf Or press F1 while running DDT to call up the help. 3/10/2009 www.cac.cornell.edu 16