GPU Tools Sandra Wienke

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "GPU Tools Sandra Wienke"

Transcription

1 Sandra Wienke Center for Computing and Communication, RWTH Aachen University MATSE HPC Battle 2012/13 Rechen- und Kommunikationszentrum (RZ)

2 Agenda IDE Eclipse Debugging (CUDA) TotalView Profiling (CUDA & OpenACC) NVIDIA Visual Profiler Appendix Debugging host code with TotalView 2

3 IDE - Eclipse Eclipse + Parallel NSight: IDE for GPU programming CUDA syntax highlighting, CUDA debugging, CUDA profiling OpenACC programming, OpenACC profiling Using Nsight on RWTH Cluster environment module load cuda nsight 1. Chose workspace 2. File New Makefile Project with Existing code 3. Chose file directory of CG Solver 4. Toolchain: CUDA Toolkit Create Makefile targets (Makefile Tab in right pane), e.g. cuda, clean 6. Double click on Makefile target for execution or download Parallel Nisght: 3

4 IDE - Eclipse Debugging (CUDA) Use the debug configuration of Makefile to create executable Press debug button (green bug) Proceed (even if errors) Switch to debug perspective (Application will suspend in the main function. At this point there is no GPU code running) Add a breakpoint in the device code Resume the application Profiling (CUDA, OpenACC) Uses internally NVIDIA s Visual Profiler (see later) Use the release target of Makefile to create executable Press profile button (watch) Proceed (even if errors) Switch perspective Ouput/ interpretation see chapter NVIDIA Visual Profiler 4

5 Agenda IDE Eclipse Debugging (CUDA) TotalView Profiling (CUDA & OpenACC) NVIDIA Visual Profiler Appendix Debugging host code with TotalView 5

6 Debugging (CUDA) Debugging host code as usual Debugging GPU kernels requires special tools CUDA debuggers available OpenACC debuggers not available Compiling CUDA applications nvcc [-arch=sm_20] mykernel.cu RWTH Cluster environment: module load cuda Debugging flags: -g G nvcc g G [-arch=sm_20] mykernel.cu (see Makefile debug target) CUDA command line tools Debugger: cuda-gdb Detecting memory access errors: cuda-memcheck 6

7 Debugging (CUDA) CUDA GUI-based debugger: TotalView Debugging host and device code in same session Thread navigation by logical or physical coordinates Displaying hierarchical memory, General information on debugging with TotalView can be found in the appendix CUDA GUI-based debugger: Eclipse (see above) RWTH Cluster environment: module load totalview totalview If you get an error concerning the CUDA version, try to compile your application with CUDA 4.1: module switch cuda cuda/41 7

8 Debugging (CUDA) - TotalView Setting breakpoints in CUDA kernels Start debugging (e.g. Go ) Message box when kernel is loaded: Set kernel breakpoints as in host code 8

9 Debugging (CUDA) - TotalView Debugger thread IDs in Linux CUDA process Host thread: positive no. CUDA thread: negative no. GPU thread navigation Logical coordinates: blocks (3 dimensions), threads (3 dimensions) Physical coordinates: device, SM, warp, core/lane Only valid selections are permitted 9

10 Debugging (CUDA) - TotalView Warp: group of 32 threads Share one PC Advance synchronously Problem: Diverging threads if (threadidx.x > 2) {...} else {...} Single Stepping Advances all GPU hardware threads within same warp Stepping over a syncthreads() call advances all threads within the block Advancing more than just one warp Halt Run To a selected line number in the source pane Set a breakpoint and Continue the process Stops all the host and device threads 10

11 Debugging (CUDA) - TotalView Displaying CUDA device properties Tools - CUDA Devices Helps mapping between logical & physical coordinates PCs across SMs, warps, lanes GPU thread divergence? Different PC within warp Diverging threads 11

12 Debugging (CUDA) - TotalView Displaying GPU data Dive into variable or watch Type in Expression List Device memory notation Meaning of address Offset within global storage Offset within shared storage Offset within local storage PTX register name Offset within generic address space (e.g. pointer to global, local or shared memory) Offset within constant storage Offset within texture storage Offset within parameter storage 12

13 Debugging (CUDA) - TotalView Checking GPU memory Enable CUDA Memory checking during startup or in the Debug menu Detects global memory addressing violations and misaligned global memory accesses Further features Multi-device support Host-pinned memory support MPI-CUDA applications 13

14 Debugging (CUDA) - Tips Check CUDA API calls All CUDA API routines return error code (cudaerror_t) Or cudagetlasterror() returns last error from a CUDA runtime call cudageterrorstring(cudaerror_t) returns corresponding message 1. Write a macro to check CUDA API return codes or use SafeCall and CheckError macros from cutil.h (NVIDIA GPU Computing SDK) 2. Use TotalView to examine the return code Evaluate the CUDA API call in the expression list If needed, dive on the error value and typecast it to an cudaerror_t type You can also surround the API call by cudageterrorstring() in the expression field and typecast it to char[xx]* 14

15 Debugging (CUDA) - Tips Check + use available hardware features printf statements are possible within kernels (since Fermi) Use double precision floating point operations (since GT200) Enable ECC and check whether single or double bit errors occurred using nvidia-smi -q (since Fermi) Check final numerical results on host While porting, it is recommended to compare all computed GPU results with host results 1. Compute check sums of GPU and host array values 2. If not sufficient, compare arrays element-wise Comparative debugging approach, e.g. statistics view 15

16 Debugging (CUDA) - Tips Check intermediate results If results are directly stored in global memory: dive on result array If results are stored in on-chip memory (e.g. registers) tedious debugging TotalView: View of variables across CUDA threads not possible yet 1. Create additional array on host for intermediate results with size #threads * #results * sizeof(result) Use array on GPU: each thread stores its result at unique index Transfer array back to host and examine the results 2. If having a limited number of thread blocks: create additional array in shared memory within kernel function: shared myarray[size] Use defines to exchange access to on-chip variable with array access Examine results by diving on array and switching between blocks Use filter, array statistics, freeze, duplicate, last values and watch points 16

17 Agenda IDE Eclipse Debugging (CUDA) TotalView Profiling (CUDA & OpenACC) NVIDIA Visual Profiler Appendix Debugging host code with TotalView 17

18 Profiling (CUDA & OpenACC) Profiling = Analyze behavior of application during runtime e.g. runtime of functions, memory throughput NVIDIA Visual Profiler for CUDA & OpenACC codes Profiles only GPU data movement & computation (not host code) 1. Compile your program and start the profiler nvvp 2. Select File New Session 3. Chose your executable as file Specify arguments, e.g. the matrix file RWTH Cluster environment: module load cuda nvvp Specify envrionment variables, e.g. CG_MAX_ITER 4. If you want to shorten the execution time, set a timeout limit 5. Finish the session configuration & wait for results 18

19 Profiling (CUDA & OpenACC) 19

20 Profiling (CUDA & OpenACC) Session Tab Timeline Long memory copy from host to device Timeline Short memory copy from device to host Collpase to see summarized info only Compute time for first kernel Is data only transfered when needed? Which kernel does need the most time? 20

21 Profiling (CUDA & OpenACC) Analysis Tab Gives hints for optimization (not always useful) Details Tab Switch from Analysis Tab to the Details Tab Runtime Grid dimensions On right hand side, activate summary view 21 Kernel name <func>_<line>_gpu

22 Agenda IDE Eclipse Debugging (CUDA) TotalView Profiling (CUDA & OpenACC) NVIDIA Visual Profiler Appendix Debugging host code with TotalView 22

23 Appendix - Debugging host code Start TotalView and select your program to debug 23

24 Appendix - Debugging host code Process window of TotalView Toolbar Process and Thread Status Stack Trace Pane Stack Frame Pane Source Pane Tabbed Pane 24

25 Appendix - Debugging host code Breakpoints Interrupt execution when reaching a specific code line Conditional Breakpoints possible Set by clicking in the source pane Temporary disabling is possible Watchpoints Interrupt when a change occurs to a specific memory location Conditional watchpoints possible (e.g. only stop if the sign of the value changes or specified threshold reached) 25

26 Appendix - Debugging host code Setting a breakpoint 26

27 Appendix - Debugging host code Inspecting an array in C/C++ Double click on array name Typecast necessary 27

28 Appendix - Debugging host code Data visualizations helpful for big data arrays 28

29 Appendix - Debugging host code Create a watchpoint for a[29] 29

30 Appendix - Debugging host code Will interrupt as soon as a[29] changes 30

CUDA Debugging. GPGPU Workshop, August 2012. Sandra Wienke Center for Computing and Communication, RWTH Aachen University

CUDA Debugging. GPGPU Workshop, August 2012. Sandra Wienke Center for Computing and Communication, RWTH Aachen University CUDA Debugging GPGPU Workshop, August 2012 Sandra Wienke Center for Computing and Communication, RWTH Aachen University Nikolay Piskun, Chris Gottbrath Rogue Wave Software Rechen- und Kommunikationszentrum

More information

Debugging with TotalView

Debugging with TotalView Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich

More information

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

RWTH GPU Cluster. Sandra Wienke wienke@rz.rwth-aachen.de November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative

More information

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase

More information

Debugging CUDA Applications Przetwarzanie Równoległe CUDA/CELL

Debugging CUDA Applications Przetwarzanie Równoległe CUDA/CELL Debugging CUDA Applications Przetwarzanie Równoległe CUDA/CELL Michał Wójcik, Tomasz Boiński Katedra Architektury Systemów Komputerowych Wydział Elektroniki, Telekomunikacji i Informatyki Politechnika

More information

Optimizing Application Performance with CUDA Profiling Tools

Optimizing Application Performance with CUDA Profiling Tools Optimizing Application Performance with CUDA Profiling Tools Why Profile? Application Code GPU Compute-Intensive Functions Rest of Sequential CPU Code CPU 100 s of cores 10,000 s of threads Great memory

More information

ANDROID DEVELOPER TOOLS TRAINING GTC 2014. Sébastien Dominé, NVIDIA

ANDROID DEVELOPER TOOLS TRAINING GTC 2014. Sébastien Dominé, NVIDIA ANDROID DEVELOPER TOOLS TRAINING GTC 2014 Sébastien Dominé, NVIDIA AGENDA NVIDIA Developer Tools Introduction Multi-core CPU tools Graphics Developer Tools Compute Developer Tools NVIDIA Developer Tools

More information

CUDA Tools for Debugging and Profiling. Jiri Kraus (NVIDIA)

CUDA Tools for Debugging and Profiling. Jiri Kraus (NVIDIA) Mitglied der Helmholtz-Gemeinschaft CUDA Tools for Debugging and Profiling Jiri Kraus (NVIDIA) GPU Programming@Jülich Supercomputing Centre Jülich 7-9 April 2014 What you will learn How to use cuda-memcheck

More information

Hands-on CUDA exercises

Hands-on CUDA exercises Hands-on CUDA exercises CUDA Exercises We have provided skeletons and solutions for 6 hands-on CUDA exercises In each exercise (except for #5), you have to implement the missing portions of the code Finished

More information

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING NVIDIA DEVELOPER TOOLS BUILD. DEBUG. PROFILE. C/C++ IDE INTEGRATION STANDALONE TOOLS HARDWARE SUPPORT CPU AND GPU DEBUGGING & PROFILING

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v6.0 February 2014 Installation and Verification on TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2.

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v5.5 July 2013 Installation and Verification on Mac OS X TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About

More information

For Introduction to Java Programming, 5E By Y. Daniel Liang

For Introduction to Java Programming, 5E By Y. Daniel Liang Supplement H: NetBeans Tutorial For Introduction to Java Programming, 5E By Y. Daniel Liang This supplement covers the following topics: Getting Started with NetBeans Creating a Project Creating, Mounting,

More information

CUDA Optimization with NVIDIA Tools. Julien Demouth, NVIDIA

CUDA Optimization with NVIDIA Tools. Julien Demouth, NVIDIA CUDA Optimization with NVIDIA Tools Julien Demouth, NVIDIA What Will You Learn? An iterative method to optimize your GPU code A way to conduct that method with Nvidia Tools 2 What Does the Application

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v6.5 August 2014 Installation and Verification on Mac OS X TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About

More information

Getting Started with CodeXL

Getting Started with CodeXL AMD Developer Tools Team Advanced Micro Devices, Inc. Table of Contents Introduction... 2 Install CodeXL... 2 Validate CodeXL installation... 3 CodeXL help... 5 Run the Teapot Sample project... 5 Basic

More information

Case Study on Productivity and Performance of GPGPUs

Case Study on Productivity and Performance of GPGPUs Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia

More information

XID ERRORS. vr352 May 2015. XID Errors

XID ERRORS. vr352 May 2015. XID Errors ID ERRORS vr352 May 2015 ID Errors Introduction... 1 1.1. What Is an id Message... 1 1.2. How to Use id Messages... 1 Working with id Errors... 2 2.1. Viewing id Error Messages... 2 2.2. Tools That Provide

More information

Guided Performance Analysis with the NVIDIA Visual Profiler

Guided Performance Analysis with the NVIDIA Visual Profiler Guided Performance Analysis with the NVIDIA Visual Profiler Identifying Performance Opportunities NVIDIA Nsight Eclipse Edition (nsight) NVIDIA Visual Profiler (nvvp) nvprof command-line profiler Guided

More information

OpenACC Basics Directive-based GPGPU Programming

OpenACC Basics Directive-based GPGPU Programming OpenACC Basics Directive-based GPGPU Programming Sandra Wienke, M.Sc. wienke@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Rechen- und Kommunikationszentrum (RZ) PPCES,

More information

CUDA SKILLS. Yu-Hang Tang. June 23-26, 2015 CSRC, Beijing

CUDA SKILLS. Yu-Hang Tang. June 23-26, 2015 CSRC, Beijing CUDA SKILLS Yu-Hang Tang June 23-26, 2015 CSRC, Beijing day1.pdf at /home/ytang/slides Referece solutions coming soon Online CUDA API documentation http://docs.nvidia.com/cuda/index.html Yu-Hang Tang @

More information

Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v 6. 4. 10 Steps to Developing a QNX Program Quickstart Guide

Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v 6. 4. 10 Steps to Developing a QNX Program Quickstart Guide Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v 6. 4 10 Steps to Developing a QNX Program Quickstart Guide 2008, QNX Software Systems GmbH & Co. KG. A Harman International Company. All rights

More information

RTOS Debugger for ecos

RTOS Debugger for ecos RTOS Debugger for ecos TRACE32 Online Help TRACE32 Directory TRACE32 Index TRACE32 Documents... RTOS Debugger... RTOS Debugger for ecos... 1 Overview... 2 Brief Overview of Documents for New Users... 3

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v5.5 July 2013 Installation and Verification on TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information

Lab 2-2: Exploring Threads

Lab 2-2: Exploring Threads Lab 2-2: Exploring Threads Objectives Prerequisites After completing this lab, you will be able to: Add profiling support to a Windows CE OS Design Locate files associated with Windows CE profiling Operate

More information

POOSL IDE User Manual

POOSL IDE User Manual Embedded Systems Innovation by TNO POOSL IDE User Manual Tool version 3.0.0 25-8-2014 1 POOSL IDE User Manual 1 Installation... 5 1.1 Minimal system requirements... 5 1.2 Installing Eclipse... 5 1.3 Installing

More information

WebSphere Business Monitor

WebSphere Business Monitor WebSphere Business Monitor Debugger 2010 IBM Corporation This presentation provides an overview of the monitor model debugger in WebSphere Business Monitor. WBPM_Monitor_Debugger.ppt Page 1 of 23 Goals

More information

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Programming models for heterogeneous computing Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Talk outline [30 slides] 1. Introduction [5 slides] 2.

More information

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014 Debugging in Heterogeneous Environments with TotalView ECMWF HPC Workshop 30 th October 2014 Agenda Introduction Challenges TotalView overview Advanced features Current work and future plans 2014 Rogue

More information

GPU Performance Analysis and Optimisation

GPU Performance Analysis and Optimisation GPU Performance Analysis and Optimisation Thomas Bradley, NVIDIA Corporation Outline What limits performance? Analysing performance: GPU profiling Exposing sufficient parallelism Optimising for Kepler

More information

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,

More information

Eliminate Memory Errors and Improve Program Stability

Eliminate Memory Errors and Improve Program Stability Eliminate Memory Errors and Improve Program Stability with Intel Parallel Studio XE Can running one simple tool make a difference? Yes, in many cases. You can find errors that cause complex, intermittent

More information

GPU Parallel Computing Architecture and CUDA Programming Model

GPU Parallel Computing Architecture and CUDA Programming Model GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel

More information

Project Manager Editor & Debugger

Project Manager Editor & Debugger TM IDE for Microcontrollers Quick Start µvision2, the new IDE from Keil Software, combines Project Management, Source Code Editing, and Program Debugging in one powerful environment. This Quick Start guide

More information

DS-5 ARM. Using the Debugger. Version 5.7. Copyright 2010, 2011 ARM. All rights reserved. ARM DUI 0446G (ID092311)

DS-5 ARM. Using the Debugger. Version 5.7. Copyright 2010, 2011 ARM. All rights reserved. ARM DUI 0446G (ID092311) ARM DS-5 Version 5.7 Using the Debugger Copyright 2010, 2011 ARM. All rights reserved. ARM DUI 0446G () ARM DS-5 Using the Debugger Copyright 2010, 2011 ARM. All rights reserved. Release Information The

More information

CooCox CoIDE UserGuide Version: 1.2.2 2011-3-4 page 1. Free ARM Cortex M3 and Cortex M0 IDE: CooCox CoIDE UserGuide

CooCox CoIDE UserGuide Version: 1.2.2 2011-3-4 page 1. Free ARM Cortex M3 and Cortex M0 IDE: CooCox CoIDE UserGuide CooCox CoIDE UserGuide Version: 1.2.2 2011-3-4 page 1 Free ARM Cortex M3 and Cortex M0 IDE: CooCox CoIDE UserGuide CooCox CoIDE UserGuide Version: 1.2.2 2011-3-4 page 2 Index: 1. CoIDE Quick Start... 4

More information

Nios II IDE Help System

Nios II IDE Help System Nios II IDE Help System 101 Innovation Drive San Jose, CA 95134 www.altera.com Nios II IDE Version: 9.0 Document Version: 1.7 Document Date: March 2009 UG-N2IDEHELP-1.7 Table Of Contents About This Document...1

More information

Department of Veterans Affairs. Open Source Electronic Health Record Services

Department of Veterans Affairs. Open Source Electronic Health Record Services Department of Veterans Affairs Open Source Electronic Health Record Services MTools Installation and Usage Guide Version 1.0 June 2013 Contract: VA118-12-C-0056 Table of Contents 1. Installation... 3 1.1.

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v6.5 August 2014 Installation and Verification on TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About

More information

DS-5 ARM. Using the Debugger. Version 5.13. Copyright 2010-2012 ARM. All rights reserved. ARM DUI 0446M (ID120712)

DS-5 ARM. Using the Debugger. Version 5.13. Copyright 2010-2012 ARM. All rights reserved. ARM DUI 0446M (ID120712) ARM DS-5 Version 5.13 Using the Debugger Copyright 2010-2012 ARM. All rights reserved. ARM DUI 0446M () ARM DS-5 Using the Debugger Copyright 2010-2012 ARM. All rights reserved. Release Information The

More information

Installing Eclipse C++ for Windows

Installing Eclipse C++ for Windows Installing Eclipse C++ for Windows I. Introduction... 2 II. Installing and/or Enabling the 32-bit JRE (Java Runtime Environment)... 2 A. Windows 32-bit Operating System Environment... 2 B. Windows 64-bit

More information

SKP16C62P Tutorial 1 Software Development Process using HEW. Renesas Technology America Inc.

SKP16C62P Tutorial 1 Software Development Process using HEW. Renesas Technology America Inc. SKP16C62P Tutorial 1 Software Development Process using HEW Renesas Technology America Inc. 1 Overview The following tutorial is a brief introduction on how to develop and debug programs using HEW (Highperformance

More information

Capacitive Touch Lab. Renesas Capacitive Touch Lab R8C/36T-A Family

Capacitive Touch Lab. Renesas Capacitive Touch Lab R8C/36T-A Family Renesas Capacitive Touch Lab R8C/36T-A Family Description: This lab will cover the Renesas Touch Solution for embedded capacitive touch systems. This lab will demonstrate how to setup and run a simple

More information

NVIDIA CUDA INSTALLATION GUIDE FOR MICROSOFT WINDOWS

NVIDIA CUDA INSTALLATION GUIDE FOR MICROSOFT WINDOWS NVIDIA CUDA INSTALLATION GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v7.5 September 2015 Installation and Verification on Windows TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements...

More information

NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II. March 2011

NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II. March 2011 NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II March 2011 Introductions Jeff Kiel Manager of Graphics Tools NVIDIA Corporation Andreas Papathanasis Lead Graphics Programmer

More information

CUDA Basics. Murphy Stein New York University

CUDA Basics. Murphy Stein New York University CUDA Basics Murphy Stein New York University Overview Device Architecture CUDA Programming Model Matrix Transpose in CUDA Further Reading What is CUDA? CUDA stands for: Compute Unified Device Architecture

More information

IBM Operational Decision Manager Version 8 Release 5. Getting Started with Business Rules

IBM Operational Decision Manager Version 8 Release 5. Getting Started with Business Rules IBM Operational Decision Manager Version 8 Release 5 Getting Started with Business Rules Note Before using this information and the product it supports, read the information in Notices on page 43. This

More information

10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition

10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition 10 STEPS TO YOUR FIRST QNX PROGRAM QUICKSTART GUIDE Second Edition QNX QUICKSTART GUIDE A guide to help you install and configure the QNX Momentics tools and the QNX Neutrino operating system, so you can

More information

Introduction to Embedded Systems. Software Update Problem

Introduction to Embedded Systems. Software Update Problem Introduction to Embedded Systems CS/ECE 6780/5780 Al Davis logistics minor Today s topics: more software development issues 1 CS 5780 Software Update Problem Lab machines work let us know if they don t

More information

Lecture 1: an introduction to CUDA

Lecture 1: an introduction to CUDA Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Overview hardware view software view CUDA programming

More information

Using the Intel Inspector XE

Using the Intel Inspector XE Using the Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Race Condition Data Race: the typical OpenMP programming error, when: two or more threads access the same memory

More information

Andreas Burghart 6 October 2014 v1.0

Andreas Burghart 6 October 2014 v1.0 Yocto Qt Application Development Andreas Burghart 6 October 2014 Contents 1.0 Introduction... 3 1.1 Qt for Embedded Linux... 3 1.2 Outline... 4 1.3 Assumptions... 5 1.4 Corrections... 5 1.5 Version...

More information

Profiling and Debugging Tools for High-performance Android Applications. Stephen Jones, Product Line Manager, NVIDIA (sjones@nvidia.

Profiling and Debugging Tools for High-performance Android Applications. Stephen Jones, Product Line Manager, NVIDIA (sjones@nvidia. Profiling and Debugging Tools for High-performance Android Applications Stephen Jones, Product Line Manager, NVIDIA (sjones@nvidia.com) Android By The Numbers 1.3M Android activations per day Android activations

More information

GDB Tutorial. A Walkthrough with Examples. CMSC 212 - Spring 2009. Last modified March 22, 2009. GDB Tutorial

GDB Tutorial. A Walkthrough with Examples. CMSC 212 - Spring 2009. Last modified March 22, 2009. GDB Tutorial A Walkthrough with Examples CMSC 212 - Spring 2009 Last modified March 22, 2009 What is gdb? GNU Debugger A debugger for several languages, including C and C++ It allows you to inspect what the program

More information

Tutorial 5: Developing Java applications

Tutorial 5: Developing Java applications Tutorial 5: Developing Java applications p. 1 Tutorial 5: Developing Java applications Georgios Gousios gousiosg@aueb.gr Department of Management Science and Technology Athens University of Economics and

More information

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization

More information

National CR16C Family On-Chip Emulation. Contents. Technical Notes V9.11.75

National CR16C Family On-Chip Emulation. Contents. Technical Notes V9.11.75 _ V9.11.75 Technical Notes National CR16C Family On-Chip Emulation Contents Contents... 1 1 Introduction... 2 2 Emulation options... 3 2.1 Hardware Options... 3 2.2 Initialization Sequence... 4 2.3 JTAG

More information

Etnus TotalView 6. THE Debugger for Complex Code Mary Kay Bunde Director, Market Development Etnus

Etnus TotalView 6. THE Debugger for Complex Code Mary Kay Bunde Director, Market Development Etnus Etnus TotalView 6 THE Debugger for Complex Code Mary Kay Bunde Director, Market Development Etnus About Etnus World's leading provider of debugging solutions for complex and distributed code A continuation

More information

Real-time Debugging using GDB Tracepoints and other Eclipse features

Real-time Debugging using GDB Tracepoints and other Eclipse features Real-time Debugging using GDB Tracepoints and other Eclipse features GCC Summit 2010 2010-010-26 marc.khouzam@ericsson.com Summary Introduction Advanced debugging features Non-stop multi-threaded debugging

More information

GPU Computing with CUDA Lecture 4 - Optimizations. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile

GPU Computing with CUDA Lecture 4 - Optimizations. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile GPU Computing with CUDA Lecture 4 - Optimizations Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile 1 Outline of lecture Recap of Lecture 3 Control flow Coalescing Latency hiding

More information

Profiler User's Guide

Profiler User's Guide Version 2016 www.pgroup.com TABLE OF CONTENTS Profiling Overview... iv What's New... iv Terminology... v Chapter 1. Preparing An Application For Profiling...1 1.1. Focused Profiling...1 1.2. Marking Regions

More information

Java Troubleshooting and Performance

Java Troubleshooting and Performance Java Troubleshooting and Performance Margus Pala Java Fundamentals 08.12.2014 Agenda Debugger Thread dumps Memory dumps Crash dumps Tools/profilers Rules of (performance) optimization 1. Don't optimize

More information

GIVE WINGS TO YOUR IDEAS TOOLS MANUAL

GIVE WINGS TO YOUR IDEAS TOOLS MANUAL GIVE WINGS TO YOUR IDEAS TOOLS MANUAL PLUG IN TO THE WIRELESS WORLD Version: 001 / 1.0 Date: October 30, 2001 Reference: WM_TOO_OAT_UGD_001 confidential Page: 1 / 22 (THIS PAGE IS INTENTIONALY LEFT BLANK)

More information

Altera Monitor Program

Altera Monitor Program Altera Monitor Program This tutorial presents an introduction to the Altera Monitor Program, which can be used to compile, assemble, download and debug programs for Altera s Nios II processor. The tutorial

More information

Eddy Integrated Development Environment, LemonIDE for Embedded Software System Development

Eddy Integrated Development Environment, LemonIDE for Embedded Software System Development Introduction to -based solution for embedded software development Section 1 Eddy Real-Time, Lemonix Section 2 Eddy Integrated Development Environment, LemonIDE Section 3 Eddy Utility Programs Eddy Integrated

More information

EE8205: Embedded Computer System Electrical and Computer Engineering, Ryerson University. Multitasking ARM-Applications with uvision and RTX

EE8205: Embedded Computer System Electrical and Computer Engineering, Ryerson University. Multitasking ARM-Applications with uvision and RTX EE8205: Embedded Computer System Electrical and Computer Engineering, Ryerson University Multitasking ARM-Applications with uvision and RTX 1. Objectives The purpose of this lab is to lab is to introduce

More information

Java Application Development using Eclipse. Jezz Kelway kelwayj@uk.ibm.com Java Technology Centre, z/os Service IBM Hursley Park Labs, United Kingdom

Java Application Development using Eclipse. Jezz Kelway kelwayj@uk.ibm.com Java Technology Centre, z/os Service IBM Hursley Park Labs, United Kingdom 8358 Java Application Development using Eclipse Jezz Kelway kelwayj@uk.ibm.com Java Technology Centre, z/os Service IBM Hursley Park Labs, United Kingdom Abstract Learn how to use the powerful features

More information

ABAP Debugging Tips and Tricks

ABAP Debugging Tips and Tricks Applies to: This article applies to all SAP ABAP based products; however the examples and screen shots are derived from ECC 6.0 system. For more information, visit the ABAP homepage. Summary This article

More information

Running a Program on an AVD

Running a Program on an AVD Running a Program on an AVD Now that you have a project that builds an application, and an AVD with a system image compatible with the application s build target and API level requirements, you can run

More information

DS-5 ARM. Using the Debugger. Version 5.16. Copyright 2010-2013 ARM. All rights reserved. ARM DUI0446P

DS-5 ARM. Using the Debugger. Version 5.16. Copyright 2010-2013 ARM. All rights reserved. ARM DUI0446P ARM DS-5 Version 5.16 Using the Debugger Copyright 2010-2013 ARM. All rights reserved. ARM DUI0446P ARM DS-5 ARM DS-5 Using the Debugger Copyright 2010-2013 ARM. All rights reserved. Release Information

More information

Hands On CUDA Tools and Performance-Optimization

Hands On CUDA Tools and Performance-Optimization Mitglied der Helmholtz-Gemeinschaft Hands On CUDA Tools and Performance-Optimization JSC GPU Programming Course 26. März 2011 Dominic Eschweiler Outline of This Talk Introduction Setup CUDA-GDB Profiling

More information

Debugging in AVR32 Studio

Debugging in AVR32 Studio Debugging in AVR32 Studio Debugging is a very powerful tool if you want to have a deeper look into your program. You can look at both variables and register values and check they are correct. In AVR32

More information

CodeWarrior for Power Architecture 10.1.1 Errata

CodeWarrior for Power Architecture 10.1.1 Errata CodeWarrior for Power Architecture 10.1.1 Errata MTWX39813 MTWX46941 MTWX47555 MTWX48310 MTWX48571 Build Tools P4080 runtime does not support building ROM applications Getting incorrect results when optimizing

More information

Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary

Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary OpenCL Optimization Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary 2 Overall Optimization Strategies Maximize parallel

More information

Configuring Security for FTP Traffic

Configuring Security for FTP Traffic 2 Configuring Security for FTP Traffic Securing FTP traffic Creating a security profile for FTP traffic Configuring a local traffic FTP profile Assigning an FTP security profile to a local traffic FTP

More information

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...

More information

How to test and debug an ASP.NET application

How to test and debug an ASP.NET application Chapter 4 How to test and debug an ASP.NET application 113 4 How to test and debug an ASP.NET application If you ve done much programming, you know that testing and debugging are often the most difficult

More information

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)

Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ) Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications

More information

Backup Server DOC-OEMSPP-S/6-BUS-EN-21062011

Backup Server DOC-OEMSPP-S/6-BUS-EN-21062011 Backup Server DOC-OEMSPP-S/6-BUS-EN-21062011 The information contained in this guide is not of a contractual nature and may be subject to change without prior notice. The software described in this guide

More information

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Learn CUDA in an Afternoon: Hands-on Practical Exercises Learn CUDA in an Afternoon: Hands-on Practical Exercises Alan Gray and James Perry, EPCC, The University of Edinburgh Introduction This document forms the hands-on practical component of the Learn CUDA

More information

Fahim Uddin http://fahim.cooperativecorner.com email@fahim.cooperativecorner.com. 1. Java SDK

Fahim Uddin http://fahim.cooperativecorner.com email@fahim.cooperativecorner.com. 1. Java SDK PREPARING YOUR MACHINES WITH NECESSARY TOOLS FOR ANDROID DEVELOPMENT SEPTEMBER, 2012 Fahim Uddin http://fahim.cooperativecorner.com email@fahim.cooperativecorner.com Android SDK makes use of the Java SE

More information

DiskBoss. File & Disk Manager. Version 2.0. Dec 2011. Flexense Ltd. www.flexense.com info@flexense.com. File Integrity Monitor

DiskBoss. File & Disk Manager. Version 2.0. Dec 2011. Flexense Ltd. www.flexense.com info@flexense.com. File Integrity Monitor DiskBoss File & Disk Manager File Integrity Monitor Version 2.0 Dec 2011 www.flexense.com info@flexense.com 1 Product Overview DiskBoss is an automated, rule-based file and disk manager allowing one to

More information

serious tools for serious apps

serious tools for serious apps 524028-2 Label.indd 1 serious tools for serious apps Real-Time Debugging Real-Time Linux Debugging and Analysis Tools Deterministic multi-core debugging, monitoring, tracing and scheduling Ideal for time-critical

More information

Visual Basic. murach's TRAINING & REFERENCE

Visual Basic. murach's TRAINING & REFERENCE TRAINING & REFERENCE murach's Visual Basic 2008 Anne Boehm lbm Mike Murach & Associates, Inc. H 1-800-221-5528 (559) 440-9071 Fax: (559) 440-0963 murachbooks@murach.com www.murach.com Contents Introduction

More information

Phone Inventory 1.0 (1000) Installation and Administration Guide

Phone Inventory 1.0 (1000) Installation and Administration Guide Phone Inventory 1.0 (1000) Installation and Administration Guide 2010 VoIP Integration June 23, 2010 Table of Contents Product Overview... 3 Requirements... 3 Application Requirements... 3 Call Manager...

More information

INTEL PARALLEL STUDIO XE EVALUATION GUIDE

INTEL PARALLEL STUDIO XE EVALUATION GUIDE Introduction This guide will illustrate how you use Intel Parallel Studio XE to find the hotspots (areas that are taking a lot of time) in your application and then recompiling those parts to improve overall

More information

Eclipse Integrated Virtual Debugger User s Manual Workstation 6.0

Eclipse Integrated Virtual Debugger User s Manual Workstation 6.0 Eclipse Integrated Virtual Debugger User s Manual Workstation 6.0 Eclipse Integrated Virtual Debugger User s Manual Eclipse Integrated Virtual Debugger User s Manual Revision: 20070426 Item: WS-ENG-Q107-297

More information

CodeWarrior Development Studio for Freescale S12(X) Microcontrollers Quick Start

CodeWarrior Development Studio for Freescale S12(X) Microcontrollers Quick Start CodeWarrior Development Studio for Freescale S12(X) Microcontrollers Quick Start SYSTEM REQUIREMENTS Hardware Operating System Disk Space PC with 1 GHz Intel Pentum -compatible processor 512 MB of RAM

More information

DsPIC HOW-TO GUIDE Creating & Debugging a Project in MPLAB

DsPIC HOW-TO GUIDE Creating & Debugging a Project in MPLAB DsPIC HOW-TO GUIDE Creating & Debugging a Project in MPLAB Contents at a Glance 1. Introduction of MPLAB... 4 2. Development Tools... 5 3. Getting Started... 6 3.1. Create a Project... 8 3.2. Start MPLAB...

More information

AMD CodeXL 1.7 GA Release Notes

AMD CodeXL 1.7 GA Release Notes AMD CodeXL 1.7 GA Release Notes Thank you for using CodeXL. We appreciate any feedback you have! Please use the CodeXL Forum to provide your feedback. You can also check out the Getting Started guide on

More information

XDB Intel System Debugger 2015 Overview Training. Robert Mueller-Albrecht, TCE, SSG DPD ECDL

XDB Intel System Debugger 2015 Overview Training. Robert Mueller-Albrecht, TCE, SSG DPD ECDL XDB Intel System Debugger 2015 Overview Training Robert Mueller-Albrecht, TCE, SSG DPD ECDL Agenda 1) What is XDB? 2) Debugger startup and device/platform support 3) Debugger usage (Android* an Linux*)

More information

DAVE version 4 Quick Start Simple LED Blinky via a Generated PWM Signal. XMC Microcontrollers February 2016

DAVE version 4 Quick Start Simple LED Blinky via a Generated PWM Signal. XMC Microcontrollers February 2016 DAVE version 4 Quick Start Simple LED Blinky via a Generated PWM Signal XMC Microcontrollers February 2016 Learning Outcome Learn the basic principles of DAVE TM version 4: Installation Required XMC kit

More information

Advanced OpenMP Course

Advanced OpenMP Course 1 / 7 Advanced OpenMP Course: Exercises and Handout Advanced OpenMP Course Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Seffenter Weg 23, 52074 Aachen, Germany {terboven, schmidl}@itc.rwth-aachen.de

More information

Integrated Virtual Debugger for Visual Studio Developer s Guide VMware Workstation 8.0

Integrated Virtual Debugger for Visual Studio Developer s Guide VMware Workstation 8.0 Integrated Virtual Debugger for Visual Studio Developer s Guide VMware Workstation 8.0 This document supports the version of each product listed and supports all subsequent versions until the document

More information

The RWTH Compute Cluster Environment

The RWTH Compute Cluster Environment The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de

More information

High Performance Computing in Aachen

High Performance Computing in Aachen High Performance Computing in Aachen Christian Iwainsky iwainsky@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Produktivitätstools unter Linux Sep 16, RWTH Aachen University

More information