Dr. Anne C. Elster. Assoc. Prof., HPC-Lab, Dept. Computer and Info. Science Norwegian University of Science & Technology Trondheim, Norway.

Similar documents
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Interactive Level-Set Deformation On the GPU

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Retargeting PLAPACK to Clusters with Hardware Accelerators

Latency and Bandwidth Impact on GPU-systems

Parallel Programming Survey

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GPGPU. Tiziano Diamanti

Introduction to GPU hardware and to CUDA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015

GPU-based Decompression for Medical Imaging Applications

HPC with Multicore and GPUs

Introduction to GPU Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

GPU for Scientific Computing. -Ali Saleh

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HPC-related R&D in 863 Program

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

Interactive Level-Set Segmentation on the GPU

Pedraforca: ARM + GPU prototype

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

64-Bit versus 32-Bit CPUs in Scientific Computing

GPU Programming in Computer Vision

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

GPUs for Scientific Computing

Turbomachinery CFD on many-core platforms experiences and strategies

Petascale Visualization: Approaches and Initial Results

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

HP ProLiant SL270s Gen8 Server. Evaluation Report

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

GPGPU Computing. Yong Cao

The Future Of Animation Is Games

Parallel Computing. Introduction

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Computer Graphics Hardware An Overview

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Evaluation of CUDA Fortran for the CFD code Strukti

GPU Computing. The GPU Advantage. To ExaScale and Beyond. The GPU is the Computer

High Performance GPGPU Computer for Embedded Systems

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Next Generation GPU Architecture Code-named Fermi

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

Data Centric Systems (DCS)

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Trends in High-Performance Computing for Power Grid Applications

Experiences on using GPU accelerators for data analysis in ROOT/RooFit

1. INTRODUCTION Graphics 2

MapReduce on GPUs. Amit Sabne, Ahmad Mujahid Mohammed Razip, Kun Xu

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Several tips on how to choose a suitable computer

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

High Performance Computing in CST STUDIO SUITE

GPU Computing - CUDA

10- High Performance Compu5ng

Accelerating CFD using OpenFOAM with GPUs

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

NVIDIA GeForce GTX 580 GPU Datasheet

How to program efficient optimization algorithms on Graphics Processing Units - The Vehicle Routing Problem as a case study

L20: GPU Architecture and Models

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

MapGraph. A High Level API for Fast Development of High Performance Graphic Analytics on GPUs.

Parallel Computing with MATLAB

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

GeoImaging Accelerator Pansharp Test Results

NVIDIA Jetson TK1 Development Kit

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

Introduction to GPU Programming Languages

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer

Parallel Firewalls on General-Purpose Graphics Processing Units

Introduction to GPU Architecture

Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

High Productivity Computing With Windows

QCD as a Video Game?

Case Study on Productivity and Performance of GPGPUs

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

GPU Point List Generation through Histogram Pyramids

GPU Usage. Requirements

A quick tutorial on Intel's Xeon Phi Coprocessor

IP Video Rendering Basics

ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI

walberla: A software framework for CFD applications

Implementation of Stereo Matching Using High Level Compiler for Parallel Computing Acceleration

An introduction to Fyrkat

Hardware Acceleration for CST MICROWAVE STUDIO

Transcription:

1 The Power of Medical Imaging on GPUs Dr. Anne C. Elster Assoc. Prof., HPC-Lab, Dept. Computer and Info. Science Norwegian University of Science & Technology Trondheim, Norway and Visiting Scientist, ECE, University of Texas at Austin, USA

2 Thank yous to: My Collaborators, Post Docs and graduate students! Drs. Frank Lindseth (SINTEF Med Tech) & Prof. Bjørn Angelsen, NTNU Med School, Dept. of Circulation & Medical Imaging

3 Thank yous to: 07/08:@ SC 07 My Post Docs and 06/07:Spring 2007 graduate students! 09/10:Spring 2010 http://research.idi.ntnu.no/hpc-lab 08/09:Spring 2009 10/11: @ SC 10 11/12:Spring 2012 A.C. Elster: The Power of Medical Imaging on GPU

4 NTNU Gløshaugen (formerly Norwegian Institute of Technology) U of Texas at Austin

5 Trondheim, Norway on the world map http://research.idi.ntnu.no/hpc-lab 5 A.C. Elster: The Power of Medical Imaging on GPU

6 Outline Motivation and brief intro to HPC-Lab at NTNU 3D Ultrasound Reconstruction 3D Surface Extraction GPU-Based Airway Segmentation and Centerline Extraction for Image Guided Bronchoscopy Current related projects at HPC-Lab at NTNU Summary

7 MOTIVATION The Power of Medical Imaging : Use Ultrasound, MRI, PET etc Imaging for: medical diagnostics (avoid exploratory surgery) image-guided surgery ++

8 The Power of Medical Imaging : Use Ultrasound, MPI, PetScans Imaging to: diagnose (avoid exploratory surgery) image-guided surgery ++ By harnessing the compute-power of GPUs!

9 Motivation GPU Computing: ModMany advances in processor designs are driven by Billion $$ gaming market! ern GPUs (Graphic Processing Unit) offer lots of FLOPS per watt!.. and lots of parallelism! NVIDA Tesla 2050/2070 (Fermi): 448 CUDA cores! - Kepler: - GTX 690 and Tesla K10 cards - have 3072 (2x1536) cores!

10 Heterogenous supercomputing China s Tianhe-1A No. 1 Supercomputer (SC 10)- NUDT/NSCC/Tianjin NUDT 6-core Intel X5670 2.93 GHz + NVIDIA Tesla M2050 GPU Custom interconnect, 183,368 Cores Rmax @2.57 Pflop/s China s Nebulae -- No.2 (ISC 10)/ No. 3 (SC 10) At National Supercomputing Centre in Shenzhen, China - Dawning TC3600 Blades w/intel X5650 2.67GHz + (4640) Nvidia Tesla C2050 GPUs - Theoretical peak performance at 2.98 PFlop/s - Linpack performance of 1.271 PFlop/s

11 NTNU GPU Activities Elster s HPC-lab has graduated 25+ Master students (diplom) in GPU computing (2007-2012) Currently supervising 8+PhD students & 9 master studs. NTNU designated NVIDIA CUDA Teaching Center (summer 2011) PhD seminar course (Spring 2013: 7 students) Master s level course (Fall 2012: 14 students) Senior Parallel Computing class Fall 2010: 43 taking exam Fall 2012: 57 students NVIDIA CUDA Research Center (2012)

12 HPC-Lab History (last 8 yrs): Fall 2006: First 2 student projects with GPU programming (Cg) Christian Larsen (MS Fall Project, December 2006): Utilizing GPUs on Cluster Computers (joint with Schlumberger) Erik Axel Nielsen asks for FX 4800 card for project with GE Healthcare Elster as head of Computational Science & Visualization program and helped NTNU acquire new IBM Supercomputer (Njord, 7+ TFLOPS, proprietary switch) 12

13 HPC-Lab History (contin.): 2007: Erik Axel Nielsen (Masters thesis, June 2007): Real-time Wavelet Filtering on the GPU -- joint project with GE Healthcare. 40 times GPU speedup of algorithm led to our implementation being adopted the same fall in their high-end cardivascular ultrasound scanner. Christian Larsen (Masters thesis, June 2007) Tore Fevang, Schlumberger (co-advisor): "Framework for Polygonial Structures Computations on Clusters (incl GPU parallelization) Idar Borlaug (Masters thesis, June 2007): Seismic Processing Using Parallel 3D FMM Thibault Collet (Masters thesis summer 2007): "Massively Online Games with Food Chains" Knut Imar Hagen (Masters thesis, June 2007) Fault-tolerance for MPI Codes on Computation Clusters (joint project with Statoil) Nils Magnus Larsgård (Masters thesis summer 2007): Framework for Converting MPI Codes to Hybrid OpenMP/MPI Codes 13 http://research.idi.ntnu.no/hpc-lab A.C. Elster: The Power of Medical Imaging on GPU

14 HPC-Lab History (contin.): 2008: Quadcore Supercomputer at UiTø (Stallo) ca. 70 TF HPC-LAB at IDI/NTNU opens in Oct. with several NVIDIA donation Several quad-core machines (1-2 donated by Schlumberger) 14

15 HPC-Lab History (contin.): 2008: HPC-LAB at IDI/NTNU opens in Oct. with several NVIDIA donation Several quad-core machines (1-2 donated by Schlumberger) Rune Hovland (Masters project, Dec 2008) : "Latency and Bandwidth Impact on GPU Systems" (ParCo 2009 w/ Elster) Daniele Giuseppe Spampinato (Masters Project, December 2008): "Linear Optimizations with CUDA (IPDPS MTAAP 2009 w/ Elster) Atle Rudshaug (Masters thesis, June 2008): Optimizing & Parallelizing a Large Commercial Code for Modeling Oil-well Networks -- joint project with Yggdrasil Andreas Bach (Masters thesis, September 2008): Profiling and Optimizing a Seismic Application on Modern Architectures -- joint project with Statoil 15

16 HPC-Lab History (contin.): 2009: NVIDIA Tesla s1070 (4 GPUs 960 cores * 1.44GHz, 4TF) Two NVIDIA Quadro FX 5800 cards (Jan 09), NVIDIA Ion (Jun 09) Two AMD/ATI Radon 5870 (1600 cores @ 850MHz, 2.72TF) (one donated by AMD) Note: Memory vs. Proc clocks E.g. NVIDIA s1070(-500): 792MHz vs 1.44GHz 16

17 HPC-Lab History (contin.): 2008: HPC-LAB at IDI/NTNU opens in Oct. with several NVIDIA donation Several quad-core machines (1-2 donated by Schlumberger) Atle Rudshaug (Masters thesis, June 2008): Optimizing & Parallelizing a Large Commercial Code for Modeling Oil-well Networks -- joint project with Yggdrasil Andreas Bach (Masters thesis, September 2008): Profiling and Optimizing a Seismic Application on Modern Architectures -- joint project with Statoil Rune Hovland (Masters project, Dec 2008) : "Latency and Bandwidth Impact on GPU Systems" (ParCo 2009 w/ Elster) Daniele Giuseppe Spampinato (Masters Project, December 2008): "Linear Optimizations with CUDA (IPDPS MTAAP 2009 w/ Elster) 17

18 Selected Master theses and Master reports supervised by Dr. Elster in 2009 1) Robin Eidissen (Masters thesis, January 2009) : "Utilizing GPUs for Real-Time Visualization of Snow (demoed @ SC 08-SC 10) Eirik Aksnes and Henrik Hesland (MS Project, Jan 2009) : "GPU Techniques for Porous Rock Visualization 2) Rune Erlend Jensen (Masters thesis, May 2009, currently PhD student at HPC-Lab) : "Techniques and Tools for Optimizing Codes on Modern Architectures: A Low-Level Approach (NR MS Thesis Award!) 3) Rune Johan Hovland (Masters thesis, June 2009), Dr. Magnus Lie Hetland (co-advisor): "Throughput Computing on Future GPUs http://research.idi.ntnu.no/hpc-lab A.C. Elster: The Power of Medical Imaging on GPU

19 Selected Master theses and Master reports supervised by Dr. Elster in 2009 1) Robin Eidissen (Masters thesis, January 2009) : "Utilizing GPUs for Real-Time Visualization of Snow (demoed @ SC 08-SC 10) Eirik Aksnes and Henrik Hesland (MS Project, Jan 2009) : "GPU Techniques for Porous Rock Visualization 2) Rune Erlend Jensen (Masters thesis, May 2009, currently PhD student at HPC-Lab) : "Techniques and Tools for Optimizing Codes on Modern Architectures: A Low-Level Approach (NR MS Thesis Award!) 3) Rune Johan Hovland (Masters thesis, June 2009), Dr. Magnus Lie Hetland (co-advisor): "Throughput Computing on Future GPUs 4) Henrik Hesland (Masters thesis, June 2009) Thorvald Natvig (co-advisor): "GPU-Enabled Interactive Pore Detection for 3D Rock Visualization " 5) Eirik Ola Aksnes (Masters thesis, July 2009) Ståle Fjeldstand & Atle Rudshaug, Numerical Rocks (co-advisors): "Simulation of Fluid Flow Through Porous Rocks on Modern GPUs" (ParCo 2009) 6) Daniel Haugen (Masters thesis, July 2009) Tore Fevang, Schlumberger (co-advisor): "Seismic Data Compression and GPU Memory Latency" 7) Åsmund Herikstad (Masters thesis, July 2009) Svein-Erik Måsøy, MedTek, NTNU (co-advisor) "Parallel Techniques for Estimation and Correction of Aberration in Medical Ultrasound Imaging" 8) Owe Johansen (Masters thesis, July 2009) John Hybertsen & Jon André Haugen, Statoil (coadvisors): "Seismic Shot Processing on GPU" 9) Daniele Giuseppe Spampinato (Masters thesis, July 2009; currently PhD student @ ETH) "Modeling Communication on Multi-GPU Systems (ParCo 2009) http://research.idi.ntnu.no/hpc-lab A.C. Elster: The Power of Medical Imaging on GPU

20 HPC-Lab History (contin.): 2010: - NVIDIA Fermi-based card(470, c2050, c2070(fall)) - More on OpenCL Ahmed A. Aqwari (Masters thesis, June 2010): Effects of Compression on Data Intensive Algorithms Aleksander Gjermundsen (Masters thesis, July 2010): Audio Processing on GPU Andreas Hysing (Masters thesis, Aug 2010): Parallel Inversion code (w/statoil) Øystein Krog (Masters thesis, June 2010): GPU-based Real-Time Snow Avalanche Simulations Holger Ludvigsen (Masters thesis, June 2010, Dr. Frank Lindseth (co-advisor): Real-Time GPU-Based 3D Ultrasound Reconstruction and Visualization Thorvald Natvig (PhD Dec 2010) Automatic Run-Time Communication and I/O 20

21 HPC-Lab Masters Theses Spring 2011 Fredrik Fossum, MTech 2011 Real-Time Rigid Body Interactions (on GPU) Yngve S. Lindal, MTech 2011 MSProj @ CERN w/ Sverre Jarp (CTO, CERN), co-advisor: Optimizing a High-Energy Physics (HEP) Toolkit on Heterogeneous Architectures. Bent Ove Stinessen, MTech 2011 Dr. Alf Birger Rustad (Statoil Research, co-advisor): Profiling, Optimization and Parallelization of Seismic Inversion Code. Jarle Steinsland, MTech2011 Auto-tunable GPU BLAS Thor Kristian Valderhaug, MTech 2011 The Lattice Boltzmann Simulation on Multi-GPU Systems. Erik Smistad, Integrated MTech/PhD Main advisor: Frank Lindseth (Elster is co-advisor) Medical imaging on GPUs Hallgeir Lien (Master fall proj) co-supervized with Dr. Jo Skjermo, Vegvesenet -- Road Generation Using A* Algorithm continued this fall by another student

22 Master students that finished Summer 2012: Kjetil Babington, MTech 2012: Terrain Rendering Techniques for the HPC-Lab Snow Simulator Thomas Falch, MTech: (Elster main advisor, Dag Breiby,(Physics) co-advisor 3D Visualization of X-ray Diffraction Data Geir Josten Lien, Mscience: Auto-tunable GPU BLAS Supervised Ca. 50 master students of which ca 25 on GPU topics Jan Rovde Realt-Time Granular Flow Simulations Using the PCISHP Method on GPGPU Devices using CUDA Frederik MJ Vestre Enhancing and Porting the HPC-Lab Snow Simulator to OpenCL on Mobile Platforms Johannes Kvam, MTech Cybernetics, (Elster is co-advisor, Main advisor: Prof Angelsen) Mediacal Image Processing w/ GPUs

23 Anne C. Elster Lab Director Rune E. Jensen PhD Students: Erik Smistad (Elster co-advisor Linseth, main advisor)) Johannes Kvam, (Elster is co-advisor, Main advisor: Prof Angelsen) Thomas Falch Mehdi Bozorgi (Elster co-advisor Linseth, main advisor)) Ivar Ursin Nikolaisen (Co-advisor. Alf B. Rustad, Statoil) Lane Holloway (Univ. of TX Austin, USA), Elster (de facto co-advisor/ committee mbr, Don Fussel, UT Comp- Sci. UT (main advisor) Samira Pakdel + NN & NN Post Doc? Master Students: Recent PhDs: Lars Melhus Henrik Knutsen Lars Espen Nordhus Stian Pedersen Magnus Mikalsen Andreas Skomedal Andreas Nordahl + Lars Martin Petersen & Elisabeth Solheim Jan Christian Meyer (PhD 2012) Selected Affiliates /Visitors Drs. Frank Lindseth (SINTEF Med Tech) & Prof. Bjørn Angelsen, NTNU Med School, Ruben Spaans Grant Strong Dept. of Circulation & http://research.idi.ntnu.no/hpc-lab Medical Imaging A.C. Elster: ThePhD Power of Medical on GPU applicant PhDImaging stud. Canda Miguel Amor Thorvald Natvig GTC S3061, March 2013 PhD stud. Spain (PhD.20, 2010)

24 Outline Motivation and brief intro to HPC-Lab at NTNU 3D Ultrasound Reconstruction 3D Surface Extraction GPU-Based Airway Segmentation and Centerline Extraction for Image Guided Bronchoscopy Current related projects at HPC-Lab at NTNU Summary

25 3D Ultrasound Reconstruction (w/ Dr. Frank Lindseth (SINTEF MedTek and NTNU, MS students: Holger Ludvigsen (CUDA) and Thor K. Valderhaug (OpenCL on AMD)

26 Ultrasound 3D Reconstruction Challenges: Calculate 64 million voxels from ca 400 b-scans Used during surgery, so real-time reconstruction is very important Keep costs down

27 Ultrasound 3D Reconstruction Solution: GPU acceleration! VNN Algorithm: Fill plane points Transform plane points Fill plane equation For each Voxel: Find closest plane Project into plane Find 2D coord of projection on plane Fill Voxel Achieved reconstruction 1.29 sec time vs 29.61 sec on CPU!

28 Real-time Ultrasound 3D reconstruction multiple views

29 Outline Motivation and brief intro to HPC-Lab at NTNU 3D Ultrasound Reconstruction 3D Surface Extraction GPU-Based Airway Segmentation and Centerline Extraction for Image Guided Bronchoscopy Current related projects at HPC-Lab at NTNU Summary

30 3D Surface Extraction (w/ Dr. Frank Lindseth (SINTEF MedTek and NTNU, and PhD student Erik Smistad

31 3D Surface Extraction on GPUs Use Marching Cubes algorithm for extracting a 3D surface from a set of sampled scalars Algorithm used extensively for visualizing and analyzing medical data (X-ray, MR) and the result of 3D segmentation. Completely data parallel Challenge: How to store the result of each cube in parallel on GPU

32 3D Surface Extraction -- Histogram data Challenge: How to store the result of each cube in parallel on GPU? In serial implementation this is simple just use a stack and add the vertex data to the stack GPU Solution: Histogram Pyramids [1] A datastructure that: Filters out cubes that has no triangle (stream reduction) Returns total sum of triangles Provides each cube with an index for memory storage Can be efficiently used by means of textures yielding large speed-ups [1] G. Ziegler et al: On-the-fly Point Clouds through Histogram Pyramids; Vision, Modeling, and Visualization 2006

33 3D Surface Extraction -- Histogram Pyramids: Construction & Traversal HP Construction HP Traversal

34 3D Surface Extraction -- Results: HPMC Dyken et al. vs. Our OpenCL implementation Size Exec. time FPS (avg) Memory 512^3 3324 ms 0.3 490 MB 256^3 5 ms 223 122 MB 128^3 3 ms 394 44 MB 64^3 2 ms 519 22MB Size Exec. time FPS (avg) Memory 512^3 34 ms 0.3 121 MB 256^3 10 ms 105 40 MB 128^3 4 ms 233 26 MB 64^3 3 ms 319 22MB Our Test system: Intel i5 750, 4GB RAM ATI Radeon 5870 (1GB RAM) AMD Catalyst 11.2 graphics driver APP SDK 2.3 w/ OpenCL 1.1 Note: OpenCL-OpenGL Synch measured to be 2-20ms, i.e. 70-90<% for smallest datasets

35 3D Surface Extraction (w/ Dr. Frank Lindseth (SINTEF MedTek and NTNU, and PhD student Erik Smistad

36 Outline Motivation and brief intro to HPC-Lab at NTNU 3D Ultrasound Reconstruction 3D Surface Extraction GPU-Based Airway Segmentation and Centerline Extraction for Image Guided Bronchoscopy Related & Current projects at HPC-Lab at NTNU Summary

37 GPU-Based Airway Tree Segmentation and Centerline Extraction Erik Smistad PhD Student http://commons.wikimedia.org/w/index.php? title=file%3aright_bronchial_tree.ogg

38 GPU-Based Airway Tree Segmentation and Centerline Extraction Erik Smistad PhD Student

39 GPU Accelerated Segmentation and Centerline Extraction of Tubular Structures from Medical Images Dataset GPU Runtime CPU Runtime Patient 1 46 secs 12min 52 secs Patient 2 49 secs 14 min 43 secs Patient 3 49 secs 10 min 44 secs Patient 4 45 secs 14 mins 4 secs Patient 5 33 secs 10 mins 5 secs Patient 6 60 secs 17 mins 25 secs NVIDIA Tesla C2070 GPU vs. one Intel i7 720 CPU with 4 cores.

40 GPU Accelerated Segmentation and Centerline Extraction of Tubular Structures from Medical Images

41 Surf tech and drug delivery Bjørn Angelsen, Johannes Kvam ++ Advanced ultrasound signal processing techniques Complex calculations, requiring real-time capabilitiesintroduction of multiple GPGPUs in scanners for computational horsepower

42 Seismic Filtering -- motivation for compression: In our previous work when working on seismic filtering, transfer time originally 2% of overall time After off-loading filtering to GPU, now transfer time 90% of overall! Seismic filtering: 1) Transfer data, 2) actual filtering

43 Motivation Locality & I/O challenge for data intensive algorithms Look at techniques for reducing Mem. Bandwidth Hardware: HDD, SSD Compression: JPEG, MPEG, MP3... Explore GPU compression capabilities Seismic filtering process Transform coding works well for signal data * * [H.S.Malvar 1992], [L.C.Duval 2000], [C.Larsen 2006], [D.Haugen 2009]

44 Seismic Data 3D A collection of floats SGY format Traces Statistical variance Constructed datasets for testing

45 Results GPU acceleration

46 Results I/O Speedup

47 Visual Results

48 Results - compression When optimizing for I/O need efficent compression rate AND fast compression algorithm Compression can give up to: 6.2 I/O speedup on HDD (70MB/s) 3.9 I/O speedup on SSD (140MB/s) Achieved through Transform coding CPU & GPU co-op Asynch I/O Predictive model accurate within 5% Seismic compression library

49 3D Physics Viz Thomas Falch PhD student Mtech thesis: Elster main advisor, Dag Breiby,(Physics) co-advisor 3D Visualization of X-ray Diffraction Data

50 Heterogeneous Framework for Medical Image Processing and Visualization

51 Heterogeneous Framework for Medical Image Processing and Visualization

52 Heterogeneous Framework for Medical Image Processing and Visualization Challenges: - Portable. Both code and performance - Scheduling/distributing work to devices - Reducing memory transfer overhead - Programmability Make easy to use for non-experts Allow experts to do hand tuning/optimization

53 TACC/Univ. of Texas at Austin s Stampede http://www.tacc.utexas.edu/stampede

54 Current Related EU Activity EU COST Action IC0805: Open European Network for High Performance Computing on Complex Environments (2009-2013)" www.complexhpc.org"

55