GPU Point List Generation through Histogram Pyramids



Similar documents
HistoPyramid stream compaction and expansion

Hardware design for ray tracing

Computer Graphics Hardware An Overview

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Interactive Level-Set Deformation On the GPU

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

OpenEXR Image Viewing Software

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Texture Cache Approximation on GPUs

Volume Rendering on Mobile Devices. Mika Pesonen

Writing Applications for the GPU Using the RapidMind Development Platform

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

QCD as a Video Game?

GPU Architecture. Michael Doggett ATI

Low power GPUs a view from the industry. Edvard Sørgård

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

Introduction to GPGPU. Tiziano Diamanti

Dynamic Resolution Rendering

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Interactive Level-Set Segmentation on the GPU

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

NVIDIA GeForce GTX 580 GPU Datasheet

A Computer Vision System on a Chip: a case study from the automotive domain

Optimizing AAA Games for Mobile Platforms

Choosing a Computer for Running SLX, P3D, and P5

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

The Future Of Animation Is Games

Employing Complex GPU Data Structures for the Interactive Visualization of Adaptive Mesh Refinement Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

How To Make A Texture Map Work Better On A Computer Graphics Card (Or Mac)

Modern Graphics Engine Design. Sim Dietrich NVIDIA Corporation

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

GPU Shading and Rendering: Introduction & Graphics Hardware

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Fast Parallel Algorithms for Computational Bio-Medicine

Volume visualization I Elvins

NVIDIA VIDEO ENCODER 5.0

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code

Introduction to GPU Programming Languages

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Getting Started with CodeXL

Optimization for DirectX9 Graphics. Ashu Rege

L20: GPU Architecture and Models

Performance Optimization and Debug Tools for mobile games with PlayCanvas

Visualisation of Large Datasets with Houdini

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

walberla: A software framework for CFD applications on Compute Cores

Geo-Scale Data Visualization in a Web Browser. Patrick Cozzi pcozzi@agi.com

Data Visualization Using Hardware Accelerated Spline Interpolation

Radeon HD 2900 and Geometry Generation. Michael Doggett

Introduction Computer stuff Pixels Line Drawing. Video Game World 2D 3D Puzzle Characters Camera Time steps

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Optimisation of complex LIDAR & oblique imagery for real time visualisation of cityscapes.

OpenGL Performance Tuning

Video Analytics A New Standard

Parallel algorithms to a parallel hardware: Designing vision algorithms for a GPU

Anime Studio Debut vs. Pro

How To Create A Surface From Points On A Computer With A Marching Cube

Raster Data Structures

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

GPGPU Computing. Yong Cao

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

CUBE-MAP DATA STRUCTURE FOR INTERACTIVE GLOBAL ILLUMINATION COMPUTATION IN DYNAMIC DIFFUSE ENVIRONMENTS

MapGraph. A High Level API for Fast Development of High Performance Graphic Analytics on GPUs.

Efficient and multi-market embedded processing based on ARM : beyond architecture dilemma

CS 371 Project 5: Midterm Meta-Specification


SYSTAP / bigdata. Open Source High Performance Highly Available. 1 bigdata Presented to CSHALS 2/27/2014

GUI GRAPHICS AND USER INTERFACES. Welcome to GUI! Mechanics. Mihail Gaianu 26/02/2014 1

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

CHAPTER FIVE RESULT ANALYSIS

C-nario Cube. June, 2008

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.

CS231M Project Report - Automated Real-Time Face Tracking and Blending

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

GPU for Scientific Computing. -Ali Saleh

Parallel Large-Scale Visualization

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Definiens XD Release Notes

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

OctaVis: A Simple and Efficient Multi-View Rendering System

Real-time Visual Tracker by Stream Processing

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Fast Multipole Method for particle interactions: an open source parallel library component

Deep Learning For Text Processing

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

Image Processing & Video Algorithms with CUDA

Parallel Simplification of Large Meshes on PC Clusters

The Scientific Data Mining Process

Adding Animation With Cinema 4D XL

DEVELOPMENT OF REAL-TIME VISUALIZATION TOOLS FOR THE QUALITY CONTROL OF DIGITAL TERRAIN MODELS AND ORTHOIMAGES

GPGPU accelerated Computational Fluid Dynamics

The Resilient Smart Grid Workshop Network-based Data Service

Transcription:

VMV 26, GPU Programming GPU Point List Generation through Histogram Pyramids Gernot Ziegler, Art Tevs, Christian Theobalt, Hans-Peter Seidel

Agenda Overall task Problems Solution principle Algorithm: Discriminator Algorithm: HistoPyramid Builder Algorithm: PointList Builder Applications Future Directions, Conclusion (Extra: Feature Detection, Geometry Generation, 3D Volume, QuadTree)

Overall task Decompose a 3D model into a point cloud Vertices don t suffice, require minimum sampling density Make GPU render 3D model into 3D volume slices Problem: 256 3 points, many unfilled find active ones! New real-time algorithm 3D model Volume conversion On-GPU point cloud

Side problem: 3D to 2D mapping NVidia GPUs cannot render into 3D volume textures. Solution: Create a 2D mapping scheme for 3D volume. 3D Volume Lattice : 2D mapping of 3D volume Side effect: Following issues become a 2D problem.

Current problem 3D volume is on GPU, find active voxels there Require dynamically growing list of point coordinates But: Bus transfers are expensive. CPU 3D model Volume slicer GPU Discriminator/List generator Volume data (Dozens Megabytes) Pointlist (Kilobytes) Point Cloud renderer

Our solution A 2D/3D algorithm which runs on Shader Model 3. *) See it as data compaction problem (Cell = Pixel/Voxel) Empty cells are useless air, only interesting cells remain. Data compaction on stream processors active area of research (Horn et al, GPU Gems 2; Sengupta et al, Edge Workshop 26) CPU 3D model Volume slicer GPU Discriminator/List generator Volume data (Dozens Megabytes) Pointlist (Kilobytes) Point Cloud renderer *) SM 2. possible, but cumbersome

Overview, data compaction

Algorithm: Discriminator Tells active cells from inactive/empty ones. Easy criterion in our case: Alpha =. Data input Base level of HistoPyramid More complicated discriminators in the paper!

Algorithm: HistoPyramid Builder Hierarchically sums up active element count. Similar to mip-mapping (w/o averaging). Sum cell content (reduction operation) L (Base, 4x4) Reduce 3 2 2 L (2x2) Reduce 8 L2 (Top, x) TOTAL COUNT OF CELLS!

Algorithm: HistoPyramid Builder s output TOTAL COUNT OF CELLS! :) Histopyramid for Teapot

Algorithm: PointList Builder Traverses HistoPyramid top-down to generate PointList. (2D texture, size based on Total Count of Cells) Input: Key indices 2 3 4 5 6 7 (8) Output:2D Point List (,) (,) (,) (3,) (2,) (,2) (,3) (3,2) X Example traversal, key index 4 HistoPyramid, traversal path: 8 L2 3 2 2 L : traversal direction L

Algorithm: PointList Builder s output A 2D texture listing active cells 3D point coordinates. (2D to 3D mapping happens in PointList Builder) PointList for Teapot. false colors: 3D positions Done! Example image, PointCloud renderer

Results

Result: Real-time particle cloud (Art Tevs)

Result: 3D Volume Analysis (Vector Field Contours, Tom Annen et al)

Future Directions Feature detection on GPU Geometry generation (SM 3. marching cubes) Real-time compression (with e.g. wavelet analysis) Conversion to SM4., and of course: Showdown with SM4 Geometry Shaders ;) Other uses for HistoPyramid (complete: Region Quadtrees/Octrees on GPU) Your ideas? Discussions welcome!

Thank you!

HistoPyramids vs. Geometry Shaders SM4. GPUs now have one option to create lists: Geometry Shaders. Used to delete and create geometry on the GPU. Still unclear if HistoPyramids become obsolete: Option : Route the whole data input through the geometry shader and let it weed out the empty cells. Vertex shader applied to millions of points? Option 2: Let geometry shader traverse the data input, and generate geometry for all found cells. Can geometry shader create more than 24 points?

Problems in GPU Feature detection Image analysis: GPU can convolve images. Feature points isolation: e.g. thresholding, requires dynamically growing list of point coordinates GPU as a stream processor cannot generate this, must download to CPU. Bus transfers are expensive, hence: Speed advantage only for complex convolvements. CPU Input image Image convolver GPU Discriminator/List generator Convolved image (Megabytes) Feature list (Kilobytes)

Speeding up GPU Feature detection We introduce a 2D/3D solution which runs on Shader Model 3.. Make it a data compaction problem (Cell = Pixel/Voxel) Uninteresting cells are useless air, only interesting cells (feature points) remain Data compaction on stream processors active area of research (Horn, GPU Gems 2) Hence: CPU Input image Feature list (Kilobytes) GPU Image convolver Convolved image Discriminator/PointList Builder

Sketch: 3D HistoPyramid Each pyramid level is a 3D volume. Like in mipmapped 3D volumes, the volumes are halved in size for every level. Thus, every time, 8 cells get reduced into one. PointList Builder accesses 3D textures one by one. Implementation obstacle: Cannot easily generate 3D volume mipmaps on NV4 architecture without internal copying (Missing render-to-3dtexture support). The alternative, laying out several 3D volumes in one 2D texture, is very cumbersome and trashes cache behaviour.

Sketch: Geometry generation At base level, the discriminator calculates n x,y, the number of intended vertices at each cell (x,y) from neighbor information (think: Marching Cubes from 3D volume). HistoPyramid will sum up the count of all intended vertices. 4 6 4 4 4 4 6 4 6 6 6 6 6 6 6 6 6 6 6 6 4 8 6 8 8 6 8 4 2 6 4 4 8 4 8 6 8 6 8 4 4 4 2 6 4 6 4 4 4 4 6 4 Intended line geometry Discriminator: Vertex creation template & Example, nx,y, first cells HistoPyramid base level

Sketch: Geometry generation GeometryBuilder, derived from PointList Builder, will end up parallelly (n x,y times) at cell (x,y) during top-down traversal. But it will know this (as it knows the initial index for cell (x,y)), and thus create the intended [.. n x,y ]-th vertex from a lookup table. 4 6 4 4 4 4 6 4 6 6 6 6 6 6 6 6 6 6 6 6 4 8 6 8 8 6 8 4 4 8 8 8 8 4 2 4 6 6 4 2 6 4 4 6 4 6 4 4 4 4 6 4 22 4 4 22 24 2 2 24 6 26 26 6 2 8 8 2 8 8 6 42 6 28 Created HistoPyramid a (,,e) (,,f) (,,g) (,,h) (,,c) (,,d) (,,e) (,,f) (,,g) (,,h) (,,a) (,,b) (,,..)... GeometryBuilder s vertex output list (,) (,) c d b g e GeometryBuilder s vertex creation order f h (,) (,)... Above list as geometry

Real-time QuadTree analysis Vision task: Detect regions with common features Common feature : Low variance around average color Usually very time-consuming, by far not real-time Our Answer: Extension of GPU point list algorithm Only create simple region geometry, region quadtrees Typical analysis time: 64x48 in 5 ms Useful for: Color grouping Motion vector clustering Acceleration of grid calculations

Results ffkvadrat: Real-time quadtree analysis (Footage by GusGus, Call of The Wild) Video frame, Edge analysis, several results