GPU Data Structures. Aaron Lefohn Neoptica

Similar documents
Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Interactive Level-Set Deformation On the GPU

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

L20: GPU Architecture and Models

Interactive Level-Set Segmentation on the GPU

A Survey of General-Purpose Computation on Graphics Hardware

GPGPU Computing. Yong Cao

String Matching on a multicore GPU using CUDA

Introduction to GPU Programming Languages

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Accelerating In-Memory Graph Database traversal using GPGPUS

Data Parallel Computing on Graphics Hardware. Ian Buck Stanford University

GPU Architecture. Michael Doggett ATI

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Hardware design for ray tracing

Radeon HD 2900 and Geometry Generation. Michael Doggett

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Writing Applications for the GPU Using the RapidMind Development Platform

Computer Graphics Hardware An Overview

Employing Complex GPU Data Structures for the Interactive Visualization of Adaptive Mesh Refinement Data

GPU Shading and Rendering: Introduction & Graphics Hardware

Multi-GPU Load Balancing for In-situ Visualization

GPGPU: General-Purpose Computation on GPUs

GPU Point List Generation through Histogram Pyramids

CUBE-MAP DATA STRUCTURE FOR INTERACTIVE GLOBAL ILLUMINATION COMPUTATION IN DYNAMIC DIFFUSE ENVIRONMENTS

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

The Future Of Animation Is Games

A NEW METHOD OF STORAGE AND VISUALIZATION FOR MASSIVE POINT CLOUD DATASET

Scan Primitives for GPU Computing

Total Recall: A Debugging Framework for GPUs

Introduction to GPGPU. Tiziano Diamanti

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

GPGPU for Real-Time Data Analytics: Introduction. Nanyang Technological University, Singapore 2

Comp 410/510. Computer Graphics Spring Introduction to Graphics Systems

1. Relational database accesses data in a sequential form. (Figures 7.1, 7.2)

ultra fast SOM using CUDA

HistoPyramid stream compaction and expansion

Efficient Gather and Scatter Operations on Graphics Processors

Local Alignment Tool Based on Hadoop Framework and GPU Architecture

GPU for Scientific Computing. -Ali Saleh

Flexible Agent Based Simulation for Pedestrian Modelling on GPU Hardware

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

On Sorting and Load Balancing on GPUs

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

Le langage OCaml et la programmation des GPU

GPGPU Parallel Merge Sort Algorithm

An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units*

General Purpose Computation on Graphics Processors (GPGPU) Mike Houston, Stanford University

GUI GRAPHICS AND USER INTERFACES. Welcome to GUI! Mechanics. Mihail Gaianu 26/02/2014 1

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Introduction to GPU Computing

Fast cone-beam CT image reconstruction using GPU hardware

NVPRO-PIPELINE A RESEARCH RENDERING PIPELINE MARKUS TAVENRATH MATAVENRATH@NVIDIA.COM SENIOR DEVELOPER TECHNOLOGY ENGINEER, NVIDIA

Dynamic Resolution Rendering

Real-time Visual Tracker by Stream Processing

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

OctaVis: A Simple and Efficient Multi-View Rendering System

HPC with Multicore and GPUs

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

GPU Computing - CUDA

Developer Tools. Tim Purcell NVIDIA

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

Clustering Billions of Data Points Using GPUs

3D Distance from a Point to a Triangle

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

1. INTRODUCTION Graphics 2

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

Stream Processing on GPUs Using Distributed Multimedia Middleware

QCD as a Video Game?

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

CUDAMat: a CUDA-based matrix class for Python

Let s put together a Manual Processor

Computer Graphics. Computer graphics deals with all aspects of creating images with a computer

Parallel Prefix Sum (Scan) with CUDA. Mark Harris

In the early 1990s, ubiquitous

MIDeA: A Multi-Parallel Intrusion Detection Architecture

A Cross-Platform Framework for Interactive Ray Tracing

CS130 - Intro to computer graphics. Dr. Victor B. Zordan vbz@cs.ucr.edu Objectives

High Performance Cloud: a MapReduce and GPGPU Based Hybrid Approach

Lecture Notes, CEng 477

Realtime Ray Tracing and its use for Interactive Global Illumination

Optimization for DirectX9 Graphics. Ashu Rege

Introduction to Computer Graphics

Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015

Next Generation GPU Architecture Code-named Fermi

Visibility Map for Global Illumination in Point Clouds

Bio-Sequence Database Scanning on a GPU

Transcription:

GPU Data Structures Aaron Lefohn Neoptica

Introduction Previous talk: GPU memory model This talk: GPU data structures

Properties of GPU Data Structures To be efficient, must support Parallel read Parallel write Parallel iteration Generalized arrays fit these requirements Dense (complete) arrays Sparse (incomplete) arrays Adaptive arrays

Address Translation Design pattern for generalized arrays Virtual Address Address Translator Physical Address Example: CPU: 3D virtual memory 1D physical memory GPU: 3D virtual memory 2D physical memory Address translator is core of data structure

Dense (Complete) GPU Arrays Large 1D Arrays Current GPUs limit 1D array sizes to 2048 or 4096 Pack into 2D memory 1D-to-2D address translation

GPU Arrays 2D Arrays Trivial, native implementation

GPU Arrays 3D Arrays Problem: GPUs do not have 3D frame buffers Solutions 1. Multiple slices per 2D buffer ( Flat 3D array ) 2. Stack of 2D slices 3. Render-to-slice of 3D array

GPU Arrays DX 10 memory model helps Can render to slice of 3D texture Flat 3D array still has advantages Render entire domain in single parallel compute pass More parallelism when writing (slower for reading)

GPU Arrays Higher Dimensional Arrays Pack into 2D buffers N-D to 2D address translation

Sparse/Adaptive Data Structures Why? Reduce memory pressure Reduce computational workload Basic Idea Pack active data elements into GPU memory

Page Table Sparse/Adaptive Arrays Dynamic sparse/adaptive N-D array (Lefohn et al. 2003/2006) Block-based sparse N-D arrays, octrees, quadtrees Virtual Domain Page Table Physical Memory

Page Table Sparse/Adaptive Arrays Hierarchical page table Virtual Domain Physical Domain

Dynamic Adaptive Data Structure Photon map (knn-grid) (Purcell et al. 2003) Image from Implementing Efficient Parallel Data Structures on GPUs, Lefohn et al., GPU Gems II, ch. 33, 2005

GPU Perfect Hash Table Static, sparse N-D array (Lefebvre et al. 2006) Figure from Lefebvre, Hoppe, Perfect Spatial Hashing, ACM SIGGRAPH, 2006

GPU Iteration GPU good at random-access read Often do not need to define input iterators GPU (mostly) performs only streaming writes Every GPGPU data structure must support an output iterator compatible with GPU rasterization GPU iterator defines contiguous (2)-D range

GPU Iteration Traverse physical data layout Example Draw single quad over all voxels in flattened volume Draw one quad per-page over page-table-based array Virtual Domain Physical Domain

GPU Iteration GPUs can now build data-structure iterators DX10 StreamOut from geometry processor Now easier to build dynamic GPU data structures Stream Out Vertex Buffer Vertex Processor Geometry Processor

GPU Iteration Optimizing input iterators Galoppo et al. optimized input and output iterators for their matrix representation Match 2D GPU memory layout Match memory access pattern of algorithm Difficult because memory layouts are unspecified and proprietary More on this for GPU sorting in next talk Galoppo, Govindaraju, Henson, Manocha, LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware, ACM/IEEE Supercomputing, 2005

GPU Data Structure Conclusions Fundamental GPU memory primitive is a fixed-size 2D array (when building atop graphics APIs) GPU data structures require parallel read/write GPGPU needs more general memory model Low-level interfaces such as ATI s CTM and NVIDIA s CUDA (!!) Iterator access patterns must match storage layout Commonplace for CPU programmers, but Need low-level information from hardware vendors

GPU Data Structure Conclusions Building complex data structures on GPU still hard Can use data-parallel algorithms (sort, scan, etc.) DX10 geometry processor StreamOut helps Is less parallelism more efficient for building data structures? Next step is having GPU build iterators for dynamic, complex data structures

More Information Overview (with code snippets) Lefohn, Kniss, Owens, Implementing Efficient Parallel Data Structures on GPUs, Chapter 33, GPU Gems II High-level GPU data structures Lefohn, Kniss, Strzodka, Sengupta, Owens, Glift: Generic, Efficient, Random-Access GPU Data Structures, ACM Transactions on Graphics, 2006 GPGPU State-of-the-art report Owens, Luebke, Govindaraju, Harris, Krüger, Lefohn, Purcell, A Survey of General-Purpose Computation on Graphics Hardware, Eurographics STAR report, 2005