Real-time Visual Tracker by Stream Processing
|
|
|
- Lorraine Bell
- 10 years ago
- Views:
Transcription
1 Real-time Visual Tracker by Stream Processing Simultaneous and Fast 3D Tracking of Multiple Faces in Video Sequences by Using a Particle Filter Oscar Mateo Lozano & Kuzahiro Otsuka presented by Piotr Rudol
2
3 Outline Particle Filter Basics Visual Tracking of Faces - Problem Definition CUDA Implementation Results Conclusion
4 Particle Filtering Particle filtering is a state estimation technique based on Monte Carlo simulations. Particles are random values of a state-space variable which are checked against the current output measure to generate a weight value i.e. likeliness of a particle being the one that best describes the current state of the system. The collection of all particles and their weights in each instant is a numerical approximation of the probability density function of the system. The probabilistic approach of this method provides significant robustness, as several possible states of the system are tracked at any given moment.
5 Particle Filtering First, a population of M samples is created by sampling from the prior distribution at time 0, P(X0). Then the update cycle is repeated for each time step: Each sample is propagated forward by sampling the next state value xt+1 given the current value xt for the sample, using the transition model P(Xt+1 xt). Each sample is weighted by the likelihood it assigns to the new evidence P(et+1 xt+1). The population is resampled to generate a new population of M samples. Each new sample is selected from the current population; the probability that a particular sample is selected is proportional to its weight. This procedure puts in focus samples in high-probability regions of the state space.
6 Particle Filter-based Localization
7 Problem Definition We consider the state sequence of a target: xt = (Tx, Ty, S, Rx, Ry, Rz, α) Tx, Ty - translational coordinate S - scale Rx, Ry, Rz - rotations along all axis α - global illumination variable
8 Problem Definition The state-space vector is given in each discrete time step by: x t = ft(x t-1, v t-1 ) - state transition model ft is possibly non-linear function of state xt-1, {vt-1} is an independent and identically-distributed process noise sequence. The objective of tracking is to recursively estimate x t from measurements: z t = ht(x t, n t ) - observation model ht is possibly non-linear function, {nt-1} is an independent and identically-distributed measurement noise sequence. We seek filtered estimates of x t based on the set of all available measurements z 1:t up to time t, together with some degree of belief in them. Namely, we want to know the p.d.f. p(x t z 1:t ), assuming that the initial p.d.f. p( x 0 ) is known.
9 Particle Filter Solution is an approximation method that makes use of Monte Carlo simulations [2]. We now consider discrete particles forming the set: ( ) { X t 1, t 1 = ( x 0 t 1, π t 1 0 ),..., M ( x t 1, π t 1) } M (4) We consider discrete particles forming the set: Every particle contains information about one possible state every of the particle system x contains j information about j one possible state of the t 1 and its importance weight πt 1 system and its importance weight.. This sample set is propagated in the Condensation algorithm as follows: Select A set of M new particles ( X t, t ) is generated, by random selection from the previous particles by using a p.d.f. according to their weights t 1. Diffuse A Gaussian random noise is added to each parameter of each particle in X t. Measure weight The probability t is measured using template matching, based on the difference error between the template (the collection of all pixels forming the face in t = 0, or some carefully selected pixels of the template if we use a sparse template [16, 17] as we describe in Section 2.3) and the input image at each time step. j O. Mateo Lozano, K where R 2 3 denotes the 2 3 upper submatri rotation matrix R. The matching error betw template and input image is defined based on th sity of each feature point, J(q m ), m = 1,, the intensity of the corresponding projected I( p m ), modified by the global illumination fa More precisely, the matching error, defined average matching error of N feature points, written as This sample is propagated in the Condensation algorithm as follows: ε = 1 N N ρ (κ (J(q m ), I( p m ))) m=1 Select - A set of M new particles is generated by random selection from the previous particles by updating the p.d.f. according their weight. κ(j, I) = α I J J Diffuse - A Gaussian noise is added to each parameter of each particle ρ(κ) = κ2 κ Measure weight - The likelihood of each particle being the true state is calculated based on template matching using the input image at each step. where κ(j, I) denotes the relative residual b intensity values of the feature point q and the pr point p, and ρ(κ) represents a robust functio is used to increase the robustness against outlie [17]. From the matching error for each particle
10 The Algorithm
11 Initialization Every n-th frame is analyzed by a boosting algorithm to detect new faces giving a rectangular sub-image. a b c A fitting of a statistical deformable model is performed resulting in a series of 53 landmark points that correspond to known features of a human face. The landmarks include a pair of texture coordinates, a set of numbers that refer to a unique part of a texture resource stored in memory. This way two image buffers ( height- and normalmap ) are created by applying a 3D face model.
12 Initialization The feature points (position and gray level) that form the sparse template are selected using image processing techniques, and their depth and normal to the surface values are extracted from prior maps. Finally the template is formed by a stream of N feature points formed as follows: p = (Px, Py, Pz, Nx, Ny, Nz, J) Px, Py, Pz - coordinates of feature points Nx, Ny, Nz - form the vector normal to the face surface in the feature point J - gray level of the feature point
13 Initialization Why do feature points include normal vectors? If a normal vector points away from the camera it is likely occluded by another feature point. It s weight can be disregarded.
14 Tracking Tracking is performed by means of a Particle Filter. Out of selection, diffusion, and weighting the latter is the most computationally intensive. It requires performing the 3D transformation of each feature point as estimated by each of the particles, and then a comparison is made of the feature point gray level against the resulting point in the full image. The sum of all those comparisons for each feature point results in the weight of one particle. For example with M=1000 particles and N=230 feature points it s weight computation operations! Weight calculation of each particle is an independent process!
15 Display After every tracking step, the best particles are averaged to get an approximation of the system state. The resulting state-space values can displayed on the screen as rotated, scaled and translated 3D-mesh models.
16 CUDA implementation After initialization, feature point stream is copied to device memory. It doesn t change during tracking. At every step the current video frame and a buffer with particles is copied to device memory. The kernel is invoked and executed in M blocks, each with N threads i.e. one block per particle and one thread per feature point per block. Each thread computes the matching error of a particle-feature pair i.e. transforms the feature point position and normal vector, accesses the video frame pixel in that position (if normal vector does not mean occlusion) and compares the gray level. The result is placed by every thread in different position in the block s shared memory. When all threads in a block finish (sync), first thread of every block sums up matching errors and the result is placed in global device memory. The host retrieves stream of weights of every particle.
17 Hardware Intel Core II 2.66GHz host system with 2 GB RAM NVIDIA GeForce 8800GTX GPU: 128 processing units, 16 multiprocessors, 350GFLOPS under CUDA. For comparison purposes, a software-only version of the weighting algorithm was also developed (single-threaded, no SIMD extensions used) for execution on the host CPU alone.
18 Results Results of the simultaneous tracking of four people. The frame sequence is taken from a synthetic video, the sparse templates are composed of approximately 230 feature points, and each template is tracked using 1000 particles.
19 ks, cle ch rtisime nd ult of all he in ry m bal this GPGPU (as opposed to SP) technique. Brook also suffers from the excessive overhead associated with Time per frame (ms) NRT RT GPU Software Results Particle weighting speed Number of particles Speed of the particle weighting stage, comparing Stream processing in the GPU version to a serial CPU-only version. Video and 217 feature points ut Figure 6 Speed of the particle weighting stage, comparing Stream processing in the GPU version to a serial CPU-only
20 Results Real-time visual tracker by stream processing Frames per second RT NRT Full application speed Faces GPU Software Speed of the full application, comparing stream processing in the GPU version to a serial CPU-only version. Video 1024x768, 230 feature points per face, and 1000 particles per face. Figure 7 Speed of the full application, comparing stream processing in the GPU version to a serial CPU-only version. BE, or many (usuall mind) implem One effort PFs so hardwa proble sents a step, s CPU ( in [25] pling c
21 Summary Particle Filter on GPU-based visual tracking of human faces was presented. Thanks to easy parallelization of weight computation operations a significant performance increase was achieved. GPU based filter runs easily in real-time with 10k particles comparing to approx. 4.5k for CPU only version. GPU based filter can track up to 6 faces in a video stream in real time comparing to 0 for CPU only version. The GPU-based particle filter can easily by used in different domains.
Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software
GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas
Computer Graphics Hardware An Overview
Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and
LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR
LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize
Texture Cache Approximation on GPUs
Texture Cache Approximation on GPUs Mark Sutherland Joshua San Miguel Natalie Enright Jerger {suther68,enright}@ece.utoronto.ca, [email protected] 1 Our Contribution GPU Core Cache Cache
ultra fast SOM using CUDA
ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A
GeoImaging Accelerator Pansharp Test Results
GeoImaging Accelerator Pansharp Test Results Executive Summary After demonstrating the exceptional performance improvement in the orthorectification module (approximately fourteen-fold see GXL Ortho Performance
A Robust Multiple Object Tracking for Sport Applications 1) Thomas Mauthner, Horst Bischof
A Robust Multiple Object Tracking for Sport Applications 1) Thomas Mauthner, Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology, Austria {mauthner,bischof}@icg.tu-graz.ac.at
Real-Time Camera Tracking Using a Particle Filter
Real-Time Camera Tracking Using a Particle Filter Mark Pupilli and Andrew Calway Department of Computer Science University of Bristol, UK {pupilli,andrew}@cs.bris.ac.uk Abstract We describe a particle
Introduction to GPGPU. Tiziano Diamanti [email protected]
[email protected] Agenda From GPUs to GPGPUs GPGPU architecture CUDA programming model Perspective projection Vectors that connect the vanishing point to every point of the 3D model will intersecate
Interactive Level-Set Deformation On the GPU
Interactive Level-Set Deformation On the GPU Institute for Data Analysis and Visualization University of California, Davis Problem Statement Goal Interactive system for deformable surface manipulation
A Prototype For Eye-Gaze Corrected
A Prototype For Eye-Gaze Corrected Video Chat on Graphics Hardware Maarten Dumont, Steven Maesen, Sammy Rogmans and Philippe Bekaert Introduction Traditional webcam video chat: No eye contact. No extensive
GPGPU Computing. Yong Cao
GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power
Introduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
SIMULTANEOUS LOCALIZATION AND MAPPING FOR EVENT-BASED VISION SYSTEMS
SIMULTANEOUS LOCALIZATION AND MAPPING FOR EVENT-BASED VISION SYSTEMS David Weikersdorfer, Raoul Hoffmann, Jörg Conradt 9-th International Conference on Computer Vision Systems 18-th July 2013, St. Petersburg,
GPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile
GPU Computing with CUDA Lecture 2 - CUDA Memories Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile 1 Outline of lecture Recap of Lecture 1 Warp scheduling CUDA Memory hierarchy
Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data
Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:
GPU Accelerated Monte Carlo Simulations and Time Series Analysis
GPU Accelerated Monte Carlo Simulations and Time Series Analysis Institute of Physics, Johannes Gutenberg-University of Mainz Center for Polymer Studies, Department of Physics, Boston University Artemis
Interactive Level-Set Segmentation on the GPU
Interactive Level-Set Segmentation on the GPU Problem Statement Goal Interactive system for deformable surface manipulation Level-sets Challenges Deformation is slow Deformation is hard to control Solution
The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA
The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques
Tracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015
INF5063: Programming heterogeneous multi-core processors because the OS-course is just to easy! Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks October 20 th 2015 Håkon Kvale
Intro to GPU computing. Spring 2015 Mark Silberstein, 048661, Technion 1
Intro to GPU computing Spring 2015 Mark Silberstein, 048661, Technion 1 Serial vs. parallel program One instruction at a time Multiple instructions in parallel Spring 2015 Mark Silberstein, 048661, Technion
Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez
Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis artínez, Gerardo Fernández-Escribano, José. Claver and José Luis Sánchez 1. Introduction 2. Technical Background 3. Proposed DVC to H.264/AVC
CSE168 Computer Graphics II, Rendering. Spring 2006 Matthias Zwicker
CSE168 Computer Graphics II, Rendering Spring 2006 Matthias Zwicker Last time Global illumination Light transport notation Path tracing Sampling patterns Reflection vs. rendering equation Reflection equation
CS231M Project Report - Automated Real-Time Face Tracking and Blending
CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, [email protected] June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android
GPGPU accelerated Computational Fluid Dynamics
t e c h n i s c h e u n i v e r s i t ä t b r a u n s c h w e i g Carl-Friedrich Gauß Faculty GPGPU accelerated Computational Fluid Dynamics 5th GACM Colloquium on Computational Mechanics Hamburg Institute
Mean-Shift Tracking with Random Sampling
1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of
Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1
Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?
GPUs for Scientific Computing
GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles [email protected] Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research
The Future Of Animation Is Games
The Future Of Animation Is Games 王 銓 彰 Next Media Animation, Media Lab, Director [email protected] The Graphics Hardware Revolution ( 繪 圖 硬 體 革 命 ) : GPU-based Graphics Hardware Multi-core (20 Cores
GPGPU acceleration in OpenFOAM
Carl-Friedrich Gauß Faculty GPGPU acceleration in OpenFOAM Northern germany OpenFoam User meeting Braunschweig Institute of Technology Thorsten Grahs Institute of Scientific Computing/move-csc 2nd October
Speed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA [email protected], [email protected] Abstract. A speed
15-418 Final Project Report. Trading Platform Server
15-418 Final Project Report Yinghao Wang [email protected] May 8, 214 Trading Platform Server Executive Summary The final project will implement a trading platform server that provides back-end support
NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist
NVIDIA CUDA Software and GPU Parallel Computing Architecture David B. Kirk, Chief Scientist Outline Applications of GPU Computing CUDA Programming Model Overview Programming in CUDA The Basics How to Get
Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected].
Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute [email protected] Outline Motivation why do we need GPUs? Past - how was GPU programming
MONTE-CARLO SIMULATION OF AMERICAN OPTIONS WITH GPUS. Julien Demouth, NVIDIA
MONTE-CARLO SIMULATION OF AMERICAN OPTIONS WITH GPUS Julien Demouth, NVIDIA STAC-A2 BENCHMARK STAC-A2 Benchmark Developed by banks Macro and micro, performance and accuracy Pricing and Greeks for American
Introduction to GPU Computing
Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1 Table of Contents 1. Architecture
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Introduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware
Deterministic Sampling-based Switching Kalman Filtering for Vehicle Tracking
Proceedings of the IEEE ITSC 2006 2006 IEEE Intelligent Transportation Systems Conference Toronto, Canada, September 17-20, 2006 WA4.1 Deterministic Sampling-based Switching Kalman Filtering for Vehicle
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS
VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James
Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors
Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors Joe Davis, Sandeep Patel, and Michela Taufer University of Delaware Outline Introduction Introduction to GPU programming Why MD
Head and Facial Animation Tracking using Appearance-Adaptive Models and Particle Filters
Head and Facial Animation Tracking using Appearance-Adaptive Models and Particle Filters F. Dornaika and F. Davoine CNRS HEUDIASYC Compiègne University of Technology 625 Compiègne Cedex, FRANCE {dornaika,
GPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui
Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching
Stream Processing on GPUs Using Distributed Multimedia Middleware
Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research
VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS
VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli Department of Electrical and Computer Engineering Northeastern University,
Tracking Groups of Pedestrians in Video Sequences
Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal
Fast Implementations of AES on Various Platforms
Fast Implementations of AES on Various Platforms Joppe W. Bos 1 Dag Arne Osvik 1 Deian Stefan 2 1 EPFL IC IIF LACAL, Station 14, CH-1015 Lausanne, Switzerland {joppe.bos, dagarne.osvik}@epfl.ch 2 Dept.
CUDA programming on NVIDIA GPUs
p. 1/21 on NVIDIA GPUs Mike Giles [email protected] Oxford University Mathematical Institute Oxford-Man Institute for Quantitative Finance Oxford eresearch Centre p. 2/21 Overview hardware view
Dynamic Resolution Rendering
Dynamic Resolution Rendering Doug Binks Introduction The resolution selection screen has been one of the defining aspects of PC gaming since the birth of games. In this whitepaper and the accompanying
GPU-BASED TUNING OF QUANTUM-INSPIRED GENETIC ALGORITHM FOR A COMBINATORIAL OPTIMIZATION PROBLEM
GPU-BASED TUNING OF QUANTUM-INSPIRED GENETIC ALGORITHM FOR A COMBINATORIAL OPTIMIZATION PROBLEM Robert Nowotniak, Jacek Kucharski Computer Engineering Department The Faculty of Electrical, Electronic,
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens
Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter
Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter Daniel Weingaertner Informatics Department Federal University of Paraná - Brazil Hochschule Regensburg 02.05.2011 Daniel
Clustering Billions of Data Points Using GPUs
Clustering Billions of Data Points Using GPUs Ren Wu [email protected] Bin Zhang [email protected] Meichun Hsu [email protected] ABSTRACT In this paper, we report our research on using GPUs to accelerate
NVIDIA GeForce GTX 580 GPU Datasheet
NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines
Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data
Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:
GPGPU for Real-Time Data Analytics: Introduction. Nanyang Technological University, Singapore 2
GPGPU for Real-Time Data Analytics: Introduction Bingsheng He 1, Huynh Phung Huynh 2, Rick Siow Mong Goh 2 1 Nanyang Technological University, Singapore 2 A*STAR Institute of High Performance Computing,
Master s thesis tutorial: part III
for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General
Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.
Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Chirag Gupta,Sumod Mohan K [email protected], [email protected] Abstract In this project we propose a method to improve
GPU Parallel Computing Architecture and CUDA Programming Model
GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel
ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU
Computer Science 14 (2) 2013 http://dx.doi.org/10.7494/csci.2013.14.2.243 Marcin Pietroń Pawe l Russek Kazimierz Wiatr ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Abstract This paper presents
Parallel Prefix Sum (Scan) with CUDA. Mark Harris [email protected]
Parallel Prefix Sum (Scan) with CUDA Mark Harris [email protected] April 2007 Document Change History Version Date Responsible Reason for Change February 14, 2007 Mark Harris Initial release April 2007
GPGPU: General-Purpose Computation on GPUs
GPGPU: General-Purpose Computation on GPUs Randy Fernando NVIDIA Developer Technology Group (Original Slides Courtesy of Mark Harris) Why GPGPU? The GPU has evolved into an extremely flexible and powerful
multimodality image processing workstation Visualizing your SPECT-CT-PET-MRI images
multimodality image processing workstation Visualizing your SPECT-CT-PET-MRI images InterView FUSION InterView FUSION is a new visualization and evaluation software developed by Mediso built on state of
A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms
A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms H. Kandil and A. Atwan Information Technology Department, Faculty of Computer and Information Sciences, Mansoura University,El-Gomhoria
Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61
F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase
WAKING up without the sound of an alarm clock is a
EINDHOVEN UNIVERSITY OF TECHNOLOGY, MARCH 00 Adaptive Alarm Clock Using Movement Detection to Differentiate Sleep Phases Jasper Kuijsten Eindhoven University of Technology P.O. Box 53, 5600MB, Eindhoven,
OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA
OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization
A Movement Tracking Management Model with Kalman Filtering Global Optimization Techniques and Mahalanobis Distance
Loutraki, 21 26 October 2005 A Movement Tracking Management Model with ing Global Optimization Techniques and Raquel Ramos Pinho, João Manuel R. S. Tavares, Miguel Velhote Correia Laboratório de Óptica
Probability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka
Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka René Widera1, Erik Zenker1,2, Guido Juckeland1, Benjamin Worpitz1,2, Axel Huebl1,2, Andreas Knüpfer2, Wolfgang E. Nagel2,
ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer
ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop Emily Apsey Performance Engineer Presentation Overview What it takes to successfully virtualize ArcGIS Pro in Citrix XenApp and XenDesktop - Shareable
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE
Silverlight for Windows Embedded Graphics and Rendering Pipeline 1
Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline Windows Embedded Compact 7 Technical Article Writers: David Franklin,
Optimizing a 3D-FWT code in a cluster of CPUs+GPUs
Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Gregorio Bernabé Javier Cuenca Domingo Giménez Universidad de Murcia Scientific Computing and Parallel Programming Group XXIX Simposium Nacional de la
High Performance GPU-based Preprocessing for Time-of-Flight Imaging in Medical Applications
High Performance GPU-based Preprocessing for Time-of-Flight Imaging in Medical Applications Jakob Wasza 1, Sebastian Bauer 1, Joachim Hornegger 1,2 1 Pattern Recognition Lab, Friedrich-Alexander University
Introduction to GPU Architecture
Introduction to GPU Architecture Ofer Rosenberg, PMTS SW, OpenCL Dev. Team AMD Based on From Shader Code to a Teraflop: How GPU Shader Cores Work, By Kayvon Fatahalian, Stanford University Content 1. Three
Image Processing & Video Algorithms with CUDA
Image Processing & Video Algorithms with CUDA Eric Young & Frank Jargstorff 8 NVIDIA Corporation. introduction Image processing is a natural fit for data parallel processing Pixels can be mapped directly
Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries
Performance Evaluations of Graph Database using CUDA and OpenMP Compatible Libraries Shin Morishima 1 and Hiroki Matsutani 1,2,3 1Keio University, 3 14 1 Hiyoshi, Kohoku ku, Yokohama, Japan 2National Institute
On-line Tracking Groups of Pedestrians with Bayesian Networks
On-line Tracking Groups of Pedestrians with Bayesian Networks Pedro M. Jorge ISEL / ISR [email protected] Jorge S. Marques IST / ISR [email protected] Arnaldo J. Abrantes ISEL [email protected] Abstract A
QCD as a Video Game?
QCD as a Video Game? Sándor D. Katz Eötvös University Budapest in collaboration with Győző Egri, Zoltán Fodor, Christian Hoelbling Dániel Nógrádi, Kálmán Szabó Outline 1. Introduction 2. GPU architecture
Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)
Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf Flow Visualization Image-Based Methods (integration-based) Spot Noise (Jarke van Wijk, Siggraph 1991) Flow Visualization:
Comparing Improved Versions of K-Means and Subtractive Clustering in a Tracking Application
Comparing Improved Versions of K-Means and Subtractive Clustering in a Tracking Application Marta Marrón Romera, Miguel Angel Sotelo Vázquez, and Juan Carlos García García Electronics Department, University
HPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware
Amazing renderings of 3D data... in minutes.
Amazing renderings of 3D data... in minutes. What is it? KeyShot is a standalone 3D rendering and animation application for anyone with a need to quickly and easily create photo-realistic images of 3D
Le langage OCaml et la programmation des GPU
Le langage OCaml et la programmation des GPU GPU programming with OCaml Mathias Bourgoin - Emmanuel Chailloux - Jean-Luc Lamotte Le projet OpenGPU : un an plus tard Ecole Polytechnique - 8 juin 2011 Outline
Building an Advanced Invariant Real-Time Human Tracking System
UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian
Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster
Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster Aaron Hagan and Ye Zhao Kent State University Abstract. In this paper, we propose an inherent parallel scheme for 3D image segmentation
E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices
E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,
