Go Faster - Preprocessing Using FPGA, CPU, GPU. Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING

Similar documents
Industrial Vision Days 2012 Making Cameras Smarter: FPGA Based Image Pre-processing Unleashed

Introduction to GPU Programming Languages

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

GIGE VISION IN PRACTICE

WHITEPAPER. Image processing in the age of 10 Gigabit Ethernet

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

GPGPU Computing. Yong Cao

Basler. Line Scan Cameras

Scope of operation and highlight of the microenable GigE Vision frame grabber family

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

MVTec Software GmbH.

GPUs for Scientific Computing

Trigger-to-Image Reliability (T2IR)

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Introduction to GPU Computing

Overview. Proven Image Quality and Easy to Use Without a Frame Grabber. Your benefits include:

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

CMSC 611: Advanced Computer Architecture

Next Generation GPU Architecture Code-named Fermi

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

ni.com/vision NI Vision

By Andrew Wilson, Editor

ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI

The Future Of Animation Is Games

A Computer Vision System on a Chip: a case study from the automotive domain

Computer Graphics Hardware An Overview

Basler scout AREA SCAN CAMERAS

GPU Architecture. Michael Doggett ATI

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Basler. Area Scan Cameras

L20: GPU Architecture and Models

Video-Rate Stereo Vision on a Reconfigurable Hardware. Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Introduction to GPU hardware and to CUDA

Basler beat AREA SCAN CAMERAS. High-resolution 12 MP cameras with global shutter

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With

Basler pilot AREA SCAN CAMERAS

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

GPU-based Decompression for Medical Imaging Applications

SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST

Baumer Industrial Cameras Innovative Vision Components for Automated Image Processing

Embedded Software development Process and Tools: Lesson-4 Linking and Locating Software

HARNESS project: Managing Heterogeneous Compute Resources for a Cloud Platform

Radeon HD 2900 and Geometry Generation. Michael Doggett

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Basler dart AREA SCAN CAMERAS. Board level cameras with bare board, S- and CS-mount options

Stream Processing on GPUs Using Distributed Multimedia Middleware

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

Introduction to GPGPU. Tiziano Diamanti

Evaluation of CUDA Fortran for the CFD code Strukti

Four Keys to Successful Multicore Optimization for Machine Vision. White Paper

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

High-Density Network Flow Monitoring

White Paper. Intel Sandy Bridge Brings Many Benefits to the PC/104 Form Factor

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Why You Need the EVGA e-geforce 6800 GS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

Float to Fix conversion

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

APPLICATION NOTE. Basler racer Migration Guide. Mechanics. Flexible Mount Concept. Housing

Parallel Programming Survey

High speed pattern streaming system based on AXIe s PCIe connectivity and synchronization mechanism

GPU for Scientific Computing. -Ali Saleh

Introduction to Cloud Computing

high-performance computing so you can move your enterprise forward

Systolic Computing. Fundamentals

Basler racer. Line Scan Cameras. Next generation CMOS sensors with 2k to 12k resolution and up to 80 khz line rate

Understanding Line Scan Camera Applications

HP Workstations graphics card options

QCD as a Video Game?

A General Framework for Tracking Objects in a Multi-Camera Environment

High-speed image processing algorithms using MMX hardware

GPGPU for Real-Time Data Analytics: Introduction. Nanyang Technological University, Singapore 2

Real-time Visual Tracker by Stream Processing

Introduction to MATLAB Gergely Somlay Application Engineer

GOLD20TH-GTX980-P-4GD5

Awards News. GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth.

Lecture Notes, CEng 477

FPGA-based MapReduce Framework for Machine Learning

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

Analecta Vol. 8, No. 2 ISSN

Lecture 2 Parallel Programming Platforms

Bringing Big Data Modelling into the Hands of Domain Experts

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

ARM Processors for Computer-On-Modules. Christian Eder Marketing Manager congatec AG

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

Advanced Rendering for Engineering & Styling

ultra fast SOM using CUDA

Chapter 2 Parallel Architecture, Software And Performance

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

xiq USB3 Vision Cameras easy on you easy on your application easy on your customer easy on your budget

A bachelor of science degree in electrical engineering with a cumulative undergraduate GPA of at least 3.0 on a 4.0 scale

How To Program With Adaptive Vision Studio

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Development of a high-resolution, high-speed vision system using CMOS image sensor technology enhanced by intelligent pixel selection technique

Transcription:

Go Faster - Preprocessing Using FPGA, CPU, GPU Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING

WHO ARE STEMMER IMAGING? STEMMER IMAGING is: Europe's leading independent provider of Core vision technology Solutions Services Our mission: To provide the users and developers of imaging technology with competitive advantage by adding value in the supply of quality components, expertise and support. VISION 2011 2

OUR PRESENCE Puchheim near Munich, Germany Pfäffikon SZ near Zurich, Switzerland Tongham near London, UK Suresnes near Paris, France»To provide the users and developers of imaging technology with competitive advantage by adding value in the supply of quality components, expertise and support.«vision 2011 3

THE MARKETS WE SERVE Factory Automation Automotive Electronics, Semiconductor & Solar Print & Packaging Food & Beverage Pharmaceutical Test & Measurement Medical Imaging Traffic, Rail & Transport Scientific Research Defense, Security & Aerospace Sports, Entertainment & Broadcast VISION 2011 4

INCREASING DATA RATES GO FASTER Preprocessing using FPGA, CPU, GPU In all markets there are applications that demand high data rates. Image resolution Frame rate Pixel depth Number of operations per pixel Each parameter increases computing efforts. VISION 2011 5

PROCESSING UNIT DECISION FOR AN ARCHITECTURE Vision applications have widely varying requirements and no single approach is ideal for each and every application. Therefore different architectures have their place. FPGA CPU GPU VISION 2011 6

CLASSIFICATION OF ARCHITECTURES CPU Component found in each computer system and most machine vision libraries are based on this architecture. Sequence of single instruction on single data Extended by vector processing instructions MMX ISSE 3DNow! AltiVec SSE2, SSE3, SSSE3, SSE4, SSE5 (SISD) (SIMD) Multiple cores Operations / s defined by tick frequency One to several ticks needed per instruction VISION 2011 7

CLASSIFICATION OF ARCHITECTURES GPU In the field of machine vision a graphics processing units and its highly parallel structure can be used for uniform calculations on large image blocks. Single instruction on multiple data (SIMD) Memory intensive operations Massive floating-point computational power Fast development due to gaming market In most cases 1 tick / operation / proc. unit General purpose GPU (GPGPU) VISION 2011 8

CLASSIFICATION OF ARCHITECTURES FPGA Field programmable gate arrays can be found in many imaging devices, but are not an inherent part of a computer system. This architecture is a reconfigurable integrated circuit. Multiple instructions on single data (MISD) High data throughput Massive parallelism Flexibility in spatial and temporal parallelism and tick frequency Direct hardware access (Trigger I/O, CC) VISION 2011 9

FEASIBILITY ANALYSIS DEFINING PRE-PROCESSING STEPS The individual processing steps are defined/identified during a feasibility study. First approach is based on software tools (CPU) Effort estimation of each step If computational power of a CPU-System is sufficient yes : select CPU no : profiling of processing steps Evaluate FPGA or GPU solution VISION 2011 10

PROCESSING STEPS OPERATOR TYPES Nearly all standard pre-processing functions fit into 5 general groups: Pixel operations: (in)homogeneous Histogram or LUT based functions Neighbour operations Random access operations Geometrical transformations VISION 2011 11

ARCHITECTURE AND OPERATION PERFORMANCE Architectures versus their general suitability for the different kinds of operators: Operation CPU GPU FPGA Pixel + ++ ++ Histogram / LUT ++ - ++ Neighbour / Kernel - ++ ++ Random Access ++ - - Geometrical Transform + ++ - Architecture SISD SIMD SIMD MISD VISION 2011 12

SYSTEM SETUP ARCHITECTURES IN ACQUISITION CHANNEL Each architecture for pre-processing may be used in different positions of the acquisition channel: Source / Camera FPGA Output / Interface FPGA Input / Grabber FPGA Host system CPU / GPU VISION 2011 13

POSSIBLE IMPROVEMENTS PERFORMANCE CONSIDERATIONS Different performance characteristics need to be considered with caution: Unit CPU GPU FPGA GFLOPS 107 ~5000 t.b.a. GOPS 153 ~5000 ~125000 RAM GB/s cache / pipeline 26 111 / - 320 - / - 17 - / 4000 RAM GB > 16 4 1 DMA GB/s n.a. 4 3,6 GPU CPU FPGA AMD Radeon HD 6990 (end user product) Intel Core i7 X 980 (4611MHz) SiliconSoftware microenable 5, Xilinx Virtex 6 series ~250MHz VISION 2011 14

THE WAY TO GO FASTER Evaluate FPGA or GPU solution If today s CPUs computational power is insufficient for the required processing steps. Architectures adaptable Pre-Processing steps convertible Flexibility and delay satisfactory Development tools available Time to market Combined processing option VISION 2011 15

CONVERTING POSSIBLE PROCESSING FUNCTIONS Operation GPU FPGA Pixel Homogenous Binarization X X Color transform X X Inhomogenous Shading, Dynamic Thresholding X X Histogram / LUT Histogram X Filters Sobel, Median, FIR 14 x 14 32 x 32 Segmentation X X Erode, Dilate X X Transformation DCT, DFT, FFT, Wavelet X Geometrical X Hough X X Compression jpeg-encoding X X H.264 X Direct GPIO & Trigger control Integration into image source / acquisition interface possible possible VISION 2011 16

INTERFACING FLEXIBILITY AND DELAY SATISFACTORY FPGA pre-processing capabilities can be found in frame grabbers and cameras. Different interfaces may require porting of the processing. FPGA CameraLink, GigE, CoaXpress, CL HS Low power consumption Expandable / scalable Practically no delay Long term availability GPU No acquisition device Adaptable by DMA Multi-GPU (SLI, Xfire, ) Minimal delay caused by DMA transfer Portable implementation possible VISION 2011 17

TIME TO MARKET DEVELOPMENT TOOLS AVAILABLE Converting processing steps to a different hardware is currently not possible by just compiling it for a different architecture. Imaging libraries supporting conversion Common Vision Blox 2011, etc. GPU CVB + DirectX, OpenCL CUDA, StreamingSDK FPGA Silicon Software / Visual Applets & eva Teledyne DALSA / Sapera APF NI / LabView MATLAB / Simulink VISION 2011 18

FPGA / CPU / GPU COMBINED PROCESSING OPTIONS Most acquisition streams can utilize several architectures. Using these in combination for implementing image processing is an applicable and reasonable way. Camera (FPGA) Acquisition device (FPGA) Computer system (CPU & GPU) architectures are currently merging CPU + GPU FPGA + CPU Bright prospects VISION 2011 19

IN PRACTICE EXAMPLES CAN BE FOUND IN DIFFERENT FIELDS All of the following examples are realized projects. FPGA for filtering & segmentation, CPU for classification in linescan FPGA for synchronizing several linescan cameras and merging frames GPU for filtering images 7 grabbers using FPGA processing in parallel FPGA for jpeg compression in high speed recording GPU for perspective correction & de-bayering, CPU for compression See 2 of our demos including FPGA (pre-)processing, hall 4, booth C51 VISION 2011 20

OUR VALUE ADD IN RELIABLE PARTNERSHIP WITH YOU Intelligently integrated components and perfect service add value to your business. Optimised components Specific development Partner programs Advanced integration VISION 2011 21

OUR PRODUCTS Illumination Optics Cameras Cabling Acquisition Software Systems Accessories VISION 2011 22

Thank you for your attention! STEMMER IMAGING GmbH Gutenbergstraße 9 13 82178 Puchheim, Germany Phone: +49 89 80902-294 Fax: +49 89 80902-116 info@stemmer-imaging.de www.stemmer-imaging.de Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development b.rudde@stemmer-imaging.de