Introduction to GPU Computing



Similar documents
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GPGPU. Tiziano Diamanti

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

NVIDIA GeForce GTX 580 GPU Datasheet

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction to GPU Programming Languages

Next Generation GPU Architecture Code-named Fermi

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

Parallel Programming Survey

HP Workstations graphics card options

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

L20: GPU Architecture and Models

Introduction to GPU hardware and to CUDA

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With

~ Greetings from WSU CAPPLab ~

GPU Architecture. Michael Doggett ATI

QCD as a Video Game?

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

HP Workstations graphics card options

NVIDIA GeForce Experience

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Introduction to GPU Architecture

GPGPU Computing. Yong Cao

HP Z Workstations graphics card options

An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units*

SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST

ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU

Appendix L. General-purpose GPU Radiative Solver. Andrea Tosetto Marco Giardino Matteo Gorlani (Blue Engineering & Design, Italy)

GPGPU for Real-Time Data Analytics: Introduction. Nanyang Technological University, Singapore 2

Transcend the Vision. Embedded Graphic Solutions that Lead to New Territory. Embedded Graphic Solutions.

Evaluation of CUDA Fortran for the CFD code Strukti

Clustering Billions of Data Points Using GPUs

SAPPHIRE VAPOR-X R9 270X 2GB GDDR5 OC WITH BOOST

Case Study on Productivity and Performance of GPGPUs

Video Conferencing System Requirements

IP Video Rendering Basics

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

QuickSpecs. NVIDIA Quadro M GB Graphics INTRODUCTION. NVIDIA Quadro M GB Graphics. Overview

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

¹ Autodesk Showcase 2016 and Autodesk ReCap 2016 are not supported in 32-Bit.

System requirements for Autodesk Building Design Suite 2017

Applying Parallel and Distributed Computing for Image Reconstruction in 3D Electrical Capacitance Tomography

ultra fast SOM using CUDA

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

GPU for Scientific Computing. -Ali Saleh

AMD EMBEDDED PCIe ADD-IN BOARD Comparison

Computer Graphics Hardware An Overview

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

High Performance GPGPU Computer for Embedded Systems

System Requirements. Autodesk Building Design Suite Standard 2013

Introduction to Numerical General Purpose GPU Computing with NVIDIA CUDA. Part 1: Hardware design and programming model

GPU-BASED TUNING OF QUANTUM-INSPIRED GENETIC ALGORITHM FOR A COMBINATORIAL OPTIMIZATION PROBLEM

Parallel Firewalls on General-Purpose Graphics Processing Units

For designers and engineers, Autodesk Product Design Suite Standard provides a foundational 3D design and drafting solution.

Turbomachinery CFD on many-core platforms experiences and strategies

Accelerating Intensity Layer Based Pencil Filter Algorithm using CUDA

GPGPU accelerated Computational Fluid Dynamics

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

Accelerating CFD using OpenFOAM with GPUs

Choosing a Computer for Running SLX, P3D, and P5

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

The Future Of Animation Is Games

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

Go Faster - Preprocessing Using FPGA, CPU, GPU. Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING

GPU-based Decompression for Medical Imaging Applications

PLANNING FOR DENSITY AND PERFORMANCE IN VDI WITH NVIDIA GRID JASON SOUTHERN SENIOR SOLUTIONS ARCHITECT FOR NVIDIA GRID

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

Speeding Up RSA Encryption Using GPU Parallelization

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

Guided Performance Analysis with the NVIDIA Visual Profiler

ST810 Advanced Computing

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

QuickSpecs. NVIDIA Quadro K1200 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. Overview

GPUs for Scientific Computing

Trends in High-Performance Computing for Power Grid Applications

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

SAPPHIRE HD GB GDDR5 PCIE.

NVIDIA Quadro K5200 Amazing 3D Application Performance and Capability

Revit products will use multiple cores for many tasks, using up to 16 cores for nearphotorealistic

TESLA K10 GPU ACCELERATOR

Autodesk Building Design Suite 2012 Standard Edition System Requirements... 2

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

Boundless Security Systems, Inc.

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Overview. NVIDIA Quadro K5200 8GB Graphics J3G90AA

White Paper OpenCL : The Future of Accelerated Application Performance Is Now. Table of Contents

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

GPU Accelerated Monte Carlo Simulations and Time Series Analysis

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU

QuickSpecs. NVIDIA Quadro K2200 4GB Graphics INTRODUCTION. NVIDIA Quadro K2200 4GB Graphics. Technical Specifications

Transcription:

Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1

Table of Contents 1. Architecture of a GPU 2. General-purpose computing on GPUs 3. Applications of GPGPU 4. Performance evaluation examples M. Hauschild - 2

Architecture of a GPU What is a GPU Graphics processing unit Main GPU manufacturers 1. Intel 2. AMD 3. Nvidia Performance characteristics: 1 GPU architecture: 28 nm GPU speed: 1 GHz Memory amount: 8 GiB GDDR5 Memory bandwidth: 640 GiB/s 1 based on the AMD Radeon R9 series (cf.[1]) M. Hauschild - 3

Architecture of a GPU Difference between GPU and CPU[3] CPU optimized for single thread execution GPU optimized for multiple data execution M. Hauschild - 4

Architecture of a GPU Architecture of a GPU[4] based on the Nvidia Fermi architecture: M. Hauschild - 5

Architecture of a GPU Architecture of a GPU[4] M. Hauschild - 6

Architecture of a GPU Architecture of a GPU[4] Summary of the Nvidia Fermi architecture: 16 Streaming Multiprocessors (SM) 32 CUDA cores per SM = 512 CUDA cores 512 FMA op/clock it is great for generating graphics, but what else could be done with it? M. Hauschild - 7

General-purpose computing on GPUs What is GPGPU[5] General-purpose computing on graphics processing units Using GPU for non-graphical computations Good for data parallelism Bad for instruction parallelism First use in LU factorization Became popular at 2001 with matrix multiplication Started using DirectX and OpenGL M. Hauschild - 8

General-purpose computing on GPUs GPGPU Frameworks Brook One of the earliest GPU frameworks by Stanford University CUDA Proprietary Nvidia-only framework OpenCL Open source general framework by Khronos Group C++ AMP Open C++ extension by Microsoft OpenACC C, C++ and Fortran extension ArrayFire Wrapper for CUDA, OpenCL, etc. M. Hauschild - 9

Applications of GPGPU General applications of GPGPU Again, GPGPU can only be superior to CPU computing, if the same algorithm is applied to a lot of data (data parallelism) For example: k-nearest neighbor Fast Fourier Transform Segmentation Audio Processing CT reconstruction Weather forecasting Cryptography Database operations M. Hauschild - 10

Applications of GPGPU Applications of GPGPU in Robotics[2] For example: Generally many image processing tasks Frame transformation Inverse kinematic calculation 3D pose estimation Point-set registration M. Hauschild - 11

Universität Hamburg Performance evaluation examples Performance evaluation examples Test 1 Sobel operator on a real image using OpenCL Measurement of the possible frames per second On GPU and CPU Test 2 Matrix multiplication of two squared matrices using OpenCL Measurement of time needed for calculation On GPU and CPU M. Hauschild - 12

Performance evaluation examples Performance evaluation examples - System characteristics My CPU: Model: AMD Phenom II X4 965 Clock speed: 3400 MHz Misc: 4 Cores, SSE3 My GPU: Model: AMD Radeon HD 6950, Memory: 2048 MB Core clock: 800 MHz Memory clock: 1250 MHz Memory bandwidth: 160 GB/s My RAM: 8 GB M. Hauschild - 13

Performance evaluation examples Performance evaluation examples - Test 1 The Sobel operator: 3. s = dx 2 + dy 2 M. Hauschild - 14

Performance evaluation examples M. Hauschild - 15

Performance evaluation examples Performance evaluation examples - Test 1 M. Hauschild - 16

Performance evaluation examples Performance evaluation examples - Test 2 Matrix Multiplication 2 : 2 from http://www.mathematrix.de/wp-content/uploads/matrixmul2.png M. Hauschild - 17

Performance evaluation examples Performance evaluation examples - Test 2 M. Hauschild - 18

Performance evaluation examples Thank you for your attention! Matthis Hauschild 3hauschi@informatik.uni-hamburg.de Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme M. Hauschild - 19

Performance evaluation examples Bibliography [1] AMD. AMD Radeon TM R9 Grafikkartenserie, 2014. http://www.amd.com/de-de/products/graphics/desktop/r9#. [2] J. Bedkowski and A. Maslowski. GPGPU computation in mobile robot applications. Warsaw University of Technology, 2012. [3] Nvidia. CUDA C Programming Guide, 2014. http://docs.nvidia.com/cuda/pdf/cuda_c_programming_guide.pdf. [4] Nvidia. NVIDIA s Next Generation CUDA Compute Architecture: Fermi, 2014. http://www.nvidia.de/content/pdf/fermi_white_papers/ NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf. [5] Wikipedia. General-purpose computing on graphics processing units, 2014. http://en.wikipedia.org/wiki/general-purpose_computing_on_ graphics_processing_units. M. Hauschild - 20