GPGPU Success and Failure Stories

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "GPGPU Success and Failure Stories"

Transcription

1 T Seminar on GPGPU Programming 4th February 2010

2 Outline 1 Overview 2 Success Stories H1N1 flu virus Military computation Video postprocessing 3 Failure Stories Intel s complaints Acceleware cuts staff half End of the GPU roadmap Crypto breakthrough on FPGA 4 References

3 Application areas I GPGPU application areas Mathematics Linear algebra Equation solving Fourier transform Medicine Magnetic Resonance Imaging (MRI) Protein folding Simulation Visualization

4 Application areas II GPGPU application areas Military Signal processing (radars, etc.) Imaging Image restoration, filtering Rendering Object detection Encoding/decoding Ray tracing [1]

5 Success Stories Areas where GPGPUs are used successfully Medicine Military computation Image processing

6 H1N1 flu virus H1N1 flu virus GPGPU used to accelerate cell structure visualization [10, 2, 8] Klaus Schulten and John Stone from the University of Illinois NAMD and VMD research software used to simulate and visualize cell structures parallelize it to GPGPU NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems. Trying to predict how the H1N1 will evolve to know what to do for future pandemic outbursts

7 H1N1 flu virus NAMD & VMD examples NAMD VMD

8 H1N1 flu virus H1N1 flu virus Was the attempt a success? Used nvidia GPUs Schulten mentions Tesla in his interview [2]. They report times faster computation on the GPU So the attempt was a success However, 24 hours of non-stop running on Dual-GPU GeForce GTX 295 runs a protein folding simulation of (only) 1400 nanoseconds [10]. In the article (parallelizing NAMD), Phillips et al. speak of up to 7-fold speedups [8], and not hundredfold.

9 Military computation Defense applications Analyzing data from radars, infrared sensors and video [11] GPU VSIPL [5] implements VSIPL on GPGPU (currently uses Cuda) Vector, Signal, and Image Processing Library Georgia Tech Research Institute and Georgia Tech School of Electrical and Computer Engineering Military programmers do not need to know GPGPU programming, they just use the GPU VSIPL library GPU VSIPL used to process radar, infrared sensor and video data VSIPL functions operate times faster than on the CPU, with much lower cost and power Used at least GeForce GTX 280 [11] Looks like a success story GPU VSIPL website promotes a 75x speedup for range-doppler map [5].

10 Video postprocessing Moon-walk video recovery Recovering the original Apollo 11 video [12] NASA accidentally recorded over the moon-walk video John Lowry of Lowry Digital set out to recover the original recording from the video tapes Lowry Digital is a company that offers movie restoration, video image enhancing, and upscaling services, among others. They have worked for example in the processing of the movie Avatar.

11 Video postprocessing Moon-walk video recovery Recovering the original Apollo 11 video [12] Four sources: the overwritten video, rescanned versions of the video and a cam-recording of the Mission Control TV monitor playing the video Temporal image processing to remove transient noise, corrections, etc. Use of Tesla GPUs to speed up image processing by an order of two magnitudes (100x?), reducing frame processing time to one minute (NVidia s estimate) Possibly another success story no final report for the project

12 Failure Stories Areas where GPGPUs have failed General software parallelization More realistic rendering Specialized computation (cryptography)

13 Intel s complaints Intel problem reports Intel s report [6] Intel complaining about the GPGPU architecture Maximal performance gains hard to reach for other than few application areas Restricted architecture Programming model Larrabee delayed [7], would be closer to the CPU programming 1 First version just not cool enough Memory bandwidth a problem Better luck on next version? Or 3rd, 4th? 1 Westmere processors are going to have an integrated GPU [13]

14 Acceleware cuts staff half Acceleware cuts staff Problem report [3] Acceleware produces software that allows vendors to utilize parallel processing without modifying their software (Wikipedia) Libraries offered as SDK/APIs or plug-ins. In 2008, staff is cut to half, starting with CEO Sean Krawinsky [3] January, A. starts work on seismic migration market March, A. enters image reconstruction market July, staff cut half. Wikipedia cites poor market conditions and lack of funding Today, focus is on electromagnetics, seismic, and engineering simulation markets

15 End of the GPU roadmap Future Graphics Adapter Figure: Rest is just computation

16 End of the GPU roadmap GPU coming close to the CPU Tim Sweeney (Epic Games) [9] 2 GPU programming too limited Shader programs, antialiasing, texture sampling, frame buffer Return to software rendering, bypass OpenGL/DirectX Ray tracing GPU programming too hard If single-threaded version costs X, multithreaded costs 2X, GPGPU costs 10X or more Over 2X cost is uneconomical Productivity more important than performance Future hardware: Unified architecture Scalar code and vector code Westmere processors (few slides ago) 2 Key note speak in ACM Conf. High Perf. Graphics

17 Crypto breakthrough on FPGA Crypto success on parallel algorithms FPGA much better than GPUs [4] 56-bit DES decryption, throughput 280+ billion keys per second Single, HW-accelerated server (Pico FPGA cluster, each FPGA processing 1.6 billion KPS) CPUs process 16M KPS, GPUs (GTX-295) 250M KPS Key recovery takes years on GPUs; less than three days with FPGAs.

18 References I [1] T. Aila and S. Laine. Understanding the efficiency of ray traversal on GPUs. In Proceedings of the ACM Conference on High Performance Graphics, pages , [2] L. Barney. Studying the H1N1 virus using NVIDIA GPUs, Nov Referenced on 28th January [3] C. Demerjian. Acceleware cuts half its staff, Jul Referenced on 28th January [4] Dr. Dobbs. Parallel algorithm leads to crypto breakthrough, Jan Referenced on 29th January [5] Georgia Tech Research Institute. GPU vector, signal, and image processing library, Referenced on 28th January [6] A. Ghuloum. The problem(s) with GPGPU, Oct Referenced on 28th January 2010.

19 References II [7] G. Pfister. The problem with Larrabee, Jan Referenced on 29th January [8] J. C. Phillips, J. E. Stone, and K. Schulten. Adapting a message-driven parallel application to GPU-accelerated clusters. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pages 1 9, [9] T. Sweeney. The end of the GPU roadmap. Keynote speak in the ACM Conference on High Performance Graphics, Jun Referenced on 29th January [10] T. Valich. Researchers battle H1N1 flu virus using GPGPU technology, Nov researches-battle-h1n1-flu-virus-using-gpgpu-technology.aspx. Referenced on 28th January [11] A. Vogel. Inexpensive parallel processing: Programming tools facilitate use of video game processors for defense needs, Jun Referenced on 28th January 2010.

20 References III [12] R. Wilson. DSP brings you a high-definition moon walk, Sep Referenced on 28th January [13] M. Yam. Intel to introduce GPGPU functions into Westmere, Dec Referenced on 29th January 2010.

Analysis of GPU Parallel Computing based on Matlab

Analysis of GPU Parallel Computing based on Matlab Analysis of GPU Parallel Computing based on Matlab Mingzhe Wang, Bo Wang, Qiu He, Xiuxiu Liu, Kunshuai Zhu (School of Computer and Control Engineering, University of Chinese Academy of Sciences, Huairou,

More information

Graphics Processing Unit (GPU) Memory Hierarchy. Presented by Vu Dinh and Donald MacIntyre

Graphics Processing Unit (GPU) Memory Hierarchy. Presented by Vu Dinh and Donald MacIntyre Graphics Processing Unit (GPU) Memory Hierarchy Presented by Vu Dinh and Donald MacIntyre 1 Agenda Introduction to Graphics Processing CPU Memory Hierarchy GPU Memory Hierarchy GPU Architecture Comparison

More information

3DES ECB Optimized for Massively Parallel CUDA GPU Architecture

3DES ECB Optimized for Massively Parallel CUDA GPU Architecture 3DES ECB Optimized for Massively Parallel CUDA GPU Architecture Lukasz Swierczewski Computer Science and Automation Institute College of Computer Science and Business Administration in Łomża Lomza, Poland

More information

Shattering the 1U Server Performance Record. Figure 1: Supermicro Product and Market Opportunity Growth

Shattering the 1U Server Performance Record. Figure 1: Supermicro Product and Market Opportunity Growth Shattering the 1U Server Performance Record Supermicro and NVIDIA recently announced a new class of servers that combines massively parallel GPUs with multi-core CPUs in a single server system. This unique

More information

The Future Of Animation Is Games

The Future Of Animation Is Games The Future Of Animation Is Games 王 銓 彰 Next Media Animation, Media Lab, Director cwang@1-apple.com.tw The Graphics Hardware Revolution ( 繪 圖 硬 體 革 命 ) : GPU-based Graphics Hardware Multi-core (20 Cores

More information

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas

More information

GPGPU accelerated Computational Fluid Dynamics

GPGPU accelerated Computational Fluid Dynamics t e c h n i s c h e u n i v e r s i t ä t b r a u n s c h w e i g Carl-Friedrich Gauß Faculty GPGPU accelerated Computational Fluid Dynamics 5th GACM Colloquium on Computational Mechanics Hamburg Institute

More information

Introduction to GPGPU. Tiziano Diamanti t.diamanti@cineca.it

Introduction to GPGPU. Tiziano Diamanti t.diamanti@cineca.it t.diamanti@cineca.it Agenda From GPUs to GPGPUs GPGPU architecture CUDA programming model Perspective projection Vectors that connect the vanishing point to every point of the 3D model will intersecate

More information

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

Accelerating CFD using OpenFOAM with GPUs

Accelerating CFD using OpenFOAM with GPUs Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide

More information

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt. Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.edu Outline Motivation why do we need GPUs? Past - how was GPU programming

More information

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist NVIDIA CUDA Software and GPU Parallel Computing Architecture David B. Kirk, Chief Scientist Outline Applications of GPU Computing CUDA Programming Model Overview Programming in CUDA The Basics How to Get

More information

Computer Graphics (CS 543) Lecture 1 (Part 1): Introduction to Computer Graphics

Computer Graphics (CS 543) Lecture 1 (Part 1): Introduction to Computer Graphics Computer Graphics (CS 543) Lecture 1 (Part 1): Introduction to Computer Graphics Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) What is Computer Graphics (CG)? Computer

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

NVIDIA Tesla. GPU Computing Technical Brief. Version 1.0.0 5/24/07

NVIDIA Tesla. GPU Computing Technical Brief. Version 1.0.0 5/24/07 NVIDIA Tesla GPU Computing Technical Brief Version 1.0.0 5/24/07 ii NVIDIA Tesla: GPU Compute Tech Brief, Version 1.0.0 Table of Contents Chapter 1. High-Performance Computing on the GPU... 1 1.1 High-Performance

More information

Hardware Acceleration for CST MICROWAVE STUDIO

Hardware Acceleration for CST MICROWAVE STUDIO Hardware Acceleration for CST MICROWAVE STUDIO Chris Mason Product Manager Amy Dewis Channel Manager Agenda 1. Introduction 2. Why use Hardware Acceleration? 3. Hardware Acceleration Technologies 4. Current

More information

Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors

Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors Joe Davis, Sandeep Patel, and Michela Taufer University of Delaware Outline Introduction Introduction to GPU programming Why MD

More information

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...

More information

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,

More information

Accelerating CST MWS Performance with GPU and MPI Computing. CST workshop series

Accelerating CST MWS Performance with GPU and MPI Computing.  CST workshop series Accelerating CST MWS Performance with GPU and MPI Computing www.cst.com CST workshop series 2010 1 Hardware Based Acceleration Techniques - Overview - Multithreading GPU Computing Distributed Computing

More information

Introduction to GPU Programming Languages

Introduction to GPU Programming Languages CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure

More information

GPUs: Doing More Than Just Games. Mark Gahagan CSE 141 November 29, 2012

GPUs: Doing More Than Just Games. Mark Gahagan CSE 141 November 29, 2012 GPUs: Doing More Than Just Games Mark Gahagan CSE 141 November 29, 2012 Outline Introduction: Why multicore at all? Background: What is a GPU? Quick Look: Warps and Threads (SIMD) NVIDIA Tesla: The First

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

Clustering Billions of Data Points Using GPUs

Clustering Billions of Data Points Using GPUs Clustering Billions of Data Points Using GPUs Ren Wu ren.wu@hp.com Bin Zhang bin.zhang2@hp.com Meichun Hsu meichun.hsu@hp.com ABSTRACT In this paper, we report our research on using GPUs to accelerate

More information

GPGPU Computing. Yong Cao

GPGPU Computing. Yong Cao GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power

More information

HOSPIRA (HSP US) HISTORICAL COMMON STOCK PRICE INFORMATION

HOSPIRA (HSP US) HISTORICAL COMMON STOCK PRICE INFORMATION 30-Apr-2004 28.35 29.00 28.20 28.46 28.55 03-May-2004 28.50 28.70 26.80 27.04 27.21 04-May-2004 26.90 26.99 26.00 26.00 26.38 05-May-2004 26.05 26.69 26.00 26.35 26.34 06-May-2004 26.31 26.35 26.05 26.26

More information

High Performance GPGPU Computer for Embedded Systems

High Performance GPGPU Computer for Embedded Systems High Performance GPGPU Computer for Embedded Systems Author: Dan Mor, Aitech Product Manager September 2015 Contents 1. Introduction... 3 2. Existing Challenges in Modern Embedded Systems... 3 2.1. Not

More information

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:

More information

International journal of Innovative Research in Engineering & Science ISSN (October, issue1 volume 4) GRAPHICS CARD

International journal of Innovative Research in Engineering & Science ISSN (October, issue1 volume 4) GRAPHICS CARD GRAPHICS CARD JITENDER SINGH YADAV* MOHIT YADAV* KIRTI AZAD* JANPREET SINGH JOLLY*. *CSE B-TECH 3 RD YEAR, DRONACHARYA COLLEGE OF ENGINEERING. ABSTRACT GRAPHICS CARD or video card one of most important

More information

GMP implementation on CUDA - A Backward Compatible Design With Performance Tuning

GMP implementation on CUDA - A Backward Compatible Design With Performance Tuning 1 GMP implementation on CUDA - A Backward Compatible Design With Performance Tuning Hao Jun Liu, Chu Tong Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto haojun.liu@utoronto.ca,

More information

L20: GPU Architecture and Models

L20: GPU Architecture and Models L20: GPU Architecture and Models scribe(s): Abdul Khalifa 20.1 Overview GPUs (Graphics Processing Units) are large parallel structure of processing cores capable of rendering graphics efficiently on displays.

More information

HPC with Multicore and GPUs

HPC with Multicore and GPUs HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware

More information

Hardware design for ray tracing

Hardware design for ray tracing Hardware design for ray tracing Jae-sung Yoon Introduction Realtime ray tracing performance has recently been achieved even on single CPU. [Wald et al. 2001, 2002, 2004] However, higher resolutions, complex

More information

GPU-based Decompression for Medical Imaging Applications

GPU-based Decompression for Medical Imaging Applications GPU-based Decompression for Medical Imaging Applications Al Wegener, CTO Samplify Systems 160 Saratoga Ave. Suite 150 Santa Clara, CA 95051 sales@samplify.com (888) LESS-BITS +1 (408) 249-1500 1 Outline

More information

GPU Architecture Overview. John Owens UC Davis

GPU Architecture Overview. John Owens UC Davis GPU Architecture Overview John Owens UC Davis The Right-Hand Turn [H&P Figure 1.1] Why? [Architecture Reasons] ILP increasingly difficult to extract from instruction stream Control hardware dominates µprocessors

More information

Multiprocessor Graphic Rendering Kerey Howard

Multiprocessor Graphic Rendering Kerey Howard Multiprocessor Graphic Rendering Kerey Howard EEL 6897 Lecture Outline Real time Rendering Introduction Graphics API Pipeline Multiprocessing Parallel Processing Threading OpenGL with Java 2 Real time

More information

Building Blocks. CPUs, Memory and Accelerators

Building Blocks. CPUs, Memory and Accelerators Building Blocks CPUs, Memory and Accelerators Outline Computer layout CPU and Memory What does performance depend on? Limits to performance Silicon-level parallelism Single Instruction Multiple Data (SIMD/Vector)

More information

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005 Recent Advances and Future Trends in Graphics Hardware Michael Doggett Architect November 23, 2005 Overview XBOX360 GPU : Xenos Rendering performance GPU architecture Unified shader Memory Export Texture/Vertex

More information

ST810 Advanced Computing

ST810 Advanced Computing ST810 Advanced Computing Lecture 17: Parallel computing part I Eric B. Laber Hua Zhou Department of Statistics North Carolina State University Mar 13, 2013 Outline computing Hardware computing overview

More information

How to tweak your PC game settings? Graphics Settings vs Performance Heroes of the Storm. Prepared by MSI Notebook FAE Version 1.

How to tweak your PC game settings? Graphics Settings vs Performance Heroes of the Storm. Prepared by MSI Notebook FAE Version 1. How to tweak your PC game settings? Graphics vs Performance Heroes of the Storm Prepared by MSI Notebook FAE Version 1.3 July 2015 Heroes of the Storm The FPS result of Heroes of the Storm is collected

More information

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015 The GPU Accelerated Data Center Marc Hamilton, August 27, 2015 THE GPU-ACCELERATED DATA CENTER HPC DEEP LEARNING PC VIRTUALIZATION CLOUD GAMING RENDERING 2 Product design FROM ADVANCED RENDERING TO VIRTUAL

More information

Median and Average Sales Prices of New Homes Sold in United States

Median and Average Sales Prices of New Homes Sold in United States Jan 1963 $17,200 (NA) Feb 1963 $17,700 (NA) Mar 1963 $18,200 (NA) Apr 1963 $18,200 (NA) May 1963 $17,500 (NA) Jun 1963 $18,000 (NA) Jul 1963 $18,400 (NA) Aug 1963 $17,800 (NA) Sep 1963 $17,900 (NA) Oct

More information

Implementation of Image Processing Algorithms on the Graphics Processing Units

Implementation of Image Processing Algorithms on the Graphics Processing Units Implementation of Image Processing Algorithms on the Graphics Processing Units Natalia Papulovskaya, Kirill Breslavskiy, and Valentin Kashitsin Department of Information Technologies of the Ural Federal

More information

Faculté Polytechnique

Faculté Polytechnique Faculté Polytechnique CHAPTER 6 : GPU PROGRAMMING APPLICATION : MULTI-CPU-GPU BASED IMAGE AND VIDEO PROCESSING Sidi Ahmed Mahmoudi sidi.mahmoudi@umons.ac.be 11 Mars 2015 PLAN Introduction I. GPU Presentation

More information

Introduction to Computer Graphics

Introduction to Computer Graphics Introduction to Computer Graphics Torsten Möller TASC 8021 778-782-2215 torsten@sfu.ca www.cs.sfu.ca/~torsten Today What is computer graphics? Contents of this course Syllabus Overview of course topics

More information

GPGPU acceleration in OpenFOAM

GPGPU acceleration in OpenFOAM Carl-Friedrich Gauß Faculty GPGPU acceleration in OpenFOAM Northern germany OpenFoam User meeting Braunschweig Institute of Technology Thorsten Grahs Institute of Scientific Computing/move-csc 2nd October

More information

GPU Data Structures. Aaron Lefohn Neoptica

GPU Data Structures. Aaron Lefohn Neoptica GPU Data Structures Aaron Lefohn Neoptica Introduction Previous talk: GPU memory model This talk: GPU data structures Properties of GPU Data Structures To be efficient, must support Parallel read Parallel

More information

1. INTRODUCTION Graphics 2

1. INTRODUCTION Graphics 2 1. INTRODUCTION Graphics 2 06-02408 Level 3 10 credits in Semester 2 Professor Aleš Leonardis Slides by Professor Ela Claridge What is computer graphics? The art of 3D graphics is the art of fooling the

More information

GPU programming using C++ AMP

GPU programming using C++ AMP GPU programming using C++ AMP Petrika Manika petrika.manika@fshn.edu.al Elda Xhumari elda.xhumari@fshn.edu.al Julian Fejzaj julian.fejzaj@fshn.edu.al Abstract Nowadays, a challenge for programmers is to

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Modern GPU

More information

Optimizing Performance of Parallel Programs on

Optimizing Performance of Parallel Programs on C-DAC & IIT Madras Five-Day Technology Workshop Programme ON Optimizing Performance of Parallel Programs on Emerging Multi-Core Processors and & GPUs OPECG-2009 Venue : Indian Institute of Technology Madras

More information

Heterogeneous Computing -> Fusion

Heterogeneous Computing -> Fusion Heterogeneous Computing -> Fusion Norm Rubin AMD Fellow 1 Heterogeneous Computing -> Fusion saahpc 2010 Definitions Heterogenous Computing A system comprised of two or more compute engines with signficant

More information

Enabling GPU-accelerated High Performance Geospatial Line-of-Sight Calculations

Enabling GPU-accelerated High Performance Geospatial Line-of-Sight Calculations Enabling GPU-accelerated High Performance Geospatial Line-of-Sight Calculations Bart Adams, Ph.D. Frank Suykens, Ph.D. Intro text for this chapter Intro text for this chapter Luciad Confidential - do not

More information

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Programming models for heterogeneous computing Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Talk outline [30 slides] 1. Introduction [5 slides] 2.

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008 Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

A Very Brief History of High-Performance Computing

A Very Brief History of High-Performance Computing A Very Brief History of High-Performance Computing CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) A Very Brief History of High-Performance Computing Spring 2016 1

More information

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Accelerating Wavelet-Based Video Coding on Graphics Hardware Wladimir J. van der Laan, Andrei C. Jalba, and Jos B.T.M. Roerdink. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA. In Proc. 6th International Symposium on Image and Signal Processing

More information

Distributed GPU password cracking

Distributed GPU password cracking Alexander Kasabov & Jochem van Kerkwijk System and Network Engineering {akasabov jkerkwijk}@os3.nl February 2, 2011 Introduction Password cracking Graphics processing unit Distributed architectures Evaluation

More information

Accelerating variant calling

Accelerating variant calling Accelerating variant calling Mauricio Carneiro GSA Broad Institute Intel Genomic Sequencing Pipeline Workshop Mount Sinai 12/10/2013 This is the work of many Genome sequencing and analysis team Mark DePristo

More information

NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II. March 2011

NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II. March 2011 NVIDIA Parallel Nsight Accelerating GPU Development in BioWare s Dragon Age II March 2011 Introductions Jeff Kiel Manager of Graphics Tools NVIDIA Corporation Andreas Papathanasis Lead Graphics Programmer

More information

Comp 410/510. Computer Graphics Spring 2016. Introduction to Graphics Systems

Comp 410/510. Computer Graphics Spring 2016. Introduction to Graphics Systems Comp 410/510 Computer Graphics Spring 2016 Introduction to Graphics Systems Computer Graphics Computer graphics deals with all aspects of creating images with a computer Hardware (PC with graphics card)

More information

THE UNIVERSITY OF BOLTON

THE UNIVERSITY OF BOLTON JANUARY Jan 1 6.44 8.24 12.23 2.17 4.06 5.46 Jan 2 6.44 8.24 12.24 2.20 4.07 5.47 Jan 3 6.44 8.24 12.24 2.21 4.08 5.48 Jan 4 6.44 8.24 12.25 2.22 4.09 5.49 Jan 5 6.43 8.23 12.25 2.24 4.10 5.50 Jan 6 6.43

More information

APPLICATIONS OF LINUX-BASED QT-CUDA PARALLEL ARCHITECTURE

APPLICATIONS OF LINUX-BASED QT-CUDA PARALLEL ARCHITECTURE APPLICATIONS OF LINUX-BASED QT-CUDA PARALLEL ARCHITECTURE Tuyou Peng 1, Jun Peng 2 1 Electronics and information Technology Department Jiangmen Polytechnic, Jiangmen, Guangdong, China, typeng2001@yahoo.com

More information

White Paper COMPUTE CORES

White Paper COMPUTE CORES White Paper COMPUTE CORES TABLE OF CONTENTS A NEW ERA OF COMPUTING 3 3 HISTORY OF PROCESSORS 3 3 THE COMPUTE CORE NOMENCLATURE 5 3 AMD S HETEROGENEOUS PLATFORM 5 3 SUMMARY 6 4 WHITE PAPER: COMPUTE CORES

More information

NVIDIA Quadro K2200. Product Specifications. NVIDIA Quadro K2200 Part No. VCQK2200 PB $ CUDA Cores 640. Maximum Power Consumption

NVIDIA Quadro K2200. Product Specifications. NVIDIA Quadro K2200 Part No. VCQK2200 PB $ CUDA Cores 640. Maximum Power Consumption NVIDIA Quadro K2200 NVIDIA Quadro K2200 Part No. VCQK2200 PB $599.00 84 0 0 36 Product Specifications CUDA Cores 640 GPU Memory Memory Interface Memory Bandwidth System Interface Maximum Power Consumption

More information

ultra fast SOM using CUDA

ultra fast SOM using CUDA ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A

More information

NVIDIA GeForce GTX 750 Ti

NVIDIA GeForce GTX 750 Ti Whitepaper NVIDIA GeForce GTX 750 Ti Featuring First-Generation Maxwell GPU Technology, Designed for Extreme Performance per Watt V1.1 Table of Contents Table of Contents... 1 Introduction... 3 The Soul

More information

Interactive Level-Set Segmentation on the GPU

Interactive Level-Set Segmentation on the GPU Interactive Level-Set Segmentation on the GPU Problem Statement Goal Interactive system for deformable surface manipulation Level-sets Challenges Deformation is slow Deformation is hard to control Solution

More information

Introduction to Parallel and Heterogeneous Computing. Benedict R. Gaster October, 2010

Introduction to Parallel and Heterogeneous Computing. Benedict R. Gaster October, 2010 Introduction to Parallel and Heterogeneous Computing Benedict R. Gaster October, 2010 Agenda Motivation A little terminology Hardware in a heterogeneous world Software in a heterogeneous world 2 Introduction

More information

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching

More information

Realtime 3D Computer Graphics Virtual Reality. Graphics

Realtime 3D Computer Graphics Virtual Reality. Graphics Realtime 3D Computer Graphics Virtual Reality Graphics Computer graphics 3D-Computer graphics (3D-CG) currently used for Simulators, VR, Games (real-time) Design (CAD) Entertainment (Movies), Art Education

More information

Das Ising-Modell auf Grafikkarten

Das Ising-Modell auf Grafikkarten Das Ising-Modell auf Grafikkarten Institute of Physics, Johannes Gutenberg-University of Mainz Center for Polymer Studies, Department of Physics, Boston University Artemis Capital Asset Management GmbH

More information

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis artínez, Gerardo Fernández-Escribano, José. Claver and José Luis Sánchez 1. Introduction 2. Technical Background 3. Proposed DVC to H.264/AVC

More information

GPU Parallel Computing Architecture and CUDA Programming Model

GPU Parallel Computing Architecture and CUDA Programming Model GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel

More information

Writing Applications for the GPU Using the RapidMind Development Platform

Writing Applications for the GPU Using the RapidMind Development Platform Writing Applications for the GPU Using the RapidMind Development Platform Contents Introduction... 1 Graphics Processing Units... 1 RapidMind Development Platform... 2 Writing RapidMind Enabled Applications...

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

GeoImaging Accelerator Pansharp Test Results

GeoImaging Accelerator Pansharp Test Results GeoImaging Accelerator Pansharp Test Results Executive Summary After demonstrating the exceptional performance improvement in the orthorectification module (approximately fourteen-fold see GXL Ortho Performance

More information

Parallel Computing with MATLAB

Parallel Computing with MATLAB Parallel Computing with MATLAB Scott Benway Senior Account Manager Jiro Doke, Ph.D. Senior Application Engineer 2013 The MathWorks, Inc. 1 Acceleration Strategies Applied in MATLAB Approach Options Best

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware

More information

GPU Accelerated Monte Carlo Simulations and Time Series Analysis

GPU Accelerated Monte Carlo Simulations and Time Series Analysis GPU Accelerated Monte Carlo Simulations and Time Series Analysis Institute of Physics, Johannes Gutenberg-University of Mainz Center for Polymer Studies, Department of Physics, Boston University Artemis

More information

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*

More information

Choosing a Computer for Running SLX, P3D, and P5

Choosing a Computer for Running SLX, P3D, and P5 Choosing a Computer for Running SLX, P3D, and P5 This paper is based on my experience purchasing a new laptop in January, 2010. I ll lead you through my selection criteria and point you to some on-line

More information

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK Steve Oberlin CTO, Accelerated Computing US to Build Two Flagship Supercomputers SUMMIT SIERRA Partnership for Science 100-300 PFLOPS Peak Performance

More information

THE PROGRAMMER S GUIDE TO THE APU GALAXY. Phil Rogers, Corporate Fellow AMD

THE PROGRAMMER S GUIDE TO THE APU GALAXY. Phil Rogers, Corporate Fellow AMD THE PROGRAMMER S GUIDE TO THE APU GALAXY Phil Rogers, Corporate Fellow AMD THE OPPORTUNITY WE ARE SEIZING Make the unprecedented processing capability of the APU as accessible to programmers as the CPU

More information

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Optimizing a 3D-FWT code in a cluster of CPUs+GPUs Gregorio Bernabé Javier Cuenca Domingo Giménez Universidad de Murcia Scientific Computing and Parallel Programming Group XXIX Simposium Nacional de la

More information

Comparing CPU and GPU in OLAP Cube Creation

Comparing CPU and GPU in OLAP Cube Creation Comparing CPU and GPU in OLAP Cube Creation SOFSEM 2011 Krzysztof Kaczmarski Faculty of Mathematics and Information Science Warsaw University of Technology 22-28 January 2011 Outline 1 Introduction Introduction

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

Overview of High Performance Computing

Overview of High Performance Computing Overview of High Performance Computing Timothy H. Kaiser, PH.D. tkaiser@mines.edu http://geco.mines.edu/workshop 1 This tutorial will cover all three time slots. In the first session we will discuss the

More information

Interactive Level-Set Deformation On the GPU

Interactive Level-Set Deformation On the GPU Interactive Level-Set Deformation On the GPU Institute for Data Analysis and Visualization University of California, Davis Problem Statement Goal Interactive system for deformable surface manipulation

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information