Introduction to GPGPU. Tiziano Diamanti
|
|
- Samantha Johnston
- 8 years ago
- Views:
Transcription
1
2 Agenda From GPUs to GPGPUs GPGPU architecture CUDA programming model
3 Perspective projection Vectors that connect the vanishing point to every point of the 3D model will intersecate the XY plane. Points of intersection will be our projected object
4 An example of perspective projection Px Py = = Zx * Qz Qz Zy * Qz Qz Qx* Zz Zz Qy * Zz Zz Dove: P (PX, PY) pixel on the screen Q (QX, QY, QZ,) starting point in 3D coordinates Z (ZX, ZY, ZZ,) vanishing point Projection is on the XY plane for simplicity so z= 0.
5 Hidden lines removal Many alghoritms may be found in literature for solving this problem, like the painter alghoritm
6 Graphic primitives are transformed into pixels of the frame buffer Rasterization
7 Z-Buffer When using filled polygons instead of lines, there is a method easily implemented in hardware to solve the problem of depth: the Z buffer. This buffer has the same size as the viewport and stores the depth value for each pixel that has been designed, where depth is the distance to the observer. For each pixel, you can go to change the color of the pixel if and only if the associated depth value is less than the existing one. In this way the polygons closer to the observer will cover the most remote in the sense that the pixels that constitute them overlap with the polygons that are further away.
8 Z-Buffer Z = -.5 Z = -.3 Final image eye Top View
9 The Z-Buffer algorithm Step 1: Initialization/enabling of the depth buffer depth buffer
10 The Z-Buffer algorithm Step 2: OpenGL stores the z coordinates of the polygons as they are rendered on the screen eye Z = -.5 Z = -.3
11 The Z-Buffer algorithm Step 3: draw the polygons according to their position z eye Z = -.5 Z = -.3
12 Texture mapping Texture mapping is to apply a bitmap image to a two-dimensional polygon.
13 Rendering Pipeline Vertices connections Fragment position Application Vertex Processor Rasterizer Fragment Processor Buffer Vertices Transformed vertices Fragments Textured fragments Woode n texture
14 The first graphic computers The first graphic supercomputers were typically SGI had hardware acceleration and areas were used for military or aviation simulation
15 3D accelerators for PC Since 1997, there are some PC graphics accelerators. The progress is very fast (about a new generation every year)
16 1999 Nvidia Riva TNT 128-bit bus and graphic engine 180 millions pixels/sec fill rate 6 millions triangles/sec peak 16 Mbyte frame buffer
17 The AGP bus Accellerated Graphic Port, introduced by Intel in 1997 The PCI bus was a 32 bit bus and had a frequency of 33 MHz, so the bandwidth was 33 * 4 byte/s = 133 MB/s The AGP bus (1X) had a frequency of 66 MHz and a width of 32 bit, so the bandwidth was 266 MB/s. AGP2x offered 533 MB/s AGP4x doubles again with 1066 MB/s. AGP8x offered 2GB/s
18 Trasform & Lighting, for the first time, perspective and illumination are calculated on the GPU 256-bit bus and graphic engine 480 Millions pixels/s 15 Millions triangles/s 32 Mbytes frame buffer 2000 Nvidia G-Force
19 2001 Nvidia G-Force 3 57 millions transistors First 3D chip 3D with vertex e pixel shaders 2 textures per pixel
20 2002 Nvidia G-Force 4 Ti 63 millions transistors millions triangles/s 128 Mbytes frame buffer vertex shader units were doubled
21 2002 Nvidia G-Force FX 130 Millions transistors 315 Millions Triangles/s 128/256MBytes frame buffer DirectX9 vertex and pixel shaders
22 Introduced by Intel in 2004 The PCI-Express bus PCI-Express 16x offers 4 Gbytes/s both ways (from and to the GPU), this is increasingly important for having the results of calculations on the GPU (GPGPU)
23 2004 Nvidia G-Force Millions transistors 128/256/512MB frame buffer 16 graphic pipelines for pixel shaders 6 units for vertex shaders DirectX 9.0c
24 2 graphic cards in a PC: Nvidia SLI, ATI Crossfire
25 2005: Nvidia G-Force millions transistors 24 graphic pixel pipelines 8 units for vertex shaders Available only for PCI-Express 15,6 billions pixel/sec 1400 millions verteces/sec
26 2006: Nvidia G-Force 8 Shader model 4.0, geometry shader (DirectX 10) Up to 768 Mbytes memory on-board 36,8 billions pixel/sec 681 millions transistors 128 unified graphic pipeline millions vertices/sec
27 PCI Express 2.0 The PCI Express 2.0 doubles the bus clock frequency of 1.1, doubling the available bandwidth. It is backward compatible with PCI Express 1.1 specifications
28 2008: Nvidia G-Force 9 Shader model 4.0, geometry shader (DirectX 10) Up to 1 Gbytes memory on-board 43,2 billionsdi pixel per second Support for PCI Express millions transistors 128 graphic pipelines 65 nm transistors
29 Shader model 4.0, geometry shader (DirectX 10) 240 Streaming processors 55 nm transistors 51.8 billions pixel per second 2008: GTX 200
30 New generation: Fermi The new generation of Nvidia graphics chips has been dubbed Fermi and is marketed under the symbol GTX 400/500. The original project included 512 CUDA cores, up to 6 GB GDDR5 memory. Produced with the process to 40 nm of TMSC (nvidia has always been fabless) _platform/b010101_40nm.htm
31 512 CUDA cores, up to 6 Gbyte GDDR5 memory. TMSC 40 nm transistors GeForce GTX 580: 512 CUDA Cores, 1536 MB GDDR5 GeForce GTX 570: 480 CUDA Cores, 1280 MB GDDR5 2010: Fermi
32 New generation: Fermi
33 Fermi The GPU is organized in 4 Graphics Processing Clusters (GPC) Each GPC has 4 sub-units, each one with 32 streaming processors that execute the same instruction in parallel (in comparison the GTX 200 chip had 8) Each GTC has cache L1 e shared memory Each GTC has 2 Dispatch units
34 Fermi introduces cache
35 Shared memory A sort of explicit cache Resides on the chip so it is much faster than the onboard memory Size is 16KB (48KB on Fermi)
36 Fermi (3) NVIDIA introduces GigaThreadTM Engine that allows concurrent execution kernel, or kernel threads belonging to different kernels can be run simultaneously, which was not possible with previous generation GPUs.
37 GF 104 Introduced the 104 chip for GF GTX 460 graphics card, introduces the hardware differences Each MS 48 and not 32 CUDA cores Provides a total of 384 cores The GTX 460 has a SM card disabled for a total of 336 cores The GTX 560 has the full 384 cores implemented
38 To balance the increase in cores for MS have been doubled dispatch units from 2 to 4 GF 104
39 nvidia naming Mainstream & laptops: GeForce Target: videogames and multi-media Workstation: Quadro Target: graphic professionals who use CAD and 3D modeling applications The surcharge is due to more memory and especially the specific drivers for accelerating applications GPGPU: Tesla Target: High Performance Computing
40 Mainstream: Fermi: real products GeForce GTX 580: 512 CUDA Cores, 1536 MB GDDR5 GeForce GTX 570: 480 CUDA Cores, 1280 MB GDDR5 Computing (memory can be configured to be ECC): Tesla C2050: 448 CUDA Cores, 3GB GDDR5 Tesla C2070: 448 CUDA Cores, 6GB GDDR5 * Note: With ECC on, 12.5% of the GPU memory is used for ECC bits. For example, 3 GB total memory yields GB of user available memory with ECC on.
41 Tesla C2050 Double Precision floating point performance (peak) 515 Gflops Single Precision floating point performance (peak) 1.03 Tflops They were 78 e 933 Tflops for the previous generation
42 Rendering Pipeline Vertices connections Fragment position Application Vertex Processor Rasterizer Fragment Processor Buffer Vertices Transformed vertices Fragments Textured fragments Woode n texture
43 Shading languages HLSL (Microsoft, 2002) Cg (nvidia, 2002) GLSL (ARB, 2003) ASM Shading Languages (2001) Direct3D (Microsoft, 1995) OpenGL (ARB, 1992)
44 GLSL: example void main() // Vertex shader { gl_position = gl_modelviewprojectionmatrix * gl_vertex; } void main() // Fragment shader { gl_fragcolor = vec4(1.0, 0.0, 0.0, 1.0); }
45 Hi level languages C-like syntax Data types: Vectors (from 1 to 4 floating point, integer, boolean) Matrices (2x2, 3x3, 4x4) Arrays e Textures Conditions, loops, functions Matrix and vector Algebra Special instructions: trigonometry, exponentials, geometry, interpolations
46 GPGPU (General Purpose computation using GPU) Non graphic use of the programmable shaders
47 Future trends The power dissipation can be further increased We are already at the limits of air cooling Power consumption increases not linearly with the clock P = CfV 2, V is proportional to f cubic relation Clock high ratios lead to very low efficiency Multi-core processors can be beneficial: To reduce the clock of 20% leads to an energy savings of 50% More efficient use of transistors rather than turning up the clock from a single processor
48 Architecture of a GPU nvidia GTX 580: Bandwidth: GB/s Estimated Gflops/s Intel Core i7-980x: Max Memory Bandwidth: 25.6 GB/s Estimated 107 GFlops
49 AMD s architecture: VLIW 5 Very Long Instruction Word 5 Designed to process a 4 component dot product (e.g. w, x, y, z) and a scalar component (e.g. lighting) at the same time Found on models of the 6800 serie and backwards 48
50 AMD s architecture: VLIW 4 In games VLIW5 reached an average of efficiency of 3.4 Starting from 6900 serie AMD introduced VLIW 4 The space previously allocated to the t-unit can now be used to have more SIMDs Drivers and compilers are more complicated on this architecture than on nvidia s because they need to exploit not only the SIMDs parallelism but they also need to exploit the vectorization inside the SIMDs 49
51 nvidia vs AMD nvidia s SMIDs are simpler (one instruction per clock cicle) but they run at double the clock of the rest of the chip, for example this are the specs of the GeForce GTX 580: CUDA Cores 512 Graphics Clock (MHz) 772 MHz Processor Clock (MHz) 1544 MHz AMD s radeon 6970 specs: Stream processors 1536 (384 * 4) Clock 880 MHz AMD s radeon 6870 specs: Stream processors 1120 (224 * 5) Clock 900 Mhz 50
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and
More informationIntroduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1
Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?
More informationThe Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA
The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationGPGPU Computing. Yong Cao
GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power
More informationGPU Architecture. Michael Doggett ATI
GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super
More informationNVIDIA GeForce GTX 580 GPU Datasheet
NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines
More informationIntroduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software
GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas
More informationQCD as a Video Game?
QCD as a Video Game? Sándor D. Katz Eötvös University Budapest in collaboration with Győző Egri, Zoltán Fodor, Christian Hoelbling Dániel Nógrádi, Kálmán Szabó Outline 1. Introduction 2. GPU architecture
More informationNext Generation GPU Architecture Code-named Fermi
Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time
More informationRadeon HD 2900 and Geometry Generation. Michael Doggett
Radeon HD 2900 and Geometry Generation Michael Doggett September 11, 2007 Overview Introduction to 3D Graphics Radeon 2900 Starting Point Requirements Top level Pipeline Blocks from top to bottom Command
More informationReal-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university
Real-Time Realistic Rendering Michael Doggett Docent Department of Computer Science Lund university 30-5-2011 Visually realistic goal force[d] us to completely rethink the entire rendering process. Cook
More informationIntroduction to GPU Architecture
Introduction to GPU Architecture Ofer Rosenberg, PMTS SW, OpenCL Dev. Team AMD Based on From Shader Code to a Teraflop: How GPU Shader Cores Work, By Kayvon Fatahalian, Stanford University Content 1. Three
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo
More informationRecent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005
Recent Advances and Future Trends in Graphics Hardware Michael Doggett Architect November 23, 2005 Overview XBOX360 GPU : Xenos Rendering performance GPU architecture Unified shader Memory Export Texture/Vertex
More informationGPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith
GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series By: Binesh Tuladhar Clay Smith Overview History of GPU s GPU Definition Classical Graphics Pipeline Geforce 6 Series Architecture Vertex
More informationRadeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008
Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer
More informationLecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com
CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Modern GPU
More informationL20: GPU Architecture and Models
L20: GPU Architecture and Models scribe(s): Abdul Khalifa 20.1 Overview GPUs (Graphics Processing Units) are large parallel structure of processing cores capable of rendering graphics efficiently on displays.
More informationCSE 564: Visualization. GPU Programming (First Steps) GPU Generations. Klaus Mueller. Computer Science Department Stony Brook University
GPU Generations CSE 564: Visualization GPU Programming (First Steps) Klaus Mueller Computer Science Department Stony Brook University For the labs, 4th generation is desirable Graphics Hardware Pipeline
More informationGPU Parallel Computing Architecture and CUDA Programming Model
GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel
More informationIntroduction to GPU Computing
Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1 Table of Contents 1. Architecture
More informationLecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.
Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide
More informationShader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group
Shader Model 3.0 Ashu Rege NVIDIA Developer Technology Group Talk Outline Quick Intro GeForce 6 Series (NV4X family) New Vertex Shader Features Vertex Texture Fetch Longer Programs and Dynamic Flow Control
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware
More informationAccelerating Intensity Layer Based Pencil Filter Algorithm using CUDA
Accelerating Intensity Layer Based Pencil Filter Algorithm using CUDA Dissertation submitted in partial fulfillment of the requirements for the degree of Master of Technology, Computer Engineering by Amol
More informationIntroduction to Computer Graphics
Introduction to Computer Graphics Torsten Möller TASC 8021 778-782-2215 torsten@sfu.ca www.cs.sfu.ca/~torsten Today What is computer graphics? Contents of this course Syllabus Overview of course topics
More informationIntroducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child
Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.
More informationHow To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With
SAPPHIRE R9 270X 4GB GDDR5 WITH BOOST & OC Specification Display Support Output GPU Video Memory Dimension Software Accessory 3 x Maximum Display Monitor(s) support 1 x HDMI (with 3D) 1 x DisplayPort 1.2
More informationNVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality
Hardware Announcement ZG09-0170, dated March 31, 2009 NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality Table of contents 1 At a glance 3
More informationSAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST
SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST Specification Display Support Output GPU Video Memory Dimension Software Accessory supports up to 4 display monitor(s) without DisplayPort 4 x Maximum Display
More informationIntroduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
More informationOverview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming
Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationGPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics
GPU Architectures A CPU Perspective Derek Hower AMD Research 5/21/2013 Goals Data Parallelism: What is it, and how to exploit it? Workload characteristics Execution Models / GPU Architectures MIMD (SPMD),
More informationMixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms
Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State
More informationATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group
ATI Radeon 4800 series Graphics Michael Doggett Graphics Architecture Group Graphics Product Group Graphics Processing Units ATI Radeon HD 4870 AMD Stream Computing Next Generation GPUs 2 Radeon 4800 series
More informationEvaluation of CUDA Fortran for the CFD code Strukti
Evaluation of CUDA Fortran for the CFD code Strukti Practical term report from Stephan Soller High performance computing center Stuttgart 1 Stuttgart Media University 2 High performance computing center
More informationGPUs for Scientific Computing
GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research
More informationGPGPU accelerated Computational Fluid Dynamics
t e c h n i s c h e u n i v e r s i t ä t b r a u n s c h w e i g Carl-Friedrich Gauß Faculty GPGPU accelerated Computational Fluid Dynamics 5th GACM Colloquium on Computational Mechanics Hamburg Institute
More informationChoosing a Computer for Running SLX, P3D, and P5
Choosing a Computer for Running SLX, P3D, and P5 This paper is based on my experience purchasing a new laptop in January, 2010. I ll lead you through my selection criteria and point you to some on-line
More informationLBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR
LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:
More informationHP Workstations graphics card options
Family data sheet HP Workstations graphics card options Quick reference guide Leading-edge professional graphics February 2013 A full range of graphics cards to meet your performance needs compare features
More informationGPU Hardware and Programming Models. Jeremy Appleyard, September 2015
GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once
More informationAMD GPU Architecture. OpenCL Tutorial, PPAM 2009. Dominik Behr September 13th, 2009
AMD GPU Architecture OpenCL Tutorial, PPAM 2009 Dominik Behr September 13th, 2009 Overview AMD GPU architecture How OpenCL maps on GPU and CPU How to optimize for AMD GPUs and CPUs in OpenCL 2 AMD GPU
More informationGPUs Under the Hood. Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology
GPUs Under the Hood Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology Bandwidth Gravity of modern computer systems The bandwidth between key components
More informationQuickSpecs. NVIDIA Quadro M6000 12GB Graphics INTRODUCTION. NVIDIA Quadro M6000 12GB Graphics. Overview
Overview L2K02AA INTRODUCTION Push the frontier of graphics processing with the new NVIDIA Quadro M6000 12GB graphics card. The Quadro M6000 features the top of the line member of the latest NVIDIA Maxwell-based
More informationGPGPU for Real-Time Data Analytics: Introduction. Nanyang Technological University, Singapore 2
GPGPU for Real-Time Data Analytics: Introduction Bingsheng He 1, Huynh Phung Huynh 2, Rick Siow Mong Goh 2 1 Nanyang Technological University, Singapore 2 A*STAR Institute of High Performance Computing,
More informationConfiguring Memory on the HP Business Desktop dx5150
Configuring Memory on the HP Business Desktop dx5150 Abstract... 2 Glossary of Terms... 2 Introduction... 2 Main Memory Configuration... 3 Single-channel vs. Dual-channel... 3 Memory Type and Speed...
More informationOpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC
OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,
More informationPCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)
PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS) PCIe Over Cable PCIe provides greater performance 8 7 6 5 GBytes/s 4
More informationQuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications
J3G90AA INTRODUCTION The NVIDIA Quadro K5200 gives you amazing application performance and capability, making it faster and easier to accelerate 3D models, render complex scenes, and simulate large datasets.
More information1. INTRODUCTION Graphics 2
1. INTRODUCTION Graphics 2 06-02408 Level 3 10 credits in Semester 2 Professor Aleš Leonardis Slides by Professor Ela Claridge What is computer graphics? The art of 3D graphics is the art of fooling the
More informationWriting Applications for the GPU Using the RapidMind Development Platform
Writing Applications for the GPU Using the RapidMind Development Platform Contents Introduction... 1 Graphics Processing Units... 1 RapidMind Development Platform... 2 Writing RapidMind Enabled Applications...
More informationPetascale Visualization: Approaches and Initial Results
Petascale Visualization: Approaches and Initial Results James Ahrens Li-Ta Lo, Boonthanome Nouanesengsy, John Patchett, Allen McPherson Los Alamos National Laboratory LA-UR- 08-07337 Operated by Los Alamos
More informationMonash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007
Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Questions 2007 INSTRUCTIONS: Answer all questions. Spend approximately 1 minute per mark. Question 1 30 Marks Total
More informationQuickSpecs. NVIDIA Quadro K1200 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. Overview
Overview L4D16AA INTRODUCTION The NVIDIA Quadro K1200 delivers outstanding professional 3D application performance in a low profile plug-in card form factor. This card is dedicated for small form factor
More informationSAPPHIRE VAPOR-X R9 270X 2GB GDDR5 OC WITH BOOST
SAPPHIRE VAPOR-X R9 270X 2GB GDDR5 OC WITH BOOST Specification Display Support Output GPU Video Memory Dimension Software Accessory 4 x Maximum Display Monitor(s) support 1 x HDMI (with 3D) 1 x DisplayPort
More informationA Crash Course on Programmable Graphics Hardware
A Crash Course on Programmable Graphics Hardware Li-Yi Wei Abstract Recent years have witnessed tremendous growth for programmable graphics hardware (GPU), both in terms of performance and functionality.
More informationHome Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks. October 20 th 2015
INF5063: Programming heterogeneous multi-core processors because the OS-course is just to easy! Home Exam 3: Distributed Video Encoding using Dolphin PCI Express Networks October 20 th 2015 Håkon Kvale
More informationAMD EMBEDDED PCIe ADD-IN BOARD Comparison
AMD EMBEDDED PCIe ADD-IN BOARD Comparison AMD Radeon E6460 AMD Radeon E6760 Graphics Processing Unit Process Technology 40 nm 40 nm Graphics Engine Operating Frequency (max) 600 MHz 600 MHz CPU Interface
More informationThe Future Of Animation Is Games
The Future Of Animation Is Games 王 銓 彰 Next Media Animation, Media Lab, Director cwang@1-apple.com.tw The Graphics Hardware Revolution ( 繪 圖 硬 體 革 命 ) : GPU-based Graphics Hardware Multi-core (20 Cores
More informationOverview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions
Module 4: Beyond Static Scalar Fields Dynamic Volume Computation and Visualization on the GPU Visualization and Computer Graphics Group University of California, Davis Overview Motivation and applications
More informationQuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Overview. NVIDIA Quadro K5200 8GB Graphics J3G90AA
Overview J3G90AA INTRODUCTION The NVIDIA Quadro K5200 gives you amazing application performance and capability, making it faster and easier to accelerate 3D models, render complex scenes, and simulate
More informationSeveral tips on how to choose a suitable computer
Several tips on how to choose a suitable computer This document provides more specific information on how to choose a computer that will be suitable for scanning and postprocessing of your data with Artec
More informationGRAPHICS CARDS IN RADIO RECONNAISSANCE: THE GPGPU TECHNOLOGY
IV. Évfolyam 4. szám - 2009. december Fürjes János furjes.janos@chello.hu GRAPHICS CARDS IN RADIO RECONNAISSANCE: THE GPGPU TECHNOLOGY Absztrakt/Abstract Jelen írás egy modern technológiát elemez, amely
More informationComp 410/510. Computer Graphics Spring 2016. Introduction to Graphics Systems
Comp 410/510 Computer Graphics Spring 2016 Introduction to Graphics Systems Computer Graphics Computer graphics deals with all aspects of creating images with a computer Hardware (PC with graphics card)
More informationCUDA programming on NVIDIA GPUs
p. 1/21 on NVIDIA GPUs Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford-Man Institute for Quantitative Finance Oxford eresearch Centre p. 2/21 Overview hardware view
More informationNVIDIA GeForce GTX 750 Ti
Whitepaper NVIDIA GeForce GTX 750 Ti Featuring First-Generation Maxwell GPU Technology, Designed for Extreme Performance per Watt V1.1 Table of Contents Table of Contents... 1 Introduction... 3 The Soul
More informationOpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA
OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization
More informationIP Video Rendering Basics
CohuHD offers a broad line of High Definition network based cameras, positioning systems and VMS solutions designed for the performance requirements associated with critical infrastructure applications.
More informationQuickSpecs. NVIDIA Quadro K2200 4GB Graphics INTRODUCTION. NVIDIA Quadro K2200 4GB Graphics. Technical Specifications
J3G88AA INTRODUCTION The NVIDIA Quadro K2200 delivers outstanding professional 3D application performance in a sub-75 Watt graphics design. Ultra-fast 4GB of GDDR5 GPU memory enables you to create large,
More informationTurbomachinery CFD on many-core platforms experiences and strategies
Turbomachinery CFD on many-core platforms experiences and strategies Graham Pullan Whittle Laboratory, Department of Engineering, University of Cambridge MUSAF Colloquium, CERFACS, Toulouse September 27-29
More informationMsystems Ltd. www.msystems.gr SAPPHIRE HD 6870 1GB GDDR5 PCIE
SAPPHIRE HD 6870 1GB GDDR5 PCIE The SAPPHIRE HD 6870 has a new architecture with a total of 1120 stream processors and 56 texture units delivering massively parallel computing power for graphics and other
More informationDeveloper Tools. Tim Purcell NVIDIA
Developer Tools Tim Purcell NVIDIA Programming Soap Box Successful programming systems require at least three tools High level language compiler Cg, HLSL, GLSL, RTSL, Brook Debugger Profiler Debugging
More informationConsole Architecture. By: Peter Hood & Adelia Wong
Console Architecture By: Peter Hood & Adelia Wong Overview Gaming console timeline and evolution Overview of the original xbox architecture Console architecture of the xbox360 Future of the xbox series
More information~ Greetings from WSU CAPPLab ~
~ Greetings from WSU CAPPLab ~ Multicore with SMT/GPGPU provides the ultimate performance; at WSU CAPPLab, we can help! Dr. Abu Asaduzzaman, Assistant Professor and Director Wichita State University (WSU)
More informationCase Study on Productivity and Performance of GPGPUs
Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia
More informationST810 Advanced Computing
ST810 Advanced Computing Lecture 17: Parallel computing part I Eric B. Laber Hua Zhou Department of Statistics North Carolina State University Mar 13, 2013 Outline computing Hardware computing overview
More informationIn the early 1990s, ubiquitous
How GPUs Work David Luebke, NVIDIA Research Greg Humphreys, University of Virginia In the early 1990s, ubiquitous interactive 3D graphics was still the stuff of science fiction. By the end of the decade,
More information3D Computer Games History and Technology
3D Computer Games History and Technology VRVis Research Center http://www.vrvis.at Lecture Outline Overview of the last 10-15 15 years A look at seminal 3D computer games Most important techniques employed
More informationHardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui
Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching
More informationSystem requirements for Autodesk Building Design Suite 2017
System requirements for Autodesk Building Design Suite 2017 For specific recommendations for a product within the Building Design Suite, please refer to that products system requirements for additional
More informationExperiences on using GPU accelerators for data analysis in ROOT/RooFit
Experiences on using GPU accelerators for data analysis in ROOT/RooFit Sverre Jarp, Alfio Lazzaro, Julien Leduc, Yngve Sneen Lindal, Andrzej Nowak European Organization for Nuclear Research (CERN), Geneva,
More informationBoundless Security Systems, Inc.
Boundless Security Systems, Inc. sharper images with better access and easier installation Product Overview Product Summary Data Sheet Control Panel client live and recorded viewing, and search software
More informationTowards Large-Scale Molecular Dynamics Simulations on Graphics Processors
Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors Joe Davis, Sandeep Patel, and Michela Taufer University of Delaware Outline Introduction Introduction to GPU programming Why MD
More informationImage Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg
Image Processing and Computer Graphics Rendering Pipeline Matthias Teschner Computer Science Department University of Freiburg Outline introduction rendering pipeline vertex processing primitive processing
More informationOptimizing AAA Games for Mobile Platforms
Optimizing AAA Games for Mobile Platforms Niklas Smedberg Senior Engine Programmer, Epic Games Who Am I A.k.a. Smedis Epic Games, Unreal Engine 15 years in the industry 30 years of programming C64 demo
More informationNVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist
NVIDIA CUDA Software and GPU Parallel Computing Architecture David B. Kirk, Chief Scientist Outline Applications of GPU Computing CUDA Programming Model Overview Programming in CUDA The Basics How to Get
More informationgpus1 Ubuntu 10.04 Available via ssh
gpus1 Ubuntu 10.04 Available via ssh root@gpus1:[~]#lspci -v grep VGA 01:04.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a) 03:00.0 VGA compatible controller: nvidia Corporation
More informationLatency and Bandwidth Impact on GPU-systems
NTNU Norwegian University of Science and Technology Faculty of Information Technology, Mathematics and Electrical Engineering Department of Computer and Information Science TDT4590 Complex Computer Systems,
More informationHow to choose a suitable computer
How to choose a suitable computer This document provides more specific information on how to choose a computer that will be suitable for scanning and post-processing your data with Artec Studio. While
More informationSUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS - 2013 UPDATE
SUBJECT: SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE KEYWORDS:, CORE, PROCESSOR, GRAPHICS, DRIVER, RAM, STORAGE SOLIDWORKS RECOMMENDATIONS - 2013 UPDATE Below is a summary of key components of an ideal SolidWorks
More informationGPU File System Encryption Kartik Kulkarni and Eugene Linkov
GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through
More informationStream Processing on GPUs Using Distributed Multimedia Middleware
Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research
More informationHPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware
More informationGeneral Purpose Computation on Graphics Processors (GPGPU) Mike Houston, Stanford University
General Purpose Computation on Graphics Processors (GPGPU) Mike Houston, Stanford University A little about me http://graphics.stanford.edu/~mhouston Education: UC San Diego, Computer Science BS Stanford
More informationBinary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More informationDynamic Resolution Rendering
Dynamic Resolution Rendering Doug Binks Introduction The resolution selection screen has been one of the defining aspects of PC gaming since the birth of games. In this whitepaper and the accompanying
More information