GPU Architecture. Michael Doggett ATI

Similar documents

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Radeon HD 2900 and Geometry Generation. Michael Doggett

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Computer Graphics Hardware An Overview

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Introduction to GPGPU. Tiziano Diamanti

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

GPGPU Computing. Yong Cao

SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

SAPPHIRE VAPOR-X R9 270X 2GB GDDR5 OC WITH BOOST

NVIDIA GeForce GTX 580 GPU Datasheet

SAPPHIRE HD GB GDDR5 PCIE.

L20: GPU Architecture and Models

Optimizing AAA Games for Mobile Platforms

Awards News. GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth.

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Midgard GPU Architecture. October 2014

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

OpenGL Performance Tuning

Dynamic Resolution Rendering

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Touchstone -A Fresh Approach to Multimedia for the PC

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

How To Understand The Power Of Unity 3D (Pro) And The Power Behind It (Pro/Pro)

GPU Architecture. An OpenCL Programmer s Introduction. Lee Howes November 3, 2010

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

AMD Radeon HD 8000M Series GPU Specifications AMD Radeon HD 8870M Series GPU Feature Summary

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

AMD EMBEDDED PCIe ADD-IN BOARD Comparison

QCD as a Video Game?

Low power GPUs a view from the industry. Edvard Sørgård

Msystems Ltd. SAPPHIRE HD GB GDDR5 PCIE. Specification

Introduction to GPU Architecture

Image Synthesis. Transparency. computer graphics & visualization

Configuring Memory on the HP Business Desktop dx5150

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Why You Need the EVGA e-geforce 6800 GS

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

General Purpose Computation on Graphics Processors (GPGPU) Mike Houston, Stanford University

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

A Crash Course on Programmable Graphics Hardware

Introduction to GPU Computing

Introduction to Computer Graphics

3D Computer Games History and Technology

Web Based 3D Visualization for COMSOL Multiphysics

HP Workstations graphics card options

Getting Started with RemoteFX in Windows Embedded Compact 7

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

White Paper AMD GRAPHICS CORES NEXT (GCN) ARCHITECTURE

Performance Optimization and Debug Tools for mobile games with PlayCanvas

How To Teach Computer Graphics

2: Introducing image synthesis. Some orientation how did we get here? Graphics system architecture Overview of OpenGL / GLU / GLUT

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

Intel Graphics Media Accelerator 900

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

The Future Of Animation Is Games

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

Interactive Level-Set Deformation On the GPU

Next Generation GPU Architecture Code-named Fermi

Data Parallel Computing on Graphics Hardware. Ian Buck Stanford University

IP Video Rendering Basics

Advanced Rendering for Engineering & Styling

Beginning Android 4. Games Development. Mario Zechner. Robert Green

An Introduction to Modern GPU Architecture Ashu Rege Director of Developer Technology

Advanced Graphics and Animations for ios Apps

Writing Applications for the GPU Using the RapidMind Development Platform

GPUs Under the Hood. Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology

Catalyst Software Suite Version 9.2 Release Notes

BEAGLEBONE BLACK ARCHITECTURE MADELEINE DAIGNEAU MICHELLE ADVENA

Advanced Visual Effects with Direct3D

Unreal Engine 4: Mobile Graphics on ARM CPU and GPU Architecture

¹ Autodesk Showcase 2016 and Autodesk ReCap 2016 are not supported in 32-Bit.

Boundless Security Systems, Inc.

Maximize Application Performance On the Go and In the Cloud with OpenCL* on Intel Architecture

Choosing a Computer for Running SLX, P3D, and P5

GPGPU accelerated Computational Fluid Dynamics

Workstation Applications for Windows. NVIDIA MAXtreme User s Guide

Optimization for DirectX9 Graphics. Ashu Rege

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

Water Flow in. Alex Vlachos, Valve July 28, 2010

~ Greetings from WSU CAPPLab ~

In the early 1990s, ubiquitous

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

OctaVis: A Simple and Efficient Multi-View Rendering System

QuickSpecs. NVIDIA Quadro K4200 4GB Graphics INTRODUCTION. NVIDIA Quadro K4200 4GB Graphics. Overview

Hardware design for ray tracing

Transcription:

GPU Architecture Michael Doggett ATI

GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas

ATI - Driving the Visual Experience Everywhere Products from cell phones to super computers Gaming Console Integrated Embedded Display Gaming Notebook Color Phone Display Digital TV Multimedia Workstation Multi Monitor Display

RADEON X1800/X1900

RADEON X1800 November 05 DirectX 9.0 Shader Model 3.0 Core clock 625MHz Memory clock 750MHz 512MB of graphics memory 321 Million transistors, 288mm 2, 90nm 8 Vertex Shaders 16 Pixel Shaders 32bit IEEE Single Precision float 4 quad based, threaded units 512 threads of 16 pixels allow efficient dynamic branching

X1800 top level architecture

Vertex Shader

Pixel Shader ALU 1 1 Vec3 ADD + Input Modifier 1 Scalar ADD + Input Modifier ALU 2 1 Vec3 ADD/MULL/MADD 1 Scalar ADD/MULL/MADD Branch Execution Unit 1 Flow Control Instruction

Memory Controller

RADEON X1900 XTX 48 shader pipes Core clock 650MHz Memory clock 775MHz 512MB of graphics memory 384 Million transistors, 352mm 2, 90nm

Xenos: XBOX360 GPU

System architecture CPU 2x 10.8 GB/s Southbridge 2x PCIE 500MB/s GPU Northbridge 22.4GB/s 700MHz 128bit GDDR3 UNIFIED MEMORY 32GB/s DAUGHTER DIE

Rendering performance GPU to Daughter Die interface 8 pixels/clk 32BPP color 4 samples Z - Lossless compression 16 pixels/clk Double Z 4 samples Z - Lossless compression GPU 32GB/s DAUGHTER DIE

Rendering performance Alpha and Z logic to EDRAM interface 256GB/s Color and Z - 32 samples 32bit color, 24bit Z, 8bit stencil Double Z - 64 samples 24bit Z, 8bit stencil DAUGHTER DIE 8pix/clk, 4x MSAA, Stencil and Z test, Alpha blending 256GB/s 10MB EDRAM

GPU architecture GPU Primitive Setup Clipper Rasterizer Hierarchical Z/S Index Stream Generator Tessellator Vertex Pipeline Pixel Pipeline Display Pixels Unified Shader Texture/Vertex Fetch Output Buffer Memory Export UNIFIED MEMORY DAUGHTER DIE

Unified Shader A revolutionary step in Graphics Hardware One hardware design that performs both Vertex and Pixel shaders Vertex processing power Vertices Vertices Pixels Vertex Shader Unified Shader Pixels Pixel Shader

Unified Shader GPU based vertex and pixel load balancing Better vertex and pixel resource usage Union of features E.g. Control flow, indexable constant, DX9 Shader Model 3.0+ Vertices Pixels 3 SIMDs Unified Shader 48 ALUs Vec4 Scalar

Memory Export Shader output to a computed address Virtualize shader resources - multipass Shader debug Randomly update data structures from Vertex or Pixel Shader Scatter write GPU Memory Export UNIFIED MEMORY EDRAM

Texture/Vertex Fetch Shader fetch can be either: Texture fetch (16 units) LOD computation Linear, Bi-linear, Tri-linear Filtering Uses cache optimized for 2D, 3D texture data with varying pixel sizes Unified texture cache Vertex fetch (16 units) Uses cache optimized for vertex-style data

GPU research areas

GPU research areas Graphics can always use increased processing power Higher screen resolution More complex shaders Compute in shader and save on memory bandwidth Product differentiator GPU measured on bang for buck Performance/mm 2 Rapid feature set change DX10 coming with Microsoft Windows Vista, end 06 Less optional features

GPU research areas How to use more replicated tiles with only local interconnect? Saves on chip level interconnect Improves floorplanning A RAW-style architecture for graphics Multi-GPU Current solutions split workload between multiple GPUs By frame Single frame tiling Combinations What alternative architectures can single GPUs scale up to?

General Purpose computing on the GPU (GPGPU)

General Purpose computing on the GPU (GPGPU) Which applications run well on the GPU? What modifications to GPU architecture enable more General Purpose computing? Full IEEE754 compliance 64bit floats Scatter, Xenos Memory Export

Rendering + Physics

GPU research areas GPU/CPU convergence Higher Order Surfaces DX10 geometry shader allows access to triangle s neighboring vertices Mesh modification on GPU