3D graphic acceleration history and architecture



Similar documents
CSE 564: Visualization. GPU Programming (First Steps) GPU Generations. Klaus Mueller. Computer Science Department Stony Brook University

Introduction to GPGPU. Tiziano Diamanti

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Radeon HD 2900 and Geometry Generation. Michael Doggett

Computer Graphics Hardware An Overview

GPGPU Computing. Yong Cao

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

GPUs Under the Hood. Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology

GPU Architecture. Michael Doggett ATI

OpenGL Performance Tuning

Computer Graphics. Anders Hast

3D Computer Games History and Technology

OpenGL & Delphi. Max Kleiner. 1/22

Writing Applications for the GPU Using the RapidMind Development Platform

Lecture Notes, CEng 477

Introduction to Computer Graphics

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

2: Introducing image synthesis. Some orientation how did we get here? Graphics system architecture Overview of OpenGL / GLU / GLUT

Performance Optimization and Debug Tools for mobile games with PlayCanvas

L20: GPU Architecture and Models

GPU Shading and Rendering: Introduction & Graphics Hardware

QCD as a Video Game?

Optimization for DirectX9 Graphics. Ashu Rege

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Developer Tools. Tim Purcell NVIDIA

NVIDIA GeForce GTX 580 GPU Datasheet

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Touchstone -A Fresh Approach to Multimedia for the PC

Optimizing AAA Games for Mobile Platforms

Aston University. School of Engineering & Applied Science

A Crash Course on Programmable Graphics Hardware

The Future Of Animation Is Games

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

Computer Graphics on Mobile Devices VL SS ECTS

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

The Design of the OpenGL Graphics Interface

Data Visualization Using Hardware Accelerated Spline Interpolation

Real-Time Graphics Architecture

Introduction to Graphics Software Development for OMAP 2/3

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

How To Understand The Power Of Unity 3D (Pro) And The Power Behind It (Pro/Pro)

NVIDIA workstation 3D graphics card upgrade options deliver productivity improvements and superior image quality

How To Use An Amd Ramfire R7 With A 4Gb Memory Card With A 2Gb Memory Chip With A 3D Graphics Card With An 8Gb Card With 2Gb Graphics Card (With 2D) And A 2D Video Card With

Introduction to Computer Graphics

INTRODUCTION TO RENDERING TECHNIQUES

The OpenGL R Graphics System: A Specification (Version 3.3 (Core Profile) - March 11, 2010)

GPGPU: General-Purpose Computation on GPUs

Computer Graphics. Computer graphics deals with all aspects of creating images with a computer

Advanced Visual Effects with Direct3D

Low power GPUs a view from the industry. Edvard Sørgård

Data Parallel Computing on Graphics Hardware. Ian Buck Stanford University

Comp 410/510. Computer Graphics Spring Introduction to Graphics Systems

GPU Usage. Requirements

GUI GRAPHICS AND USER INTERFACES. Welcome to GUI! Mechanics. Mihail Gaianu 26/02/2014 1

Introduction to GPU Programming Languages

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

Accelerating Intensity Layer Based Pencil Filter Algorithm using CUDA

Computer Applications in Textile Engineering. Computer Applications in Textile Engineering

OctaVis: A Simple and Efficient Multi-View Rendering System

Web Based 3D Visualization for COMSOL Multiphysics

NVFX : A NEW SCENE AND MATERIAL EFFECT FRAMEWORK FOR OPENGL AND DIRECTX. TRISTAN LORACH Senior Devtech Engineer SIGGRAPH 2013

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

Introduction to GPU Architecture

AMD RenderMonkey IDE Version 1.71

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

Interactive Computer Graphics

Volume Rendering on Mobile Devices. Mika Pesonen

How To Teach Computer Graphics

Computer Graphics CS 543 Lecture 12 (Part 1) Curves. Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

BRINGING UNREAL ENGINE 4 TO OPENGL Nick Penwarden Epic Games Mathias Schott, Evan Hart NVIDIA

SAPPHIRE TOXIC R9 270X 2GB GDDR5 WITH BOOST

1. INTRODUCTION Graphics 2

System requirements for Autodesk Building Design Suite 2017

Graphics and Computing GPUs

Procedural Shaders: A Feature Animation Perspective

Graphics Input Primitives. 5. Input Devices Introduction to OpenGL. String Choice/Selection Valuator

Image Synthesis. Transparency. computer graphics & visualization

An Introduction to Modern GPU Architecture Ashu Rege Director of Developer Technology

Introduction to GPU Computing

Dynamic Resolution Rendering

NVPRO-PIPELINE A RESEARCH RENDERING PIPELINE MARKUS TAVENRATH MATAVENRATH@NVIDIA.COM SENIOR DEVELOPER TECHNOLOGY ENGINEER, NVIDIA

GPU Christmas Tree Rendering. Evan Hart

Workstation Applications for Windows. NVIDIA MAXtreme User s Guide

Transcription:

3D graphic acceleration history and architecture 2003-2016 Josef Pelikán, Jan Horáček CGG MFF UK Praha pepca@cgg.mff.cuni.cz http://cgg.mff.cuni.cz/~pepca/ GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 1 / 43

Advances in computer hardware 3D acceleration common in the consumer sector games, multimedia appearance: presentation quality, ~ photorealistic sophisticated texturing and shading techniques, multipass methods,.. very high performance recent VLSI technology (NVIDIA Pascal.. 16 nm, AMD.. 1024-bit HBM memory,..) extreme memory performance, massive parallelism very fast CPU-GPU buses GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 2 / 43

Advances in GPU software two main APIs for 3D graphics OpenGL (SGI, open standard, Khronos) Direct3D (Microsoft) parameter setup + efficient data transfer sharing of data arrays ( buffers ) programmable rendering pipeline revolution in realtime 3D graphics (~2000) vertex-shader: vertex processing tesselation and geometry shaders: geometry processing on the GPU (new primitives on the fly ) fragment-shader (pixel-shader): pixel appearance GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 3 / 43

Development tools for developers and artists high level shader programming [Cg (NVIDIA)], HLSL (DirectX), GLSL (OpenGL) Cg is almost equal to HLSL effect composition compact definition of the effect (GPU programs, data references, parameters..) in one source file/script DirectX.FX format, NVIDIA CgFX format tools: Effect Browser (Microsoft), FX Composer (N- VIDIA), RenderMonkey (ATI) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 4 / 43

History I: Silicon Graphics first commercial graphical workstations hardware-accelerated graphical subsystem (depth-buffer) SW tools, libraries (Iris GL, Inventor), graphics = SGI selected workstations (http://www.g-lenerz.de/) 1983: Iris 1000 graphical terminal to VAX (Motorola 68k @ 8MHz, Geometry Engine,..) 1984: Iris 2000 (graphical station, Clark GE), Iris 3000 1986: Professional Iris 4D (first MIPS processors) 1988: Power series, GTX, Personal Iris 4D GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 5 / 43

History II: Silicon Graphics, SGI other workstations and graphical systems: 1991: Iris Indigo (most popular SGI workstation) 1992: Iris Crimson 1992: Indigo R4000 (64bit), RealityEngine 1993: Iris Indy (cheap, 3D graphics acceleration option) 1993: Onyx (server with RE2) 1993: Iris Indigo 2 (Extreme graphics) 1996: O2 (cheap workstation), InfiniteReality engine 1997: Octane (2 CPUs) 2000: Octane2 (Vpro graphics) 2002: Fuel, 2003: Ezro, 2008: Virtu (x86, NVIDIA) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 6 / 43

History III: Consumer sector 1 st graphic accelerator for home PC 1996: 3Dfx Voodoo 1 graphic coprocessor ( pass-through ) Glide API 1 st SLI card (Scan Line Interleave) 1998: 3Dfx Voodoo 2 NVIDIA 1997: NVIDIA Riva 128 1998: NVIDIA Riva TNT ( TwiNTexel ) 1 st HW T&L ( transform & lighting ) 1999: NVIDIA GeForce 256 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 7 / 43

History IV: Consumer sector 2000 NVIDIA GeForce2 ATI Radeon 2001: GPU programming DirectX 8.0 (vertex shaders, fragment shaders, 1.0, 1.1) NVIDIA GeForce3, GeForce3 Titanium DirectX 8.1 (PS 1.2, 1.3, 1.4) ATI Radeon 8500 (TruForm) 2002: advanced GPU programming DirectX 9.0 VS, PS 2.0 NVIDIA GeForce4 Titanium ATI Radeon 9000, 9700 [Pro] GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 8 / 43

History V: Consumer sector 2003: affordable DX9 cheap DirectX 9.0 compatible cards (VS, PS 2.0) NVIDIA GeForce FX 5200-5800 ATI Radeon 9800 2004: extended shader programming DirectX 9.0c (VS, PS 3.0), OpenGL 2.0 (at last!) NVIDIA GeForce 6800, 6200, 6600 ATI Radeon X800 2005: HW advances PCI-Express bus twin GPU systems NVIDIA: SLI, ATI: CrossFire NVIDIA GeForce 7800 ATI Radeon X550, X850 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 9 / 43

History VI: Consumer sector 2006 DirectX 10 (Windows Vista).. geometry shaders NVIDIA GeForce 7600, 7900 ATI Radeon X1800, X1900 2007 CUDA (NVIDIA) GPGPU programming in C NVIDIA GeForce 8600, 8800 ATI Radeon R600 (HD 2400, 3850) 2009 OpenGL 3.2, DirectX 11 GPU tesselation NVIDIA Fermi GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 10 / 43

History VII: Consumer sector 2010 OpenGL 4 OpenCL: general computing on GPU, multiplatform 2011 - computing servers using many GPU cards (NVIDIA Tesla architecture) OpenGL 4.6 DirectX 12 (Windows 10, Xbox One) OpenGL ES for mobile platforms (GLES 3.2) other manufacturers Matrox, 3DLabs, S3, PowerVR (Kyro), SiS Intel: very good integrated GPUs GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 11 / 43

3D graphics pipeline I Application Geometry Rasterization Application 3D data representation (virtualization, disk + memory), parametrization, templates,.. object behavior: physical simulation, AI interaction: collisions, deformations,.. GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 12 / 43

3D graphics pipeline II Application Geometry Rasterization Geometry ( HW T&L ) modeling transforms (application support) projection transforms (perspective), clipping tesselation (creating primitives on the fly ) lighting (at least pre-computing of data for lighting) very long pipeline GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 13 / 43

3D graphics pipeline III Application Geometry Rasterization Rasterization (raster rendering) primitives are converted into fragments, attribute interpolation visibility ( depth-buffer ), texture mapping, shading effects, transparency, fog,.. parallelism (independent processing of fragments) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 14 / 43

OpenGL (FFP scheme) OpenGL Fixed Functionality Pipeline: Geom. Per-Vertex Operations Vertices Primitive Assembly Clip Project Viewport Cull Geom. Rasterize Fragment Processing Fragments Per-Fragment Operations App Memory Pixel Unpack Pixel Transfer Pixels Textures Texture Memory Frame Buffer Operations Pixels Pixel Pack Pixel Groups Read Control Frame Buffer GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 15 / 43

OpenGL: geometric primitives types of geometric primitives: point, line, polyline, closed polyline polygon, triangle, triangle strip, triangle fan, quadrangle, quad strip immediate vertex processing mode glvertex, glcolor, glnormal, gltexcoord,... not efficient (many gl*() calls) vertex arrays gldrawarrays, glmultidrawarrays, gldrawelements,... glcolorpointer, glvertexpointer,... or interleaving GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 16 / 43

Geometric primitives I GL_POINTS GL_LINES GL_LINE_STRIP GL_LINE_LOOP V 0 V 1 V 0 V 1 V 0 V 1 V 0 V 1 V 2 V 2 V 2 V 2 V 3 V 3 V 3 V 3 V 4 V 4 V 4 V 4 V 5 V 5 V 5 V 5 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 17 / 43

Geometric primitives II GL_TRIANGLES GL_TRIANGLE_STRIP GL_TRIANGLE_FAN V 0 V 1 V 0 V 1 V 6 V 5 V 4 V 2 V3 V 2 V3 V 0 V3 V 4 V 5 V 2 V 5 V 4 V 1 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 18 / 43

Geometric primitives III GL_QUADS GL_QUAD_STRIP GL_POLYGON V 0 V 3 V 1 V 0 V 6 V 5 V 4 V 1 V 2 V 3 V 2 V 3 V 4 V 7 V 5 V 4 V 0 V 2 V 6 V 6 V 5 V 7 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 19 / 43 V 1

OpenGL macros (Display Lists) DISPLAY_LIST_MODE instead of IMMEDIATE_MODE sequence of OpenGL commands stored in memory glnewlist, glendlist a list can be stored on the server side (GPU) idea: display-list = macro replay glcalllist, glcalllists could be much more efficient (sequence caching/optimization..) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 20 / 43

Geometric data on the server since OpenGL 1.5 VBO buffer, currently: VAO buffer server-side buffers for geometric data buffer management: glcreatebuffers, glbindbuffer data submit: glbufferdata, glbuffersubdata buffer mapping: glmapbuffer, glunmap.. working with client memory or VBO buffer glcolorpointer, glnormalpointer, glvertexpointer,... GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 21 / 43

Vertex Buffer Objects glbindbuffer( GL_ARRAY_BUFFER, 0 ); glvertexpointer( ); glnormalpointer( ); 0 1 [x,y,z,w] [N x,n y,n z ] [s,t] Vertex buffer [0,1,2] [1,2,3] [3,0,12] id=0 [x,y,z,w] [N x,n y,n z ] [s,t] VBO buffer objects Index buffer id=1 GPU memory glbindbuffer( GL_ELEMENT_ARRAY_BUFFER, 1 ); gldrawelements( GL_TRIANGLES, ); GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 22 / 43

Vertex processing (GL < 3.x) matrix transforms (model, view, projection matrices) glmatrixmode glloadidentity, glloadmatrix, glmultmatrix glrotate, glscale, gltranslate,... lighting attributes gllight, gllightmodel, glmaterial GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 23 / 43

Primitive assembly primitive assembly how many vertices the primitive needs assembly and dispatch of the data primitive processing clipping projection into a frustum, division by w projection and clipping into a 2D window ( viewport ) optional back-face culling - single- vs. double-sided triangles GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 24 / 43

Rasterization, fragments rasterization decomposition of a primitive into a set of fragments objects: points, line segments, triangles, bitmaps fragment raster element, potentially contributing to a pixel color size: equal or smaller than a pixel (anti-aliasing) data packet passing through a raster part of a GPU: - input/output: x, y, z (only depth is mutable) - optional texture coordinates t 0 to t n - specular & diffuse color, fog coefficient, user data,... - output: RGB color and opacity α (further frame-buffer op.) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 25 / 43

Fragment data interpolation fragment attributes are interpolated from values in vertices: depth (z or w) texture coordinates colors (specular, diffuse color) user attributes,... fast HW interpolators perspective correct interpolation only [ x, y ] is changing linearly other values need one division per fragment GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 26 / 43

Fragment processing texturing operations very good optimization data fetch from texture memory texel interpolation - mip-mapping, anisotropic filtering,... multi-texturing (many operations for combining colors) special effects (bump-mapping, environment mapping) fog computation depends on z value primary & secondary color combination (diff., spec.) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 27 / 43

Fragment utilization ( per-fragm. op. ) user area clipping (glscissor) transparency rejection test (glalphafunc) stencil test (glstencilop) * write depth test (gldepthfunc) * color blending (transparency) * srgb conversion dithering (shallow frame-buffers) logical operation (gllogicop) * GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 28 / 43

Global frame-buffer operations frame-buffers front, back, left, right,... (double-buffering, stereo) set current rendering buffer (gldrawbuffer) buffer initialization (glclear) glclearcolor, glcleardepth, glclearstencil, glclearaccum graphic server control operations glflush: flush all intermediate buffers glfinish: finish all current-context rendering GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 29 / 43

Raster images in OpenGL Geom. Per-Vertex Operations Vertices Primitive Assembly Clip Project Viewport Cull Geom. Rasterize Fragment Processing Fragments Per-Fragment Operations App Memory Pixel Unpack Pixel Transfer Pixels Textures Texture Memory Frame Buffer Operations Pixels Pixel Pack Pixel Groups Read Control Frame Buffer GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 30 / 43

Raster image transfer application memory frame-buffer by rasterization (convert bitmap to fragments) gldrawpixels, glbitmap application memory texture memory only using unpacking and pixel transfer glteximage, gltexsubimage transfer inside GPU glcopypixels: for frame-buffer[s] glcopyteximage, glcopytexsubimage: target = texture frame-buffer application memory glreadpixels: used to be very slow operation ( AGP) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 31 / 43

Raster conversions and other op. pixel unpacking conversion from app format to OpenGL format ( coherent stream of pixels, group of pixels ) { RGB[α] depth stencil } [][] source format, scanline length (stride), offsets,... setting: glpixelstore pixel packing OpenGL format app format pixel transfer change: scale, intensity, general LuT operation setting: glpixeltransfer, glpixelmap extensions: convolution, other filters, histograms,... GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 32 / 43

OpenGL (Programmable Pipeline) Texture Memory Geom. Vertex Processor Vertices Primitive Assembly Clip Project Viewport Cull Geom. Rasterize Fragment Processor Per-Fragment Operations App Memory Pixel Unpack Pixel Transfer Pixels Textures Fragments Frame Buffer Operations Pixels Pixel Pack Pixel Groups Read Control Frame Buffer GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 33 / 43

Vertex processor replaces the vertex processing unit in FFP vertex coordinate transform normal vector transform and normalization computing/transformation of texturing coordinates lighting vectors setup of material attributes cannot modify number of vertices - partial solution: primitive degeneration type / topology of geometric primitives GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 34 / 43

Fragment processor replaces fragment processing unit in FFP arbitrary arithmetic on fragment attributes texture data fetch and application (color, etc.) fog computation output fragment color synthesis fragment depth can be modified cannot modify number of fragments (except for the discard operation) fragment position within the viewport [x,y] GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 35 / 43

Recent innovations (2009-2010) two more geometry processing steps on a GPU geometry shader (OpenGL 3.2) tesselation shaders (OpenGL 4.0) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 36 / 43

Geometric primitives IV GL_LINES_ADJACENCY GL_LINE_STRIP_ADJACENCY V 0 V 3 V 7 V 6 V 5 V 1 V 2 V 4 V 4 V 0 V 6 V 3 V5 V 1 V 2 V 7 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 37 / 43

Geometric primitives V GL_TRIANGLES_ADJACENCY GL_TRIANGLE_STRIP_ADJACENCY V 1 V 1 V 0 V 0 V 2 V 3 V 2 V 5 V 6 V 3 V 7 V 4 V 9 V 4 V 5 V 8 V 10 V 11 GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 38 / 43

Geometric processors Tesselation shaders new in OpenGL 4.0 HW-supported surface division, subdivision (spline patches..) new shaders: tesselation control and tesselation evaluation the former defines topology, the later actually computes new geometry (coefficients) Geometry shader since OpenGL 3.2 just before primitive assembly & rasterizer capability to create/discard vertices and primitives more general than TS but less efficient (unadvisable for simple [sub]division schemes) GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 39 / 43

GPU programming Vertex shader, Fragment shader, custom machine code executed in a vertex-, fragment-, processor application programmer is able to deploy his own code HW independent * programming languages microcode for GPU is compiled at run-time and can be effectively optimized (various profiles/versions,..) low-level instructions (assembler-like code) or high-level languages Cg, HLSL, GLSL (similar) NVIDIA Microsoft OpenGL GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 40 / 43

OpenGL 3.x removed: FFP and immediate mode (glbegin/glend) new: contextual profiles for future removal of obsolete functionality new: floating-point textures new: HW instancing new: geometry shaders (creating/discarding geometric primitives on the GPU) GLSL 1.3-1.5 native bitwise operations, integer arithmetic,... GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 41 / 43

OpenGL 4.x new: HW tesselation Tesselation Control Shaders, Tesselation Evaluation Shaders GL_PATCHES, GL_TRIANGLES_ADJACENCY primitives new: direct connection to external computing API (OpenCL) w/o CPU intermediation new: binary storage of compiled shaders GLSL 4.0 64-bit floating point computing support shader subroutines separable shader programs GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 42 / 43

Sources Tomas Akenine-Möller, Eric Haines: Real-time rendering, 3 rd edition, A K Peters, 2008, ISBN: 9781568814247 OpenGL Architecture Review Board: OpenGL Programming Guide: The Official Guide to Learning OpenGL, Addison-Wesley, latest edition (8 th edition for the OpenGL 4.1) Christophe Riccino: OpenGL reviews, http://www.g-truc.net/post-opengl-review.html [Randima Fernando, Mark J. Kilgard: The Cg Tutorial, Addison-Wesley, 2003, ISBN: 0321194969] GPU architecture 2016 Josef Pelikán, http://cgg.mff.cuni.cz/~pepca 43 / 43