1.4 Pixel Shaders. Jason L. Mitchell. 3D Application Research Group Lead JasonM@ati.com. 1.4 Pixel Shaders - ATI Technologies 1



Similar documents
To sit in the shade on a fine day and look upon verdure is the most perfect refreshment. Jane Austen: Mansfield Park, IX, 1814

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Radeon HD 2900 and Geometry Generation. Michael Doggett

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

INTRODUCTION TO RENDERING TECHNIQUES

High Dynamic Range and other Fun Shader Tricks. Simon Green

GPU Architecture. Michael Doggett ATI

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

How To Make A Texture Map Work Better On A Computer Graphics Card (Or Mac)

OpenGL Performance Tuning

Otis Photo Lab Inkjet Printing Demo

A Proposal for OpenEXR Color Management

Advanced Visual Effects with Direct3D

Deferred Shading & Screen Space Effects

Far Cry and DirectX. Carsten Wenzel

Optimizing AAA Games for Mobile Platforms

3D Computer Games History and Technology

Optimization for DirectX9 Graphics. Ashu Rege

Developer Tools. Tim Purcell NVIDIA

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

GPUs Under the Hood. Prof. Aaron Lanterman School of Electrical and Computer Engineering Georgia Institute of Technology

CSE 564: Visualization. GPU Programming (First Steps) GPU Generations. Klaus Mueller. Computer Science Department Stony Brook University

PRODUCT LIFECYCLE MANAGEMENT COMPETENCY CENTRE RENDERING. PLMCC, JSS Academy of Technical Education, Noida Rendering 1 of 16

Touchstone -A Fresh Approach to Multimedia for the PC

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

Cork Education and Training Board. Programme Module for. 3 Dimensional Computer Graphics. Leading to. Level 5 FETAC

Real-Time BC6H Compression on GPU. Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog

Outline. srgb DX9, DX10, XBox 360. Tone Mapping. Motion Blur

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

Modern Graphics Engine Design. Sim Dietrich NVIDIA Corporation

Using Photorealistic RenderMan for High-Quality Direct Volume Rendering

Making natural looking Volumetric Clouds In Blender 2.48a

Lighting & Rendering in Maya: Lights and Shadows

Image Synthesis. Transparency. computer graphics & visualization

Maya 2014 Still Life Part 1 Texturing & Lighting

Advances in Real-Time Skin Rendering

GPGPU Computing. Yong Cao

Volume Rendering on Mobile Devices. Mika Pesonen

Dynamic Resolution Rendering

GPU Shading and Rendering: Introduction & Graphics Hardware

Computer Graphics CS 543 Lecture 12 (Part 1) Curves. Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

Computer Graphics Hardware An Overview

Tutorial for Tracker and Supporting Software By David Chandler

GUI GRAPHICS AND USER INTERFACES. Welcome to GUI! Mechanics. Mihail Gaianu 26/02/2014 1

Technical Paper DISPLAY PROFILING SOLUTIONS

A Crash Course on Programmable Graphics Hardware

Procedural Shaders: A Feature Animation Perspective

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Lecture 15: Hardware Rendering

3D Math Overview and 3D Graphics Foundations

EFX Keying/Alpha plugins for After Effects

Lecture Notes, CEng 477

COMP175: Computer Graphics. Lecture 1 Introduction and Display Technologies

Console Architecture. By: Peter Hood & Adelia Wong

CUBE-MAP DATA STRUCTURE FOR INTERACTIVE GLOBAL ILLUMINATION COMPUTATION IN DYNAMIC DIFFUSE ENVIRONMENTS

Lesson 26: Reflection & Mirror Diagrams

Writing Applications for the GPU Using the RapidMind Development Platform

So, you want to make a photo-realistic rendering of the Earth from orbit, eh? And you want it to look just like what astronauts see from the shuttle

Calibration Best Practices

A Adobe RGB Color Space

SkillsUSA 2014 Contest Projects 3-D Visualization and Animation

Introduction to Game Programming. Steven Osman

Software Virtual Textures

Introduction to Computer Graphics

The Lighting Effects Filter

Scanners and How to Use Them

Getting Started with iray in 3ds Max 2014

Image Synthesis. Fur Rendering. computer graphics & visualization

Computer Animation: Art, Science and Criticism

Introduction to GPGPU. Tiziano Diamanti

Beyond 2D Monitor NVIDIA 3D Stereo

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

Materials in NX Render

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Plug-in Software Developer Kit (SDK)

GPU Architecture. An OpenCL Programmer s Introduction. Lee Howes November 3, 2010

(Refer Slide Time: 1:42)

Lezione 4: Grafica 3D*(II)

My Materials. In this tutorial, we ll examine the material settings for some simple common materials used in modeling.

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

Color correction in 3D environments Nicholas Blackhawk

Interactive Computer Graphics

MASKS & CHANNELS WORKING WITH MASKS AND CHANNELS

Blender addons ESRI Shapefile import/export and georeferenced raster import

General Purpose Computation on Graphics Processors (GPGPU) Mike Houston, Stanford University

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.

Shear :: Blocks (Video and Image Processing Blockset )

3D Scanner using Line Laser. 1. Introduction. 2. Theory

Flame On: Real-Time Fire Simulation for Video Games. Simon Green, NVIDIA Christopher Horvath, Pixar

Movie 11. Preparing images for print

How To Understand The Power Of Unity 3D (Pro) And The Power Behind It (Pro/Pro)

Our One-Year 3D Animation Program is a comprehensive training in 3D using Alias

ACE: After Effects CC

An Introduction to Modern GPU Architecture Ashu Rege Director of Developer Technology

Technical Brief. Quadro FX 5600 SDI and Quadro FX 4600 SDI Graphics to SDI Video Output. April 2008 TB _v01

Video Camera Image Quality in Physical Electronic Security Systems

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Transcription:

1.4 Pixel Shaders Jason L. Mitchell 3D Application Research Group Lead JasonM@ati.com 1.4 Pixel Shaders - ATI Technologies 1

Outline Pixel Shader Overview 1.1 Shader Review 1.4 Pixel Shaders Unified Instruction set Flexible dependent texture read Image Processing 3D volume visualizations Dynamic transfer functions Effects on 3D Surfaces Per-pixel lighting Diffuse Specular Dealing with halfangle denormalization Per-pixel Fresnel Term Bumpy Environment mapping Bumped Cube mapping Projected dudv Per-pixel anisotropic lighting ShadeLab Tool Good for playing around with shaders in real-time Generates Vertex Shader code on the fly to feed pixel shader 1.4 Pixel Shaders - ATI Technologies 2

The Road to ps.2.0 in DX9 1.0 1.1 1.2 1.3 1.4 2.0 4 Textures Fixed Address Ops 8 Color Ops 6 Textures (12 fetches) Unified Instruction Set 16 Instructions 8 Textures (16 fetches) Expanded Unified Instruction Set 64 Instructions 1.4 Pixel Shaders - ATI Technologies 3

What is a Pixel Shader? What is a Pixel Shader? A Pixel Shader is a set of microcode that you download to the GPU. These little programs execute on the GPU and operate on pixels and texels like the legacy multitexture pipeline Much more flexible than the legacy multitexture pipeline Multitexturing is still available, though not at the same time as Pixel Shading. Set the current pixel shader to 0 to use traditional multitexture 1.4 Pixel Shaders - ATI Technologies 4

Pixel Shader API Assemble and create the shader: D3DXAssembleShader( stropcodes, lstrlen(stropcodes), 0, NULL, &m_pd3dxbufshader, &pbuffer ) m_pd3ddevice->createpixelshader( (DWORD*)m_pD3DXBufShader->GetBufferPointer(), &m_hpixelshader ) Set the current pixel shader: m_pd3ddevice->setpixelshader(m_hpixelshader) Clean up on the way out: m_pd3ddevice->setpixelshader(0) m_pd3ddevice->deletepixelshader(m_hpixelshader) 1.4 Pixel Shaders - ATI Technologies 5

Pixel Shader In s and Out s Temporary Register File (r n ) Texture Coordinates Diffuse & Specular Direct3D Pixel Shader Fog Alpha Blending Frame Buffer Constants Inputs are texture coordinates, constants, diffuse and specular Several read-write temps Output color and alpha in r0.rgb and r0.a Output depth is in r5.r if you use texdepth (v 1.4) No separate specular add when using a pixel shader You have to code it up yourself in the shader Fixed-function fog is still there Followed by alpha blending 1.4 Pixel Shaders - ATI Technologies 6

Constants Eight read-only constants (c0..c7) Range -1 to +1 If you pass in anything outside of this range, it just gets clamped A given co-issue (rgb and a) instruction may only reference up to two constants Example constant definition syntax: def c0, 1.0f, 0.5f, -0.3f, 1.0f 1.4 Pixel Shaders - ATI Technologies 7

Interpolated Quantities Interpolated Quantities Diffuse and Specular (v0 and v1) Low precision and unsigned In ps.1.1 through ps.1.3, available only color shader Not available before ps.1.4 phase marker Texture coordinates High precision signed interpolators Can be used as extra colors, signed vectors, matrix rows etc 1.4 Pixel Shaders - ATI Technologies 8

1.1 Model 1.1 Model Four Textures Color Shader Low Range and precision 8 instructions Preceded by Address Shader Fixed set of modes like bumped cubic environment mapping, a pair of dp3s etc In fact, you can write them all down 1.4 Pixel Shaders - ATI Technologies 9

The 1.1 Address Shaders One texture ps.1.1 tex t0 Texcoord as color ps.1.1 texcoord t0 Mimic a clip plane ps.1.1 texkill t0 One instruction Two textures ps.1.1 tex t0 tex t1 One texbeml ps.1.1 tex t0 texbeml t1, t0 Two texcoords as colors ps.1.1 texcoord t0 texcoord t1 Color AR remapping ps.1.1 tex t0 texreg2ar t1, t0 Mimic two clip planes ps.1.1 texkill t0 texkill t1 Color GB remapping ps.1.1 tex t0 texreg2gb t1, t0 One texbem ps.1.1 tex t0 texbem t1, t0 Sample and texcoord ps.1.1 tex t0 texcoord t1 Two instructions 3 textures ps.1.1 tex t0 tex t1 tex t2 Mimic 3 clip planes ps.1.1 texkill t0 texkill t1 texkill t2 2 samples & a texcoord ps.1.1 tex t0 tex t1 texcoord t2 One texbeml and a texcoord ps.1.1 tex t0 texbeml t1, t0 texcoord t2 One texbem and a sample ps.1.1 tex t0 texbem t1, t0 tex t2 One texbem and a texcoord ps.1.1 tex t0 texbem t1, t0 texcoord t2 One texbeml and a sample ps.1.1 tex t0 texbeml t1, t0 tex t2 Three instructions Sample and 2 texcoords ps.1.1 tex t0 texcoord t1 texcoord t2 Three texcoords as colors ps.1.1 texcoord t0 texcoord t1 texcoord t2 3x2 matrix multiply ps.1.1 tex t0 texm3x2pad t1, t0 texm3x2tex t2, t0 1.4 Pixel Shaders - ATI Technologies 10

The 1.1 Address Shaders Four Instruction Shaders 4 textures ps.1.1 tex t0 tex t1 tex t2 tex t3 Mimic clip planes ps.1.1 texkill t0 texkill t1 texkill t2 texkill t3 3 Samples and one texcoord ps.1.1 tex t0 tex t1 tex t2 texcoord t3 2 Samples and 2 texcoords ps.1.1 tex t0 tex t1 texcoord t2 texcoord t3 1 texbem and 2 samples ps.1.1 tex t0 texbem t1, t0 tex t2 tex t3 1 Sample and 3 texcoords ps.1.1 tex t0 texcoord t1 texcoord t2 texcoord t3 Two texbems ps.1.1 tex t0 texbem t1, t0 tex t2 texbem t3, t2 4 texcoords ps.1.1 texcoord t0 texcoord t1 texcoord t2 texcoord t3 1 texbem, a sample and a texcoord ps.1.1 tex t0 texbem t1, t0 tex t2 texcoord t3 1 texbem and 2 texcoords ps.1.1 tex t0 texbem t1, t0 texcoord t2 texcoord t3 Two texbemls ps.1.1 tex t0 texbeml t1, t0 tex t2 texbeml t3, t2 One texbeml and two samples ps.1.1 tex t0 texbeml t1, t0 tex t2 tex t3 1 texbeml, a sample and a texcoord ps.1.1 tex t0 texbeml t1, t0 tex t2 texcoord t3 3x2 multiply and texcoord tex t0 texm3x2pad t1, t0 texm3x2tex t2, t0 texcoord t3 1 texbeml and 2 texcoords ps.1.1 tex t0 texbeml t1, t0 texcoord t2 texcoord t3 3x2 multiply & sample ps.1.1 tex t0 texm3x2pad t1, t0 texm3x2tex t2, t0 tex t3 3x3 multiply ps.1.1 tex t0 texm3x3pad t1, t0 texm3x3pad t2, t0 texm3x3tex t3, t0 3x3 matrix multiply and reflect constant vector tex t0 texm3x3pad t1, t0 texm3x3pad t2, t0 texm3x3spec t3, t0 3x3 matrix multiply and reflect interpolated vector tex t0 texm3x3pad t1, t0 texm3x3pad t2, t0 texm3x3vspec t3, t0 1.4 Pixel Shaders - ATI Technologies 11

1.2 Shaders 1.2 Shaders Still CISC like 1.1 Same instruction count 4 new tex instructions texreg2rgb texdp3tex texdp3 texm3x3 1 new argument modifier.b replicate blue Valid only in an alpha op 2 new arithmetic instructions cmp Conditionally chooses between s1 and s2 based on s0 compared with zero dp4 4-element dot product 1.4 Pixel Shaders - ATI Technologies 12

1.3 Shaders One additional tex instruction texm3x2depth t2, t0 Performs two dot products to obtain z and w. The depth for the current pixel is set to z/w. If w == 0, the result is 1.0. If z > w the result is clamped to 1.0. Example tex t0 texm3x2pad t1, t0 texm3x2depth t2, t0 1.4 Pixel Shaders - ATI Technologies 13

1.4 Model Flexible, unified instruction set Think up your own math and just do it rather than try to wedge your ideas into a fixed set of modes Flexible dependent texture fetching More textures More instructions High Precision Range of at least -8 to +8 Well along the road to DX9 1.0 1.1 1.2 1.3 1.4 2.0 1.4 Pixel Shaders - ATI Technologies 14

1.4 Pixel Shader Structure Texture Register File t0 t1 t2 t3 t4 t5 texld t4, t5 dp3 t0.r, t0, t4 dp3 t0.g, t1, t4 dp3 t0.b, t2, t4 dp3_x2 t2.rgb, t0, t3 mul t2.rgb, t0, t2 dp3 t1.rgb, t0, t0 mad t1.rgb, -t3, t1, t2 phase texld t0, t0 texld t1, t1 texld t2, t5 mul t0, t0, t2 mad t0, t0, t2.a, t1 Optional Sampling Address Shader Up to 8 instructions Optional Sampling Can be dependent reads Color Shader Up to 8 instructions 1.4 Pixel Shaders - ATI Technologies 15

1.4 Texture Instructions Mostly just data routing. Not ALU operations per se texld Samples data into a map texcrd Moves high precision signed data into a temp register (r n ) Higher precision than v0 or v1 texkill Kills pixels based on sign of register components Fallback for parts that don t have clip planes texdepth Substitute value for this pixel s z! 1.4 Pixel Shaders - ATI Technologies 16

texld vs tex texld vs tex Both cause a register to be filled with sampled data from a map tex Unary op texld Binary op Context is associated with destination register That means texture handle, filtering modes etc Explicitly specifies dependent reads at the top of phase 2 1.4 Pixel Shaders - ATI Technologies 17

texcrd vs texcoord texcoord clamps input to 0..1 range Basically behaves like a color You have to scale and bias into 0..1 in your vertex shader Very annoying if you re also using the texm3x2 instructions, as you have already found texcrd does not clamp 0..1 Takes same input range as texm3x2 type instructions Retains pixel pipeline s native precision which is higher than colors 1.4 Pixel Shaders - ATI Technologies 18

texkill Another way to kill pixels If you re just doing a clip plane, use a clip plane As a fallback, use texkill for chips that don t support user clip planes Pixels are killed based on the sign of the components of registers 1.4 Pixel Shaders - ATI Technologies 19

texdepth Substitute a register value for z Imaged based rendering Depth sprites 1.4 Pixel Shaders - ATI Technologies 20

1.4 Pixel Shader ALU Instructions add d, s0, s1 sub d, s0, s1 mul d, s0, s1 mad d, s0, s1, s2 lrp d, s0, s1, s2 mov d, s0 cnd d, s0, s1, s2 cmp d, s0, s1, s2 dp3 d, s0, s1 dp4 d, s0, s1 bem d, s0, s1, s2 // sum // difference // modulate // s0 * s1 + s2 // s2 + s0*(s1-s2) // d = s0 // d = (s2 > 0.5)? s0 : s1 // d = (s2 >= 0)? s0 : s1 // s0 s1 replicated to d.rgba // s0 s1 replicated to d.rgba // Macro similar to texbem 1.4 Pixel Shaders - ATI Technologies

Negate Invert Argument Modifiers -r n 1-r n Unsigned value in source is required Bias (_bias) Shifts value down by ½ Scale by 2 (_x2) Scales argument by 2 Signed Scaling (_bx2) _bias followed by _x2 Shifts value down and scales data by 2 like the implicit behavior of D3DTOP_DOTPRODUCT3 in SetTSS() Channel replication r n.r, r n.g, r n.b or r n.a Useful for extracting scalars out of registers Not just in alpha instructions like the.b in ps.1.2 1.4 Pixel Shaders - ATI Technologies 22

Instruction Modifiers _x2 - Multiply result by 2 _x4 - Multiply result by 4 _x8 - Multiply result by 8 _d2 - Divide result by 2 _d4 - Divide result by 4 _d8 - Divide result by 8 _sat - Saturate result to 0..1 _sat may be used alone or combined with one of the other modifiers. i.e. mad_d8_sat 1.4 Pixel Shaders - ATI Technologies 23

Write Masks Any channels of the destination register may be masked during the write of the result Useful for computing different components of a texture coordinate for a dependent read Example: dp3 r0.r, t0, t4 mov r0.g, t0.a We ll show more examples of this 1.4 Pixel Shaders - ATI Technologies 24

Range and Precision ps.1.4 range is at least -8 to +8 Determine with MaxPixelShaderValue Pay attention to precision when doing operations that may cause errors to build up Conversely, use your range when you need it rather than scale down and lose precision. Filter kernel intermediate results are one case. Your texture coordinate interpolators are your high precision data sources. Use them. Sampling an 8 bit per channel texture (to normalize a vector, for example) gives you back a low precision result 1.4 Pixel Shaders - ATI Technologies 25

cnd and cmp cnd d, s0, s1, s2 d = (s0 > 0.5)? s1 : s2 Conditionally chooses between s1 and s2 In DirectX 8.1, cnd can now perform a component wise comparison of s0 to 0.5 in order to select s1 or s2 s component: For s0 = [0.4, 0.5, 0.51, -5.6], d = [s2.r, s2.g, s1.b, s2.a] cmp d, s0, s1, s2 d = (s0 >= 0)? s1 : s2 Conditionally chooses between s1 and s2 For s0 = [-0.4, 0.0, 5.0, -6.3], d = [s2.r, s1.g, s1.b, s2.a] New Instruction in DirectX 8.1 Useful for absolute value: cmp d, s0, s0, -s0 1.4 Pixel Shaders - ATI Technologies 26

Examples: Image Filters Use on 2D images in general Use as post processing pass over 3D scenes rendered into textures Luminance filter for Black and White effect Edge filters for non-photorealistic rendering Glare filters for soft look (see Fiat Lux by Debevec) Opportunity for you to customize your look Rendering to textures is fundamental. You need to get over your reluctance to render into textures. Becomes especially interesting when we get to high dynamic range 1.4 Pixel Shaders - ATI Technologies 27

Luminance Filter Different RGB recipes give different looks Black and White TV Black and White film Sepia Run through arbitrary transfer function using a dependent read for heat signature A common recipe is Lum =.3r +.59g +.11b ps.1.4 def c0, 0.30f, 0.59f, 0.11f, 1.0f texld r0, t0 t0 dp3 r0, r0, c0 c0 1.4 Pixel Shaders - ATI Technologies 28

Luminance Filter Original Image Luminance Image 1.4 Pixel Shaders - ATI Technologies 29

Multitap Filters Effectively code filter kernels right into the pixel shader Pre offset taps with texture coordinates For traditional image processing, offsets are a function of image/texture dimensions and point sampling is used Or compose complex filter kernels from multiple bilinear kernels 1.4 Pixel Shaders - ATI Technologies

Edge Detection Filter Roberts Cross Gradient Filters ps.1.4 ps.1.4 texld texld r0, r0, t0 t0 // // Center Center Tap Tap 1 0 0-1 0-1 1 0 texld texld r1, r1, t1 t1 // // Down Down & Right Right texld texld r2, r2, t2 t2 // // Down Down & Left Left add add r1, r1, r0, r0, -r1 -r1 add add r2, r2, r0, r0, -r2 -r2 cmp cmp r1, r1, r1, r1, r1, r1, -r1 -r1 t2 t0 t1 cmp cmp r2, r2, r2, r2, r2, r2, -r2 -r2 add_x8 add_x8 r0, r0, r1, r1, r2 r2 1.4 Pixel Shaders - ATI Technologies

Gradient Filter Original Image 8 x Gradient Magnitude 1.4 Pixel Shaders - ATI Technologies 32

Gradient from Color ps.1.4 ps.1.4 def def c0, c0, 0.30f, 0.30f, 0.59f, 0.59f, 0.11f, 0.11f, 1.0f 1.0f texld texld r0, r0, t0 t0 // // Center Center Tap Tap texld texld r1, r1, t1 t1 // // Down Down & Right Right texld texld r2, r2, t2 t2 // // Down Down & Left Left dp3 dp3 r3, r3, r0, r0, c0 c0 // // Compute Compute Luminances Luminances dp3 dp3 r1, r1, r1, r1, c0 c0 dp3 dp3 r2, r2, r1, r1, c0 c0 add add r1, r1, r3, r3, -r1 -r1 // // Compute Compute Differences Differences add add r2, r2, r3, r3, -r2 -r2 cmp cmp r1, r1, r1, r1, r1, r1, -r1 -r1// // Absolute Absolute Values Values cmp cmp r2, r2, r2, r2, r2, r2, -r2 -r2 add_x8 add_x8 r1, r1, r1, r1, r2 r2 phase phase mul mul r0.rgb, r0.rgb, r0, r0, 1-r1 1-r1 // // Composite Composite with with original original rgb rgb +mov +mov r0.a, r0.a, c0.a c0.a 1.4 Pixel Shaders - ATI Technologies

Gradient from Color Original Image Original x Inverted Gradient Magnitude 1.4 Pixel Shaders - ATI Technologies 34

Five Tap Blur Filter ps.1.4 def def c0, c0, 0.2f, 0.2f, 0.2f, 0.2f, 0.2f, 0.2f, 1.0f 1.0f texld texld r0, r0, t0 t0 // // Center Tap Tap texld texld r1, r1, t1 t1 // // Down Down & Right Right texld texld r2, r2, t2 t2 // // Down Down & Left Left texld texld r3, r3, t3 t3 // // Up Up & Left Left texld texld r4, r4, t4 t4 // // Up Up & Right Right add add r0, r0, r0, r0, r1 r1 add add r2, r2, r2, r2, r3 r3 add add r0, r0, r0, r0, r2 r2 add add r0, r0, r0, r0, r4 r4 mul mul r0, r0, r0, r0, c0 c0 t3 t2 t0 t4 t1 1.4 Pixel Shaders - ATI Technologies

Five Tap Blur Filter Original Image Blurred Image 1.4 Pixel Shaders - ATI Technologies 36

Ping-Pong Pong Blur Filter Render to Ping Texture Ping Blur Ping Pong back and forth Blur Original Scene Pong Blurred Image 1.4 Pixel Shaders - ATI Technologies 37

Glare Filter Composite thresholded blur image back on to original scene Image from Rendering with Natural Light by Debevec 1.4 Pixel Shaders - ATI Technologies

Glare Filter Composite thresholded image back on to original scene Image from Fiat Lux by Debevec 1.4 Pixel Shaders - ATI Technologies

Sepia Transfer Function ps.1.4 def def c0, c0, 0.30f, 0.59f, 0.11f, 1.0f 1.0f texld texld r0, r0, t0 t0 dp3 dp3 r0, r0, r0, r0, c0 c0 // // Convert to to Luminance phase phase texld texld r5, r5, r0 r0 // // Dependent read read mov mov r0, r0, r5 r5 1D Luminance to Sepia map 1.4 Pixel Shaders - ATI Technologies

Sepia Transfer Function Original Image Sepia Tone Image 1.4 Pixel Shaders - ATI Technologies 41

Heat Signature 1D Heat Signature Map 1.4 Pixel Shaders - ATI Technologies

Heat Transfer Function Heat input image Heat Signature Image Transfer Function 1.4 Pixel Shaders - ATI Technologies 43

Volume Visualization The visualization community starts with data that is inherently volumetric and often scalar, as it is acquired from some 3D medical imaging modality As such, no polygonal representation exists and there is a need to reconstruct projections of the data through direct volume rendering Last year, we demoed this on RADEON using volume textures on DirectX 8.0 One major area of activity in the visualization community is coming up with methods for using transfer functions, ideally dynamic ones, that map the (often scalar) data to some curve through color space The 1D sepia and heat signature maps just shown are examples of transfer functions 1.4 Pixel Shaders - ATI Technologies

Volume Visualization On consumer cards like RADEON, volume rendering is done by compositing a series of shells in camera space which intersect the volume to be visualized A texture matrix is used to texture map the shells as they slice through a volume texture View frustum and VolViz Shells 1.4 Pixel Shaders - ATI Technologies 45

Dynamic Transfer Functions With 1.4 pixel shaders, it is very natural to sample data from a 3D texture map and apply a transfer function via a dependent read Transfer functions are usually 1D and are very cheap to update interactively 1.4 Pixel Shaders - ATI Technologies

Dynamic Transfer Functions Scalar Data Transfer Function Applied 1.4 Pixel Shaders - ATI Technologies 47

More Examples: Lighting Four Diffuse Per-Pixel Pixel Lights in one Pass 1.4 Pixel Shaders - ATI Technologies

Per-pixel N L N L and Attenuation 1.4 Pixel Shaders - ATI Technologies 49

Per-pixel N L N L and Attenuation texld texld r1, r1, t0 t0 Normal Normal texld texld r2, r2, t1 t1 Cubic Cubic Normalized Normalized Tangent Tangent Space Space Light Light Direction Direction texcrd texcrd r3.rgb, r3.rgb, t2 t2 World World Space Space Light Light Direction. Direction. Unit Unit length length is is the the light's light's range. range. dp3_sat dp3_sat r1.rgb, r1.rgb, r1_bx2, r1_bx2, r2_bx2 r2_bx2 N L N L dp3 dp3 r3.rgb, r3.rgb, r3, r3, r3 r3 (World (World Space Space Light Light Distance)^2 Distance)^2 phase phase Dependent Read texld texld r0, r0, t0 t0 texld texld r3, r3, r3 r3 Base Base Light Light Falloff Falloff Function Function mul_x2 mul_x2 r4.rgb, r4.rgb, r1, r1, r3 r3 falloff falloff * * (N L) (N L) add add r4.rgb, r4.rgb, r4, r4, c7 c7 += += ambient ambient mul mul r0.rgb, r0.rgb, r0, r0, r4 r4 base base * * (ambient (ambient + + (falloff*n L)) (falloff*n L)) 1.4 Pixel Shaders - ATI Technologies 50

Variable Specular Power Constant specular power Variable specular power 1.4 Pixel Shaders - ATI Technologies 51

Variable Specular Power Per-pixel (N H) k with per-pixel variation of k 120.0 Base map with albedo in RGB and gloss in alpha k Normal map with xyz in RGB and k in alpha N H k map 10.0 N.H 0.0 1.0 Should also be able to apply a scale and a bias to the map and in the pixel shader to make better use of the resolution 1.4 Pixel Shaders - ATI Technologies

Maps for per-pixel pixel variation of k shader Albedo in RGB Gloss in alpha k = 120 N H k map k = 10 Normals in RGB k in alpha 1.4 Pixel Shaders - ATI Technologies

Variable Specular Power ps.1.4 ps.1.4 texld texld r1, r1, t0 t0 Normal Normal texld texld r2, r2, t1 t1 Normalized Normalized Tangent Tangent Space Space L vector vector texcrd texcrd r3.rgb, r3.rgb, t2 t2 Tangent Tangent Space Space Halfangle Halfangle vector vector Dependent Read dp3_sat dp3_sat r5.xyz, r5.xyz, r1_bx2, r1_bx2, r2_bx2 r2_bx2 N L N L dp3_sat dp3_sat r2.xyz, r2.xyz, r1_bx2, r1_bx2, r3 r3 N H N H mov mov r2.y, r2.y, r1.a r1.a K = Specular Specular Exponent Exponent phase phase texld texld r0, r0, t0 t0 Base Base texld texld r3, r3, r2 r2 Specular Specular NH K NH K map map add add r4.rgb, r4.rgb, r5, r5, c7 c7 += += ambient ambient mul mul r0.rgb, r0.rgb, r0, r0, r4 r4 base base * (ambient (ambient + N L)) N L)) +mul_x2 +mul_x2 r0.a, r0.a, r0.a, r0.a, r3.a r3.a Gloss Gloss map map * specular specular add add r0.rgb, r0.rgb, r0, r0, r0.a r0.a (base*(ambient (base*(ambient + N L)) N L)) + (Gloss*Highlight) (Gloss*Highlight) 1.4 Pixel Shaders - ATI Technologies

Anisotropic lighting We know how to light lines and anisotropic materials by doing two dot products and using the results to look up the non-linear parts in a 2D texture/function (Banks, Zöckler, Heidrich) This was done per-vertex using the texture matrix With per-pixel dot products and dependent texture reads, we can now do this math per-pixel and specify the direction of anisotropy in a map. Zöckler et al Heidrich et al 1.4 Pixel Shaders - ATI Technologies 55

Per-pixel anisotropic lighting This technique involves computing the following for diffuse and specular illumination: Diffuse: 1 (L T) 2 Specular: 1 (L T) 2 1 (V T) 2 (L T)(V T) These two dot products can be computed per-pixel with the texm3x2* instructions or just two dp3s in 1.4 Use this 2D tex coord to index into special map to evaluate above functions At GDC 2001, we showed this limited to per-pixel tangents in the plane of the polygon Here, we orthogonalize the tangents wrt the per-pixel normal inside the pixel shader 1.4 Pixel Shaders - ATI Technologies 56

Per-pixel anisotropic lighting Use traditional normal map, whose normals are in tangent space Use tangent map Or use an interpolated tangent and orthogonalize it per-pixel Interpolate V and L in tangent space and compute coordinates into function lookup table per pixel. 1.4 Pixel Shaders - ATI Technologies 57

Per-pixel anisotropic lighting -1.0 Diffuse in RGB -1.0 Specular in Alpha V T V T +1.0 +1.0 L T -1.0 +1.0 L T -1.0 +1.0 1.4 Pixel Shaders - ATI Technologies 58

Anisotropic Lighting Example: Brushed Metal 1.4 Pixel Shaders - ATI Technologies 59

Bumped Anisotropic Lighting ps.1.4 ps.1.4 def c0, 0.5f, 0.5f, 0.0f, 1.0f def c0, 0.5f, 0.5f, 0.0f, 1.0f texld r0, t0 Contains direction of anisotropy in tangent space texld r0, t0 Contains direction of anisotropy in tangent space texcrd r2.rgb, t1 light vector texcrd r2.rgb, t1 light vector texcrd r3.rgb, t2 view vector texcrd r3.rgb, t2 view vector texld r4, t0 normal map texld r4, t0 normal map Perturb anisotropy lighting direction by normal Perturb anisotropy lighting direction by normal dp3 r1.xyz, r0_bx2, r4_bx2 Aniso.Normal dp3 r1.xyz, r0_bx2, r4_bx2 Aniso.Normal mad r0.xyz, r4_bx2, r1, r0_bx2 Aniso - N(Aniso.Normal) mad r0.xyz, r4_bx2, r1, r0_bx2 Aniso - N(Aniso.Normal) Calculate A.View and A.Light for looking up into function map Calculate A.View and A.Light for looking up into function map dp3 r5.x, r2, r0 Perform second row of matrix multiply dp3 r5.x, r2, r0 Perform second row of matrix multiply dp3 r5.yz, r3, r0 Perform second row of matrix multiply to get a dp3 r5.yz, r3, r0 Perform second row of matrix multiply to get a 3-vector with which to sample texture 3, which is 3-vector with which to sample texture 3, which is a look-up table for aniso lighting a look-up table for aniso lighting mad r5.rg, r5, c0, c0 Scale and bias for lookup mad r5.rg, r5, c0, c0 Scale and bias for lookup Diffuse Light Term Diffuse Light Term dp3_sat r4.rgb, r4_bx2, r2 N.L dp3_sat r4.rgb, r4_bx2, r2 N.L phase phase texld r2, r5 Anisotropic lighting function lookup texld r2, r5 Anisotropic lighting function lookup texld r3, t0 gloss map texld r3, t0 gloss map mul r4.rgb, r3, r4.b basemap * N.L mul r4.rgb, r3, r4.b basemap * N.L mad r0.rgb, r3, r2.a, r4 += glossmap * specular mad r0.rgb, r3, r2.a, r4 += glossmap * specular mad r0.rgb, r3, c7, r0 += ambient * basemap mad r0.rgb, r3, c7, r0 += ambient * basemap 1.4 Pixel Shaders - ATI Technologies 60

Anisotropic Lighting Example: Human Hair Highlights computed in pixel shader Direction of anisotropy map is used to light the hair 1.4 Pixel Shaders - ATI Technologies 61

Bumpy Environment Mapping Several flavors of this DX6-style EMBM Must work with projective texturing to be useful Could do DX6-style but with interpolated 2x2 matrix But the really cool one is per-pixel doing a 3x3 multiply to transform fetched normal into cube map space All still useful and valid in different circumstances. Can now do superposition of the perturbation maps for constructive / destructive interference of waveforms Really, the distinctions become irrelevant, as this all just degenerates into dependent texture reads and the app makes the tradeoffs between what it determines is correct for a given effect 1.4 Pixel Shaders - ATI Technologies 62

Traditional EMBM Traditional EMBM The 2D case is still valuable and not going away The fact that the 2x2 matrix is no longer required to be state unlocks this even further. Works great with dynamic projective reflection maps for floors, walls, lakes etc Good for refraction (heat waves, water effects etc.) 1.4 Pixel Shaders - ATI Technologies 63

Bumped Cubic Environment Mapping Interpolate a 3x3 matrix which represents a transformation from tangent space to cube map space Sample normal and transform it by 3x3 matrix Sample diffuse map with transformed normal Reflect the eye vector through the normal and sample a specular and/or env map Do both Blend with a per-pixel Fresnel Term! 1.4 Pixel Shaders - ATI Technologies 64

Bumpy Environment Mapping 1.4 Pixel Shaders - ATI Technologies 65

Bumpy Environment Mapping texld texld r0, r0, t0 t0 texld texld r1, r1, t4 t4 texcrd texcrd r4.rgb, r4.rgb, t1 t1 texcrd texcrd r2.rgb, r2.rgb, t2 t2 texcrd texcrd r3.rgb, r3.rgb, t3 t3 texcrd texcrd r5.rgb, r5.rgb, t5 t5 Look Look up up normal normal map map Eye Eye vector vector through through normalizer normalizer cube cube map map 1st 1st row row of of environment environment matrix matrix 2st 2st row row of of environment environment matrix matrix 3rd 3rd row row of of environment environment matrix matrix World World space space L L (Unit (Unit length length is is light's light's range) range) Dependent Reads dp3 dp3 r4.r, r4.r, r4, r4, r0_bx2 r0_bx2 1st 1st row row of of matrix matrix multiply multiply dp3 dp3 r4.g, r4.g, r2, r2, r0_bx2 r0_bx2 2nd 2nd row row of of matrix matrix multiply multiply dp3 dp3 r4.b, r4.b, r3, r3, r0_bx2 r0_bx2 3rd 3rd row row of of matrix matrix multiply multiply dp3_x2 dp3_x2 r3.rgb, r3.rgb, r4, r4, r1_bx2 r1_bx2 2(N Eye) 2(N Eye) mul mul r3.rgb, r3.rgb, r4, r4, r3 r3 2N(N Eye) 2N(N Eye) dp3 dp3 r2.rgb, r2.rgb, r4, r4, r4 r4 N N N N mad mad r2.rgb, r2.rgb, -r1_bx2, -r1_bx2, r2, r2, r3 r3 2N(N Eye) 2N(N Eye) - - Eye(N N) Eye(N N) phase phase texld texld r2, r2, r2 r2 texld texld r3, r3, t0 t0 texld texld r4, r4, r4 r4 texld texld r5, r5, t0 t0 Sample Sample cubic cubic reflection reflection map map Sample Sample base base map map Sample Sample cubic cubic diffuse diffuse map map Sample Sample gloss gloss map map mul mul r1.rgb, r1.rgb, r5, r5, r2 r2 Specular Specular = = Gloss Gloss * * Reflection Reflection mad mad r0.rgb, r0.rgb, r3, r3, r4_x2, r4_x2, r1 r1 Base Base * * Diffuse Diffuse + + Specular Specular 1.4 Pixel Shaders - ATI Technologies 66

Per-Pixel Pixel Fresnel Per-Pixel Diffuse Per-Pixel Bumped Environment map Per-Pixel Fresnel Result + = 1.4 Pixel Shaders - ATI Technologies

Ghost/Glow Shader Per-Pixel N Eye used to Index into Glow Map Glow Map (which gets multiplied by {0.1, 0.1, 0.5, 0.0, 1.0} to tint it ghostly green 1.4 Pixel Shaders - ATI Technologies

Ghost Shader dp3 dp3 r4.r, r4.r, r4, r4, r0_bx2 1st 1st row row of of matrix multiply dp3 dp3 r4.g, r4.g, r2, r2, r0_bx2 2nd 2nd row row of of matrix multiply dp3 dp3 r4.b, r4.b, r3, r3, r0_bx2 3rd 3rd row row of of matrix multiply dp3 dp3 r5, r5, r4, r4, r1 r1 (N Eye) phase phase texld texld r2, r2, r5 r5 mov movr0.rgb, r2 r2 1.4 Pixel Shaders - ATI Technologies 69

Multi-light light Shaders Four Diffuse Per-Pixel Lights in one Pass 1.4 Pixel Shaders - ATI Technologies

4-light Shader dp3_sat dp3_sat r2.rgb, r2.rgb, r1_bx2, r1_bx2, r2_bx2 r2_bx2 *= *=(N L1) mul_x2 mul_x2 r2.rgb, r2.rgb, r2, r2, c0 c0 *= *= Light Light Color Color dp3_sat dp3_sat r3.rgb, r3.rgb, r1_bx2, r1_bx2, r3_bx2 r3_bx2 Light Light 2 mul_x2 mul_x2 r3.rgb, r3.rgb, r3, r3, c1 c1 dp3_sat dp3_sat r4.rgb, r4.rgb, r1_bx2, r1_bx2, r4_bx2 r4_bx2 Light Light 3 mul_x2 mul_x2 r4.rgb, r4.rgb, r4, r4, c2 c2 phase phase texld texld r0, r0, t0 t0 texld texld r5, r5, t4 t4 dp3_sat dp3_sat r5.rgb, r5.rgb, r1_bx2, r1_bx2, r5_bx2 r5_bx2 Light Light 4 mul_x2 mul_x2 r5.rgb, r5.rgb, r5, r5, c3 c3 mul mul r1.rgb, r1.rgb, r2, r2, v0.x v0.x Attenuate Attenuate light light 1 mad mad r1.rgb, r1.rgb, r3, r3, v0.y, v0.y, r1 r1 Attenuate Attenuate light light 2 mad mad r1.rgb, r1.rgb, r4, r4, v0.z, v0.z, r1 r1 Attenuate Attenuate light light 3 mad mad r1.rgb, r1.rgb, r5, r5, v0.w, v0.w, r1 r1 Attenuate Attenuate light light 4 add add r1.rgb, r1.rgb, r1, r1, c7 c7 += += Ambient Ambient mul mul r0.rgb, r0.rgb, r1, r1, r0 r0 Modulate Modulate by by base base map map 1.4 Pixel Shaders - ATI Technologies 71

Reflection and Refraction Shader Normal used to compute reflection and refraction rays in one pass 1.4 Pixel Shaders - ATI Technologies

Reflection and Refraction dp3 dp3 r4.r, r4.r, r4, r4, r0_bx2 1st 1st row row of of matrix multiply dp3 dp3 r4.g, r4.g, r2, r2, r0_bx2 2nd 2nd row row of of matrix multiply dp3 dp3 r4.b, r4.b, r3, r3, r0_bx2 3rd 3rd row row of of matrix multiply mul mul r5.rgb, c0.g, c0.g, -r1_bx2 Refract by by c0 c0 = index index of of refraction fudge fudge factor mad mad r2.rgb, c0.r, c0.r, -r4, -r4, r5 r5 Refract by by c0 c0 = index index of of refraction fudge fudge factor 1.4 Pixel Shaders - ATI Technologies 73

Developing Pixel Shaders The good news is that since shaders are a programming language they re easier to edit and debug on the fly than complex multitexturing setups Use tools like ShadeLab to play around with them interactively As Pixel Shaders grow in complexity comparable to today s vertex shaders, we ll need to address the language issues like we are at the vertex shader level today. See Evan Hart s talk right after lunch for more on this. 1.4 Pixel Shaders - ATI Technologies 74

ShadeLab 1.4 Pixel Shaders - ATI Technologies 75

The Road to DX9 1.4 is a good preparation for how to think about DX9 pixel shaders Unified instruction set Higher precision Vectors, not colors Flexible dependent texture reads 1.4 Pixel Shaders - ATI Technologies 76

Summary Pixel Shader Overview 1.4 Pixel Shaders Unified Instruction set Flexible dependent texture read Image Processing 3D volume visualizations Effects on 3D Surfaces Per-pixel lighting Per-pixel Fresnel Term Bumpy Environment mapping Per-pixel anisotropic lighting ShadeLab Tool Good for playing around with shaders in real-time Generates Vertex Shader code on the fly to feed pixel shader 1.4 Pixel Shaders - ATI Technologies 77

Call To Action Use 1.4 pixel shaders in your games when they are present Abstract your shader usage to the point that shaders and shader versions are just part of the dataset Start thinking about pixel shaders according to this model it s where we re going in DX9. 1.4 Pixel Shaders - ATI Technologies 78

Acknowledgements Chris Brennan, Chris Oat, John Isidoro and Dan Baker for insights and demos 1.4 Pixel Shaders - ATI Technologies 79

Questions 1.4 Pixel Shaders - ATI Technologies 80