Optimizing AAA Games for Mobile Platforms



Similar documents
Image Synthesis. Transparency. computer graphics & visualization

Low power GPUs a view from the industry. Edvard Sørgård

Computer Graphics Hardware An Overview

GPU Architecture. Michael Doggett ATI

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

How To Understand The Power Of Unity 3D (Pro) And The Power Behind It (Pro/Pro)

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

Performance Optimization and Debug Tools for mobile games with PlayCanvas

OpenGL Performance Tuning

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Midgard GPU Architecture. October 2014

Advanced Graphics and Animations for ios Apps

Radeon HD 2900 and Geometry Generation. Michael Doggett

How To Teach Computer Graphics

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

Unreal Engine 4: Mobile Graphics on ARM CPU and GPU Architecture

Introduction to Computer Graphics

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Outline. srgb DX9, DX10, XBox 360. Tone Mapping. Motion Blur

2: Introducing image synthesis. Some orientation how did we get here? Graphics system architecture Overview of OpenGL / GLU / GLUT

How To Build An Engine 4 Mobile Graphics On Anarm V8-A (A64)

CSE 564: Visualization. GPU Programming (First Steps) GPU Generations. Klaus Mueller. Computer Science Department Stony Brook University

NVFX : A NEW SCENE AND MATERIAL EFFECT FRAMEWORK FOR OPENGL AND DIRECTX. TRISTAN LORACH Senior Devtech Engineer SIGGRAPH 2013

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

Vulkan on NVIDIA GPUs. Piers Daniell, Driver Software Engineer, OpenGL and Vulkan

iz3d Stereo Driver Description (DirectX Realization)

3D Computer Games History and Technology

What is GPUOpen? Currently, we have divided console & PC development Black box libraries go against the philosophy of game development Game

Touchstone -A Fresh Approach to Multimedia for the PC

Deferred Shading & Screen Space Effects

How To Make A Texture Map Work Better On A Computer Graphics Card (Or Mac)

Introduction to GPGPU. Tiziano Diamanti

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

NVPRO-PIPELINE A RESEARCH RENDERING PIPELINE MARKUS TAVENRATH MATAVENRATH@NVIDIA.COM SENIOR DEVELOPER TECHNOLOGY ENGINEER, NVIDIA

QCD as a Video Game?

Introduction to Computer Graphics. Jürgen P. Schulze, Ph.D. University of California, San Diego Fall Quarter 2012

GUI GRAPHICS AND USER INTERFACES. Welcome to GUI! Mechanics. Mihail Gaianu 26/02/2014 1

BRINGING UNREAL ENGINE 4 TO OPENGL Nick Penwarden Epic Games Mathias Schott, Evan Hart NVIDIA

NVIDIA GeForce GTX 580 GPU Datasheet

ANDROID DEVELOPER TOOLS TRAINING GTC Sébastien Dominé, NVIDIA

GPU Christmas Tree Rendering. Evan Hart

INTRODUCTION TO RENDERING TECHNIQUES

nvfx: A New Shader-Effect Framework for OpenGL, DirectX and Even Compute Tristan Lorach ( tlorach@nvidia.com )

Dynamic Resolution Rendering

Rendering Techniques in Gears of War 2

Practical Volume Rendering in mobile devices

Web Based 3D Visualization for COMSOL Multiphysics

OpenGL Insights. Edited by. Patrick Cozzi and Christophe Riccio

Real-Time Realistic Rendering. Michael Doggett Docent Department of Computer Science Lund university

GPU Profiling with AMD CodeXL

The Future Of Animation Is Games

Introduction to Game Programming. Steven Osman

Making Dreams Come True: Global Illumination with Enlighten. Graham Hazel Senior Product Manager Sam Bugden Technical Artist

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Water Flow in. Alex Vlachos, Valve July 28, 2010

GPGPU Computing. Yong Cao

Real-Time BC6H Compression on GPU. Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog

Writing Applications for the GPU Using the RapidMind Development Platform

A Prototype For Eye-Gaze Corrected

XBOX Performances XNA. Part of the slide set taken from: Understanding XNA Framework Performance Shawn Hargreaves GameFest 2007

High Dynamic Range and other Fun Shader Tricks. Simon Green

Lecture Notes, CEng 477

Advanced Rendering for Engineering & Styling

Application. EDIUS and Intel s Sandy Bridge Technology

Petascale Visualization: Approaches and Initial Results

A Crash Course on Programmable Graphics Hardware

Game Development in Android Disgruntled Rats LLC. Sean Godinez Brian Morgan Michael Boldischar

A Short Introduction to Computer Graphics

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

Volume Rendering on Mobile Devices. Mika Pesonen

Advanced Visual Effects with Direct3D

Developing with Oculus: Mastering the Oculus SDK. Volga Aksoy Software Engineer, Oculus

Cloud Gaming & Application Delivery with NVIDIA GRID Technologies. Franck DIARD, Ph.D. GRID Architect, NVIDIA

INSTALLATION GUIDE ENTERPRISE DYNAMICS 9.0

GPU-Driven Rendering Pipelines

Modern Graphics Engine Design. Sim Dietrich NVIDIA Corporation

CS 378: Computer Game Technology

3D Stereoscopic Game Development. How to Make Your Game Look

SkillsUSA 2014 Contest Projects 3-D Visualization and Animation

Getting Started with CodeXL

GearVRf Developer Guide

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

OpenEXR Image Viewing Software

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.

Computer Graphics on Mobile Devices VL SS ECTS

Trends in HTML5. Matt Spencer UI & Browser Marketing Manager

Lezione 4: Grafica 3D*(II)

How To Develop For A Powergen 2.2 (Tegra) With Nsight) And Gbd (Gbd) On A Quadriplegic (Powergen) Powergen Powergen 3

VA (Video Acceleration) API. Jonathan Bian 2009 Linux Plumbers Conference

L20: GPU Architecture and Models

AMD GPU Architecture. OpenCL Tutorial, PPAM Dominik Behr September 13th, 2009

Software Evaluation Guide for Autodesk 3ds Max 2009* and Enemy Territory: Quake Wars* Render a 3D character while playing a game

1. INTRODUCTION Graphics 2

Developer Tools. Tim Purcell NVIDIA

Transcription:

Optimizing AAA Games for Mobile Platforms Niklas Smedberg Senior Engine Programmer, Epic Games

Who Am I A.k.a. Smedis Epic Games, Unreal Engine 15 years in the industry 30 years of programming C64 demo scene

Mobile GPU Architecture Traditional GPU NVIDIA Tegra Tile-based GPU Very different from desktop or consoles Common on smartphones and tablets ARM Mali GPUs fall into this category

Tile-Based Mobile GPU TL;DR Summary Splits the screen into tiles E.g. 16x16 pixels The whole tile fits in GPU registers Process all drawcalls for one tile Repeat for each tile to fill the screen Each tile is written to RAM as it finishes (For illustration purposes only)

ARM Mali Rendering Process Software Command Buffer Vertex Processing Tiling Parameter Buffer Pixel Processing Frame Buffer

Render Target On Chip No bandwidth cost for drawing or alpha-blend Cheap depth/stencil testing MSAA is cheap Have seen 0-5 ms cost for MSAA Be wary of buffer restores (color or depth) Note: Depth textures doesn t work with MSAA

Mali 400 vs 600 Mali 400 precision 16-bit floating point in pixelshader (one size fits all) 24-bit interpolators Shader units Mali 400: Separate vertex and pixel hardware Mali 6xx: Unified shader unit Alpha-blending Mali 400: Separate specialized hardware Mali 6xx: Can be done in shader

Architecture Caveats Extremely expensive rendertarget switches No free hidden surface removal Not like on ImgTec-based devices Sort drawcalls front-to-back Big occluders first, skybox last No texture compression for RGBA textures Use uncompressed RGBA or two compressed RGB textures ASTC will save the day OpenGL ES 3.0 not available on Android and ios

Rendering Techniques How to take advantage of this GPU?

Render Buffer Management (1 of 2) Each render target is a whole new scene Avoid switching render target back and forth! Can cause a full restore: Copies full color/depth/stencil from RAM into Tile Memory at the beginning of the scene Can cause a full resolve: Copies full color/depth/stencil from Tile Memory into RAM at the end of the scene

Render Buffer Management (2 of 2) Avoid buffer restore Clear everything! Color/depth/stencil A clear just sets some dirty bits in a register Avoid buffer resolve Use discard extension (GL_EXT_discard_framebuffer) See usage case for shadows Avoid unnecessarily different FBO combos Don t let the driver think it needs to start resolving and restoring any buffers!

Mobile Materials Full Unreal Engine materials are too complicated

Mobile Materials Initial idea: Pre-render into a single texture

Mobile Materials Current solution: Pre-render components into separate textures Add mobile-specific settings Feature support driven by artists

Mobile Shaders One hand-written ubershader Lots of #ifdef for all features Exposed as fixed settings in the artist UI Checkboxes, lists, values, etc

Material Example: Rim Lighting

Material Example: Vertex Animation

Shader Offline Processing Run C pre-processor offline Reduces in-game compile time Eliminates duplicates at off-line time

Shader Compiling Compile all shaders at startup Avoids hitching at run-time Compile on the GL thread, while loading on Game thread Compiling is not enough Must issue dummy drawcalls! Some states affect shaders! (But not as much on ARM GPUs) May need experimenting to avoid shader patching E.g. alpha-blend states, color write masks

God Rays

God Rays Ported from Xbox 360 to OpenGL ES 2.0 Moved all math to vertex shader Pass down data through interpolators But, now we re out of interpolators Split radial filter into 4 draw calls 4 x 8 = 32 texture lookups total (equiv. 256) Went from 30 ms to 5 ms

Original Shader

Mobile Shader

God Rays Original Scene No God Rays

1st Pass Downsample Scene Identify pixels RGB: Scene color A: Occlusion factor Resolve to texture: Unblurred source

2nd Pass Average 8 lookups From Unblurred source 1st quarter vector Uses 8.xy interpolators Opaque draw call

3rd Pass Average +8 lookups From Unblurred source 2nd quarter vector Uses 8.xy interpolators Additive draw call Resolve to texture: Blurred source

4th Pass Average 8 lookups From Blurred source 1st half vector Uses 8.xy interpolators Opaque draw call

5th Pass Average +8 lookups From Blurred source 2nd half vector Uses 8.xy interpolators Additive draw call Resolve final result

6th Pass Clear the final buffer Avoids buffer restore Opaque fullscreen Screenblend apply Blend in pixelshader

Character Shadows

Character Shadows Ported one type of shadows from Xbox: Projected, modulated dynamic shadows Fairly standard method Generate shadow depth buffer Stencil potential pixels Compare shadow depth and scene depth Darken affected pixels

Character Shadows 1. Project character depth from light view

Character Shadows 2. Reproject into camera view

Character Shadows 3. Compare with SceneDepth and modulate

Character Shadows 4. Draw character on top (no self-shadow)

Shadow Optimizations (1 of 2) Shadow depth first in the frame Avoids a rendertarget switch (resolve & restore!) Resolve SceneDepth just before shadows* Write out tile depth to RAM to read as texture Keep rendering in the same tile Unfortunately no API for this in OpenGL ES

Shadow Optimizations (2 of 2) Optimize color buffer usage for shadow We only need the depth buffer! Unnecessary buffer, but required in OpenGL ES Clear (avoid restore) and disable color writes Use gldiscardframebuffer() to avoid resolve Could encode depth in F16 / RGBA8 color instead Draw screen-space quad instead of cube Avoids a dependent texture lookup

Questions?