AGENDA. Overview GPU Video Encoding NVIDIA Video Encoding Capabilities. Software API Performance & Quality. Kepler vs Maxwell GPU capabilities Roadmap



Similar documents
HIGH-PERFORMANCE GPU VIDEO ENCODING ABHIJIT PATAIT SR. MANAGER, NVIDIA

HIGH PERFORMANCE VIDEO ENCODING WITH NVIDIA GPUS

NVIDIA VIDEO ENCODER 5.0

GRID SDK FAQ. PG January Frequently Asked Questions

Cloud Gaming & Application Delivery with NVIDIA GRID Technologies. Franck DIARD, Ph.D. GRID Architect, NVIDIA

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

REMOTE HIGH FIDELITY VISUALIZATION. May 2015 Jeremy Main, Sr. Solution Architect GRID

NVIDIA GRID OVERVIEW SERVER POWERED BY NVIDIA GRID. WHY GPUs FOR VIRTUAL DESKTOPS AND APPLICATIONS? WHAT IS A VIRTUAL DESKTOP?

MPEG-4 AVC/H.264 Video Codecs Comparison

NVIDIA GeForce GTX 580 GPU Datasheet

Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing

PNY Professional Solutions NVIDIA GRID - GPU Acceleration for the Cloud

Get the Best out of NVIDIA GPUs for 3D Design and Engineering in the Cloud

Amazon EC2 Product Details Page 1 of 5

PLANNING FOR DENSITY AND PERFORMANCE IN VDI WITH NVIDIA GRID JASON SOUTHERN SENIOR SOLUTIONS ARCHITECT FOR NVIDIA GRID

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

S Hands on Tutorial: Deploying GRID in Citrix and VMware Virtual Desktop Environments

Vmware Horizon View with Rich Media, Unified Communications and 3D Graphics

NVIDIA Quadro M4000 Sync PNY Part Number: VCQM4000SYNC-PB. User Guide

QuickSpecs. NVIDIA Quadro M GB Graphics INTRODUCTION. NVIDIA Quadro M GB Graphics. Overview

VMware and NVIDIA: Bringing Workstations to the cloud

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

S4726 VIRTUALIZATION AN INTRO TO VIRTUALIZATION. Luke Wignall & Jared Cowart Senior Solution Architects GRID

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

MPEG-4 AVC/H.264 Video Codecs Comparison

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Overview. NVIDIA Quadro K5200 8GB Graphics J3G90AA

QuickSpecs. NVIDIA Quadro K4200 4GB Graphics INTRODUCTION. NVIDIA Quadro K4200 4GB Graphics. Overview

NVIDIA GeForce GTX 750 Ti

GTC EXPRESS WEBINAR: GETTING THE MOST OUT OF VMWARE HORIZON VIEW VDGA WITH NVIDIA GRID. Steve Harpster - Solution Architect NALA

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Media Cloud Based on Intel Graphics Virtualization Technology (Intel GVT-g) and OpenStack *

Boundless Security Systems, Inc.

HP Workstations graphics card options

GPU Accelerated XenDesktop 3D Graphics beyond Designers and Engineers

GPU Computing - CUDA

ANDROID DEVELOPER TOOLS TRAINING GTC Sébastien Dominé, NVIDIA

XID ERRORS. vr352 May XID Errors

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING

VMware Horizon View 3D Graphics

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Technical Brief. Introducing Hybrid SLI Technology. March 11, 2008 TB _v01

HP Z Workstations graphics card options

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

Technical Brief. NVIDIA nview Display Management Software. May 2009 TB _v02

Introduction to GPU hardware and to CUDA

This letter contains latest information about the above mentioned software version.

Using Mobile Processors for Cost Effective Live Video Streaming to the Internet

QuickSpecs. NVIDIA Quadro K2200 4GB Graphics INTRODUCTION. NVIDIA Quadro K2200 4GB Graphics. Technical Specifications

PassMark - G3D Mark High End Videocards - Updated 12th of November 2015

Performance Testing in Virtualized Environments. Emily Apsey Product Engineer

Stream Processing on GPUs Using Distributed Multimedia Middleware

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

This letter contains latest information about the above mentioned software version.

Cluster Monitoring and Management Tools RAJAT PHULL, NVIDIA SOFTWARE ENGINEER ROB TODD, NVIDIA SOFTWARE ENGINEER

VidyoPanorama SOLUTION BRIEF VIDYO

QuickSpecs. NVIDIA Quadro K1200 4GB Graphics INTRODUCTION PERFORMANCE AND FEATURES. Overview

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

NVIDIA Tesla Compute Cluster Driver for Windows

Chapter 19 Cloud Computing for Multimedia Services

Wirecast Release Notes

Low power GPUs a view from the industry. Edvard Sørgård

GPU Parallel Computing Architecture and CUDA Programming Model

Application. EDIUS and Intel s Sandy Bridge Technology

Getting Started with the ZED 2 Introduction... 2

NVIDIA Quadro K5200 Amazing 3D Application Performance and Capability

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

HP Workstations graphics card options

ADVANTAGES OF AV OVER IP. EMCORE Corporation

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual

Lecture 1: an introduction to CUDA

Delivering 3D Graphics from the Private or Public Cloud

DRM Driver Development For Embedded Systems

Remoting Your 3D. Ian Williams Manager, PSG Applied Engineering NVIDIA Corporation.

VIRTU Universal MVP Installation Guide

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

Virtual Desktop VMware View Horizon

Remote Graphical Visualization of Large Interactive Spatial Data

NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS

HP Z Workstations graphics card options

Bosch Video Management System

Technical Brief. Quadro FX 5600 SDI and Quadro FX 4600 SDI Graphics to SDI Video Output. April 2008 TB _v01

Catalyst Software Suite Version 9.3 Release Notes

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

NVIDIA GeForce Experience

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

Video Conferencing System Requirements

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

NVIDIA GRID DASSAULT CATIA V5/V6 SCALABILITY GUIDE. NVIDIA Performance Engineering Labs PerfEngDoc-SG-DSC01v1 March 2016

Release 310 Quadro & NVS Professional Drivers for Windows - Version

High Performance GPGPU Computer for Embedded Systems

Transcription:

HIGH PERFORMANCE VIDEO ENCODING USING NVIDIA GPUS Abhijit Patait Sr. Manager, GPU Multimedia SW

AGENDA Overview GPU Video Encoding NVIDIA Video Encoding Capabilities Kepler vs Maxwell GPU capabilities Roadmap Software API Performance & Quality

WHY GPU VIDEO ENCODING?

BENEFITS OF ENCODING ON GPU Low power Fixed function hardware Reduced memory transfers Low latency High performance Higher density Scalability Ease of Programming Linux, Windows, C/C++, Application portability

NVIDIA GPU VIDEO ENCODING CAPABILITIES

NVIDIA GPU ENCODING CAPABILITIES Feature Benefits H.264 base, main, high profiles Wide range of use-cases High performance (Up to 16x HD) YUV 4:2:0 and 4:4:4 support QP maps MVC Up to 4096 4096 in HW API - NV Encode SDK & GRID SDK Independent of CUDA Blazing-speed encoding High quality encoding without chroma subsampling Customizable quality, region of interest encoding Full resolution stereo encode High resolution encode Flexible, Win/Linux, DirectX/CUDA Use CUDA and encode simultaneously

VIDEO ENCODING KEPLER VS. MAXWELL Kepler (GK104, GK107, GK106, GK110, GK208) Planar 4:4:4 Maxwell (GM107) Standard 4:4:4 and H.264 lossless encoding ~240 fps 2-pass encoding @ 720p ~500 fps 2-pass encoding @ 720p GRID K340/K520, K1/K2, Quadro, Tesla K10/K20 GeForce 2 full-speed encode sessions/gpu Current and future Maxwell GPU-boards GeForce 2 full-speed encode sessions/gpu NV Encode SDK 1.0, 2.0, 3.0 (Now) NV Encode SDK 4.0+ (May 2014) GRID SDK 1.x, 2.2, 2.3 (Now) GRID SDK 3.0+ (June 2014)

NVIDIA VIDEO ENCODING ROADMAP Performance improvements Quality improvements 4:4:4 & lossless encoding Rate control enhancements Adaptive quantization ROI, ME-only mode New video standards

NVENC SOFTWARE APIS

USING NVENC Direct Encode Capture + Encode NVENC SDK No capture Transcoding Archiving Video editing CUDA pre-process + encoding Granular encoder settings D3D, CUDA interop GRID SDK Capture + encode Optimized for lowlatency apps Capture + CUDA preprocess + encoding Encoder settings optimized for streaming D3D, CUDA interop

DIRECT ENCODE (NVENC SDK) Encoded bitstream Client application NVENC API Initialize, Configure, Encode CUDA Driver NVENC Driver Configure HW HW Encode NVENC firmware + hardware DirectX Driver

CAPTURE AND ENCODE (GRID SDK) Encoded Bitstream Client application DX/OGL Present NvFBC/NvIFR Encode NVENC Driver YUV DirectX/OGL Driver Capture NVENC Hardware GPU 3D Engine

NVENC SDK Available on NVIDIA developer zone https://developer.nvidia.com/nvidia-video-codec-sdk Current release 3.0 Release 4.0 in May 2014 with Maxwell support Interface header, documentation, sample application.dll/.so included in the driver Unified API for Windows and Linux Works on x86/x64 Various API s, presets, rate control modes for Transcoding Video conferencing GTC Session S4654

NVENC SDK (CONTD.) Advantages Flexibility Dynamic resolution/bitrate change CABAC vs CAVLC; low-level encoder settings, B-frames, sync vs async, custom QP Linux, Windows, DirectX, CUDA, OGL (via CUDA) Also works on GeForce hardware (2 sessions/gpu) Error concealment Reference picture invalidation Intra-refresh Quality Two-pass modes for higher quality Various presets with quality/performance trade-off 4:4:4 & lossless encoding (Maxwell only)

GRID SDK ENCODE Available on NVIDIA developer zone https://developer.nvidia.com/grid-app-game-streaming Current release: 2.2 Interface header, documentation, sample apps.dll/.so included in the driver Windows and Linux Works on x86/x64 Various presets and API s for Remote graphics (Cloud gaming, remote desktop, capture & stream) Optimized for low latency

GRID SDK (CONTD.) Advantages Simplicity Very simple API; single function call for capture + H.264 encode Low-latency, high performance Optimized API Error concealment Reference picture invalidation Intra-refresh Quality Two-pass modes for higher quality 4:4:4 & lossless encoding (Maxwell only)

PERFORMANCE AND QUALITY

PERFORMANCE 720P NVENC Performance at 720p, Low-Latency HP preset 231 fps Rate control modes CBR_IFRAME_2PASS 2_PASS_FRAMESIZE_CAP 2_PASS_QUALITY 232 fps 232 fps 504 fps 503 fps 505 fps Kepler (GRID) Maxwell 100 200 300 400 500 600 720p Performance (fps) Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application

PERFORMANCE 1080P NVENC Performance at 1080p, Low-Latency HP preset 118 fps Rate control modes CBR_IFRAME_2PASS 2_PASS_FRAMESIZE_CAP 2_PASS_QUALITY 118 fps 119 fps 239 fps 240 fps 238 fps Kepler (GRID) Maxwell 50 100 150 200 250 1080p Performance (fps) Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application

ENCODING QUALITY VS X264 ASSUMPTIONS Infinite GOP IPPP VBV buffer = bitrate/framerate x264 Zero latency CRF = 24 Preset = faster NVENC Preset = LOW_LATENCY_HQ RC = 2-pass-quality

NVENC/X264 QUALITY COMPARISON 45 Titan Fall 720p, 5 Mbps, Low-latency HQ 1.2 40 35 PSNR Y (db) 1.1 30 1 PSNR Y (db) 25 20 SSIM Y 0.9 0.8 SSIM Y PSNR NVENC PSNR x264 SSIM NVENC SSIM x264 15 0.7 10 5 0.6 0 1 101 201 301 401 501 601 701 801 901 0.5

NVENC/X264 QUALITY COMPARISON 60 Bunny 1080p, 12 Mbps, Low-latency HQ 1.5 50 1.4 1.3 40 1.2 PSNR Y (db) 30 20 PSNR Y (db) 1.1 1 SSIM Y PSNR NVENC PSNR x264 SSIM NVENC SSIM x264 SSIM Y 0.9 10 0.8 0 1 101 201 301 401 501 0.7

QUALITY COMPARISON PSNR PSNR Comparison - x264 vs NVENC PSNR Y (db) 50.00 db 45.00 db 40.00 db 35.00 db 30.00 db 25.00 db 20.00 db 15.00 db 10.00 db 5.00 db 0.00 db -5.00 db Bunny NFS Rivals NFS Rivals Titan Fall Titan Fall WoT - 3 WoT - 12 1080p 720p 1080p 720p 1080p 1280 768 1280 768 PSNR NVENC 47.24 db 34.05 db 35.51 db 30.58 db 28.13 db 34.15 db 35.60 db PSNR x264 43.71 db 33.18 db 34.39 db 29.78 db 30.63 db 33.41 db 34.72 db PSNR Difference 3.52 db 0.87 db 1.12 db 0.80 db -2.50 db 0.74 db 0.87 db

QUALITY COMPARISON SSIM SSIM Comparison - x264 vs NVENC 1.0000 0.8000 0.6000 SSIM Y 0.4000 0.2000 0.0000-0.2000 Bunny 1080p NFS Rivals NFS Rivals Titan Fall Titan Fall WoT - 3 WoT - 12 720p 1080p 720p 1080p 1280 768 1280 768 SSIM NVENC 0.9874 0.9217 0.9388 0.8350 0.8309 0.9101 0.9169 SSIM x264 0.9808 0.9103 0.9269 0.8073 0.8567 0.8930 0.9027 SSIM Difference 0.01 0.01 0.01 0.03-0.03 0.02 0.01

QUESTIONS?