PLANNING FOR DENSITY AND PERFORMANCE IN VDI WITH NVIDIA GRID JASON SOUTHERN SENIOR SOLUTIONS ARCHITECT FOR NVIDIA GRID



Similar documents
REMOTE HIGH FIDELITY VISUALIZATION. May 2015 Jeremy Main, Sr. Solution Architect GRID

Table of Contents. P a g e 2

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

NVIDIA GRID OVERVIEW SERVER POWERED BY NVIDIA GRID. WHY GPUs FOR VIRTUAL DESKTOPS AND APPLICATIONS? WHAT IS A VIRTUAL DESKTOP?

GTC EXPRESS WEBINAR: GETTING THE MOST OUT OF VMWARE HORIZON VIEW VDGA WITH NVIDIA GRID. Steve Harpster - Solution Architect NALA

PNY Professional Solutions NVIDIA GRID - GPU Acceleration for the Cloud

S4726 VIRTUALIZATION AN INTRO TO VIRTUALIZATION. Luke Wignall & Jared Cowart Senior Solution Architects GRID

NVIDIA GRID VGPU Steve Harpster SA NALA October 2013

S5445 BUILDING THE BEST USER EXPERIENCE WITH CITRIX XENAPP & NVIDIA GRID THOMAS POPPELGAARD

Performance Testing in Virtualized Environments. Emily Apsey Product Engineer

NVIDIA GRID DASSAULT CATIA V5/V6 SCALABILITY GUIDE. NVIDIA Performance Engineering Labs PerfEngDoc-SG-DSC01v1 March 2016

HOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu:

Get the Best out of NVIDIA GPUs for 3D Design and Engineering in the Cloud

GRID VGPU FOR VMWARE VSPHERE

GPU virtualization with Citrix XenDesktop, using NVIDIA GRID graphics board on VMware vsphere 6

Accelerated Graphics Reference Architecture Addendum for

VMware and NVIDIA: Bringing Workstations to the cloud

S Hands on Tutorial: Deploying GRID in Citrix and VMware Virtual Desktop Environments

GRID VGPU FOR CITRIX XENSERVER

Cloud Gaming & Application Delivery with NVIDIA GRID Technologies. Franck DIARD, Ph.D. GRID Architect, NVIDIA

Implementation of a Collaborative Engineering Data Management System. Tony Metz Fermilab 04 May 2015

VMware Horizon 6 3D Engineering Workloads Reference Architecture TECHNICAL WHITE PAPER

Delivering 3D Graphics from the Private or Public Cloud

Virtualization of ArcGIS Pro. An Esri White Paper December 2015

OCIO HIGH RESOLUTION GRAPHICS DESKTOP VIRTUALIZATION POC. Chip Charnley Technical Expert Client Technologies Infrastructure Architecture

GPU Accelerated XenDesktop 3D Graphics beyond Designers and Engineers

Deploying Hardware-Accelerated Graphics with View Virtual Desktops in Horizon 6

Hardware Accelerated Graphics for VDI

GRID VGPU FOR CITRIX XENSERVER

CUSTOMER EXPERIENCES WITH GPU VIRTUALIZATION AND 3D REMOTING

Configuring XenServer v6.5.0 Service Pack 1 for Graphics

GRID VIRTUAL GPU FOR CITRIX XENSERVER Version ,

Virtual Machine Graphics Acceleration Deployment Guide WHITE PAPER

Virtual Desktop VMware View Horizon

Server and Storage Sizing Guide for Windows 7 TECHNICAL NOTES

GTC 2015 S5128 Case Study: Georgia Tech Uses Citrix XenApp with NVIDIA GRID to Deliver Engineering Applications

VDI: What Does it Mean, Deploying challenges & Will It Save You Money?

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

Characterize Performance in Horizon 6

WHITE PAPER 1

Graphics Acceleration in VMware Horizon View Virtual Desktops

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

SierraVMI Sizing Guide

VMware Horizon View 3D Graphics

Configuring RemoteFX on Windows Server 2012 R2

Vmware Horizon View with Rich Media, Unified Communications and 3D Graphics

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

Installation Guide. (Version ) Midland Valley Exploration Ltd 144 West George Street Glasgow G2 2HG United Kingdom

Chapter 19 Cloud Computing for Multimedia Services

Virtual Computing and VMWare. Module 4

YOUR STRATEGIC VIRTUALIZATION ALTERNATIVE. Greg Lissy Director, Red Hat Virtualization Business. James Rankin Senior Solutions Architect

Stream Processing on GPUs Using Distributed Multimedia Middleware

Parallel Large-Scale Visualization

NVIDIA VIDEO ENCODER 5.0

HIGH PERFORMANCE VIDEO ENCODING WITH NVIDIA GPUS

2014 Teradici Corporation.

Stratusphere Solutions

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

VMware Virtual Infrastucture From the Virtualized to the Automated Data Center

AGENDA. Overview GPU Video Encoding NVIDIA Video Encoding Capabilities. Software API Performance & Quality. Kepler vs Maxwell GPU capabilities Roadmap

HIGH-PERFORMANCE GPU VIDEO ENCODING ABHIJIT PATAIT SR. MANAGER, NVIDIA

Monitoring Databases on VMware

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

VMware Horizon 6 with View Performance and Best Practices TECHNICAL WHITE PAPER

ClearCube: Dedicated 1:1 Blade PC Workstations for Centralized and Virtualized Desktop Infrastructure

Reviewer s Guide for Remote 3D Graphics Apps

Managing Capacity Using VMware vcenter CapacityIQ TECHNICAL WHITE PAPER

按 一 下 以 編 輯 母 片 標 題 樣 式

Dell Desktop Virtualization Solutions Stack with Teradici APEX 2800 server offload card

HP Z Workstations graphics card options

Desktop Virtualization. The back-end

HP Workstations graphics card options

Parallels Virtuozzo Containers

Intel Graphics Virtualization Technology Update. Zhi Wang,

Desktop Consolidation. Stéphane Verdy, CTO Devon IT

QuickSpecs. NVIDIA Quadro M GB Graphics INTRODUCTION. NVIDIA Quadro M GB Graphics. Overview

QuickSpecs. NVIDIA Quadro K5200 8GB Graphics INTRODUCTION. NVIDIA Quadro K5200 8GB Graphics. Technical Specifications

GRID VIRTUAL GPU FOR CITRIX XENSERVER Version ,

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

Praktijkexamen met Project VRC. Virtual Reality Check

What s New with VMware Virtual Infrastructure

Maximizing SQL Server Virtualization Performance

System requirements for Autodesk Building Design Suite 2017

VIRTU Universal MVP Installation Guide

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

Memory and SSD Optimization In Windows Server 2012 and SQL Server 2012

HP Workstations graphics card options

Citrix XenDesktop Modular Reference Architecture Version 2.0. Prepared by: Worldwide Consulting Solutions

Mark Bennett. Search and the Virtual Machine

E-Guide VIRTUALIZED GPUS TO IMPROVE VDI PERFORMANCE

XID ERRORS. vr352 May XID Errors

Introduction GPU Pass-Through Shared GPU Guest Support and Constraints Available NVIDIA GRID vgpu Types...

Transcription:

PLANNING FOR DENSITY AND PERFORMANCE IN VDI WITH NVIDIA GRID JASON SOUTHERN SENIOR SOLUTIONS ARCHITECT FOR NVIDIA GRID

AGENDA Recap on how vgpu works Planning for Performance - Design considerations - Benchmarking Optimizing for Density

recap Nvidia vgpu

SHARING THE GPU vgpu from NVIDIA Datacenter GPU-enabled server Hypervisor Virtual Machine Guest OS Apps Virtual Machine Guest OS Apps Notebook or thin client GRID Virtual GPU Manager NVIDIA Driver NVIDIA Driver Hypervisor Physical GPU Management Direct GPU access from guest VMs Graphics NVIDIA GRID vgpu

VIRTUAL GPU RESOURCE SHARING GPU-enabled server Hypervisor GRID Virtual GPU Manager Hypervisor NVIDIA GRID vgpu CPU MMU Channels Timeshared Scheduling 3D CE NVENC NVDEC VM 1 Guest OS Apps NVIDIA Driver VM 2 Guest OS Apps NVIDIA Driver GPU BAR VM1 BAR VM2 BAR Framebuffer VM1 FB VM2 FB Frame buffer - Fixed allocation - Allocated at VM startup GPU Engines Timeshared among VMs, like multiple contexts on single OS Dedicated secure data channels between VM & GPU

Building for Performance

WHAT AFFECTS OVERALL PERFORMANCE vcpu System Memory GPU Storage Performance

HOW DO WE CHECK GPU UTILIZATION? Nvidia-SMI - CLI - Realtime & Looping Perfmon - GUI - Realtime & logging GPU-Z - GUI - Realtime & Log to File Process Explorer - Per process information on utilisation GPUShark - Basic GUI - Realtime Lakeside Systrack / LWL Stratusphere - Detailed historical reporting

MONITORING PASSTHROUGH VS VGPU Measured against 100% of the GPU

BE CAREFUL THOUGH 320% Utilisation?

ASSESSMENT TOOLS Long term assessment data allows you to plan for the peak loads. GPU usage is often in bursts, plan for the peak not the mean. Use assessment tools that track GPU info e.g. - Lakeside Systrack 7 - Liquidware Labs Stratusphere FIT

PLAN FOR THE PEAKS

VCPU S Allow at least one for the Encoder (HDX or PCoIP) Allow at least one for the OS The rest are for the application(s) - How many did the workstations have? - How demanding is the application itself?

SYSTEM MEMORY => GPU Memory 2GB of System RAM & 4GB GPU Memory = Bottleneck! Memory overcommit / ballooning etc is not recommended.

CUDA Computational Usage GPGPU PhysX PASSTHROUGH OR VGPU When do I really need to use Passthrough? Troubleshooting vgpu issues Driver simplification - Kx80Q

CUDA WHAT IS IT NVIDIA s parallel computing architecture that enables dramatic increases in computing performance by harnessing the power of the GPU Applications & their features that use CUDA http://www.nvidia.com/object/gpu-accelerated-applications.html

Benchmarking

BENCHMARKING Remember you re benchmarking the entire VM, not just the GPU All of these have an impact on the result. - GPU - CPU - RAM - DISK Don t overlook User Experience testing. - Benchmarks are just numbers, user acceptance is king.

BENCHMARKING TOOLS CADalyst - For AutoCAD workloads http://www.cadalyst.com/benchmark-test 3D Mark 11 - Generic DirectX benchmarking http://www.futuremark.com/benchmarks/3dmark11 SPECViewperf 11 - OPENGL benchmarking tool - Has industry & application specific modules available - Version 12 has issues with virtualisation at present.. http://www.spec.org/gwpg/gpc.static/vp11info.html

Frame Rate Limiter & VSYNC

FRAME RATE LIMITER For vgpu we implement a frame Rate Limiter (FRL) Used in vgpu to balance performance across multiple vgpus executing on the same physical GPU. FRL imposes a max frames-per-second that vgpu will render at in a VM. - Q profiles render at 60fps max - non Q profiles are limited to 45fps max

VSYNC Setting is modified by applications or manually performed via the NVIDIA Control Panel Default setting allows the application to set the VSYNC policy Setting the VSYNC to on will synchronize the frame rate to 60Hz / 60 FPS for both pass-through and vgpu Setting the VSYNC to off will allow the GPU to render as many frames as possible In vgpu profiles, this setting does not override the FRL

70.0 VSYNC EFFECT ON VGPU - SINGLE VM 60.0 50.0 40.0 30.0 57.7 60.9 20.0 46.9 44.0 49.1 49.3 34.4 35.9 47.9 50.6 10.0 11.0 11.7 0.0 CATIA Siemens NX ProE SolidWorks Tcvis Ensight SPECviewperf 11 Scores K260Q K260Q VSYNC Off

70.0 FRL EFFECT ON VGPU SINGLE VM 60.0 50.0 40.0 30.0 57.7 61.2 20.0 46.9 36.5 49.1 50.1 34.4 29.7 47.9 43.6 10.0 11.0 9.8 0.0 CATIA Siemens NX ProE SolidWorks Tcvis Ensight SPECviewperf 11 Scores K260Q K260Q FRL Off

90.0 VSYNC + FRL EFFECT ON VGPU 80.0 70.0 60.0 50.0 40.0 78.2 75.4 75.0 74.3 30.0 20.0 10.0 0.0 46.9 58.3 57.7 60.2 57.0 60.9 61.2 57.7 49.1 49.3 50.1 50.6 47.9 44.0 43.6 36.5 34.4 35.9 37.2 39.2 29.7 11.0 11.7 9.8 11.3 11.1 CATIA Siemens NX ProE SolidWorks Tcvis Ensight SPECviewperf 11 Scores K260Q K260Q VSYNC Off K260Q FRL Off K260Q VSYNC + FRL Off Pass-through VSYNC Off

Am I using the right profile? Optimizing for Density

Quadro K6000 2880 CUDA cores 12GB COMPARING QUADRO TO VGPU Pass-through vgpu Quadro K5000 1536 CUDA cores 4GB Quadro K4000 768 CUDA cores 3GB Quadro K2000 384 CUDA cores 2GB GRID K2 2x 1536 CUDA cores 2x 4GB GRID K260Q 2x 1536 CUDA cores 4x 2GB GRID K240Q 2x 1536 CUDA cores 8x 1GB Quadro K600 192 CUDA cores 1GB Quadro 410 192 CUDA cores 512MB GRID K1 4x 192 CUDA cores 4x 4GB GRID K140Q 4x 192 CUDA cores 16x 1GB

vgpu Profiles In Current Driver Board GRID K1 vgpu type vgpus per board vgpus per GPU Per virtual GPU FB Heads Max Res GRID K120Q 32 8 512M 2 2560x1600 GRID K140Q 16 4 1G 2 2560x1600 GRID K160Q 8 2 2G 4 2560x1600 GRID K180Q 4 1 4G 4 2560x1600 Board vgpu type vgpus per board vgpus per GPU GRID K2 Per virtual GPU FB Heads Max Res GRID K220Q 16 8 512M 2 2560x1600 GRID K240Q 8 4 1G 2 2560x1600 GRID K260Q 4 2 2G 4 2560x1600 GRID K280Q 2 1 4G 4 2560x1600 What does the Q mean?

GRID K260Q 2GB framebuffer 4 heads, 2560x1600 ENGINEER DESIGNER GRID K2 2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU) GRID K240Q 1GB framebuffer 2 heads, 2560x1600 POWER USER GRID K220Q 512MB framebuffer 2 heads, 1920x1200 KNOWLEDGE WORKER

LET S CONSIDER A SCENARIO. An organisation has trialled K1 s in passthrough on dual displays - Performance is perfect, but they want better density from their server purchase if possible. -2 K1 cards in a chassis = 8 Users in pass-through. Is there a way to get more users on the server with the same or better performance?

IT DEPENDS ON THE PEAK UTILIZATION Idle 10% GPU Framebuffer Load 25% Load 90% Idle 75% 90% of the GPU in use vgpu on K1 not an option 1 GB Framebuffer in use 3 GB going to waste.

Card VGPU OPTIONS ON A K2 CARD. Physical GPUs Virtual GPU Use Case Frame Buffer (MB) Virtual Display Heads Maximum Resolution Maximum vgpus per GPU per Board GRID K2 2 GRID K260Q Typical Designer 2048 4 2560x1600 2 4 GRID K2 2 GRID K240Q No Density improvement 4 VM s per card Entry-Level Designer 1024 2 2560x1600 4 8 Sufficient Guaranteed GPU capacity but too little Framebuffer < 1Gb GRID K2 2 GRID K220Q Knowledge Wkr 512 2 2560x1600 8 16 K1 192 Cores per GPU K2 1536 Cores per GPU So, let s assume that K220Q profiles have similar minimum GPU resources to K1 in pass-through

THE GOLDILOCKS PROFILE? Card Physical GPUs Virtual GPU Use Case Frame Buffer (MB) Virtual Display Heads Maximum Resolution Maximum vgpus per GPU per Board GRID K2 2 GRID K240Q Entry-Level Designer 1024 2 2560x1600 4 8 Idle 10% K1 Usage GPU K1 Usage Framebuffer Load 25% Load 90% Idle 75%

POTENTIAL SOLUTION K2 with 240Q profile would - Double the user density in the chassis to 16 - Increased GPU performance - CAPEX reduction due to less chassis needed.

Remember, this is just the start ENGINEER / DESIGNER GRID K2 High-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU) POWER USER GRID K1 Entry Kepler GPUs 768 CUDA cores (192 / GPU) 16GB DDR3 (4GB / GPU) KNOWLEDGE WORKER

Impact of Remoting Protocols One Last thing

THANK YOU