Delivering 3D Graphics from the Private or Public Cloud with XenDesktop and GRID Derek Thorslund, Director of Product Management Citrix Systems Q3 2013
Rich Apps as a Service
Citrix milestones in 3D graphics remoting 2006 2009 2011 2012 2013 Project K2 delivers CATIA to Boeing Dreamliner designers GA of XenDesktop HDX 3D Pro with Deep Compression XenServer 6.0 hypervisor introduces GPU Passthrough Higher fps via NVIDIA GRID API plus improved compression XenDesktop 7 GPU Sharing with high density GRID K1/K2 cards
Business Drivers for virtualizing 3D graphics apps & workstations
Global talent base Secure IP Work-from-home Disaster recovery Mobile device access Improve time-to-market Cost efficiency
Leverage worldwide talent pool
Intellectual property: Do you recognize this car?
Centralize and secure design IP Engineering drawings Bills of Materials Cost info Supplier info Customer info Lifecycle data Product design decisions
Work-from-home & Disaster recovery
Leverage mobile devices
Improve efficiency & agility Reduce operational costs
Global Product Development Teams Real Example United States Germany India China Korea Brazil Australia
Global Development Effort Real Example 30,000 CAD files or 70 GB of data to be synchronized every day Across 26 design centers (30,000+ users) Across 16 countries It took 2 weekends to sync all code updates! More challenging for 4,000+ suppliers and partners
Enhanced IP control, collaboration and global agility R & D QA R & D Sales & Marketing Supplier Support Manufacturing & Logistics Data stays in data center Access via LAN or WAN
Case studies
Case study and customer reference Manuel Killer, Project Manager CAx Technologies ABB Switzerland Ltd Power Electronics & MV Drives Global CAD accesss with HDX 3D Pro October 2011
Requirements from Business to IS Extended Engineering Workbench in India Global Document Mgmt Global Software Development and Engineering Tool Landscape Global Product Release Process Global Change Mgmt Process Global Product Development Process Global Development Global Engineering Global Production Engineers in India need to be able to work as if they were sitting in Switzerland Turgi
Challenges Of course there were more than one 3D CAD data is large Transferring our largest assemblies took 2.5 hours! ABB s corporate network Latency Bandwidth Like one team ABB Group July 17, 2013 Slide 18
Learnings Service quality is a subjective matter Using Dassault SolidWorks, 5-6 hours per day; Designers can work from India as if in Switzerland! Latency effect (subjective scores) 350 300 250 200 150 100 50 0 50% 75% 80% 90% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% ABB Group July 17, 2013 Slide 20 Latency [ms] System Quality [%] (results without CloudBridge)
HDX 3D Pro case study Wind turbine manufacturer Delivering PTC Pro/E and Dassault SolidWorks from Europe to other continents since 2008 (2,000 remote users) HDX 3D Pro protects Vestas intellectual property, supports workforce globalization, eliminates inconsistencies in engineering design versioning and overcomes regulatory challenges Reduced cost per running hour by 30% from 416 (traditional CAD workstations) to 291 (data center blade workstations) via follow-the-sun utilization (Denmark, UK, US, India, China) Citrix Confidential - Do Not Distribute
HDX 3D Pro case study Major European heavy vehicle manufacturer Access from Germany, Mexico and Brazil to Dassault CATIA apps hosted in Sweden At 220ms roundtrip latency, good performance working on models with 1500+ parts; bandwidth usage rarely reaches 2.5 Mbps Using 3D Space Mouse
HDX 3D Pro case study Daimler Digital Factory The Daimler Digitale Fabrik (Digital Factory) team can simulate an entire manufacturing plant in software HDX 3D Pro serves users across various Daimler plants and offices Siemens NX applications HP ws460c data center blade workstations NVIDIA Quadro Fermi Q2000 cards
Product Overview XenDesktop HDX 3D Pro
Segmenting the user population 1 2 3 Tier 1: Professional users (e.g. design engineers, radiologists) Top rendering performance (dedicated GPU) Deep compression on WAN links 3D mouse Tier 2: Power users (users who need to view/edit large 3D models) GPU sharing Tier 3: Knowledge workers Highly shared GPU
XenDesktop HDX 3D Pro XenDesktop feature for high-end and second-tier 3D professional graphics GPU acceleration for hardware rendering of large 3D models Multiple compression options including deep compression codec for access over narrow WAN links VDI and Hosted Shared workloads
XenDesktop: Powerful and flexible infrastructure Universal client High-Definition User Experience Enterprise app store Flexible Desktop and App delivery Citrix Receiver NetScaler Gateway XenDesktop
VDI versus RDS (hosted shared) XenDesktop workloads Tier 1: HDX 3D Pro on VDI Tier 2: HDX 3D Pro on RDS GPU acceleration of Direct3D, OpenGL, CUDA, OpenCL H.264-based Deep Compression Full desktop or seamless apps DirectX/OpenGL GPU sharing planned via GRID vgpu 3D mouse support TOP PERFORMANCE GPU acceleration of Direct3D, OpenGL, CUDA*, OpenCL* H.264-based Deep Compression Full desktop or seamless apps DirectX/OpenGL GPU sharing Lower cost Microsoft licensing Apps must be RDS compatible * Experimental pending field validation MOST COST-EFFECTIVE
What s new with HDX 3D Pro in XenDesktop 7? Self-tuning codec technology ᵒ Adaptive Display automatically detects transient and/or video images ᵒ Image quality dynamically adapts to network bandwidth ᵒ Fine Drawing codec eliminated; improved H.264 codec performs much better HDX 3D Pro now available for Windows Server RDS workloads ᵒ Adaptive H.264-based Deep Compression ᵒ GPU acceleration and sharing for OpenGL and DirectX (including WPF), plus experimental support for CUDA and OpenCL ᵒ Faster frame rate at higher resolutions compared to XA6.5 GPU Sharing
What s new with HDX 3D Pro? (cont d) Auto screen resolution detection ᵒ No longer necessary to disconnect /reconnect when changing resolution 5 versions of Receiver now include decoding of H.264-based Deep Compression: Windows, Linux, ios, Mac, Android HDX Monitor now reports on HDX 3D Pro ᵒ Details on fps, codec, performance ᵒ Replaces previous HDX 3D Pro Health Check Tool Quad monitor support ᵒ Not a hard limit but we tested with up to 4 monitors with good performance
RDS-compatible professional graphics apps Some examples from autodeskandcitrix.com, Citrix Ready site, etc. Lots of Autodesk apps, including: ᵒ AutoCAD ᵒ Inventor ᵒ Revit ᵒ Navisworks Bentley MicroStation ESRI ArcGIS Intergraph SmartPlant 3D Adobe PhotoShop
Citrix Ready partners for prof l graphics on VDI
XenDesktop HDX 3D Pro Feature of XenDesktop Enterprise and Platinum editions Broad app compatibility OpenGL, DirectX (incl. WPF), CUDA, OpenCL No API hooking Blade/rack workstations are ideal, but any form factor can be used for the host Multiple users per host using GPU passthrough
HDX 3D Pro client options Desktop Virtualization for High-end Graphics Users
HDX 3D Pro on thin clients HDX Ready Premium thin clients support Deep Compression decoding Min. 1.6 GHz CPU required More to come, including lower cost HDX SoC devices Photos not to scale
Host requirements XenServer 6.x or vsphere 5.1 or physical machine Quad core CPU at 2.3 GHz or higher, or four vcpus 4 GB of RAM minimum GPU card supported by ISV
Seamless Application Delivery Citrix Receiver ICA HDX 3D Pro XenDesktop VDA End Point Host
Compression Options Users can switch between codecs if desired: Deep Compression codec (default) CPU-based, min. 1.5 Mbps bandwidth (see next slide for real world examples) 100% lossless compression (e.g. medical imaging)
Deep Compression codec technology Customer-reported bandwidth utilization on long-haul connections First user requires 1.5 to 2 Mbps minimum Heavy equipment manufacturer: Branch with 12 concurrent users requires 700-800 Kbps per user Control valves manufacturer: 20 Mbps WAN link serves branch with 17 users, i.e. 1.2 Mbps/user Bandwidth requirement does not scale linearly
Lossless Pixel-Perfect Compression Lossless (pixel-perfect) for Medical Images Lossless Systray icon Text displayed on hovering the mouse over the icon
Support for up to 4 monitors Citrix Receiver for Windows or Linux Efficient use of bandwidth
3D mouse support available on VDI USB redirection for 3D mice and similar devices Virtual Channel can be prioritized to maximize responsiveness
Citrix CloudBridge for WAN optimization Ideal for low bandwidth and high latency connections Improves responsiveness of apps delivered via HDX 3D Pro over high latency connections Reduces bandwidth consumption, enabling more users to share a given size of pipe (e.g. ABB reports 3:1 improvement at just 5 users) Citrix Confidential - Do Not Distribute
GPU sharing for RDS workloads Usually one VM per GPU (and one GPU per VM) ᵒ On bare metal with OpenGL apps, multiple GPUs can serve one VM, but in general we recommend one GPU per VM using a hypervisor that supports GPU passthrough Each VM is a multi-user Windows Server RDS workload XenServer GPU Passthrough now supports up to 12 GPUs per server ᵒ But typical high-end configuration is 3x NVIDIA GRID K2 for a total of 6 GPUs Direct access to graphics driver and hardware, unlike software-based vgpu User density depends on the apps, GPU processing power, video RAM, etc. ᵒ No fixed limit; one customer reports 32 users on a Q6000 with Dassault 3D Via player RDS limitation: One user could impact performance of other users ᵒ Recommend capping the number of users per VM Available now!
GPU Passthrough (single-user & multi-user VMs) Reduced cost per user Introduced in XenServer 6 (October 2011) Now also in vsphere/esx with vdga Multiple GPUs per host Lower cost per user Servers with up to 12 GPUs currently on XenServer HCL One graphics-accelerated VM (single-user or multi-user) per GPU VM VM VM VM Hypervisor Depending on CPU power, same host may also support regular office workers Citrix Confidential - Do Not Distribute
NVIDIA GRID K1 NVIDIA GRID K2 GPU 4 Kepler GPUs 2 High End Kepler GPUs CUDA cores 768 (192 per GPU) 3072 (1,536 per GPU) Memory Size 16GB DDR3 (4GB per GPU) 8GB GDDR5 (4GB per GPU) Max Power 130 W 225 W Cooling solution Passive Passive OpenGL 4.3 4.3 DirectX 11 11 GRID vgpu support Yes Yes NVIDIA Confidential 1 Number of users depends on software solution, workload, and screen resolution
GPU Passthrough XenDesktop Windows VMs non-3d VM 3D Pro VM non-3d VM 3D Pro VM non-3d VM 3D Pro VM non-3d VM 3D Pro VM non-3d VM non-3d VM non-3d VM non-3d VM XenServer hypervisor Hardware platform GPU GPU GPU GPU...
GPU Passthrough with RDS workloads Session 1 Session 2 Session 3 Session 4 Session 5... Session N-1 Session N XenApp Windows Server VMs XenApp VM XenApp VM XenApp VM XenApp VM XenServer hypervisor Hardware platform GPU GPU GPU GPU
XenDesktop GPU Sharing on hosted-shared Multiple concurrent users per GPU Ideal for second tier users of 3D professional graphics Supports all versions of DirectX and OpenGL GPU Sharing for DirectX has been available since XenApp 6.0 XenApp 6.5 OpenGL GPU Sharing feature add-on was introduced in March at GTC Included in XenDesktop 7 for Hosted Shared workloads Works with Fermi-generation NVIDIA Quadro cards and with the latest Kepler-architecture GRID K2 (higher user density) Directly leverages the GPU video driver (unlike API Intercept vgpu) Includes experimental support for CUDA and OpenCL
GPU sharing scalability With two NVIDIA Quadro 4000 cards we ran 9 users per GPU using a test app that works with ESRI ArcGIS, and we still had space for more Running Dassault SolidWorks, Ansys Workbench and Fluent, scalability was 6 to 10 users per Quadro 4000 The Quadro 6000 was able to support 30 users running Dassault 3DVIA Composer Player with only minor slowdown; and this test was harder on the graphics card than the real world is! We are getting 30 users of SAP Right Hemisphere 3D on a physical XenApp 6.5 server with a Quadro 2000 card New NVIDIA GRID K2 introduces even higher user densities!
GPU sharing for VDI workloads (coming soon) GPU sharing for single-user Windows desktop VDI workloads requires GPU virtualization (vgpu) Earlier vgpu technologies (Microsoft Hyper-V RemoteFX, VMware vsphere/esx vsga) are software-based (API Intercept approach) ᵒ Designed for less demanding knowledge worker use cases ᵒ Limited to smaller 3D models due to data transfer from user session to session 0 ᵒ Limited to older versions of DirectX/OpenGL XenServer/NVIDIA GRID vgpu is hardware-based ᵒ High performance, even with large models ᵒ Supports the latest versions of Direct and OpenGL Public Tech Preview September 2013
GPU Virtualization XenDesktop Windows VMs 1 2 3 4 N 3D Pro VM 3D Pro VM 3D Pro VM 3D Pro VM... 3D Pro VM XenServer hypervisor vgpu vgpu vgpu vgpu... vgpu Hardware platform GPU...
Available H1 2013 Available H2 2013 NVIDIA GRID Enabled OEM Platforms Cisco UCS C240 M3 2 GRID K1 or 2 GRID K2 HP ProLiant SL250 Gen8 2 GRID K2 Dell PowerEdge R720 2 GRID K1 or 2 GRID K2 HP ProLiant SL270 4+ GRID K2 IBM idataplex dx360 M4 2 GRID K1 or 2 GRID K2 HP ProLiant WS460c Gen8 1 GRID K1 or 1 GRID K2 Asus ESC 4000 G2 4 GRID K2 SuperMicro SYS-1027-TRF 2 GRID K1 or 3 GRID K2 SuperMicro SYS-2027-TRF 2 GRID K1 or 3 GRID K2 http://hcl.vmd.citrix.com/gpupass-throughdevicelist.aspx
Summary: Citrix solution for 3D graphics Proven solution for high-end 3D graphics delivery Best WAN performance on the market Lowest cost per user Access from any device First vendor to adopt NVIDIA GRID API and offer high performance GPU sharing
Additional Information Resources: www.citrix.com/xendesktop/hdx3d/ Blog: www.blogs.citrix.com/product/xendesktop/ Twitter: @xendesktop
Check out GRID talks @ SIGGRAPH Next Week: State-of the-art of Virtualized Graphics Tuesday, July 23 rd 10:40 AM PDT Wednesday, July 24 th 1:20 PM PDT Thursday, July 25 th 11:20 AM PDT Bunkspeed: Bringing NVIDIA iray and Ease of Use to Designers Thursday, July 25 th 1:20 PM PDT
GTC 2014 Call for Submissions Looking for submissions in the fields of Science and research Professional graphics Mobile computing Automotive applications Game development Cloud computing Submit at www.gputechconf.com
Work better. Live better.