GPU Accelerated Signal Processing in OpenStack. John Paul Walters. Computer Scien5st, USC Informa5on Sciences Ins5tute jwalters@isi.



Similar documents
CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

Mobile Cloud Computing T Open Source IaaS

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

SR-IOV In High Performance Computing

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

OpenStack: we drink our own Champagne. Teun Docter Software developer

2) Xen Hypervisor 3) UEC

High Performance Computing in CST STUDIO SUITE

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

2972 Linux Options and Best Practices for Scaleup Virtualization

Virtualization & Cloud Computing (2W-VnCC)

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E

The QEMU/KVM Hypervisor

VERSUS VMWARE VSPHERE

Cloud Computing through Virtualization and HPC technologies

Beyond the Hypervisor

Cisco Application-Centric Infrastructure (ACI) and Linux Containers

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

Operating Systems Virtualization mechanisms

Cloud Computing. Alex Crawford Ben Johnstone

PRIVATE CLOUD PLATFORM OPTIONS. Stephen Lee CEO, ArkiTechs Inc.

T Mobile Cloud Computing Private Cloud & Assignment

Ubuntu OpenStack on VMware vsphere: A reference architecture for deploying OpenStack while limiting changes to existing infrastructure

Scientific Computing Data Management Visions

HP ProLiant SL270s Gen8 Server. Evaluation Report

NVIDIA GRID OVERVIEW SERVER POWERED BY NVIDIA GRID. WHY GPUs FOR VIRTUAL DESKTOPS AND APPLICATIONS? WHAT IS A VIRTUAL DESKTOP?

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Comparing Open Source Private Cloud (IaaS) Platforms

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o

CON9577 Performance Optimizations for Cloud Infrastructure as a Service

Sistemi Operativi e Reti. Cloud Computing

Comparing Ganeti to other Private Cloud Platforms. Lance Albertson

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Today. 1. Private Clouds. Private Cloud toolkits. Private Clouds and OpenStack Introduction

Enabling Technologies for Distributed and Cloud Computing

Software Testing and Deployment Using Virtualization and Cloud

Enabling Technologies for Distributed Computing

OPEN CLOUD INFRASTRUCTURE BUILT FOR THE ENTERPRISE

RED HAT ENTERPRISE VIRTUALIZATION FOR SERVERS: COMPETITIVE PRICING GUIDE

Cloud^H^H^H^H^H Virtualization Technology. Andrew Jones May 2011

OpenStack Introduction. November 4, 2015

Introduction to Virtualization & KVM

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

Red Hat Enterprise Virtualization Performance. Mark Wagner Senior Principal Engineer, Red Hat June 13, 2013

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

Open Cirrus: Towards an Open Source Cloud Stack

How To Install Eucalyptus (Cont'D) On A Cloud) On An Ubuntu Or Linux (Contd) Or A Windows 7 (Cont') (Cont'T) (Bsd) (Dll) (Amd)

Full and Para Virtualization

A cure for Virtual Insanity: A vendor-neutral introduction to virtualization without the hype

An Introduction to OpenStack and its use of KVM. Daniel P. Berrangé

Virtual Switching Without a Hypervisor for a More Secure Cloud

Intro to Virtualization

Overview: Building Open Source Cloud Computing Environments

Deploying Business Virtual Appliances on Open Source Cloud Computing

Computing Service Provision in P2P Clouds

OpenStack Alberto Molina Coballes

Computing in High- Energy-Physics: How Virtualization meets the Grid

FREE AND OPEN SOURCE SOFTWARE FOR CLOUD COMPUTING SERENA SPINOSO FULVIO VALENZA

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Performance of Network Virtualization in Cloud Computing Infrastructures: The OpenStack Case.

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

HPC performance applications on Virtual Clusters

Basics of Virtualisation

RED HAT ENTERPRISE VIRTUALIZATION PERFORMANCE: SPECVIRT BENCHMARK

How To Build A Cloud Stack For A University Project

OpenStack IaaS. Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013

Performance of the Cloud-Based Commodity Cluster. School of Computer Science and Engineering, International University, Hochiminh City 70000, Vietnam

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

The Xen of Virtualization

KVM Virtualized I/O Performance

Vocera Voice 4.3 and 4.4 Server Sizing Matrix

9/26/2011. What is Virtualization? What are the different types of virtualization.

PNY Professional Solutions NVIDIA GRID - GPU Acceleration for the Cloud

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Masters Project Proposal

RUNNING vtvax FOR WINDOWS

Open Source Cloud Computing: Characteristics and an Overview

Cloud and Virtualization to Support Grid Infrastructures

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1

Comparison of Open Source Cloud System for Small and Medium Sized Enterprises

GRID SDK FAQ. PG January Frequently Asked Questions

White Paper. Recording Server Virtualization

KVM, OpenStack, and the Open Cloud

DOCLITE: DOCKER CONTAINER-BASED LIGHTWEIGHT BENCHMARKING ON THE CLOUD

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

Transcription:

GPU Accelerated Signal Processing in OpenStack John Paul Walters Computer Scien5st, USC Informa5on Sciences Ins5tute jwalters@isi.edu

Outline Motivation OpenStack Background Heterogeneous OpenStack GPU Performance across hypervisors Current Status Future work 2

Motivation Scientific workloads demand increasing performance with greater power efficiency Architectures have been driven towards specialization, heterogeneity Infrastructure-as-a-Service (IaaS) clouds can democratize access to the latest, most powerful accelerators Then why are most of today s clouds homogeneous? Of the major providers, only Amazon offers virtual machine access to GPUs in the public cloud 3

Cloud Computing and GPUs GPU passthrough has historically been hard Specific to particular GPUs, hypervisors, host OS Legacy VGA BIOS support, etc. Today we can access GPUs through most of the major hypervisors KVM, VMWare ESXi, Xen, LXC Combine this with a heterogeneous cloud 4

OpenStack Background OpenStack founded by Rackspace and NASA In use by Rackspace, HP, and others for their public clouds Open source with hundreds of participating companies In use for both public and private clouds Current stable release: OpenStack Havana OpenStack Icehouse to be released in April 5 120 100 80 60 40 20 0 Google Trends Searches for Common Open Source IaaS Projects openstack cloudstack opennebula eucalyptus cloud

OpenStack Architecture Image source: http://docs.openstack.org/training-guides/content/module001-ch004-openstackarchitecture.html 6

Supporting GPUs We re pursuing multiple approaches for GPU support in OpenStack LXC support for container-based VMs Xen support for fully virtualized guests KVM support for fully virtualized guests, SR-IOV Also compare against VMWare ESXi Our OpenStack work currently supports GPU-enabled LXC containers Xen prototype implementation as well Given widespread support for GPUs across hypervisors, does hypervisor choice impact performance? 7 7

GPU Performance Across Hypervisors 2 CPU Architectures, 2 GPU Architectures, 4 Hypervisors Sandy Bridge + Kepler, Westmere + Fermi KVM, Xen, LXC, VMWare ESXi Standardize on a common CentOS 6.4 base system for comparison Same 2.6.32-358.23.2 kernel across all guests and LXC 8

Hardware Setup Sandy Bridge + Kepler Westmere + Fermi CPU (cores) 2xE5-2670 (16) 2xX5660 (12) Clock Speed 2.6 GHz 2.6 GHz RAM 48 GB 192 GB NUMA Nodes 2 2 GPU 1xK20m 2xC2075 9

Hypervisor Configuration Hypervisor Linux Kernel Linux Distro KVM 3.12 Arch 2013.10.01 Xen 4.3.0-7 3.12 (dom0) Arch 2013.10.01 VMWare ESXi 5.5.0 N/A N/A LXC 2.6.32-358.23.2 CentOS 6.4 10

Benchmarks 3 Benchmarks SHOC OpenCL: signal processing GPU-LIBSVM: big data, machine learning HOOMD: Molecular dynamics, GPUDirect Virtual machines: CentOS 6.4 with 2.6.32-358.23.2 kernel, 20 GB RAM, and 1 CPU socket Control for NUMA effects 11

K20 Results - SHOC RelaGve Performance 1.01 1 0.99 0.98 0.97 0.96 SHOC Performance for Common Signal Processing Kernels KVM Xen LXC VMWare 12

K20 Results SHOC Outliers RelaGve Performance 1.06 1.04 1.02 1 0.98 0.96 0.94 SHOC OpenCL Level 1, Level 2 Outliers KVM Xen LXC VMWare 13

C2075 Results - SHOC RelaGve Performance 1.02 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 SHOC Performance for Common Signal Processing Kernels KVM Xen LXC VMWare 14

C2075 Results SHOC Outliers RelaGve Performance 1.1 1 0.9 0.8 0.7 0.6 SHOC OpenCL Level 1, Level 2 Outliers KVM Xen LXC VMWare 15

SHOC Observations Overall both Fermi and Kepler systems perform nearnative This is especially true for KVM and LXC Xen on the C2075 system shows some overhead Likely because Xen couldn t activate large page tables Some unexpected performance improvement for Kepler Spmv 16

K20 Results GPU-LIBSVM 1.02 GPU- LIBSVM RelaGve Performance 1 RelaGve Performance 0.98 0.96 0.94 0.92 0.9 0.88 1800 3600 4800 6000 # of training instances KVM Xen LXC VMWare 17

C2075 Results GPU-LIBSVM 1.4 GPU- LIBSVM RelaGve Performance 1.2 RelaGve Performance 1 0.8 0.6 0.4 0.2 0 1800 3600 4800 6000 # of training instances KVM Xen LXC VMWare 18

GPU-LIBSVM Observations Unexpected performance improvement for KVM on both systems Most pronounced on Westmere/Fermi platform This is due to the use of transparent hugepages (THP) Back the entire guest memory with hugepages Improves TLB performance Disabling hugepages on Westmere/Fermi platform reduces performance to 80-87% of the base system 19

Multi-GPU with GPUDirect Many real applications extend beyond a single node s capabilities Test multi-node performance with Infiniband SR-IOV and GPUDirect 2 Sandy Bridge nodes equipped with K20 GPUs ConnectX-3 IB with SR-IOV enabled Ported Mellanox OFED 2.1-1 to 3.13 kernel KVM hypervisor Test with HOOMD, a commonly used GPUDirect-enabled particle dynamics simulator 20

GPUDirect Advantage Image source: http://old.mellanox.com/content/pages.php?pg=products_dyn&product_family=116 21

HOOMD Performance HOOMD Lennard- Jones Liquid MD RelaGve Performance RelaGve Performance 1.02 1 0.98 0.96 0.94 0.92 16k 32k 64k 128k 256k 512k Number of ParGcles 22

Current Status Source code is available now https://github.com/usc-isi/nova Includes support for heterogeneity GPU-enabled LXC instances Bare-metal provisioning Architecture-aware scheduler Prototype Xen with GPU passthrough implementation 23

Future Work Primary focus: multi-node Greater range of applications, larger systems Integrate GPU passthrough support for KVM This might come free with the existing OpenStack PCI passthrough work NUMA support This work assumes perfect NUMA mapping OpenStack should be NUMA-aware 24