Achieving Performance Isolation with Lightweight Co-Kernels

Similar documents
So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

Lightweight Operating Systems for Scalable Native and Virtualized Supercomputing

Cloud Computing through Virtualization and HPC technologies

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

Full and Para Virtualization

Virtual Switching Without a Hypervisor for a More Secure Cloud

Cray DVS: Data Virtualization Service

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Datacenter Operating Systems

Big Data Management in the Clouds and HPC Systems

Introduction to the NI Real-Time Hypervisor

Hardware Based Virtualization Technologies. Elsie Wahlig Platform Software Architect

LS DYNA Performance Benchmarks and Profiling. January 2009

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Cloud Operating Systems for Servers

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

OS/Run'me and Execu'on Time Produc'vity

Lustre Networking BY PETER J. BRAAM

ECLIPSE Performance Benchmarks and Profiling. January 2009

Oracle Database Scalability in VMware ESX VMware ESX 3.5

9/26/2011. What is Virtualization? What are the different types of virtualization.

COS 318: Operating Systems. Virtual Machine Monitors

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

High Performance Computing. Course Notes HPC Fundamentals

Beyond the Hypervisor

How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014

Cluster Implementation and Management; Scheduling

2972 Linux Options and Best Practices for Scaleup Virtualization

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance

Enabling Technologies for Distributed Computing

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

FLOW-3D Performance Benchmark and Profiling. September 2012


IOS110. Virtualization 5/27/2014 1

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Masters Project Proposal

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

SR-IOV: Performance Benefits for Virtualized Interconnects!

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Parallel Computing. Benson Muite. benson.

Amazon EC2 Product Details Page 1 of 5

Big Fast Data Hadoop acceleration with Flash. June 2013

RPM Brotherhood: KVM VIRTUALIZATION TECHNOLOGY

BHyVe. BSD Hypervisor. Neel Natu Peter Grehan

Enabling Technologies for Distributed and Cloud Computing

Red Hat Enterprise Linux 6. Stanislav Polášek ELOS Technologies

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Data Centric Systems (DCS)

Big Data Performance Growth on the Rise

CSE 501 Monday, September 09, 2013 Kevin Cleary

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Intro to Virtualization

Enabling High performance Big Data platform with RDMA

Xeon+FPGA Platform for the Data Center

SALSA Flash-Optimized Software-Defined Storage

Boosting Data Transfer with TCP Offload Engine Technology

Virtualization. Dr. Yingwu Zhu

Satish Mohan. Head Engineering. AMD Developer Conference, Bangalore

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

DPDK Summit 2014 DPDK in a Virtual World

Fastboot Techniques for x86 Architectures. Marcus Bortel Field Application Engineer QNX Software Systems

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland Hanhirova CS/Aalto

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

HPC Software Requirements to Support an HPC Cluster Supercomputer

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Advanced Computer Networks. Network I/O Virtualization

Virtual Machines. COMP 3361: Operating Systems I Winter

Infrastructure Matters: POWER8 vs. Xeon x86

PARALLELS CLOUD SERVER

An Analysis of Container-based Platforms for NFV

Nested Virtualization

Open Cirrus: Towards an Open Source Cloud Stack

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Sun Constellation System: The Open Petascale Computing Architecture

VMware and CPU Virtualization Technology. Jack Lo Sr. Director, R&D

Cloud Computing. Adam Barker

New Storage System Solutions

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

Optimize Server Virtualization with QLogic s 10GbE Secure SR-IOV

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1

Transcription:

Achieving Performance Isolation with Lightweight Co-Kernels Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015

HPC Architecture Traditional In Situ Data Processing Simulation Analytic / Visualization Supercomputer Processing Cluster Operating System and Runtimes (OS/R) Compute Node Shared Storage Cluster Problem: massive data movement over interconnects Move computation to data Improved data locality Reduced power consumption Reduced network traffic 2

Challenge: Predictable High Performance Tightly coupled HPC workloads are sensitive to OS noise and overhead [Petrini SC 03, Ferreira SC 08, Hoefler SC 10] Specialized kernels for predictable performance Tailored from Linux: CNL for Cray supercomputers Lightweight kernels (LWK) developed from scratch: IBM CNK, Kitten Data processing workloads favor Linux environments Cross workload interference Shared hardware (CPU time, cache, memory bandwidth) Shared system software How to provide both Linux and specialized kernels on the same node, while ensuring performance isolation?? 3

Approach: Lightweight Co-Kernels Analytic / Visualization Simulation Analytic / Visualization Simulation Linux Hardware Linux LWK Hardware Hardware resources on one node are dynamically composed into multiple partitions or enclaves Independent software stacks are deployed on each enclave Optimized for certain applications and hardware Performance isolation at both the software and hardware level 4

Agenda Introduction The Pisces Lightweight Co-Kernel Architecture Implementation Evaluation Related Work Conclusion 5

Building Blocks: Kitten and Palacios the Kitten Lightweight Kernel (LWK) Goal: provide predictable performance for massively parallel HPC applications Simple resource management policies Limited kernel I/O support + direct user-level network access the Palacios Lightweight Virtual Machine Monitor (VMM) Goal: predictable performance Lightweight resource management policies Established history of providing virtualized environments for HPC [Lange et al. VEE 11, Kocoloski and Lange ROSS 12] Kitten: https://software.sandia.gov/trac/kitten Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/

The Pisces Lightweight Co-Kernel Architecture Isolated Virtual Machine Palacios VMM Kitten Co-kernel (1) Pisces Applications + Virtual Machines Linux Hardware Pisces Isolated Application Kitten Co-kernel (2) Pisces Design Goals Performance isolation at both software and hardware level Dynamic creation of resizable enclaves Isolated virtual environments http://www.prognosticlab.org/pisces/ 7

Design Decisions Elimination of cross OS dependencies Each enclave must implement its own complete set of supported system calls No system call forwarding is allowed Internalized management of I/O Each enclave must provide its own I/O device drivers and manage its hardware resources directly Userspace cross enclave communication Cross enclave communication is not a kernel provided feature Explicitly setup cross enclave shared memory at runtime (XEMEM) Using virtualization to provide missing OS features 8

Cross Kernel Communication User% Context% Linux' Compa)ble' Workloads' Control' Process' Shared*Mem* Communica6on*Channels* Shared*Mem* *Ctrl*Channel* Control' Process' Isolated' Processes'' +' Virtual' Machines' Kernel% Context% Linux' Hardware'Par))on' Cross%Kernel* Messages* Ki@en' CoAKernel' Hardware'Par))on' XEMEM: Efficient Shared Memory for Composed Applications on Multi-OS/R Exascale Systems [Kocoloski and Lange, HPDC 15] 9

Agenda Introduction The Pisces Lightweight Co-Kernel Architecture Implementation Evaluation Related Work Conclusion 10

Challenges & Approaches How to boot a co-kernel? Hot-remove resources from Linux, and load co-kernel Reuse Linux boot code with modified target kernel address Restrict the Kitten co-kernel to access assigned resources only How to share hardware resources among kernels? Hot-remove from Linux + direct assignment and adjustment (e.g. CPU cores, memory blocks, PCI devices) Managed by Linux and Pisces (e.g. IOMMU) How to communicate with a co-kernel? Kernel level: IPI + shared memory, primarily for Pisces commands Application level: XEMEM [Kocoloski HPDC 15] How to route device interrupts? 11

I/O Interrupt Routing Legacy Interrupt Forwarding Management Kernel Co-Kernel Direct Device Assignment (w/ MSI) Management Kernel Co-Kernel IRQ Forwarder IPI IRQ Handler IRQ Forwarder IRQ Handler INTx MSI MSI IO-APIC MSI/MSI-X Device MSI/MSI-X Device Legacy Device Legacy interrupt vectors are potentially shared among multiple devices Pisces provides IRQ forwarding service IRQ forwarding is only used during initialization for PCI devices Modern PCI devices support dedicated interrupt vectors (MSI/MSI-X) Directly route to the corresponding enclave 12

Implementation Pisces Linux kernel module supports unmodified Linux kernels (2.6.3x 3.x.y) Co-kernel initialization and management Kitten (~9000 LOC changes) Manage assigned hardware resources Dynamic resource assignment Kernel level communication channel Palacios (~5000 LOC changes) Dynamic resource assignment Command forwarding channel Pisces: http://www.prognosticlab.org/pisces/ Kitten: https://software.sandia.gov/trac/kitten Palacios: http://www.prognosticlab.org/palacios http://www.v3vee.org/ 13

Agenda Introduction The Pisces Lightweight Co-Kernel Architecture Implementation Evaluation Related Work Conclusion 14

Evaluation 8 node Dell R450 cluster Two six-core Intel Ivy-Bridge Xeon processors 24GB RAM split across two NUMA domains QDR Infiniband CentOS 7, Linux kernel 3.16 For performance isolation experiments, the hardware is partitioned by NUMA domains. i.e. Linux on one NUMA domain, co-kernel on the other 15

Fast Pisces Management Operations Operations Latency (ms) Booting a co-kernel 265.98 Adding a single CPU core 33.74 Adding a 128MB memory block 82.66 Adding an Ethernet NIC 118.98 16

Eliminating Cross Kernel Dependencies solitary workloads (us) w/ other workloads (us) Linux 3.05 3.48 co-kernel fwd 6.12 14.00 co-kernel 0.39 0.36 Execution Time of getpid() Co-kernel has the best average performance Co-kernel has the most consistent performance System call forwarding has longer latency and suffers from cross stack performance interference 17

Noise Analysis 20 (a) without competing workloads (b) with competing workloads Linux Latency (us) 15 10 5 0 0 1 2 3 4 5 Time (seconds) 0 1 2 3 4 5 Time (seconds) 20 (a) without competing workloads (b) with competing workloads Kitten co-kernel Latency (us) 15 10 5 0 0 1 2 3 4 5 0 1 2 3 4 5 Time (seconds) Time (seconds) Co-Kernel: less noise + better isolation * Each point represents the latency of an OS interruption 18

Single Node Performance Completion Time (Seconds) 85 84 83 82 1 0 without bg with bg CentOS Kitten/KVM co-kernel Throughput (GUPS) 21250 21000 20750 20500 20250 250 0 without bg with bg CentOS Kitten/KVM co-kernel CoMD Performance Stream Performance Co-Kernel: consist performance + performance isolation 19

8 Node Performance Throughput (GFLOP/s) 20 18 16 14 12 10 8 6 4 co-vmm native KVM co-vmm bg native bg KVM bg 2 1 2 3 4 5 6 7 8 Number of Nodes w/o bg: co-vmm achieves native Linux performance w/ bg: co-vmm outperforms native Linux 20

Co-VMM for HPC in the Cloud 100 80 Co-VMM Native KVM CDF (%) 60 40 20 0 44 45 46 47 48 49 50 51 Runtime (seconds) CDF of HPCCG Performance (running with Hadoop, 8 nodes) co-vmm: consistent performance + performance isolation 21

Related Work Exascale operating systems and runtimes (OS/Rs) Hobbes (SNL, LBNL, LANL, ORNL, U. Pitt, various universities) Argo (ANL, LLNL, PNNL, various universities) FusedOS (Intel / IBM) mos (Intel) McKernel (RIKEN AICS, University of Tokyo) Our uniqueness: performance isolation, dynamic resource composition, lightweight virtualization 22

Conclusion Design and implementation of the Pisces co-kernel architecture Pisces framework http://www.prognosticlab.org/pisces/ Kitten co-kernel Palacios VMM for Kitten co-kernel https://software.sandia.gov/trac/kitten http://www.prognosticlab.org/palacios Demonstrated that the co-kernel architecture provides Optimized execution environments for in situ processing Performance isolation 23

Thank You Jiannan Ouyang Ph.D. Candidate @ University of Pittsburgh ouyang@cs.pitt.edu http://people.cs.pitt.edu/~ouyang/ The Prognostic Lab @ U. Pittsburgh http://www.prognosticlab.org