A Holistic Model of the Energy-Efficiency of Hypervisors

Similar documents

A Holistic Model of the Performance and the Energy-Efficiency of Hypervisors in an HPC Environment

Performance Evaluation of the XDEM framework on the OpenStack Cloud Computing Middleware

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed and Cloud Computing

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

Cloud Computing through Virtualization and HPC technologies

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Scientific Computing Data Management Visions

Benchmarking Hadoop & HBase on Violin

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

Investigation of storage options for scientific computing on Grid and Cloud facilities

CSC Yearly Team Meeting 2013 Edition. vendredi 18 octobre 13

7 Real Benefits of a Virtual Infrastructure

A Holistic Model for Resource Representation in Virtualized Cloud Computing Data Centers

Data center modeling, and energy efficient server management

Key words: cloud computing, cluster computing, virtualization, hypervisor, performance evaluation

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

Scaling in a Hypervisor Environment

Evaluate the Performance and Scalability of Image Deployment in Virtual Data Center

White Paper. Recording Server Virtualization

ATLAS Cloud Computing and Computational Science Center at Fresno State

Full and Para Virtualization

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Mobile Cloud Computing T Open Source IaaS

Computing in High- Energy-Physics: How Virtualization meets the Grid

IOS110. Virtualization 5/27/2014 1

Virtualization Performance on SGI UV 2000 using Red Hat Enterprise Linux 6.3 KVM

Cloud Computing PES. (and virtualization at CERN) Cloud Computing. GridKa School 2011, Karlsruhe. Disclaimer: largely personal view of things

Use of Hadoop File System for Nuclear Physics Analyses in STAR

Variations in Performance and Scalability when Migrating n-tier Applications to Different Clouds

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

SR-IOV: Performance Benefits for Virtualized Interconnects!

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

Building a Private Cloud with Eucalyptus

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Performance In the Cloud. White paper

TREND MICRO SOFTWARE APPLIANCE SUPPORT

IBM System x family brochure

Scaling Database Performance in Azure

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun Storage Tech. Lab SK Telecom

Vocera Voice 4.3 and 4.4 Server Sizing Matrix

FLOW-3D Performance Benchmark and Profiling. September 2012

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Performance in a Gluster System. Versions 3.1.x

Outline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms

Networking Virtualization Using FPGAs

GPU Accelerated Signal Processing in OpenStack. John Paul Walters. Computer Scien5st, USC Informa5on Sciences Ins5tute

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Emerging Technology for the Next Decade

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

Exploring Amazon EC2 for Scale-out Applications

SR-IOV In High Performance Computing

How To Build A Cloud Stack For A University Project

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Business white paper. HP Process Automation. Version 7.0. Server performance

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Dell Reference Configuration for Hortonworks Data Platform

Virtualization of the MS Exchange Server Environment

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

Active Fabric Manager (AFM) Plug-in for VMware vcenter Virtual Distributed Switch (VDS) CLI Guide

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Determining Overhead, Variance & Isola>on Metrics in Virtualiza>on for IaaS Cloud

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

Comparison of Hybrid Flash Storage System Performance

CS 695 Topics in Virtualization and Cloud Computing. Introduction

Performance Evaluation of Private Clouds Eucalyptus versus CloudStack

Microsoft SharePoint Server 2010

Balancing CPU, Storage

2) Xen Hypervisor 3) UEC

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Lecture 2 Cloud Computing & Virtualization. Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu

Red Hat Enterprise Virtualization Performance. Mark Wagner Senior Principal Engineer, Red Hat June 13, 2013

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

Philips IntelliSpace Critical Care and Anesthesia on VMware vsphere 5.1

Cisco Intercloud Fabric for Business

VIRTUALIZATION, The next step for online services

Cloud Computing. Alex Crawford Ben Johnstone

VMware Virtual SAN 6.0 Performance

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

How To Build A Cloud Server For A Large Company

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

Tableau Server 7.0 scalability

RUNNING vtvax FOR WINDOWS

System Requirements Table of contents

KVM Virtualized I/O Performance

Transcription:

A Holistic Model of the -Efficiency of Hypervisors in an HPC Environment Mateusz Guzek,Sebastien Varrette, Valentin Plugaru, Johnatan E. Pecero and Pascal Bouvry SnT & CSC, University of Luxembourg, Luxembourg 1 / 29

Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 2 / 29

Introduction, Context & Motivations Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 3 / 29

Introduction, Context & Motivations HPC at the Heart of our Daily Life Today... Research, Industry, Local Collectivities... Tomorrow: applied research, digital health, nano/bio techno N 4 / 29

Introduction, Context & Motivations Cloud Computing in an HPC context Horizontal scalability: perfect for replication/ HA (High Availability) best suited for runs with minimal communication and I/O nearly useless for true parallel/distributed HPC runs Cloud Data storage Data locality enforced for performance Data outsourcing vs. legal obligation to keep data local Accessibility, security challenges Cost effectiveness chaos+gaia usage: 11,154,125 CPUhour (1273 years) since 2007 15,06M$ on EC2 cc2.8xlarge vs. 4 Me cumul. HW investment Virtualization layer impact on performance? most probably decreased performance huge overhead induced on I/O + no support of IB [Q E F]DR 5 / 29

Introduction, Context & Motivations Objectives of this study Better than assumptions/a-priori: concrete models and experiments Evaluate impact of the underlying hypervisor at the heart of any cloud middleware so far analysis of the most widespread virtualization frameworks propose a lightweight, high-level model of a virtualized machine. Evaluate a real HPC platform (or anything as close as possible) concrete deployment on top of the Grid5000 platform select benchmarking tools to reflect an HPC usage Abstract from the specifics of a single processor architecture Evaluate Intel vs AMD Thanks Georges ;) 6 / 29

Introduction, Context & Motivations Virtualization Frameworks Enables finer grain resource provisioning Provides new functionalities (e.g. migration, suspension) The study includes the most commonly used hypervisors Hypervisor: Xen 4.0 KVM 0.12 ESXi 5.1 Host architecture x86, x86-64, ARM x86, x86-64 x86-64 VT-x/AMD-v Yes Yes Yes Max Guest CPU 128 64 32 Max. Host memory 1TB - 2TB Max. Guest memory 1TB - 1TB 3D-acceleration Yes (HVM Guests) No Yes License GPL GPL/LGPL Proprietary deployment on the same Debian instance on Grid5000 7 / 29

Introduction, Context & Motivations HPC benchmarks Selected to represent various use cases of HPC systems: HPCC : new reference benchmark suit for HPC includes HPL 7 tests to stress CPU/disk/RAM/network usage Bonnie++ : a file system benchmarking suite IOZone : cross-platform benchmark of file operations read, write, re-read, re-write, read backwards/strided, mmap... 8 / 29

Modeling Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 9 / 29

Modeling Resource model The model divides machine into distinct types. Further defined by resource supplies: their capacity and architecture. Node PowerEdge R310 Processor Memory Storage Network Intel Xeon X3430, 2.4 GHz, 8M Cache, Turbo 4GB Memory (2x2GB), 1333MHz Single Ranked UDIMM 500GB 7.2K RPM SATA 3.5in No Raid Broadcom 5709 Dual Port 1GbE NIC w/toe iscsi, PCIe-4 C o r e C o r e C o r e C o r e 4 GB 500 GB 1 Gbps 1 Gbps 1 2 3 4 10 / 29

Modeling Resource allocation model The resource allocation is three-tier: Task, VM, Machine. T1 D: 1 A: 1 T2 D: 2 A: 1 T3 D: 1 A: 1 T4 D: 2 A: 1 T5 D: 1 A: 1 T6 D: 1 A: 1 T7 D: 1 A: 2 T8 D: 2 A: 2 T9 D: 2 A: 2 VM1 Type 1 D: 5 P: 4 A: 1 VM2 Type 2 D: 6 P: 5 A: 1 VM3 Type 3 D: 6 P:4 A: 1 VM4 Type 3 D: 6 P:4 A: 1 VM5 Type 2 D: 6 P: 5 A: 1 Node 1 P: 20 U: 17 A: 1 Node 2 P: 32 U: 12 A:2 Each level is described by the previously presented resource model, either as a resource provision or as a resource demand. Validation if such model can estimate the avg. power of a node. 11 / 29

Experimental Setup & Experiments Performed Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 12 / 29

Experimental Setup & Experiments Performed Setup Name Site Cluster #cpus/n #RAM Processor R peak Intel Lyon taurus 2 32GB Intel Xeon E5-2630@2.3GHz 6C 110,4 GFlops AMD Reims stremi 2 48GB AMD Opteron 6164 HE@1.7GHz 12C 163.2 GFlops Deploy of recent platforms 2011, Intel or AMD Experimental setup not straightforward see the paper for details. 13 / 29

Experimental Setup & Experiments Performed Runs & Monitoring config: baseline KVM Xen VMWare ESXi Observation No. stremi-3 5 5 5 0 10916 stremi-6 5 5 5 0 10907 stremi-30 5 5 5 5 13706 stremi-31 5 5 5 5 14026 taurus-7 5 5 5 5 6516 taurus-8 4 5 5 0 4769 taurus-9 5 5 5 0 5085 taurus-10 5 5 5 5 6545 Avg power values based on: OmegaWatt taurus average every 1s Raritan stremi instant. every 3s dstat used to capture utilization (every 1s): CPU user, system, idle, wio (%) Memory used, buffered, cached, free (B) Disk read, write (B) Network received, send (B). 14 / 29

Results Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 15 / 29

Results 1000 Performance HPCC results Baseline Intel Baseline AMD Xen Intel Xen AMD KVM Intel KVM AMD VMWare ESXi Intel VMWare ESXi AMD 100 Raw Result 100 Raw Result 10 10 HPL (TFlops) 1 DGEMM (GFlops) 0.1 STREAM (MB/s) 100 Copy Scale Add Trial Raw Result Raw Result 10 1 0.01 RandomAccess (GUPs) 0.1 16 / 29

Results Performance IOzone results 6000 rewrite 4500 3000 1500 0 random_write IOzone 64MB file, 1MB record test stremi baseline taurus baseline stremi KVM taurus KVM stremi Xen taurus Xen stremi ESXi taurus ESXi read random_read reread Intel-based platform outperforms the AMD-based one ESXi outperforms baseline in some cases on Intel platform (caching strategy?) 17 / 29

Results 250 200 consumption profiles I Baseline Intel (Lyon) Baseline AMD (Reims) Power usage evolution Baseline Intel (Lyon) Power usage evolution Baseline AMD (Reims) 350 HPCC HPCC Bonnie++ 1233931J Bonnie++ 604686J IOZone 4427s IOZone 2805s 300 1242245J 245835J 7436s 1475s 134824J 51407J 250 1128s 458s Power usage [W] 150 100 Power usage [W] 200 150 100 50 50 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 0 2000 4000 6000 8000 10000 12000 Xen Intel (Lyon) Xen AMD (Reims) Power usage evolution Xen Intel (Lyon) Power usage evolution Xen AMD (Reims) 350 300 638835J 2693s HPCC Bonnie++ IOZone 350 300 882056J 3486s HPCC Bonnie++ IOZone 250 210324J 1700s 62161J 532s 250 543745J 3302s 230757J 1428s Power usage [W] 200 150 Power usage [W] 200 150 100 100 50 50 0 0 1000 2000 3000 4000 5000 0 0 1000 2000 3000 4000 5000 6000 7000 8000 18 / 29

Results 300 250 200 consumption profiles II KVM Intel (Lyon) KVM AMD (Reims) Power usage evolution KVM Intel (Lyon) Power usage evolution KVM AMD (Reims) 350 HPCC HPCC 815474J Bonnie++ Bonnie++ IOZone 698807J IOZone 3561s 300 2686s 175341J 57950J 948062J 368147J 1471s 538s 250 5983s 2321s Power usage [W] 150 Power usage [W] 200 150 100 100 50 50 0 0 1000 2000 3000 4000 5000 0 0 2000 4000 6000 8000 10000 ESXi Intel (Lyon) ESXi AMD (Reims) Power usage evolution VMWare ESXi Intel (Lyon) Power usage evolution VMWare ESXi AMD (Reims) 300 250 642461J 2693s HPCC Bonnie++ IOZone 154896J 48143J 350 300 554346J 2063s HPCC Bonnie++ IOZone 200 1162s 388s 250 1027326J 199686J Power usage [W] 150 Power usage [W] 200 150 6650s 1252s 100 100 50 50 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 0 0 2000 4000 6000 8000 10000 19 / 29

Results Models and their accuracy The proposed models are based on multiple linear regression principle and include different subsets of available inputs. Model R 2 Residuals Error St.er. Min 1Q Median 3Q Max W % Basic 0.959 10.4-116 -3.54 0.8 5.07 117 6.67 3.8% Refined 0.959 10.4-116 -3.54 0.806 5.07 117 6.67 3.8% No Phases 0.941 12.4-127 -4.13 1.04 4.88 125 8.18 4.4% CPU Hom. 0.814 22-147 -13.1 3.9 13.6 160 16.7 9.6% CPU Het. 0.922 14.2-129 -5.12-0.0472 4.89 129 9.73 5.0% Only phases 0.922 14.3-114 -3.94 2.06 5.51 72.8 8.75 4.9% No group 0.856 19.4-142 -8.45 2.96 11.3 129 14.1 8.0% Clusterwise 0.928 13.7-122 -7.22 0.876 7.02 131 9.97 5.3% Group only 0.924 14.1-113 -4.29 2.52 5.88 69.9 8.77 4.9% 20 / 29

Results Numerical values for sample model Categorical predictors: node stremi-3 stremi-30 stremi-31 stremi-6 taurus-10 taurus-7 taurus-8 taurus-9 value 0-1.8-14 -4.7-45 -40-46 -44 phase Bonnie DGEMM FFT HPL IOZONE PTRANS RandAcc STREAM idle value 0 5.7 6.5 16 0.012-6.1-11 3.1 6.1 hypervisor ESXi KVM Xen baseline value 0-3.6-5.4-19 Numerical Predictors: metric Intercept cpu user cpu system cpu idle cpu wio mem used value 316-0.78-1.3-1.7-1.8 1.1E-9 metric mem buffers mem cached mem free disk write bytes rec. bytes send value -3.1E-08 8.0E-10 8.3E-10-1.8E-10 4.6E-05-1.1E-04 21 / 29

Results Predictions I Baseline Intel (Lyon) Baseline AMD (Reims) Power [W] 100 150 200 Observed Refined Power [W] 160 180 200 220 240 260 280 300 Observed Refined 0 1000 2000 3000 4000 0 2000 4000 6000 8000 10000 12000 22 / 29

Results Predictions II Xen Intel (Lyon) Xen AMD (Reims) Power [W] 100 150 200 250 Observed Refined Power [W] 160 180 200 220 240 260 Observed Refined 0 1000 2000 3000 4000 5000 0 2000 4000 6000 8000 23 / 29

Results Predictions III KVM Intel (Lyon) KVM AMD (Reims) Power [W] 100 150 200 250 Observed Refined Power [W] 160 180 200 220 240 260 280 Observed Refined 0 1000 2000 3000 4000 5000 0 2000 4000 6000 8000 10000 24 / 29

Results Predictions IV ESXi Intel (Lyon) ESXi AMD (Reims) Power [W] 120 140 160 180 200 220 240 260 Observed Refined Power [W] 150 200 250 300 Observed Refined 0 1000 2000 3000 4000 0 2000 4000 6000 8000 10000 25 / 29

Conclusion Summary 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion 26 / 29

Conclusion Conclusion CC in an HPC context requires a better understanding of the performance of virtualization middlewares. In this talk: evaluation of the 3 widespread virtualization frameworks Xen, KVM and VMware ESXi vs. baseline practical and lightweight power model of virtualized nodes holistic model is the most accurate deployment/experiments on a real HPC environment Grid 5000, Intel vs. AMD evaluation Middleware affects both performance and energy results observed overhead (20-30%) is acceptable. Hardware heterogeneity is noticeable Better results on Intel than on AMD 27 / 29

Conclusion Future work 1 Performance modelling and comparing it with the power model. 2 Analysis of effects of multiple VMs on a single node. 3 Including temperature as an environmental factor. 4 Analysis of overhead of cloud management system OpenNebula, OpenStack, Eucalyptus, Nimbus etc. Snooze ;) 5 Experimenting with network-intensive workloads 6 Derivation of model from component, rather than full platform level 28 / 29

Conclusion Thank you for your attention... 1 Introduction, Context & Motivations 2 Modeling 3 Experimental Setup & Experiments Performed 4 Results 5 Conclusion Contacts: {firstname.name}@uni.lu 29 / 29