DPDK Summit 2014 DPDK in a Virtual World



Similar documents
Leveraging NIC Technology to Improve Network Performance in VMware vsphere

Deploying Extremely Latency-Sensitive Applications in VMware vsphere 5.5

Evaluation and Characterization of NFV Infrastructure Solutions on HP Server Platforms

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vsphere 5 R E S E A R C H N O T E

Telecom - The technology behind

Network Virtualization Technologies and their Effect on Performance

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

How To Save Power On A Server With A Power Management System On A Vsphere Vsphee V (Vsphere) Vspheer (Vpower) (Vesphere) (Vmware

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

High-performance vnic framework for hypervisor-based NFV with userspace vswitch Yoshihiro Nakajima, Hitoshi Masutani, Hirokazu Takahashi NTT Labs.

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

Advanced Computer Networks. Network I/O Virtualization

Full and Para Virtualization

Intel DPDK Boosts Server Appliance Performance White Paper

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Network Function Virtualization Using Data Plane Developer s Kit

Simplify VMware vsphere* 4 Networking with Intel Ethernet 10 Gigabit Server Adapters

Scaling in a Hypervisor Environment

Evaluation Report: Emulex OCe GbE and OCe GbE Adapter Comparison with Intel X710 10GbE and XL710 40GbE Adapters

State of the Art Cloud Infrastructure

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

PCI-SIG SR-IOV Primer. An Introduction to SR-IOV Technology Intel LAN Access Division

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

The Future of Virtualization Technology. Stephen Alan Herrod VP of Technology VMware

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Knut Omang Ifi/Oracle 19 Oct, 2015

Dialogic PowerMedia XMS

A Superior Hardware Platform for Server Virtualization

Virtualization Technologies

Enabling Technologies for Distributed Computing

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

Chapter 14 Virtual Machines

Enabling Intel Virtualization Technology Features and Benefits

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

High Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman

Benchmarking Virtual Switches in OPNFV draft-vsperf-bmwg-vswitch-opnfv-00. Maryam Tahhan Al Morton

VMware and CPU Virtualization Technology. Jack Lo Sr. Director, R&D

vsphere Networking vsphere 6.0 ESXi 6.0 vcenter Server 6.0 EN

Enabling Technologies for Distributed and Cloud Computing

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

How Linux kernel enables MidoNet s overlay networks for virtualized environments. LinuxTag Berlin, May 2014

An Oracle Technical White Paper November Oracle Solaris 11 Network Virtualization and Network Resource Management

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

Virtualization for Cloud Computing

SMB Direct for SQL Server and Private Cloud

Chapter 5 Cloud Resource Virtualization

Expert Reference Series of White Papers. vterminology: A Guide to Key Virtualization Terminology

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

SDN software switch Lagopus and NFV enabled software node

ELI: Bare-Metal Performance for I/O Virtualization

Xen in Embedded Systems. Ray Kinsella Senior Software Engineer Embedded and Communications Group Intel Corporation

Version 3.7 Technical Whitepaper

Datacenter Operating Systems

Cloud Computing CS

Virtual Network Exceleration OCe14000 Ethernet Network Adapters

Masters Project Proposal

I/O virtualization. Jussi Hanhirova Aalto University, Helsinki, Finland Hanhirova CS/Aalto

VMware vsphere 4.1 Networking Performance

vpf_ring: Towards Wire-Speed Network Monitoring Using Virtual Machines

Achieving a High-Performance Virtual Network Infrastructure with PLUMgrid IO Visor & Mellanox ConnectX -3 Pro

Network Function Virtualization: Virtualized BRAS with Linux* and Intel Architecture

Virtualization Overview. Yao-Min Chen

JUNIPER NETWORKS FIREFLY HOST FIREWALL PERFORMANCE

Understanding Full Virtualization, Paravirtualization, and Hardware Assist. Introduction...1 Overview of x86 Virtualization...2 CPU Virtualization...

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Hyper-V R2: What's New?

Bridging the Gap between Software and Hardware Techniques for I/O Virtualization

Virtualizing Performance-Critical Database Applications in VMware vsphere VMware vsphere 4.0 with ESX 4.0

OpenStack Networking: Where to Next?

Set Up a VM-Series Firewall on an ESXi Server

High-performance vswitch of the user, by the user, for the user

Optimize Server Virtualization with QLogic s 10GbE Secure SR-IOV

How To Make A Virtual Machine Aware Of A Network On A Physical Server

Cisco Intercloud Fabric for Business

Mellanox Academy Online Training (E-learning)

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Uses for Virtual Machines. Virtual Machines. There are several uses for virtual machines:

Simplified, High-Performance 10GbE Networks Based on a Single Virtual Distributed Switch, Managed by VMware vsphere* 5.1

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing

Hardware Based Virtualization Technologies. Elsie Wahlig Platform Software Architect

Nutanix Tech Note. VMware vsphere Networking on Nutanix

Set Up a VM-Series Firewall on an ESXi Server

Nested Virtualization

Intel Virtualization Technology Overview Yu Ke

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1

Virtualization Technology. Zhiming Shen

Enterprise-Class Virtualization with Open Source Technologies

NetVM: High Performance and Flexible Networking using Virtualization on Commodity Platforms

Getting the Most Out of Virtualization of Your Progress OpenEdge Environment. Libor Laubacher Principal Technical Support Engineer 8.10.

W H I T E P A P E R. Performance and Scalability of Microsoft SQL Server on VMware vsphere 4

VIRTUALIZATION 101. Brainstorm Conference 2013 PRESENTER INTRODUCTIONS

RoCE vs. iwarp Competitive Analysis

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

VXLAN: Scaling Data Center Capacity. White Paper

VXLAN Performance Evaluation on VMware vsphere 5.1

Intel Ethernet Server Bypass Adapter X520/X540 Family

Running vtserver in a Virtual Machine Environment. Technical Note by AVTware

Hyper-V Networking. Aidan Finn

Transcription:

DPDK Summit 2014 DPDK in a Virtual World Bhavesh Davda (Sr. Staff Engineer, CTO Office, ware) Rashmin Patel (DPDK Virtualization Engineer, Intel)

Agenda Data Plane Virtualization Trends DPDK Virtualization Support ware ESXi : Key Features and Performance 2

Agenda Data Plane Virtualization Trends DPDK Virtualization Support ware ESXi : Key Features and Performance 3

Data Plane Virtualization Trends Virtualization for Directed I/O Packets are routed to Virtual Machine using DirectPath I/O. Limited flexibility but native performance Hybrid Switching Solution, combining vswitch support with direct assignment of SR-IOV Virtual Function Optimized Virtual Switching solution, combining flexibility with performance. Support for live migration and data plane performance. Compute Virtual Machine Data Plane Appliance Intel DPDK Compute or Offload Virtual Machine Data Plane Appliance Intel DPDK Compute or Data Plane Virtual Machine Data Plane Appliance Intel DPDK Hypervisor VT-d Hypervisor Virtual Switch VT-d Hypervisor Optimized Virtual Switch Intel Architecture Network Interface Network Interface Intel Architecture Network Interface Network Interface SR-IOV Intel Architecture Network Interface Standalone appliance integration, Firewall, WAN acceleration, Traffic Shaping Service Chaining, Unified Threat Management, Intrusion Detection / Prevention Increasing flexibility through high performance soft switching supporting both communications and compute workloads 4

Agenda Data Plane Virtualization Trends DPDK Virtualization Support ware ESXi : Key Feature and Performance 5

DPDK Virtualization Support Virtual Machine Niantic 82599 Virtual Function PMD Fortville Virtual Function PMD E1000 Emulated Device (Intel 82540 Ethernet Controller) PMD Virtio Para-virtual Device (Qumranet Device) PMD IVSHMEM Shared Memory Interface Virtqueue-GrantTable Interface in Xen DomU E1000 Emulated Device (Intel 82545EM Ethernet Controller) PMD E1000E Emulated Device (Intel 82574 Ethernet Controller) PMD Vmxnet3 Para-virtual Device PMD ware ESXi Hypervisor/Host Niantic 82599 Physical Function Driver Fortville Physical Function Driver Userspace-Vhost Backend support IVSHMEM Backend support Fedora DPDK Hypervisor VT-d Ubuntu DPDK Software Switch DPDK Intel Architecture CentOS 6

ware ESXi 1G/10G/40G/SRIOV-VF e1000 e1000e XNET3 ware ESXi Virtual Interfaces 10000 vss/vds/nsx A software switch Emulated devices E1000 (Intel 82545EM) E1000E (Intel 82574) VLance (AMD PCnet32) Para-virtual devices XNET interface XNET2 interface XNET3 interface Direct assignment vswitch bypass Intel VT-d / IOMMU required DirectPath I/O Full PCI function SR-IOV PCI VF assignment Incompatible with virtualization features: vmotion, HA, FT, snapshots, network virtualization overlays (VXLAN/STT/Geneve) RX 0 DPDK EM/IGB/IXGBE-PMD TX VT-d RX vmware/igb TX 1 DPDK EM-PMD RX TX Virtual Interface Backend vswitch / vds / NSX vmware/ixgbe 2 DPDK XNET3-PMD RX TX vmware/xyz IA 7

ware ESXi Emulated Device (E1000) ware ESXi Receive Queue Emulation Backend Blank mbuf Rx Descriptor Ring Allocate mbuf Enqueue desc Guest OS E1000 Driver vswitch RX Side TX Side RX packet Desc update TX packet Device Emulation: EXIT-ENTRY Transmit Queue Dequeue RX desc Enqueue TX desc RX Side TX Side Packet Buffer Memory Desc update only Tx Descriptor Ring Dequeue desc Free mbuf Emulated Device 5000 Every emulated PCI MMIO access from a results in exit 8

ware ESXi Paravirtual Device (XNET3) ware ESXi XNET3 Backend Blank mbuf Receive Queue Command Ring 0 Command Ring 1 Completion Ring Allocate mbuf Enqueue desc Guest OS XNET3 Driver vswitch RX Side TX Side RX packet Desc update TX packet Shared Memory Transmit Queue Dequeue RX desc Enqueue TX desc RX Side TX Side Packet Buffer Memory Command Ring Desc update only Completion Ring Dequeue desc Free mbuf 5000 XNET3 Device Reduced exits by moving Queue Control Registers in Shared Memory and passing hints from ESXi to Guest OS 9

ESXi 5.5 - XNET3 vs E1000 Optimized Rx/Tx queues handling in XNET3 controlled through shared memory region reduced exits compared to E1000 s inefficient MMIO emulation Multiqueue infrastructure of XNET3 with RSS capability enhance the performance with Multicores in a Average cost for a exit/ entry sequence includes ~600 cycles for CALL instruction. Average cost for EPT violation ~1000 cycles Intel Xeon Processor E5-4610 v2 (16M Cache, 2.30 GHz) exit/ entry frequency 140000000 120000000 Virtual Machine Virtual Machine 100000000 80000000 Intel DPDK EM-PMD Intel DPDK XNET3-PMD 60000000 40000000 e1000 vmxnet3 20000000 ESXi Hypervisor vswitch 0 EPT EXIT Total EXIT Intel Architecture Network Interface Traffic Generator DPDK XNET3 DPDK E1000 97% Reduction of exits associated with DPDK based Packet forwarding benchmark 10

Hypervisor Backend Impact Traffic Flow: Traffic Gen. -> vswitch -> XNET3 (or E1000) -> (DPDK) -> XNET3 (or E1000) -> vswitch -> Traffic Gen. exit reduction doesn t translate to big difference in packet throughput; Hypervisor Backend and Native Networking stack needs optimizations 11

Device Model Research Traffic Flow: Traffic Gen. -> 10G (VT-d) -> 1 (DPDK) -> XNET3 (or E1000) -> vswitch -> XNET3 (or E1000) -> 2 (DPDK) -> 10G (VT-d) -> Traffic Gen. Important to understand for designing changes in device model for future ESXi releases 12

Agenda Data Plane Virtualization Trends DPDK Virtualization Support ware ESXi : Key Features and Performance 13

Key Properties of Virtual Machines Partitioning Run multiple operating systems on one physical machine Divide system resources between virtual machines 14

Key Properties of Virtual Machines : Continued Partitioning Run multiple operating systems on one physical machine Divide system resources between virtual machines Isolation ware Fault and security isolation at the hardware level Advanced resource controls preserve performance 15

Key Properties of Virtual Machines : Continued Partitioning Run multiple operating systems on one physical machine Divide system resources between virtual machines ware Isolation Fault and security isolation at the hardware level Advanced resource controls preserve performance Encapsulation Entire state of the virtual machine can be saved to files Move and copy virtual machines as easily as moving and copying files 16

Key Properties of Virtual Machines : Continued Partitioning Run multiple operating systems on one physical machine Divide system resources between virtual machines ware Isolation Fault and security isolation at the hardware level Advanced resource controls preserve performance Encapsulation Entire state of the virtual machine can be saved to files Move and copy virtual machines as easily as moving and copying files Hardware Independence Provision or migrate any virtual machine to any similar or different physical server 17

ESXi Networking Architecture Overview 20000 18

ESXi Networking Datapath Overview Message copy from application to GOS (kernel) GOS (network stack) + vnic driver queues packet for vnic exit to M/Hypervisor vnic implementation emulates DMA from, sends to vswitch vswitch queues packet for pnic pnic DMAs packet and transmits on the wire s NIC Virtual Switch Interconnect ESXi Hypervisor Management Agents Background Tasks Server 5000 Network Switch 19

Transmit Processing for a Ring Entries To Packets Destination Packets To Ring Entries Switching, Encapsulation, Teaming, Packets to Ring Entries vnic Kcall vnic Opport. vnic DevEmu DevEmu DevEmu Network Virtualization Layer PNIC Driver 0 One transmit thread per, executing all parts of the stack Transmit thread can also execute receive path for destination Activation of transmit thread: Two mechanisms Immediate, forcible activation by (low delay, expensive) Opportunistic activation by other threads or when halts (potentially higher delay, cheap) 20

Receive Processing For a vnic vnic vnic DevEmu DevEmu DevEmu Network Virtualization Layer PNIC Driver Shared Queue Dedicated Queue 0 One thread per device NetQueue enabled devices: one thread per NetQueue Each NetQueue processes traffic for one or more MAC addresses (vnics) NetQueue vnic mapping determined by unicast throughput and FCFS. vnics can share queues due to low throughput, too many vnics or Queue type mismatch (LRO Queue vs. non-lro VNIC) 21

Improve receive throughput to a single vcpu0 vcpun vnic DevEmu NetVirt PNIC vnic DevEmu NetVirt PNIC Single thread for receives can become bottleneck at high packet rates (> 1 Million PPS or > 15Gbps) Use XNET3 virtual device, Enable RSS inside Guest Enable RSS in Physical NICs (only available on some PNICs) Add ethernetx.pnicfeatures = 4 to vmx file 0 Side effects: Increased CPU Cycles/Byte 22

Improve transmit throughput with multiple vnics vnic vnic vnic vnic DevEmu DevEmu DevEmu DevEmu NetVirt PNIC NetVirt PNIC Some applications use multiple vnics for very high throughput Common transmit thread for all vnics can become bottleneck Set ethernetx.ctxperdev = 1 in vmx file Side effects: Increased CPU Cost/Byte 0 23

The Latency Sensitivity Feature in vsphere 5.5 Minimize virtualization overhead, near bare-metal performance New virtual machine property: Latency sensitivity High => lowest latency Medium => low latency Exclusively assign physical CPUs to virtual CPUs of Latency Sensitivity = High s Physical CPUs not used for scheduling other s or ESXi tasks Idle in Virtual Machine monitor (M) when Guest OS is idle Lowers latency to wake up the idle Guest OS, compared to idling in ESXi vmkernel Disable vnic interrupt coalescing For DirectPath I/O, optimize interrupt delivery path for lowest latency Make ESXi vmkernel more preemptible Reduces jitter due to long-running kernel code Hypervisor Hypervisor Latency-sensitivity Latency-sensitivity Feature 24

Response Time (us) ESXi 5.5 Network Latencies and Jitter 512 256 -Default -SRIOV -Latency-SRIOV Native 128 64 32 16 Single 2 vcpu : RHEL 6.2 x64 ping i 0.001 RTT Intel Xeon E5-2640 @ 2.50 GHz, Intel 82599EB PNIC 8 Median 90-Pct 99-Pct 99.9-Pct 99.99-Pct 25

Latency (us) ESXi 5.5 ultra-low latency w/ InfiniBand DirectPath I/O 2048 1024 Polling-RDMA-Write-Latency 512 256 128 64 32 16 8 Native ESXi 5.5 4 2 1 HP* Proliant* DL380 Gen 8, Intel Xeon E5-2667v2: 2 x 8 @ 3.0Ghz 2 x 64 GB Memory Mellanox* ConnectX-3 FDR InfiniBand*/RoCE adapter Mellanox OFED (OpenFabrics Enterprise Distribution) ver 2.2 0.5 Message Size (Bytes) 26

Million Packets per Second (Mpps) ESXi 5.5 packet rates with Intel DPDK 25 20 Intel VT-d IOTLB Limitation Close gap in future vsphere release with Intel VT-d SuperPage support (Intel E5-2600 v2 and newer) Focus area for future vsphere release L3 Forwarding on ware ESXi 5.5 Intel Server Board S2600CP IVT L1 @ 2.3GHz Intel 10GbE Ethernet Controller 15 10 5 Typical for Telco Core Networks Typical for Enterprise Data center Networks Linux Native DPDK Native DPDK Virtualized SR-IOV DPDK Virtualized Vmxnet3 0 64 128 256 512 768 1024 1280 1512 Intel Virtualization Technology (Intel VT) for Directed I/O (Intel VT-d) Intel Data Plane Development Kit (Intel DPDK) Packet Size (Bytes) 27

NSX Network Virtualization Platform Controller Cluster Virtual Network L2 L3 L2 Cloud Management Platform Northbound NSX API Software VTEP API HW Partner NSX vswitch NSX vswitch Open vswitch Open vswitch vsphere Host vsphere Host K Xen Server NSX Edge DPDK Hardware VLAN VLAN 30000 Physical Network 28

Summary Virtual Appliances performing Network Functions needs high performance on commodity x86 based server platforms Virtualized interfaces supported in DPDK offer multiple options and flexibility for data plane application developers ESXi hypervisor supports multiple interfaces, including para-virtual and emulated interfaces while offering best in class virtualization features, as well as direct assigned interfaces via DirectPath I/O and SR-IOV 29

vsphere ESXi Performance Related References Best Practices for Performance Tuning of Latency-Sensitive Workloads in vsphere s http://www.vmware.com/files/pdf/techpaper/w-tuning-latency-sensitive-workloads.pdf Intel Data Plane Development Kit (Intel DPDK) with ware vsphere http://www.vmware.com/files/pdf/techpaper/intel-dpdk-vsphere-solution-brief.pdf Deploying Extremely Latency-Sensitive Applications in ware vsphere 5.5 http://www.vmware.com/files/pdf/techpaper/latency-sensitive-perf-vsphere55.pdf The CPU Scheduler in ware vsphere 5.1 https://www.vmware.com/files/pdf/techpaper/ware-vsphere-cpu-sched-perf.pdf RDMA Performance in Virtual Machines using QDR InfiniBand on ware vsphere 5 https://labs.vmware.com/academic/publications/ib-researchnote-apr2012 30

Backup 32

ware vsphere Sample Features Protect Against Failures Eliminate Planned Downtime Distributed Resource Scheduler 33

Getting started: vsphere Hypervisor Minimum Hardware Requirements: Creating Virtual Machines: vsphere Deployment Architecture: 34

Improve Multicast throughput to multiple s vnic vnic vnic vnic DevEmu T0 NetVirt PNIC DevEmu Multicast receive traffic: single thread has to duplicate packets for ALL s Set ethernetx.emurxmode = 1 NetVirt Receive thread queues packets with the device emulation layer Per- thread picks up packets and carries out receive processing T0 DevEmu Side effects: Increased receive throughput to single, Increased CPU Cycles/Byte T1 PNIC DevEmu T2 T0 35