High-performance vnic framework for hypervisor-based NFV with userspace vswitch Yoshihiro Nakajima, Hitoshi Masutani, Hirokazu Takahashi NTT Labs. 0
Outline Motivation and background Issues on current NFV implementation vnic Design & implementation Performance evaluation 1
Goal To provide novel components to enable highperformance NFV with general propose hardware To provide high-performance vnic with operation-friendly features 2
How much cycles for one packet? # of packets per seconds 16,000,000 Short packet 64Byte 14.88 MPPS, 67.2 ns 2Ghz: 134 clocks 3Ghz: 201 clocks 14,000,000 12,000,000 10,000,000 Computer packet 1KByte 1.2MPPS, 835 ns 2Ghz: 1670 clocks 3Ghz: 2505 clocks 8,000,000 6,000,000 4,000,000 2,000,000 0 0 256 512 768 1024 Packet size (Byte) 1280 3
Data Plane Development Kit () x86 architecture-optimized dataplane library and NIC drivers Memory structure-aware queue, buffer management packet flow classification polling mode-based NIC driver Low-overhead & high-speed runtime optimized with data-plane processing Abstraction layer for hetero server environments BSD-license apps apps dataplane Socket API 1. DMA Write Ethernet Driver API NIC 2. DMA READ packet buffer 4
NFV requirements from 30,000 feet High performance network I/O for all packet sizes Especially in smaller packet size (< 256 bytes) Low-latency and less-jitter Network I/O & Packet processing Isolation Performance isolation between NFV VMs Security-related VM-to-VM isolation from untrusted apps Reliability, availability and serviceability (RAS) function for long-term operation Nf-Vi-H VM VM Virtual machine Management and API Sequential thread emulation Virtual Switch (vswitch) Instruction, Policing mapping and emulation VM VM Hardware resources NIC 5
What s matter in NFV Still poor performance of NFV apps Lower network I/O performance Big processing latency and big jitter Limited deployment flexibility SR-IOV has limitation in performance and configuration Combination of apps on guest VM and enabled vswitch is configuration Limited operational support is good for performance, but has limited dynamic reconfiguration Maintenance features are not realized 6
Issues on NFV middleware 7
Performance bottleneck in NFV with HV domain System call cause context switch on guest VM User space on host HW emulation needs CPU cycle & VM transition Guest VM User space Legacy NW apps QEMU NIC HW emulator Pkt recv / send cause VM transition TAP client Kernel space NW stack Packet buffer int errupt TAP driver e1000 NIC driver int errupt regist er access KVM driver regist er access bridge Privileged register accesses Kernel space on host for vnic cause VM transition NIC driver NIC VM transition: 800 CPU cycles 8
vnic strategy for performance & RAS Use para-virtualization NIC framework No full-virtualization (emulation-based) Global-shared memory-based packet exchange Reduce memory User-space-based packet data exchange No kernel-userspace packet data exchange 9
Target NFV architecture with hypervisor apps or legacy apps on guest VM + userspace vswitch Connected by shared memory-based vnic Reduce OS kernel implementation Nf-Vi-H Run in userspace to avoid VM transition and context switch VM VM Virtual machine Management and API Sequential thread emulation Virtual Switch (vswitch) Instruction, Policing mapping and emulation VM VM Memory based packet transfer Hardware resources NIC 10
Existing vnic for u-vsw and guest VM (1/2) e1000 PMD with QEMU's e1000 FV and vswitch connected by tap User space on host Guest VM User space NW apps ETHDEV/ e1000 PMD Packet buffer -enabled vswitch QEMU Kernel space NIC HW emulator Software dataplane ETHDEV / PMD ETHDEV / pcap PMD TAP client UIO driver UIO driver r egister access inter r upt TAP driver KVM driver r egister access Pros: legacy and support, opposite status detection Cons: Cons: bad performance, many VM transitions, context switch DMA Kernel space on host -enabled NIC virtio-net PV PMD with QEMU virtio-net framework and virtio-net PV PMD with vhost-net framework and vswitch connected by tap vswitch connected by tap User space on host User space on host Guest VM Guest VM User space User space -enabled vswitch virtio ring NW apps Packet buffer ETHDEV/ virtio-net PMD virtio ring Packet buffer -enabled vswitch QEMU Kernel space Kernel space Virtio-net device Software dataplane ETHDEV / PMD ETHDEV / pcap PMD UIO driver NW apps ETHDEV/ virtio-net PMD UIO Software dataplane UIO ETHDEV / PMD TAP client r egister access TAP driver KVM driver ETHDEV / pcap PMD r egister access UIO driver TAP client TAP driver KVM driver vhost-net r egister access ioeventfd DMA DMA Kernel space on host Kernel space on host -enabled NIC -enabled NIC Pros: legacy and support, opposite status detection Cons: bad performance, many VM transitions, context switch Pros: legacy and support, opposite status detection Cons: bad performance, many VM transitions, context switch 11
Existing vnic for u-vsw and guest VM (2/2) virtio-net PV PMD with QEMU virtio-net framework and vswitch with vhost-user API to connect to virtionet PMD. ring by QEMU IVSHMEM extension and vswitch connected by shared memory User space on host User space on host Guest VM -enabled vswitch Guest VM User space User space Software dataplane ETHDEV / PMD virtio ring Network apps RING API QEMU -enabled vswitch Packet buffer Ring RING API Software dataplane Packet buffer QEMU IVSHMEM driver Packet buffer Ring IVSHMEM device mma p Packet buffer Ring vhost-user API ETHDEV/ virtio-net PMD UIO driver vhost-user backend mma p mma p DMA Kernel space ETHDEV / PMD Kernel space virtio-net device NW apps Hugepage UIO driver KVM driver KVM driver UIO driver DMA Kernel space on host Kernel space on host -enabled NIC -enabled NIC Pros: Best performance Cons: only support, static configuration, no RAS Pros: good performance, both support of legacy and Cons: no status tracking of opposite device 12
High performance vnic framework for NFV 13
vnic requirements for NFV with u-vsw High-Performance 10-Gbps network I/O throughput No virtualization transition between a guest VM and u-vsw Simultaneous support apps and u-vsw Functionality for operation Isolation between NFV VM and u-vsw Flexible service maintenance support Link status notification on the both sides Virtualization middleware support Support open source hypervisor (KVM) app and legacy app support No OS (kernel) modification on a guest VM 14
vnic deisgn vnic as an extension of virtio-net framework Para-virtualization network interface Packet communication by global shared memory One packet to ensure VM-to-VM isolation Control msg by inter-process-communication between pseudo devices User space on host -enabled vswitch Guest VM User space NW apps Software dataplane ETHDEV / PMD pseudo PMDenabled device IPC-based control communication ETHDEV/ virtio-net PMD Kernel space virtio-net-compatible device Global shared memory Kernel space on host 15
vnic implementation Virtq-PMD driver: 4K LOC modification Virtio-net device with extension API and PV-based NIC (virtio-net) API Global shared memory-based packet transmission on hugetlb UNIX domain socket based control message Event notification (link-status, finalization) Pooling-based the opposite device check mechanism QEMU: 1K LOC modification virtio-net-ipc device on shared memory space Shared memory-based device mapping User space on host -enabled vswitch Guest VM User space NW apps Software dataplane ETHDEV/ virtio-net PMD ETHDEV / vir tq- PM D Unix dom a in socket kick / queue a ddr virtio-net device Event notifica tion Kernel space virtio-net-ipc device Global shared memory/hugetlb Kernel space on host 16
Virtq-PMD connection sequence Virtq PMD Virtio-net-ipc device Virtio-net device Virtio-net PMD Initialization request Initialization phase Initialization request Response Configuration phase Shared-memory config request mmap() Response Run loop (interrupted mode or legacy mode) Update notification (avail ring queue) Update notification (used ring queue) Finalization phase From virtio-net-ipc Finalization request (avail ring queue) Device close Response From virtio-net-ipc Device close Finalization request (avail ring queue) Response 17
Performance evaluation 18
Evaluation setting and memory performance Hardware configuration CPU Intel Xeon E5-2697 v2 (2.7GHz) x 2 (SandyBridge) # of CPU cores 12 (Hyper-thread off) (total 24) Memory DDR3-1866 RDIMM ECC 8GB x 8 (total 64GB) Motherboard Supermicro X9DRH-7TF (Intel C602) Intel VT-x enabled rte_memcpy Software configuration memcpy 5E-10 4.5E-10 Distribution Ubuntu 12.04.4 LTS Kernel Linux ubuntu 3.20-58-generic x86\_64 QEMU version 1.0 GCC version 4.8.2 Isolcpus 2,3,4,5,6,7,8,9,10,11,14,15,16,17,18,19,20,21,22,23 Hugetlbfs 1 GB page x 40 for KVM testpmd apps 1.6.0 1E-10 Guest VM 4 CPU cores & 8 GB memory 5E-11 4E-10 SECONDS / BYTE 3.5E-10 3E-10 2.5E-10 2E-10 1.5E-10 0 CPU assignment 0 200 400 600 800 1000 1200 1400 PACKET SIZE (BYTES) Logical CPU core process assignment 0-3 None 4-7 Host: testpmd 8-11 QEMU & Guest VM: testpmd (using all cores (8-11)) 12 23 None 19 1600
Performance benchmark micro benchmarking tool: Testpmd apps Polling-based bridge app that reads data from a NIC and writes data to another NIC in both directions. null-pmd: a -enabled dummy PMD to allow packet generation from memory buffer and packet discard to memory buffer virtq PMD Measurement point Null PMD testpmd on host Bare-metal configuration testpmd on Guest VM Measurement point Null PMD virtio-net PMD testpmd on Guest VM Measurement point Null PMD virtio-net PMD testpmd on Guest VM Measurement point Null PMD pcap PMD TAP driver Virtio-net driver virtqueue virtq PMD virtqueue virtq PMD Null PMD testpmd on host virtq-pmd Null PMD virtq PMD Null PMD vhost application on host Testpmd on host Vhost app Kernel-driver 20
Performance evaluation MPPS GBPS 70.00 12.00 60.00 10.00 50.00 8.00 GBPS MPPS 40.00 6.00 30.00 4.00 20.00 2.00 10.00 0.00 0.00 64 256 512 1024 1500 64 1536 256 virtq PMD (off) virtq PMD (on) vhost app (off) vhost app (on) 512 1024 1500 1536 Packet size Packet size Linux drv (off) Kernel drv (on) virtq PMD (off) virtq PMD (on) vhost app (off) vhost app (on) Linux drv (off) Linux drv (on) Virtq PMD achieved great performance 62.45 Gbps (7.36 MPPS) unidirectional throughput 122.90 Gbps (14.72 MPPS) bidirectional throughput 5.7 times faster than Linux driver in 64B, 2.8 times faster than Linux drvier in 1500B Virtq PMD achieved better performance in large packet to vhost app 21
Conclusion Virtq-PMD: High performance virtio-net-based vnic framework for NFV with HV domain Achieved 122.90 Gbps (14.72 MPPS) throughput APIs and legacy network I/O API Operation-aware extension support Same overheads in any packet size. 22
Please try and fork me Virtq-PMD https://github.com/lagopus/virtio-net-ipc-qemu-1.0 https://github.com/lagopus/virtq-pmd-dpdk-1.6. Lagopus vswitch: yet another highperformance SDN/OpenFlow switch http://lagopus.github.io/ https://github.com/lagopus/lagopus 23