Jun (Jim) Xu jun.xu@huawei.com Princial Engineer, Futurewei Technologies, Inc.
Linux K/QEMU Switch/Router NFV
Linux IP stack in Kernel ll lications will communicate via socket Limited raw socket alications Perfect world (really?) User Sace aache TCP dum Socket Interface L4 L3 L2
K (Kernel-based Virtual Machine) is a virtualization infrastructure for the Linux kernel that turns it into a hyervisor, which was merged into the Linux kernel mainline in February 2007 * Suorts multile architectures Common use in network areas and ISP/SP K inherits majority of Linux Kernel functions, including its IP stack P P P P P P P P K/Linux Bare Metal TCP/IP Stack IO Driver Memory Management Process scheduling Hyervisor Memory Management Process scheduling TCP/IP Stack IO Driver More. * From wikiedia
Traffic attern in DC: otraffic across within the host otraffic across hosts otraffic aggregate to core and goes to Edge Most traffics are the first two tyes (east-west) To handle the first two cases, virtual switch is introduced Huawei CE12800 Huawei CE12800 Huawei CE5800 Huawei CE5800 Intel Server Intel Server vswitch 11.1.1.1/24 11.1.1.2/24 11.1.1.1/24 11.1.1.2/24
Huawei CE12800 Huawei CE12800 Huawei CE5800 Huawei CE5800 Intel Server Intel Server Intel Server vswitch Intel Server vswitch 11.1.1.1/24 11.1.1.2/24 11.1.1.1/24 11.1.1.2/24 VxLN, STT, NVGRE can be used in virtual switch Distributed Router can be deloyed in the host as well
Huawei CE5800 Switch NIC Hyervisor Linux Kernel Virtual switch (e.g. OVS kernel module) 11.1.1.1/24 11.1.1.2/24 There is one oen source virtual switch OVS Current OVS suitable for endoint virtualization Perfect World! ( really?)
End Node 1 VNF1 LB VNF2 Firewall VNF3 nat VNF4 isec Virtualization Layer VNF5 router End Node 2 Bare Metal VNFs will be executed in s (for now). The acket erformance along with the functions of the virtual switches will ut significant imact on the success of the NFV These introduce new challenges Reference to www.etsi.org about NFV
Physical Switch Performance Challenges NIC Hyervisor Linux Kernel Virtual switch (e.g. OVS kernel module) 11.1.1.1/24 11.1.1.2/24
From: Lothar Braun, lexander Didebulidez, etc., Comaring and Imroving Current Packet Caturing Solutions based on Commodity Hardware in Internet Measurement Conference, 2010 htt://conferences.sigcomm.org/imc/2010/aer s/206.df
Intel Dual Core @1600Mhz 50% CPU Utilization 1.488 Ms Tx rate Test result shows NETMP with 60B acket costs ~1000cycles/acket * Netma is a oen source at htt://info.iet.unii.it/~luigi/netma/
K/Linux Kernel NIC NIC Virtual Switch/Virtual Router IP stack vnic vnic vnic IP IP IP SR-IOV is a ossible solution, but yet introduces other roblems
Huawei CE12800 Huawei CE12800 Huawei CE5800 Huawei CE5800 Intel Server K/Linux Kernel Virtual Switch/Virtual Intel Router Server 11.1.1.1/24 11.1.1.2/24 vnic 11.1.1.2 /24 IP vnic IP vnic In NFV, may not be the endoint anymore
User Sace Mirror DHCP IPv4 aache TCP dum Socket Interface LB NT IPv6 GRE MPLS Classifier QOS Redirect dd (Distributed) Routing dd MPLS dd (Distributed) Firewall dd QOS dd IP Filter dd Packet Classifier dd Redirect dd Load Balance Plus existing L2 functions. ll into (Linux) Kernel VxLN NVGRE VLN RP OS/hyervisor/Network be the monolithic iece of all (Sounds familiar?)
Use SIC to offload the switching function Common aroach from Network Vendors Challenges to address ortability, feature velocity in SICs NIC Hyervisor/ Linux Kernel Switch SIC vnic vnic IP IP IP
Huawei CE5800 Switch NIC K Linux Kernel QEMU-K vnic Guest Kernel Guest Usersace Processes Network Service liance SR-IOV rovides the searated access to a network adator among various PCIe hardware functions. It byasses the virtual switch function in the kernel to allow network traffic directly goes between VF and Combine with UIO, QEMU can access the FV and rovide the IO to Guest Kernel SR-IOV and UIO are orthogonal technology. Other IO solutions are also feasible.
DPDK demonstrates the desired network acket erformance for NFV DPDK usersace design oints to a better software aroach Packet rocessing enhancement rovides further oortunities: Integration of High Bandwidth PCIe Gen3 New VX Extensions Intel Virtualization Technology (Intel VT) Intel Data Direct I/O Technology (Intel DDIO) Quad Core Intel Core i7-3610qe Processor 2.30GHz (E1), 6MB L3 cache Mobile Intel QM77 Exress Chiset (1) Emerald Lake 2 Platform (CRB) DDR3 1600MHz, 2 x dual rank 4GB (total 8GB), Dual-Channel Configuration 2 x Intel 82599 Dual Port PCI-Exress x8 10 Gigabit Ethernet NIC *From Intel DPDK. For DPDK reference to htt://www.intel.com/go/ddk
Huawei CE5800 Switch K Linux Kernel NIC Usersace Virtual Switch/Router vnic QEMU-K QEMU-K Guest Kernel Guest Usersace Processes Guest Kernel Guest Usersace Processes Network Service liance K/Linux Kernel back to what it is designed for, and yet robust. Usersace Virtual Switch/Router uses UIO to directly access Physical NIC, and benefits from DDIO, PCIe Gen3, IOTLB, etc Usersace Virtual Switch/Router rovides vnic for for accomlishing the inter- communication Elastic erformance with multi-core suort.
Huawei CE5800 Switch K Linux Kernel NIC Usersace Virtual Switch/Router vnic QEMU-K Guest Kernel Guest Usersace Processes QEMU-K Guest Kernel Guest Usersace Processes Network Service liance Usersace acket ath can be directly from to Zero coy is ossible for inter- ackets in the same host when otimizing with Guest kernel UIO, and Frontend Driver in QEMU.
Suort more IO tyes Less intrusive IO Feature Rich Usersace Switch/Router Multi-core exansion Multile Instances to suort Multi-tenancy
Monolithic hyervisors may not be extensible with NFV deloyment Linux/K/hyervisor can focus on its main tasks, e.g. rocess scheduling, resource management, virtualization Usersace IP stack rovides another aroach for easy develoment environment, high acket erformance, and comatible communication suort.
Reference DPDK htts://01.org/acket-rocessing/overview/ddk-detail NETMP htt://info.iet.unii.it/~luigi/netma/ PR_RING htt://www.nto.org/roducts/f_ring/ K htt://www.linux-kvm.org/age/main_page Contact me at jun.xu@huawei.com