Journal of Information & Computational Science 9: 5 (2012) 1273 1280 Available at http://www.joics.com VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Yuan Tang, Jianping Li, Yuanyuan Huang School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China Abstract With the emergence of cloud computing, it is possible that the Virtual Machines (VMs) hosting High Performance Computing (HPC) applications seamlessly migrate between distributed cloud resources and tightly-coupled cluster resources. However, the performance of existing virtual computing environment integrated with VMs and overlay networks cannot support the high-performance distributed computing. In this paper, we describe the design and implementation of a virtual overlay network, VON/K, which is integrated with Kernel-based Virtual Machine (KVM). VON/K has negligible latency and bandwidth overheads on 1Gbps Ethernet networks, providing near-native access to high performance networks. Keywords: Virtual Machine; KVM; Overlay Network; Cloud Computing 1 Introduction Virtual Machines (VMs) can greatly simplify cloud and distributed computing by lowering the level of abstraction from traditional model. Moreover, the utility of virtual overlay network has been clearly demonstrated in the context of tightly-coupled high performance computing and loosely-coupled cloud computing using VMs [1]. In this environment, an application is mapped into a collection of VMs that are instantiated as needed and interconnected on overlay network. However, the current limiting factor in employing distributed computing for tightlycoupled system is the performance of the virtual overlay network system. Current overlay network systems have low overhead to effectively host loosely-coupled scalable applications, but the performance is insufficient for tightly-coupled applications [2]. Indeed, for loosely-coupled applications, this concept has readily moved from research to practice [3]. It is well known that cloud computing supports minimal intra-node overhead, but the network infrastructure imposes significant and frequently unpredictable performance penalties. If the overhead of virtual network were sufficiently low, it would be practical to use it in tightly-coupled cluster. Corresponding author. Email address: ytang2222@uestc.edu.cn (Yuan Tang). 1548 7741 / Copyright 2012 Binary Information Press May 2012
1274 Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 In response to the above limitation, we have designed and implemented VON/K, a virtual overlay network that provides a simple layer 2 abstraction: a collection of user s VMs appear to be attached to the user s local area network, regardless of their actual locations. VON/K is an implementation of overlay that is embedded into Linux kernel, and integrated with opensourced Kernel-based Virtual Machine (KVM) [4]. The test results demonstrate that VON/K can achieve near-native bandwidth and latency in the 1 Gbps Ethernet networks. Through the use of low overhead overlay network in high-bandwidth, low-latency environments such as current clusters/supercomputers and future data centers, we seek to make it practical to use overlay network at all times, even when running tightly-coupled applications on such high-end environments. This paper s contributions are we implement the virtual network extending for VMs to clusters and supercomputers with high performance networks. The rest of the paper is organized as follows. In Section 2, we introduce the overlay network and KVM virtualization infrastructure. In Section 3, the detailed description of VON/K s design and implementation is given. The performance of VON/K is evaluated in Section 4. The last section presents the conclusions and future work. 2 Overlay and KVM Overlay network with a layer 2 abstraction provides a powerful model for virtualized wide-area distributed computing resources on collections of VMs. The following subsections discuss the overlay network, KVM, and their integration. 2.1 Virtual Overlay Network An overlay network is a computer network which is built on top of another network. The utilities of overlay network have been recognized in wide area distributed environments, such as VIOLIN [5], IPOP [6], ViNe [7]. In these systems, it is necessary for administrators to set up overlay links. VIOLIN project aims to build a service-on-demand grid environment based on virtual server technology and virtual networking. It allows for the dynamic setup of an arbitrary private link layer and network layer virtual network among virtual servers. IPOP is a system that leverages P2P technology to create virtual IP networks. ViNe on the other hand, builds IP overlays on top of the Internet. It is similar to the traditional VPN, but solves some issues with VPN. Perhaps closest to our work is the VNET [8], which is a layer 2 overlay network for virtual machines, and provides the abstraction of a virtual LAN. VNET is among the fastest virtual networks implemented using user-level code, achieving 21.5 MB/s [9] with a 1 ms latency overhead communicating between Linux 2.6 VMs running on host machines with dual 2.0 GHz Xeon processors. These speeds are sufficient for its purpose in providing virtual networking for wide-area and/or loosely-coupled distributed computing. They are not, however, sufficient for use within a cluster at gigabit or greater speeds. VNET is fundamentally limited by the kernel/user space transitions needed to handle a guest s packet send or receive. 2.2 KVM Virtual Machine Monitor KVM is a Virtual Machine Monitor (VMM) that embedded into Linux kernel and supports
Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 1275 Virtual world Real world Guest 1 Guest 2 Guest n Qemu I/O Qemu I/O Qemu I/O Linux kernel Virtio KVM driver NIC Fig. 1: KVM based structure native virtualization using AMD-V or Intel VT-x. By adding virtualization capabilities to a standard Linux kernel (see Fig. 1), the commonality and scalability are brought into virtualized environment. Moreover, paravirtualization is also available for Linux and Windows guests using the Virtio [10] framework. By integrating into the kernel, the KVM hypervisor automatically tracks the latest hardware and scalability features without additional effort. A normal Linux process has two modes of execution: kernel and user. KVM adds a third mode: guest mode (which has its own kernel and user modes, but these do not interest the hypervisor at all). The virtual computing environments integrated with KVM and virtual overlay networks play an important role in distributed computing and cloud computing. 2.3 Integrating Overlay with KVM VON/K is an implementation of overlay network that is integrated with KVM virtual machine monitor. This model supports adaptive computing on distributed computing resources, and parallel execution in a collection of VMs. The VON/K is designed to be a configurable overlay network, which presents a simple layer 2 networking abstraction that user s VMs are located at user s local area Ethernet network, regardless of the actual locations or the complexity of network topology. The VON/K layer can be effectively used to monitor the VMs traffic and the performance of the underlying physical network. We evaluate VON/K, finding that it provides negligible overheads on 1 Gbps Ethernet networks. Moreover, VON/K could be implemented in other VMMs, and as such provides a proof-of-concept that virtual networking for VMs, with performance overheads low enough to be inconsequential even in a cluster environment is clearly possible. 3 VON/K Design and Implementation We now describe how VON/K has been architected and implemented in the context of KVM.
1276 Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 3.1 VON/K Architecture The overall architecture of VON/K is shown in Fig. 2. KVM can be running with multiple guest VMs, and each VM provides a virtual (Ethernet) NIC to its guest. Linux injects all of the network packages to the physical network and gets packets from the network through the Ethernet card. For high performance applications, the virtual NIC conforms to the virtio interface [10]. KVM virtio network device is created by kernel and servers as the Ethernet network card for the guest mode. Their network packets sent to outside network are all sent to VON/K by virtio network card. Then these packets are routed by VON/K inside kernel, and then either inject to the virtio devices of other guests in the same host physical machine, or go to the outside network through the VON/K Bridge. The guest VMs have modifications to support PCI passthrough (DMA addresses are offset) that bridge the guest VMs network device and the physical NIC. Using PCI passthrough mechanism, the VMs have direct access to the Ethernet devices. Guest 1 Guest n Qemu I/O Virtio driver Qemu I/O Virtio driver KVM virtio device KVM virtio device Linux kernel VON/K core Passthrough NIC Fig. 2: VON/K architecture VON/K comprises of three major components: the Core component responsible for the packet routing, the Bridge and Control components, both of which are implemented as kernel modules in the Linux kernel. 3.2 VON/K Core The VON/K Core component, directly embedded into the Linux kernel, basically is a packet processing and forwarding system. It is responsible for routing Ethernet packets between virtual NICs on the machine or between this machine and remote VON/K Cores on other machines. All of the forwarding rules are based on the layer 2 addresses. VON/K forwards the incoming packets from variant sources, which include outside network and virtual network devices. VON/K internal packet processing logic is illustrated in Fig. 3. The routing logic and data structure are shown in Fig. 4. Routing rules are maintained in a routing table that is indexed by MAC addresses of the source and destination virtual machines. For the destination of a packet, it could be either a VON/K Interface or a VON/K Bridge, both of which are virtual devices. A VON/K Bridge connects to an overlay destination which is a next UDP/IP level destination of the packet, on some other machine. However, for an interface destination, the VON/K Core directly delivers the packet to the corresponding virtual NIC.
Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 1277 VON/K control Linux user VON/K virtio device KVM virtio device KVM device manager KVM virtio device Outside network VON/K bridge Routing table Packet dispatcher VON/K core NIC Linux kernel Fig. 3: VON/K core logic Routng table Link 1 Link 2 Package routing Links list Interface 1 Interface 2 Fig. 4: VON/K routing logic In addition to the front-end virtio drivers (implemented in the guest) and the back-end virtio drivers (implemented in the KVM), VON/K virtio defines two layers to support guest-to-kvm communication. To attach front-end drivers to back-end drivers, VON/K Virtio network driver needs three virtual queue interfaces. One is for receiving, the other is for transmitting and the three is for control component. In the Linux kernel, a special virtio network device called VON/K virtio device is created and configured. The virtio is an abstraction for a set of common emulated devices in a paravirtualized hypervisor developed in recently Linux kernel. This design allows the hypervisor to export a common set of emulated devices and make them available through a common API. The VON/K virtio device is used as the communication device between the VON/K Bridge and the VON/K Core components. The VON/K Core sends all of the raw Ethernet packets that are destined to the VMs outside of this host machine. Along with the raw Ethernet packet, the destination link information is also affiliated. These are sent to VON/K Bridge through the VON/K virtio device. In the VON/K Bridge, the raw Ethernet packets from VON/K Core is processed and sent out to the VON/K peer side of the destined machine through Ethernet device. If a packet destined VM is resided in a physical machine located within the same cluster or local network with the source machine, this packet could be injected into the Ethernet without any further encapsulation. This is working at the condition that the promiscuous mode of the Ethernet card in the Linux is turned
1278 Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 on. The destination physical machine will receive all the packets transmitting on the whole local network, and send all of the packets to the VON/K Core, where the packets are routed. 3.3 VON/K Control VON/K supports the configuration from user level application. The VON/K control component allows for remote and local configuration of interfaces and routing rules so that an overlay network can be constructed and changed. The VON/K configuration console allows for local control to be provided from a file or command. The VON/K control component is responsible for validity checking before it transfers the new configuration to the VON/K Core using the bridge device. 4 Performance Evaluation We consider communication between two machines whose NICs are directly connected. In the virtualized configuration, the guests and performance testing tools run on top of Linux with VON/K. In the native configuration, the same guest environments run directly on the hardware. 4.1 Testbed The purpose of our performance evaluation is to determine how close VON/K comes to native throughput and latency in the most demanding (lowest latency, highest throughput) hardware environments. We measure round-trip latency using ping by sending ICMP packets. The throughput is measured using ttcp tool. Our testbed consists of two physical machines called host. The host has a Dual quadcore AMD Processor 2.0 GHz, 4 GB RAM, and 1 Gbps Ethernet NIC. We considered three configurations: Native: In this configuration, the VON/K and KVM are not used. A Linux runs directly on the host machines. Passthrough: The KVM runs on each host machine, while the VON/K is not used. A single VM, which is configured with a single CPU core and 1GB of RAM, has directly access to the Ethernet devices via PCI passthrough device. VON/K: The VON/K configuration corresponds to the architectural diagram given in Fig. 2. The kernel running in VM has the same passthrough model as above configuration. 4.2 Results Analysis Fig. 5 shows the performance test results. Here we compared the latency and throughputs of VON/K with the native and passthrough. The end to end latencies are the average of 100 measurements, shown in Fig. 5 (a). roximately half of the increase in latency compared to native performance is due to passthrough Ethernet device access, as can be seen from the figure. A significant component of this has to do with current interrupt virtualization limitations on the processor. Therefore, KVM exits on all interrupts, including those from the passthrough
Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 1279 device. For the latter, these are immediately re-vectored into the VM, but, nonetheless, at least one VM exit/entry cost is born on each packet send or receive by the passthrough device in the passthrough or VON/K configurations. Although latency is much important metric of the network, considering the primary goal of VON/K, which is high throughput virtual overlay networking for cluster based local area networking, we do care more throughput than latency in our test case. The end to end throughput of VON/K on 1G networking is shown in Fig. 5 (b). Ttcp is configured to use 1450 byte packets sent as fast as possible over 60 seconds. For the 1 Gbps network, VON/K has no difficulty achieving native throughput. Also, we gave the throughput comparison for these three with different UDP packet, illustrated in Fig. 6. When the size of UDP packet closes to the multiples of 1500 bytes (a standard MTU), the throughput reaches its highest point. With the increase of UDP packet, the tread lines indicate that throughput becomes more stable, and VON/K can achieve near-native throughput. 0.4 0.35 800 788 706 767 689 753 676 0.3 600 ms 0.2 0.1 0.17 0.2 Mbits/sec 400 200 Tcp Udp 0 0 Native Passthrough VON/K Native Passthrough VON/K (a) End-to-end latency (b) End-to-end throughput Fig. 5: VON/K performance test Mbits/sec 750 700 650 600 550 500 450 Native Passthrough VON/K 400 1.0 1.2 1.4 1.45 1.5 2.0 2.5 2.8 3.0 3.2 3.5 3.8 4.0 4.2 4.5 4.6 5.0 5.5 5.8 6.0 UDP pkt size (kbytes) Fig. 6: UDP throughput test 5 Conclusion In this paper, we have described the VON/K model of overlay network in a distributed computing environment. VON/K s design goal is to achieve near-native throughput and latency on 1 Gbps
1280 Y. Tang et al. / Journal of Information & Computational Science 9: 5 (2012) 1273 1280 Ethernet and other high performance interconnects. To achieve high performance, VON/K relies on KVM virtualization in the Linux kernel and high-performance network I/O. Virtualization enables VON/K to provide a simple and flexible level 2 Ethernet network abstraction in a large range of systems. We are currently working to further enhance performance of the VON/K in which tightly-coupled applications can seamlessly migrate to and from heterogeneous data center networks on cloud and HPC environment. The functionality enhancement will be focused on supercomputers in future. References [1] J. Lange, K. Pedretti, P. Dinda, C. Bae, P. Bridges, P. Soltero, A. Merritt, Minimal-overhead virtualization of a large scale supercomputer, Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), March 2011 [2] S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T. Fahringer, D. Epema, An Early Performance Analysis of Cloud Computing Services for Scientific Computing, Tech. Rep. PDS2008-006, Delft University of Technology, Dec. 2008 [3] E. Walker, Benchmarking amazon EC2 for high performance scientific computing, USENIX; login, 3(8), Oct. 2008, 18-23 [4] KVM homepage, www.linux-kvm.org [5] P. Ruth, X. Jiang, D. Xu, Violin: Virtual Internetworking on Overlay Infrastructure, Tech. Rep. CSD TR 03-027, Purdue University, July 2003 [6] A. Ganguly, A. Agrawal, P. O. Boykin, R. Figueiredo, IP over P2P: Enabling self-configuring virtual IP networks for grid computing, Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2006 [7] M. Tsugawa, Jose A. B. Fortes, A virtual network (ViNe) architecture for grid computing, Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), 2006 [8] Ananth I. Sundararaj, Peter A. Dinda, Towards virtual networks for virtual machine grid computing, Proceedings of the 3rd USENIX Virtual Machine Research and Technology Symposium, May 2004 [9] J. Lange, P. Dinda, Transparent network services via a virtual traffic layer for virtual machines, Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing, June 2007 [10] Rusty Russell, Virtio: Towards a de-facto standard for virtual I/O devices, ACM SIGOPS Operating Systems Review, 42(5), July 2008