Blueprints Configuring the network for virtual machines in a data center
Blueprints Configuring the network for virtual machines in a data center
Note Before using this information and the product it supports, read the information in Notices on page 47. Second Edition (March 2013) Copyright IBM Corporation 2011, 2013. US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents Chapter 1. Scope, requirements, and support............... 1 Chapter 2. Network setup....... 3 Managing virtual machines with libvirt..... 3 Considerations for live migration....... 3 Chapter 3. The VEPA scenario..... 5 Adding a VEPA interface to a virtual machine... 5 Adding 802.1Qbg VSI types to a VEPA interface.. 6 Installing and starting the lldpad daemon.... 7 Specifying a VSI type........... 9 VM startup.............. 10 VM migration............. 11 Migrating your VM........... 11 Shutting down a virtual machine....... 12 Configuring a bonding device........ 12 Chapter 4. The VEB scenario..... 15 Filtering............... 16 Adding a filter reference to the domain XML.. 16 The no-spoofing filter.......... 16 The SNMP bridge management information base.. 18 Installing net-snmp........... 18 Installing net-snmp-perl......... 19 Installing net-snmp-utils......... 19 Configuring and verifying net-snmp..... 19 Network setup............. 28 Managing virtual machines with libvirt.... 29 Considerations for live migration...... 29 The VEPA scenario............ 29 Adding a VEPA interface to a virtual machine.. 30 Adding 802.1Qbg VSI types to a VEPA interface 31 Installing and starting the lldpad daemon... 31 Specifying a VSI type.......... 34 VM startup............. 34 VM migration............ 35 Migrating your VM.......... 36 Shutting down a virtual machine...... 36 Configuring a bonding device....... 37 The VEB scenario............ 38 Filtering.............. 40 Adding a filter reference to the domain XML 40 The no-spoofing filter......... 40 The SNMP bridge management information base 42 Installing net-snmp.......... 42 Installing net-snmp-perl........ 43 Installing net-snmp-utils........ 43 Configuring and verifying net-snmp.... 43 Relevant standards and protocols....... 45 Shared storage pools........... 45 Notices.............. 47 Trademarks.............. 49 Chapter 5. Relevant standards and protocols.............. 23 Chapter 6. Shared storage pools... 25 Chapter 7. Configuring the network for virtual machines in a data center... 27 Scope, requirements, and support....... 27 Copyright IBM Corp. 2011, 2013 iii
iv Blueprints: Configuring the network for virtual machines in a data center
Chapter 1. Scope, requirements, and support This blueprint applies to System x running Linux. You can learn more about this blueprint, including the Systems to which this information applies System x running Linux New in the second edition (March 2013) v Support for 802.1 Qbg over bonding devices in Red Hat Enterprise Linux 6.4 Intended audience This blueprint is intended for advanced Linux system administrators and programmers who need to configure virtual and physical switches in a Red Hat Enterprise Linux 6.1 KVM hypervisor environment. Scope and purpose To complete the instructions in this blueprint, the virtual machines must already be defined. Only the differences to a normal libvirt network setup are described. The process of defining virtual machines and all the details of a libvirt network setup is outside the scope of this blueprint. This blueprint provides information about configuring libvirt to use macvtap devices for virtual network interfaces, and how to use the 802.1Qbg Virtual Station Interface (VSI) types with physical switches. This document also provides information about configuring the network filter for virtual switches in a libvirt-controlled environment, and the necessary changes to the appropriate XML files with the help of the virsh tool. Test environment The instructions in this blueprint were tested on System x HS22 blade systems running Red Hat Enterprise Linux 6.1 KVM Hypervisor with already defined virtual machines. The HS22 blade systems were connected to a VEPA and 802.1 Qbg enabled switch. Hardware, software, and other prerequisites Host machines must be running either Intel VT chipsets or AMD-V chipsets that support hardware-assisted virtualization. For more information about hardware requirements, see the Enabling KVM support on your hardware section of the Quick Start Guide for installing and running KVM. Author names Gerhard Stenzel Jens Osterkamp Thomas Richter Other contributors Santwana Samantray Heather Crognale Copyright IBM Corp. 2011, 2013 1
IBM Services Linux offers flexibility, options, and competitive total cost of ownership with a world class enterprise operating system. Community innovation integrates leading-edge technologies and best practices into Linux. IBM is a leader in the Linux community with over 600 developers in the IBM Linux Technology Center working on over 100 open source projects in the community. IBM supports Linux on all IBM servers, storage, and middleware, offering the broadest flexibility to match your business needs. For more information about IBM and Linux, go to ibm.com/linux (https://www.ibm.com/linux) IBM Support Questions and comments regarding this documentation can be posted on the developerworks Systems Management Blueprint Community Forum: developerworks Linux Virtualization Blueprint Community Forum The IBM developerworks discussion forums let you ask questions, share knowledge, ideas, and opinions about technologies and programming techniques with other developerworks users. Use the forum content at your own risk. While IBM attempts to provide a timely response to all postings, the use of this developerworks forum does not guarantee a response to every question that is posted, nor do we validate the answers or the code that are offered. 2 Blueprints: Configuring the network for virtual machines in a data center
Chapter 2. Network setup The use of virtual machines in physical host machines leads to a situation where there is no longer a strict separation between host and network administration because some form of virtual network setup is needed to connect the virtual machines to the physical network. Depending on the type of workload on the virtual machines and the type of communication between them, the spectrum of the network setup for the guest within the hosts can range from a dedicated network card for a guest to a fully administratable virtual switch with advanced routing and filtering capabilities. In this blueprint, two complementary scenarios are considered: v the VEPA scenario, or network centric scenario, where the administration of the network happens in one or more physical switches, and where the virtual machines are connected to the physical network as closely as possible. v the VEB scenario, or host centric scenario, where all administration including the administration of the virtual network and switches happens in the hypervisor host machines. Managing virtual machines with libvirt Both scenarios described in this blueprint assume that you will use libvirt. Many libvirt tasks, like defining, starting, or migrating a virtual machine can be accomplished using virsh, the command-line utility. The graphical interface to libvirt, virt-manager, does not currently support all of the features discussed in this blueprint. If you decide to use virt-manager, though, you can do so with some manual editing of the virtual machine definition (the details of which are outside of the scope of this blueprint). For more information about using libvirt, see http://libvirt.org/docs.html. Considerations for live migration Live migration of virtual machines requires that the virtual disk of a virtual machine is stored on a shared resource accessible to both source and target host. For example an NFS-based or iscsi-based storage pool. For more information about live migration of virtual machines, see http://www.redhat.com/docs/en- US/Red_Hat_Enterprise_Linux/6-Beta/html/Virtualization/ Copyright IBM Corp. 2011, 2013 3
4 Blueprints: Configuring the network for virtual machines in a data center
Chapter 3. The VEPA scenario In the VEPA scenario, the virtual machines are connected to the physical network as closely as possible, so that the virtual network interface is directly connected to the physical network interface of the host without going through the virtual bridge of the Linux host. All network traffic from a virtual machine goes to the switch first, effectively bypassing the internal bridge even if the network traffic is intended for the host or another virtual machine on the same host. Communication between virtual machines on the same host therefore requires that the respective switch port is put in 'reflective relay' mode. This kind of communication will also impact the overall bandwidth of the respective physical interface and the latency of the network traffic. Adding a VEPA interface to a virtual machine A VEPA interface is added to a virtual machine by specifying the "direct" interface type in the domain XML. A typical network interface definition looks similar to the following: Copyright IBM Corp. 2011, 2013 5
<devices>... <interface type= network > <mac address= d0:0f:d0:0f:02:01 /> <source network= default /> <model type= virtio /> </interface>... </devices> Change the interface definition using the virsh edit <virtual machine> command as follows to add a VEPA interface to a virtual machine (using eth2 with VLAN ID 20 in this example). Note: If you are using BNT switches, the VLAN ID should be in the range of 2 to 4094. <devices>... <interface type= direct > <mac address= d0:0f:d0:0f:02:01 /> <source dev= eth2.20 mode= vepa /> <source network= default /> <model type= virtio /> </interface>... </devices> This change causes libvirt to create a macvtap device, tie it to the specified physical network interface, and then pass the macvtap file descriptor to QEMU upon virtual machine start. This sequence works similarly for virtual machine shutdown and migration. While using the VEPA mode provides the base for the network-centric management of the virtual machines network activities, an additional step is necessary to actually do this: enforcing port policies. In this case, 802.1Qbg VSI types. Adding 802.1Qbg VSI types to a VEPA interface The VSI types are defined and stored in a database with an associated unique ID. The switch advertises its support for 802.1Qbg capabilities, including reflective relay. On the KVM host, the link layer discovery protocol agent daemon (lldpad) is used to configure the port in VEPA mode (hairpin mode), effectively offloading packet-switching functions from the KVM host to the adjacent switch. The lldpad daemon is notified by libvirt if VSI types are to be applied to a specific MAC/physical port pair. The unique ID of the VSI type and the virtual NIC MAC/VLAN information is registered with the adjacent switch port through lldpad and the switch retrieves the VSI type that is based on the unique ID and associates the rules against the registered MAC/VLAN. 802.1Qbg requires support from the physical switch that is connected to the server. See your switch user guide to determine whether the switch supports 802.1Qbg, and how to configure it. Both IBM blade switches and rack switches support 802.1Qbg. If necessary, upgrade to a newer switch firmware version. Configuration typically involves enabling EVB on the switch port, and configuring connection details for the VSIDB server. 6 Blueprints: Configuring the network for virtual machines in a data center
For more information about how to configure a switch for VEPA, see chapter 6 of the Implementing a VM-Aware Network Using VMready (http://www.redbooks.ibm.com/abstracts/sg247985.html) IBM Redbooks publication. Note: The switch IBMNOS allows a port to be configured with "tag-pvid". With this configuration, all packets that are leaving the port are tagged, including the LLDP and ECP packets. This configuration is not recommended for ports on which 802.1Qbg is used because it is not supported by the current lldpad implementation Installing and starting the lldpad daemon To configure the port in VEPA mode, install and start the link layer discover protocol agent daemon (lldpad). Procedure 1. Install the lldpad daemon with the following command: yum install lldpad 2. Start the lldpad daemon with the following commands: service lldpad start 3. Enable the admin status for the interface that is connected to the switch (in this example eth2) with the following commands: Note: These two commands address the nearest bridge and the nearest customer bridge. lldptool -L -i eth2 adminstatus=rxtx lldptool -i eth2 -g ncb -L adminstatus=rxtx 4. Enable transmission of the EVB TLV (edge virtual bridging type length value message) and then configure reflective relay mode and capabilities with the following sets of commands for both the nearest bridge and for the nearest customer bridge. For Red Hat Enterprise Linux 6.4 and later, lldpad uses only the nearest customer bridge. a. For the nearest bridge (unnecessary for Red Hat Enterprise Linux 6.4 and later): lldptool -T -i eth2 -V evbcfg -c enabletx=yes lldptool -T -i eth2 -V evbcfg -c fmode=reflectiverelay lldptool -T -i eth2 -V evbcfg -c capabilities=rte,ecp,vdp b. For the nearest customer bridge: lldptool -i eth2 -T -g ncb -V evbcfg -c enabletx=yes lldptool -i eth2 -T -g ncb -V evbcfg -c fmode=reflectiverelay lldptool -i eth2 -T -g ncb -V evbcfg -c capabilities=rte,vdp,ecp c. Display the EVB parameters: lldptool -t -i eth2 -g ncb -V ecbcfg -c enabletx lldptool -t -i eth2 -g ncb -V ecbcfg -c fmode lldptool -t -i eth2 -g ncb -V ecbcfg -c capabilities 5. Enable VDP with the following commands: Note: These commands address the nearest bridge and the nearest customer bridge. Red Hat Enterprise Linux 6.3 and before lldptool -T -i eth2 -V vdp -c enabletx=yes lldptool -i eth2 -T -g ncb -V vdp -c enabletx=yes Chapter 3. The VEPA scenario 7
Red Hat Enterprise Linux 6.4 and later lldptool -i eth2 -T -g ncb -V vdp -c enabletx=yes 6. Restart the lldpad daemon with the following command: service lldpad restart Results All the changes made with lldptool as above have to be made only once and will be saved to the lldpad configuration file. They will available to lldpad automatically on the next start. The changes can be verified by examining the /var/lib/lldpad/lldpad.conf file. The changes due to step 4a are highlighted: dcbx : { version = "1.0"; dcbx_version = 2; }; nearest_customer_bridge : { eth2 : { tlvid00000001 : { info = "04001B2163BEE8"; }; tlvid00000002 : { info = "03001B2163BEE8"; }; adminstatus = 3; tlvid001b3f00 : { capabilities = "rte,vdp,ecp"; fmode = "reflectiverelay"; enabletx = true; }; vdp : { enabletx = true; }; };... }; lldp : { eth2 : { tlvid00000001 : { info = "04001B2163BEE8"; }; tlvid00000002 : { info = "03001B2163BEE8"; }; adminstatus = 3; vdp : { enabletx = true; }; tlvid001b3f00 : { enabletx = true; info = "001B3F0080070000DF0C00000F"; capabilities = "rte,ecp,vdp"; fmode = "reflectiverelay"; 8 Blueprints: Configuring the network for virtual machines in a data center
}; };... }; v The status of the negotiation of EVB capabilities can be queried with lldptool with the following command for the nearest bridge: lldptool -t -i eth2 -V evbcfg The result of this command should look similar to the following: EVB Configuration TLV supported forwarding mode: (0x40) reflective relay supported capabilities: (0x07) RTE ECP VDP configured forwarding mode: (0x40) reflective relay configured capabilities: (0x07) RTE ECP VDP no. of supported VSIs: 0001 no. of configured VSIs: 0000 RTE: 16 v The status of the negotiation of EVB capabilities can be queried with the -g ncb option for the nearest customer bridge: lldptool -i eth2 -t -g ncb -V evbcfg The result of this command should look similar to the following: EVB Configuration TLV supported forwarding mode: (0x40) reflective relay supported capabilities: (0x07) RTE ECP VDP configured forwarding mode: (0x40) reflective relay configured capabilities: (0x07) RTE ECP VDP no. of supported VSIs: 0001 no. of configured VSIs: 0000 RTE: 16 Note: The displayed values of configured forwarding mode and configured capabilities, shown as 0, are incorrect on some Red Hat Enterprise Linux 6.x versions. Specifying a VSI type Specify the VSI type for a VEPA interface by adding a <virtualport/> element to the domain XML using virsh edit. The parameters of the virtualport element are documented in more detail in the IEEE 802.1Qbg standard. The values are network specific and should be provided by the network administrator. In 802.1Qbg terms, the Virtual Station Interface (VSI) represents the virtual interface of a virtual machine. Important: Be sure to adapt the attribute values for the <parameter/> element to your specific network environment. <devices>... <interface type= direct > <mac address= d0:0f:d0:0f:02:01 /> <source dev= eth2.20 mode= vepa /> <source network= default /> <virtualport type= 802.1Qbg > <parameters managerid= 12 typeid= 0x123456 Chapter 3. The VEPA scenario 9
typeidversion= 1 instanceid= 09b00c53-8b5c-4eeb-8f00-d847aa05191b /> </virtualport> <model type= virtio /> </interface>... </devices> VM startup When the virtual machine is started, libvirt parses the virtualport type, determines the physical device (nth parent) and VLAN ID, and sends a netlink message with an 'ASSOCIATE' request to the lldpad daemon. The lldpad daemon then sends an ASSOCIATE VDP message on the physical interface to the switch. Depending on the success or failure of registering the VSI type, you will see one of the following messages. Table 1. System status messages Output message shown virsh start testvm3 Status The lldpad daemon is not running. error: Failed to start domain testvm3 error: internal error sending of PortProfileRequest failed virsh start testvm3 error: Failed to start domain testvm3 error: internal error port-profile setlink timed out virsh start testvm3 The environment does not have an 802.1Qbg enabled switch. The virtual machine is successfully created. Domain testvm3 started You can verify the status of the association of the profile with the following command: lldptool -t -i eth2 -V vdp mode The output will look similar to the following output: mode = mode: 2 (VDP_MODE_ASSOCIATED) response: 0 (success) state: 2 (VSI_ASSOCIATED) mgrid: 12 id: 1193046 (0x123456) version: 3 instance: 09b00c53-8b5c-4eeb-8f00-d847aa05191b mac: d0:0f:d0:0f:02:01 vlan: 20 10 Blueprints: Configuring the network for virtual machines in a data center
VM migration Live migration allows you to migrate a virtual machine from one host to another host while the virtual machine continues to run seamlessly. The virtual disk of the virtual machine is located in a shared storage pool and should be mounted on both the source and destination. Configuring SSH to authenticate without requiring passwords will simplify the migration process. For more information about shared storage pools, see Chapter 6, Shared storage pools, on page 25. Migrating your VM Use this procedure to migrate your virtual machines. Before you begin This procedure assumes that SSH keys are already set for password-less authentication. If you do not have SSH keys set for password authentication, do so before continuing with this procedure. Procedure 1. Verify that password-less authentication is working: virsh -c qemu+ssh://c7b5/system list --all Id Name State ---------------------------------- - testvm3 shut off 2. Verify that the virtual machine is running locally: virsh list --all Id Name State ---------------------------------- 23 testvm3 running 3. Migrate the virtual machine to the destination host: virsh migrate testvm3 qemu+ssh://c7b5/system 4. Verify that the virtual machine is running on the destination: virsh -c qemu+ssh://c7b5/system list --all Id Name State ---------------------------------- 17 testvm3 running 5. Migrate the virtual machine back with one of the following command strings: v [root@c7b4 ~]# virsh --connect qemu+ssh://c7b5/system \ migrate testvm3 qemu+ssh:///system Chapter 3. The VEPA scenario 11
v virsh --connect qemu+ssh://c7b5/system migrate testvm3 \ qemu+ssh://c7b4/system Shutting down a virtual machine You can shut down a virtual machine from either directly within the virtual machine or by using the virsh command from the host. To shut down the virtual machine from the host, use the following command: virsh shutdown testvm3 If the command is successful, the following output is displayed: Domain testvm3 is being shutdown Configuring a bonding device Use this procedure to configure a bonding device for lldpad. Before you begin Beginning with Red Hat Enterprise Linux 6.4, lldpad supports bonding devices. Only active-backup mode is supported. In this example, a bond device named bond0 has an IP address of 192.168.0.5, and two slaves named eth2 and eth3. Note: Switch configuration to support a bond device on an HS22 blade has the following characteristics and limitations: v Slave devices must be connected to a different switch device. The switch firmware does not support EVB protocol on external ports. v Switches must be interconnected. Use an external port between both switches. v Enable the external ports on both switches to transfer traffic for the VLAN id that is defined in the lldpad configuration section. To configure a bonding device for lldpad, complete the following procedure: Procedure 1. Install the bonding device driver: modprobe bonding mode=active-backup miimon=100 A device named bond0 is available. 2. Assign an IP address to the bond0 device: ifconfig bond0 192.168.0.5 netmask 255.255.255.0 up 3. Create slaves and assign them to the bond0 device: ifenslave bond0 eth2 eth3 12 Blueprints: Configuring the network for virtual machines in a data center
4. Assign a VLAN device that is named bond0.4 on top of the bond0 device: vconfig add bond0 4 A new device that is called bond0.4 is associated with VLAN id 4. 5. Assign an IP address the the bond0.4 VLAN device: ifconfig bond0.4 192.168.4.5 netmask 255.255.255.0 up lldpad uses the device bond0 for switch communication. 6. Enable transmission of the EVB TLV for the bond0 device: a. For the nearest customer bridge: lldptool -i bond0 -T -g ncb -V evbcfg -c enabletx=yes lldptool -i bond0 -T -g ncb -V evbcfg -c fmode=reflectiverelay lldptool -i bond0 -T -g ncb -V evbcfg -c capabilities=rte,vdp,ecp lldptool -i bond0 -T -g ncb -V vdp -c enabletx=yes lldptool -i bond0 -L -g ncb adminstatus=rxtx b. Display the EVB parameters: lldptool -t -i bond0 -g ncb -V ecbcfg -c enabletx lldptool -t -i bond0 -g ncb -V ecbcfg -c fmode lldptool -t -i bond0 -g ncb -V ecbcfg -c capabilities lldptool -i bond0 -t -g ncb -V vdp -c enabletx lldptool -i bond0 -l -g ncb adminstatus lldpad uses the bond device to communicate with the switch. If the active slave fails, the backup slave becomes the new active slave and communication between virtual machines, lldpad, and the switch resumes after a short wait period. Chapter 3. The VEPA scenario 13
14 Blueprints: Configuring the network for virtual machines in a data center
Chapter 4. The VEB scenario In the VEB scenario, all of the virtual machines in a host are connected to a virtual switch, or possibly several virtual switches. The host connects the virtual switch to the physical switch. This scenario has advantages where the inter-guest communication is performance-critical and also where no 802.1Qbg enabled switches are available. In order to manage and potentially limit the network activities of the guest machines, network filters can be added to the virtual machine definition. The implementation of these network filters is a combination of iptables (layer 3) and ebtables (layer 2) rules, but the network filter rules are specified in an implementation-independent format, which is described in Filtering on page 16. Copyright IBM Corp. 2011, 2013 15
The state of the virtual switch can be queried through SNMP. The setup of the net-snmp daemon to serve the bridge MIB is described in The SNMP bridge management information base on page 18. Filtering Filtering allows the hypervisor to control which network packets are sent to, or received from, a virtual machine. For more information about network filters, including information about writing custom filters, see "Network Filters" at libvirt.org. Adding a filter reference to the domain XML You can add a filter reference to a guest XML definition by adding a <filterref/> element to the network interface definition using the virsh edit <virtual machine> command. In the following example, the no-spoofing filter is added:... <interface type= network > <mac address= 52:54:0:11:11:11 /> <source network= mynet /> <model type= virtio /> <filterref filter= no-spoofing /> </interface>... The no-spoofing filter The no-spoofing filter in this example references the respective filter for preventing MAC, IP, and ARP spoofing. The xml file for the no-spoofing filter can be created by creating a file called no-spoofing.xml with the following content: <filter name= no-spoofing chain= root > <filterref filter= no-mac-spoofing /> <filterref filter= no-ip-spoofing /> <filterref filter= no-arp-spoofing /> </filter> Then, define the file to libvirt with the following command: virsh nwfilter-define no-spoofing.xml 16 Blueprints: Configuring the network for virtual machines in a data center
Tip: A filter can be examined using the following command: virsh nwfilter-dumpxml no-spoofing You can see the effect of specifying the no-spoofing filter when you view the ebtables output, as shown in the following examples: 1. With no virtual machine running, the output looks similar to the following: # ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 0, policy: ACCEPT Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT 2. After starting the virtual machine with the no-spoofing filter, the output looks similar to the following: # ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 1, policy: ACCEPT -i vnet0 -j libvirt-i-vnet0 Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT Bridge chain: libvirt-i-vnet0, entries: 4, policy: ACCEPT -s! 52:54:0:6e:4c:17 -j DROP -p IPv4 -j ACCEPT -p ARP -j ACCEPT -j DROP 3. When the virtual machine requests an IP address through DHCP, the output looks similar to the following: ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 1, policy: ACCEPT -i vnet0 -j libvirt-i-vnet0 Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT -o vnet0 -j libvirt-o-vnet0 Bridge chain: libvirt-i-vnet0, entries: 2, policy: ACCEPT -p IPv4 -j I-vnet0-ipv4 -p ARP -j I-vnet0-arp Bridge chain: libvirt-o-vnet0, entries: 1, policy: ACCEPT -p ARP -j O-vnet0-arp Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT -s! 52:54:0:11:11:11 -j DROP -p IPv4 --ip-src! 192.168.122.40 -j DROP Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT -s! 52:54:0:11:11:11 -j DROP -p ARP --arp-mac-src! 52:54:0:11:11:11 -j DROP -p ARP --arp-ip-src! 192.168.122.40 -j DROP -p ARP --arp-op Request -j ACCEPT -p ARP --arp-op Reply -j ACCEPT Chapter 4. The VEB scenario 17
-j DROP Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT -p ARP --arp-op Reply --arp-mac-dst! 52:54:0:11:11:11 -j DROP -p ARP --arp-ip-dst! 192.168.122.40 -j DROP -p ARP --arp-op Request -j ACCEPT -p ARP --arp-op Reply -j ACCEPT -j DROP These ebtables rules only allow packets to and from the virtual machine with the specified MAC and IP values. All other packets are dropped, therefore effectively preventing spoofing attacks of the virtual machine. The SNMP bridge management information base With the bridge, you can query information about the virtual switch in the KVM hypervisor through SNMP. This ability can be exploited by IBM Director, IBM Tivoli Network Manager, and other SNMP-capable tools. The SNMP bridge Management Information Bases (MIBs) define the representation of the bridge data: v 802.1d Bridge MIB RFC 2674, for switches with no VLANs (located in the /usr/share/mibs/ietf/q- BRIDGE-MIB directory of libsmi-0.4.8 in Red Hat Enterprise Linux 6.x) v 802.1q Bridge MIB RFC 4363, for switches with VLANs (located in the /usr/share/mibs/ietf/p- BRIDGE-MIB directory of libsmi-0.4.8 in Red Hat Enterprise Linux 6.x). For more information about this bridge, see Definitions of Managed Objects for Bridges with Traffic Classes, Multicast Filtering, and Virtual LAN Extensions. v The Bridge MIB module for managing devices that support IEEE 802.1D (RFC 4188): RFC 1286 (1st rev) RFC 1493 (2nd rev) For more information about this bridge module, see Definitions of Managed Objects for Bridges. In Red Hat Enterprise Linux 6.0, only RFC 4188 was supported. Support for RFC 4363 (VLAN support) was added in Red Hat Enterprise Linux 6.1.The following sections gives a short outline about how to install, configure and verify the support of the bridge MIB. The bridge MIB is implemented as a Perl extension for the net-snmp daemon. If those files are missing use yum install libsmi-0.4.8-4.el6.x86_64 to install them. Installing net-snmp Before using the SNMP MIB, you must install the net-snmp package. To install the net-snmp package, run the following command: yum install net-snmp 18 Blueprints: Configuring the network for virtual machines in a data center
This command also installs all prerequisite packages. Installing net-snmp-perl Before using the SNMP MIB, you must install the net-snmp-perl package. The bridge MIB implementation is in the /usr/bin/snmp-bridge-mib directory, part of the net-snmp-perl RPM, which is installed with following command: yum install net-snmp-perl Installing net-snmp-utils Before using the SNMP MIB, you must install the net-snmp-utils package. The successful configuration of the bridge MIB can be verified with the snmpwalk utility, part of the net-snmp-utils RPM, which is installed with the following command: yum install net-snmp-utils.x86_64 Configuring and verifying net-snmp Before using SNMP MIB, configure and verify bridge support. Before you begin For testing purposes only, you can grant access to snmpd for all users by adding the following line to the /etc/snmp/snmpd.conf file: rocommunity public Note: This access should be granted only for testing purposes. In a production environment, be sure to replace this line with the correct access controls for your environment. Procedure 1. Start the snmp daemon: service snmpd restart Stopping snmpd: [FAILED] Starting snmpd: [ OK ] 2. Verify that the bridge module is not yet configured: Chapter 4. The VEB scenario 19
snmpwalk -Os -c public -v 2c localhost.1.3.6.1.2.1.17.4 mib-2.17.4 = No Such Object available on this agent at this OID 3. Configure support for the bridge MIB module by adding the following line to the /etc/snmp/snmpd.conf file: master agentx The changes to the /etc/snmp/snmpd.conf file look similar to the following: --- /etc/snmp/snmpd.conf 2010-07-28 10:14:00.000000000 +0200 +++ snmpd.conf 2010-08-10 22:40:37.000000000 +0200 @@ -16,6 +16,7 @@ # Access Control ########################################################################### +rocommunity public # As shipped, the snmpd demon will only respond to queries on the # system mib group until this file is replaced or modified for # security purposes. Examples are shown below about how to increase the @@ -460,3 +461,4 @@ # Further Information # # See the snmpd.conf manual page, and the output of "snmpd -H". +master agentx 4. Restart snmpd: service snmpd restart Stopping snmpd: [ OK ] Starting snmpd: [ OK ] 5. Run the following perl script: perl /usr/bin/snmp-bridge-mib virbr0 6. Verify that the bridge is configured. In this example, there is no virtual machine connected to the bridge. The following snmpwalk command typically generates a long return. snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 The MAC address of the virtual machine is used to verify correct operation. Assuming the MAC address contains the string 4C:17, then when you grep for the MAC address, no output is returned if the virtual machine is not running: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" 7. Start a virtual machine that connects to the bridge using the virsh start <virtual-machine> command, and then run the same command from Step 6 again: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" BRIDGE-MIB::dot1dTpFdbAddress. RT.nL. = STRING: 52:54:0:6e:4c:17 BRIDGE-MIB::dot1dTpFdbAddress..T.nL. = STRING: fe:54:0:6e:4c:17 8. Shutdown the virtual machine and wait at least 30 seconds to run the command again and verify that the output is empty: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" Results The bridge MIB module is now installed, configured, and verified to work. 20 Blueprints: Configuring the network for virtual machines in a data center
Chapter 4. The VEB scenario 21
22 Blueprints: Configuring the network for virtual machines in a data center
Chapter 5. Relevant standards and protocols The relevant 802.1Qbg protocols are listed for your convenience. v External Virtual Bridging protocol (EVB) uses LLDP as transport Defines locus of VM to VM switching Set VEPA or VEB mode v Channel Discovery and Configuration Protocol (CDCP) [Not Implemented] Virtualize the physical link to simultaneously support multiple VEPA/VEB components v Edge Control Protocol (ECP) ECP provides a reliable, acknowledged protocol for VDP rather than using LLDP v Virtual Station Interface (VSI) Discovery Protocol(VDP) Associate and De-associate interface MAC/VLAN to port profile Copyright IBM Corp. 2011, 2013 23
24 Blueprints: Configuring the network for virtual machines in a data center
Chapter 6. Shared storage pools For live migration, the source and the target host of the to-be-migrated guest must both have access to the virtual disk of the guest. You can ensure this access by using a shared storage pool. In the following example, an NFS server is used, which exports a directory to all the host machines: showmount -e dcnserver Export list for dcnserver: /server/sas/nfs * The definition of an NFS-based stored pool looks similar to the following: <pool type= netfs > <name>dcnserver</name> <uuid>178a678a-ab47-b26b-cdcc-5bc3a33ffd79</uuid> <capacity>1082195443712</capacity> <allocation>24719130624</allocation> <available>1057476313088</available> <source> <host name= dcnserver /> <dir path= /server/sas/nfs/images /> <format type= auto /> </source> <target> <path>/var/lib/libvirt/images</path> <permissions> <mode>0700</mode> <owner>-1</owner> <group>-1</group> </permissions> </target> </pool> The disk from the storage pool is referenced in the domain xml: <devices>... <disk type= file device= disk > <driver name= qemu type= raw /> <source file = /var/lib/libvirt/images/f12nwtest.img /> <target dev= sda bus= scsi /> <shareable/> <address type= drive controller= 0 bus= 0 unit= 0 /> </disk>... </devices> Note also that the <shareable/> element is added to the disk definition. Copyright IBM Corp. 2011, 2013 25
26 Blueprints: Configuring the network for virtual machines in a data center
Chapter 7. Configuring the network for virtual machines in a data center Learn how to configure virtual machines to connect more directly to the physical network adapter (Virtual Ethernet Port Aggregator/VEPA mode) and the necessary steps to use 802.1Qbg to configure the physical switches. You can also learn how to configure virtual machines connected to a virtual switch to use network filters (Virtual Ethernet Bridge/VEB mode). Key tools and technologies discussed in this demonstration include libvirt, net-snmp, and the lldpad daemon. Scope, requirements, and support Systems to which this information applies System x running Linux New in the second edition (March 2013) v Support for 802.1 Qbg over bonding devices in Red Hat Enterprise Linux 6.4 Intended audience This blueprint is intended for advanced Linux system administrators and programmers who need to configure virtual and physical switches in a Red Hat Enterprise Linux 6.1 KVM hypervisor environment. Scope and purpose To complete the instructions in this blueprint, the virtual machines must already be defined. Only the differences to a normal libvirt network setup are described. The process of defining virtual machines and all the details of a libvirt network setup is outside the scope of this blueprint. This blueprint provides information about configuring libvirt to use macvtap devices for virtual network interfaces, and how to use the 802.1Qbg Virtual Station Interface (VSI) types with physical switches. This document also provides information about configuring the network filter for virtual switches in a libvirt-controlled environment, and the necessary changes to the appropriate XML files with the help of the virsh tool. Test environment The instructions in this blueprint were tested on System x HS22 blade systems running Red Hat Enterprise Linux 6.1 KVM Hypervisor with already defined virtual machines. The HS22 blade systems were connected to a VEPA and 802.1 Qbg enabled switch. Hardware, software, and other prerequisites Host machines must be running either Intel VT chipsets or AMD-V chipsets that support hardware-assisted virtualization. For more information about hardware requirements, see the Enabling KVM support on your hardware section of the Quick Start Guide for installing and running KVM. Copyright IBM Corp. 2011, 2013 27
Author names Gerhard Stenzel Jens Osterkamp Thomas Richter Other contributors Santwana Samantray Heather Crognale IBM Services Linux offers flexibility, options, and competitive total cost of ownership with a world class enterprise operating system. Community innovation integrates leading-edge technologies and best practices into Linux. IBM is a leader in the Linux community with over 600 developers in the IBM Linux Technology Center working on over 100 open source projects in the community. IBM supports Linux on all IBM servers, storage, and middleware, offering the broadest flexibility to match your business needs. For more information about IBM and Linux, go to ibm.com/linux (https://www.ibm.com/linux) IBM Support Questions and comments regarding this documentation can be posted on the developerworks Systems Management Blueprint Community Forum: developerworks Linux Virtualization Blueprint Community Forum The IBM developerworks discussion forums let you ask questions, share knowledge, ideas, and opinions about technologies and programming techniques with other developerworks users. Use the forum content at your own risk. While IBM attempts to provide a timely response to all postings, the use of this developerworks forum does not guarantee a response to every question that is posted, nor do we validate the answers or the code that are offered. Network setup The use of virtual machines in physical host machines leads to a situation where there is no longer a strict separation between host and network administration because some form of virtual network setup is needed to connect the virtual machines to the physical network. Depending on the type of workload on the virtual machines and the type of communication between them, the spectrum of the network setup for the guest within the hosts can range from a dedicated network card for a guest to a fully administratable virtual switch with advanced routing and filtering capabilities. In this blueprint, two complementary scenarios are considered: v the VEPA scenario, or network centric scenario, where the administration of the network happens in one or more physical switches, and where the virtual machines are connected to the physical network as closely as possible. v the VEB scenario, or host centric scenario, where all administration including the administration of the virtual network and switches happens in the hypervisor host machines. 28 Blueprints: Configuring the network for virtual machines in a data center
Managing virtual machines with libvirt Both scenarios described in this blueprint assume that you will use libvirt. Many libvirt tasks, like defining, starting, or migrating a virtual machine can be accomplished using virsh, the command-line utility. The graphical interface to libvirt, virt-manager, does not currently support all of the features discussed in this blueprint. If you decide to use virt-manager, though, you can do so with some manual editing of the virtual machine definition (the details of which are outside of the scope of this blueprint). For more information about using libvirt, see http://libvirt.org/docs.html. Considerations for live migration Live migration of virtual machines requires that the virtual disk of a virtual machine is stored on a shared resource accessible to both source and target host. For example an NFS-based or iscsi-based storage pool. For more information about live migration of virtual machines, see http://www.redhat.com/docs/en- US/Red_Hat_Enterprise_Linux/6-Beta/html/Virtualization/ The VEPA scenario In the VEPA scenario, the virtual machines are connected to the physical network as closely as possible, so that the virtual network interface is directly connected to the physical network interface of the host without going through the virtual bridge of the Linux host. All network traffic from a virtual machine goes to the switch first, effectively bypassing the internal bridge even if the network traffic is intended for the host or another virtual machine on the same host. Communication between virtual machines on the same host therefore requires that the respective switch port is put in 'reflective relay' mode. This kind of communication will also impact the overall bandwidth of the respective physical interface and the latency of the network traffic. Chapter 7. Configuring the network for virtual machines in a data center 29
Adding a VEPA interface to a virtual machine A VEPA interface is added to a virtual machine by specifying the "direct" interface type in the domain XML. A typical network interface definition looks similar to the following: <devices>... <interface type= network > <mac address= d0:0f:d0:0f:02:01 /> <source network= default /> <model type= virtio /> </interface>... </devices> Change the interface definition using the virsh edit <virtual machine> command as follows to add a VEPA interface to a virtual machine (using eth2 with VLAN ID 20 in this example). Note: If you are using BNT switches, the VLAN ID should be in the range of 2 to 4094. <devices>... <interface type= direct > 30 Blueprints: Configuring the network for virtual machines in a data center
<mac address= d0:0f:d0:0f:02:01 /> <source dev= eth2.20 mode= vepa /> <source network= default /> <model type= virtio /> </interface>... </devices> This change causes libvirt to create a macvtap device, tie it to the specified physical network interface, and then pass the macvtap file descriptor to QEMU upon virtual machine start. This sequence works similarly for virtual machine shutdown and migration. While using the VEPA mode provides the base for the network-centric management of the virtual machines network activities, an additional step is necessary to actually do this: enforcing port policies. In this case, 802.1Qbg VSI types. Adding 802.1Qbg VSI types to a VEPA interface The VSI types are defined and stored in a database with an associated unique ID. The switch advertises its support for 802.1Qbg capabilities, including reflective relay. On the KVM host, the link layer discovery protocol agent daemon (lldpad) is used to configure the port in VEPA mode (hairpin mode), effectively offloading packet-switching functions from the KVM host to the adjacent switch. The lldpad daemon is notified by libvirt if VSI types are to be applied to a specific MAC/physical port pair. The unique ID of the VSI type and the virtual NIC MAC/VLAN information is registered with the adjacent switch port through lldpad and the switch retrieves the VSI type that is based on the unique ID and associates the rules against the registered MAC/VLAN. 802.1Qbg requires support from the physical switch that is connected to the server. See your switch user guide to determine whether the switch supports 802.1Qbg, and how to configure it. Both IBM blade switches and rack switches support 802.1Qbg. If necessary, upgrade to a newer switch firmware version. Configuration typically involves enabling EVB on the switch port, and configuring connection details for the VSIDB server. For more information about how to configure a switch for VEPA, see chapter 6 of the Implementing a VM-Aware Network Using VMready (http://www.redbooks.ibm.com/abstracts/sg247985.html) IBM Redbooks publication. Note: The switch IBMNOS allows a port to be configured with "tag-pvid". With this configuration, all packets that are leaving the port are tagged, including the LLDP and ECP packets. This configuration is not recommended for ports on which 802.1Qbg is used because it is not supported by the current lldpad implementation Installing and starting the lldpad daemon To configure the port in VEPA mode, install and start the link layer discover protocol agent daemon (lldpad). Chapter 7. Configuring the network for virtual machines in a data center 31
Procedure 1. Install the lldpad daemon with the following command: yum install lldpad 2. Start the lldpad daemon with the following commands: service lldpad start 3. Enable the admin status for the interface that is connected to the switch (in this example eth2) with the following commands: Note: These two commands address the nearest bridge and the nearest customer bridge. lldptool -L -i eth2 adminstatus=rxtx lldptool -i eth2 -g ncb -L adminstatus=rxtx 4. Enable transmission of the EVB TLV (edge virtual bridging type length value message) and then configure reflective relay mode and capabilities with the following sets of commands for both the nearest bridge and for the nearest customer bridge. For Red Hat Enterprise Linux 6.4 and later, lldpad uses only the nearest customer bridge. a. For the nearest bridge (unnecessary for Red Hat Enterprise Linux 6.4 and later): lldptool -T -i eth2 -V evbcfg -c enabletx=yes lldptool -T -i eth2 -V evbcfg -c fmode=reflectiverelay lldptool -T -i eth2 -V evbcfg -c capabilities=rte,ecp,vdp b. For the nearest customer bridge: lldptool -i eth2 -T -g ncb -V evbcfg -c enabletx=yes lldptool -i eth2 -T -g ncb -V evbcfg -c fmode=reflectiverelay lldptool -i eth2 -T -g ncb -V evbcfg -c capabilities=rte,vdp,ecp c. Display the EVB parameters: lldptool -t -i eth2 -g ncb -V ecbcfg -c enabletx lldptool -t -i eth2 -g ncb -V ecbcfg -c fmode lldptool -t -i eth2 -g ncb -V ecbcfg -c capabilities 5. Enable VDP with the following commands: Note: These commands address the nearest bridge and the nearest customer bridge. Red Hat Enterprise Linux 6.3 and before lldptool -T -i eth2 -V vdp -c enabletx=yes lldptool -i eth2 -T -g ncb -V vdp -c enabletx=yes Red Hat Enterprise Linux 6.4 and later lldptool -i eth2 -T -g ncb -V vdp -c enabletx=yes 6. Restart the lldpad daemon with the following command: service lldpad restart Results All the changes made with lldptool as above have to be made only once and will be saved to the lldpad configuration file. They will available to lldpad automatically on the next start. The changes can be verified by examining the /var/lib/lldpad/lldpad.conf file. The changes due to step 4a are highlighted: dcbx : { version = "1.0"; dcbx_version = 2; }; nearest_customer_bridge : { eth2 : { tlvid00000001 : { info = "04001B2163BEE8"; 32 Blueprints: Configuring the network for virtual machines in a data center
}; tlvid00000002 : { info = "03001B2163BEE8"; }; adminstatus = 3; tlvid001b3f00 : { capabilities = "rte,vdp,ecp"; fmode = "reflectiverelay"; enabletx = true; }; vdp : { enabletx = true; }; };... }; lldp : { eth2 : { tlvid00000001 : { info = "04001B2163BEE8"; }; tlvid00000002 : { info = "03001B2163BEE8"; }; adminstatus = 3; vdp : { enabletx = true; }; tlvid001b3f00 : { enabletx = true; info = "001B3F0080070000DF0C00000F"; capabilities = "rte,ecp,vdp"; fmode = "reflectiverelay"; }; };... }; v The status of the negotiation of EVB capabilities can be queried with lldptool with the following command for the nearest bridge: lldptool -t -i eth2 -V evbcfg The result of this command should look similar to the following: EVB Configuration TLV supported forwarding mode: (0x40) reflective relay supported capabilities: (0x07) RTE ECP VDP configured forwarding mode: (0x40) reflective relay configured capabilities: (0x07) RTE ECP VDP no. of supported VSIs: 0001 no. of configured VSIs: 0000 RTE: 16 v The status of the negotiation of EVB capabilities can be queried with the -g ncb option for the nearest customer bridge: lldptool -i eth2 -t -g ncb -V evbcfg The result of this command should look similar to the following: Chapter 7. Configuring the network for virtual machines in a data center 33
EVB Configuration TLV supported forwarding mode: (0x40) reflective relay supported capabilities: (0x07) RTE ECP VDP configured forwarding mode: (0x40) reflective relay configured capabilities: (0x07) RTE ECP VDP no. of supported VSIs: 0001 no. of configured VSIs: 0000 RTE: 16 Note: The displayed values of configured forwarding mode and configured capabilities, shown as 0, are incorrect on some Red Hat Enterprise Linux 6.x versions. Specifying a VSI type Specify the VSI type for a VEPA interface by adding a <virtualport/> element to the domain XML using virsh edit. The parameters of the virtualport element are documented in more detail in the IEEE 802.1Qbg standard. The values are network specific and should be provided by the network administrator. In 802.1Qbg terms, the Virtual Station Interface (VSI) represents the virtual interface of a virtual machine. Important: Be sure to adapt the attribute values for the <parameter/> element to your specific network environment. <devices>... <interface type= direct > <mac address= d0:0f:d0:0f:02:01 /> <source dev= eth2.20 mode= vepa /> <source network= default /> <virtualport type= 802.1Qbg > <parameters managerid= 12 typeid= 0x123456 typeidversion= 1 instanceid= 09b00c53-8b5c-4eeb-8f00-d847aa05191b /> </virtualport> <model type= virtio /> </interface>... </devices> VM startup When the virtual machine is started, libvirt parses the virtualport type, determines the physical device (nth parent) and VLAN ID, and sends a netlink message with an 'ASSOCIATE' request to the lldpad daemon. 34 Blueprints: Configuring the network for virtual machines in a data center
The lldpad daemon then sends an ASSOCIATE VDP message on the physical interface to the switch. Depending on the success or failure of registering the VSI type, you will see one of the following messages. Table 2. System status messages Output message shown virsh start testvm3 Status The lldpad daemon is not running. error: Failed to start domain testvm3 error: internal error sending of PortProfileRequest failed virsh start testvm3 error: Failed to start domain testvm3 error: internal error port-profile setlink timed out virsh start testvm3 The environment does not have an 802.1Qbg enabled switch. The virtual machine is successfully created. Domain testvm3 started You can verify the status of the association of the profile with the following command: lldptool -t -i eth2 -V vdp mode The output will look similar to the following output: mode = mode: 2 (VDP_MODE_ASSOCIATED) response: 0 (success) state: 2 (VSI_ASSOCIATED) mgrid: 12 id: 1193046 (0x123456) version: 3 instance: 09b00c53-8b5c-4eeb-8f00-d847aa05191b mac: d0:0f:d0:0f:02:01 vlan: 20 VM migration Live migration allows you to migrate a virtual machine from one host to another host while the virtual machine continues to run seamlessly. The virtual disk of the virtual machine is located in a shared storage pool and should be mounted on both the source and destination. Configuring SSH to authenticate without requiring passwords will simplify the migration process. For more information about shared storage pools, see Chapter 6, Shared storage pools, on page 25. Chapter 7. Configuring the network for virtual machines in a data center 35
Migrating your VM Use this procedure to migrate your virtual machines. Before you begin This procedure assumes that SSH keys are already set for password-less authentication. If you do not have SSH keys set for password authentication, do so before continuing with this procedure. Procedure 1. Verify that password-less authentication is working: virsh -c qemu+ssh://c7b5/system list --all Id Name State ---------------------------------- - testvm3 shut off 2. Verify that the virtual machine is running locally: virsh list --all Id Name State ---------------------------------- 23 testvm3 running 3. Migrate the virtual machine to the destination host: virsh migrate testvm3 qemu+ssh://c7b5/system 4. Verify that the virtual machine is running on the destination: virsh -c qemu+ssh://c7b5/system list --all Id Name State ---------------------------------- 17 testvm3 running 5. Migrate the virtual machine back with one of the following command strings: v [root@c7b4 ~]# virsh --connect qemu+ssh://c7b5/system \ migrate testvm3 qemu+ssh:///system v virsh --connect qemu+ssh://c7b5/system migrate testvm3 \ qemu+ssh://c7b4/system Shutting down a virtual machine You can shut down a virtual machine from either directly within the virtual machine or by using the virsh command from the host. To shut down the virtual machine from the host, use the following command: virsh shutdown testvm3 If the command is successful, the following output is displayed: Domain testvm3 is being shutdown 36 Blueprints: Configuring the network for virtual machines in a data center
Configuring a bonding device Use this procedure to configure a bonding device for lldpad. Before you begin Beginning with Red Hat Enterprise Linux 6.4, lldpad supports bonding devices. Only active-backup mode is supported. In this example, a bond device named bond0 has an IP address of 192.168.0.5, and two slaves named eth2 and eth3. Note: Switch configuration to support a bond device on an HS22 blade has the following characteristics and limitations: v Slave devices must be connected to a different switch device. The switch firmware does not support EVB protocol on external ports. v Switches must be interconnected. Use an external port between both switches. v Enable the external ports on both switches to transfer traffic for the VLAN id that is defined in the lldpad configuration section. To configure a bonding device for lldpad, complete the following procedure: Procedure 1. Install the bonding device driver: modprobe bonding mode=active-backup miimon=100 A device named bond0 is available. 2. Assign an IP address to the bond0 device: ifconfig bond0 192.168.0.5 netmask 255.255.255.0 up 3. Create slaves and assign them to the bond0 device: ifenslave bond0 eth2 eth3 4. Assign a VLAN device that is named bond0.4 on top of the bond0 device: vconfig add bond0 4 A new device that is called bond0.4 is associated with VLAN id 4. 5. Assign an IP address the the bond0.4 VLAN device: ifconfig bond0.4 192.168.4.5 netmask 255.255.255.0 up lldpad uses the device bond0 for switch communication. 6. Enable transmission of the EVB TLV for the bond0 device: a. For the nearest customer bridge: lldptool -i bond0 -T -g ncb -V evbcfg -c enabletx=yes lldptool -i bond0 -T -g ncb -V evbcfg -c fmode=reflectiverelay lldptool -i bond0 -T -g ncb -V evbcfg -c capabilities=rte,vdp,ecp lldptool -i bond0 -T -g ncb -V vdp -c enabletx=yes lldptool -i bond0 -L -g ncb adminstatus=rxtx b. Display the EVB parameters: Chapter 7. Configuring the network for virtual machines in a data center 37
lldptool -t -i bond0 -g ncb -V ecbcfg -c enabletx lldptool -t -i bond0 -g ncb -V ecbcfg -c fmode lldptool -t -i bond0 -g ncb -V ecbcfg -c capabilities lldptool -i bond0 -t -g ncb -V vdp -c enabletx lldptool -i bond0 -l -g ncb adminstatus lldpad uses the bond device to communicate with the switch. If the active slave fails, the backup slave becomes the new active slave and communication between virtual machines, lldpad, and the switch resumes after a short wait period. The VEB scenario In the VEB scenario, all of the virtual machines in a host are connected to a virtual switch, or possibly several virtual switches. The host connects the virtual switch to the physical switch. This scenario has advantages where the inter-guest communication is performance-critical and also where no 802.1Qbg enabled switches are available. 38 Blueprints: Configuring the network for virtual machines in a data center
In order to manage and potentially limit the network activities of the guest machines, network filters can be added to the virtual machine definition. The implementation of these network filters is a combination of iptables (layer 3) and ebtables (layer 2) rules, but the network filter rules are specified in an implementation-independent format, which is described in Filtering on page 16. The state of the virtual switch can be queried through SNMP. The setup of the net-snmp daemon to serve the bridge MIB is described in The SNMP bridge management information base on page 18. Chapter 7. Configuring the network for virtual machines in a data center 39
Filtering Filtering allows the hypervisor to control which network packets are sent to, or received from, a virtual machine. For more information about network filters, including information about writing custom filters, see "Network Filters" at libvirt.org. Adding a filter reference to the domain XML You can add a filter reference to a guest XML definition by adding a <filterref/> element to the network interface definition using the virsh edit <virtual machine> command. In the following example, the no-spoofing filter is added:... <interface type= network > <mac address= 52:54:0:11:11:11 /> <source network= mynet /> <model type= virtio /> <filterref filter= no-spoofing /> </interface>... The no-spoofing filter The no-spoofing filter in this example references the respective filter for preventing MAC, IP, and ARP spoofing. The xml file for the no-spoofing filter can be created by creating a file called no-spoofing.xml with the following content: <filter name= no-spoofing chain= root > <filterref filter= no-mac-spoofing /> <filterref filter= no-ip-spoofing /> <filterref filter= no-arp-spoofing /> </filter> Then, define the file to libvirt with the following command: virsh nwfilter-define no-spoofing.xml Tip: A filter can be examined using the following command: virsh nwfilter-dumpxml no-spoofing 40 Blueprints: Configuring the network for virtual machines in a data center
You can see the effect of specifying the no-spoofing filter when you view the ebtables output, as shown in the following examples: 1. With no virtual machine running, the output looks similar to the following: # ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 0, policy: ACCEPT Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT 2. After starting the virtual machine with the no-spoofing filter, the output looks similar to the following: # ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 1, policy: ACCEPT -i vnet0 -j libvirt-i-vnet0 Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT Bridge chain: libvirt-i-vnet0, entries: 4, policy: ACCEPT -s! 52:54:0:6e:4c:17 -j DROP -p IPv4 -j ACCEPT -p ARP -j ACCEPT -j DROP 3. When the virtual machine requests an IP address through DHCP, the output looks similar to the following: ebtables -t nat -L Bridge table: nat Bridge chain: PREROUTING, entries: 1, policy: ACCEPT -i vnet0 -j libvirt-i-vnet0 Bridge chain: OUTPUT, entries: 0, policy: ACCEPT Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT -o vnet0 -j libvirt-o-vnet0 Bridge chain: libvirt-i-vnet0, entries: 2, policy: ACCEPT -p IPv4 -j I-vnet0-ipv4 -p ARP -j I-vnet0-arp Bridge chain: libvirt-o-vnet0, entries: 1, policy: ACCEPT -p ARP -j O-vnet0-arp Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT -s! 52:54:0:11:11:11 -j DROP -p IPv4 --ip-src! 192.168.122.40 -j DROP Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT -s! 52:54:0:11:11:11 -j DROP -p ARP --arp-mac-src! 52:54:0:11:11:11 -j DROP -p ARP --arp-ip-src! 192.168.122.40 -j DROP -p ARP --arp-op Request -j ACCEPT -p ARP --arp-op Reply -j ACCEPT -j DROP Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT -p ARP --arp-op Reply --arp-mac-dst! 52:54:0:11:11:11 -j DROP Chapter 7. Configuring the network for virtual machines in a data center 41
-p ARP --arp-ip-dst! 192.168.122.40 -j DROP -p ARP --arp-op Request -j ACCEPT -p ARP --arp-op Reply -j ACCEPT -j DROP These ebtables rules only allow packets to and from the virtual machine with the specified MAC and IP values. All other packets are dropped, therefore effectively preventing spoofing attacks of the virtual machine. The SNMP bridge management information base With the bridge, you can query information about the virtual switch in the KVM hypervisor through SNMP. This ability can be exploited by IBM Director, IBM Tivoli Network Manager, and other SNMP-capable tools. The SNMP bridge Management Information Bases (MIBs) define the representation of the bridge data: v 802.1d Bridge MIB RFC 2674, for switches with no VLANs (located in the /usr/share/mibs/ietf/q- BRIDGE-MIB directory of libsmi-0.4.8 in Red Hat Enterprise Linux 6.x) v 802.1q Bridge MIB RFC 4363, for switches with VLANs (located in the /usr/share/mibs/ietf/p- BRIDGE-MIB directory of libsmi-0.4.8 in Red Hat Enterprise Linux 6.x). For more information about this bridge, see Definitions of Managed Objects for Bridges with Traffic Classes, Multicast Filtering, and Virtual LAN Extensions. v The Bridge MIB module for managing devices that support IEEE 802.1D (RFC 4188): RFC 1286 (1st rev) RFC 1493 (2nd rev) For more information about this bridge module, see Definitions of Managed Objects for Bridges. In Red Hat Enterprise Linux 6.0, only RFC 4188 was supported. Support for RFC 4363 (VLAN support) was added in Red Hat Enterprise Linux 6.1.The following sections gives a short outline about how to install, configure and verify the support of the bridge MIB. The bridge MIB is implemented as a Perl extension for the net-snmp daemon. If those files are missing use yum install libsmi-0.4.8-4.el6.x86_64 to install them. Installing net-snmp Before using the SNMP MIB, you must install the net-snmp package. To install the net-snmp package, run the following command: yum install net-snmp This command also installs all prerequisite packages. 42 Blueprints: Configuring the network for virtual machines in a data center
Installing net-snmp-perl Before using the SNMP MIB, you must install the net-snmp-perl package. The bridge MIB implementation is in the /usr/bin/snmp-bridge-mib directory, part of the net-snmp-perl RPM, which is installed with following command: yum install net-snmp-perl Installing net-snmp-utils Before using the SNMP MIB, you must install the net-snmp-utils package. The successful configuration of the bridge MIB can be verified with the snmpwalk utility, part of the net-snmp-utils RPM, which is installed with the following command: yum install net-snmp-utils.x86_64 Configuring and verifying net-snmp Before using SNMP MIB, configure and verify bridge support. Before you begin For testing purposes only, you can grant access to snmpd for all users by adding the following line to the /etc/snmp/snmpd.conf file: rocommunity public Note: This access should be granted only for testing purposes. In a production environment, be sure to replace this line with the correct access controls for your environment. Procedure 1. Start the snmp daemon: service snmpd restart Stopping snmpd: [FAILED] Starting snmpd: [ OK ] 2. Verify that the bridge module is not yet configured: snmpwalk -Os -c public -v 2c localhost.1.3.6.1.2.1.17.4 mib-2.17.4 = No Such Object available on this agent at this OID 3. Configure support for the bridge MIB module by adding the following line to the /etc/snmp/snmpd.conf file: Chapter 7. Configuring the network for virtual machines in a data center 43
master agentx The changes to the /etc/snmp/snmpd.conf file look similar to the following: --- /etc/snmp/snmpd.conf 2010-07-28 10:14:00.000000000 +0200 +++ snmpd.conf 2010-08-10 22:40:37.000000000 +0200 @@ -16,6 +16,7 @@ # Access Control ########################################################################### +rocommunity public # As shipped, the snmpd demon will only respond to queries on the # system mib group until this file is replaced or modified for # security purposes. Examples are shown below about how to increase the @@ -460,3 +461,4 @@ # Further Information # # See the snmpd.conf manual page, and the output of "snmpd -H". +master agentx 4. Restart snmpd: service snmpd restart Stopping snmpd: [ OK ] Starting snmpd: [ OK ] 5. Run the following perl script: perl /usr/bin/snmp-bridge-mib virbr0 6. Verify that the bridge is configured. In this example, there is no virtual machine connected to the bridge. The following snmpwalk command typically generates a long return. snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 The MAC address of the virtual machine is used to verify correct operation. Assuming the MAC address contains the string 4C:17, then when you grep for the MAC address, no output is returned if the virtual machine is not running: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" 7. Start a virtual machine that connects to the bridge using the virsh start <virtual-machine> command, and then run the same command from Step 6 again: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" BRIDGE-MIB::dot1dTpFdbAddress. RT.nL. = STRING: 52:54:0:6e:4c:17 BRIDGE-MIB::dot1dTpFdbAddress..T.nL. = STRING: fe:54:0:6e:4c:17 8. Shutdown the virtual machine and wait at least 30 seconds to run the command again and verify that the output is empty: snmpwalk -c public -v2c -m +BRIDGE-MIB -M +/usr/share/mibs/ietf/ \ localhost.1.3.6.1.2.1.17 grep "4c:17" Results The bridge MIB module is now installed, configured, and verified to work. 44 Blueprints: Configuring the network for virtual machines in a data center
Relevant standards and protocols The relevant 802.1Qbg protocols are listed for your convenience. v External Virtual Bridging protocol (EVB) uses LLDP as transport Defines locus of VM to VM switching Set VEPA or VEB mode v Channel Discovery and Configuration Protocol (CDCP) [Not Implemented] Virtualize the physical link to simultaneously support multiple VEPA/VEB components v Edge Control Protocol (ECP) ECP provides a reliable, acknowledged protocol for VDP rather than using LLDP v Virtual Station Interface (VSI) Discovery Protocol(VDP) Associate and De-associate interface MAC/VLAN to port profile Shared storage pools For live migration, the source and the target host of the to-be-migrated guest must both have access to the virtual disk of the guest. You can ensure this access by using a shared storage pool. In the following example, an NFS server is used, which exports a directory to all the host machines: showmount -e dcnserver Export list for dcnserver: /server/sas/nfs * The definition of an NFS-based stored pool looks similar to the following: <pool type= netfs > <name>dcnserver</name> <uuid>178a678a-ab47-b26b-cdcc-5bc3a33ffd79</uuid> <capacity>1082195443712</capacity> <allocation>24719130624</allocation> <available>1057476313088</available> <source> <host name= dcnserver /> <dir path= /server/sas/nfs/images /> <format type= auto /> </source> <target> <path>/var/lib/libvirt/images</path> <permissions> <mode>0700</mode> <owner>-1</owner> <group>-1</group> </permissions> </target> </pool> Chapter 7. Configuring the network for virtual machines in a data center 45
The disk from the storage pool is referenced in the domain xml: <devices>... <disk type= file device= disk > <driver name= qemu type= raw /> <source file = /var/lib/libvirt/images/f12nwtest.img /> <target dev= sda bus= scsi /> <shareable/> <address type= drive controller= 0 bus= 0 unit= 0 /> </disk>... </devices> Note also that the <shareable/> element is added to the disk definition. 46 Blueprints: Configuring the network for virtual machines in a data center
Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-ibm product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation Dept. LRAS/Bldg. 903 11501 Burnet Road Austin, TX 78758-3400 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Copyright IBM Corp. 2011, 2013 47
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: IBM World Trade Asia Corporation Licensing 2-31 Roppongi 3-chome, Minato-ku Tokyo 106-0032, Japan IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-ibm products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. Any references in this information to non-ibm Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to the manufacturer, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. The manufacturer, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. CODE LICENSE AND DISCLAIMER INFORMATION: The manufacturer grants you a nonexclusive copyright license to use all programming code examples from which you can generate similar function tailored to your own specific needs. SUBJECT TO ANY STATUTORY WARRANTIES WHICH CANNOT BE EXCLUDED, THE MANUFACTURER, ITS PROGRAM DEVELOPERS AND SUPPLIERS MAKE NO WARRANTIES OR CONDITIONS EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OR CONDITIONS OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT, REGARDING THE PROGRAM OR TECHNICAL SUPPORT, IF ANY. UNDER NO CIRCUMSTANCES IS THE MANUFACTURER, ITS PROGRAM DEVELOPERS OR SUPPLIERS LIABLE FOR ANY OF THE FOLLOWING, EVEN IF INFORMED OF THEIR POSSIBILITY: 1. LOSS OF, OR DAMAGE TO, DATA; 2. SPECIAL, INCIDENTAL, OR INDIRECT DAMAGES, OR FOR ANY ECONOMIC CONSEQUENTIAL DAMAGES; OR 3. LOST PROFITS, BUSINESS, REVENUE, GOODWILL, OR ANTICIPATED SAVINGS. 48 Blueprints: Configuring the network for virtual machines in a data center
SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF DIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, SO SOME OR ALL OF THE ABOVE LIMITATIONS OR EXCLUSIONS MAY NOT APPLY TO YOU. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. Copyright IBM Corp. _enter the year or years_. If you are viewing this information in softcopy, the photographs and color illustrations may not appear. Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( and ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Java and all Java-based trademarks and logos are registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others. Notices 49
50 Blueprints: Configuring the network for virtual machines in a data center
Printed in USA