Best Practices for Virtual Networking Karim Elatov Technical Support Engineer, GSS 2009 VMware Inc. All rights reserved
Agenda Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 2
Virtual Network Overview - Physical to Virtual Physical Virtual Physical Physical Switch Virtual Switch Physical Switch Conventional access, distribution, core design Design with redundancy for enhanced availability Under the covers, virtual network same as physical Access layer implemented as virtual switches 3
Virtual Switch Options Virtual Switch Model Details vnetwork Standard Switch vnetwork Distributed Switch Cisco Nexus 1000V Host based: 1 or more per ESX host Distributed: 1 or more per Datacenter Distributed: 1 or more per Datacenter - Same as vswitch in VI3 - Expanded feature set - Private VLANs - Bi-directional traffic shaping - Network vmotion - Simplified management - Cisco Catalyst/Nexus feature set - Cisco NXOS cli - Supports LACP Virtual networking concepts similar with all virtual switches 4
ESX Virtual Switch: Capabilities MAC address assigned to vnic NIC Teaming of Physical NIC(s) [uplink(s)] associated with vswitches VM0 VM1 Layer 2 - only forward frames VM <-> VM and VM <- MAC a MAC b MAC c > Uplink; No vswitch <-> vswitch or Uplink <-> Uplink vswitch vswitch vswitch will not create loops affecting Spanning Tree in the physical network Can terminate VLAN trunks (VST mode) or pass Physical Switches trunk through to VM (VGT mode) 5
Distributed Virtual Switch Standard vswitch vcenter Exist across 2 or more clustered hosts vnetwork & dvswitch vcenter Provide similar functionality to vswitches Reside on top of hidden vswitches vcenter owns the configuration of the dvswitch Consistent host network configurations 6
Port Groups Template for one or more ports with a common configuration VLAN Assignment Security Traffic Shaping (limit egress traffic from VM) Failover & Load Balancing Distributed Virtual Port Group (Distributed Virtual Switch) Bidirectional traffic shaping (ingress and egress) Network VMotion network port state migrated upon VMotion 7
NIC Teaming for Availability and Load Sharing NIC Teaming aggregates multiple physical uplinks: VM0 VM1 Availability reduce exposure to single points of failure (NIC, uplink, physical switch) vswitch NIC Team Load Sharing distribute load over multiple uplinks (according to selected NIC teaming algorithm) Requirements: Two or more NICs on same vswitch Teamed NICs must have same VLAN configurations KB - NIC teaming in ESXi and ESX (1004088) 8
NIC Teaming Options Name Originating Virtual Port ID Algorithm vmnic chosen based upon: vnic port Physical Network Considerations Teamed ports in same L2 domain (BP: team over two physical switches) Source MAC Address Explicit Failover Order MAC seen on vnic Highest order uplink from active list Teamed ports in same L2 domain (BP: team over two physical switches) Best Practices: Originating Virtual PortID for VMs is the default, no extra configuration needed IP Hash, ensure that physical switch is properly configured for Etherchannel *KB - ESX/ESXi host requirements for link aggregation (1001938) Teamed ports in same L2 domain (BP: team over two physical switches) IP Hash* Hash(SrcIP, DstIP) Teamed ports configured in static 802.3ad Etherchannel - no LACP (Nexus 1000v for LACP) - Needs MEC to span 2 switches *KB - Sample configuration of EtherChannel / Link aggregation with ESX/ESXi and Cisco/HP switches (1004048) 9
Cisco Nexus 1000v Overview Cisco Nexus 1000v is a software switch for vnetwork Distributed Switches (vds): Virtual Supervisor Module (VSM) Virtual Ethernet Module (VEM) Things to remember: Virtual Ethernet Module (VEM)VSM uses external network fabric to communicate with VEMs VSM does not take part in forwarding packets VEM does not switch traffic to other VEM without an uplink 10
Cisco Nexus 1000v Modules Server 1 VM #1 VM #2 VM #3 Server 2 VM #4 VM #5 VM #6 VM #7 Server 3 VM #8 VEM VMware vswitch Nexus VEM 1000V vds VMware vswitch VMware ESX VMware ESX VM #9 VM #10 VM #11 VM #12 VEM VMware vswitch VMware ESX Nexus 1000V VSM Virtual Supervisor Module (VSM) Virtual or Physical appliance running Cisco OS (supports HA) Performs management, monitoring, & configuration Tight integration with VMware Virtual Center 11 vcenter Server Virtual Ethernet Module (VEM) Enables advanced networking capability on the hypervisor Provides each VM with dedicated switch port Collection of VEMs = 1 DVS Cisco Nexus 1000V Enables: Policy Based VM Connectivity Mobility of Network & Security Properties Non-Disruptive Operational Model
vswitch Configurations Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 12
Cisco show run and show tech-support Obtain configuration of a Cisco router or switch Run commands in priviliged EXEC mode show run show tech-support The following is a Cisco EtherChannel sample configuration: interface Port-channel1 switchport switchport access vlan 100 switchport mode access no ip address! interface GigabitEthernet1/1 switchport switchport access vlan 100 switchport mode access no ip address channel-group 1 mode on! KB - Troubleshooting network issues with the Cisco show tech-support command (1015437) 13
Traffic Types on a Virtual Network Virtual Machine Traffic Traffic sourced and received from virtual machine(s) Isolate from each other based on service level vmotion Traffic Traffic sent when moving a virtual machine from one ESX host to another Should be isolated Management Traffic Should be isolated from VM traffic (one or two Service Consoles) If VMware HA is enabled, includes heartbeats IP Storage Traffic NFS and/or iscsi via vmkernel interface Should be isolated from other traffic types Fault Tolerance (FT) Logging Traffic Low latency, high bandwidth Should be isolated from other traffic types How do we maintain traffic isolation without proliferating NICs? VLANs 14
Traffic Types on a Virtual Network, cont. Port groups in dedicated VLANs on a management-only virtual switch. Service console/vmk Interface production virtual switch virtual machines vmotion 106 storage 107 mgmt 108 management virtual switch production vmotion storage management 15
VLAN Tagging Options EST External Switch Tagging VGT Virtual Guest Tagging VST Virtual Switch Tagging VLAN assigned in Port Group policy vswitch vswitch vswitch VLAN Tags applied in Guest PortGroup set to VLAN 4095 VLAN Tags applied in vswitch Physical Switch Physical Switch Physical Switch 16 External Physical switch applies VLAN tags switchport access vlan switchport trunk VST is the best practice and most common method switchport trunk
DVS Support for Private VLAN (PVLAN) Enable users to restrict communications DMZ network Between VMs on the same VLAN or network segment Web application database email Allow server devices to share server the same server IP subnet while server being Layer 2 Isolated PVLAN Types Community Benefits: VMs can communicate with VMs on Community Employ community Larger and Promiscuous subnets PVLAN isolated (advantageous to hosting environments) Isolated Reduce Management Overhead VMs can only communicate with VMs on the Promiscuous Promiscuous VMs can communicate with all VMs KB - Private VLAN (PVLAN) on vnetwork Distributed Switch - Concept Overview (1010691) PVLAN router in promiscuous PVLAN document server isolated PVLAN 17
PVLAN Cost Benefit W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B PG PG PG PG PG PG PG PG PG PG PG PG Distributed Virtual Switch TOTAL COST: 12 VLANs (one per VM) W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B W2003EE-32-A W2003EE-32-B PG (with Isolated PVLAN) Distributed Virtual Switch TOTAL COST: 1 PVLAN (over 90% savings ) 18
Link Aggregation EtherChannel Port trunking between two to eight Active Fast Ethernet, Gigabit Ethernet, or 10 Gigabit Ethernet ports EtherChannel vs. 802.3ad EtherChannel is Cisco proprietary and 802.3ad is an open standard Note: ESX implements 802.3ad Static Mode Link Aggregation LACP (one of the implementations included in IEEE 802.3ad) Link Aggregation Control Protocol (LACP) Control the bundling of several physical ports into a single logical channel Only supported on Nexus 1000v KB ESX/ESXi host requirements for link aggregation (1001938) 19
Sample Link Aggregation Configuration KB - Sample configuration of EtherChannel / Link aggregation with ESX/ESXi andcisco/hp switches (1004048) 20 Supported switch Aggregation algorithm: IP-SRC-DST Supported Virtual Switch NIC Teaming mode: IP HASH
Failover Configurations Link Status relies solely on the network adapter link state Cannot detect configuration errors Spanning Tree Blocking Incorrect VLAN Physical switch cable pulls Beacon Probing sends out and listens for beacon probes Broadcast frames (ethertype 0x05ff) Beacon Probing Best Practice Use at least 3 NICs for triangulation If only 2 NICs in team, can t determine link failed KB - What is beacon probing? (1005577) Leads to shotgun mode results Figure Using beacons to detect upstream network connection failures. 21
Spanning Tree Protocol (STP) Considerations MAC a VM0 VM1 MAC b Spanning Tree Protocol creates loop-free L2 tree topologies in the physical network Physical links put in blocking state to construct loop-free tree vswitch vswitch drops BPDUs ESX vswitch does not participate in Spanning Tree and will not create loops with uplinks ESX Uplinks will not block, always active (full use of all links) Physical Switches Blocked link Switches sending BPDUs every 2s to construct and maintain Spanning Tree Topology Recommendations for Physical Network Config: 1. Leave Spanning Tree enabled on physical network and ESX facing ports (i.e. leave it as is!) 2. Use portfast or portfast trunk on ESX facing ports (puts ports in forwarding state immediately) 3. Use bpduguard to enforce STP boundary KB - STP may cause temporary loss of network connectivity when a failover or failback event occurs (1003804) 22
Tips & Tricks Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 23
Tips & Tricks Load-Based Teaming (LBT) Dynamically balance network load over available uplinks Triggered by ingress or egress congestion at 75% mean utilization over a 30 second period Configure on DVS via Route based on physical NIC load *LBT is not available on the Standard vswitch (DVS feature for ingress/egress traffic shaping) Network I/O Control (NetIOC) DVS software scheduler to isolate and prioritize specific traffic types contending for bandwidth on the uplinks connecting ESX/ESXi 4.1 hosts with the physical network. 24
Tips & Tricks Tip #1 After physical to virtual migration, the VM MAC address can be changed for Licensed Applications relying on physical MAC address. (KB 1008473) Tip #2 NLB Multicast needs physical switch Manual ARP resolution of NLB cluster. (KB 1006525) Tip #3 Cisco Discovery Protocol (CDP) gives switchport configuration information useful for troubleshooting (KB 1007069) Tip #4 - Beacon Probing and IP Hash DO NOT MIX (duplicate packets and port flapping) (KB 1017612 & KB 1012819) Tip #5 Link aggregation is never supported on disparate trunked switches Use VSS with MEC. (KB 1001938 & KB 1027731) 25
Tips & Tricks Using 10GigE Variable/high b/w 2Gbps+ High 1-2G b/w Low b/w Ingress (into switch) traffic shaping policy control on Port Group SC#2 iscsi NFS VMotion FT SC 2x 10GigE common/expected 10GigE CNAs or NICs Possible Deployment Method FCoE 10GE 10GE vswitch FCoE Gbps 10 Active/Standby on all Portgroups VMs sticky to one vmnic SC/vmk ports sticky to other 26 FCoE FCoE Priority Group bandwidth reservation (in CNA config utility) Use Ingress Traffic Shaping to control traffic type per Port Group Best Practice: Ensure Drivers and Firmware are compatible for success vsphere 4.1 supports up to (4) 10GigE NICs; 5.0 supports (8) 10GigE NICs If FCoE, use Priority Group bandwidth reservation (on CNA utility)
Troubleshooting Virtual Networks Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 27
Network Troubleshooting Tips 28 Troubleshoot one component at a time Physical NICs Virtual Switch Virtual NICs Physical Network Tools for Troubleshooting vsphere Client Command Line Utilities ESXTOP Third party tools Ping and Traceroute Traffic sniffers & Protocol Analyzers Wireshark Logs
Capturing Traffic Best Practice: create a new management interface for this purpose vswitch must be in Promiscuous Mode (KBs 1004099 & 1002934) ESXi uses tcpdump-uw (KB 1031186) 29
What s New in vsphere 5.0 Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 30
What s New in vsphere 5? Monitor and troubleshoot virtual infrastructure traffic NetFlow V5 Port mirror (SPAN) LLDP (standard based link layer discovery protocol) support simplifies the network configuration and management in non-cisco switch environment. Enhancements to the network I/O control (NIOC) Ability to create User-defined resource pool Support for vsphere replication traffic type; a new system traffic type that carries replication traffic from one host to another. Support for IEEE 802.1p tagging What s New in VMware vsphere 5.0 Networking Technical Whitepaper 31
Network Design Considerations Best Practices for Virtual Networking Virtual Network Overview vswitch Configurations Tips & Tricks Troubleshooting Virtual Networks What s New in vsphere 5.0 Network Design Considerations 32
Network Design Considerations How do you design the virtual network for performance and availability but maintain isolation between the various traffic types (e.g. VM traffic, VMotion, and Management)? Starting point depends on: Number of available physical ports on server Required traffic types 2 NIC minimum for availability, 4+ NICs per server preferred 802.1Q VLAN trunking highly recommended for logical scaling (particularly with low NIC port servers) Examples are meant as guidance and do not represent strict requirements in terms of design Understand your requirements and resultant traffic types and design accordingly 33
Example 1: Blade Server with 2 NIC Ports Candidate Design: SC vmkernel Team both NIC ports Create one virtual switch Portgroup3 VLAN 30 Portgroup1 VLAN 10 Portgroup2 VLAN 20 Create three port groups: vmnic0 vmnic1 vswitch Use Active/Standby policy for each portgroup Portgroup1: Service Console (SC) VLAN Trunks (VLANs 10, 20, 30) Portgroup2: VMotion Portgroup3: VM traffic Use VLAN trunking Active Standby Trunk VLANs 10, 20, 30 on each uplink Note: Team over dvuplinks with vds 34
Example 2: Server with 4 NIC Ports Candidate Design: Create two virtual switches Portgroup4 VLAN 40 Portgroup3 VLAN 30 SC Portgroup1 VLAN 10 vmkernel Portgroup2 VLAN 20 Team two NICs to each vswitch vswitch0 (use active/standby for each portgroup): vswitch1 vswitch0 Portgroup1: Service Console (SC) vmnic0 vmnic2 vmnic1 vmnic3 Portgroup2: VMotion VLANs 30, 40 VLANs 10, 20 vswitch1 (use Originating Virtual PortID) Portgroup3: VM traffic #1 Active Standby Note: Team over dvuplinks with vds Portgroup4: VM traffic #2 Use VLAN trunking vmnic1 and vmnic3: Trunk VLANs 10, 20 vmnic0 and vmnic2: Trunk VLANs 30, 40 35
Example 3: Server with 4 NIC Ports (Slight Variation) Candidate Design: Create one virtual switch Portgroup4 VLAN 40 Portgroup3 VLAN 30 SC Portgroup1 VLAN 10 vmkernel Portgroup2 VLAN 20 Create two NIC teams vswitch0 (use active/standby for portgroups 1 & 2): vswitch0 Portgroup1: Service Console (SC) Portgroup2: Vmotion vmnic0 vmnic2 vmnic1 vmnic3 Use Originating Virtual PortID for Portgroups 3 & 4 VLANs 30, 40 VLANs 10, 20 Portgroup3: VM traffic #1 Portgroup4: VM traffic #2 Active Standby Use VLAN trunking Note: Team over dvuplinks with vds vmnic1 and vmnic3: Trunk VLANs 10, 20 vmnic0 and vmnic2: Trunk VLANs 30, 40 36
Questions 37