MLAG on Linux - Lessons Learned. Scott Emery, Wilson Kok Cumulus Networks Inc.



Similar documents
Objectives. The Role of Redundancy in a Switched Network. Layer 2 Loops. Broadcast Storms. More problems with Layer 2 loops

Virtual PortChannel Quick Configuration Guide

Objectives. Explain the Role of Redundancy in a Converged Switched Network. Explain the Role of Redundancy in a Converged Switched Network

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

CHAPTER 10 LAN REDUNDANCY. Scaling Networks

16-PORT POWER OVER ETHERNET WEB SMART SWITCH

Virtual PortChannels: Building Networks without Spanning Tree Protocol

Chapter 4: Spanning Tree Design Guidelines for Cisco NX-OS Software and Virtual PortChannels

Bridgewalling - Using Netfilter in Bridge Mode

Management Software. Web Browser User s Guide AT-S106. For the AT-GS950/48 Gigabit Ethernet Smart Switch. Version Rev.

How To Switch In Sonicos Enhanced (Sonicwall) On A 2400Mmi 2400Mm2 (Solarwall Nametra) (Soulwall 2400Mm1) (Network) (

OVERLAYING VIRTUALIZED LAYER 2 NETWORKS OVER LAYER 3 NETWORKS

TRILL for Data Center Networks

Chapter 3. Enterprise Campus Network Design

HARTING Ha-VIS Management Software

DATA CENTER. Best Practices for High Availability Deployment for the Brocade ADX Switch

Cisco FabricPath Technology and Design

Multi-Chassis Trunking for Resilient and High-Performance Network Architectures

Switching in an Enterprise Network

Data Networking and Architecture. Delegates should have some basic knowledge of Internet Protocol and Data Networking principles.

SDN CENTRALIZED NETWORK COMMAND AND CONTROL

BLADE PVST+ Spanning Tree and Interoperability with Cisco

STATE OF THE ART OF DATA CENTRE NETWORK TECHNOLOGIES CASE: COMPARISON BETWEEN ETHERNET FABRIC SOLUTIONS

Programmable Networking with Open vswitch

How To Make A Network Virtualization In Cumulus Linux (X86) (Powerpc) (X64) (For Windows) (Windows) (Amd64) And (Powerpci) (Win2

Efficient Video Distribution Networks with.multicast: IGMP Querier and PIM-DM

RESILIENT NETWORK DESIGN

AT-S60 Version Management Software for the AT-8400 Series Switch. Software Release Notes

Data Center Use Cases and Trends

Hirschmann. Simply a good Connection. White paper: Security concepts. based on EAGLE system. Security-concepts Frank Seufert White Paper Rev. 1.

Monitoring and Analyzing Switch Operation

The ABCs of Spanning Tree Protocol

SRX High Availability Design Guide

Port Trunking. Contents

How To Make A Network Cable Reliable And Secure

Using MLAG in Dell Networks

Citrix XenServer Design: Designing XenServer Network Configurations

Shortest Path Bridging IEEE 802.1aq Overview

How To. Configure Multiple Spanning Tree Protocol (MSTP) Introduction. Overview of MSTP. Extension of RSTP

Data Center Infrastructure of the future. Alexei Agueev, Systems Engineer

TRILL Large Layer 2 Network Solution

Cisco CCNP Optimizing Converged Cisco Networks (ONT)

Layer 3 Network + Dedicated Internet Connectivity

Network Virtualization and Data Center Networks Data Center Virtualization - Basics. Qin Yin Fall Semester 2013

Network Configuration Example

24 Port Gigabit Ethernet Web Smart Switch. Users Manual

Interconnecting Cisco Networking Devices Part 2

Lab 5 Explicit Proxy Performance, Load Balancing & Redundancy

What s New in VMware vsphere 5.5 Networking

Juniper / Cisco Interoperability Tests. August 2014

iscsi SANs Don t Have To Suck

June Bridge & Switch. Pietro Nicoletti Piero[at]studioreti.it. Bridge-Switch-Engl - 1 P. Nicoletti: see note pag. 2

Port Trunking. Contents

Juniper Networks EX Series/ Cisco Catalyst Interoperability Test Results. May 1, 2009

What s New in VMware vsphere 5.1 Networking

Multi-site Datacenter Network Infrastructures

ETHERNET VPN (EVPN) OVERLAY NETWORKS FOR ETHERNET SERVICES

VMware vsphere and Cumulus Linux Validated Design Guide. Deploying VMware vsphere with Network Switches Running Cumulus Linux

Deployment Guide for SRX Series Services Gateways in Chassis Cluster Configuration

Abstract. MEP; Reviewed: GAK 10/17/2005. Solution & Interoperability Test Lab Application Notes 2005 Avaya Inc. All Rights Reserved.

ALL8894WMP. User s Manual. 8-Port 10/100/1000Mbps with 4-port PoE. Web Management Switch

VMDC 3.0 Design Overview

Configuring the Transparent or Routed Firewall

Configuring EtherChannels

Network Virtualization and Software-defined Networking. Chris Wright and Thomas Graf Red Hat June 14, 2013

FlexNetwork Architecture Delivers Higher Speed, Lower Downtime With HP IRF Technology. August 2011

High Availability Failover Optimization Tuning HA Timers PAN-OS 6.0.0

ETHERNET VPN (EVPN) NEXT-GENERATION VPN FOR ETHERNET SERVICES

Deploying Brocade VDX 6720 Data Center Switches with Brocade VCS in Enterprise Data Centers

WSG24POE Switch. User Manual

AlliedWare Plus OS How To Configure interoperation between PVST+ and RSTP or MSTP

Extending Networking to Fit the Cloud

Easy Smart Configuration Utility

CORPORATE NETWORKING

Bandwidth Management in MPLS Networks

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

AT-S63 Version Management Software for the AT-9400 Basic Layer 3 Gigabit Ethernet Switches Software Release Notes

LiveAction Application Note

Architektur XenServer

CloudEngine 6800 Series Data Center Switches

: Interconnecting Cisco Networking Devices Part 2 v1.1

EVOLVING ENTERPRISE NETWORKS WITH SPB-M APPLICATION NOTE

20. Switched Local Area Networks

How Linux kernel enables MidoNet s overlay networks for virtualized environments. LinuxTag Berlin, May 2014

Network Architecture Validated designs utilizing MikroTik in the Data Center

CERN Cloud Infrastructure. Cloud Networking

Command Line Interface

Ethernet Fabrics: An Architecture for Cloud Networking

Wave Relay System and General Project Details

Chapter 2 Lab 2-2, Configuring EtherChannel Instructor Version

Course Contents CCNP (CISco certified network professional)

How To Learn Cisco Cisco Ios And Cisco Vlan

Aruba Mobility Access Switch and Arista 7050S INTEROPERABILITY TEST RESULTS:

"Charting the Course...

Can PowerConnect Switches Be Used in IP Multicast Networks?

QoS Switching. Two Related Areas to Cover (1) Switched IP Forwarding (2) 802.1Q (Virtual LANs) and 802.1p (GARP/Priorities)

- EtherChannel - Port Aggregation

Ethernet (LAN switching)

Transcription:

MLAG on Linux - Lessons Learned Scott Emery, Wilson Kok Cumulus Networks Inc.

Agenda MLAG introduction and use cases Lessons learned MLAG control plane model MLAG data plane Linux kernel requirements Other important changes and considerations

MLAG introduction MLAG - a LAG across more than one node multi-homing for redundancy active-active to utilize all links which otherwise may get blocked by Spanning Tree no modification of LAG partner

MLAG terminology ISL - inter switch link Primary role Secondary role Dually connected Singly connected

MLAG use case - hypervisor no MLAG - striping by VM MACs or other policies kernel MLAG - it s a bond kernel vm virtual switch virtual switch eth0 eth1 eth0 eth1 switch switch switch switch

MLAG use case - L2 fabric no blocking links, full utilization of bandwidth load balancing and redundancy offered by LAG

MLAG use case - L2 fabric no blocking links, full utilization of bandwidth load balancing and redundancy offered by LAG

Lessons learned L2 can be dangerous! Fail open by default, no TTL, unknown means flood... MLAG - more ways to live dangerously Rigorous and conservative interface state management needed. Temporary loops or duplicates not acceptable

Lessons learned Fast convergence depends on a lot of things done right: Proper daemon up/down sequences: UP: STPd up > MLAGd up > interface enable DOWN: interface disable > MLAGd down > STPd down Avoid split brain as much as possible: changing LACP system id flaps bonds have multiple heart beat channels between MLAG daemons Failures, besides link and node down, do happen, should not melt network. e.g. daemon crash Need to fail close, e.g. monit clean up

MLAG control plane model Linux kernel enforces default interface state on MLAG enabled interfaces User space MLAG daemon maintains MLAG configuration, controls the formation of MLAG and updates interface state and data path Analogous to the user space Spanning Tree model

MLAG data plane L2 must never have loops, redundant paths are blocked But want to utilize all links, cannot block Answer.. Make the links appear logically the same for the protocols that are supposed to protect you!

MLAG data plane rules same packet is not delivered to a node more than once packet sourced from a dually connected node is not delivered back to the same node This means packets crossing the ISL and destined to: dually-connected links => drop singly-connected links => forward

Minimum Linux kernel requirements ability to set LACP system ID on bond independent of bond mac address mlag_enable attribute on bond mechanism to keep member interface carrier down independent of admin state IFF_PROTO_DOWN duplicate filtering of packets crossing the ISL

Interface bring up user enables an mlag (bond with mlag_enable = 1) bonding driver keeps the bond and all its slaves down MLAG daemon puts bond in dormant interface mode to begin when MLAG daemon peering is complete sets mlag LACP system id on bond (802.3ad mode) brings slaves up LACP can run, no data traffic LACP converges, bond moves from oper down to oper dormant MLAG daemon verifies MLAG membership, installs egress filter, then sets bond to oper up

Split brain handling MLAG daemon pair cannot talk to each other ISL down but MLAG daemons alive MLAG daemon with secondary role keeps all MLAGs in down state with IFF_PROTO_DOWN IFF_PROTO_DOWN indicates to kernel to not bring bond slaves carrier up until it is cleared

Duplicate Filtering Packet ingress on ISL should only egress on singly connected links use ebtables: -i <ISL> -o <dually connected interface> -j DROP rule MUST be installed before dually connected interface is oper up rule MUST be uninstalled as soon as interface becomes singly connected One rule per dually connected interface, not scalable, especially in the case of non VLAN-aware bridge model with many bridges and many VLANs. Better if: ebtables can filter on the parent interface, e.g. eth1 instead of eth1.100, eth1.101, eth1.102. or bridge driver can make use of the knowledge of which link is ISL and which are dual-connected

Possible other Linux kernel requirements interface attribute to indicate ISL knowledge of the dual-connectedness of a link knowledge of mlag id of interfaces bridge filtering modifications based upon above

Other important changes and considerations Spanning Tree changes MAC address management IGMP group membership handling MLAG control traffic treatment

Spanning Tree Changes STP daemon connects to MLAG daemon and learns which is ISL singly/dually connected interfaces and their MLAG id when MLAG peering is up or down STP needs to run as if the two switches are one. Multiple approaches possible: master STP daemon runs the protocol and maintains full state sync with the slave STP daemon or each STP daemon does independent calculation. Loosely coupled, distributed processing Loosely coupled model is simpler and more scalable

Spanning Tree - Loosely coupled model use common bridge id (MLAG system id) when generating BPDUs only MLAG primary switch sends BPDU on duallyconnected links both MLAG switches send BPDU on singly-connected links BPDU received on root port is processed and also relayed across ISL, replace source MAC with MLAG id

MAC address management Goals reduce unknown flood eliminate constant MAC moves between ISL and MLAG Solution disable learning on ISL synchronize MAC addresses install address learned on MLAG on one side to corresponding MLAG on the other side install address learned on singly connected link on ISL on other side

IGMP Snooping MLAG daemons synchronize between themselves: IGMP group membership for dually connected links mrouter port information reports/queries may need to be flooded, the same duplicate filtering rule applies

MLAG control traffic control traffic share the ISL with data traffic, needs to be given higher priority independent of data traffic topology change - use a separate VLAN device on the ISL which is not bridged

While we re at it... VLAN-aware bridge driver great enhancement! more work needed scalability: vlan range*, per port per vlan local fdb* usability: limited to single STP instance, per bridge igmp snooping control Bonding driver a few issues with slave active state setting and MUX machine transitions* (*patches submitted upstream)

Thank You!