1 BGP Prefix Independent Convergence draft-rtgwg-bgp-pic-00 Authors :, Cisco Systems Presenter : Clarence Filsfils, Cisco Systems Pradosh Mohapatra, Cisco Systems IETF85, Nov/2012 Atlanta, USA
2 Agenda Problem Solution Recovery from various failure scenarios
3 Agenda Problem Solution Recovery from various failure scenarios
What do we want to Achieve? 4 Serial nature of reachability propagation BGP convergence is inherently slow Can we adjust the data structure in time complexity that does not depend on BGP prefixes? If a path is lost or gained, modifications should be independent of the number of prefixes If there is a recovery, recover as soon as possible independent of the number of prefixes Objective Make re-convergence after topology change independent of the number of BGP prefixes
Terminology 5 Leaf: A prefix or local label container datastructure Path: a recursive or non-recursive path May be primary or backup Pathlist: an array of paths Each path carries its own pathindex OutLabel-Array: Array of outgoing labels and/or label actions* A possibly different outlabel-array attached to each leaf Each entry represents an outgoing label and/or label action for a path in the pathlist** Adjacency: The L2 information to send a packet to a directly connected router
6 Agenda Problem Solution Recovery from various failure scenarios
Basic Idea 7 BGP-PIC is really a FIB feature Hierarchical and shared forwarding chains On topology changes FIB modifies pathlists immediately without modifying leaves Restore traffic very quickly BGP modifies leaves later at a slower pace Behavior is internal to the router Completely transparent to operator Incrementally deployable
Sample Topology: VPN in LDP Core 8 LDP Core Two IGP paths Two BGP paths*
Forwarding Chain on Ingress PE 9 VPN-L21 VPN-L31 IGP leaf LDP-L11 LDP-L21 VPN leaf 1 NH1 NH2 Adj1 Shared BGP Pathlist (even among VRFs) Shared IGP Pathlist Adj2 VPN leaf 2 VPN-L22 IGP leaf LDP-L12 VPN-L32 BGP OutLabel Array LDP-L22 IGP OutLabel Array
Forwarding Chain on Egress 10 BGP OutLabel Array Unlabaled Swap VPN-L() VPN leaf IP prefix Leaf for Connected route Towards CE NH to CE VPN leaf Local Label address Shared BGP Pathlist IGP Leaf through core Towards
11 Agenda Problem Solution Recovery from various failure scenarios
Core Link/Node Failure 12 A Works whether BGP is running In the core or not B Failure IGP/LDP Convergence C
Core Link/Node Failure: FIB on 13 BGP leaf Core link Failure pathlist BGP leaf IGP/LDP Convergence IGP leaf pathlist BGP leaf IGP leaf BGP pathlist NOT modified pathlist NH1 NH1 Only IGP pathlist is modified IGP leaf NH2
14 Unipath PE Node* Failure: Protection at Ingress PE CE1 CE2 Works whether BGP is running in the core or not** must be pre- programmed as a backup path Core or Edge node Failure CE1 CE2 FIB Correction CE1 CE2
Unipath PE Node/PE-CE Link failure: FIB on 15 -LI -LI Failure must be pre- programmed as a backup path Backup PE BGP Pathlist recalculation -LI -LI Outlabel-array array does NOT change NH1 NH2 IGP leaf -LI -LI NH1 NH2 Path keeps its original index Use backup PE and backwalk to recalculate higher level LDIs IGP withdraws The leaf for NH2
Unipath PE-CE Link Failure: Protection at Egress PE 16 CE1 More useful for BGP-free core CE2 Locally detect failure 50ms recovery!! Failure CE1 CE2 FIB Correction CE1 CE2
Unipath PE-CE Link failure: FIB on 17 unlabeled Swap Backup path Going to Failure BGP Pathlist recalculation must be pre- programmed as a backup path CE1 Outlabel-array array does NOT change LDP LI To unlabeled Swap NH1 NH2 NH1 Locally detect failure 50ms recovery!! NH2 LDP Label To Path keeps its original index LDP LI To unlabeled Swap Swap incoming VPN label with VPN label then push transport label for NH2
Conclusion 18 Single elegant design to handle many convergence/protection cases in a BGP prefix independent time