Advances in BGP Gunter Van de Velde Sr. Technical Leader
What is BGP? What a Google search bgp abbreviation finds? Source: http://www.all-acronyms.com/bgp Border Gateway Protocol Bacterial Growth Potential Battlegroup Becker, Green and Pearson <sensored entry> Bermuda grass pollen Berri Gas Plant beta-glycerophosphate biliary glycoprotein blood group bone gamma-carboxyglutamic acid protei bone gamma-carboxyglutamic acid-contai bone gla protein bone Gla-containing protein Borders Group, Inc. brain-type glycogen phosphorylase Bridge Gateway Protocol Broader Gateway Protocol Bureau de Gestion de Projet Brain Gain Program Without BGP the Internet would not exist in its current stable and simple form It is the plumbing technology of the Internet 3
Agenda Motivation to Enhance BGP Scale and Performance Enhancements What happened in BGP Landscape? Some new Cool features that may interest you 4
Agenda Motivation to Enhance BGP Scale and Performance Enhancements What happened in BGP Landscape? Some new Cool features that may interest you 5
BGP started in 1989 Motivation and Development of BGP: When the Internet grew and moved to an autonomous system (AS) mesh architecture it was needed to have stable, non-chatty and low CPU consuming protocol to connect all of these AS s together. In June 1989, the first version of this new routing protocol was formalized, with the publishing of RFC 1105, A Border Gateway Protocol (BGP). 6
Service Provider Routing and Services progress Multimedia, Mobile Internet and Cloud Services will generate massive bandwidth explosion Prefix growth is almost a linear curve Evolution of offered BGP services go from basic technologies to very advanced infrastructures 7
Control-plane Evolution Most of services are progressing towards BGP Service/transport Before 2008 2013 and future IDR (Peering) BGP BGP (IPv6) SP L3VPN BGP BGP + FRR + Scalability SP Multicast VPN PIM BGP Multicast VPN DDOS mitigation CLI BGP flowspec Network Monitoring SNMP BGP monitoring protocol Security Filters BGP Sec (RPKI), DDoS Mitigation Proximity SP-L3VPN-DC BGP connected app API BGP Inter-AS, VPN4DC Business & CE L2VPN LDP BGP PW Sign (VPLS) DC Interconnect L2VPN BGP MAC Sign (EVPN) MPLS transport LDP BGP+Label (Unified MPLS) Data Center OSPF/ISIS BGP + Multipath Massive Scale DMVPN NHRP / EIGRP BGP + Path Diversity Campus/Ent L3VPN BGP (IOS) BGP (NX-OS) 8
Why BGP is so successful? Robustness: Run over TCP Low Overhead protocol: sends an update once and then remains silent Scalability: Path Vector Protocol, allows full mesh High Availability: NSR, PIC, Well Known : Tons of engineers know BGP Simplicity: BGP is simple (even if knobs make BGP BIG and sometimes less trivial to read) Multi-protocol: IPv4, IPv6, L2VPN, L3VPN, Multicast Incremental: easy to extend: NLRI,Path Attribute, Community Flexible: Policy 9
Scale & Performance Enhancements BGP Scaling Update Generation Enhancements Update generation is the most important, time-critical task Is now a separate process, to provide more CPU Quantum Parallel Route Refresh Significant delay (up to 15-30 minutes) seen in advertising incremental updates while RR is servicing route refresh requests or converging newly established peers Refresh and incremental updates run in parrallel Keepalive Enhancements Loosing or delayed keep-alive message result in session flaps Hence keep-alive processing is now placed into a separate process using priority queuing mechanism Adaptive Update Cache Size Instead of using a fixed cache size, the new code dynamically adapts to the address family used, the available router memory and the number of peers in an update group 10
Agenda Motivation to Enhance BGP Scale and Performance Enhancements What happened in BGP Landscape? Some new Cool features that may interest you 11
Scale & Performance Enhancements PE Scaling PE-CE Optimization In old code slow convergence was experienced with large numbers of CE s Improved by intelligently evaluating VPN prefixes based upon the prefixes in the CE s VRF VRF-Based Advertise Bits Increased memory consumption when number of VRF s was scaled on a PE Smart reuse of advertise bit space for VRF Route Reflector Scaling Selective RIB Download A Route-Reflector needs to receive the full RIB, however not all prefixes MUST be in the Forwarding Information Base (FIB) So, we now allow by using user policy to only download selected prefixes in the FIB More about BGP Performance tuning in BRKRST-3321 12
Slow Peer Management BGP Resiliency/HA Enhancement Issue: Slow peers in update groups block convergence of other update group members by filling message queues/transmitting slowly Persistent network issue affecting all BGP routers Two components to solution Detection Protection Detection BGP update timestamps Peer s TCP connection characteristics 13
Slow Peer Management BGP Resiliency/HA Enhancement Protection Move slower peers out of update group Separate slow update group with matching policies created Any slow members are moved to slow update group Detection can be automatic or manual with CLI command Automatic recovery Slow peers are periodically checked for recovery Recovered peers rejoin the main update group Isolation of slow peers unblocks faster peers and lets them converge as fast as possible 14
ASR1000 RP2, RP1, ASR1001 and 7200 BGP Route and Session Scalability Comparison - RR 7200 NPE- G2 (2GB) ASR1000 RP1 (4GB) ASR1001 (4GB) ASR1001 (8GB) ASR1001 (16GB) ASR1000 RP2 (8GB) ASR1000 RP2 (16GB) ipv4 routes 4M 7M* 2M* 9M* 17M* 12M* 29M* vpnv4 routes 7M 6M 2M 8M 16M 10M 24M ipv6 routes 2M 5M* 2M* 8M* 15M* 9M* 24M* vpnv6 routes 6M 5M 1.5M 7.5M 14.5M 9M 21M BGP sessions <1000 4000 4000 4000 4000 8000 8000 Tested with BGP selective download feature for ipv4/ipv6 for dedicated RR application. This feature prevents ipv4/ipv6 BGP routes to be installed in RIB and FIB. It reduces memory usage per ipv4/ipv6 prefix and CPU utilization ASR 1000 with RP1 allocates ~1.7GB to IOSd, ASR 1001 with 4GB allocates ~1.4GB to IOSd, whereas on NPE-G2 entire 2G is used by IOS 16
ASR 1000 RP1 and RP2 Convergence Performance Comparison - RR for your reference Tested with 1M Total Unique Routes Total Routes Reflected by RR to All Clients (Number of routes x Number of Clients) ASR1000 RP1 (4GB) Convergence (in seconds) ASR1001 (16GB) Convergence (in seconds) ASR1000 RP2 (16GB) Convergence (in seconds) ipv4 (1K RR clients) 1Billion 220 133 75 vpnv4 (1K RR clients, 8K RT) 1Billion 680 489 221 ipv6 (1K RR clients) 1Billion 720 393 194 vpnv6 (1K RR clients, 8K RT) 1Billion 877 811 293 ipv4 (2K RR clients) 2 Billion 375 270 138 vpnv4 (2K RR clients, 8K RT) 2 Billion 1285 797 394 ipv6 (2K RR clients) 2 Billion 1126 897 284 vpnv6 (2K RR clients, 8K RT) 2 Billion 1766 1691 551 Tested with peer groups (1K RR clients per peer group) 7200 NPE-G2 can not converge in the above test cases. ASR1000 RP2 converges about twice faster than 7200 NPE-G2 based on RR customer profile testing CPU utilization below 5% after convergence Link to Isocore report http://www.cisco.com/en/us/prod/collateral/routers/ps9343/itd13029-asr1000-rp2validationv1_1.pdf 17
Agenda Motivation to Enhance BGP Scale and Performance Enhancements What happened in BGP Landscape? Some new Cool features that may interest you 18
What Happened in XR Landscape? 4.0 4.1 4.1.1 4.2 4.2.1 4.2.3 4.2.4 4.3.0 4.3.1 RT-Constraint Multi-Instance/Multi-AS Attribute Filtering and Error handling BGP Based DDoS Mitigation Add Path Support Accumulated Interior Gateway Protocol (AIGP) Metric Attribute Unipath PIC for non-vpn addressfamilies (6PE/IPv6/IPv4 Unicast) BGP Accept Own BGP 3107 PIC Update for Global Prefixes Prefix Origin Validation based on RPKI PIC for RIB and FIB DMZ Link Bandwidth for Unequal Cost Recursive Load Balancing Selective VRF Download 6PE/6vPE over L2TPv3 Next-Generation Multicast VPN 19
What Happened in IOS Landscape? 15.2(1)S 15.2(2)S 15.2(4)S 15.3(1)S 15.3(2)S Origin AS Validation Gracefull Shutdown ibgp NSR mvpn BGP SAFI 129 NSR without Route-Refresh Additional Path Attribute Filtering and Error Handling Diverse Path Graceful Shutdown IPv6 client for Single hop BFD IPv6 PIC Core and Edge RT Constraint IP Prefix export from a VRF into global Table mvpnv6 Extranet Support Local-AS allow-policy RT/VPN-ID Attribute Rewrite Wildcard VRF Aware Conditional Announcement 20
What Happened in XE Landscape? 3.8 3.9 Multicast VPN BGP Dampening Multiple Cluster IDs VPN Distinguisher Attribute IPv6 NSR Local-AS Allow-policy RT or VPN-ID Rewrite Wildcard VRF Aware Conditional Advertisement 21
What Happened NXOS Landscape? 5.2 6.0 6.1 6.2 Prefix Independent Convergence (Core) local-as AS Override (allowas-in) Disable 4-byte AS advertisement MP BGP MPLS VPNs, 6PE, MDT BGP AddPath BGP send community both BGP Neighbor AF weight command BGP med confed and AS multipath-relax BGP next hop self for route reflector Default information originate support Flexible distance manipulation with Inject map Unsupress map as-format command for AS-plain & AS-dot Enhancements for removal of private AS enable route target import-export in default VRF InterAS option B-lite BGP Authentication for Prefix-based neighbors 22
Agenda Motivation to Enhance BGP Scale and Performance Enhancements What happened in BGP Landscape? Some new Cool features that may interest you 23
PIC Edge Feature Overview Internet Service Providers provide strict SLAs to their Financial and Business VPN customers where they need to offer a sub-second convergence in the case of Core/Edge Link or node failures in their network Prefix Independent Convergence (PIC) has been supported in IOS-XR/IOS for a while for CORE link failures as well as edge node failures BGP Best-External project provides support for advertisement of Best- External path to the ibgp/rr peers when a locally selected bestpath is from an internal peer BGP PIC Unipath provides a capability to install a backup path into the forwarding table to provide prefix independent convergence in case of the PE- CE link failure 24
PIC Edge: PE-CE Link Protection BGP Resiliency/HA Enhancement PE1 RR PE3 Primary 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud Traffic Flow CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE3 configured as primary, PE4 as backup PE3 preferred over PE4 by local preference CE2 has different RDs in VRFs on PE3 and PE4 PE4: advertise-best-external, to advertise route via PE4-CE2 link PE3: additional-paths install, to install primary and backup path 25
PIC Edge: Link Protection BGP Resiliency/HA Enhancement PE1 RR PE3 Primary 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud Traffic Flow CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE3 has primary and backup path Primary via directly connected PE3-CE2 link Backup via PE4 best external route What happens when PE3-CE2 link fails? 26
PIC Edge: Link Protection BGP Resiliency/HA Enhancement PE1 RR PE3 Primary 10.1.1.0/24 VPN1 Site #1 CE1 Traffic Flow MPLS Cloud CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup CEF (via BFD or link layer mechanism) detects PE3-CE2 link failure CEF immediately swaps to repair path label Traffic shunted to PE4 and across PE4-CE2 link 27
PIC Edge: Link Protection BGP Resiliency/HA Enhancement PE1 Traffic Flow RR PE3 Primary 10.1.1.0/24 VPN1 Site #1 CE1 Withdraw MPLS Cloud route via PE3 CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE3 withdraws route via PE3-CE2 link Update propagated to remote PE routers 28
PIC Edge: Link Protection BGP Resiliency/HA Enhancement Primary PE1 RR PE3 10.1.1.0/24 VPN1 Site #1 CE1 Withdraw MPLS Cloud route via PE3 CE2 10.2.2.0/24 VPN1 Site #2 PE2 Traffic Flow PE4 Backup BGP on remote PEs selects new bestpath New bestpath is via PE4 Traffic flows directly to PE4 instead of via PE3 29
PIC Edge: Edge Node Protection BGP Resiliency/HA Enhancement PE1 RR PE3 Primary 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud Traffic Flow CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE3 configured as primary, PE4 as backup PE3 preferred over PE4 by local preference CE2 has different RDs in VRFs on PE3 and PE4 PE4: advertise-best-external, to advertise route via PE4-CE2 link PE1: additional-paths install, to install primary and backup path 30
PIC Edge: Edge Node Protection BGP Resiliency/HA Enhancement Primary Traffic Flow PE1 RR PE3 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE1 has primary and backup path Primary via PE3 Backup via PE4 best external route What happens when node PE3 fails? 31
PIC Edge: Edge Node Protection BGP Resiliency/HA Enhancement Primary Traffic Flow PE1 RR PE3 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud PE3 s /32 host route removed from IGP CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup BGP Resiliency/HA Enhancement 32
PIC Edge: Edge Node Protection BGP Resiliency/HA Enhancement Primary Traffic Flow PE1 RR PE3 10.1.1.0/24 VPN1 Site #1 CE1 MPLS Cloud PE3 s /32 host route removed from IGP CE2 10.2.2.0/24 VPN1 Site #2 PE2 PE4 Backup PE1 detects loss of PE3 s /32 host route in IGP CEF immediately swaps forwarding destination label from PE3 to PE4 using backup path BGP on PE1 computes a new bestpath later, choosing PE4 33
Enabling BGP PIC Enabling IP Routing Fast Convergence for your reference BGP PIC leverages IGP convergence Make sure IGP converges quickly IOS-XR: IGP Timers pretty-much tuned by default IOS: Sample OSPF config: process-max-time 50 ip routing protocol purge interface interface carrier-delay msec 0 negotiation auto ip ospf network point-to-point bfd interval 100 min_rx 100 mul 3 router ospf 1 ispf timers throttle spf 50 100 5000 timers throttle lsa all 0 20 1000 timers lsa arrival 20 timers pacing flood 15 passive-interface Loopback 0 bfd all-interfaces 34
Enabling BGP PIC Edge: IOS-XR for your reference Two BGP-PIC Edge Flavors: BGP PIC Edge Multipath and Unipath Multipath: Re-routing router load-balances across multiple next-hops, backup next-hops are actively taking traffic, are active in the routing/forwarding plane, commonly found in active/active redundancy scenarios. No configuration, apart from enabling BGP multipath (maximum-paths... ) Unipath: Backup path(s) are NOT taking traffic, as found in active/standby scenarios route-policy backup! Currently, only a single backup path is supported set path-selection backup 1 install [multipath-protect] [advertise] end-policy router bgp... address-family ipv4 unicast additional-paths selection route-policy backup! address-family vpnv4 unicast additional-paths selection route-policy backup! 35
Enabling BGP PIC Edge: IOS for your reference As in IOS-XR, PIC-Edge w/ multipath requires no additional configuration PIC-Edge unipath needs to be enabled explicitly... router bgp... address-family ipv4 [vrf...] or address-family vpnv4 bgp additional-paths install... or implicitly when enabling best external router bgp... address-family ipv4 [vrf...] or address-family vpnv4 bgp advertise-best-external http://www.cisco.com/en/us/docs/ios/iproute_bgp/configuration/guide/irg_bgp_mp_pic.html http://www.cisco.com/en/us/docs/ios/ios_xe/iproute_bgp/configuration/guide/irg_best_external_xe.html 36
Question: How will my PE s learn about the alternate Paths? By default my RR Only-Reflects the Best-Route Prefix Z Via E0 NH:PE2, P:Z RR NH:PE2, P:Z E0 PE2 Prefix Z Via PE2 Z E0 Prefix Z Via E0 PE3 NH:PE3, P:Z PE1 37
Diverse BGP Path Distribution Shadow Session Easy deployment no upgrade of any existing router is required, just new ibgp session per each extra path (CLI knob in RR1) Diverse ibgp session does announce the 2nd best path PE2 NH:PE2, P:Z RR1 NH:PE3, P:Z NH:PE2, P:Z Prefix Z Via PE2 Via PE3 Z PE1 NH:PE3, P:Z PE3 38
BGP Add-Path Add-Path will signal diverse paths from 2 to X paths Required all Add-Path receiver BGP router to support Add-Path capability. PE2 NH:PE2, P:Z RR1 NH:PE2, P:Z AP 1 NH:PE3, P:Z AP 2 Prefix Z Via PE2 Via PE3 Z PE1 NH:PE3, P:Z PE3 39
BGP Add-path flavors IETF defines 5 flavors of Add-x-Path. 2 are implemented by Cisco: Add-n-path: with add-n-path the route reflector will do best path computation for all paths and send n best to BR/PE. Usecase: Primary + n-1 Backup scenario. (n is maximal for IOS-XR 2 and 3 for IOS). Add-all-path: with add-all-path, the route reflector will do the primary best path computation (only on first path) and then send all path to BR/PE. Usecase: Large DC ECMP load balancing, hot potato routing scenario 40
Add-path: selecting second best 1. Select best 2. Simple Remove rule all paths whose next-hop == best s (including best) 3. Run bestpath selection again on the remaining paths to select backup 41
Add-Path Configuration IOS-XR for your reference Enable in global address-family mode Enables for all IBGP neighbors Enable/Disable in neighbor mode router bgp 100 address-family ipv4 unicast additional-paths send! address-family vpnv4 unicast additional-paths send! neighbor 1.1.1.1 remote-as 100 address-family ipv4 unicast! address-family vpnv4 unicast!! neighbor 2.2.2.2 remote-as 100 capability additional-paths send disable address-family ipv4 unicast! 43
Add-Path Configuration IOS-XR for your reference Enable in global address-family mode Enables for all IBGP neighbors Enable/Disable in neighbor mode router bgp 100 address-family ipv4 unicast additional-paths receive! address-family vpnv4 unicast additional-paths receive! neighbor 1.1.1.1 remote-as 100 address-family ipv4 unicast! address-family vpnv4 unicast!! neighbor 2.2.2.2 remote-as 100 capability additional-paths receive disable address-family ipv4 unicast!!! 44
PIC Edge: Test Results BGP Resiliency/HA Enhancement Test Setup Node Failure Link Failure No PIC Edge, No BFD 12-14 sec 8-17 sec BFD Only 10-12 sec 6-12 sec PIC Edge Only 8 sec 4 sec PIC Edge, BFD 0 sec 0 sec 47
Automated Route Target Filtering BGP Feature Increased VPN service deployment increases load on VPN routers 10% YOY VPN table growth Highly desirable to filter unwanted VPN routes Multiple filtering approaches New RT filter address family Extended community ORF 48
Automated Route Target Filtering BGP Feature Derive RT filtering information from VPN RT import lists automatically Exchange filtering info via RT filter AF or extended community ORF Translate filter info received from neighbors into outbound filtering policies Generate incremental updates for received RT update queries Incremental deployment possible/desirable 49
Automated Route Target Filtering VRF- Blue RT-Constraint: NLRI= {VRF-Blue, VRF-Red} RT-Constraint: NLRI= {VRF-Green, VRF-Purple} VRF- Green VRF- Red PE-3 RT-Constraint: NLRI= {VRF-Blue, VRF-Red, VRF-Green} PE-1 VRF- Purple VRF- Red RR-1 RR-2 RT-Constraint: NLRI={VRF-Green, VRF-Purple, VRF-Blue} VRF- Purple VRF- Green PE-4 RT-Constraint: NLRI= {VRF-Red, VRF-Green} RT-Constraint: NLRI= {VRF-Purple, VRF-Blue} PE-2 VRF- Blue Improves PE and RR scaling and performance by sending only relevant VPN routes 50
IOS XR - Accept own Accept own This feature allows movement from a PE- Based service provisioning model to a centralized router reflector (RR)-based service provisioning model. With this feature, you can define route TO service-vrf mapping within a centralized route reflector and then propagate this information down to all the PE clients of that RR. Without this feature, you would define the route TO service VRF mapping in all PE devices, thereby incurring a high configuration overhead, which could result in more errors. router#configure router(config)#router bgp 100 router(config-bgp)#neighbor 10.2.3.4 router(config-bgp-nbr)#address-family vpnv4 unicast router(config-bgp-nbr-af)#accept-own This feature enables a route reflector to modify the Route Target (RT) list of a VPN route that is distributed by the route reflector, enabling the route reflector to control how a route originated within one VRF is imported into other VRFs. 51
Overview AIGP AIGP (Accumulated IGP Metric Attribute for BGP) http://tools.ietf.org/html/draft-ietf-idr-aigp-09 Optional, non-transitive BGP path attribute BGP attribute to provide BGP a way to make its routing decision based on the IGP metric, to choose the shortest path between two nodes across different AS. The main driving force for this feature is to solve the IGP scale issue seen in some ISP core network. Mainly to be deployed to carry nexthop prefixes/labels across different AS within the same administrative domain. The remote ingress PE select its best path using the modified best path selection process using AIGP metric. 52
Overview AIGP Passing AIGP attribute to non-aigp capable neighbors Translate AIGP into cost-community 2 POI of pre-best-path and igp-cost are supported A transitive keyword to make cost-comm transitive to ebgp neighbors Redistribute BGP (with AIGP) into IGP Translate AIGP value into BGP MED Other software components Route installation for BGP to tag AIGP metric during route installation NH notification when AIGP metric changed Update generation throttling is not supported in 4.0 It is highly recommended to deploy BGP best-external and Additional-path in conjunction with the AIGP attribute, to effectively achieve the desired routing policy. 56
AIGP: Originating AIGP AIGP is enabled between ibgp neighbors by default AIGP between ebgp neighbors need to be enabled AIGP can be originated by using redistribute ospf, redistribute isis, redistribute static or the BGP network command. AIGP can also be originated using neighbor address-family inbound or outbound policy to set AIGP to be the IGP cost or to a fixed value. route-policy set_aigp_1 if destination in (61.1.1.0/24 le 32) then set aigp-metric 111 elseif destination in (2100::1:0/112, 2100::2:0/112) then set aigp-metric igp-cost Endif end-policy router bgp 1 address-family ipv4 unicast redistribute ospf 1 route-policy set_aigp_1 57
What is Multi-Instance BGP? A new IOS-XR BGP architecture to support multiple instances along the lines of OSPF instances Each BGP instance is a separate process running on the same or a different RP/DRP node The BGP instances do not share any prefix table between them No need for a common adj-rib-in (brib) as is the case with distributed BGP The BGP instances do not communicate with each other and do not set up peering with each other Each individual instance can set up peering with another router independently 60
What is Multi-AS BGP? It will be possible to configure each instance of a multi-instances BGP with a different AS number Global address families can t be configured under more than one AS except vpnv4 and vpnv6 VPN address-families may be configured under multiple AS instances that do not share any VRFs 61
Configuration Example for your reference 63
Attribute Filtering and error-handling Attribute filtering Unwanted optional transitive attribute such as ATTR_SET, CONFED segment in AS4_PATH causing outage in some equipments. Prevent unwanted/unknown BGP attributes from hitting legacy equipment Block specific attributes Block a range of non-mandatory attributes Error-handling draft-ietf-idr-optional-transitive-04.txt Punishment should not exceed the crime Gracefully fix or ignore non-severe errors Avoid session resets for most cases Never discard update error, as that can lead to inconsistencies 66
Architecture Malformed BGP Updates Invalid Attribute Contents Wrong Attribute Length Transitive Attributes Unknown Attributes Unwanted Attributes Attribute Filtering Error-handling NLRI processing 67
Attribute filtering for your reference First level of inbound filtering Filtering is configured as a range of attribute codes and a corresponding action to take (Note: Never Discard Update as that can lead towards inconsistencies) Actions Discard the attribute Treat-as-withdraw Applied when parsing each attribute in the received Update message When a attribute matches the filter, further processing of the attribute is stopped and the corresponding action is taken 68
Error-handling for your reference Comes into play after attribute-filtering is applied When we detect one or more malformed attributes or NLRIs or other fields in the Update message Steps Classification of errors Actions to be taken Logging 69
BGP Origin Validation Origin Validation for E-BGP routes Next release to cover origin validation for locally sourced routes Support client functionality of RPKI RTR protocol Separate database to store record entries from the cache Support to announce path validation state to IBGP neighbors using a well known path validation state extended community Modified route policies to incorporate path validation states 71
Prefix hijacking Announce someone else s prefix Announce a more specific of someone else s prefix Either way, you are trying to steal someone else s traffic by getting it routed to you Capture, sniff, redirect, manipulate traffic as you wish Source: nanog 46 preso 72
How does the Solution look like? 73
Multicast VPN Solution Space (complete solution is now available) Service IPv4 Native IPv6 Native IPv4 mvpn IPv6 mvpn C-Multicast Signaling PIM PORT BGP Core Tree Signaling PIM (pt-mpt) MLDP (pt-mpt mpt-mpt) P2MP TE (pt-mpt) Encapsulation /Forwarding IP/GRE LSM 77
Multicast VPN BGP Signaling BGP Auto-Discovery RR PE3 Source CE1 PIM C-Join (*,G) or (S,G) PE1 BGP BGP C-mroutes PIM C-Join (*,G) or (S,G) CE3 Receiver RP CE2 PE2 PE4 CE4 Receiver BGP customer-multicast signaling and BGP auto-discover is now added to the multicast VPN solution. BGP as overlay allows Service Providers to capitalize on a single protocol Auto-Discovery of PEs and Core tree/tunnel information Advertisement of Customer Multicast routes 78
BGP Graceful Shutdown RFC 6198 April 2011 Old Behaviour If session drops then BGP will withdraw all prefixes learned over that session BGP has no mechanism to signal prefix will soon be unreachable (for maintenance for example) Historically RR s have worsened the issue as they tend to hide the alternate path as they only forward the best path 1 BGP Graceful Shutdown allows to do maintenance on router without service disruption. #Graceful Shutdown Please wait 2 BGP/ Prefix 10.45 / localpref : 10 This new knob allows a router to notify neighbor to redirect traffic to other paths and after some time will drop BGP sessions. 3 Traffic is redirected The notification could be done using Local Preference attribute or user community attribute 79
Graceful Shutdown GSHUT well-known community The GSHUT community attribute is applied to a neighbor specified by the neighbor shutdown graceful command, thereby gracefully shutting down the link in an expected number of seconds The GSHUT community is specified in a community list, which is referenced by a route map and then used to make policy routing decisions. neighbor {ipv4-address ipv6-address peer-group-name} shutdown graceful seconds {community value [local-preference value] local-preference value} http://www.cisco.com/en/us/docs/ios-xml/ios/iproute_bgp/configuration/15-s/irg-15-s-book.pdf 80
DDoS Mitigation a stepstone approach Phase III Dynamic application aware redirection and traffic handling IOS-XR 5.2.0 IOS-XE 3.1.2 Phase II Malicious traffic mitigation Cleaning of Malicious traffic Dirty and clean traffic handling Usage of Multi-instance BGP IOS-XR 4.3.1 IOS-XE partial Phase I ACL RTBH PBR urpf 81
DDoS Overview Distributed denial-of-service (DDoS) attacks target network infrastructures or computer services by sending overwhelming number of service requests to the server from many sources. Server resources are used up in serving the fake requests resulting in denial or degradation of legitimate service requests to be served Addressing DDoS attacks Detection Detect incoming fake requests Mitigation Diversion Send traffic to a specialized device that removes the fake packets from the traffic stream while retaining the legitimate packets Return Send back the clean traffic to the server 82
DDOS impact on Customer Business 83
DDoS mitigation architecture 1. Detection (no DDoS) Scan Netflow data to detect DDOS attacks Security Server DDOS Analyser Sample Netflow DDOS scrubber 86
DDoS mitigation architecture 2. Detection (DDOS) Scan Netflow data Find DDOS signature Security Server DDOS Analyser Sample Netflow DDOS scrubber 87
DDoS mitigation architecture 3. Redirect traffic to DDOS scruber Scan Netflow data Find DDoS signature Security Server DDoS Analyser BGP DDoS Mitigation Action: redirect to DDoS scrubber DDoS scrubber 88
DDoS Mitigation: Architecture Considerations Normal traffic flow when there is no attack Redirect traffic from any edge PE to any specific DDoS scrubber Including the PE that is connected to the host network Granular (prefix level/network) diversion Customers buy DDoS mitigation service for some prefixes Pre-provisioned DDoS service for those prefixes (using policy such as standard community flag) Centralized controller that injects the diversion route VPN based Labeled return path for the clean traffic To prevent routing loops Solution support redirection of BGP less/more specific prefixes or local originated prefixes (static route, redistributed route) Support for multi-homed customers During attack, send clean traffic from DDOS scrubber to multiple PE s 89
The concept Traffic under normal conditions Server Scrubber Traffic under normalized conditions Traffic takes shortest path Upstream and downstream traffic follow traditional routing PE2 ISP PE1 PE3 Security analyser Security server Pre-provisioned DDoS instrumentation Traffic Scrubber Separate clean and malicious traffic Security Analyser Analyses Netflow/IPFIX statistics from the traffic flows Security server Actions upon traffic analysis by communication to infrastructure routers Internet users 90
BGP based DDoS Traffic under DDoS condition Server PE2 Scrubber Traffic under DDoS condition Traffic is redirected to a scrubber Scrubber separates the clean from the malicious traffic Clean traffic is returned to original destination server PE3 Security analyser Security server ISP PE1 Goal Do not drop all traffic Collect traffic intelligence Operational simplicity Easy to remove redirect when traffic normalizes Internet users 91
How does it work? Normal traffic condition Server 1.1.1.1/32 Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 3.3.3.3 PE3 Scrubber All PE s peer with the RR All PE s exchange both Global Internet and VPN prefixes All PE interfaces are non-vpn Security analyser is performing doing analyses ISP 4.4.4.4 PE1 Security analyser Security server Destination Next-hop 1.1.1.1/32 2.2.2.2 Internet users 92
How does it work? Server is under DDoS Server Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 3.3.3.3 Scrubber Flow is detected as dirty by Security analyser Result: Server is under attack Traffic needs to be redirected to the scrubber to mitigate the attack 1.1.1.1/32 PE3 ISP 4.4.4.4 PE1 Security analyser Security server Destination Next-hop 1.1.1.1/32 2.2.2.2 Internet users 93
How does it work? Server is under DDoS Server Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 DDoS Route-Reflector 5.5.5.5 3.3.3.3 Scrubber Destination Next-hop 1.1.1.1/32 3.3.3.3 1.1.1.1/32 Internet users ISP 4.4.4.4 PE1 PE3 Security server DDoS Route-Reflector was previsioned Mitigation route to 1.1.1.1/32 is injected on the DDoS RR by the Security server Mitigation route to 1.1.1.1/32 is pointing to 3.3.3.3 on DDoS mitigation RR 94
How does it work? Destination Next-hop Server is under DDoS 1.1.1.1/32 3.3.3.3 Server 1.1.1.1/32 Internet and VPN Route-Reflector 2.2.2.2 PE2 ISP 4.4.4.4 5.5.5.5 PE1 DDoS Route-Reflector 5.5.5.5 3.3.3.3 PE3 Security server Scrubber BGP Table Mitigation route to 1.1.1.1/32 is pointing to 3.3.3.3 is signalled to all PE s All PE s receive the mitigation route from the DDoS Mitigation RR Each PE will now have 2 routes to reach 1.1.1.1/32 Which route will the PE use? Routing Table Destination Next-hop Destination Next-hop 1.1.1.1/32 2.2.2.2 1.1.1.1/32???????????? Internet users 1.1.1.1/32 3.3.3.3 95
How does it work? Server is under DDoS Server 1.1.1.1/32 Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 DDoS Route-Reflector 5.5.5.5 3.3.3.3 PE3 Scrubber Trick # 1 The DDoS mitigation route will ALWAYS be preferred, even if Both prefix lengths are the same DDoS prefix is shorter Original prefix has better administrative distance ISP 4.4.4.4 PE1 Security server BGP Table Routing Table Destination Next-hop Destination Next-hop 1.1.1.1/32 2.2.2.2 1.1.1.1/32 3.3.3.3 Internet users 1.1.1.1/32 3.3.3.3 96
How does it work? Server is under DDoS Server 1.1.1.1/32 Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 DDoS Route-Reflector 5.5.5.5 3.3.3.3 PE3 Clean traffic Scrubber The mitigated traffic flows towards PE3 (3.3.3.3) PE3 is sending the dirty flow towards the scrubber The scrubber will Handle and remove the dirty traffic within the original flow Send the cleaned traffic towards the original destination (1.1.1.1 at PE2 (2.2.2.2)) ISP 4.4.4.4 PE1 BGP Table Routing Table Destination Next-hop Destination Next-hop Internet users 1.1.1.1/32 2.2.2.2 1.1.1.1/32 3.3.3.3 1.1.1.1/32 3.3.3.3 97
How does it work? Server is under DDoS Server 1.1.1.1/32 Internet and VPN Route-Reflector 2.2.2.2 PE2 5.5.5.5 DDoS Route-Reflector 5.5.5.5 3.3.3.3 PE3 Clean traffic Scrubber Problem Scrubber sends traffic to PE3 PE3 does routing lookup for 1.1.1.1 and finds that it is directly attached ROUTING LOOP!!! How do we fix this? We use a new isolated routing table for the clean traffic This routing table is Preprovisioned Inside a VPN ISP 4.4.4.4 PE1 BGP Table Routing Table Destination Next-hop Destination Next-hop Internet users 1.1.1.1/32 2.2.2.2 1.1.1.1/32 3.3.3.3 1.1.1.1/32 3.3.3.3 98
How does it work? Server is under DDoS Server 2.2.2.2 PE2 1.1.1.1/32 3.3.3.3 PE3 Scrubber The clean traffic will be injected upon PE3 on an interface member of VPN Clean PE3 will now do a routing destination lookup for 1.1.1.1 in VPN Clean The matching routing table entry is pointing towards PE2 at 2.2.2.2 The clean flow, which is now part of VPN Clean is sent towards PE2 reachable at 2.2.2.2 ISP 4.4.4.4 PE1 BGP Table Routing Table Destination Next-hop Destination Next-hop VPN Internet users VPN Clean 1.1.1.1/32 2.2.2.2 1.1.1.1/32 3.3.3.3 1.1.1.1/32 3.3.3.3 Global 1.1.1.1/32 2.2.2.2 Clean 99
How does it work? Server is under DDoS Server 2.2.2.2 PE2 CE1 1.1.1.1/32 Routing Table PE2 receives the clean flow Destination Next-hop VPN 1.1.1.1/32 3.3.3.3 Global 1.1.1.1/32 CE1 Clean 3.3.3.3 PE3 Scrubber within VPN clean PE2 does a destination address routing lookup in VPN clean A matching route is found in VPN clean Flow is forwarded towards CE1 onwards to Server Internet users ISP 4.4.4.4 PE1 HOLD on a minute! PE2 does not have any interface part of VPN clean All interfaces on PE2 are global interfaces so how did that clean route for 1.1.1.1 get into VPN clean? 100
How does it work? BGP Table Destination Nexthop VPN 1.1.1.1/32 CE1 Global Routing Table Destination Next-hop VPN 1.1.1.1/32 3.3.3.3 Global 1.1.1.1/32 3.3.3.3 Global 1.1.1.1/32 CE1 Clean 1.1.1.1 CE1 clean Internet users Server 2.2.2.2 3.3.3.3 PE2 CE1 PE3 1.1.1.1/32 ISP 4.4.4.4 PE1 Scrubber Trick # 2 Copy the locally BGP inserted route directly into VPN clean BGP table Neighbour details are inherited from the global table (i.e.) Outgoing interface Next-hop Interface pointing towards CE1 is NOT VPN aware This VPN clean distributed as normal VPN New CLI command to do that import from default-vrf route-policy ddos advertise-as-vpn 101
Going back to traditional traffic flow Server is under DDoS Destination Next-hop Internet and VPN Route-Reflector DDoS Route-Reflector 1.1.1.1/32 3.3.3.3 Server 1.1.1.1/32 5.5.5.5 2.2.2.2 ISP 4.4.4.4 PE1 5.5.5.5 3.3.3.3 Security server Scrubber Remove the routing entry on the Mitigation DDoS RR No more route is remaining on the DDoS Mitigation RR Traffic flows normally again Internet users 102
Configuration (1) router bgp 99 instance ddos bgp router-id 3.3.3.3 bgp read-only bgp install diversion address-family ipv4 unicast! router bgp 99 bgp router-id 2.2.2.2 address-family ipv4 unicast! Creation of DDoS BGP instance Allows config of 2th IPv4 or IPv6 instance Suppresses BGP Update Generation Triggers BGP ddos instance to install diversion path to RIB, so that the paths are pushed down to FIB 103
Configuration (2) Importing the global route s in the clean VRF vrf clean address-family ipv4 unicast import from default-vrf route-policy ddos advertise-as-vpn export route-target 111:1!! address-family ipv6 unicast import from default-vrf route-policy ddos advertise-as-vpn export route-target 111:1!!! 104
109
Complete Your Online Session Evaluation Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Cisco Daily Challenge points for each session evaluation you complete. Complete your session evaluation online now through either the mobile app or internet kiosk stations. Maximize your Cisco Live experience with your free Cisco Live 365 account. Download session PDFs, view sessions on-demand and participate in live activities throughout the year. Click the Enter Cisco Live 365 button in your Cisco Live portal to log in. 110