1 BGP EE 122, Fall 2013 Sylvia Ratnasamy Material thanks to Ion Stoica, Scott Shenker, Jennifer Rexford, and many other colleagues
2 BGP: The story so far l Destinations are IP prefixes ( /8) l Nodes are Autonomous Systems (ASes) l Links represent both physical connections and business relationships l customer-provider or peer-to-peer l BGP l path-vector protocol l policy-driven route selection
3 BGP: Today l BGP policy l typical policies, how they re implemented l BGP protocol details l Issues with BGP
4 Policy imposed in how routes are selected and exported! Route export Route selection Customer Competitor Can reach 128.3/16 blah blah l Selection: Which path to use? l controls whether/how traffic leaves the network l Export: Which path to advertise? l controls whether/how traffic enters the network
5 Typical Selection Policy l In decreasing order of priority l make/save money (send to customer > peer > provider) l maximize performance (smallest AS path length) l minimize use of my network bandwidth ( hot potato ) l l
6 Typical Export Policy Destination prefix advertised by Customer Peer Provider Export route to Everyone (providers, peers, other customers) Customers Customers We ll refer to these as the Gao-Rexford rules (capture common -- but not required! -- practice!)
7 Gao-Rexford providers peers customers With Gao-Rexford, the customer-provider graph is a DAG (directed acyclic graph) and routes are valley free
8 BGP: Today l BGP policy l typical policies, how they re implemented l BGP protocol details l stay awake as long as you can l BGP issues
9 Who speaks BGP?! Border router Internal router Border routers at an Autonomous System
10 What does speak BGP mean? l Implement the standardized BGP protocol l read more here: l Specifies what messages to exchange with other BGP speakers l l message types: e.g., route advertisements message syntax: e.g., first X bytes for dest prefix; next Y for AS path, etc. l And how to process these messages l l e.g., when you receive a message of type X, apply this selection rule, then as per BGP state machine in the protocol spec + policy decisions, etc.
11 BGP sessions! ebgp session A border router speaks BGP with border routers in other ASes
12 BGP sessions! ibgp session A border router speaks BGP with other (interior and border) routers in its own AS
13 ebgp, ibgp, IGP l ebgp: BGP sessions between border routers in different ASes l Learn routes to external destinations l ibgp: BGP sessions between border routers and other routers within the same AS l l distribute externally learned routes internally assume a full all-to-all mesh of ibgp sessions l IGP: Interior Gateway Protocol = Intradomain routing protocol l l provide internal reachability e.g., OSPF, RIP
14 Some Border Routers Don t Need BGP! l Customer that connects to a single upstream ISP l l The ISP can advertise prefixes into BGP on behalf of customer and the customer can simply default-route to the ISP Provider Install routes /16 pointing to Customer Install default routes /0 pointing to Provider Customer /16
15 Putting the pieces together! Provide internal reachability (IGP) 2. Learn routes to external destinations (ebgp) 3. Distribute externally learned routes internally (ibgp) 4. Travel shortest path to egress (IGP)
16 Basic Messages in BGP l Open l Establishes BGP session l BGP uses TCP [will make sense in 1-2weeks] l Notification l Report unusual conditions l Update l Inform neighbor of new routes l Inform neighbor of old routes that become inactive l Keepalive l Inform neighbor that connection is still viable
17 BGP Operations! Open session on TCP port 179 AS1 Exchange all active routes BGP session AS2 Exchange incremental Updates While connection is ALIVE exchange route UPDATE messages
18 Route Updates l Format <IP prefix: route attributes> l attributes describe properties of the route l Two kinds of updates l announcements: new routes or changes to existing routes l withdrawal: remove routes that no longer exist
19 Route Attributes l Routes are described using attributes l Used in route selection/export decisions l Some attributes are local l i.e., private within an AS, not included in announcements l e.g., LOCAL PREF, ORIGIN l Some attributes are propagated with ebgp route announcements l e.g., NEXT HOP, AS PATH, MED, etc. l There are many standardized attributes in BGP l We will discuss a few
20 Attributes (1): ASPATH l Carried in route announcements l Vector that lists all the ASes a route announcement has traversed (in reverse order) l e.g., AS 7018 AT&T AS 88 Princeton, /16 AS IP prefix = /16 AS path = /16 AS path =
21 Example: ASPATH /16 AS Path = AS 1129 Global Access /16 AS Path = AS 1755 Ebone /16 AS Path = AS 1239 Sprint /16 AS Path = AS RIPE NCC RIS project /16 AS Path = 88 AS 88 Princeton /16 21 Prefix Originated AS7018 AT&T /16 AS Path = /16 AS Path = AS 3549 Global Crossing
22 Attributes (2): NEXT HOP l Carried in a route update message l IP address of next hop router on path to destination l Updated as the announcement leaves AS AS 88 Princeton, / AS 7018 AT&T AS IP prefix = /16 AS path = 88 Next Hop = /16 AS path = Next Hop =
23 Attributes (3): LOCAL PREF l Local Preference l Used to choose between different AS paths l The higher the value the more preferred l Local to an AS; carried only in ibgp messages l Ensures consistent route selection across an AS /24 AS1 BGP table at AS4: Destination AS Path Local Pref AS2 AS /24 AS3 AS /24 AS2 AS1 100 AS4
24 Example: ibgp and LOCAL PREF! l Both routers prefer the path through AS 100 on the left AS1 AS 2 AS 3 Local Pref = 100 Local Pref = 90 I-BGP AS 4
25 Attributes (4): ORIGIN l Records who originated the announcement l Local to an AS l Options: l l l e : from ebgp i : from ibgp? : Incomplete; often used for static routes l Typically: e > i >?
26 Attributes (5) : MED l Multi-Exit Discriminator AS1 l Used when ASes are interconnected via 2 or more links to specify how close a prefix is to the link it is announced on Link B MED=50 MED=10 Link A l Lower is better AS2 l AS announcing prefix sets MED (AS2 in picture) l AS receiving prefix (optionally!) uses MED to select link (AS1 in pic.) AS3 destination prefix
27 Attributes (6): IGP cost! l Used for hot-potato routing l Each router selects the closest egress point based on the path cost in intra-domain protocol dst A 4 F 3 5 C D E 8 10 B 4 G 27 hot potato
28 IGP may conflict with MED NEXTHOP=SF MED=100 D sf A B NEXTHOP=BOS MED=500
29 Using Attributes l Rules for route selection in priority order Priority Rule Remarks 1 LOCAL PREF Pick highest LOCAL PREF 2 ASPATH Pick shortest ASPATH length 3 MED Lowest MED preferred 4 ebgp > ibgp Did AS learn route via ebgp (preferred) or ibgp? 5 ibgp path Lowest IGP cost to next hop (egress router) 6 Router ID Smallest router ID (IP address) as tie-breaker
30 BGP UPDATE Processing! Open ended programming. Constrained only by vendor configuration language Receive BGP Updates Filter routes & tweak attributes Based on Attribute Values Best Routes Apply Policy = filter routes & tweak attributes Transmit BGP Updates Apply Import Policies Best Route Selection Best Route Table Apply Export Policies Install forwarding Entries for best Routes. IP Forwarding Table
31 BGP: Today l BGP policy l typical policies, how they re implemented l BGP protocol details l BGP issues
32 Issues with BGP! l Reachability l Security l Convergence l Performance
33 Reachability! l In normal routing, if graph is connected then reachability is assured l With policy routing, this does not always hold Provider AS 1 AS 3 Provider AS 2 Customer
34 Security! l An AS can claim to serve a prefix that they actually don t have a route to (blackholing traffic) l Problem not specific to policy or path vector l Important because of AS autonomy l Fixable: make ASes prove they have a path l Note: AS can also have incentive to forward packets along a route different from what is advertised l Tell customers about fictitious short path l Much harder to fix!
35 Convergence l Result: If all AS policies follow Gao-Rexford rules, BGP is guaranteed to converge (safety) l For arbitrary policies, BGP may fail to converge!
36 Example of Policy Oscillation! 1 prefers over 1 0 to reach
37 Step-by-Step of Policy Oscillation! Initially: nodes 1, 2, 3 know only shortest path to
38 Step-by-Step of Policy Oscillation! 1 advertises its path 1 0 to
39 Step-by-Step of Policy Oscillation!
40 Step-by-Step of Policy Oscillation! 3 advertises its path 3 0 to
41 Step-by-Step of Policy Oscillation!
42 Step-by-Step of Policy Oscillation! 1 withdraws its path 1 0 from
43 Step-by-Step of Policy Oscillation!
44 Step-by-Step of Policy Oscillation! 2 advertises its path 2 0 to advertise:
45 Step-by-Step of Policy Oscillation!
46 Step-by-Step of Policy Oscillation! 3 withdraws its path 3 0 from
47 Step-by-Step of Policy Oscillation!
48 Step-by-Step of Policy Oscillation! 1 advertises its path 1 0 to
49 Step-by-Step of Policy Oscillation!
50 Step-by-Step of Policy Oscillation! 2 withdraws its path 2 0 from withdraw:
51 Step-by-Step of Policy Oscillation! We are back to where we started!
52 Convergence l Result: If all AS policies follow Gao-Rexford rules, BGP is guaranteed to converge (safety) l For arbitrary policies, BGP may fail to converge! l Should this trouble us?
53 Performance Nonissues! l Internal routing (non) l Domains typically use hot potato routing l Not always optimal, but economically expedient l Policy not about performance (non) l So policy-chosen paths aren t shortest l Choosing among policy-compliant paths (non) l Fewest AS hops has little to do with actual delay l 20% of paths inflated by at least 5 router hops
54 Performance (example)! l AS path length can be misleading l An AS may have many router-level hops BGP says that path 4 1 is better than path AS 4 AS 3 AS 2 AS 1
55 Real Performance Issue: Slow convergence! l BGP outages are biggest source of Internet problems l Labovitz et al. SIGCOMM 97 l l 10% of routes available less than 95% of time Less than 35% of routes available 99.99% of the time l Labovitz et al. SIGCOMM 2000 l 40% of path outages take 30+ minutes to repair l But most popular paths are very stable
56 BGP Misconfigurations! l BGP protocol is both bloated and underspecified l lots of leeway in how to set and interpret attribute values, route selection rules, etc. l necessary to allow autonomy, diverse policies l but also gives operators plenty of rope l Much of this configuration is manual and ad hoc l And the core abstraction is fundamentally flawed l per-router configuration to effect AS-wide policy l now strong industry interest in changing this! [later: SDN]
57 BGP: How did we get here?! l BGP was designed for a different time l before commercial ISPs and their needs l before address aggregation l before multi-homing l We don t 1989 : get BGP-1 a second [RFC 1105] chance: `clean slate designs Replacement virtually for impossible EGP (1984, RFC to 904) deploy 1990 : BGP-2 [RFC 1163] l Thought 1991 experiment: : BGP-3 [RFC 1267] how would you design a policy-driven 1995 : BGP-4 interdomain [RFC 1771] routing solution? How would you Support deploy for Classless it? Interdomain Routing (CIDR)
58 Next Time. l Wrap up the network layer! l the IPv4 header l IP routers