Route Servers for!dummies or: Scaling is Hard; Let s Go Shopping!



Similar documents
I Wanted a New Network

BIRD Internet Routing Daemon. CZ.NIC z. s. p. o. Ondrej Filip / ondrej.filip@nic.cz NANOG-48, Austin, TX

Uli Bornhauser - Peter Martini - Martin Horneffer Scalability of ibgp Path Diversity Concepts

Redundancy & the Netnod Internet Exchange Points

Doing Don ts: Modifying BGP Attributes within an Autonomous System

Operation and Technical Best Practice. IXP Automation and Operational Efficiency

APNIC elearning: BGP Attributes

CS 457 Lecture 19 Global Internet - BGP. Fall 2011

SDX Project Updates GEC 20

Introduction to Routing

Exterior Gateway Protocols (BGP)

BGP FORGOTTEN BUT USEFUL FEATURES. Piotr Wojciechowski (CCIE #25543)

Internet Exchange Points (IXPs) Scalable Infrastructure Workshop

CS551 External v.s. Internal BGP

Bell Aliant. Business Internet Border Gateway Protocol Policy and Features Guidelines

Active measurements: networks. Prof. Anja Feldmann, Ph.D. Dr. Nikolaos Chatzis Georgios Smaragdakis, Ph.D.

Analyzing Capabilities of Commercial and Open-Source Routers to Implement Atomic BGP

How To Understand Bg

APNIC elearning: BGP Basics. Contact: erou03_v1.0

Transitioning to BGP. ISP Workshops. Last updated 24 April 2013

Redefine Network Visibility in the Data Center with the Cisco NetFlow Generation Appliance

Border Gateway Protocol BGP4 (2)

Border Gateway Protocol (BGP)

Understanding Virtual Router and Virtual Systems

DE-CIX Premium Internet Exchange Services

Inter-domain Routing Basics. Border Gateway Protocol. Inter-domain Routing Basics. Inter-domain Routing Basics. Exterior routing protocols created to:

pmacct: introducing BGP natively into a NetFlow/sFlow collector

TRILL for Service Provider Data Center and IXP. Francois Tallet, Cisco Systems

NetFlow & BGP multi-path: quo vadis?

NetFlow & BGP multi-path: quo vadis?

BGP: Frequently Asked Questions

Understanding Large Internet Service Provider Backbone Networks

co Characterizing and Tracing Packet Floods Using Cisco R

exchanging traffic in Paris a new proposal Maurice Dean & Raphael Maunier

LINX (The London Internet Exchange) a presentation to the 6 th PTT Forum. John Souter Chief Executive Officer November 2012

BGP Best Path Selection Algorithm

How to Configure BGP Tech Note

Route Optimization. rek Petr Grygarek, VSB-TU Ostrava, Routed and Switched Networks 1

Interdomain Routing. Outline

HP Networking BGP and MPLS technology training

Inter-domain Routing. Outline. Border Gateway Protocol

Peering in Hong Kong. Che-Hoo CHENG CUHK/HKIX

BGP overview BGP operations BGP messages BGP decision algorithm BGP states

BGP Attributes and Path Selection

Data Center Use Cases and Trends

pmacct: introducing BGP na2vely into a NetFlow/sFlow collector

JUNOS Secure BGP Template

The Value of Peering. ISP/IXP Workshops

RFC 2547bis: BGP/MPLS VPN Fundamentals

Using the Border Gateway Protocol for Interdomain Routing

DD2491 p Load balancing BGP. Johan Nicklasson KTHNOC/NADA

BGP1 Multihoming and Traffic Engineering

Claudio Jeker. RIPE 41 Meeting Amsterdam, 15. January Using BGP topology information for DNS RR sorting

How to maximize the available capacity!

The Naughty port Project

Load balancing and traffic control in BGP

Indonesia Internet exchange IIX IIX and IIXv6 Development Update 2007

Understanding Route Redistribution & Filtering

IXP Manager Workshop. 27 th Euro-IX Forum October 25 th 2015 Berlin, Germany

Troubleshooting and Maintaining Cisco IP Networks Volume 1

The Role IXPs and Peering Play in the Evolution of the Internet

BGP Basics. BGP Uses TCP 179 ibgp - BGP Peers in the same AS ebgp - BGP Peers in different AS's Private BGP ASN. BGP Router Processes

Embedded BGP Routing Monitoring. Th. Lévy O. Marcé

Fireware How To Dynamic Routing

BGP Convergence in much less than a second Clarence Filsfils - cf@cisco.com

BGP and Traffic Engineering with Akamai. Caglar Dabanoglu Akamai Technologies AfPIF 2015, Maputo, August 25th

CLOS IP FABRICS WITH QFX5100 SWITCHES

Understanding Route Aggregation in BGP

SEC , Cisco Systems, Inc. All rights reserved.

How More Specifics increase your transit bill (and ways to avoid it)

Exam 1 Review Questions

Accounting and Routing in the Internet

Increasing Path Diversity using Route Reflector

Network Configuration Example

MPLS. Cisco MPLS. Cisco Router Challenge 227. MPLS Introduction. The most up-to-date version of this test is at:

BGP. 1. Internet Routing

How To Make A Network Secure

Layer 3 Network + Dedicated Internet Connectivity

Lecture 18: Border Gateway Protocol"

How Routers Forward Packets

10 reasons why. eircom net is. Ireland s Premier Business Internet Service Provider

Internet inter-as routing: BGP

Demonstrating the high performance and feature richness of the compact MX Series

Achieving Low-Latency Security

Advanced BGP Policy. Advanced Topics

Routing and traffic measurements in ISP networks

Building A Cheaper Peering Router. (Actually it s more about buying a cheaper router and applying some routing tricks)

DDoS Protection. How Cisco IT Protects Against Distributed Denial of Service Attacks. A Cisco on Cisco Case Study: Inside Cisco IT

National Education Network. KAREN School Cluster High-level Design

CSE 473s Introduction to Computer Networks

MPLS VPN Route Target Rewrite

Transcription:

Route Servers for!dummies or: Scaling is Hard; Let s Go Shopping! Nick Hilliard Head of Operations nick@inex.ie

Some Blurb on INEX Currently only member-owner IXP in Ireland 59 members, 46 full members, 13 associate Estimate about 90% eyeballs in Ireland (South) Traffic levels: daytime peaks of 6G Provide usual services - 10M to 10G ethernet Two separate L2 infrastructures Three PoPs: Telecity Dublin, DEG, Interxion DUB1 Mixture of Brocade (FES-X624, TI24X) and Cisco 6500 Fibre lit with Transmode DWDM kit - N x 10G Highly active community interest

Some Blurb on INEX Currently only member-owner IXP in Ireland 59 members, 46 full members, 13 associate Estimate about 90% eyeballs in Ireland (South) Traffic levels: daytime peaks of 6G Provide usual services - 10M to 10G ethernet Two separate L2 infrastructures Three PoPs: Telecity Dublin, DEG, Interxion DUB1 Mixture of Brocade (FES-X624, TI24X) and Cisco 6500 Fibre lit with Transmode DWDM kit - N x 10G Free Beer at Meetings!!!11!! Highly active community interest

Some Blurb on INEX Currently only member-owner IXP in Ireland 59 members, 46 full members, 13 associate Estimate about 90% eyeballs in Ireland (South) Traffic levels: daytime peaks of 6G Provide usual services - 10M to 10G ethernet Two separate L2 infrastructures Three PoPs: Telecity Dublin, DEG, Interxion DUB1 Mixture of Brocade (FES-X624, TI24X) and Cisco 6500 Fibre lit with Transmode DWDM kit - N x 10G Highly active community interest Oh yeah, we have some route-servers too

Route Servers for Dummies Platform for multi lateral peering agreements (MPLA) Similar to a route reflector, except uses ebgp Very fashionable at IXPs right now Reduce administrative load of peering Simple interconnection to lots of other partners Instant RoI (ISP management likes this) Outsourcing RIB calculations to fast machines(!) Safe if IXP has implemented prefix filtering Considered ghetto routing by larger providers There are good reasons for this opinion INEX recommends peering with route servers unless you know why you shouldn t Because route servers are not for everyone Route prefix filtering considered indispensable by IXP participants

Route Servers for Dummies

Route Servers for Dummies Peering on IXP without Route Servers

Route Servers for Dummies Peering on IXP without Route Servers Peering on IXP with Route Servers

Single-RIB BGP policy problem IXP Fabric Y A X Single RIB Route Server C D B

Single-RIB BGP policy problem IXP Fabric Y A X Single RIB Route Server C D ABCD route server clients B

Single-RIB BGP policy problem IXP Fabric Y A X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B B

Single-RIB BGP policy problem IXP Fabric Y A X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B

Single-RIB BGP policy problem IXP Fabric RS calculates best path as BX Y A X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B

Single-RIB BGP policy problem Y A IXP Fabric RS calculates best path as BX B does not peer with D (tag 0:D) X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B

Single-RIB BGP policy problem Y A IXP Fabric RS calculates best path as BX B does not peer with D (tag 0:D) X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B D should see XYA path

Single-RIB BGP policy problem Y A IXP Fabric RS calculates best path as BX B does not peer with D (tag 0:D) X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B D should see XYA path RS RIB has only one best path - BX

Single-RIB BGP policy problem Y A IXP Fabric RS calculates best path as BX B does not peer with D (tag 0:D) X Single RIB Route Server C D ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B D should see XYA path RS RIB has only one best path - BX D does not see X

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server X Loc-RIB A Loc-RIB B C Loc-RIB C B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients X reaches IXP via transit A and B B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server import rule filters on tag X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server import rule filters on tag RIB C sees BX and AYX X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server import rule filters on tag RIB C sees BX and AYX RIB D only sees AYX X Loc-RIB A C Loc-RIB B Loc-RIB C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB D D

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server import rule filters on tag RIB C sees BX and AYX RIB D only sees AYX X Loc-RIB A Loc-RIB B BX C Loc-RIB C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB D D RIB C selects BX

How Per-Client Loc-RIBs Work Y A export import Multi- RIB Route Server import rule filters on tag RIB C sees BX and AYX RIB D only sees AYX X Loc-RIB A Loc-RIB B BX C ABCD route server clients X reaches IXP via transit A and B AYX longer than BX B Loc-RIB C Loc-RIB D AYX D RIB C selects BX RIB D selects AYX

Ceiling Cat is Watching you Propagate

Problems with Multiple Loc-RIBs

Problems with Multiple Loc-RIBs Multiple Loc-RIBs mean: Memory, CPU consumption go from O(M) to O(N x M) N = number of clients M = total number of prefixes

Problems with Multiple Loc-RIBs Multiple Loc-RIBs mean: Memory, CPU consumption go from O(M) to O(N x M) N = number of clients M = total number of prefixes Update processing resources required are: Where P(n) = the number of prefixes from peer n N = number of peers connected to system This scales as P average * N 2

Help!

Resources Required Peers & Prefixes

Cute Kitteh! Resources Required Peers & Prefixes

Resources Required Peers & Prefixes

SRSLY Cute Kitteh! Resources Required Peers & Prefixes

Resources Required Peers & Prefixes

Evil Kitteh! Resources Required Peers & Prefixes

Resources Required Peers & Prefixes

Resources Required CPU / Memory / Network Resource Limit Peers & Prefixes

The bit you need to get worried about Resources Required CPU / Memory / Network Resource Limit Peers & Prefixes

System Response Time Peers & Prefixes

Performance Knee System Response Time Peers & Prefixes

Performance Knee System Response Time Peers & Prefixes Performance Goes to Hell in a Handcart

Facts and Figures A single BGP prefix update might take 10-30 bytes on network to send to peer might take 10-30µS to process update Disclaimer this ignores attributes, path length, cpu speed, and a pile of other highly relevant parameters

Facts and Figures A single BGP prefix update might take 10-30 bytes on network to send to peer might take 10-30µS to process update Disclaimer this ignores attributes, path length, cpu speed, and a pile of other highly relevant parameters Ok, it s hand-waving

Facts and Figures 200 clients Average 100 prefixes each

Facts and Figures 200 clients Average 100 prefixes each Multi- RIB Route Server RIB A RIB B RIB C RIB 100 19500 prefixes 50 prefixes 19500 prefixes 50 prefixes 19500 prefixes 50 prefixes 19500 prefixes Client A Client B Client C 50 prefixes Client 100

Facts and Figures 200 clients Average 100 prefixes each Multi- RIB Route Server RIB A RIB B RIB C RIB 100 19500 prefixes 50 prefixes 19500 prefixes 50 prefixes 19500 prefixes 50 prefixes 19500 prefixes Client A Client B Client C RS to Client updates: 19500*200 = 3,900,000 Client to RS updates: 100 * 200 = 20000 updates Total BGP updates: 4,000,000 40-120M of network traffic 40-120s CPU time 50 prefixes Client 100

Facts and Figures 1000 clients Average 500 prefixes each

Facts and Figures 1000 clients Average 500 prefixes each Multi- RIB Route Server RIB A RIB B RIB C RIB 1000 499500 prefixes 500 prefixes 499500 prefixes 500 prefixes 499500 prefixes 500 prefixes 499500 prefixes Client A Client B Client C 500 prefixes Client 1000

Facts and Figures 1000 clients Average 500 prefixes each Multi- RIB Route Server RIB A RIB B RIB C RIB 1000 499500 prefixes 500 prefixes 499500 prefixes 500 prefixes 499500 prefixes 500 prefixes 499500 prefixes Client A Client B Client C RS to Client updates: 499000*1000 = 499.5m Client to RS updates: 1000 * 500 = 500k updates Total BGP updates: 500m 5-15G of network traffic 85-250m CPU time 500 prefixes Client 1000

How Do We Fix This? Super-linear scaling causes inherent breakage Moving away from one Loc-RIB per client model is critical Right now, this isn t the primary cause of IXP breakage Three primary models to escape this limitation Collapse multiple Loc-RIBs in memory into single gargantuan Loc-RIB less memory, less CPU You can run but you can t hide Use prior knowledge Web based peering Disable unique Loc-RIB on per client basis BGP ADD_PATH Capability Published as ID: draft-walton-bgp-add-paths Moves BGP Best Path Selection to client, so filtering can be performed without selecting