Network Tomography and Internet Traffic Matrices



Similar documents
An iterative tomogravity algorithm for the estimation of network traffic

Traffic Engineering With Traditional IP Routing Protocols

IP Forwarding Anomalies and Improving their Detection using Multiple Data Sources

Traffic & Peering Analysis

Simulation of Heuristic Usage for Load Balancing In Routing Efficiency

TE in action. Some problems that TE tries to solve. Concept of Traffic Engineering (TE)

Search Heuristics for Load Balancing in IP-networks

Internet Firewall CSIS Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS net15 1. Routers can implement packet filtering

Reformulating the monitor placement problem: Optimal Network-wide wide Sampling

Research on Errors of Utilized Bandwidth Measured by NetFlow

Question 1. [7 points] Consider the following scenario and assume host H s routing table is the one given below:

2004 Networks UK Publishers. Reprinted with permission.

TUTORIAL. Best Practices for Determining the Traffic Matrix in IP Networks V 3.0

MPLS Architecture for evaluating end-to-end delivery

Real-Time Traffic Engineering Management With Route Analytics

Understanding and Optimizing BGP Peering Relationships with Advanced Route and Traffic Analytics

Network congestion control using NetFlow

On the effect of forwarding table size on SDN network utilization

Routing and traffic measurements in ISP networks

SBSCET, Firozpur (Punjab), India

MPLS Network Design & Monitoring

Probability-Model based Network Traffic Matrix Estimation

Exterior Gateway Protocols (BGP)

Network traffic engineering

Advanced BGP Policy. Advanced Topics

Optimization of IP Load-Balanced Routing for Hose Model

Backbone Capacity Planning Methodology and Process

Traffic Engineering With Traditional IP Routing Protocols

CS 5480/6480: Computer Networks Spring 2012 Homework 4 Solutions Due by 1:25 PM on April 11 th 2012

Traffic Engineering for Multiple Spanning Tree Protocol in Large Data Centers

Multi-Commodity Flow Traffic Engineering with Hybrid MPLS/OSPF Routing

Robust Load Balancing using Multi-Topology Routing

Implementing MPLS VPN in Provider's IP Backbone Luyuan Fang AT&T

How to Configure BGP Tech Note

Flow Monitoring With Cisco Routers

Traffic Engineering Management Concepts

An Adaptive MT-BGP Traffic Engineering Based on Virtual Routing Topologies

Visualizing Traffic on Network Topology

Dynamic Routing Protocols II OSPF. Distance Vector vs. Link State Routing

Assignment #3 Routing and Network Analysis. CIS3210 Computer Networks. University of Guelph

Network traffic monitoring and management. Sonia Panchen 11 th November 2010

How To Understand and Configure Your Network for IntraVUE

Masterkurs Rechnernetze IN2097

Border Gateway Protocol BGP4 (2)

Software Defined Networking What is it, how does it work, and what is it good for?

A REPORT ON ANALYSIS OF OSPF ROUTING PROTOCOL NORTH CAROLINA STATE UNIVERSITY

Routing Protocols. Interconnected ASes. Hierarchical Routing. Hierarchical Routing

20. Switched Local Area Networks

Course Description. Students Will Learn

Network-Wide Change Management Visibility with Route Analytics

and reporting Slavko Gajin

Exercise 4 MPLS router configuration

APNIC elearning: BGP Attributes

TUTORIAL Best Practices for Determining the Traffic Matrix in IP Networks

Experimentation driven traffic monitoring and engineering research

Computer Networks II Master degree in Computer Engineering Exam session: 11/02/2009 Teacher: Emiliano Trevisani. Student Identification number

Multihoming and Multi-path Routing. CS 7260 Nick Feamster January

Table of Contents. Cisco How Does Load Balancing Work?

Software-Defined Traffic Measurement with OpenSketch

Obfuscation of sensitive data in network flows 1

Interconnecting Cisco Networking Devices: Accelerated (CCNAX) 2.0(80 Hs) 1-Interconnecting Cisco Networking Devices Part 1 (40 Hs)

Juniper / Cisco Interoperability Tests. August 2014

Load Balancing by MPLS in Differentiated Services Networks

A Performance Study of IP and MPLS Traffic Engineering Techniques under Traffic Variations

Based on Computer Networking, 4 th Edition by Kurose and Ross

Introduction. Technology background

Lecture 8: Routing I Distance-vector Algorithms. CSE 123: Computer Networks Stefan Savage

Network Measurement. Why Measure the Network? Types of Measurement. Traffic Measurement. Packet Monitoring. Monitoring a LAN Link. ScienLfic discovery

PART III. OPS-based wide area networks

Network-Wide Capacity Planning with Route Analytics

MPLS/BGP Network Simulation Techniques for Business Enterprise Networks

Cisco Application Networking for Citrix Presentation Server

Cracking Network Monitoring in DCNs with SDN

MPLS Traffic Engineering in ISP Network

Outline. EE 122: Interdomain Routing Protocol (BGP) BGP Routing. Internet is more complicated... Ion Stoica TAs: Junda Liu, DK Moon, David Zats

Enterprise Network Simulation Using MPLS- BGP

Introduction to LAN/WAN. Network Layer

Network-Wide Class of Service (CoS) Management with Route Analytics. Integrated Traffic and Routing Visibility for Effective CoS Delivery

Active measurements: networks. Prof. Anja Feldmann, Ph.D. Dr. Nikolaos Chatzis Georgios Smaragdakis, Ph.D.

Outline. Outline. Outline

Per-Packet Load Balancing

basic BGP in Huawei CLI

Transitioning to BGP. ISP Workshops. Last updated 24 April 2013

How to configure an Advanced Expert Probe as NetFlow Collector

The Value of Flow Data for Peering Decisions

Broadband Networks. Prof. Karandikar. Department of Electrical Engineering. Indian Institute of Technology, Bombay. Lecture - 26

Answers to Sample Questions on Network Layer

HPSR 2002 Kobe, Japan. Towards Next Generation Internet. Bijan Jabbari, PhD Professor, George Mason University

Transcription:

Network Tomography and Internet Traffic Matrices Matthew Roughan School of Mathematical Sciences <matthew.roughan@adelaide.edu.au> 1

Credits David Donoho Stanford Nick Duffield AT&T Labs-Research Albert Greenberg AT&T Labs-Research Carsten Lund AT&T Labs-Research Quynh Nguyen AT&T Labs Yin Zhang AT&T Labs-Research 2

Problem Have link traffic measurements Want to know demands from source to destination C B A TM =... x A B x..., A, C... 3 L L L L

Example App: reliability analysis C Under a link failure, routes change want to predict new link loads B A TM =... x A B x..., A, C... 4 L L L L

Network Engineering What you want to do a)reliability analysis b)traffic engineering c)capacity planning What do you need to know Network and routing Prediction and optimization techniques? Traffic matrix 5

Outline Part I: What do we have to work with data sources SNMP traffic data Netflow, packet traces Topology, routing and configuration Part II:Algorithms Gravity models Tomography Combination and information theory Part III: Applications Network Reliability analysis Capacity planning Routing optimization (and traffic engineering in general) 6

Part I: Data Sources 7

Traffic Data 8

Data Availability packet traces Packet traces limited availability like a high zoom snap shot special equipment needed (O&M expensive even if box is cheap) lower speed interfaces (only recently OC192) huge amount of data generated 9

Data Availability flow level data Flow level data not available everywhere like a home movie of the network historically poor vendor support (from some vendors) large volume of data (1:100 compared to traffic) feature interaction/performance impact 10

Netflow Measurements Detailed IP flow measurements Flow defined by 3 Source, Destination IP, 3 Source, Destination Port, 3 Protocol, 3 Time Statistics about flows 3 Bytes, Packets, Start time, End time, etc. Enough information to get traffic matrix Semi-standard router feature Cisco, Juniper, etc. not always well supported potential performance impact on router Huge amount of data (500GB/day) 11

Data Availability SNMP SNMP traffic data like a time lapse panorama MIB II (including IfInOctets/IfOutOctets) is available almost everywhere manageable volume of data (but poor quality) no significant impact on router performance 12

SNMP Pro Comparatively simple Relatively low volume It is used already (lots of historical data) Con Data quality an issue with any data source 3 Ambiguous 3 Missing data 3 Irregular sampling Octets counters only tell you link utilizations 3 Hard to get a traffic matrix 3 Can t tell what type of traffic 3 Can t easily detect DoS, or other unusual events Coarse time scale (>1 minute typically; 5 min in our case) 13

Topology and configuration Router configurations Based on downloaded router configurations, every 24 hours 3 Links/interfaces 3 Location (to and from) 3 Function (peering, customer, backbone, ) 3 OSPF weights and areas 3 BGP configurations Routing 3 Forwarding tables 3 BGP (table dumps and route monitor) 3 OSPF table dumps Routing simulations Simulate IGP and BGP to get routing matrices 14

Part II: Algorithms 15

The problem 1 route 3 y = + route 1 router x x 1 1 3 2 3 route 2 Want to compute the traffic x j along route j from measurements on the links, y i y 1 1 0 1 x 1 y 2 y 3 = 1 1 0 0 1 1 x 2 x 3 16

The problem 1 route 3 y = + route 1 router x x 1 1 3 2 3 route 2 Want to compute the traffic x j along route j from measurements on the links, y i y = Ax 17

Underconstrained linear inverse problem Traffic matrix y = Ax Link measurements Routing matrix Many more unknowns than measurements 18

Naive approach 19

Gravity Model Assume traffic between sites is proportional to traffic at each site x 1 y 1 y 2 x 2 y 2 y 3 x 3 y 1 y 3 Assumes there is no systematic difference between traffic in LA and NY Only the total volume matters Could include a distance term, but locality of information is not as important in the Internet as in other networks 20

Simple gravity model 21

Generalized gravity model Internet routing is asymmetric A provider can control exit points for traffic going to peer networks peer links access links 22

Generalized gravity model Internet routing is asymmetric A provider can control exit points for traffic going to peer networks Have much less control over where traffic enters peer links access links 23

Generalized gravity model 24

Tomographic approach 1 route 3 y = + route 1 router x x 1 1 3 2 3 route 2 y = A x 25

Direct Tomographic approach Under-constrained problem Find additional constraints Use a model to do so Typical approach is to use higher order statistics of the traffic to find additional constraints Disadvantage Complex algorithm doesn t scale (~1000 nodes, 10000 routes) Reliance on higher order stats is not robust given the problems in SNMP data Model may not be correct -> result in problems Inconsistency between model and solution 26

Combining gravity model and tomography 1. gravity solution 2. tomo-gravity solution argmin{ y Ax 2 + λ 2 J( x) } x tomographic constraints (from link measurements) 27

Regularization approach Minimum Mutual Information: minimize the mutual information between source and destination No information The minimum is independence of source and destination 3 P(S,D) = p(s) p(d) 3 P(D S) = P(D) 3 actually this corresponds to the gravity model Add tomographic constraints: 3 Including additional information as constraints 3 Natural algorithm is one that minimizes the Kullback-Liebler information number of the P(S,D) with respect to P(S) P(D) Max relative entropy (relative to independence) 28

Validation Results good: ±20% bounds for larger flows Observables even better 29

More results Large errors are in small flows >80% of demands have <20% error tomogravity method simple approximation 30

Robustness (input errors) 31

Robustness (missing data) 32

Dependence on Topology 30 star (20 nodes) clique relative errors (%) 25 20 15 10 5 0 random geographic Linear (geographic) 0 1 2 3 4 5 6 7 8 9 10 11 unknowns per measurement 33

Additional information Netflow 34

Part III: Applications 35

Applications Capacity planning Optimize network capacities to carry traffic given routing Timescale months Reliability Analysis Test network has enough redundant capacity for failures Time scale days Traffic engineering Optimize routing to carry given traffic Time scale potentially minutes 36

Capacity planning Plan network capacities No sophisticated queueing (yet) Optimization problem Used in AT&T backbone capacity planning For more than well over a year North American backbone Being extended to other networks 37

Network Reliability Analysis Consider the link loads in the network under failure scenarios Traffic will be rerouted What are the new link loads? Prototype used (> 1 year) Currently being turned form a prototype into a production tool for the IP backbone Allows what if type questions to be asked about link failures (and span, or router failures) Allows comprehensive analysis of network risks 3 What is the link most under threat of overload under likely failure scenarios 38

Example use: reliability analysis 39

Traffic engineering and routing optimization Choosing route parameters that use the network most efficiently In simple cases, load balancing across parallel routes Methods Shortest path IGP weight optimization 3Thorup and Fortz showed could optimize OSPF weights Multi-commodity flow optimization 3Implementation using MPLS 3Explicit route for each origin/destination pair 40

Comparison of route optimizations 41

Conclusion Properties Fast (a few seconds for 50 nodes) Scales (to hundreds of nodes) Robust (to errors and missing data) Average errors ~11%, bounds 20% for large flows Tomo-gravity implemented AT&T s IP backbone (AS 7018) Hourly traffic matrices for > 1 year Being extended to other networks http://www.maths.adelaide.edu.au/staff/applied/~roughan/ 42

Additional slides 43

Validation Look at a real network Get SNMP from links Get Netflow to generate a traffic matrix Compare algorithm results with ground truth Problems: 3 Hard to get Netflow along whole edge of network If we had this, then we wouldn t need SNMP approach 3 Actually pretty hard to match up data Is the problem in your data: SNMP, Netflow, routing, Simulation Simulate and compare Problems 3 How to generate realistic traffic matrices 3 How to generate realistic network 3 How to generate realistic routing 3 Danger of generating exactly what you put in 44

Our method We have netflow around part of the edge (currently) We can generate a partial traffic matrix (hourly) Won t match traffic measured from SNMP on links Can use the routing and partial traffic matrix to simulate the SNMP measurements you would get Then solve inverse problem Advantage Realistic network, routing, and traffic Comparison is direct, we know errors are due to algorithm not errors in the data 45

Estimates over time 46

Local traffic matrix (George Varghese) 0% 1% 5% 10% for reference previous case 47