Reformulating the monitor placement problem: Optimal Network-wide wide Sampling



Similar documents
2004 Networks UK Publishers. Reprinted with permission.

Network Tomography and Internet Traffic Matrices

Optimizing the deployment of Netflow to infer traffic matrices in large networks

Network-Wide Capacity Planning with Route Analytics

Nonlinear Optimization: Algorithms 3: Interior-point methods

A Framework For Maximizing Traffic Monitoring Utility In Network V.Architha #1, Y.Nagendar *2

How To Find The Optimal Sampling Rate Of Netflow On A Network

Load Distribution in Large Scale Network Monitoring Infrastructures

DESIGN AND ANALYSIS OF TECHNIQUES FOR MAPPING VIRTUAL NETWORKS TO SOFTWARE- DEFINED NETWORK SUBSTRATES

Introduction. The Inherent Unpredictability of IP Networks # $# #

Binary vs Analogue Path Monitoring in IP Networks

Experimentation driven traffic monitoring and engineering research

Traffic Engineering for Multiple Spanning Tree Protocol in Large Data Centers

Towards a Next- Generation Inter-domain Routing Protocol. L. Subramanian, M. Caesar, C.T. Ee, M. Handley, Z. Mao, S. Shenker, and I.

Routing and traffic measurements in ISP networks

LEISURE: A Framework for Load-Balanced Network-Wide Traffic Measurement

Adaptive Resource Management and Control in Software Defined Networks

Traffic Engineering With Traditional IP Routing Protocols

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Network traffic engineering

On the Interaction and Competition among Internet Service Providers

Efficiently Monitoring Bandwidth and Latency in IP Networks

An iterative tomogravity algorithm for the estimation of network traffic

Integrated Backup Topology Control and Routing of Obscured Traffic in Hybrid RF/FSO Networks*

Network Security through Software Defined Networking: a Survey

TE in action. Some problems that TE tries to solve. Concept of Traffic Engineering (TE)

Measuring the Shared Fate of IGP Engineering and Interdomain Traffic

Adaptive Online Gradient Descent

Network-Wide Change Management Visibility with Route Analytics

Multi-Commodity Flow Traffic Engineering with Hybrid MPLS/OSPF Routing

Load Balancing by MPLS in Differentiated Services Networks

Border Gateway Protocol (BGP)

Chapter 4. VoIP Metric based Traffic Engineering to Support the Service Quality over the Internet (Inter-domain IP network)

Measurement-aware Monitor Placement and Routing

A Closed-Loop Control Traffic Engineering System for the Dynamic Load Balancing of Inter-AS Traffic

Network-Wide Class of Service (CoS) Management with Route Analytics. Integrated Traffic and Routing Visibility for Effective CoS Delivery

Router Scheduling Configuration Based on the Maximization of Benefit and Carried Best Effort Traffic

B4: Experience with a Globally-Deployed Software Defined WAN TO APPEAR IN SIGCOMM 13

Optimal monitoring in large networks by Successive c-optimal Designs

Optimizing the measurement of the trac on large Networks: An Experimental Design Approach

E-Guide NETWORKING MONITORING BEST PRACTICES: SETTING A NETWORK PERFORMANCE BASELINE

Flow Monitoring With Cisco Routers

PATH SELECTION AND BANDWIDTH ALLOCATION IN MPLS NETWORKS

An Adaptive MT-BGP Traffic Engineering Based on Virtual Routing Topologies

Traffic Analysis With Netflow. The Key to Network Visibility

Infrastructure for active and passive measurements at 10Gbps and beyond

NetFlow Performance Analysis

Performance of Cisco IPS 4500 and 4300 Series Sensors

On Orchestrating Virtual Network Functions

Real-Time Traffic Engineering Management With Route Analytics

Simulation of Heuristic Usage for Load Balancing In Routing Efficiency

Deploying in a Distributed Environment

Understanding and Optimizing BGP Peering Relationships with Advanced Route and Traffic Analytics

Enhancing BoD Services based on Virtual Network Topology Control

Nonlinear Programming Methods.S2 Quadratic Programming

WAVE: Popularity-based and Collaborative In-network Caching for Content-Oriented Networks

Network traffic monitoring and management. Sonia Panchen 11 th November 2010

Linear Programming I

Network Level Multihoming and BGP Challenges

Multi-layer MPLS Network Design: the Impact of Statistical Multiplexing

Routing & Traffic Analysis for Converged Networks. Filling the Layer 3 Gap in VoIP Management

GLM, insurance pricing & big data: paying attention to convergence issues.

Outline. EE 122: Interdomain Routing Protocol (BGP) BGP Routing. Internet is more complicated... Ion Stoica TAs: Junda Liu, DK Moon, David Zats

Structural Analysis of Network Traffic Flows Eric Kolaczyk

Passively Detecting Remote Connectivity Issues Using Flow Accounting. 2nd EMANICS Workshop on Netflow/IPFIX usage in network management

Multiple Layer Traffic Engineering in NTT Network Service

Internetworking II: VPNs, MPLS, and Traffic Engineering

Functional Optimization Models for Active Queue Management

OpenTM: Traffic Matrix Estimator for OpenFlow Networks

Blind SDN vs. Insightful SDN in a Mobile Backhaul Environment Extending SDN with Network State+

HPAM: Hybrid Protocol for Application Level Multicast. Yeo Chai Kiat

Traffic Analysis with Netflow The Key to Network Visibility

A Network Recovery Scheme for Node or Link Failures using Multiple Routing Configurations

Introduction to Quality of Service. Andrea Bianco Telecommunication Network Group

Follow the Perturbed Leader

Transcription:

Reformulating the monitor placement problem: Optimal Network-wide wide Sampling Gianluca Iannaccone Intel Research @ Cambridge Joint work with: G. Cantieni,, P. Thiran (EPFL) C. Barakat (INRIA), C. Diot (Intel) 1

Motivation Router-embedded embedded monitoring functionalities are commonly used in small and large ISPs e.g., Cisco s NetFlow provide visibility over the entire network level of details is good enough (for now...) Challenge: how to configure a network-wide wide monitoring infrastructure with hundreds of viewpoints? 2

Why is it a hard challenge? Configure means setting the sampling rates on all individual interfaces Sampling rates needs to be low to reduce stress on routers Aggregate volume of information collected from the routers should be kept under control Measurement task unknown a priori and a single fixed layout does not perform well e.g., PoP-level traffic matrix estimation all edge routers with low sampling rates. e.g., focusing on specific prefix below the radars few monitors, relatively higher sampling rates. 3

Our objective Given a measurement task and a target accuracy, find a method that: sets the sampling rates on all interfaces guarantees optimal use of resources (in terms of processed packets) requires minimum configuration can adapt quickly to changes in the traffic Method should apply to a general class of measurement tasks 4

Picking a measurement task... Estimate amount of traffic flowing among a subset of origin-destination pairs Common task for traffic engineering apps Janet AS786 GEANT European Research Network 5

Problem formulation Choose vector of sampling rates p that maximizes utility function for OD pair k effective sampling rate for OD pair k sampling rate on link i max sampling rate for link i packets traversing link i system capacity (in packets) Effective sampling rate approximated by sum of sampling rates All constraints are linear and define a convex solution space Unique maximizer exists as long as M() is strictly concave 6

Algorithm Solve system defined by KKT conditions select set active/inactive constraints (equivalent to switching off/on a link monitor) use gradient projection method to explore space use KKT conditions to check optimality of solution Selection of active/inactive constraints is NP- hard no guarantee of convergence Limit algorithm runs to 2,000 iterations 98.6% optimum found (for our task) 7

The utility function Measures quality of sampling an OD pair Well behaved to make the algorithm run fast Mean square relative error good candidate E[SRE] = E[((X/ρ S) / S) 2 ] actually 1 E[SRE] mean square relative accuracy M(ρ) ) = 1 E[1/S] * (1/ρ 1); minor tweaking to force it to be zero when ρ = 0 needs E[1/S] where S is the size of the OD pair 8

Evaluation Consider NetFlow data from GEANT Collected using Juniper s s Traffic Sampling 1/1000 periodic sampling We scale the measurement by 1000 (we just need a realistic mix of OD pair sizes) Results based on one run of the algorithm One five minute snapshot of the network traffic Compute OD pair sizes and link loads Assume E[1/S] is known 9

Results highlights Measuring relative accuracy Defined as one minus relative error (not squared) Allows to validate manipulation of utility function and the use of effective sampling rate Accuracy is in the range 89-99% 99% Worst accuracy for JANET LU (it has just 20 pkts/sec) Measurement spread across 10 links Max sampling rates is 0.92% (lightly loaded links) Most links are around 0.1% No OD pair is monitored on more than two links Effective sampling rate (sum of sampling rates) is a good approximation of actual sampling rate 10

Comparing to naive solutions Why not just monitoring JANET access link? All the monitored traffic would be relevant! To achieve same accuracy over all OD pairs we need ~1% sampling rate 70% more packets are processed It s s not always possible to monitor both directions of access links Why not just monitoring all UK links? There are just 6 links leaving the UK Straightforward algorithm to set sampling rate (each OD pair is present on just one link), but... 11

Monitoring all UK links Why does our method work better? It looks across the entire network to find where small OD pairs manifest themselves without hiding behind large flows 12

Deployment on real networks Two aspects need to be addressed What prior knowledge about the network does the method need? need routing information need estimate of E[1/S] for each OD pair bootstrapping phase How does the method perform over time? time of day effect change E[1/S] and U i routing event change path taken by OD pairs adapt sampling rates 13

Bootstrapping phase 14

Performance over time OD pair volume drops to less than 10 pkts/sec 15

Performance over time (cont d) accuracy drops following time of day or other OD pair fluctuations 16

Performance over time (cont d) up to 120% more sampled packets than target capacity! 17

Adapting to traffic fluctuations Three different cases that require different approaches Link load increases more sampled packets, exceeding capacity find new sampling rates to enforce target capacity OD pair decreases in volume poor accuracy because of bad E[1/S] estimate adapt capacity Θ to keep target accuracy OD pair traverses different set of links missing entire OD pair monitor routing updates and re-bootstrap the algorithm 18

Fluctuations in OD pairs Monitoring accuracy of OD pairs This is not trivial. Accuracy is not known. Need to estimate E[1/S] from sampled data. Use simplest method Current size of OD pair Compute new sampling rates when estimated accuracy drops below target If the estimated accuracy is still below target, increase capacity by 10% Decrease capacity if estimated accuracy is above target for more than one hour 19

Fluctuations in OD pairs (cont d) 20

Fluctuations in OD pairs (cont d) 21

Fluctuations in OD pairs (cont d) 22

Related work Passive monitoring Suh et al, Locating Network Monitors..., Infocom 2005 two phase approach: select the monitors then optimize sampling near-optimal solutions Active monitoring Bejerano, Rastogi, Robust monitoring of link delays, Infocom 2003 Jamin et al., On the placement of Internet instrumentation, Infocom 2000 Improving NetFlow Estan et al, Building a better NetFlow, Sigcomm 2004 Baek-Yong et al.... Adaptive Sampling... TM estimation work really a different problem setting 23

Conclusion & Future work Set sampling rates of a network of monitors. General enough framework for large class of measurement tasks Working on finding new utility functions Looking into using better predictors for E[1/S] Open issue How long does it take to reconfigure NetFlow? 24