Structural Analysis of Network Traffic Flows Eric Kolaczyk

Similar documents
Structural Analysis of Network Traffic Flows

Detecting Network Anomalies. Anant Shah

Mining Anomalies in Network-Wide Flow Data. Anukool Lakhina, Ph.D. with Mark Crovella and Christophe Diot

Characterization of Network-Wide Anomalies in Traffic Flows

Diagnosing Network Disruptions with Network-Wide Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Linear Algebra Review. Vectors

Machine Learning in Statistical Arbitrage

Network Tomography and Internet Traffic Matrices

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Network Performance Monitoring at Small Time Scales

Sequential Non-Bayesian Network Traffic Flows Anomaly Detection and Isolation

Dynamics of Prefix Usage at an Edge Router

Page 1. Outline EEC 274 Internet Measurements & Analysis. Traffic Measurements. Motivations. Applications

Impact of BGP Dynamics on Router CPU Utilization

Lecture 5: Singular Value Decomposition SVD (1)

An iterative tomogravity algorithm for the estimation of network traffic

SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND

Reformulating the monitor placement problem: Optimal Network-wide wide Sampling

Possibilities of Automation of the Caterpillar -SSA Method for Time Series Analysis and Forecast. Th.Alexandrov, N.Golyandina

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Singular Spectrum Analysis with Rssa

2.2 Elimination of Trend and Seasonality

Nonlinear Iterative Partial Least Squares Method

Principal Component Analysis

Measuring the Shared Fate of IGP Engineering and Interdomain Traffic

Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

3 Orthogonal Vectors and Matrices

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

Data, Measurements, Features

> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)

Cisco IOS Flexible NetFlow Technology

EE627 Lecture 22. Multihoming Route Control Devices

Time-Frequency Detection Algorithm of Network Traffic Anomalies

Time Series Analysis of Network Traffic

Probability-Model based Network Traffic Matrix Estimation

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

IP Forwarding Anomalies and Improving their Detection using Multiple Data Sources

Manifold Learning Examples PCA, LLE and ISOMAP

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

( ) FACTORING. x In this polynomial the only variable in common to all is x.

MACHINE LEARNING IN HIGH ENERGY PHYSICS

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

CISCO INFORMATION TECHNOLOGY AT WORK CASE STUDY: CISCO IOS NETFLOW TECHNOLOGY

1 Example of Time Series Analysis by SSA 1

1.3 Algebraic Expressions

Traffic Analysis with Netflow The Key to Network Visibility

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH

CS Introduction to Data Mining Instructor: Abdullah Mueen

Univariate and Multivariate Methods PEARSON. Addison Wesley

Unsupervised and supervised dimension reduction: Algorithms and connections

Joint Entropy Analysis Model for DDoS Attack Detection

Linear Algebraic Equations, SVD, and the Pseudo-Inverse

Load Balancing by MPLS in Differentiated Services Networks

Solving Linear Systems, Continued and The Inverse of a Matrix

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

1 Introduction to Matrices

Report of Independent Auditors

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication

W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015

Data Exploration and Preprocessing. Data Mining and Text Mining (UIC Politecnico di Milano)

Data Mining: Algorithms and Applications Matrix Math Review

Gamma Distribution Fitting

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Sampling Biases in IP Topology Measurements

Common Core Unit Summary Grades 6 to 8

Principles of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n

Transcription:

Structural Analysis of Network Traffic Flows Eric Kolaczyk Anukool Lakhina, Dina Papagiannaki, Mark Crovella, Christophe Diot, and Nina Taft

Traditional Network Traffic Analysis Focus on Short stationary timescales Traffic on a single link in isolation Principal results Scaling properties Packet delays and losses What ISPs Care About Focus on Long, nonstationary timescales Traffic on all links simultaneously Principal goals Capacity planning Traffic engineering Anomaly detection 2

Need for Whole-Network Traffic Analysis Traffic Engineering: How does traffic move throughout the network? Anomaly Detection: Which links show unusual traffic? Capacity planning: How much and where in network to upgrade? 3

This is Complicated! Measuring and modeling traffic on all links simultaneously is challenging. Even single link modeling is difficult 100s of links in large IP networks High-Dimensional timeseries Significant correlation in link traffic Is there a more fundamental representation? 4

Origin-Destination Flows total traffic on the link traffic time Link traffic arises from the superposition of Origin- Destination (OD) flows Modeling OD flows instead of link traffic removes a significant source of correlation A fundamental primitive for whole-network analysis 5

But, This Is Still Complicated Even more OD flows than links Still a high dimensional, multivariate timeseries How do we extract meaning from this high dimensional structure in a systematic manner? 6

High Dimensionality: A General Strategy Look for good low-dimensional representations Often a high-dimensional structure can be explained by a small number of independent variables A commonly used technique: Principal Component Analysis (PCA) (aka KL-Transform, SVD, ) 7

Our work Measure complete sets of OD flow timeseries from two backbone networks Use PCA to understand their structure Decompose OD flows into simpler features Characterize individual features Reconstruct OD flows as sum of features Call this structural analysis 8

Datasets Abilene: 11 PoPs, 121 OD flows. Sprint-Europe: 13 PoPs, 169 OD flows. Collect sampled traffic from every ingress link using NetFlow Use BGP tables to resolve egress points Week-long datasets, 5- or 10-minute timesteps 9

Example OD Flows 5.5 x 107 x 10 6 x 10 4 2.5 x 107 3 5 5 4.5 2 Traffic in OD Flow 96 4.5 4 3.5 Traffic in OD Flow 18 2.5 2 1.5 1 Traffic in OD Flow 167 4 3.5 3 2.5 2 1.5 Traffic in Ab. OD Flow 29 1.5 1 0.5 1 0.5 3 0.5 x 10 7 x 10 5 x 10 6 x 10 4 3 6 8 7 7 6 Traffic in Ab. OD Flow 27 2.5 2 1.5 Traffic in OD Flow 124 5 4 3 2 Traffic in Ab. OD Flow 59 6 5 4 3 2 Traffic in OD Flow 111 5 4 3 2 1 1 1 1 0 100 200 300 400 500 600 700 800 900 1000 x 10 4 x 10 5 x 10 8 x 10 7 3.5 1.35 3.6 3 5 1.3 3.4 Traffic in OD Flow 157 2.5 2 1.5 1 Traffic in OD Flow 42 4 3 2 Traffic in OD Flow 84 1.25 1.2 1.15 1.1 1.05 1 0.95 Traffic in OD Flow 131 3.2 3 2.8 2.6 2.4 0.5 1 0.9 2.2 0.85 2 0 0 100 200 300 400 500 600 700 800 900 1000 Some have visible structure, some less so 10

Specific Questions of Structural Analysis Are there low dimensional representations for a set of OD flows? Do OD flows share common features? What do the features look like? Can we get a high-level understanding of a set of OD flows in terms of these features? 11

Principal Component Analysis Coordinate transformation method Original Data Transformed Data x1, x2 u1, u2 12

Properties of Principle Components Each PC in the direction of maximum (remaining) energy in the set of OD flows Ordered by amount of energy they capture Eigenflow: set of OD flows mapped onto a PC; a common trend Ordered by most common to least common trend 13

PCA on OD flows # OD pairs # OD pairs # OD pairs time time # OD pairs OD flow Eigenflow PC X: OD flow matrix U: Eigenflow matrix V: Principal matrix X=UΣV T 14

PCA on OD flows (2) Each eigenflow is a weighted sum of all OD flows = Eigenflows are orthonormal ; Singular values indicate the energy attributable to a principal component Each OD flow is weighted sum of all eigenflows = + + 15

An Example Eigenflow and PC Eigenflow 6 0.05 0 0.05 5 x 10 4 OD Flow 167 4.5 Time Traffic in OD Flow 167 4 3.5 3 2.5 2 1.5 1 0.5 PC 6 0.2 0 0.2 5.5 x 107 5 OD Flow 94 0.4 20 40 60 80 100 120 140 160 OD Flow Traffic in OD Flow 96 4.5 4 3.5 3 16

Outline For Rest of Talk Find intrinsic dimensionality of OD flows Decompose OD flows Characterize eigenflows Reconstruct OD flows Structural Analysis Potential applications 17

Low Intrinsic Dimensionality of OD Flows 1 0.9 Sprint 1 Abilene 0.8 Magnitude Magnitude 0.7 0.6 0.5 0.4 Plot of (square root of) energy captured by each dimension. 0.3 0.2 0.1 20 40 60 80 100 120 140 160 Singular Values Singular Values 18

Approximating With Top 5 Eigenflows x 10 7 3.5 Original 5 PC 3 Traffic in OD Flow 88 2.5 2 1.5 19

Approximating With Top 5 Eigenflows 2.5 x 10 7 Original 5 PC Traffic in OD Flow 79 2 1.5 1 0.5 20

Approximating With Top 5 Eigenflows x 10 7 7 Original 5 PC 6 Traffic in OD Flow 96 5 4 3 2 1 21

Outline Find intrinsic dimensionality of OD flows Decompose OD flows Characterize eigenflows Reconstruct OD flows Structural Analysis Potential applications 22

Structure of OD Flows Pr[X<x] 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Sprint 1 Abilene Most OD flows have less than 20 significant eigenflows Can think of each OD flow as having only a small set of features 0.1 0 20 40 60 80 100 120 140 160 Number of Eigenflows in an OD flow 23

Kinds of Eigenflows 0.04 0.02 0.35 0.3 0.25 0.08 0.06 0.04 Eigenflow 2 0 0.02 Eigenflow 20 0.2 0.15 0.1 Eigenflow 29 0.02 0 0.02 0.05 0.04 0.04 0 0.06 0.06 0.05 0.08 0.1 Deterministic d-eigenflows Spike s-eigenflows Noise n-eigenflows Predictable (periodic) trends Sudden, isolated spikes and drops Roughly stationary and Gaussian 24

D-eigenflows Have Periodicity 0.038 3.5 3 2.5 0.036 Sprint 1 Abilene 0.034 0.032 0.03 Eigenflow 1 0.028 FFT Energy 2 1.5 0.026 0.024 0.022 1 0.5 0 0 6 12 24 36 48 Hours Power spectrum 25

S-eigenflows Have Spikes 0.3 Sprint 1 Eigenflow 8 0.2 0.1 0 0.05 Abilene Eigenflow 10 5-sigma threshold 0 0.05 0.1 26

N-eigenflows Are Gaussian 0.1 Quantiles of Input Sample 0.1 0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.08 0.1 Sprint 1 Abilene 3 2 1 0 1 2 3 Standard Normal Quantiles Eigenflow 39 0.05 0 0.05 0.1 qq-plot 27

Hundreds of Eigenflows But Only Three Basic Types 28

An OD Flow, Reconstructed x 10 7 2 1.5 1 1.8 1.6 1.4 1.2 1 0.8 0.6 5 0 5 x 10 7 x 10 6 Original d eigenflows s eigenflows OD flow D-components S-components x 106 5 n eigenflows 0 5 N-components 29

Another OD Flow, Reconstructed x 10 7 6 Original 4 2 OD flow x 10 7 4 d eigenflows 2 0 D-components 2 x 10 7 s eigenflows 0 2 S-components x 107 2 n eigenflows 0 2 N-components 30

Which Eigenflows Are Most Significant? N n eigenflow S s eigenflow D d eigenflow N n eigenflow S s eigenflow D d eigenflow 0 5 10 15 20 25 30 35 40 45 50 Sprint Eigenflows in order 0 10 20 30 40 50 60 70 80 90 Abilene Eigenflows in order 1-6: d-eigenflows appear to be most significant in both networks. 5-10: s-eigenflows are next important. 12 and beyond: n-eigenflows account for rest. 31

Contribution of Eigenflow Types Fraction of total OD flow energy captured by each type of eigenflow 32

Contribution to Each OD Flow (Sprint) Fraction of Total Energy 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 (Sprint) Deterministic Spike Noise 15 30 45 60 75 90 105 120 135 150 165 OD Flow (large to small) Largest OD flows: Strong deterministic component. Smallest OD flows: Primarily dominated by spikes. Regardless of size, n-eigenflows account for a fairly constant portion. 33

Contribution to Each OD Flow (Abilene) Fraction of Total Energy 0.8 0.7 0.6 0.5 0.4 0.3 Deterministic Spike Noise Largest OD flows: Strong deterministic component. Smallest OD flows: Dominated by noise, but have diurnal trends also. 0.2 0.1 15 30 45 60 75 90 105 120 OD Flow (large to small) Regardless of size, spikes account for a fairly constant portion. 34

Summary: Specific Questions Are there low dimensional representations for a set of OD flows? 5-10 eigenflows is sufficient for good approximation of a set of 100+ OD flows Do OD flows share common features? The common features across OD flows are eigenflows What do the features look like? Each eigenflow can be categorized as D, S, or N Can we get a high-level understanding of a set of OD flows in terms of these features? Both networks: Large flows are primarily diurnal Sprint: Small flows are primarily spikes; noise constant. Abilene: Small flows have N and D; spikes constant. 35

Outline Find intrinsic dimensionality of OD flows Decompose OD flows Characterize eigenflows Reconstruct OD flows Structural Analysis Potential applications 36

Traffic Matrix Estimation Problem Statement: Infer OD flows (X) given link measurements (Y) and routing matrix (A): Y T = AX T State of the Art: dim(x) > dim(y), so treat as ill-posed linear inverse problem. Infer Y on stationary (short) timescales. Possible Approach: On longer timescales, intrinsic dimensionality of OD flows is small, so effective dim(x) < dim(y) TM estimation of largest eigenflows now becomes a wellposed problem. 37

Anomaly Detection State of the art: Use wavelets to detrend each flow in isolation. [Barford:IMW02] Possible approach: Detrend all OD flows simultaneously by subtracting d-eigenflows. 3.5 x 107 Original Timeseries for OD Flow# 57 3 2.5 2 1.5 1 15 x 106 Detrended Timeseries with 4σ threshold 10 5 0 Original timeseries d pseudo eigenflows 5 38

Traffic Forecasting State of the art: Treat each flow timeseries independently. Use wavelets to extract trends. Build timeseries forecasting models on trends. [PTZC:INFOCOM03] Possible approach: Build forecasting models on d-eigenflows as trends. Allows simultaneous examination and forecasting for entire ensemble of OD flows. 39

Traffic Engineering Problem Statement: How does one identify important traffic flows, so that they can be treated differently? State of the art: Measure all flows on a single link Find heavy-hitters or elephant flows based on preset thresholds [PTC:INFOCOM04, PTBTSC:IMW02] Possible approach: Look across all flows and extract common features Taxonomize each flow into D, S, or N 40

Final thoughts OD flows a useful primitive to engineer networks Set of OD flows have low dimensional representations A Structural Analysis approach can provide useful insight into nature of OD flows 41

Thanks! Help with Abilene Data Rick Summerhill, Mark Fullmer (Internet2) Matthew Davy (Indiana University) Help with Sprint-Europe Data Bjorn Carlsson, Jeff Loughridge (SprintLink), Richard Gass (Sprint ATL) 42

Backup slides 43

Principal Component Analysis For any given dataset, PCA finds a new coordinate system that maps maximum variability in the data to a minimum number of coordinates New axes are called Principal Axes or Components 44

Properties of Principle Components Each PC in the direction of maximum (remaining) energy in the set of OD flows Ordered by amount of energy they capture mapping of X onto one PC and, difference between original and data mapped onto first k-1 PCs. Eigenflow: set of OD flows mapped onto a PC; a common trend Ordered by most common to least common trend 45

Energy captured by each PC 1 0.9 Sprint 1 Abilene 0.8 0.7 Energy Captured 0.6 0.5 0.4 0.3 0.2 0.1 20 40 60 80 100 120 140 160 Principal Component 46