Detecting Network Anomalies. Anant Shah



Similar documents
Structural Analysis of Network Traffic Flows Eric Kolaczyk

Mining Anomalies in Network-Wide Flow Data. Anukool Lakhina, Ph.D. with Mark Crovella and Christophe Diot

Conclusions and Future Directions

Design and simulation of wireless network for Anomaly detection and prevention in network traffic with various approaches

PART III. OPS-based wide area networks

AUTONOMOUS NETWORK SECURITY FOR DETECTION OF NETWORK ATTACKS

IP Forwarding Anomalies and Improving their Detection using Multiple Data Sources

Cisco IOS Flexible NetFlow Technology

CS 5480/6480: Computer Networks Spring 2012 Homework 4 Solutions Due by 1:25 PM on April 11 th 2012

Distance Vector Routing Protocols. Routing Protocols and Concepts Ola Lundh

Structural Analysis of Network Traffic Flows

Machine Learning in Statistical Arbitrage

Netflow Overview. PacNOG 6 Nadi, Fiji

Detecting Flooding Attacks Using Power Divergence

Denial of Service and Anomaly Detection

Network Management & Monitoring

HOW TO PREVENT DDOS ATTACKS IN A SERVICE PROVIDER ENVIRONMENT

Network Monitoring and Management NetFlow Overview

Vocia MS-1 Network Considerations for VoIP. Vocia MS-1 and Network Port Configuration. VoIP Network Switch. Control Network Switch

Diagnosing Network Disruptions with Network-Wide Analysis

Flow Monitoring With Cisco Routers

Characterization of Network-Wide Anomalies in Traffic Flows

Introduction to Netflow

Answers to Sample Questions on Network Layer

Question 1. [7 points] Consider the following scenario and assume host H s routing table is the one given below:

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

INCREASE NETWORK VISIBILITY AND REDUCE SECURITY THREATS WITH IMC FLOW ANALYSIS TOOLS

Load Balancing by MPLS in Differentiated Services Networks

Inter-domain Routing Basics. Border Gateway Protocol. Inter-domain Routing Basics. Inter-domain Routing Basics. Exterior routing protocols created to:

4 Internet QoS Management

RID-DoS: Real-time Inter-network Defense Against Denial of Service Attacks. Kathleen M. Moriarty. MIT Lincoln Laboratory.

Internet Firewall CSIS Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS net15 1. Routers can implement packet filtering

STANDPOINT FOR QUALITY-OF-SERVICE MEASUREMENT

Time-Frequency Detection Algorithm of Network Traffic Anomalies

Router and Routing Basics

Review Jeopardy. Blue vs. Orange. Review Jeopardy

On Entropy in Network Traffic Anomaly Detection

Increasing Reliability in Network Traffic Anomaly Detection

Introducing FortiDDoS. Mar, 2013

Region 10 Videoconference Network (R10VN)

Joint Entropy Analysis Model for DDoS Attack Detection

Classful IP Addressing (cont.)

Java Modules for Time Series Analysis

Internetworking II: VPNs, MPLS, and Traffic Engineering

Chapter 10 Link-State Routing Protocols

Combining Statistical and Spectral Analysis Techniques in Network Traffic Anomaly Detection

Enhancing Network Monitoring with Route Analytics

2004 Networks UK Publishers. Reprinted with permission.

configure WAN load balancing

Border Gateway Protocol (BGP-4)

Exterior Gateway Protocols (BGP)

Radware s Attack Mitigation Solution On-line Business Protection

How To Stop A Ddos Attack On A Network From Tracing To Source From A Network To A Source Address

Characteristics of Network Traffic Flow Anomalies

Understanding and Optimizing BGP Peering Relationships with Advanced Route and Traffic Analytics

Data Fusion Enhancing NetFlow Graph Analytics

2. What is the maximum value of each octet in an IP address? A. 28 B. 255 C. 256 D. None of the above

APNIC elearning: BGP Basics. Contact: erou03_v1.0

Detecting Anomalies in Network Traffic Using Maximum Entropy Estimation

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Network Monitoring Using Traffic Dispersion Graphs (TDGs)

co Characterizing and Tracing Packet Floods Using Cisco R

Application Notes for Configuring a SonicWALL VPN with an Avaya IP Telephony Infrastructure - Issue 1.0

Analysis of Network Packets. C DAC Bangalore Electronics City

VegaStream Information Note Considerations for a VoIP installation

ECE 358: Computer Networks. Solutions to Homework #4. Chapter 4 - The Network Layer

The Coremelt Attack. Ahren Studer and Adrian Perrig. We ve Come to Rely on the Internet

Monitoring sítí pomocí NetFlow dat od paketů ke strategiím

TD 271 Rev.1 (PLEN/15)

Layer 3 Routing User s Manual

Anomaly detection. Problem motivation. Machine Learning

Component Ordering in Independent Component Analysis Based on Data Power

NetFlow/IPFIX Various Thoughts

modeling Network Traffic

R2. The word protocol is often used to describe diplomatic relations. How does Wikipedia describe diplomatic protocol?

Troubleshooting and Maintaining Cisco IP Networks Volume 1

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Routing. Static Routing. Fairness. Adaptive Routing. Shortest Path First. Flooding, Flow routing. Distance Vector

Introduction to Performance Measurements and Monitoring. Vinayak

Datagram-based network layer: forwarding; routing. Additional function of VCbased network layer: call setup.

Internet Packets. Forwarding Datagrams

Contribution of the French MetroSec

Network Operations Analytics

From Active & Programmable Networks to.. OpenFlow & Software Defined Networks. Prof. C. Tschudin, M. Sifalakis, T. Meyer, M. Monti, S.

Technology Overview. Class of Service Overview. Published: Copyright 2014, Juniper Networks, Inc.

Network Measurement. Why Measure the Network? Types of Measurement. Traffic Measurement. Packet Monitoring. Monitoring a LAN Link. ScienLfic discovery

Transcription:

Detecting Network Anomalies using Traffic Modeling Anant Shah

Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting deviations (and potentially uncovering problems) is called Anomaly Detection 1

Focus: Network Anomalies Network operators often face challenges such as DoS, Scans, Worms, Outages, etc. They manifest themselves as deviations in network traffic Sometimes these deviations are obvious to observe, sometimes they are hidden 2

Obvious Network Anomaly 3

Sometimes Network Anomalies are Hidden How many Anomalies can you point out? Border Router of Institution Let s model the degree of randomness: Entropy of destination IP and Port What factors contributed to the successful detection of this anomaly? 4

How was this Anomaly Detected? Count based metrics #bytes,#packets Simple plot of time series didn t reveal the anomaly Dataused is header fields destination IP and Port Scope of analysis is single incoming link Model of Entropy made the anomaly stand out 5

Network Anomaly Detection Dimensions Data Network Traffic and Statistics Bytes/sec, Packets/sec, Header fields, etc. Scope Per Link (Temporal analysis) All links at a time (Spatio-Temporal analysis) Scope Data Model Model Model network traffic characteristics Exploit an intrinsic characteristic of data that the anomaly impacted Frequency, Entropy, Principal Components, etc. 6

Interesting Questions to Answer What data to use? -Variety in available data is limited to data collection in operational networks What scope to focus on? -Choices range from one link to all the links in the network Which modeling technique to use? - Challenge to match model to anomaly - Most interesting aspect of anomaly detection 7

Taxonomy Data Sketches, Sampling PCA Data Reduction Probability Transformation Correlation Temporal Dynamics Entropy Frequency, Distribution Detection 8

Example Data Sketches, Sampling PCA Data Correlation Reduction Transformation Temporal Probability Dynamics Entropy Frequency, Distribution Detection Subspace Method by Lakhina et al: Detection EWMA Roadmap of the method Correlation Data Reduction Data Principal Components Sampling SNMP 9

Single Link Methods 10

1 Feature Vector Method Data {Variance, Count Metrics, Single Link} Scope Model Detection Variance Data Reduction Sampling Data Count Metrics 11

1 Generating Envelopes How to capture dynamic nature of traffic? -Use envelopes 12

1 Expected behavior as Envelopes Output 1 for packets per minute Packets per minute For each metric find which envelope the observed value falls and output corresponding integer Time Some other metrics: Number of broadcast packets, Load on adjacent network, percentage bandwidth used, etc 13

1 Anomaly Vectors State of network Matching Algorithm Fault Feature Vector Expected deviation of anomaly If distance is < 3 then there is a match Distance: (1,1)=> 0 (1,1)=> 0 (2,2)=> 0 (0,0)=> 0 (0,-1)=> 1 Anomaly Detected Sum=1 14

1 Results Authors analyze their department LAN data Best detection for broadcast storm Success for hardware faults is low Authors attribute this to lack of more metrics Claim addition of more metrics may help detecting hardware faults ROC Curve 15

Comments: Feature Vector Method Generating a good feature vector for an anomaly is very difficult and there are large number of anomalies Will not work for anomalies that do not cause significant volume changes Predictions are based on previous days data Intro Dimensions Method Conclusion Good for enterprise wide LAN but not a solution for ISPs 1 16

2 Signal Analysis Method Data {Frequency, Count Metrics, Single Link} Scope Model Detection Deviation Score Temporal Dynamics Frequency Data Reduction Sampling Data SNMP, Count Metrics 17

2 Why Signal Analysis? Some anomalies are instantaneous with sudden changes in the count metrics Simple variance may not be able to capture these in given time window Need: Segregate instantaneous changes with slow varying trends -Analyzing time series as a signal can help 18

2 Time Series Decomposition Each band captures different traffic variations We can now focus on just high and mid band to find instantaneous deviations Flows per second Days 19

2 Signal Analysis Method Overview Step 1: Calculate variance in each band using a sliding window Step 2: Combine these local variances using weighted sum to give deviation score Step 3: Apply simple threshold on deviation score 20

2 Signal Analysis Method Overview DoS Packets per second Deviation Score Day 21

2 Wavelet method shows good results for sharp instantaneous anomalies E.g., DoS, Link Outage Results Anomalies that impact a specific set of flows, could not be detected E.g., Port Scan 22

2 Comments: Signal Analysis Method Large anomalies may mask other subsequent small anomalies Selecting appropriate deviation threshold is challenging, since it will be different for different networks Need: Method to exploit flow level characteristics 23

3 Equilibrium Method Data {Distribution, IP Flows, Single Link} Scope Model Detection Error Bounds Temporal Dynamics Fitting Distribution Data Reduction Sampling Data IP Flows 24

3 Why Equilibrium Property? A Short Timescale Uncorrelated Traffic Equilibrium: ASTUTE What we have so far: Anomalies that impact traffic dispersion: Use Variance Anomalies that are instantaneous: Use Frequency These are just count metrics, how to find anomalies in flows? Need: A way to detect anomalies that impacts volume of flows w.r.t each other Observation: Over a short duration of time (<15mins) IP Flows are independent and stationary Intuitively: Independent flows cancel each other out 25

3 ASTUTE Method Example CLT : for large F, the AAV * has a standard Gaussian distribution A toy example : K = 2 i i+1 probability False positives False positives 0 ^ 3 flows δ = 1/3 +2 σ^ -1 2 = 7/3 AAV: K(F) 0.378 -K AAV K *AAV: Astute Assessment Value 26

3 ASTUTE v/s Wavelet Method Unlike wavelet method detection is based on a flow s correlation with other flows Best performance for port scan and prefix outage Poor performance for DoS detection 27

Single Link Methods Summary Summary Variance Use signal analysis Single parameter variance will not capture all traffic dynamic Need IP details and detection for non large volume change Use Variance envelopes of multiple parameters Use Equilibrium property Need better ways to capture instantaneous spikes 28

Single Link Methods Summary Summary Model Target anomaly class Best detected anomalies (+)Pros (-)Cons Variance envelopes Affecting dispersion Broadcast Storm Accounts for daily patterns Frequency Instantaneous DoS Segregation of highly variable data Equilibrium Affectingflow correlations Prefixoutage Noa-priori knowledge needed Needs significant volume changes Small anomalies can get masked Keeping per flow state not feasible 1. How to detect anomalies that span multiple links? 2. Can we separate out anomalous space? 29

Network Wide Methods 30

4 Subspace(PCA) Method Data {Principal Components, SNMP, Multi Link} Scope Model Detection EWMA Correlation PCA Data Reduction Sampling Data SNMP 31

4 Why Subspace Method? Data from multiple links Allows analysis of variance across all links Extracts principal components in decreasing order of variance, which allows you to define subspaces 32

4 Basic Understanding of PCA Y PC-1 captured most variance If there were n dimensions we would have n PCs In decreasing order of variance captured X 33

4 Dimensionality Reduction: PCA Most variance is captured by first 3 or 4 PC Separate into normal and anomalous subspace 34

4 Detecting Anomalies in Subspace First few components represent Normal Subspace Later components represent Anomalous Subspace 35

5 Subspace Entropy Compound Method Data {Entropy+PCA, IP Flows, Multi Link} Scope Model Detection EWMA Correlation PCA Probability Entropy Data Reduction OD Flows Data IP Flows 36

5 Origin Destination Flows A relation between links and flows A B C E All traffic that enters the network from a common ingress point and departs from a common egress point Routing Matrix A: #links x #OD flows Where A ij = 1 if flow j traverses link i D Example OD Flow AE AF Paths included A-B-C-E A-B-D-C-E A-B-C-E-F A-B-D-C-E-F A-B-C-F A-B-D-C-F F 37

5 Origin Destination Flows 38

5 Multi-way Subspace Method For all OD Flows calculate entropy of header fields Tile all 4 matrices and apply subspace method This reduces dimensionality Analyze residual vectors for anomalies 39

5 Results Detected by Subspace method Input: Byte count Detected by Entropy+Subspace method Input: Entropy of OD Flows (Aggregated IP Flows) 40

6 Comments: Multi-way Subspace Feature metrics capture different anomalies than volume ones Manual inspection is needed to derive anomalous IP flows from OD flows PCA is computationally heavy, need reduce data before using subspace Solution: Use Sketches! 41

6 Subspace Entropy Compound Method Data {Sketches+Entropy+PCA, IP Flows, Multi Link} Scope Model Detection EWMA Correlation PCA Probability Entropy Data Reduction Sketches Data IP Flows 42

6 What is a Sketch, Why use it? Keeping per flow state is not feasible A method to detect anomalies in massive data sets is needed Should work below any general detection method Sketch is a projection of IP Flows on a smaller space Sketch preserves the characteristics of original data 43

6 Sketches + Entropy + Subspace Build multiple sketches of feature entropies Applying this compound model: 1.Compute Local Sketches of entropiesat each router for 4 header fields 2. Compute Global Sketches and find entropies Combine entries belonging to same hash Intuitively: We have aggregated IP flows at all routers that hashed to same value 3. Detect anomalies by subspace analysis 44

6 Results This method detects different anomalies than detected by just Entropy method Detected by Entropy+Subspace method Input: Entropy of OD Flows (Aggregated IP Flows) Detected by Sketches + Entropy+Subspace method Input: IP Flow Sketches 45

Summary Network Wide Methods Summary Subspace Does not find anomalous flows Use OD Flows with Entropy Model Need IP Flow level details with reduction in search space Use Sketches before applying PCA and Entropy 46

Summary Network Wide Methods Summary Model Target anomaly class Best detected anomalies (+)Pros (-)Cons Subspace Network wide volume changes Alpha flows Exploitcorrelations among network links Does not drill down to anomalous flows OD flows + Entropy + Subspace Behavioraldistribution changes Port/Network Scans Detect anomalous flows Needs topology information Sketches + Entropy + Subspace Long lasting behavioral changes in IP flows Port/Network Scans Detect anomalous flows with less false positives Sensitive to choice of sketch keys 47

General Limitations Is sampled data sufficient? Sampling reduces the effectiveness of modeling methods For change in 10% sampling rate more than half anomalies were not detected Common assumptions may not be valid Availability of clean data False positives are okay Synthetic anomalies are representative Interpretation of anomalies is left up to the operator 48

Conclusion The problem space comprises of 3 dimensions Data, Scope and Model Each method can be broken down in multiple layers Multiple layers can used together to achieve multiple goals such as: Targeting different set of anomalies Reduction in dimensionality Achieve better granularity 49

Thank You Any Questions? 50