Blacklisting and Blocking Sources of Malicious Traffic Athina Markopoulou University of California, Irvine Joint work with Fabio Soldo, Anh Le @ UC Irvine and Katerina Argyraki @ EPFL 1
Outline Motivation on Malicious Internet Traffic: Attack and Defense Two Defense Mechanisms Proactive: Predictive Blacklisting Reactive: Source-Based Filtering Conclusion 2
Malicious Traffic on the Internet Compromising systems scanning, worms, website attacks phishing, social engineering attacks... Launching attacks spam click-fraud Denial-of-Service attacks Botnets t large groups of compromised hosts, remotely controlled 3
The solution requires many components Monitoring and detection of malicious activity in the network and/or at hosts signature-based, behavioral analysis Mitigation at the hosts: remove malicious code in the network: block, rate-limit, scrub malicious traffic Internet architecture 4
Defense at the edge of the network Network 1 Network 2 router router Logging IDS Firewall Logging IDS Firewall Logging IDS Firewall Logging IDS Firewall Network 3 Network 4 Our focus is on (1) blacklisting and (2) blocking malicious traffic 5
Dshield Dataset 6 months of IDS+firewall logs from Dshield.org (May-Oct 2008): ~600 contributing networks, 60M+ source IPs, 400M+ logs Contributing network Dshield.org Logs Time Victim ID Src IP Dst IP Src Port Dst Port Protocol Flags (contributor) Pros: huge amount of data, diverse sample, used by many researchers Cons: no detailed information on alerts, may include errors 6
Outline Background Malicious Internet Traffic: Attack and Defense Two Defenses Mechanisms Proactive: Predictive Blacklisting Reactive: Source-Based Filtering Conclusion 7
Predictive Blacklisting Problem definition: Given past logs of malicious activity collected at various locations Predict sources likely l to send malicious i traffic to each victim network in the future. Blacklist: list of worst (e.g. top-100) attack sources Prediction vs. Detection 8
Data analysis Superposition of several behaviors Nu mber of alerts Source ( Attacker ) IP Day 9
A multi-level prediction model Different predictors capture different patterns in the dataset: Model temporal dynamics Model spatial correlation between victims/attackers i Combine different predictors Formulate as a Recommendation Systems problem in particular collaborative filtering 10
Recommender systems: example Netflix: you rate movies and you get suggestions 11
Formulating Predictive Blacklisting as a Recommendation System (CF) Recommendation System Predictive Blacklisting Users Attackers Item ms 3 2?? - 13 4? -?? 1 1?? 4? -- 37? 1? -? 12? 1?? 6 3 1 9? 11 3 - - 2? 9 4?? -? 27???? 2? 3 8? - 2?? 21 6 - -????? R = Rating Matrix 11 2? - User? Attack??? rating volume Victi ims Goal: predict rating matrix: r a,v (t) 12
Predictor I: (attacker, victim) pair Temporal dynamics r TS a v, ( t) Data analysis: attacks from the same source within short time 13
Predictor I: (a, v) time series r TS a v, ( t) Data analysis: repeated attacks within short time periods Prediction: Use EWMA model to capture this temporal trend Accounts for the short memory of attack sources. Computationally efficient Includes as special case t=1 Past activity at time t t Predicted activity 14
Predictor II: similar victims spatial correlation Data analysis: victims share common attackers. [Katti et al, IMC 2005], [Zhang et al, Usenix Security 2008] Common attackers Our approach: Victims 15
Predictor II: similar victims defining similarity Similarity of victims u,v captures: the number of common attackers and when they are attacked Common attackers Our approach: victims v1 v2 v3 v4 a1 a2 a3 a4 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 1 16
Predictor II: similar victims k-nearest neighbors (knn) r KNN a v, ( t) Traditional knn: trust your peers Identify k most similar victims ( neighbors ) + predict your rating based on theirs New challenges due to time varying ratings Our approach: Predicted activity Sum over the neighborhood of v Time series forecast given past logs Similarity between time-varying vectors 17
Predictor III: Attackers-Victims Data analysis: Co-clustering group of attackers consistently target the same group of victims. this behavior often persists over time We used the Cross-Association (CA) method to automatically identify dense clusters of victims-attackers. 18
Intuition: Predictor III: Attackers-Victims Prediction EWMA CA r ( ) a, v t pairs (a,v) in dense clusters are more likely to occur use the density of the cluster, as the predictor, where EWMA-CA: further weight by persistence over time 19
A multi-level prediction model Summary Different predictors capture different patterns: Temporal trends EWMA TS of (attacker,victim) Neighborhood models: KNN: Similarity of victims EWMA CA: Interaction of attackers-victims Combine different predictors 20
Combining different predictors Weighted Average with weights proportional to the accuracy of each predictor on a pair (a,v). 21
Performance Analysis Baseline Blacklisting i Techniques Local Worst Offender List (LWOL) Most prolific local attackers Reactive but not proactive Global Worst Offender List (GWOL) Most prolific global attackers Might contain irrelevant attackers Non prolific attackers are elusive to GWOL Collaborative Blacklisting (HPB) [J. Zhang, P. Porras, J. Ullrich, Highly Predictive Blacklisting, USENIX Security 2008] Also implemented and offered as a service (HPB) by Dshield.org Methodology: Use link-analysis on the victims similarity graph to predict future attacks 22
Performance Analysis total hit count 60 days of Dshield hildl logs, 5 days training, i 1 day testing, ti BL length=1000, The combined method significantly improves the hit count (up to 70%, 57% on avg) exhibits less variation over time Combined method HPB GWOL 23
Predicting Attacks what is the best we can do? Training, day t 1 Test, day t 2 v i 12-1 33 5 - - 3 5-17 4 - - LocalUB(v i )=3 Local Upper Bound: #IPs in training & test window of a particular contributor 2-1 1 - - - - 1 - - 5 - - 12-1 33 5 - - 3 5-17 4 - - - - 7-3 29 6 1 2-1 5 31 4 - - - - 2 - - 1 - - 2 4 - - x - x x x x x x x - x x x x GlobalUB=5 Global Upper Bound: # IPs in training window of any contributor 24
Predicting Attacks room for improvement Collaboration helps! Our method ( BL =1000) Large gap from prior methods 25
Performance Analysis robustness to random errors Robustness achieved by diverse methods E.g. an attacker may send traffic to a single victim (detected by temporal) or to several victims (detected by spatial behavior); or he can limit his attack activity 26
Predictive Blacklisting as a RS System Summary Contributions Combined predictors that capture different patterns in the data Significant improvement with simple techniques still room for further improvement New formulation as a recommenders system (collaborative filtering) problem paves the way to powerful techniques: e.g., capture global structure (latent factors), joint spatio-temporal models References F.Soldo, A.Le, A.Markopoulou, "Predictive Blacklisting as an Implicit Recommendation system, IEEE INFOCOM 2010 and in arxiv.org In the news: MIT Technology Review, Slashdot, ACM TechNews 27
How to use a list of malicious sources? A policy decision: E.g. scrub, give lower priority, block, monitor, do nothing One option is to block (filter) malicious sources when: during flooding attacks by million-node botnets where: at firewalls or at the routers 28
Outline Background Malicious Internet Traffic: Attack and Defense Two Defenses Mechanisms Proactive: Predictive Blacklisting Reactive: Optimal Source-Based Filtering Conclusion 29
Filtering at the routers Access Control Lists (ACLs) Match a packet header against rules, e.g. source and destination IP addresses Source-based filter: ACL that denies access to a source IP/prefix Filters implemented in TCAM Can keep up with high speeds Limited resource There are less filters than attack sources 30
Filter Selection at a Single Router tradeoff: number of filters vs. collateral l damage Filter an attack source A.B.C.D attackers legitimate users c c c c......... c c Filter a prefix A.B.C.* ISP edge router C V 31
Optimal Source-Based Filtering Design a family of filter selection algorithms that: take as input: a blacklist of malicious (bad) sources a whitelist of legitimate (good) sources a constraint on the number of filters Fmax a constraint on the access bandwidth C the operator s policy optimally select which source IP prefixes to filter so as to optimize the operator s objective subject to the constraints ABC* A.B.C. 0 2^32-1 A.B.C.D so far, heuristically done (through ACLs or rate limiters) 32
Optimal Source-Based Filtering [l,r]: range in the IP space p/l: prefix p of length l F max : number of filters (<<N) A General Framework : whether we block range [l,r] or not : weight assigned to source IP address, i. : cost of blocking a range [l,r] 33
Optimal Source-Based Filtering Expressing Operator s Policy Assignment of weights W i is the operator s knob: indicates volume of traffic sent, or importance assigned by the operator W i >0 (good source i), W i <0 (bad source i ), W i =0 (indifferent) Objective function = = cost of good sources in range [l,r] cost of bad sources in range [l,r] 34
Filter Selection Algorithms Problem Overview RANGE-based: filter IP or range [l,r] [Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA 09] FILTER-ALL-RANGE FILTER-SOME-RANGE FILTER-ALL-DYNAMIC-RANGE RANGE PREFIX-based: filter IP source or prefix [Soldo, Markopoulou, Argyraki: INFOCOM 09, arxiv.org] FILTER-ALL: block all malicious sources FILTER-SOME: block some malicious sources FILTER-ALL-DYNAMIC: BL varies over time FLOODING: bandwidth constraint t at access router DISTRIBUTED-FLOODING: filters at multiple routers 35
Filter Selection Algorithms Algorithms Overview RANGE-based: filter IP or range [l,r] [Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA 09] FILTER-ALL-RANGE FILTER-SOME-RANGE FILTER-ALL-DYNAMIC-RANGE RANGE PREFIX-based: filter IP source or prefix [Soldo, Markopoulou, Argyraki: INFOCOM 09, arxiv.org] FILTER-ALL: O(N) FILTER-SOME: O(N) FILTER-ALL-DYNAMIC: O(N) FLOODING: NP-hard, pseudo-polynomial l alg. O(C 2 N) + heuristic DISTRIBUTED-FLOODING: distributed solution following a dynamic programming g approach 36
Longest Common Prefix Tree of a BL LCP-Tree(BL) : binary tree, leaves are addresses in BL, intermediate nodes are their longest common prefixes It can be found from the full binary tree of IP prefixes E.g. for BL={10.0.0.2, 10.0.0.3, 10.0.0.7}, the LCP-Tree(BL) is: 10.0.0.0/29 10.0.0.2/31 3 bad, 5 good addresses 0 good, 2 bad addresses 10.0.0.2/320 0 10003/32 10.0.0.3/32 10007/32 10.0.0.7/32 Finding a set of filters: no need to look for all possible sets of prefixes sufficient to look only for prunings of the LCP tree lends itself to a dynamic programming approach 37
Filter-All-Prefix Problem Statement Given: a blacklist BL, weight w i (for each good IP i), F max filters choose: prefixes p/l (x p/l ) so as to: filter all bad addresses and minimize collateral damage 38
Filter-All-Prefix Dynamic Programming Algorithm : cost of optimal allocation of F filters within a prefix p p s L s R F-n 1, filters within left subtree n 1, filters within right subtree n=1,1,,f: means that we want to block all malicious sources (leaves) 39
Filter-All-Prefix DP Algorithm: Example N = 10 Fmax = 4 0/1 32/5 57/6 58/6
Filter-Some-Prefix N = 10 Fmax = 4 32/5 3/6 57/6 58/6
N = 10 Fmax = 4 Need to be (re)computed: O(F max log(n)) Filter-All-Prefix-Dynamic Time-varying i case 0 22 26 7 7 5 6 6 0 2 3 31 37 10 15 17 22 32 33 57 58 42
FLOODING Problem Statement Given: a blacklist BL, a whitelist WL, a weight of address = traffic volume generated, a constraint on the link capacity C, and F max filters choose: source IP prefixes, x p/l so as to: minimize the collateral damage and fit the total traffic within the link capacity C 43
FLOODING is NP-hard FLOODING DP Algorithm reduction from knapsack with cardinality constraint (1.5K) An optimal pseudo-polynomial polynomial dynamic programming algorithm, solves the problem in: O((CF max ) 2 N) similar to the previous DP but solve 2-dimensional KP the LCP-Tree includes both good and bad addresses DP extended to take into account the capacity constraint A heuristic, by adjusting the granularity (ΔC>1) of C 44
Distributed Flooding filters at several routers attackers Deploy filters at several routers increase total filter budget Each router (u) has its own: view of good/bad traffic capacity in incoming link filter budget Filtering at several routers: not only which prefix to block but also on which router c c c c......... c c Solution: can be solved in a distributed way outperforms independent decisions Victim 45
Evaluation using Dshield data FLOODING vs. rate limiting i i Attack sources, from a point of view of a single victim in Dshield Good sources: [Kohler et al. TON 06, Barford et al. PAM 06] Before attack: good traffic was C/10 < C During attack: bad traffic is 10C CD/N Optimal filter selection preserves the good traffic and drops the bad. 46
Intuition why optimization helps compared to non-optimized filtering Malicious sources are clustered in the IP address space Malicious sources are not co-located with legitimate sources Filtering can block IP prefixes with malicious sources, without penalizing (many) legitimate sources. 47
Evaluation using Dshield data (2) FILTER-ALL-PREFIX vs. generic clustering algorithms Malicious addresses: attacking 2 specific victim networks (most and least clustered) in Dshield dataset Good addresses generated: using a multifractal [Kohler et al. TON 06, Barford et al. PAM 06] Optimal filter selection outperforms generic clustering 48
Evaluation using Dshield data (3) DISTRIBUTED-FLOODING: the value of coordination i CD/N Coordination among routers helps 49
Optimal Source-Based Filtering Summary Framework for optimal filter selection defined various filtering problems designed efficient algorithms to solve them Lead to significant improvements on real datasets Compared to non-optimized filter selection, to generic clustering, or to uncoordinated routers because of clustering of malicious sources 50
Outline Background Malicious Internet Traffic: Attack and Defenses Two Defenses Mechanisms Proactive: Blacklisting as a Recommendation System Reactive: Filtering as an Optimization Problem Conclusion Parts of larger system that collects and analyzes data from multiple sensors and takes appropriate action 51
Thank you! athina@uci.edu http://newport.eecs/uci.edu/~athina 52