Detecting Flooding Attacks Using Power Divergence

Detecting Flooding Attacks Using Power Divergence Jean Tajer IT Security for the Next Generation European Cup, Prague 17-19 February, 2012 PAGE 1

Agenda 1- Introduction 2- K-ary Sktech 3- Detection Threshold 4- Power Divergence 5- Experimental Results - Traffic behavior under normal conditions and DDoS Attacks - SYN traffic under Power Divergence and dynamic Threshold - Receiver Operating Characteristic (ROC) 6- Conclusion PAGE 2

Introduction This paper deals with detection of flooding attacks which are the most common type of Denial of Service (DoS) attacks. We propose a new framework for the detection of flooding attacks by integrating Power Divergence over Sketch data structure. The performance of the proposed framework is investigated in terms of detection probability, false alarm ratio and the receiver operating characteristic (ROC). We focus on tuning the parameter of Power Divergence to optimize the performance. We conduct performance analysis over publicly available real IP traces integrated with flooding attacks. Our analysis results prove that our proposed algorithm outperforms the existing solutions. PAGE 3

Detection threshold In order to differentiate network anomalies from normal behavior, the use of a detection threshold for Power Divergence is really mandatory. Instead of using a static threshold, we use a dynamic one: Jacobson Fast algorithm for RTT mean and variation. Let be the current value of the Power Divergence. and are respectively the current and next exponentially smoothed average estimates of Power Divergence. Let be the deviation between the current Power measure PD(n) and the average measure The exponentially smoothed average of is denoted by The estimated threshold Thre(n + 1) is then given as follows where precision. are all modifiable parameters that can be adjusted numerically in order to improve the detection PAGE 4 "IT Security for the Next Generation", European Cup 17-19February, 2012

K-ary Sketch Sketch generates fixed-number of time series for anomaly detection. Sketch provides more grained analysis than aggregating whole traffic in one time series. The Sketch data structure is used for dimensionality reduction. It is based on random aggregation of traffic attribute (e.g. number of packets) in different hash tables. A Sketch S is a 2D array of H K cell (as shown in figure below), where K is the size of the hash table, and H is the number of mutual independent hash functions (universal hash functions). Each item is identified by a key κn and associated with a reward value νn. For each new arriving item (κn, νn), the associated value will be added to the cell S[i][j], where i is an index used to represent the hash function associated with ith hash table (0 i d 1), and j is the hash value (j = hi(κn))of the key by the ith hash function. Data items whose keys are hashed to the same value, will be aggregated in the same cell in the hash table, and their values will be added up to existing counter in the hash table. Each hash table (or each row) is used to derive probability distribution as the ratio of the counter in each cell to the sum of whole cell in the line. The derived probability distributions (we get K probability set, one per line) are used as inputs for divergence measures. PAGE 5

Power Divergence The approach used in this paper to detect the DDoS attacks is based on probabilistic decision measure. In fact, the idea is to estimate the subjective prior distribution of the traffic and to use it as a baseline probability. This probability distribution is denoted by q = [q1.qn]. In presence of attacks, the probability distribution changes. One can use this change to detect the attacks. However, with the traffic variations, this probability distribution changes also even in the absence of attacks. This is called false alarms/attacks. The objective then is to find a method that detects the attacks and remove the false alarms. This motivates the need for a quantitative measure of information or more generally a decision theoretic measure of divergence between the basic probability q and some other distribution p. We choose for this article Power Divergence. It is a measure of distance between two probability measure of order as follows: given Where Ep is the expectation with respect to posterior probability distribution p. This divergence presents some interesting special cases. For = 0.5, this divergence is proportional to the squared Hellinger distance between p and q, while for = 1 it is equal to the Kullback-Leibler (KL) measure. Obviously, this power divergence outperforms then the KL and Hellinger measures. PAGE 6 "IT Security for the Next Generation", European Cup 17-19February, 2012

Experimental Results We present performance analysis results of integrating Power Divergence detection algorithm over Sketch, for detecting of SYN flooding attacks. We use the real internet MAWI trans-pacific traces from 15/04/2010 12h00 to 18h15 as few hours in the life of the internet, to test the efficiency of these used algortihms. IP addresses in the traces are scrambled by a modified version of tcpdriv tool, but correlation between addresses are conserverd. We have analysed this 06h15 of wide area network traces using sketch technique, with a key of the sketch (ki=dip), and a reward (vi=1) for SYN request only,and zero otherwise. Afterward, we inject real DDOS attacks with different intensity inside this trace to simulate distributed SYN flooding attacks. In order to proceed with test, we inject 9 real DDoS TCP SYN flooding attacks with different intensity in MAWI public traces (tcpdump files). These atatcks are inserted each 30 minutes ( at time t=31, 71, 111, etc.) and last for 10 minutes. Fig. 2 illustrates the number of SYN flooding attacks. These attacks as described before are generated 9 times for duration of 10 minutes for every 30 minutes. As we can notice, the intensity of these attacks is not constant. It begins with a value of 10000 and decreases untill 2000 attacks. PAGE 7

Traffic behavior under normal conditions and DDoS Attacks Fig. 3 and Fig. 6 show the variation of the total number of packets (TCP, UDP and ICMP) before and after the SYN flooding attacks. Indeed, Fig. 4 and Fig. 7 show the variation of number of TCP packets before and after the SYN flooding attacks. One can notice that the shape of traffic variation in both figures is similar. This can be explained by the fact that the intensity of SYN flooding is not large ompared to the intensity of the total number of packets. In such cases, the detection of the attack is very challenging. Fig. 5 and Fig. 8 show the variation of number of SYN before and after the SYN flooding attacks. We can notice here that the two figures have different shapes. This can be explained by the high intensity of SYN flooding attacks in comparison with the total number of SYN packets under normal condition. PAGE 8

SYN traffic under Power Divergence and dynamic Threshold (1) We have conducted analysis test for several values of.due to space limit we provide in this section, the results for only two values of : 0.5 and 1.5. In fact, we have found that = 1.5 is the optimal value. We compare it to the case of = 0.5. The Power Divergence of this is similar to the Hellinger Distance used in the litterature. For the parameter of the dynamic threshold, hereby the values that we used for = 0.8 and = 0.8 1) Power Divergence comparison behavior between = 0.5 and = 1.5: As described before, when applying the value of = 0.5 to the Power Divergence, this coincides with the Hellinger Distance (HD). Fig. 9 illustrates the behavior of SYN traffic with the SYN flooding attack under Power Divergence technique. It is obvious that with this value of, Power Divergence is not able to detect the 9 SYN flooding attacks. It can only detect the 7 first attacks but not the last 2 attacks at t = 310 and t = 350. Let us now take another value of =1.5. Fig. 10 shows the behavior of SYN traffic with the SYN flooding attack under Power Divergence technique. We can notice that via this value of, all the 9 attacks have been really been detected. We conclude that the value of = 1.5 is better and more adequat than = 0.5. PAGE 9

SYN traffic under Power Divergence and dynamic Threshold (2) 2) Dynamic Threshold Power and Divergence for = 0:5 and = 1:5 In this article we decided to introduce, instead of statistic threshold, a dynamic one to our experiments result.we applied it over the SYN traffic with SYN flooding attack under Power Divergence technique. Whenever the threshold (dashline) is above the SYN traffic, this means that there are no attack. Whenever the threshold (dash-line) is under the SYN traffic, this means that there are attacks. For the value of = 0.5, the dynamic threshold detects 7 attacks that have been generated by Power Divergence. But, we can notice also that it detects many false alarms as shown in Fig. 11. Fig. 12 shows that for = 1.5, the dynamic threshold detects all the 9 attacks that have been generated by Power Divergence. Indeed and unlike the case of = 0.5, the dynamic threshold for =1.5 doesnot detect the false alarms. The important use of dynamic threshold instead of static one in our case has been justified. If we take per example a constant threshold of value h = 0.5 for the = 1. 5, we can notice that the last attack at t = 350 will not be detected. Indeed if h = 0.2, the dynamic threshold will detect the 9 attacks plus the false alarm at t = 140. PAGE 10

Receiver Operating Characteristic (ROC) Fig. 14 and Fig. 13 show the receiver operating characteristic (ROC) curves for the Power Divergence algorithm for varying attack intensity, attack duration and normal traffic load. ROC curves display the trade-off between false alarm rate and detection rate. The performance of Power Divergence varies significantly with variation in the attack intensity. We plot the ROC by varying the values of the threshold. For For = 0.5 and as we can see from Fig. 13, we are able to achieve a detection rate of 67 % with 0 false alarm rate. = 1.5 and as we can see from Fig. 14, we are able to achieve a detection rate of 89 % with 0 false alarm rate. ROC figures has show that for = 1,5 the detection rate with 0 false alarm is better then the value of = 0.5. PAGE 11

Conclusion DDoS attacks are a real threat in any type traffic. In this paper, we proposed a new framework based on Sketch and power divergence for anomaly detection over high speed links. Our experimental prove the effeciency of the proposed approach through implementation and testing on real traces with DoS/DDoS. We proved that our approach is effecient through implementation and testing over real traces with distributed SYN flooding attacks. Results of our experimentations have shown the capacity of any type of detection even for low intensity of DDoS attacks. Via the ROC, Performance evaluation shows that whenever we increase the value of, Power divergence is able to preserve high detection accuracy even when the attack rate is very low. We concluded that the Power Divergence of order = 1.5 is the optimal valuethat allows to minimise the false alarm ratio of increasethe detection efficiency. We have shown that for = 1.5, our algorithm outperfmors the Hellinger Distance (which is equivalent to take = 0.5 in our algorithm). In our future work, we will focus on providing additional information to pinpoint malicious flows, in order to trigger automatic reaction against ongoing attacks. We also intend to provide a method for reducing the amount of monitoring data on high speed networks, and to analyze the impact of sampling on the precision of these divergence measures. PAGE 12

Thank You Jean Tajer IT Security for the Next Generation American Cup, New York 17-19 February, 2012 PAGE 13