Multistep Attack Detection and Alert Correlation in Intrusion Detection Systems

Size: px
Start display at page:

Download "Multistep Attack Detection and Alert Correlation in Intrusion Detection Systems"

Transcription

1 Multistep Attack Detection and Alert Correlation in Intrusion Detection Systems Fabio Manganiello, Mirco Marchetti, and Michele Colajanni Università degli Studi di Modena e Reggio Emilia, Department of Information Engineering Via Vignolese 905, Modena, Italy {fabio.manganiello, mirco.marchetti, michele.colajanni}@unimore.it Abstract. A growing trend in the cybersecurity landscape is represented by multistep attacks that involve multiple correlated intrusion activities to reach the intended target. The duty of correlating security alerts and reconstructing complete attack scenarios is left to system administrators because current Network Intrusion Detection Systems (NIDS) are still oriented to generate alerts related to single attacks, with no or minimal correlation analysis among different security alerts. We propose a novel approach for the automatic analysis of multiple security alerts generated by state-of-the-art signature-based NIDS. Our proposal is able to group security alerts that are likely to belong to the same attack scenario, and to identify correlations and causal relationships among them. This goal is achieved by combining alert classification through Self Organizing Maps and unsupervised clustering algorithms. The efficacy of the proposal is demonstrated through a prototype tested against network traffic traces containing multistep attacks. Keywords: Network security, machine learning, neural networks, alert clustering, alert correlation, Self-Organizing Maps 1 Introduction The presence of a Network Intrusion Detection System (NIDS) is a cornerstone in any modern security architecture. A typical NIDS analyzes network traffic and generates security alerts as soon as a malicious network packet is detected. Alert analysis is then performed manually by security experts that parse the NIDS logs to identify relevant alerts and verify possible causal relationships among them. While the log analysis can be simplified through some NIDS management interface, it is still a manual, time consuming and error prone process especially because we are experiencing a growing number of multistep attacks that involve multiple correlated intrusion activities. This paper proposes a novel alert correlation and clustering approach that helps security analysts in identifying multistep attacks by clustering similar alerts produced by a signature-based NIDS, such as Snort [11], and by highlighting

2 2 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni strong correlations and causal relationships among different clusters. This goal is achieved through multiple steps: the security alerts are preprocessed by a hierarchical clustering scheme; then, they are processed using a Self-Organizing Map ( SOM ) that is a type of auto associative neural network [6]; the alerts that likely belong to the same attack scenarios are clustered by using the k-means algorithm over the SOM output layer; finally, a correlation index is computed among the alert clusters in order to identify causal relationships that are typical of a multistep attack scenario. The final output of our framework for multistep attack detection is a set of oriented graphs. Each graph describes an attack scenario in which the vertices represent clusters belonging to the same attack scenarios, and the directed links denote alert clusters that are tied by causal relationships. In such a way, a security administrator can immediately identify correlated alerts by looking at the graphs, thus avoiding to waste time on checking false positives and irrelevant alerts. This paper presents several contributions with respect to the state of the art. The application of the k-means clustering algorithm to the output layer of a SOM allows us to perform robust and unsupervised clustering of correlated NIDS alerts. To the best of our knowledge, this solution has never been proposed in network security. Moreover, several original (for the security literature) heuristics for the SOM initialization and definition of the minimum size of the clusters produced by the k-means algorithm reduce the number of configuration parameters and allow our solution to autonomously adapt to different workloads. This is an important result because the typicial context is characterized by high variability and heterogeneity of security alerts. We demonstrate the feasibility and efficacy of the proposed solution through a prototype that was extensively tested using the most recent datasets released after the 2010 Capture the Flag competition [4]. This paper is organized as follows. Section 2 discusses related work in network security literature and outlines main improvements of this proposal with respect to the state of the art. Section 3 presents the software architecture of the proposed solution. Sections 4, 5 and 6 describe the SOM, the clustering algorithm applied to the SOM output, and the algorithm that evaluates correlations among alert clusters, respectively. Section 7 presents the experimental results. Section 8 draws the conclusions and outlines future work. 2 Related work The application of machine learning techniques, such as neural networks and clustering algorithms, for intrusion detection has been widely explored in the security literature. Several papers propose clustering algorithms, support vector machines [7] and neural networks as the main detection engine for the implementation of anomaly-based network intrusion detection systems. In particular, the use of Self-Organizing Maps (SOM) for the implementation of an anomaly-based IDS was proposed in [14], [9] and [2]. Unlike previous literature mainly oriented to anomaly-based detection, in this paper we propose SOM and clustering algo-

3 Multistep Attack Detection and Alert Correlation 3 rithms for the postprocessing of security alerts generated by a signature-based NIDS. Hence, our proposal relates to other papers focusing on techniques for NIDS alert correlation [3]. According to the comprehensive framework for NIDS alert correlation proposed in [13], the solution proposed in this paper can be classified as a multistep correlation component. In this context, two related papers are [8] and [5]. In [8] authors describe an alert correlation algorithm that uses a SOM in order to compute the correlation between security alerts produced by a NIDS. Alerts are modeled as numerical tuples, and their correlation is inversely proportional to the distance distance between the two corresponding neurons on the output layer of a SOM. Correlation computed according to [8] measures the similarity between two given security alerts, and is commutative by definition. Our work differentiates from [8] because our evaluation is not limited to compute alert similarity. Indeed, mapping security alerts on the output layer of the SOM is only one intermediate steps of our framework, that takes the SOM output as the input of the alert clustering algorithms. Moreover, with respect to [8] we propose an innovative initialization algorithm for the SOM and an adaptive training strategy that makes the SOM more robust with respect to perturbations in the training data (see Section 4 for details on the design and implementation of the SOM). Finally, instead of relying just on a commutative correlation index based on the distance between two alerts, our correlation index depends also on the type of alerts and on their detection time (see Section 6). The resulting correlation index expresses the causality relationships among alerts much better than the previous one. In [5] security alerts generated by a NIDS are grouped through a hierarchical clustering algorithm. The classification hierarchy, that is defined by the user, aggregates alerts of the same type targeting one host or a set of hosts connected to the same subnet. We use a similar hierarchical clustering scheme as a preprocessing step. We then use the alert clusters generated by this hierarchical clustering algorithm as an input for the SOM. Hence, we take advantage of the ability of the algorithm presented in [5] to reduce the number of alerts to be processed, and to group a high number of false positives. All the subsequent processing steps are novel. In such a way, the proposed alert correlation algorithm could work even without the preprocessing step of a hierarchical clustering although at the price of a higher computational cost. 3 Software architecture The architecture of the framework proposed in this paper consists of a preprocessing phase and of three main processing steps. The preprocessing phase, called hierarchical clustering in Figure 1, takes as its input the intrusion alerts generated by a signature-based NIDS. In our reference implementation, we refer to the well known Snort. Alerts are grouped according to a clustering hierarchy in a way similar to [5]. This preprocessing phase has two positive effects. It reduces the number of elements that have to be processed

4 4 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni by the subsequent steps, with an important reduction of the computational cost of the three processing phases. Moreover, it groups many false positives in the same cluster, thus simplifying human inspection of the security alerts. In our approach, clustered alerts are modelled as numerical normalized tuples, where each element identifies a property of a given cluster (timestamp, alert type, source and destination address and ports). Input: NIDS alerts First processing step:som Second processing step: k-means clustering Preprocessing: hierarchical clustering Third processing step: correlation index Output: set of correlation graphs Fig. 1. Main functional steps of the software architecture Clustered alerts are then processed by the Self Organizing Map (SOM) in Figure 1. The SOM is able to reduce the dimensionality of the input dataset by mapping each multidimensional tuple to one output neuron. Each output neuron is identified by its coordinates on the output layer of the SOM, hence the multi-dimensional tuples that represent clustered alerts are mapped to twodimensional coordinate sets on the output layer. Moreover, a SOM has the ability to map similar tuples to neurons that are close in the output layer. The use of

5 Multistep Attack Detection and Alert Correlation 5 SOM allows us to reduce the dimensionality of the dataset while preserving information related to input similarity. In particular, the distance between two output neurons on the output layer of the SOM is inversely proportional to the similarity of the related inputs. The output of the SOM is then analyzed by the second processing step, that is a k-means clustering algorithm aiming to detect similar alert clusters that are likely to belong to the same attack scenario. The basic idea is that similar alert clusters are mapped on close neurons of the SOM, hence it is possible to group similar alert clusters by executing a k-means clustering on the output layer of the SOM, as shown in Figure 1. The choice of the parameter k is critical. Our proposal includes a heuristic that computes the best k on the basis of the data analyzed so far, as illustrated in Section 5. This approach allows our clustering algorithm to automatically adapt its parameters without the need of a static value for k that risks to be unsuitable in most instances. The output of the proposed k-means clustering algorithm, described in Section 5, is a set of clusters each representing a likely attack scenario. The third (and last) processing step evaluates correlations between alert clusters belonging to the same attack scenario. It takes as its input the output layer of the SOM and the results of the k-means clustering algorithm, and then computes a correlation index between alert clusters that have been grouped within the same cluster by the k-means algorithm. The computation of the correlation index is based on the distance between the neurons on the output layer of the SOM, on the timing relationships between alerts and on their type. In particular, this algorithm identifies causal relationships between alerts by determining which of the two alerts occurred first, and whether historical data show that alerts of the same type usually occur one after another. If the correlation index between two alert clusters is above a dynamic threshold, then the two clusters are considered related. A description of the algorithm used to compute the alert correlation graph is in Section 6. The final output of the proposed algorithm for multistep alert correlation is represented by a set of directed graphs. Each graph represents a different attack scenario, whose vertices are clusters of similar alerts. The directed edges represent relationships between different alert clusters that belong to the same scenario. An example of graph produced by our algorithm is presented in Section 7. 4 Self-Organizing Map A Self-Organizing Map is an auto-associative neural network [6], that is commonly used to produce a low-dimensional (typically two-dimensional) representation of input vectors that belong to a high-dimensional input space. SOM networks are especially useful for the visualization of data with high dimensionality, since input vectors are mapped to coordinates in the output layer by a neighborhood function that preserves the topological properties of the input

6 6 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni space. Hence, input vectors that are close in the input space are mapped to near positions in the output layer of the SOM. Input similarity between any two input vectors is inversely proportional to the distance between their projections on the output layer. Given a SOM having K neurons in the input layer, and M N on the output layer, the SOM is a completely connected network having K M N links, so that each input neuron is connected to each neuron of the output layer. The output layer itself can be completely connected (that is, M(M 1) N(N 1) links) or simply connected (each neuron is connected only to the closest ones on the same row and column, and possibly on the same diagonal). Each link from the input to the output layer has a weight that expresses the strength of that link. In the described implementation, each alert is modelled as a numerical normalized tuple provided to the input layer. As an example, the i th alert x i is modelled as x i = (alertt ype i, srcip i, dstip i, srcp ort i, dstp ort i, timestamp i ) (1) where alertt ype i is the alert type as identified by Snort, srcip i and dst I P i are its source and destination IP addresses, srcp ort i and dstp ort i are the source and destination port numbers, and timestamp i is the time at which the alert has been issued. Each field is represented by a numerical value normalized in the range [0, 1]. The first step initializes the weights of the links between the input and the output layer. Being the involved training algorithm unsupervised, it is important to have an optimal weight initialization instead of a random initialization. This approach guarantees an acceptable associative accuracy in the least number of steps. The weight initialization algorithm used in this model is similar to that proposed in [12]. It involves a heuristics that aims to map on near points the items which are dimensionally or logically close, and on distant points the items which are dimensionally or logically distant. Let us consider the training space X = {x 1,..., x h } for our network. We pick up the two vectors x a, x b X, with x a = {x a1,..., x ak } and x b = {x b1,..., x bk } having the maximum K-dimensional Euclidean distance: where x a, x b X d(x a, x b ) d(x i, x j ) i, j = 1..h (2) d(x i, x j ) = K (x ip x jp ) 2 (3) p=1 The values of x a and x b are used for initializing the vectors of weights on the lower left, w M1 = x a, and the upper right corner, w 1N = x b. The idea is to map the two dimensionally most distant items on the opposite corners of the output layer. The values of the upper left corner weights vector w 11 are then initialized by picking up the vector x c X {x a, x b } having the maximum distance from x a

7 Multistep Attack Detection and Alert Correlation 7 and x b. Finally, the vector x d X {x a, x b, x c } having the maximum distance from x a, x b, x c initializes the values of the bottom right corner w MN. After this process, we need to initialize the weights of the remaining neurons on the four edges through a linear interpolations, according to the following relations: w 1j = j 1 N 1 w 1N + N j N 1 w 11 for j = 2,..., N 1 (4) w Mj = j 1 N 1 w MN + N j N 1 w M1 for j = 2,..., N 1 (5) w i1 = i 1 M 1 w M1 + M i M 1 w 11 for i = 2,..., M 1 (6) w in = i 1 M 1 w MN + M i M 1 w 1N for i = 2,..., M 1 (7) After this step, the remaining neurons are initialized through the following two-dimensional linear interpolation: w ij = (j 1)(i 1) (N 1)(M 1) w (j 1)(M i) MN + (N 1)(M 1) w 1N + (N j)(i 1) (N 1)(M 1) w (N j)(m i) M1 + (N 1)(M 1) w 11 (8) This heuristic-based initialization schema for the SOM reduces the number of steps to reach a good precision, and improves the accuracy of the network with respect to a random initialization schema with fewer training steps [12]. After the initialization of the weights, the network undergoes unsupervised and competitive training by using the same training set X = {x 1,..., x h } used for the initialization. For each training vector x i X, i = 1,..., h, the algorithm finds the neuron that most likely represents it, that is, the neuron having the weights vector w with the minimum distance from x i : x i w = min x i w jk (9) for j = 1,..., M, k = 1...N and the operator representing the Euclidean distance. w represents the best neuron associated to the training vector x i, that is, the one having the minimum weight distance from the input element x i. At the t-th learning step, the map weights are updated according to the following relation: w jk (t) = w jk (t 1) + δ( w, w jk )α(t) (x i w jk (t 1)) (10) for j = 1,..., M, k = 1,..., N. The value of the function δ is inversely proportional to the distance between the two neuron weights taken as their arguments. In Eq. 10 this value increases proportionally to the distance between the neuron at coordinates (i, j) and w. It expresses a kind of influence in the update steps over the training vector x i that augments as we approach the neuron having

8 8 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni weights w. Considering two neurons with coordinates (x, y) and (i, j), for the purposes of this paper we use the following δ function: δ(w xy, w jk ) = 1 ( x j + y k ) (11) The term in round parentheses at the denominator is the Manhattan distance between the coordinates of the two neurons. The idea behind this choice is to apply, at each learning step, some strong adjustments over the neurons that are relatively close to w, and the maximum adjustments on w itself, for which the δ function value is 1. The following step consists in the definition of a learning rate for the network expressed in function of the learning step t. In many SOM applications, this value is high at the beginning of the learning phase, when the network is still more prone to errors, and it decreases monotonically as the learning phase continues. However, as discussed in [15], this approach makes the learning process too much dependent on the first learning vectors, and not suited to the high variable context of NIDS alert analysis. To mitigate this issue, in this paper we use a learning rate function α(t) close to that proposed in [15] and shown in Figure 2: α(t) = t ( T exp 1 t ) T (12) T is a parameter that expresses how fast the learning rate should tend to 0. A low value of T implies a faster learning process on a smaller training set, while a high value implies a slower process on a larger training set. The parameter T is computed as a function of two user-defined parameters: C, that represents the cutoff threshold for the learning rate, i.e. the value under which the learning phase feedback becomes negligible; t c, that is computed by solving the equation α(t c ) = C. It represents the number of learning steps that should be taken before having a learning rate that equals the cutoff value, i.e. the number of steps after which the SOM network becomes nearly insensitive to further learning feedbacks. The parameter T in α(t) representation given t c, is computed by solving α(t c ) = C: t c T + log T = log t ce (13) C that is a logarithmic equation having T as a variable. Its approximated solution is T = t ( ce (W C exp 1 C )) (14) e where W 1 is the analytic continuation of the Lambert W function.

9 Multistep Attack Detection and Alert Correlation 9 Fig. 2. Plot of the function α(t) that expresses the learning rate in function of the current learning step An approximation of this function is computed through its Taylor series as proposed in [1]: where W 1 (z) = µ k = k 1 ( µk 2 k k=0 µ k p k (15) p = 2(ez + 1) (16) + α ) k 2 α k 4 2 µ k 1 k + 1 (17) k 1 α k = µ j µ k j+1 (18) j=2 µ 0 = 1, µ 1 = 1, α 0 = 2, α 1 = 1 (19)

10 10 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni The first terms of the series (eq. 15) are W 1 (z) = 1 + p 1 3 p p p p5... (20) The SOM is re-trained at regular intervals. The training phase, that is computationally expensive, is executed in a separated thread, as discussed in Section 7. On multi-core processors this feature allows us to execute this phase in parallel with analysis and at short time intervals (in our experiments, in the order of seconds). After the learning phase, the network is ready to associate alerts provided as the normalized numeric tuples as shown in Eq. 1. The network has as many input neurons as the number of items in the alert tuple, and its output layer size M N configured by the user. 5 Clustering algorithm The next step is to apply a k-means clustering algorithm on the SOM output layer to extract possible information over distinct attack scenarios. Hence, we need to initialize k centers, for a fixed value of k. Let M N be the size of the SOM output laver, and D be the data set containing all the pairs (i, j), 1 i M, 1 j N, having at least one associated alert. We pick up the points x µ = (i µ, j µ ) and x ν = (i ν, j ν ) out of the dataset D as the points having the maximum Euclidean distance: x µ, x ν D : x µ x ν x i x j, 1 i, j D (21) These points identify the first two centers for the clustering algorithm. The third center is chosen as the point in the data set having the maximum distance from x µ and x ν, the fourth center as the point having the maximum distance from the three centers chosen so far, and so on. After the initialization step, each element in the data set D chooses the centers closest to it as the identifier of its cluster. Therefore, at the t-th iteration, the set S (t) i, containing the points associated to the i-th center, for 1 i k, is defined as { S (t) i = x j : x j m (t) i } x j m (t) i i = 1..k (22) where m (t) i represents the coordinates of the i-th center at the step t. The centers of each cluster are the computed again as mean of the points inside of it: m (t+1) i = 1 S (t) i x j S (t) i x j (23) The algorithm is repeated until m (t) i = m (t+1) i i = 1,..., k, that is, until no more change is needed on the coordinates of the centers.

11 Multistep Attack Detection and Alert Correlation 11 A well known drawback of the k-means algorithm consists in the necessity to fix the value of the number of clusters k before the algorithm execution. To limit this risk, we use the Schwarz criterion [10] as our heuristics for computing the best number of clusters in our data set. In particular, for 1 k D, we compute several heuristic indexes that express, for that value of k, how large is the distortion value for that k, that is how large the average distance between each point and its associated center is: µ k = k log n + (x j m i ) 2 (24) x j D where m i, 1 i k, is the center associated by the k-means algorithm to the generic point x j D. The best value of k, k, is the one having µ k µ k, k = 1... D. The Schwarz heuristic allows us to build a clustering algorithm based on a near-optimal number of clusters without using fixed or user-provided parameters. This approach is able to adapt itself to the heterogeneity of the data set at the price of a higher computational cost. 6 Correlation index In the last processing step we compute correlations among alert clusters belonging to the same attack scenario. The correlation index between two alert clusters A i and A j mapped on two output neurons having coordinates (x[a i ], y[a i ]) and (x[a j ], y[a j ]) is computed as a function of the Euclidean distance between these two neurons normalized over the maximum distance on the output matrix, that is the distance between the extreme points on the same diagonal (denoted by (x 0, y 0 ) and (x M, y M )): { 0, 1] Corr(A i, A j ) = 1, 2] (25) 1+ (x[a i] x[a j]) 2 +(y[a i] y[a j]) 2 The case 1] comes when the distance between (x[a i ], y[a i ]) and (x[a j ], y[a j ]) is the same distance between (x 0, y 0 ) and (x M, y M ), that is, the two alerts are mapped on neurons having the maximum distance (in this case their correlation value is zero), otherwise we have the case 2]. The correlation value between two alerts is always normalized in [0, 1]. This definition of correlation is commutative, and it does not express any causality correlation. As we want to express the direction of the causality between two alerts, a of type A and b of type B with a relatively high correlation coefficient, we pursue the following approach: 1. If t[b] > t[a], i.e. the alert b was raised after the alert a, and we have no historical information over time relations between alerts of type A and alerts of type B, then a b, that is the alert a was likely to generate the alert b in a specific attack scenario;

12 12 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni 2. If we have logged information over alerts of both types A and B, and the number of times that an alert of type B occurred after an alert of type A (in a specified time window) is greater than the number of times when the opposite event occurred, then a b. The weight of this correlation value (denoting its accuracy), is computed as a function of the size of the knowledge base acquired so far, namely the number of alerts in the alert log L used to train the SOM. In particular, we want to keep a relatively low weight (close to 0) when the size of the alert log used for training the network is small, and to increase it when the system acquires new alerts. Given an alert log of size x = L, we compute the weight w(x) of the SOM-based index through a monotonically increasing function in [0, 1]. In particular, we will use the hyperbolic tangent tanh as the weight function: ( x ) w(x) = tanh = e x K e x K K e x K + e x K (26) The parameter K determines how fast we want the algorithm to tend to 1. It is set as a function of the parameter x M denoting the size of the alert log L in order to have w(x M ) = M. In this paper we use M = Considering 1 as the asymptotically maximum value for w(x), then x M denotes the size of the alert log L in order to have a weight that is 95% of its maximum value): w(x M ) = M e xm K e x MK = M (27) e x M K + e x M K Solving this equation through numerical approximation methods, for a fixed value of M and letting t = x M /K, we get an approximated solution t sol, used for computing K as: K = x M t sol (28) Once obtained the correlation coefficients for each pair of alerts, we consider two alerts A i, A j actually correlated only if their correlation value is greater than a threshold: is corr(a i, A j ) = true Corr(A i, A j ) µ corr + λσ corr (29) where µ corr is the average correlation value and σ corr is its standard deviation. λ R denotes how far from the average we want to shift in order to consider two alerts as correlated. Feasible values of λ for our purposes are in the interval λ [0, 2]. For λ 0 we consider as correlated all the pairs of alerts having a correlation value higher than the average one. This usually implies a relatively large correlation graph. A value of λ 2 brings to a smaller correlation graph, that only contains the most strongly correlated pairs of alerts. Higher values for λ could result in empty or poorly populated correlation graphs, since the correlation constraint could be too strict.

13 7 Experimental results Multistep Attack Detection and Alert Correlation 13 We carried out several experiments using a prototype implementation of the proposed multistep alert correlation architecture. The software has been mainly developed in C, as a module for Intrusion Detection System software Snort. It also includes a Web-based user interface realized in Perl, HTML and AJAX. A preliminary set of experiments, aiming to verify all the funcionalities of our software, were conducted using small-scale traffic traces, including attack scenarios performed within controlled network environment. Extensive experimental evaluation have then been carried out against the Capture the Flag (CTF) 2010 dataset, that includes 40 GB of network traffic in PCAP format and is publicly available [4]. Our goals are to demonstrate that the computational cost of the proposed solution is compatible with live traffic analysis and to verify the capability of the system to recongize and correlate alerts belonging to the same attack scenario. To achieve high performance, several different processing steps have been implemented as concurrent threads, that run in parallel with traffic analysis. In particular, alert clustering, training of the SOM and the periodic evaluation of the best value of k of the k-means algorithm, are performed in the background. As soon as the new weight of the SOM or the new best value of k are computed, they are substituted to the previous values. This design choice allows us to leverage modern, multi-core architectures to perform these expensive operations with no impacts on the overall performance of Snort. The two lines in Figure 3 show the memory usage of Snort while analyzing a 200MB traffic trace with and without or multistep alert correlation module. When our module is active, the memory usage of Snort is slightly higher. However, in both the cases the analysis of the traffic dump requires about 42 seconds, hence or module has no noticeable impact on the time required to analyze the traffic trace. Thanks to the self-adaptive choices of the parameters, our framework can easily adapt to heterogeneous traffic traces without the need for user-defined static configurations: the first alert clustering phase [5] is performed using the average heterogeneity of the alerts as grouping measure; the SOM distances depend on the size of the SOM network itself, but a greater size means a higher average distance, that anyway does not impact on the relative normalized distance values; The k-means clustering uses Schwarz criterion as heuristic for computing the best number of data clusters. The last processing step, that computes the correlation index among clustered alerts belonging to the same attack scenario, is quite sensitive to the choice of the parameter λ in Eq. 29. Feasible values are in the interval λ [0, 2], as discussed in Section 6. A value of λ close to zero produces a graph that contains many alert correlations (all the ones having a correlation value greater than the average one). This is useful when the user wants to know all the likely correlations. A value

14 14 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni Fig. 3. Memory usage of Snort analyzing 200 MB of CTF traffic, with and without our module closer to 2 highlights the few strongest correlations, having high correlation values. For example, by setting λ = 1.7, we obtain 4 different scenarios, two of which are reported in Figures 4 and 5. The scenario represented by the correlation graph in Figure 4 contains a likely shellcode execution (the grouping of several similar alerts inside of the same graph node is performed through Julisch method described in [5]) linked to an HTTP LONG HEADER alert and to a NON-RFC DEFINED CHAR alert raised by the http inspect module towards the same host and port in the same time window. The correlation graph depicted in Figure 5 represents an attack scenario in which a TCP portsweep alert and an ICMP sweep alert triggered towards the same subnet in the same time window are successfully correlated. On the other hand, by setting λ = 1.2 we obtain a larger number of likely scenarios, that can also be connected among them. The XML file 1 containing the 37 alert clusters that represent different attack scenarios and the corresponding correlation graphs 2 cannot be included and discussed in this paper for space limitations and are freely available online. 1 Available online at 2 Available online at

15 Multistep Attack Detection and Alert Correlation 15 Fig. 4. Example of a correlation sub-graph generated by the software using the Capture the Flag 2010 dataset, successfully finding a correlation in a scenario containing a shellcode NOOP alert and a HTTP LONG HEADER alert on the same port and in the same time window Fig. 5. Another example of a correlation found by the software, showing a correlation between some TCP portsweep and ICMP sweep alerts towards the same subnet and in the same time window 8 Conclusion This paper presents a novel multistep alert correlation algorithm that is able to group security alerts belonging to the same attack scenario and to identify correlations and casual relationships among intrusions activities. The input of the proposed algorithm is represented by security alerts generated by a signaturebased NIDS, such as Snort. Viability and efficacy of the proposed multistep alert correlation algorithm is demonstrated through a prototype, tested against recent and publicly available traffic traces. Experimental results show that the proposed multistep correlation algorithm is able to isolate and correlate intrusion activities belonging to the same attack scenario, thus helping security administrator in the analysis of alerts produced by a NIDS. Moreover, by leveraging modern multi-

16 16 Fabio Manganiello, Mirco Marchetti, and Michele Colajanni core architectures to perform training in parallel and non-blocking threads, the computational cost of our prototype is compatible with live traffic analysis. References 1. Chapeau-Blondeau, F.: Numerical evaluation of the lambert w function and application to generation of generalized gaussian noise with exponent 1/2. IEEE Transactions on Signal Processing 50(9) (2002) 2. Chen, Z.G., Zhang, G.H., Tian, L.Q., Geng, Z.L.: Intrusion detection based on self-organizing map and artificial immunisation algorithm. Engineering Materials 439(1), (2010) 3. Colajanni, M., Marchetti, M., Messori, M.: Selective and early threat detection in large networked systems. In: Proc. of the 10th IEEE International Conference on Computer and Information Technology (CIT 2010) (2010) 4. Capture the flag traffic dump, available online at links/dc-ctf.html 5. Julisch, K.: Clustering intrusion detection alarms to support root cause analysis. ACM Transactions on Information and System Security 6, (2003) 6. Kohonen, T.: The self-organizing map 78(9) (1990) 7. Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection using neural networks and support vector machines. In: Proceedings of the 2002 International Joint Conference on Neural Networks (2002) 8. Munesh, K., Shoaib, S., Humera, N.: Feature-based alert correlation in security systems using self organizing maps. In: Proceedings of SPIE, the International Society for Optical Engineering (2009) 9. Patole, V.A., Pachghare, V.K., Kulkarni, P.: Article: Self organizing maps to build intrusion detection system. International Journal of Computer Applications 1(7), 1 4 (February 2010) 10. Pelleg, D., Moore, A.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proc. of the 17th International Conference on Machine Learning. pp Morgan Kaufmann (2000) 11. Snort home page, available online at Su, M.C., Liu, T.K., Chang, H.T.: Improving the self-organizing feature map algorithm using an efficient initialization scheme. Tamkang Journal of Science and Engineering 5(1), (2002) 13. Valeur, F., Vigna, G., Kruegel, C., Kemmerer, R.A.: A comprehensive approach to intrusion detection alert correlation. IEEE Transactions on Dependable and Secure Computing 1, (2004) 14. Vokorokos, L., Baláz, A., Chovanec, M.: Intrusion detection system using self organizing map 6(1) (2006) 15. Yoo, J.H., Kang, B.H., Kim, J.W.: A clustering analysis and learning rate for selforganizing feature map. In: Proc. of the 3rd International Conference on Fuzzy Logic, Neural Networks and Soft Computing (1994)

Identification of correlated network intrusion alerts

Identification of correlated network intrusion alerts Identification of correlated network intrusion alerts Mirco Marchetti, Michele Colajanni, Fabio Manganiello Department of Information Engineering University of Modena and Reggio Emilia Modena, Italy {mirco.marchetti,

More information

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks 2011 International Conference on Network and Electronics Engineering IPCSIT vol.11 (2011) (2011) IACSIT Press, Singapore An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks Reyhaneh

More information

A Study of Web Log Analysis Using Clustering Techniques

A Study of Web Log Analysis Using Clustering Techniques A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless

More information

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph

More information

Detection and mitigation of Web Services Attacks using Markov Model

Detection and mitigation of Web Services Attacks using Markov Model Detection and mitigation of Web Services Attacks using Markov Model Vivek Relan RELAN1@UMBC.EDU Bhushan Sonawane BHUSHAN1@UMBC.EDU Department of Computer Science and Engineering, University of Maryland,

More information

Network Intrusion Detection Systems

Network Intrusion Detection Systems Network Intrusion Detection Systems False Positive Reduction Through Anomaly Detection Joint research by Emmanuele Zambon & Damiano Bolzoni 7/1/06 NIDS - False Positive reduction through Anomaly Detection

More information

Network Intrusion Detection Using an Improved Competitive Learning Neural Network

Network Intrusion Detection Using an Improved Competitive Learning Neural Network Network Intrusion Detection Using an Improved Competitive Learning Neural Network John Zhong Lei and Ali Ghorbani Faculty of Computer Science University of New Brunswick Fredericton, NB, E3B 5A3, Canada

More information

Index Terms Domain name, Firewall, Packet, Phishing, URL.

Index Terms Domain name, Firewall, Packet, Phishing, URL. BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet

More information

Intrusion Detection System using Log Files and Reinforcement Learning

Intrusion Detection System using Log Files and Reinforcement Learning Intrusion Detection System using Log Files and Reinforcement Learning Bhagyashree Deokar, Ambarish Hazarnis Department of Computer Engineering K. J. Somaiya College of Engineering, Mumbai, India ABSTRACT

More information

Efficient Security Alert Management System

Efficient Security Alert Management System Efficient Security Alert Management System Minoo Deljavan Anvary IT Department School of e-learning Shiraz University Shiraz, Fars, Iran Majid Ghonji Feshki Department of Computer Science Qzvin Branch,

More information

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL G. Maria Priscilla 1 and C. P. Sumathi 2 1 S.N.R. Sons College (Autonomous), Coimbatore, India 2 SDNB Vaishnav College

More information

KEITH LEHNERT AND ERIC FRIEDRICH

KEITH LEHNERT AND ERIC FRIEDRICH MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

More information

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Load balancing in a heterogeneous computer system by self-organizing Kohonen network Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.

More information

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations

More information

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,

More information

Alarm Clustering for Intrusion Detection Systems in Computer Networks

Alarm Clustering for Intrusion Detection Systems in Computer Networks Alarm Clustering for Intrusion Detection Systems in Computer Networks Giorgio Giacinto, Roberto Perdisci, Fabio Roli Department of Electrical and Electronic Engineering, University of Cagliari Piazza D

More information

Experimentation driven traffic monitoring and engineering research

Experimentation driven traffic monitoring and engineering research Experimentation driven traffic monitoring and engineering research Amir KRIFA (Amir.Krifa@sophia.inria.fr) 11/20/09 ECODE FP7 Project 1 Outline i. Future directions of Internet traffic monitoring and engineering

More information

Analyzing TCP Traffic Patterns Using Self Organizing Maps

Analyzing TCP Traffic Patterns Using Self Organizing Maps Analyzing TCP Traffic Patterns Using Self Organizing Maps Stefano Zanero D.E.I.-Politecnico di Milano, via Ponzio 34/5-20133 Milano Italy zanero@elet.polimi.it Abstract. The continuous evolution of the

More information

Intrusion Detection in AlienVault

Intrusion Detection in AlienVault Complete. Simple. Affordable Copyright 2014 AlienVault. All rights reserved. AlienVault, AlienVault Unified Security Management, AlienVault USM, AlienVault Open Threat Exchange, AlienVault OTX, Open Threat

More information

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj

More information

Detecting Denial of Service Attacks Using Emergent Self-Organizing Maps

Detecting Denial of Service Attacks Using Emergent Self-Organizing Maps 2005 IEEE International Symposium on Signal Processing and Information Technology Detecting Denial of Service Attacks Using Emergent Self-Organizing Maps Aikaterini Mitrokotsa, Christos Douligeris Department

More information

Web Forensic Evidence of SQL Injection Analysis

Web Forensic Evidence of SQL Injection Analysis International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

More information

An apparatus for P2P classification in Netflow traces

An apparatus for P2P classification in Netflow traces An apparatus for P2P classification in Netflow traces Andrew M Gossett, Ioannis Papapanagiotou and Michael Devetsikiotis Electrical and Computer Engineering, North Carolina State University, Raleigh, USA

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

NETWORK-BASED INTRUSION DETECTION USING NEURAL NETWORKS

NETWORK-BASED INTRUSION DETECTION USING NEURAL NETWORKS 1 NETWORK-BASED INTRUSION DETECTION USING NEURAL NETWORKS ALAN BIVENS biven@cs.rpi.edu RASHEDA SMITH smithr2@cs.rpi.edu CHANDRIKA PALAGIRI palgac@cs.rpi.edu BOLESLAW SZYMANSKI szymansk@cs.rpi.edu MARK

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Visualization of Breast Cancer Data by SOM Component Planes

Visualization of Breast Cancer Data by SOM Component Planes International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian

More information

Segmentation of stock trading customers according to potential value

Segmentation of stock trading customers according to potential value Expert Systems with Applications 27 (2004) 27 33 www.elsevier.com/locate/eswa Segmentation of stock trading customers according to potential value H.W. Shin a, *, S.Y. Sohn b a Samsung Economy Research

More information

Cluster Analysis: Advanced Concepts

Cluster Analysis: Advanced Concepts Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means

More information

Clustering on Large Numeric Data Sets Using Hierarchical Approach Birch

Clustering on Large Numeric Data Sets Using Hierarchical Approach Birch Global Journal of Computer Science and Technology Software & Data Engineering Volume 12 Issue 12 Version 1.0 Year 2012 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global

More information

INTRUSION DETECTION SYSTEM USING SELF ORGANIZING MAP

INTRUSION DETECTION SYSTEM USING SELF ORGANIZING MAP Acta Electrotechnica et Informatica No. 1, Vol. 6, 2006 1 INTRUSION DETECTION SYSTEM USING SELF ORGANIZING MAP Liberios VOKOROKOS, Anton BALÁŽ, Martin CHOVANEC Technical University of Košice, Faculty of

More information

Network Based Intrusion Detection Using Honey pot Deception

Network Based Intrusion Detection Using Honey pot Deception Network Based Intrusion Detection Using Honey pot Deception Dr.K.V.Kulhalli, S.R.Khot Department of Electronics and Communication Engineering D.Y.Patil College of Engg.& technology, Kolhapur,Maharashtra,India.

More information

INTERACTIVE DATA EXPLORATION USING MDS MAPPING

INTERACTIVE DATA EXPLORATION USING MDS MAPPING INTERACTIVE DATA EXPLORATION USING MDS MAPPING Antoine Naud and Włodzisław Duch 1 Department of Computer Methods Nicolaus Copernicus University ul. Grudziadzka 5, 87-100 Toruń, Poland Abstract: Interactive

More information

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack Asnita Hashim, University of Technology MARA, Malaysia April 14-15, 2011 The Integration of SNORT with K-Means Clustering

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Self Organizing Maps for Visualization of Categories

Self Organizing Maps for Visualization of Categories Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl

More information

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering

Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering IV International Congress on Ultra Modern Telecommunications and Control Systems 22 Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering Antti Juvonen, Tuomo

More information

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

More information

A Novel Distributed Denial of Service (DDoS) Attacks Discriminating Detection in Flash Crowds

A Novel Distributed Denial of Service (DDoS) Attacks Discriminating Detection in Flash Crowds International Journal of Research Studies in Science, Engineering and Technology Volume 1, Issue 9, December 2014, PP 139-143 ISSN 2349-4751 (Print) & ISSN 2349-476X (Online) A Novel Distributed Denial

More information

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

More information

Detection of Distributed Denial of Service Attack with Hadoop on Live Network

Detection of Distributed Denial of Service Attack with Hadoop on Live Network Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION

CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION MATIJA STEVANOVIC PhD Student JENS MYRUP PEDERSEN Associate Professor Department of Electronic Systems Aalborg University,

More information

Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets

Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets Aaditya Prakash, Infosys Limited aaadityaprakash@gmail.com Abstract--Self-Organizing

More information

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections

A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections A Secure Online Reputation Defense System from Unfair Ratings using Anomaly Detections Asha baby PG Scholar,Department of CSE A. Kumaresan Professor, Department of CSE K. Vijayakumar Professor, Department

More information

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis A Web-based Interactive Data Visualization System for Outlier Subspace Analysis Dong Liu, Qigang Gao Computer Science Dalhousie University Halifax, NS, B3H 1W5 Canada dongl@cs.dal.ca qggao@cs.dal.ca Hai

More information

Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

More information

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework Jakrarin Therdphapiyanak Dept. of Computer Engineering Chulalongkorn University

More information

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of

More information

Chapter ML:XI (continued)

Chapter ML:XI (continued) Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained

More information

USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION

USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION USING SELF-ORGANISING MAPS FOR ANOMALOUS BEHAVIOUR DETECTION IN A COMPUTER FORENSIC INVESTIGATION B.K.L. Fei, J.H.P. Eloff, M.S. Olivier, H.M. Tillwick and H.S. Venter Information and Computer Security

More information

Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Flexible Deterministic Packet Marking: An IP Traceback Scheme Against DDOS Attacks

Flexible Deterministic Packet Marking: An IP Traceback Scheme Against DDOS Attacks Flexible Deterministic Packet Marking: An IP Traceback Scheme Against DDOS Attacks Prashil S. Waghmare PG student, Sinhgad College of Engineering, Vadgaon, Pune University, Maharashtra, India. prashil.waghmare14@gmail.com

More information

8 Visualization of high-dimensional data

8 Visualization of high-dimensional data 8 VISUALIZATION OF HIGH-DIMENSIONAL DATA 55 { plik roadmap8.tex January 24, 2005} 8 Visualization of high-dimensional data 8. Visualization of individual data points using Kohonen s SOM 8.. Typical Kohonen

More information

Cost Effective Automated Scaling of Web Applications for Multi Cloud Services

Cost Effective Automated Scaling of Web Applications for Multi Cloud Services Cost Effective Automated Scaling of Web Applications for Multi Cloud Services SANTHOSH.A 1, D.VINOTHA 2, BOOPATHY.P 3 1,2,3 Computer Science and Engineering PRIST University India Abstract - Resource allocation

More information

The Effectiveness of Self-Organizing Network Toolbox (1)

The Effectiveness of Self-Organizing Network Toolbox (1) Security Threats to Signal Classifiers Using Self-Organizing Maps T. Charles Clancy Awais Khawar tcc@umd.edu awais@umd.edu Department of Electrical and Computer Engineering University of Maryland, College

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Volume 1 Issue 11 (November 2014)

International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Volume 1 Issue 11 (November 2014) Denial-of-Service Attack Detection Mangesh D. Salunke * Prof. Ruhi Kabra G.H.Raisoni CEM, SPPU, Ahmednagar HOD, G.H.Raisoni CEM, SPPU,Ahmednagar Abstract: A DoS (Denial of Service) attack as name indicates

More information

Conclusions and Future Directions

Conclusions and Future Directions Chapter 9 This chapter summarizes the thesis with discussion of (a) the findings and the contributions to the state-of-the-art in the disciplines covered by this work, and (b) future work, those directions

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

Network Intrusion Simulation Using OPNET

Network Intrusion Simulation Using OPNET Network Intrusion Simulation Using OPNET Shabana Razak, Mian Zhou, Sheau-Dong Lang* School of Electrical Engineering & Computer Science and National Center for Forensic Science* University of Central Florida,

More information

Keywords Attack model, DDoS, Host Scan, Port Scan

Keywords Attack model, DDoS, Host Scan, Port Scan Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com DDOS Detection

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

Using Artificial Intelligence in Intrusion Detection Systems

Using Artificial Intelligence in Intrusion Detection Systems Using Artificial Intelligence in Intrusion Detection Systems Matti Manninen Helsinki University of Technology mimannin@niksula.hut.fi Abstract Artificial Intelligence could make the use of Intrusion Detection

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Detecting Flooding Attacks Using Power Divergence

Detecting Flooding Attacks Using Power Divergence Detecting Flooding Attacks Using Power Divergence Jean Tajer IT Security for the Next Generation European Cup, Prague 17-19 February, 2012 PAGE 1 Agenda 1- Introduction 2- K-ary Sktech 3- Detection Threshold

More information

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA Professor Yang Xiang Network Security and Computing Laboratory (NSCLab) School of Information Technology Deakin University, Melbourne, Australia http://anss.org.au/nsclab

More information

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE

More information

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION 1 ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION B. Mikó PhD, Z-Form Tool Manufacturing and Application Ltd H-1082. Budapest, Asztalos S. u 4. Tel: (1) 477 1016, e-mail: miko@manuf.bme.hu

More information

Neural Network Add-in

Neural Network Add-in Neural Network Add-in Version 1.5 Software User s Guide Contents Overview... 2 Getting Started... 2 Working with Datasets... 2 Open a Dataset... 3 Save a Dataset... 3 Data Pre-processing... 3 Lagging...

More information

Hybrid Model For Intrusion Detection System Chapke Prajkta P., Raut A. B.

Hybrid Model For Intrusion Detection System Chapke Prajkta P., Raut A. B. www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume1 Issue 3 Dec 2012 Page No. 151-155 Hybrid Model For Intrusion Detection System Chapke Prajkta P., Raut A. B.

More information

Load Distribution in Large Scale Network Monitoring Infrastructures

Load Distribution in Large Scale Network Monitoring Infrastructures Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu

More information

A New Approach For Estimating Software Effort Using RBFN Network

A New Approach For Estimating Software Effort Using RBFN Network IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.7, July 008 37 A New Approach For Estimating Software Using RBFN Network Ch. Satyananda Reddy, P. Sankara Rao, KVSVN Raju,

More information

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING MS SYSTEMS ENGINEERING AND ENGINEERING MANAGEMENT SEMESTER 1 EXAMINATION 2015/2016 INTELLIGENT SYSTEMS

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING MS SYSTEMS ENGINEERING AND ENGINEERING MANAGEMENT SEMESTER 1 EXAMINATION 2015/2016 INTELLIGENT SYSTEMS TW72 UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING MS SYSTEMS ENGINEERING AND ENGINEERING MANAGEMENT SEMESTER 1 EXAMINATION 2015/2016 INTELLIGENT SYSTEMS MODULE NO: EEM7010 Date: Monday 11 th January 2016

More information

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu

More information

Handling of incomplete data sets using ICA and SOM in data mining

Handling of incomplete data sets using ICA and SOM in data mining Neural Comput & Applic (2007) 16: 167 172 DOI 10.1007/s00521-006-0058-6 ORIGINAL ARTICLE Hongyi Peng Æ Siming Zhu Handling of incomplete data sets using ICA and SOM in data mining Received: 2 September

More information

A survey on Data Mining based Intrusion Detection Systems

A survey on Data Mining based Intrusion Detection Systems International Journal of Computer Networks and Communications Security VOL. 2, NO. 12, DECEMBER 2014, 485 490 Available online at: www.ijcncs.org ISSN 2308-9830 A survey on Data Mining based Intrusion

More information

Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach

Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden smakadir@csc.kth.se,

More information

Quality Assessment in Spatial Clustering of Data Mining

Quality Assessment in Spatial Clustering of Data Mining Quality Assessment in Spatial Clustering of Data Mining Azimi, A. and M.R. Delavar Centre of Excellence in Geomatics Engineering and Disaster Management, Dept. of Surveying and Geomatics Engineering, Engineering

More information

Multi scale random field simulation program

Multi scale random field simulation program Multi scale random field simulation program 1.15. 2010 (Updated 12.22.2010) Andrew Seifried, Stanford University Introduction This is a supporting document for the series of Matlab scripts used to perform

More information

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013 A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:

More information

Coordinated Scan Detection

Coordinated Scan Detection Coordinated Scan Detection Carrie Gates CA Labs, Islandia, NY carrie.gates@ca.com Abstract Coordinated attacks, where the tasks involved in an attack are distributed amongst multiple sources, can be used

More information

Efficiently Managing Firewall Conflicting Policies

Efficiently Managing Firewall Conflicting Policies Efficiently Managing Firewall Conflicting Policies 1 K.Raghavendra swamy, 2 B.Prashant 1 Final M Tech Student, 2 Associate professor, Dept of Computer Science and Engineering 12, Eluru College of Engineeering

More information

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Data Mining Project Report. Document Clustering. Meryem Uzun-Per Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...

More information

WEB APPLICATION FIREWALL

WEB APPLICATION FIREWALL WEB APPLICATION FIREWALL CS499 : B.Tech Project Final Report by Namit Gupta (Y3188) Abakash Saikia (Y3349) under the supervision of Dr. Dheeraj Sanghi submitted to Department of Computer Science and Engineering

More information

Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor

Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Engineering, Business and Enterprise

More information

Clustering Technique in Data Mining for Text Documents

Clustering Technique in Data Mining for Text Documents Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Layered Approach of Intrusion Detection System with Efficient Alert Aggregation for Heterogeneous Networks

Layered Approach of Intrusion Detection System with Efficient Alert Aggregation for Heterogeneous Networks Layered Approach of Intrusion Detection System with Efficient Alert Aggregation for Heterogeneous Networks Lohith Raj S N, Shanthi M B, Jitendranath Mungara Abstract Protecting data from the intruders

More information

STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS

STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS Athira A B 1 and Vinod Pathari 2 1 Department of Computer Engineering,National Institute Of Technology Calicut, India

More information

Application of Data Mining Techniques in Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as

More information

Bisecting K-Means for Clustering Web Log data

Bisecting K-Means for Clustering Web Log data Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining

More information

Classification of Engineering Consultancy Firms Using Self-Organizing Maps: A Scientific Approach

Classification of Engineering Consultancy Firms Using Self-Organizing Maps: A Scientific Approach International Journal of Civil & Environmental Engineering IJCEE-IJENS Vol:13 No:03 46 Classification of Engineering Consultancy Firms Using Self-Organizing Maps: A Scientific Approach Mansour N. Jadid

More information