A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique



Similar documents
A survey on Data Mining based Intrusion Detection Systems

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

A Survey on Intrusion Detection System with Data Mining Techniques

CHAPTER 1 INTRODUCTION

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Intrusion Detection via Machine Learning for SCADA System Protection

On Entropy in Network Traffic Anomaly Detection

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification

Performance Evaluation of Intrusion Detection Systems using ANN

KEITH LEHNERT AND ERIC FRIEDRICH

Feature Subset Selection in Spam Detection

Efficient Security Alert Management System

Hybrid Intrusion Detection System Using K-Means Algorithm

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: Volume 1 Issue 11 (November 2014)

A Neuro Fuzzy Based Intrusion Detection System for a Cloud Data Center Using Adaptive Learning

Social Media Mining. Data Mining Essentials

Network Intrusion Detection using Semi Supervised Support Vector Machine

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

Intrusion Detection using Artificial Neural Networks with Best Set of Features

How To Detect Denial Of Service Attack On A Network With A Network Traffic Characterization Scheme

HYBRID INTRUSION DETECTION FOR CLUSTER BASED WIRELESS SENSOR NETWORK

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

Adaptive Anomaly Detection for Network Security

Network Intrusion Detection Using an Improved Competitive Learning Neural Network

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

How To Cluster

Hybrid Intrusion Detection System Model using Clustering, Classification and Decision Table

Conclusions and Future Directions

A Dynamic Flooding Attack Detection System Based on Different Classification Techniques and Using SNMP MIB Data

Network Intrusion Detection Systems

How To Prevent Network Attacks

Honey Bee Intelligent Model for Network Zero Day Attack Detection

Index Terms: Intrusion Detection System (IDS), Training, Neural Network, anomaly detection, misuse detection.

Intrusion Detection for Mobile Ad Hoc Networks

Intrusion Detection System using Log Files and Reinforcement Learning

An Introduction to Data Mining

STUDY OF IMPLEMENTATION OF INTRUSION DETECTION SYSTEM (IDS) VIA DIFFERENT APPROACHS

Credit Card Fraud Detection Using Self Organised Map

Taxonomy of Intrusion Detection System

False Positives Reduction Techniques in Intrusion Detection Systems-A Review

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Combining Heterogeneous Classifiers for Network Intrusion Detection

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack

IDS IN TELECOMMUNICATION NETWORK USING PCA

A New Method for Traffic Forecasting Based on the Data Mining Technology with Artificial Intelligent Algorithms

Support Vector Machines with Clustering for Training with Very Large Datasets

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Hybrid Model For Intrusion Detection System Chapke Prajkta P., Raut A. B.

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Data Mining for Network Intrusion Detection

CLASSIFYING NETWORK TRAFFIC IN THE BIG DATA ERA

A Content based Spam Filtering Using Optical Back Propagation Technique

Prediction of DDoS Attack Scheme

Fuzzy Threat Assessment on Service Availability with Data Fusion Approach

Layered Approach of Intrusion Detection System with Efficient Alert Aggregation for Heterogeneous Networks

A Survey on Intrusion Detection using Data Mining Technique

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 5, SEPTEMBER

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Application of Data Mining Techniques in Intrusion Detection

Detecting Denial of Service Attacks Using Emergent Self-Organizing Maps

Role of Anomaly IDS in Network

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Development of a Network Intrusion Detection System

An Intrusion Detection System based on Support Vector Machine using Hierarchical Clustering and Genetic Algorithm

A Review on Hybrid Intrusion Detection System using TAN & SVM

ANALYTICS IN BIG DATA ERA

Classification algorithm in Data mining: An Overview

Enhancing Quality of Data using Data Mining Method

Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques

Environmental Remote Sensing GEOG 2021

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

A Secure Intrusion detection system against DDOS attack in Wireless Mobile Ad-hoc Network Abstract

AUTONOMOUS NETWORK SECURITY FOR DETECTION OF NETWORK ATTACKS

Tracking and Recognition in Sports Videos

Multivariate Correlation Analysis Technique BasedonEuclideanDistanceMapfor Network Traffic Characterization

SURVEY OF INTRUSION DETECTION SYSTEM

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

NETWORK INTRUSION DETECTION SYSTEM USING HYBRID CLASSIFICATION MODEL

A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

Neural Networks for Intrusion Detection and Its Applications

A SYSTEM FOR DENIAL OF SERVICE ATTACK DETECTION BASED ON MULTIVARIATE CORRELATION ANALYSIS

Network Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Network

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Advanced Ensemble Strategies for Polynomial Models

Adaptive Layered Approach using Machine Learning Techniques with Gain Ratio for Intrusion Detection Systems

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Addressing the Class Imbalance Problem in Medical Datasets

Towards better accuracy for Spam predictions

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Data Mining For Intrusion Detection Systems. Monique Wooten. Professor Robila

Data Mining Practical Machine Learning Tools and Techniques

Survey of Data Mining Approach using IDS

An Overview of Knowledge Discovery Database and Data mining Techniques

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

Keywords - Intrusion Detection System, Intrusion Prevention System, Artificial Neural Network, Multi Layer Perceptron, SYN_FLOOD, PING_FLOOD, JPCap

Transcription:

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj Branch, Sanandaj, Iran. 2 Department of Industrial Engineering, Islamic Azad University, Sanandaj Branch, Sanandaj, Iran. * Corresponding author: Dr. Heirsh Soltanpanah ABSTRACT "Intrusion" signifies certain acts that are meant to hazard the integration, privacy and availability of a source. An intrusion detection system (IDS) can limit the access of user to the computer by implementation of some specific rules. These rules are based on the knowledge of the expert. In this research a model is suggested that is based on data analysis with the aim of increasing accuracy. This model consists of two layers. At the first one, the clustering algorithm is used with the aim of simplifying input space and decreasing calculations. At the second layer, classification algorithm is used to classify available samples in the cluster. The results show that this method has a better performance on KDD CUP data set in comparison with previous methods..key words: Intrusion detection, intrusion detection system, classification, clustering, training data set, test data set Introduction Intrusion is a set of operations that try to hazard the integration, privacy and availability of a source. An intrusion detection system can limit the access of user to the computer by the implementation of some specific rules. These rules are based on the knowledge of the expert. These experts are people who make attack scenarios. The system identifies all violations and accomplishes necessary steps for stopping attack on the database. In this regard, the problem of intrusion detection in computer security is widely investigated. Intrusion detection systems are divided into two categories based on the host and network. A system based on network can verify network traffic for specific parts of the network and detect any suspicious activity with analyzing them. The most significant variable problems concerned with IDS are [1]: Wideness and variability of attacks: Computer systems are faced with a wide range of attacks. Variability of attacks causes long time Detection. On the other hand, attacks do not have a specified pattern and everyone has a different behavior. So some attacks with unknown pattern are being expected. Likeness of behaviors of some attacks to regular activities behavior in the network: Some of attacks have a pattern similar to the typical pattern of activities in the network. The detection of these attacks is more difficult than others. Methodology of intrusion detection is divided into 3 categories [1], [3] and [4]: 1.1. signature-based detection A signature is a pattern that corresponds to a known attack. In this method, they obtain the patterns of occurred processes and compare each with the existing attack pattern. If it matches, it means that intrusion has occurred, if not the system is in a safe mode. The main advantages of this method are their simplicity and efficiency in detecting known attacks and the main disadvantage is that it cannot detect new attacks. 1.2. anomaly-based detections An unconventional behavior is a behavior that has some deviations from a conventional behavior. Conventional behaviors (normal behaviors) form the profiles. Patterns of these behaviors can be obtained from observing legal activities in a computer system. The profiles can be static or dynamic. Anomalybased detection systems compare ongoing processes behaviors with profiles and will announce intrusion if they observe deviation. Some sources call these methods behavior-based detection. The main advantage of these methods is their ability to detect

a new attack. The other advantage is their independence of the operating system. Their major shortcoming is their production of false alarms and their low performance in case where there is no good example. 1.3. Statement of protocol analysis These methods are based on knowing and tracing the state of protocols. They act like anomaly-based detection methods with one difference and that is related to the profiles. In this method, the profiles are based on standard rules that legislate in the network. In this paper there is a new method which is based on anomaly detection for intrusion detection in system. This model has two layers. In the first layer, the clustering algorithm is used with purpose of simplification of input space and decreasing calculations. In the second one, classification algorithm is used to classify available samples in the cluster. 2. Research Background Li uses a model of signaling game for intrusion detection in wireless networks. In this method, it is assumed that there is a central distributer network which any node of that has IDS but just IDS embedded in central cluster has the task of detection. The above method uses statistical method s techniques for this operation. Tian et al. utilized genetic algorithm [4]. First of all, labeled data set (normal or abnormal samples of network traffic data that can be collected by surveillance tools) is being analyzed and then some primary rules are being legislated. Then these primary rules can be used for the initial population. In the following, it is trying to extract some appropriate legislations for intrusion detection by use of genetic operations. Gosh et al. used forward multilayered perception network and also Elman returning network with the aim of classification of system calls[4]. They reported their results of the DARPA 1999 data set. Their results showed that Elman network has a better performance in comparison with multilayered perception network. This method is for intrusion detection and normal behavior modeling. With respect to the above case, Ghosh et al. used a clustering method based on grid for modeling the normal behaviors [4]. In this method, normal behaviors of users can be observed online. Afterward, based on them, normal patterns are being extracted and they divide into several clusters. A sample will be known as an abnormal one if it doesn t match with any cluster. A recent approach has been illustrated for intrusion detection in systems that use clustering and classifying models together. As an example, Park et al. had used a combination of clustering and classifying models for intrusion detection [5]. At first, with the usage of Fary clustering, divides data samples into several clusters and then tries to classify the samples by using artificial neural network as a classifier. Wang et al. also used a combined model like above one [7]. This model uses two algorithms; decision tree classification algorithm and ian clustering networks. The purpose of this method is reducing the complexity of classification. With clustering at first, it is tried to make some categories of data samples and then each category will be classified independently. Figure 1-2 shows an abstract of expressed methods. 3. Proposed model Combined models which incorporate clustering and classifying cause data reduction and so classifier algorithm can classify the samples better. As an example, SVM classifier algorithm uses relationship optimization. This equation is LaGrange equation. Suggested methods for solving LaGrange equation are sensitive to the number of samples. If there are many samples, they can t get the answer. Due to the above points, in this paper, a combined model will be presented which divides data into several classes and then uses classification algorithm for each category. The proposed model is as follows: Step 1: First of all, the best properties will be selected by feature selection methods. Step 2: Data is being clustered by using the clustering algorithms. Top-down hierarchical algorithm was used in this paper. Step 3: Finally, combination of several classifiers is used for classification of samples in any cluster. Categories used in this paper are as follow: - Support vector machine with Gaussian kernel - Support vector machine with polynomials - Support vector machine with quadratic kernel - tree - Multilayered perception network Figure 1-3 shows the above model. In the following, we will explain all the steps. fig (1-3) proposed model process (structure) 324

3.1. Feature selection Any sample in data set is exhibited by features that express the data set. Feature selection means to select such properties that can determine a sample better and can perform properly in separating that one than another sample. Different ways are suggested for feature selection and we use principle components analysis in this research. Principle component analysis: this method that is considered as a dimension reduction technique tries to combine features with the propose of reducing them. Indeed, this method reduces dimension by dimension transfer. PCA technique is the best way to reduce the data size linearly. It means that by omitting insignificant coefficients obtained from this transformation, missing information is less than other methods. 3.2. Clustering Clustering is one of unsupervised branches of learning. This process is automatic and during that, samples are being divided into some categories that their members are similar to each other. These categories are called cluster. So cluster is a set of objects in which they are similar to each other and they are different from objects in another cluster. We can consider various criteria for similarity. For example, the distance criterion can be used and objects that are near to each other can be put in one cluster. Clustering algorithms includes two categories; hierarchical algorithm and dividing algorithm. Hierarchical algorithms construct clusters gradually (grow like crystals) but dividing algorithms carry out clustering directly [8]. Hierarchical algorithm: is one of the clustering algorithms which with regards to some criteria, analyzes data hierarchically and then, tries to create various clusters by dividing and conquer methods. These algorithms divide into two groups: 1- Aggregation algorithms (down to up): in these algorithms, at first, it is assumed that number of clusters is equal to the number of samples. In each step two or more clusters combine with each other and make a new cluster. 2- Dividing algorithms( up to down ): in these algorithms, first of all, it is assumed that data set has one cluster and then in each step it will be tried to divide one cluster into two. This process carries out repeatedly to satisfy a certain criteria. In the present paper, a kind of hierarchical algorithm is suggested and it was used for clustering data samples. The performance of the algorithm is as follows: Step 1: In the first step, we assume the data set forms a cluster and we call it C1.then we will try to decompose the data. Decomposition is as follows: - We use below equation to solve the cluster center: M = f, Fi,jis determiner of ith feature magnitude in j th sample. - For each sample in cluster we measure its Euclidean distance from center. - Separate the sample which has the maximum distance from center and put it in a new cluster (Cnew) - For each sample in cluster C1, measure its distance from center of the new cluster (Dnew). Then if this distance is smaller than distance from its own center, assign that sample to the new cluster. It means that - IfD < D ThenassignS toc - We check whether breaking cluster causes increasing efficiency or not (evaluation - criteria). At this sta Criteria). At this stage, by using a cluster we can classify the sets C1 and Cnewand measure the accuracy. If the accuracy of the set Cnew is more thanc1, then breaking the two clusters is efficient and we continue this process. Otherwise, we don t divide upper cluster into tow. Classifier used in this research is support vector machine that will be explained later. Step 2: First step is learning phase of the above model, in the second stage, we should evaluate this model. In this stage one sample of total test (unseen) is given to this model. The distance of this sample from every cluster is being measured.then; sample will be attributed to a cluster which has minimum distance. And so this sample is classified by using classification which was trained on that cluster and its normality or offensiveness will be determined 325

3.3. Classification Classification algorithms try to classify samples of a Figure 2-3 shows pseudo code of the proposed model. Algorithm_1 Input: training data {(x, x ),, (x, y )} Output: a set of clusters in which each cluster has a own classifier {(C, h ),, (C, h )} While (satisfaction is true) 1. For each cluster C Divide the cluster C to C, C by calling algorithm_2 1.1. 1.2. Classify C,C, C using SVM 1.3. if accuracy of C, C bigger that C then 1.2. 1. put C, C into a stack and remove C 1.4. if stack is not empty then 1.4.1. C Pop a cluster from stack classification, decision tree and MLP artificial neural network is used with the aim of assortment and obtained boundary for that cluster will be resulted. 3.4. Combination of classifications Algorithm_2---------------------------------------------------------------------------------------- Input: (C ) Output: (C, C ) 1. S Calculate the center of C 2. For all sample (o ) in C do 2.1. d Calculate the distance from S 3. C max ( d ) 4. S Calculate the center of C 5. For all sample in cluster C do 5.1. D Calculate the distance from S 6. If (D <d ) then 6.1. Assign o into C and remove it from C 6.2. S Calculate the center of C 7.Ci1 = C 8.Ci2 = Ci set based on their features. Many classification algorithms had been developed.at the present model, SVM classification algorithm is used for classifying and a standard for breaking cluster. After that the targeted cluster breaks into two, we measure the accuracy of the SVM classifier on the tow separated clusters and the main one. If average size of the obtained accuracy of two separated clusters is more than the main cluster then breaking causes efficiency and we follow the process. Otherwise we stop fraction process. The next usage of SVM classifier is after creation of clusters. After obtaining clusters, samples in Every cluster should be classified. If a cluster s samples have an equal label, so it doesn t need classification and any sample that belongs to this cluster has that label. But if samples in the cluster have different labels( normal tag or attack tag), a combination of support vector machine In this research, two models of combination for combining classifiers were been used; model and selection model. We will illustrate both of them later. 3.4.1. Combining: Combining the output of the classifiers means that the final decision is being influenced by output of every classifiers. Some methods are suggested for this operation that in this research, majority of votes is used and below equation shows it: H(t) = max d. 3.4.2. : 326

means that for a specific data sample just one of the classifiers those participating in Forum section will decide. Indeed, Forum section output is the output of one of classifiers. The key question is that which of classifiers should be selected for a sample. Several methods were expanded that we will express them in the following. There are various methods for selection. In this paper the nearest neighbor mechanism that is more common than other methods is used. The nearest neighbor: in this method, for deciding on a sample, parameter K of the nearest neighbor of the targeted sample ( in the training data set ) is being obtained and then accuracy of any classifiers will be measured on K of the sample. Finally, any classifier that has higher accuracy is being selected for commenting. Neighbors are obtained on the base of the distance criterion.it means that K of any neighbor that has near distance will be measured. 4. Experiments 4.1. Data set In this section KDD CUP data set at 1999 was used for targeted purpose. Of course only some parts of KDD data set were used. This is because the comparison with other methods. Wang et al. after presentation of their method used 18285 samples of KDD data set for comparing their method with others [7]. These samples were selected randomly but they have some features that there are in table 3-4.as it is visible; this subset includes all samples of attack like Probe, R2L and also U2R. But because of large number of Dos samples, gust 10000 samples were selected. Also 3000 normal samples had been selected randomly. Connection Type Normal Dos Probe R2L U2R Training Data Set Table (4-1) tested data set 3000 10.000 4107 1126 52 4.2. Evaluation criteria In this paper several evaluation criteria had been used. They are as follow: Precision standard: equation 1 shows how this criterion can be measured. Precision = TP TP + FP (1) Recall standard: this criterion is correct classification rate of positive samples and equation 2 shows it: Recall = TPrate = TP TP + FN (2) F-measure standard: this standard is used for inclusion the accuracy and TP rate of a single criterion. Equation 3 shows it : F = (1 + β ). (Precision. TPrate) β (3) 2. Precision + TPrate 4.3. Results With the aim of evaluating the purposed method, the above model had been tested on the data set determined in chart 4-1 and based on expressed criteria. Chart 4-2 till 4-6 show obtained results from purposed model with other methods Table(4-2) The results of experiment on Normal data sample. BPNN FC- Precision 91.22 89.22 89.75 91.32 100.0 100.0 Recall 99.41 97.70 98.20 99.08 96.81 96.25 F-value 95.14 93.27 93.79 95.04 98.37 97.64 BPNN FC- Precision 99.84 99.69 99.79 99.91 97.73 98.51 Recall 97.24 96.65 97.20 96.70 100.0 88.87 F-value 98.52 98.15 98.48 98.28 98.37 97.64 327

Table(4-3) The results of experiment on Dos data samples BPNN FC- Precision 50.00 52.61 60.94 48.12 99.29 79.68 Recall 78.13 88.13 88.75 80.00 99.85 79.85 F-value 60.98 65.89 72.26 60.09 99.57 79.77 Table(4-4) The results of experiment on Probe data samples BPNN FC- Precision 33.33 46.15 57.14 93.18 100.0 90.0 Recall 1.43 8.57 5.71 58.57 87.07 37.21 F-value 2.74 14.58 10.39 71.93 92.80 45.89 Table(4-5) The results of experiment on R2l data samples BPNN FC- Precision 50.00 25.00 50.00 83.33 90.0 70.0 Recall 15.38 7.69 23.08 76.92 51.15 36.08 F-value 23.53 11.76 31.58 80.00 65.37 44.28 Table(4-6) The results of experiment on U2R data samples 5. Conclusion In this paper we stated intrusion detection in computer networks and expressed methods for that. It was stated that most of existing methods are based on machine learning. In this research, a combined model based on clustering and classification was presented.this model includes two layers. In the first layer clustering algorithm had been used with purpose of simplifying the input space and also decreasing calculations. In the second layer classification algorithm had been used for classifying existing samples in a cluster. For evaluating this method, a part of KDD CUP data set was used. References [1] A. Patcha and J. Park, An overview of anomaly detection techniques : Existing solutions and latest technological trends, Comput. Networks, vol. 51, pp. 3448 3470, 2007 [2] A.K. Ghosh, C. Michael, M. Schatz, A realtime intrusion detection system based on learning program behavior, in: Proceedings of the Third International Workshop on Recent Advances in Intrusion Detection Toulouse, France, 2000, pp. 93 109. [3] H. Liao, C. R. Lin, Y. Lin, and K. Tung, Intrusion detection system : A comprehensive review, J. Netw. Comput.Appl., vol. 36, no. 1, pp. 16 24, 2013. [4] M. X. Ã, S. H. Ã, B. Tian, and S. Parvin, Anomaly detection in wireless sensor networks : A survey, J. Netw. Comput.Appl., vol. 34, no. 4, pp. 1302 1325, 2011. [5] Park, Nam Hun, Sang Hyun Oh, and Won Suk Lee. "Anomaly intrusion detection by clustering transactional audit streams in a host computer." Information Sciences 180.12 (2010): 2375-2389. [6] S. Shen, Y. Li, H. Xu, and Q. Cao, Signaling game based strategy of intrusion detection in wireless sensor networks, Comput. Math. with Appl., vol. 62, no. 6, pp. 2404 2416, 2011. [7] Wang, Gang, et al. "A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering." Expert Systems with Applications 37.9, 2010. [8] Xiang, Cheng, Png Chin Yong, and Lim Swee Meng. "Design of multiple-level hybrid classifier for intrusion detection system using ian clustering and decision trees." Pattern Recognition Letters 29.7, 2008. [9] W. Li, Using Genetic Algorithm for Network Intrusion Detection, C.S.G. Department of Energy, 2004, pp. 1 8. 328