FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION



Similar documents
FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION. Abstract

Intrusion Detection Using Data Mining Along Fuzzy Logic and Genetic Algorithms

Development of a Network Intrusion Detection System

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

Fuzzy Network Profiling for Intrusion Detection

Slow Port Scanning Detection

Intrusion Detection via Machine Learning for SCADA System Protection

System Specification. Author: CMU Team

Hybrid Model For Intrusion Detection System Chapke Prajkta P., Raut A. B.

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack

PROFESSIONAL SECURITY SYSTEMS

How To Prevent Network Attacks

A TWO LEVEL ARCHITECTURE USING CONSENSUS METHOD FOR GLOBAL DECISION MAKING AGAINST DDoS ATTACKS

Intrusion Detection System Based Network Using SNORT Signatures And WINPCAP

Fuzzy Network Profiling for Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection

Calculation Algorithm for Network Flow Parameters Entropy in Anomaly Detection

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

HYBRID INTRUSION DETECTION FOR CLUSTER BASED WIRELESS SENSOR NETWORK

IDS Categories. Sensor Types Host-based (HIDS) sensors collect data from hosts for

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Firewalls and Intrusion Detection

On A Network Forensics Model For Information Security

A Survey on Intrusion Detection System with Data Mining Techniques

Role of Anomaly IDS in Network

Firewalls, Tunnels, and Network Intrusion Detection. Firewalls

Firewalls, Tunnels, and Network Intrusion Detection

Network Intrusion Detection Systems. Beyond packet filtering

CSCI 4250/6250 Fall 2015 Computer and Networks Security

Intrusion Detection. Jeffrey J.P. Tsai. Imperial College Press. A Machine Learning Approach. Zhenwei Yu. University of Illinois, Chicago, USA

CS 356 Lecture 17 and 18 Intrusion Detection. Spring 2013

USING GENETIC ALGORITHM IN NETWORK SECURITY

Keywords - Intrusion Detection System, Intrusion Prevention System, Artificial Neural Network, Multi Layer Perceptron, SYN_FLOOD, PING_FLOOD, JPCap

Using Genetic Algorithm for Network Intrusion Detection

A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM

Artificial Neural Networks for Misuse Detection

Network Monitoring On Large Networks. Yao Chuan Han (TWCERT/CC)

Advancement in Virtualization Based Intrusion Detection System in Cloud Environment

Firewalls. Basic Firewall Concept. Why firewalls? Firewall goals. Two Separable Topics. Firewall Design & Architecture Issues

Intrusion Detection Systems with Correlation Capabilities

Network- vs. Host-based Intrusion Detection

Network Based Intrusion Detection Using Honey pot Deception

Detecting Novel Network Intrusions Using Bayes Estimators

CURRENT STUDIES ON INTRUSION DETECTION SYSTEM, GENETIC ALGORITHM AND FUZZY LOGIC

Implementing Large-Scale Autonomic Server Monitoring Using Process Query Systems. Christopher Roblee Vincent Berk George Cybenko

A Novel Solution on Alert Conflict Resolution Model in Network Management

Intrusion Detection System for Cloud Network Using FC-ANN Algorithm

Introduction of Intrusion Detection Systems

Neural Networks for Intrusion Detection and Its Applications

SURVEY OF INTRUSION DETECTION SYSTEM

Intrusion Detection for Grid and Cloud Computing

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme. Firewall

An Anomaly-Based Method for DDoS Attacks Detection using RBF Neural Networks

CYBER ATTACKS EXPLAINED: PACKET CRAFTING

HP Intelligent Management Center v7.1 Network Traffic Analyzer Administrator Guide

Hillstone T-Series Intelligent Next-Generation Firewall Whitepaper: Abnormal Behavior Analysis

Wharf T&T Limited DDoS Mitigation Service Customer Portal User Guide

Network Security Management

Network Monitoring Tool to Identify Malware Infected Computers

Detecting Anomalies in Network Traffic Using Maximum Entropy Estimation

Module II. Internet Security. Chapter 7. Intrusion Detection. Web Security: Theory & Applications. School of Software, Sun Yat-sen University

An Overview of Intrusion Detection System (IDS) along with its Commonly Used Techniques and Classifications

Configuring Personal Firewalls and Understanding IDS. Securing Networks Chapter 3 Part 2 of 4 CA M S Mehta, FCA

A Neural Network Based System for Intrusion Detection and Classification of Attacks

A Biologically Inspired Approach to Network Vulnerability Identification

Intrusion Detection Categories (note supplied by Steve Tonkovich of CAPTUS NETWORKS)

Intrusion Detection. Tianen Liu. May 22, paper will look at different kinds of intrusion detection systems, different ways of

Firewalls. Firewalls. Idea: separate local network from the Internet 2/24/15. Intranet DMZ. Trusted hosts and networks. Firewall.

Computational intelligence in intrusion detection systems

Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor

Data Mining Approach in Security Information and Event Management

Efficient Security Alert Management System

IDS / IPS. James E. Thiel S.W.A.T.

A Review on Network Intrusion Detection System Using Open Source Snort

Denial of Service attacks: analysis and countermeasures. Marek Ostaszewski

Network Intrusion Detection System Using Genetic Algorithm and Fuzzy Logic

Firewalls. configuring a sophisticated GNU/Linux firewall involves understanding

On the use of Honeypots for Detecting Cyber Attacks on Industrial Control Networks

CS155 - Firewalls. Simon Cooper <sc@sgi.com> CS155 Firewalls 22 May 2003

Name. Description. Rationale

SANS Top 20 Critical Controls for Effective Cyber Defense

Usage of Netflow in Security and Monitoring of Computer Networks

Rule 4-004M Payment Card Industry (PCI) Monitoring, Logging and Audit (proposed)

Transcription:

FUZZY DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION Susan M. Bridges Bridges@cs.msstate.edu Rayford B. Vaughn vaughn@cs.msstate.edu 23 rd National Information Systems Security Conference October 16-19, 2000

OUTLINE AI and Intrusion Detection Intrusion Detection System Design Fuzzy Logic and Data Mining Fuzzy Association Rules Fuzzy Frequency Episodes Intrusion Detection via Fuzzy Data Mining GA s for System Optimization Summary and Future Work

AI TECHNIQUES AND INTRUSION DETECTION Long history of AI techniques applied to intrusion detection. For example: Rule-Based Expert Systems( Lunt and Jagannathan 1988) State Transition Analysis (Ilgun and Kemmerer 1995) Genetic Algorithms (Me 1998) Inductive Sequential Patterns (Teng, Chen and Lu 1990) Artificial Neural Networks (Debar, Becker, and Siboni 1992) Data mining applied to intrusion detection is an active area of research. Examples include: Lee, Stolfo, and Mok (1998) Barbara, Jajodia, Wu, and Speegle (2000)

UNIQUE FEATURES OF OUR WORK Combines fuzzy logic with data mining Overcomes sharp boundary problems of many systems Reduces false positive errors Can be used for both anomaly detection and misuse detection Includes real-time components Uses genetic algorithms for system optimization

FUZZY LOGIC AND SECURITY Many security-related features are quantitative e.g., temporal statistical measurements (Porras and Valdes 1998; Lee and Stolfo 1998)» SN: number of SYN flags in TCP header during last 2s» FN: number of FIN flags in TCP header during last 2s» RN: number of RST flags in TCP header during last 2s» PN: number of distinct destination ports during last 2s 0.0 1.0 2.0 3.0 4.0 5.0 6.0 Timestamp first 2 seconds second 2 seconds third 2 seconds...

FUZZY LOGIC ALLOWS OVERLAPPING CATEGORIES 1 0.8 1 0.8 0.6 0.6 0.4 0.2 0.4 Low Medium High Low Medium High 0.2 0 0 5 10 15 20 25 30 Destination Ports a. Non-fuzzy sets 0 0 5 10 15 20 25 30 Destination Ports b. Fuzzy sets

FUZZY LOGIC OVERCOMES SHARP BOUNDARY PROBLEMS Medium A B 0 x1 x2 x3 x4... A B LOW MEDIUM HIGH 0 x1 x2 x3 x4... If Medium is the normal pattern, then without fuzzy sets, A & B are both outside of the normal pattern. Fuzzy logic allows degrees of normality.

INTELLIGENT INTRUSION DETECTION MODEL Integration of fuzzy logic with data mining Fuzzy association rules Fuzzy frequency episodes Preliminary architecture Includes both misuse detection and anamoly detection Integrates machine-level and network-level information Optimization using genetic algorithms

Network Traffic or Audit Data (1) Network Traffic or Audit Data (2)... Network Traffic or Audit Data (m) Background Unit Machine Learning Component ( by mining fuzzy association rules and fuzzy frequency episodes ) Core Component Intrusion Detection Module 1 Intrusion Detection Module n+1... Intrusion Detection Module n Experts... Anomaly Detection Intrusion Detection Module n Decision-Making Module Administrator Misuse Detection Server Communication Module Clients Intrusion Detection Sentry 1 Host or Network Device Intrusion Detection Sentry 2 Host or Network Device... Intrusion Detection Sentry s Host or Network Device

MINING FUZZY ASSOCIATION RULES Association rules represent commonly found patterns in data. Association Rule Format: R1: X Y, c, s X and Y are disjoint sets of items s (support) tells how often X and Y co-occur in the data c (confidence) tells how often Y is associated with X. Our system is unique: X and Y are fuzzy variables that take fuzzy sets as values

FUZZY ASSOCIATION RULES Sample Fuzzy Association Rule: { SN=LOW, FN=LOW } { RN=LOW }, c = 0.924, s = 0.49 Interpretation: SN, FN, and RN are fuzzy variables. The pattern { SN=LOW, FN=LOW, RN=LOW } has occurred in 49% of the training cases; When the pattern { SN=LOW, FN=LOW } occurs, there will be 92.4% probability that { RN=LOW } will occur at the same time.

SAMPLE FUZZY FREQUENCY EPISODE RULES { E1: PN=LOW, E2: PN=MEDIUM } { E3:PN=MEDIUM } c = 0.854, s = 0.108, w = 10 seconds E1, E2 and E3 are events occurring within the time window 10 seconds. PN is a fuzzy variable The events occur in the order E1, E2, E3, but there may also be intervening events { PN=LOW, PN=MEDIUM, PN=MEDIUM } has occurred 10.8% in all training cases; When { PN=LOW, PN=MEDIUM } occurs, { PN=MEDIUM } will follow with 85.4% probability.

FUZZY DATA MINING FOR INTRUSION DETECTION Modification of non-fuzzy methods developed by Lee, Stolfo, and Mok (1998) Anomaly Detection Approach Mine a set of fuzzy association rules from data with no anomalies. When given new data, mine fuzzy association rules from this data. Compare the similarity of the sets of rules mined from new data and normal data.

1 0.8 Similarity 0.6 0.4 0.2 0 T1 T2 T3 T4 T5 T6 T7 T8 T9 Similarity 0.773 0.795 0.775 0.564 0.513 0.288 0.1 0.116 0.0804 Test Data Sets Similarities between Training Data Set and Different Test Data Sets by Mining Fuzzy Association Rules on SN, FN, and RN. Training data collected in the afternoon. T1-T3 afternoon T4-T6 evening T7-T9 late night Data source: MSU CS network

1 0.8 Similarity 0.6 0.4 0.2 0 T1 T2 T3 T4 T5 T6 T7 T8 T9 Similarity 0.681 0.883 0.89 0.207 0.171 0.0645 7.37E-05 8.91E-05 1.67E-05 Test Data Sets Similarities between Training Data Set and Different Test Data Sets by Mining Fuzzy Frequency Episodes on PN. Training data collected in the afternoon. T1-T3 afternoon T4-T6 evening T7-T9 late night Data source: MSU CS network

Similarity 1 0.8 0.6 0.4 0.2 0 Baseline Network1 Network3 Similarity 0.744 0.309 0.315 Test Data Sets Similarities between Training Data Set and Different Test Data Sets by Mining Fuzzy Association Rules on SN, FN, and RN Similarity 1 0.8 0.6 0.4 0.2 0 Baseline Network1 Network3 Similarities between Training Data Set and Different Test Data Sets by Mining Fuzzy Frequency Episodes on PN Similarity 0.885 0 0.000155 Testing Data Sets Training data: no intrusions Test data: baseline (no intrusions) network1 (includes simulated IP Spoofing intrusions) network3 (includes simulated port scanning intrusions)

REAL-TIME INTRUSION DETECTION Given a fuzzy episode rule R: { e1,, ek-1 } { ek }, c, s, w, if {e1,, ek-1} has occurred in the current event sequence, then { ek } can be predicted to occur next with confidence of c. If the next event does not match any prediction from the rule set, it will be alarmed as an anomaly. Define anomaly percentage = number of anomalies / number of events

Anomaly Percentage (%) 40% 30% 20% 10% 0% T1' T2' T3' T4' T5' T6' Anomaly Percentage 8.99% 9.55% 7.30% 25.60% 33.71% 32.39% Test Data Sets Anomaly Percentages of Different Test Data Sets in Real-time Intrusion Detection by Mining Fuzzy Frequency Episodes on PN Training Data: No intrusions Test Data: T1 -T3 no intrusions T4 -T6 simulated mscan Experiments and Results

FUZZY VS. NON-FUZZY Comparing the false positive error rates of fuzzy episode rules with non-fuzzy versions for real-time intrusion detection 20% False Positive Error Rate (%) 15% 10% 5% 0% T1' T2' T3' T4' T5' T6' Fuzzy 8.99% 9.55% 7.30% 4.44% 6.67% 8.89% Non-Fuzzy 17.98% 12.92% 15.25% 11.11% 17.78% 11.11% Test Data Sets

USING GENETIC ALGORITHMS FOR OPTIMIZATION Problem with NIDS System uses a fixed set of features for all kinds of situations Fuzzy membership functions must be predefined. Hypothesis Different features may be useful for different classes of intrusion attacks and for different situations. The performance of the system can be improved by using a GA to evolve an optimal set of features and fuzzy membership functions.

GENETIC ALGORITHMS Optimization goals APPROACH Maximize the similarity of rules mined from normal data with baseline rule set Minimize the similarity of rules mined from abnormal data with baseline rule set Parameters to change Features available from audit data Fuzzy membership function parameters

0a' d c' a c b Z S PI X Y + = = = ),, ( ),, ( ),, ( ') ',, ( 1 ') ',, ( 1 2 1 2 0 ),, ( 2 2 d b b Z b d b S b d PI c a S c a Z a c c a c a a c S µ µ µ µ µ µ µ µ µ a 2 c a a + <µ c c a < + µ 2 c<µ µ µ < < b b FUZZY SETS ARE DEFINED BY PARAMETERIZED MEMBERSHIP FUNCTIONS

EXAMPLE RUN Membership functions before and after the GA optimization FN Original Membership Functions FN Learned Membership Functions 1 1 Membership 0.8 0.6 0.4 0.2 Membership 0.8 0.6 0.4 0.2 0 1 6 11 16 21 26 31 low medium high 36 41 46 Number 0 1 6 11 16 21 26 31 36 41 46 Number low medium high

OPTIMIZATION RESULTS Similarity to Normal 1.2 1 0.8 0.6 0.4 0.2 0 baseline network1 network3 pre-selected feature and fuzzy membership functions optimized features and fuzzy membership functions baseline (no intrusions) network1 (includes simulated IP Spoofing intrusions) network3 (includes simulated port scanning intrusions)

FEATURES SELECTED FOR IP SPOOFING AND PORT SCANNING ATTACKS IP Spoofing Port Scanning Feature Selected Luo s Features Source IP, FIN, Data Size, Port number Source IP, Destination IP, Source Port, and Data size. SYN, FIN, RN SYN, FIN, RN

CONCLUSIONS Developed an architecture for integrating machine learning methods with other intrusion detection methods. Extended data mining techniques by integrating fuzzy logic Demonstrated that these methods are superior to their non-fuzzy counterparts. Developed a method for real-time intrusion detection using fuzzy frequency episodes. Used GA s to improve the performance of the system by selecting best set of features and by tuning the fuzzy membership function parameters

Current and Future Work Further work with fuzzy frequency episodes and real-time intrusion detection Using fuzzy logic for data fusion by the decision module Generating misuse modules from association rules Using incremental data mining to deal with drift in normality Investigating intrusion detection in high speed clusters of workstations