Computational intelligence in intrusion detection systems --- An introduction to an introduction Rick Chang @ TEIL
Reference The use of computational intelligence in intrusion detection systems : A review Shelly Xiaonan Wu, Wolfgang Banzhaf Applied Soft Computing 2009
Intrusion prevention techniques Intrusion detection systems (IDS) Fire walls Access control Encryption Data collection Data preprocessing Intrusion recognition Reporting Response
History of IDS 1987 early 1990s late 1990s D.E. Denning proposed an intrusion detection model Combinations of expert systems and statistical approaches Automated knowledge acquisition Combine with computational intelligence
Computational intelligence J.C. Bezdek (1994) : A system is computational intelligent when it: deals with only numerical (low-level) data, has pattern recognition components does not use knowledge in the artificial intelligence sense; and additionally when it (begins to) exhibit (i) computational adaptivity, (ii) computational fault tolerance, (iii) speed approaching human-like turnaround, and (iv) error rates that approximate human performance.
Computational intelligence Artificial neural networks Fuzzy sets Evolutionary computation methods Artificial immune systems Swarm intelligence Soft computing.
Roadmap Introduction to intrusion detection systems (IDS) Evolutionary computation methods Artificial immune systems Swarm intelligence Discussion
Intrusion detection system Solid lines : data/control flow Dashed lines : responses to intrusive activities.
Intrusion detection system IDS Misuse detection Anomaly detection Predefined descriptions of intrusive behaviors Supervised learning Fail easily when facing unknown intrusions Hypothesize that abnormal behavior is rare and different from normal behavior Unsupervised learning Difficulties: deficiency of abnormal samples, adaption to constantly changing normal behavior
Evolutionary computation
Evolutionary computation Genetic algorithms Automatic model structure design Classifiers Genetic programming Classifiers
Automatic model structure design Artificial neural networks need optimal structures. Clustering algorithms need the number of clusters. Use GA to search the right structure or parameters
Classifiers Classification rules Transformation functions GA: search the parameters GP: search the functions
Niching and fitness function Niching techniques are adopted. Fitness sharing, crowding, voting, token competition Fitness function Detection rate False positive rate Conciseness
Challenges No reasonable termination criterion Niching Distributed EC models Unbalanced data distribution
Artificial immune system
Human immune system Innate immune system Adaptive immune system
Innate immune system 1. Skin 2. Respiratory tract 3. Gastrointestinal tract 4. Urogenital tract ***CORPORATION 1. Phagocytosis 2. Inflammation 3. Complement 4. Interferon
Adaptive immune system 1. Skin 2. Respiratory tract 3. Gastrointestinal tract 4. Urogenital tract ***CORPORATION 1. Phagocytosis 2. Inflammation 3. Complement 4. Interferon
***CORPORATION Adaptive immune system
T-cell helper IL-1 IL-2 killer supressor memory M T4 IL-6 Plasm cell B-cell Ig ***CORPORATION memory
Normally, lymphocytes do not attack normal cells, why? Lymphocytes must be mature before leaving red bone marrow.
Maturation To avoid autoimmunity, T cells and B cells must pass a negative selection stage, where lymphocytes which match self cells are killed. (These mature lymphocytes have never encountered antigens.)
Artificial immune system (AIS) Anomaly detection Instead of building models for the normal, they generate non-self (anomalous) patterns by giving normal data.
Negative selection
Self non-self discrimination model
Lifespan model
An evolutionary AIS model Three stages : gene library evolution negative selection clonal selection Immature detectors, rather than generated randomly, are created by selecting and rearranging useful genes. The library evolves. The clonal selection detects various intrusions with a limited number of detectors, generates memory detectors, and drives the gene library evolution.
Challenges Fitting to real-world environments Avoid the scaling problem Detect and fill holes Estimate the coverage of rule sets Deal with a high volume and dimensional data Adapting to changes in self data Integrating immune responses
Swarm intelligence
Ant colony optimization Use ACO to keep track of intruder trails Identify affected paths of intrusion in a sensor network by investigating the pheromone concentration Clustering local strategy rules
Particle swarm optimization Learn classification rules divide-and-conquer : Use PSO to find the best rule covering current training set Remove those covered points
Discussion
Performance
Research
Challenges Good benchmark datasets Old and unrealistic Ability of adaptation to constantly changing environments intrusive behavior legitimate behavior systems networks
Thanks