Intrusion Detection Systems Oussama El-Rawas History and Concepts of IDSs Overview A brief description about the history of Intrusion Detection Systems An introduction to Intrusion Detection Systems including: Models Architecture of a recent paper on Anomaly Detection. Response/Conclusion Evolution of IDSs 1
Evolution of IDSs: Definitions IDS = Intrusion Detection System. Overall, there are 2 main types of intrusion detection: Network Intrusion detection (NID) Host-Based Intrusion Detection (HID) Network Intrusion Detection Uses Packet Sniffers to read and analyze packets exchanged between hosts. Usually Deployed with broadcast type networks (TCP/IP). Historically weak with: Switched Networks Encrypted Networks High-Speed Networks Some of the weaknesses have been tackled Host-Based Intrusion Detection Monitors the Hosts themselves and responds to attacks on them. Mainly done through audit logs. Helps to combat internal threats: Possibly 80% of all intrusions by disgruntled and/or dishonest employees. 2
Evolution of IDSs Concept born in 1980: Computer Security Threat Monitoring and Surveillance by James Anderson. Importance of auditing and audit trails Start of HIB IDSs. 1983: SRI International (Dr. Dorothy Denning) in a Government project Analyze audit trails Create activity based user profiles First IDS Model: IDES Evolution of IDSs 1984: SRI track and analyze audit data on ARPANET. Dr. Denning: An Intrusion Detection Model Revealed necessary info for commercial IDSs Basis of most work on IDSs 1988: University of California Davis, Lawrence Livermore Labs The Haystack Project (US Air Force) Comparing audit data to defined patterns. DIDS (Distributed IDS) for client and server tracking. Evolution of IDSs 1989: Haystack Labs Founded Stalker, a host-based pattern matching system that included robust search capabilities to manually and automatically query the audit data. 1990: Birth of NID. Introduce by UC Davis's Todd Heberlein. Developed the Network Security Monitor (NSM) Deployed at major government installations Contributed to DIDS Birth of Hybrid Intrusion Detection. 3
Evolution of IDSs Early 1990's: Commercial IDS development Haystack: first to market with Stalker SAIC: Computer Misuse Detection System (CMDS). Host-based IDS. Air Force Cryptologic Support Center: Automated Security Measurement System (ASIM). Better Scalability and Portability compared to other NIDs First to incorporate both hardware and software into NID Evolution of IDSs 1994: ASIM Developers formed the Wheel Group NetRanger: First commercially viable NID device 1997 and beyond: IDSs gain popularity and revenue. ISS develops NID called RealSecure Cisco purchase Wheel group Centrax Corp.: Merger of Haystack Labs and and CMDS team from SAIC IDSs gain popularity Evolution of IDSs 4
Intrusion Detection Systems Intrusion Detection Systems System regular behavior includes: Predictable user/process patterns. Command patterns that don't compromise system security. Processes conforming to specification. Systems under attack fail to conform with at least one of the above => Basis of intrusion detection. Based on: D. Denning, An Intrusion Detection Model. Intrusion Detection Systems: Role To detect intrusions with known and unknown techniques irrespective of the source. To provide timely detection of intrusions. To present status data to the user (security officer). To be accurate. 5
Intrusion Detection Systems: Models The IDS Model: responsible for classifying a sequence of states and actions. Different Models related to ways a system might not conform with regular behavior: Deviation from usual actions => Anomaly Model Actions that compromise security => Misuse Model Out-of-spec actions from programs => Specification-based Model All Models can be either static or adaptive Intrusion Detection Systems Anomaly Modeling Definition: Anomaly detection analyzes a set of characteristics of the system and compares their behavior with a set of expected values. It reports when the computed statistics do not match the expected measurements. Based on statistical models Intrusion Detection Systems: Anomaly Modeling 3 possible statistical models used: Threshold based Statistical Moments Based on statistical data (mean, std. Deviation) Adaptive Models behavior Selection of features and statistics crucial to performance Possible application for Neural Networks Markov Model Needs training data to make a model (non-attack) Describes the probability of transitioning from one state to another Transitioning to a state of low probability is reported as an anomaly 6
Intrusion Detection Systems: Anomaly Modeling Examples: IDES Statistics based Adaptive: biased towards newer statistical information Haystack Intrusion Detection Systems: Misuse Modeling Definition: Misuse detection determines whether a sequence of instructions being executed is known to violate the site security policy being executed. If so, it reports a potential intrusion. Applies a rule set on a sequence of instructions: Static Adaptive New rules can be added any time. Site Centric Intrusion Detection Systems: Misuse Modeling Example: Kumar and Spafford, IDIOT (Intrusion Detection In Our Time) Event is a change in state Attacks classified in five ways: Existence: creation of an entity Sequence: sequential events Partial order: several sequences Duration: Existence for an a certain time Interval: events separated by N units of time Monitors audit logs 7
Intrusion Detection Systems: Specification-based Modeling Definition: Specification-based detection determines whether or not a sequence of instructions violates a specification of how a program, or system, should execute. If so, it reports a potential intrusion. Relies mainly on system traces System Trace is a sequence of events during the execution of a system of processes All programs running need to be modeled Spec-based Modeling in its infancy. Intrusion Detection Systems: Architecture Three major parts of the IDS architecture: The Agent The Director The Notifier Intrusion Detection Systems: Architecture Agent: gathers information on the system from different sources (log files, processes, network), and relays it to the director Possible preprocessing 2 types of Agents: Host based Gathers information from local host sources (log file, Policy checker) Focused on inside intrusion Network Based Monitors Traffic and Content through network sniffers Focused on outside intrusion and network oriented attacks (ex: DOS) 8
Intrusion Detection Systems: Architecture Director: Uses an analysis engine the existence of an attack based on information gathered by the agents, and relays its result to the Notifier This is the Brain of the IDS Could possibly be adaptive Notifier: receives outputs from the director and reports it to the user through different means (GUI, log files, email). Intrusion Detection Systems: Architecture Init IDS Agent Director Notifier Accurately Detecting Source Code of Attacks that Increase Privilege, by R.K. Cunningham and C.S. Stevenson. This paper addresses anomaly detection in software code on Unix based systems. 9
Usually IDS depend on audit logs to detect attacks Too Slow for some attacks Allows the malware to disable the IDS and gain privileges => Detect the software before the attack takes place by identifying potentially malicious code. Two types of privilege gaining programs: Give access to unauthorized user Increase privileges of an existing user The attack is assumed to be launched from the unknowing victim Steps of the attack: Download malware Compiling on local machine Execute the attack using compiled code Deals with C and shell script only. Most Antivirus and IDS software use Signature matching Heuristics (to detect the steps taken by malicious code) This system uses a 3 step approach Language identification Feature extraction Classifier 10
For testing and training, sample malicious code obtained from open-source projects an hacker sites. C code: 5271 training files, 496 attack files 3323 test files, 67 attack files Shell code: 476 training files, 119 attack files 650 test files, 33 attack files Language Identification: A rule based idetifier C identified by: Preprocessor directives Reserved words (not English nor shell) C comments Shell identified by: '!#' Shell comments '#' '$' prefixes On 450MHz Sparc Ultra 60: 90μs/KB 11
Features are defined by regular expressions: <Feature type, Code category, ecoding scheme> Code categories: Comments Strings Code sans-strings Code Encoding scheme: Once Count Normalize C detector: 19 features described by regular expressions. Creating and deleting file links (link, unlink, rmdir) Getting privileged programs to execute malicious code (embedded) Attack actions (using chown, passwd,...) Fed into MLP Neural Network (N, 2N, N), where N=19. 12
Classifier configurations: Code with comments: Embedded executable Exploit comment Calls to execute command Presence of main() Local include file Sans-comment Embedded executable Calls to execute command Presence of main() Link and system calls Near zero false alarm rate Fast, needing 666μs/KB 77% C code analysis 21% File read 2% other Shell detector Adapted the regular expressions from C to Shell, and defined additional regular expressions: Altering Environment Variables Acquiring priviliged interactive shell Shell code used to create new users. Backward feature selection determines features that give the most performance. 13
Classifier configurations: With comments: Comment Localhost Chown (increase ownership) Interactive (enter interactive shell) Shared (create shared object) Sans-comments: Embedded executable: checking for embedded hex or octal code Subshell: check for invocation of subshell Classifier is an MLP (N, 2N, N) with N=12 for code with comments N=16 for code without comments More difficult and lengthy than C to Classify because of more complex regular expressions. Performance: 1071μs/KB 74% Shell code analyzer 21% reading files 4% other Main weakness is the ability to add dummy code to modify features extracted 14
References Computer Security, by Matt Bishop, Pearson Education Inc. 2003. The Evolution of Intrusion Detection Systems, by Paul Innella, Nov 16 2001, http://www.securityfocus.com/infocus/1514 Accurately Detecting Source Code of Attacks That Increase Privilege, R.K. Cunningham and C.S. Stevenson. Questions? 15