ANALYSIS OF PAYLOAD BASED APPLICATION LEVEL NETWORK ANOMALY DETECTION

Transcription

1 ANALYSIS OF PAYLOAD BASED APPLICATION LEVEL NETWORK ANOMALY DETECTION Like Zhang, Gregory B. White Department of Computer Science, University of Texas at San Antonio Abstract Most network anomaly detection research is based on packet header fields, while the payload is usually discarded. Preventing unknown attacks and Internet worms has led to a need for application level network anomaly detection. Payload based detection schemes in experiments are often misleading. In this paper, we discuss the problems associated with the experimental results. In the first section, a brief review will be given for application level anomaly detection research. Introduction to several major payload based approaches will be given in section 2. Then we use the DARPA 99 dataset to evaluate the ALAD mechanism, and discuss the problems by using original DARPA 99 datasets for evaluation. In the fourth section, an improved method will be proposed with a focus on detecting payload related attacks. In section 5, we demonstrate how to justify the payload based detection mechanism using the DARPA 99 dataset, and compare with ALAD to demonstrate its advantages. 1. Introduction Intrusion detection is a common method used by government entities to determine when their network is under attack. Anomaly detection, which attempts to identify attacks based on profiles of normal network activity, is supposed to be able to detect zero-day attacks, as described in [1]. However, it is a far less than practical solution although it has been proposed since the late 80 s [2]. The most popular choice for today s network intrusion detection systems (NIDS) is still the signature based approach, which is based on signatures of already known attacks or vulnerabilities. This method works well if the specific patterns of certain attacks could be found, thus it is able to detect such activities by matching the pattern. It is much more reliable than anomaly based methods on the condition that the attack signature or fingerprint could be identified. However, for new attacks, or mutations of known attacks, whose fingerprints have not been discovered, the signature based approach could miss detecting the attack. Today, the importance of defending against zeroday attacks is becoming increasingly important. Zero day attacks have two scenarios. In the first, the attack is brand new. People will take some time to find the cause and identify the target. In the second, a vulnerability of a specific system or application will be discovered, then a patch will be released to help fix the problem. However, during the time before the patch is released, systems could have already been compromised, as mentioned in [1]. Obviously, a signature based approach could not provide any effective protection against zero day attacks since there is no existing fingerprint when the attack is initially launched. A popular choice is the socalled protocol anomaly detection, which detects any activities violating known protocols. However, it has no effect on any new applications, and even involves looking into the source code [3]. The signature based approach and protocol anomaly detection can be categorized as misuse detection. The misuse detection works well for existing attacks or systems, but not for unknowns. In theory, the only approach capable of detecting any attack regardless of whether they are known or unknown is anomaly detection, which experienced early attention in IDS research. There have been many approaches proposed for network anomaly detection, and most of them are applying different machine learning or data mining techniques on network packets to construct a model for normal network activities. In 1998, Wenk Lee and others at Columbia University first proposed to use rule learning algorithms for host based anomaly detection [4], then they applied similar methods for network intrusion detection by combining with other classification and statistical techniques [5]. Other approaches include using neural network, support vector machines, nearest neighbors, and other statistical methods [6] [7] [8]. A detailed comparison /07 $ IEEE 1

2 for popular data mining based anomaly detection mechanisms can be found in [9]. Although much research has been done, the major problem of anomaly detection is still not solved. Anomaly detection is based on the need to construct a profile to describe normal network activities. Any event not conforming to the profile will be identified as abnormal. However, since the network traffic is so complicated, and varies a lot based on user actions, it is extremely difficult to generate such a profile. At this moment, there has not been a reliable method to achieve the goal. All existing anomaly detection mechanisms have the same problems: detection and false alarm rates. Based on [10], in the DARPA 99 intrusion detection experiment, the best system could only detect at most half of the attacks, and it includes using both network and host based IDS. In the experiment described in [9], several popular data mining techniques have been applied for network anomaly detection, but most of them can only detect about half or less than half of the total attacks in the DARPA 99 dataset at the false positive rate of 0.02%. When trying to increase the detection rate to 80%, the false positive rate also raises to around 1%. That means there could be thousands of false alarms per day for in a normal traffic situation, and it is not acceptable. In these experiments described in [5][9][10], the major failure happens when detecting application level attacks. For network level attacks, such as arp poison, SYN flood, teardrop, or others, most anomaly detection algorithms work well, and the detection rate could even reach near 100% for some cases with low false positives [9]. However, problems occur when application level attacks are involved. In such cases, no anomaly detection method could obtain satisfying results. For example, R2L attack is one of the popular attacks in the DARPA 99 dataset. It tries to gain local access to a machine by taking advantage of specific vulnerabilities. A typical example is using a dictionary to exploit possible weak or misconfigured system security policies. Such attacks almost always happen on the application layer, and experiments have proven that such attacks are extremely difficult to detect by most IDS systems [7]. The reason is not so complicated. All of the anomaly detection schemes only consider the packet header fields, like the destination IP, destination port, flags, etc., so they have good performance if only packet header are involved in the attacks. While for application level attacks, which are mostly related to the packet payload, these header-based methods will not work because they do not check the payload at all. For example, a popular overflow attack is to send some fields with extremely long arguments (e.g. ps and sendmail). Since the header fields are still valid, and the malicious payloads are filtered out in the detection phase, these header-based NIDS will consider the packets normal and fail to generate alarms. In DARPA 99 dataset, almost half of the attacks actually happen on the application level. In fact, most of today s attacks target the vulnerabilities of specific systems or applications, as mentioned in [11], or run on the application level with multiple steps, such as sshtrojan and crashiis in DARPA 99. If we only consider the packet header information, these attacks contain no malicious activity because the header fields do not violate any protocol, and they do not always generate abnormal network traffic. So we have to depend on the packet payload to defend against such attacks. For the signature based NIDS, finding the unique fingerprint of a specific attack is the key issue, which is usually done manually or semi-automatically. However, as we said earlier, the signature based approach has no effect for zero day attacks, so we need a solution for anomaly detection based on packet payload. Some related research will be introduced in section 2. Since payload based anomaly detection is a fairly new topic, it lacks related benchmark or evaluation tools. The DARPA 99 dataset is a popular choice for evaluating various network anomaly detection mechanisms. It contains various attacks including network probe, U2R (User to Root), R2L (Remote to Local), and DOS. The MIT lab provides 5 weeks of data for experiment. There are totally 201 attack instances, which fall into 58 categories. How much of the attacks were detected and how many false alarms are generated become the benchmark for most network anomaly detection research. Although such a benchmark is convenient for NIDS research, we found it could be misleading, especially for payload based anomaly detection. If people simply focus on improving the results based on DARPA 99, they could probably develop a method that looks good in the experiment, but is low in practical value. We will demonstrate the problem in section 3. Then in section 4, an improved payload based network anomaly detection method will be introduced with detailed explanation. In section 5, we will compare the proposed algorithm with others, and discuss why our method actually works better than the other methods. 2

3 2. Related Works Only in recent years has payload based anomaly detection received more than just passing attention for network intrusion detection research. Unlike the header based approach, which could be done by just applying different data mining or machine learning algorithms to the standard packet header fields, the payload does not have any fixed format except for popular protocols such as HTTP or FTP. Even for these protocols, the known information only takes a small portion of the whole payload, and the majority of what the payload carries is usually unknown. So the general goal of payload based anomaly detection is to extract information as much as possible from the unknown payloads. Currently, there have not been many solutions proposed. The following is an introduction to some current attempts which show advantages in some aspects. 2.1 HTTP anomaly detection An anomaly based method to detect web-based attacks was developed in [13][14][15]. Different from other IDS techniques, which identify attacks based on different packet fields such as source IP, destination IP, destination port, etc., this method is based on only the packet payload. Since this approach only focuses on HTTP traffic, it could take advantages of the known protocol format to extract useful fields from the HTTP request, then construct associated statistical models. In the earlier approach, which was described in [13], three properties were used: the request type, the request length, and the payload characters distribution. More properties were incorporated in the later implementation in [14][15]. Although this method claims to have 0.06% or less false positives when testing on Google and campus networks, it only focus on HTTP traffic and cannot be adopted for other applications. So it is almost impossible to compare it with other methods 2.2 PAYL (Payload-based Anomaly Detection) Columbia University has been doing anomaly network intrusion detection since 1997, and their previous effort involved applying data mining techniques for anomaly detection, as in [4][5]. In [16], an approach based on payload byte distribution was proposed. The profile of byte frequency distribution and standard deviation of the payload were built during the training phase. Then in the detecting phase, the Mahalanobis distance was used to measure the difference between the incoming data and the profile. The method proves to work well at identifying new application level activities including malicious executable files or Internet worms [17], however, the problem of false alarm rates still exists when testing with the DARPA 99 dataset under the low false positive rate condition. The researcher claims it could be improved by cooperating with a signature-based approach, but this would only apply to known attacks. Overall, the PAYL approach proves the effectiveness for detecting novel attacks using payload based information, but the accuracy is still a problem. 2.3 ALAD ALAD (Application Level Anomaly Detection) is proposed in [18]. It attempts to extract a keyword from the payload, and associate it with other information to identify attacks. For any packet, the first word of each line will be extracted as a keyword. Thus there could be multiple keywords for a packet. Several pairs of attributes are then created for modeling. Based on the description of [18], most pairs are still based on packet header fields like source ip destination ip or destination ip destination port, the keyword is used in the pair keyword destination port. In the training phase, a statistical profile will be constructed to record all existing values for these pairs in the training set. Since the training set is attack free, the field values in this period are considered acceptable. In the detection phase, each new incoming packet will be compared with each pair s profile. If difference was found, an anomaly score will be assigned and cumulated. Once the anomaly score reaches a certain threshold, an alarm will then be generated. The keyword extraction approach in ALAD is intuitive, because it tries to analyze the payload without any pre-knowledge. However, there are some flaws in its implementation, and shows a lot of problems in our experiment, which will be described later in section 3. The above projects represent the current situation of payload based network anomaly detection research. Each has its strength and weaknesses. The first one shew good performance for detection web based attacks by analyzing http requests, but such an approach is more like a protocol based detection since it is based on already known HTTP protocol and cannot be applied to other applications. PAYL demonstrated the capability to detect novel application level attacks by using payload byte distribution, but it still have problems when trying to have a low false alarm rate. ALAD uses a more intuitive payload 3

4 keyword approach, but its detection in fact depends on the header fields, and has a low detection rate for application level attacks. The DARPA 99 dataset has been used as an important tool for evaluation in the methods described in PAYL and ALAD. Because ALAD offers source code and corresponding evaluation code, we are able to perform in-depth experiments. However, we found a problem which distorts the evaluation. The ALAD is supposed to analyze the payload and then attempts to detect payload based attacks, but it does not work as expected, although the experiment result looks promising. This is because the DARPA 99 dataset is for general purpose NIDS, not for application level attacks only. PAYL also mentioned about the same problem [16], but there is no further detailed discussion. In the next section, we will discuss this issue based on the ALAD approach and DARPA 99 dataset. 3. Evaluating ALAD with DARPA 99 ALAD (Application Level Anomaly Detection) was introduced in [18]. The main idea is to extract the first word of each line in the payload as a keyword. During the training phase, a keyword set should be constructed by collecting all possible keywords. These keywords are then associated with other corresponding properties to construct the profile. For example, in ALAD, keywords are associated with destination port in the packet such as 21:220 and 80:GET. Each port usually corresponds to a specific application or protocol, so it should have a limited set of keywords. In the detection phase, if a new keyword is found, ALAD increases the anomaly score. When the anomaly score reaches the threshold, an alarm will be generated. The key issue for application level NIDS is how to analyze the payload. The keyword approach in ALAD showed promise, so we focused on studying how this technique could help detect payload related attacks in the experiment. The source code of ALAD can be found at [19]. We tested it on the DARPA 99 dataset using the same training set and testing set. As indicated by [18] and [10], we use week 3 as the training set, and week 4~5 as the testing set. The result is shown in table 1. In table 1, we found some potential problems. The ALAD has filtered out all non-tcp traffic, and it should only detect TCP based attacks. However, arp poison is an attack that sends malicious ARP packets, which is supposed to not be detectable by ALAD. Another one is smurf, which is an ICMP based DOS attack, and it is also should not be detected. So why are these attacks detected? Could it be because of the payload keyword working? When we looked into the results, we found the reason for the detection is more coincidental. For the arp poison, the ALAD identifies the malicious packet because of the source IP, destination IP, the fact that the destination port does not match the profile, and the packet was from the same location where the arp poison occurred at the same time. This is in fact a byproduct of the attacker, and it might be because the person is also generating some TCP traffic while performing an arp poison attack. For arp poison itself, it should not generate any TCP communication, thus should not be detectable. The same thing happens to the smurf attack, which is detected by a malicious payload keyword, but smurf itself does not have a keyword at all. This finding does not negate the fact that the ALAD indeed detected these attacks in the DARPA 99 dataset, but it does indicate that the method does not works well for attacks such as arp poision or smurf even if it could detect some instances of them. If the attacker is more careful, he can avoid being detected at all using the same attack. Another finding is that although ALAD claims to be an application level detection approach, the experiment shows most of the detected attacks are still based on network layer information such as IP addresses or port number. The reason is that those addresses are not contained in the constructed profile, which is a collection for all IP addresses and port numbers in the training phase. This is obviously unreasonable since it means those attacks will not be detectable if they are from the same IP addresses which appear in the training data. Besides, a new IP address should be allowed for any public network. Using IP addresses to detect attacks in the testing set might achieve a good result, but it has no practical use. 4

5 Attack FA (false alarms before detected) arppoison 17 # = :80= back 127 # = :80= casesen 65 # = :80= casesen 127 # = :25= crashiis 127 # = :25= crashiis 211 # = :80= dosnuke 179 # = :25= eject 562 # =AS/A/APF To= :20 ffbconfig 35 # = :25= ftpwrite 750 # :79= insidesniffer 37 # = :80= mscan 157 # = :80= netbus 278 # = :25= ntinfoscan 246 # = :25= ntinfoscan 413 # = :80= portsweep 535 # :79= ps 96 # =AS/AP/AF To= : = satan 79 # = :80= sechole 388 # = :80= smurf 637 # 23=",identifier, teardrop 47 # = :80= teardrop 443 # = :80= yaga 127 # = :25= Table 1 Detection Result of ALAD for week 4 and 5 The DARPA 99 dataset contains 201 instances of 58 different kinds of attacks, but not all of them are running on the application level or contain payloads. When people mention the detection rate or false alarms, they are usually talking about the detection rate for all 201 attacks. However, this is not accurate for any systems using anomaly detection focusing on payload based attacks only. When talking about the accuracy of payload based detection, only the accuracy for attacks running on the application level which contain payloads should be reported. While tempting to utilize all types of attacks when evaluating systems, for research purposes it is obviously not correct to report any payload based detection approach as being less accurate because it cannot detect arp poison or any other network layer attacks. To rate the result of a payload-based approach, the attack instances should contain only those running on the application level. Based on the truth table from the MIT Lincoln Lab website, we provide the payload-related attacks in table 2. There are total 33 types of attacks, most of which are U2R and R2U, with a total of 107 instances. Here we consider payload-related attack as any malicious activity running on the application level, even if it has an empty payload. Using the information in table 2, and compared with table 1, we found ALAD only detects 17 payloadrelated instances. From the previous discussion, we also know the 17 detected instances are actually not identified by payload information, but by their different IP addresses from the training set. Such an approach cannot be accepted as a general application level NIDS mechanism because it depends on the network layer fields. How to correctly use network layer information for NIDS is not the goal of this research effort. We want to know how the keyword payload works in the experiment. The ALAD approach has six property combinations. Keyword is used in only one pair: keyword destination port. To test how it affects the detection result, we removed this pair from the profile and conducted the same experiment again. This time, we surprisingly found there was not much difference whether using the keyword or not. In fact, after removing the keyword destination port pair, only the smurf attack is not detected, all others are the same. This indicates the keyword implementing in 5

6 ALAD does not make much contribution to the detection. Even though, the keyword approach is still an intuitive idea. In the next section, we will propose another keyword based anomaly detection algorithm. Attack Instances # Apache 2 3 Back 4 CrashIIS 8 Mailbomb 4 Teardrop 3 Casesen 3 Eject 2 Ffbconfig 2 Fdformat 3 Loadmodule 3 Perl 4 Ps 4 Sechole 3 Xterm 3 Yaga 4 Framespoofer 1 Ftpwrite 2 Guest 3 Httptunnel 3 Imap 2 Named 3 Ncftp 5 Netbus 3 Netcat 4 Phf 4 Sendmail 2 Sshtrojan 3 Xlock 3 Xsnoop 3 Ntinfoscan 3 Satan 2 Guesstelnet 4 Guessftp 2 Guesspop 1 Anypw 1 Total 107 Table 2 Payload related attacks summary in DARPA A keyword based approach The proposed method is based on the idea of using keyword as in ALAD, but it is a totally different in several aspects. First, the ALAD extracts the first word of each line in the payload as the keyword, so there are multiple keywords in one packet. Our method only extracts the first word in the first line, which usually contains the most important information for application level protocols. Second, ALAD associates keyword with destination port, but it was proven to not be of much use as shown in section 3. ALAD actually still depends on header fields for detection. Our method is based on packet payload only, and extracts more information from the payload rather than keyword alone. In addition, ALAD, and most other approaches, arbitrarily select some packet fields for profile developing, while our approach uses Principal Component Analysis (PCA) first to reduce the data dimension to find the most variant fields. The method is divided into 2 phases, as shown in figures 1 and Training phase Extract packet fields Get the packet keyword and its value Numeric the keyword Perform PCA analysis Build Profile Figure 1 Training Phase Step 1: Extract packet fields It is not necessary to extract all of the fields such as TCP flags in the packets. Since we focus on payloadbased attacks, only fields that could be related to payload content should be extracted. Here we pick 9 fields: Header Length, IP Version, Packet Length, Source IP, Destination IP, Source Port, Destination Port, Payload Size, and Payload. Step 2: Packet Keyword and the value Usually, the first line of payload follows the format with keyword parameters, such as GET /index.html or EHLO Jupiter.cherry.org. The first 6

7 word is therefore defined as the keyword, and all subsequent parameters are defined as corresponding values. Step 3: Number the keywords PCA is a method to calculate the eigenvectors of a matrix, so it cannot work with characters or strings. We save the unique keywords in an array, and use the corresponding sequence number as its id. Step 4: PCA Analysis PCA is a popular technique in image processing, patter recognition and data analysis. It is used for data dimension reduction and multivariate analysis. Simply stated, it could simplify a dataset by using linear transformation to transform the original data set into a new coordinate system. The greatest variance of the original data exists on the first coordinate in the new system, the second greatest variance is on the second coordinate, and so on. Table 3 displays the sample result after applying PCA on the selected packet fields Header_Len IP_Version Packet_Len Src_IP Dst_IP Src_Port Dst_Port Payload_Size Keyword Table 3 PCA Results Each column in table 4 stands for an eigenvector, and each row stands for a field in the original data. The first eigenvector demonstrated that the most significant variance in the original data is the source IP and destination IP. The second eigenvector indicates the same result. However, as we discussed in section 3, IP address cannot be taken as a reliable method for attack detection. The third and fourth eigenvectors indicate that the source port and destination port are significant variances. The packet length and payload size stand out in the fifth, seventh and eighth eigenvectors. Keyword is the most significant one in the six eigenvector. The last eigenvector is ignored. Step 5: Build Profile In the PCA process, we found the following properties exhibit great variance: source port, destination port, packet length, payload size, and keyword. Since packet length is the payload size plus IP header length, and we consider only payload related attacks, packet length is removed from consideration. Thus we have only four parameters related to payload: source port, destination port, keyword and the payload size. Since port number is usually associated with a specific protocol, and each protocol has a stable collection of keywords, it is not necessary to relate the port number with keywords. So we save the corresponding payload size for each keyword in a hash table. 4.2 Detection phase Network Packet Preprocessing Fields matching? Y Figure 2 Detection Phase Send Alarm Step 1: Preprocessing The preprocessing is accomplished to extract the necessary fields (keyword and payload size) for profile matching. Step 2: Profile Matching Matching the profile simply means comparing the property pairs in the profile. Since we saved each keyword and the corresponding payload length in a hash table, we compare the incoming keyword and its N Y 7

8 payload size with the data in the hash table. If they do not match, an alarm will be generated. 5. Experiment and Comparison The proposed method is tested using the DARPA 99 dataset. Week 3, which is attack free, is used for training, and week 4 and 5 are used for testing. Since it is a payload based detection approach, we only used the TCP traffic. Table 3 contains the detected attacks: Attack Name Total # Detected # PS 4 2 Guesstelnet 4 2 Netbus 3 2 Ntinfoscan 3 2 Teardrop 3 3 CrashIIS 8 5 Yaga 4 3 Casesen 3 1 Sshtrojan 3 1 Eject 2 1 Ftpwrite 2 1 Back 4 1 Ffbconfig 2 1 Netcat 4 1 Fdformat 3 1 Phf 4 1 Satan 2 1 Sechole 3 1 Netcat 4 1 Table 4 Detected Payload Related Attacks The detected result is compared with ALAD in table 5. In addition to comparing the total attacks detected, as in many other similar experiments, we also compare the payload-related attacks detected, which is to compare the detection rate on application level attacks. Total Attacks Detected Payload related Attacks Overall False Positive Rate Table 5 ALAD Our method Comparison with ALAD The data in table 5 is based on the detection result for the week 4 and 5 DARPA 99 insider network traffic data. The total payload related instances are 107, which belong to 33 categories. In the original ALAD, only 17 instances of 13 types were detected, while our method detects 31 instances of 19 types. Because our detection is only for the payload related attacks, which mostly belong to U2R or R2L, the detection rate is far better than most previous approaches as in [9], which indicated the previous anomaly detection methods can only detect very few or even no R2L or U2R attacks. Our method has slightly higher false positive rate, but we found it is in fact greater because ALAD misses most application level attacks if they are deeply hidden in the traffic, while our method does not. This can be proved by exploring the detection results per day as in figure 3. The reason for the higher false positives is because of day 1, 2 and 4. In these days, there are very few payload related attacks, and they are difficult to detect. ALAD will not be able to detect them and send out alarms, so it has very few false alarms. Our method has better detection mechanism and is capable of detecting these attacks. It is thus understandable why there is a higher number of false alarms. Even though, the false alarms are still acceptable which ranges from 140 to 210 per day. For other days, when payload related attacks are common and ALAD is capable of detecting them, as in day 5, 7, and 8, the false positive rate is very close between our method and ALAD, while our method is almost always able to detect more attacks. In fig. 3, We compared the detection rate for payload related attacks and the corresponding false positive rate of both our method and ALAD in the consecutive 9 days in DARPA 99 experiment. It demonstrates the above conclusion that when the attacks are more common in payload related level (day 5, 7, 8), both methods have similar false rate, while our method always have better detection rate in this situation, especially when such attacks are difficult to be detected when they are rare, as in day 1, 2 and 4. 8

9 under low false alarm conditions is still not satisfying. Since many people have tried applying different algorithms on packet fields, there is not much space for improving the traditional approaches. New directions, however, are worth exploring. As described in this paper, reasonable performance could already be achieved by using extracted keywords and the payload length alone. It could be much improved if we can find additional mechanisms to analyze the payload and obtain more useful information. Since, many attacks are made up of multiple steps, and single steps are valid to a NIDS, it is important to associate these isolated steps together. Thus it is necessary to start studying session-based detection mechanism. (a) Detected payload related Attacks (b) False Positive Rate Figure 3 Comparison with ALAD for 9 day in week 4 and 5 6. Conclusion In this paper, we discussed the potential problem for payload based network anomaly detection evaluation, and then proposed a keyword based approach. The proposed anomaly detection method focuses on application level attacks. We developed the concept of a keyword for payload related attack detection. Combining the keyword with other information, such as payload length, our method demonstrates reasonable performance in the experiments. The experiment demonstrated the advantage by extracting useful information from the packet payload for application level network attack detection, but there is much to accomplish in the future. The detection rate 10. References [1] Levy, E., Approaching Zero, IEEE Security & Privacy Magazine, vol. 2, issue 4, pp , 2004 [2] Denning, D., An Intrusion Detection Model, IEEE Transactions on Software Engineering, vol.13, 2 (Feb), pp , 1987 [3] D.Wagner and D.Dean, Intrusion Detection visa Static Analysis, IEEE Symposium on Security and Privacy, Oakland, California, May 2001 [4] Wenke Lee, Sal Stolfo, and Phil Chan. Learning Patterns from Unix Process Execution Traces for Intrusion Detection, AAAI Workshop: AI Approaches to Fraud Detection and Risk Management, July 1997 [5] Wenke Lee, Sal Stolfo, and Kui Mok., A Data Mining Framework for Building Intrusion Detection Models, Proceedings of the 1999 IEEE Symposium on Security and Privacy, Oakland, CA, May 1999 [6] S. Mukkamala, G. Janoski, A. Sung, Intrusion Detection Using Neural Networks and Support Vector Machines, Proceedings of IEEE International Joint Conference on Neural Networks, pp , Hawaii, May, 2002 [7] Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P., Srivastava, J., Kumar, V., Dokas, P., The MINDS - Minnesota Intrusion Detection System, Next Generation Data Mining, MIT Press, 2004 [8] Xin Xu, Xuening Wang, An Adaptive Network Intrusion Detection Method Based on PCA and Support Vector Machines, Proceedings of the 1st International Conference on Advanced Data Mining and Applications (ADMA 05), Wuhan, China, July 22-24, 2005 [9] Lazarevic, A., Ertoz, L., Ozgur, A, Srivastava, J., Kumar, V., A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection, Proceedings of the 3rd SIAM Conference on Data Mining, San Francisco, May,

10 [10] R. Lippmann, et al., The 1999 DARPA Off-Line Intrusion Detection Evaluation, Computer Networks, 34(4), pp , 2000 [11] H. J. Wang, C. Guo, D. R. Simon, and A. Zugenmaier, Shield: A Vulnerability-Driven Network Filters for Preventing Known Vulnerability Exploits, ACM SIGCOMM 04, Portland, USA, August, 2004 [12] MIT Lincoln Lab, Info. System Tech. Group, [13] C. Kruegl, T. Toth, and E. Kirda, Service Specific Anomaly Detection for Network Intrusion Detection, Proceedings of the 2002 ACM symposium on Applied computing (SAC 2002), pp , Madrid, Spain, 2002 [14] C. Kruegl, G. Vigna, Anomaly Detection of Web-based Attacks, Proceedings of the 10th ACM Conference on Computer and Communication Security (CCS 03), pp , Washington, DC, October, 2003 [15] Christopher Kruegel, Giovanni Vigna, and W. Robertson, A multi-model approach to the detection of web-based attacks, Computer Networks, vol. 48, no. 5, pp , August, 2005 [16] Ke Wang, S. J. Stolfo, Anomalous Payloadbased Network Intrusion Detection, Recent Advances in Intrusion Detection, RAID 2004, Sophia Antipolis, France, September 2004 [17] Ke Wang, Gabriela Cretu, Salvatore J. Stolfo, "Anomalous Payload-based Worm Detection and Signature Generation", Proceedings of the Eighth International Symposium on Recent Advances in Intrusion Detection(RAID 2005), pp , 2005 [18] Matthew V. Mahoney and Philip K. Chan, Learning Nonstationary Models of Normal Traffic for Detecting Novel Attacks, Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining, pp , 2002 [19] Network Anomaly Intrusion Detection Research at Florida Institue of Technology., 10