Measuring Intrusion Detection Capability: An Information-Theoretic Approach

Size: px
Start display at page:

Download "Measuring Intrusion Detection Capability: An Information-Theoretic Approach"

Transcription

1 Measuring Intrusion Detection Capability: An Information-Theoretic Approach Guofei Gu, Prahlad Fogla, David Dagon, Boris Škorić Wenke Lee Philips Research Laboratories, Netherlands Georgia Institute of Technology, U.S.A. ABSTRACT A fundamental problem in intrusion detection is what metric(s) can be used to objectively evaluate an intrusion detection system (IDS) in terms of its ability to correctly classify events as normal or intrusive. Traditional metrics (e.g., true positive rate and false positive rate) measure different aspects, but no single metric seems sufficient to measure the capability of intrusion detection systems. The lack of a single unified metric makes it difficult to fine-tune and evaluate an IDS. In this paper, we provide an in-depth analysis of existing metrics. Specifically, we analyze a typical cost-based scheme [6], and demonstrate that this approach is very confusing and ineffective when the cost factor is not carefully selected. In addition, we provide a novel information-theoretic analysis of IDS and propose a new metric that highly complements cost-based analysis. When examining the intrusion detection process from an information-theoretic point of view, intuitively, we should have less uncertainty about the input (event data) given the IDS output (alarm data). Thus, our new metric, C ID (Intrusion Detection Capability), is defined as the ratio of the mutual information between the IDS input and output to the entropy of the input. C ID has the desired property that: () It takes into account all the important aspects of detection capability naturally, i.e., true positive rate, false positive rate, positive predictive value, negative predictive value, and base rate; (2) it objectively provides an intrinsic measure of intrusion detection capability; and (3) it is sensitive to IDS operation parameters such as true positive rate and false positive rate, which can demonstrate the effect of the subtle changes of intrusion detection systems. We propose C ID as an appropriate performance measure to maximize when fine-tuning an IDS. The obtained operation point is the best that can be achieved by the IDS in terms of its intrinsic ability to classify input data. We use numerical examples as well as experiments of actual IDSs on various data sets to show that by using C ID, we can choose the best (optimal) operating point for an IDS and objectively compare different IDSs. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ASIACCS 6, March 2-24, 26, Taipei, Taiwan. Copyright 26 ACM /6/3...$5.. Categories and Subject Descriptors C.2. [Computer-Communication Network]: Security and Protection; C.4 [Performance of Systems]: Measurement techniques; H.. [Systems and Information Theory]: Information theory General Terms Security, Measurement Keywords. INTRODUCTION Evaluating intrusion detection systems is a fundamental topic in the field of intrusion detection. In this paper, we limit our focus of evaluation to measure the effectiveness of an IDS in terms of its ability to correctly classify events as normal or intrusive. Other important IDS performance objectives, such as economy in resource usage, resilience Intrusion detection, performance measurement, informationtheoretic to stress [2], and ability to resist attacks directed at the IDS [9, 7], are beyond the scope of this paper. Policydependent IDS evaluation is also beyond the scope. Measuring the capability of an IDS (to correctly classify events as normal or intrusive) is essential to both practice and research because it enables us to better fine-tune an IDS (selecting the best IDS configuration for an operation environment) and compare different IDSs. For example, when deploying an anomaly-based IDS, we need to adjust some parameters (e.g., the threshold of deviation from a normal profile) to tune the IDS at an optimal operating point. Each adjustment (setting) is a different configuration. If we can measure the capability of an IDS at these configurations, we can simply choose the configuration that maximizes this capability metric. There are several existing metrics that measure different aspects of intrusion detection systems, but no single metric seems sufficient to objectively measure the capability of intrusion detection systems. The most basic and commonly used metrics are true positive rate (T P, i.e., the probability that the IDS outputs an alarm when there is an intrusion) and false positive rate (F P, i.e., the probability that the IDS outputs an alarm when no intrusion occurs). Alternatively, one can use false negative rate F N = T P and true negative rate T N = F P. When we fine-tune an IDS (particularly an anomaly detection system), for example, by setting the threshold of

2 Term Equivalent Terms from IDS Literature Meaning F P, or α P (A I) False positive rate. The chance that there is an alert, A, when there is no intrusion, I. T P ( β), P (A I) True positive rate (or detection rate). The chance there is an alert, A, when there is an intrusion, I. F N, or β P ( A I) False negative rate. The chance there is no alert, A, when there is an intrusion, I. T N ( α), P ( A I) True negative rate. The chance there is no alert, A, when there is no intrusion, I. P P V Bayesian detection rate, P (I A) Positive predictive value. The chance that an intrusion, I, is present when an IDS outputs an alarm, A. NP V P ( I A) Negative predictive value. The chance that there is no intrusion, I, when an IDS does not output an alarm, A. B P (I) Base rate. The probability that there is an intrusion in the observed audit data. Table : Terminology used in this paper. For readability, we will use the terms listed in the leftmost column. a deviation from a normal profile, there may be different T P and F P values associated with different IDS operation points (e.g., each with a different threshold). For example, at one configuration, T P =.8, F P =., while at another configuration, T P =.9, F P =.2. If only the metrics of T P, F P are given, determining the better operation point is difficult. This naturally motivates us to find a new composite metric. Clearly, both T P and F P need to be considered in this new metric. The question is then how to use these two basic metrics together. A popular approach is to use an ROC (receiver operating characteristic) curve [9] to plot the different T P and F P values associated with different IDS operation points. For example, an ROC curve can show one (operation) point with < T P =.99, F P =. > and another with < T P =.999, F P =.>. An ROC curve shows the relationship between T P and F P, but by itself, it cannot be used to determine the best IDS operation point. ROC curves may be used for comparing IDSs. If the ROC curves of the two IDSs do not cross (i.e., one is always above the other), then the IDS with the top ROC curve is better because for every F P, it has a higher T P. However, if the curves do cross, then there is no easy way to compare the IDSs. It is not always appropriate to use the area under the ROC curve (AUC) for comparison because it measures all possible operation points of an IDS. One can argue that a comparison should be based on the best operation point of each IDS because in practice an IDS is fine-tuned to a particular configuration (e.g., using a particular threshold). One approach to integrating the metrics T P and F P is through cost-based analysis. Essentially, the tradeoff between a true positive and a false positive is considered in terms of cost measures (or estimates) of the damage caused by an intrusion and inconvenience caused by a false alarm. Gaffney and Ulvila [6] used such an approach to combine ROC curves with cost analysis to compute an expected cost for each IDS operation point. The expected cost can be used to select the best operation point and to compare different IDSs. The quality of cost-based analysis depends on how well the cost estimates reflect the reality. However, cost measures in security are often determined subjectively because of the lack of good (risk) analysis models. Thus, cost-based analysis cannot be used to objectively evaluate and compare IDSs. As shown in Section 3, this approach [6] is very confusing and ineffective when the cost factor is not carefully selected. Moreover, cost-based analysis does not provide an intrinsic measure of detection performance (or accuracy). In addition to T P and F P, two other useful metrics are the positive predictive value (P P V ), which is the probability of an intrusion when the IDS outputs an alarm, and the negative predictive value (NP V ), which is the probability of no intrusion when the IDS does not output an alarm. These metrics are very important from a usability point of view because ultimately, the IDS alarms are useful to an intrusion response system (or administrative staff) only if the IDS has high P P V and NP V. Both P P V and NP V depend on T P and F P, and are very sensitive to the base rate (B), which is the prior probability of intrusion. Thus, these two metrics can be expressed using Bayes theorem (and PPV is called Bayesian detection rate [] in IDS literature) so that the base rate can be entered as a piece of prior information about the IDS operational environment in the Bayesian equations. Similar to the situation with T P and F P, both P P V and NP V are needed when evaluating an IDS from a usability point of view, and currently, there is no objective method to integrate both metrics. We need a single unified metric that takes into account all the important aspects of the detection capability, i.e., T P, F P, P P V, NP V, and B. That is, this metric should incorporate existing metrics because they are all useful in their own right. This metric needs to be objective. That is, it should not depend on any subjective measure. In addition, it needs to be sensitive to IDS operation parameters to facilitate fine-tuning and fine-grained comparison of IDSs. We use T P and F P as the surrogates of IDS operation parameters (e.g., threshold) because changes to the operation parameters usually result in changes to T P and F P. Although it is difficult or sometimes impossible to control the base rate in an IDS, we still consider it as an operation parameter because it is a measure of the environment in which the IDS operates. T P, F P, B can be measured when

3 we evaluate an IDS because we have the evaluation data set and should know the ground truth. We propose an information-theoretic measure of the intrusion detection capability. At an abstract level, the purpose of an IDS is to classify the input data (i.e., events that the IDS monitors) correctly as normal or an intrusion. That is, the IDS output (i.e., the alarms) should faithfully reflect the truth about the input (i.e., whether an intrusion really occurs or not). From an information-theoretic point of view, we should have less uncertainty about the input given the IDS output. Thus, our metric, called the Intrusion Detection Capability, or C ID, is simply the ratio of the mutual information between the IDS input and output to the entropy of the input. Mutual information measures the amount of uncertainty of the input resolved by knowing the IDS output. We normalize it using the entropy (the original uncertainty) of the input. Thus, the ratio provides a normalized measure of the amount of certainty gained by observing IDS outputs. This natural metric incorporates T P, F P, P P V, NP V, and B, and thus, provides a unified measure of the detection capability of an IDS. It is also sensitive to T P, F P, and B, which can demonstrate the effect of the subtle changes of intrusion detection systems. This paper makes contributions to both research and practice. We provide an in-depth analysis of existing metrics and provide a better understanding of their limitations. We examine the intrusion detection process from an informationtheoretic point of view and propose a new unified metric for the intrusion detection capability. C ID is the appropriate performance measure to maximize when fine-tuning an IDS. The obtained operation point is the best that can be achieved by the IDS in terms of its intrinsic ability to classify input data. We use numerical examples as well as experiments of actual IDSs on various data sets to show that by using this metric, we can choose the best (optimal) operating point for an IDS and objectively compare different IDSs. Note that this new metric, C ID, is not intended to replace existing metrics such as T P, F P. In fact, T P, F P are used as basic inputs to compute C ID. Thus, C ID presents a composite/unified measure and a nature tradeoff between T P and F P. Furthermore, C ID is just one possible measure for IDS evaluation. It is not to replace cost-based analysis, but instead, it greatly complements the cost-based approach, particularly in the cases that risk model is not clear or not available. Finally, although our measure can be used in other domains, we focus on intrusion detection (specifically network-based intrusion detection) as a motivating example. The rest of this paper is organized as follows. Section 2 provides an information-theoretic view of the intrusion detection process. After reviewing some essential information theory concepts, we introduce our unified metric of the intrusion detection capability, C ID. Section 3 analyzes existing metrics and compares them with C ID. Section 4 describes how C ID can be used to select the best operation point of an IDS and to compare different IDSs. Section 5 discusses limitations and extensions. Section 6 introduces related work, and Section 7 concludes the paper and discusses future work. 2. AN INFORMATION-THEORETIC VIEW OF INTRUSION DETECTION Let us revisit the intrusion detection process from an informationtheoretic point of view. At an abstract level, an IDS accepts and analyzes an input data stream and produces alerts that indicate intrusions. Every unit of an input data stream has either an intrusive or normal status. Thus, we can model the input of an IDS as a random variable X, where X = represents an intrusion, and X = represents normal traffic. The output alerts of an IDS is also modeled as a random variable Y, where Y = indicates an alert of an intrusion, and Y = represents no alert from the IDS. We assume here that there is an IDS output (decision) corresponding to each input. The exact encoding of X, Y is related to the unit of the input data stream, which is in fact related to IDS analysis granularity, or the so-called unit of analysis [5]. For network-based IDSs such as Snort [22], the unit of analysis is a packet. The malicious packets are encoded as X =. The IDS examines every packet to classify it as malicious (Y = ) or normal (Y = ). There are also IDSs such as Bro [7] which analyze events based on flows. In this case, the malicious flow is encoded as X =, and the output indicates whether this flow contains an attack (Y = ) or not (Y = ). An abstract model for intrusion detection is shown in Figure. In this model, p(x =) is the base rate, which is the P(x=)=B P(x=)=-B X Figure : An abstract model for intrusion detection. prior probability of intrusion in the input event data examined by the IDS. We denote it as B. An intrusion event has a probability p(y = X = ) of being considered normal by the IDS. This is the false negative rate (F N), denoted as β. Similarly, a normal event also has a probability p(y = X = ) of being misclassified as an intrusion. This is the false positive rate (F P ), denoted as α. We will use the notations (B, α, β) throughout this paper. Table lists the terms used by this paper and their meaning. Note that when we evaluate an IDS, we should have an evaluation data set of which we know the ground truth. Thus, once the evaluation data set is given and the tests are run, we should be able to calculate B, α and β. This model is useful because intrusion detection can be analyzed from an information-theoretic point of view. We will first review a few basic concepts in information theory [3], the building blocks of our proposed metric of the intrusion detection capability. FN FP -FN -FP 2. Information Theory Background Definition. The entropy (or self-information) H(X) of a discrete random variable X is defined by [3] H(X) = x p(x) log p(x) Y

4 This definition is commonly known as the Shannon entropy measure, or the uncertainty of X. A larger value of H(X) indicates that X is more uncertain. We use the convention that log =, which is easily justified by continuity because x log x as x. The definition of entropy can be extended to the case of jointly distributed random variables. Definition 2. If (X, Y ) is jointly distributed as p(x, y), then the conditional entropy H(X Y ) is defined as [3] H(X Y ) = y p(x, y) log p(x y) () x Conditional entropy is the amount of remaining uncertainty of X after Y is known. We can say H(X Y ) = if and only if the value of X is completely determined by the value of Y. Conversely, H(X Y ) = H(X) if and only if X and Y are completely independent. Conditional entropy H(X Y ) has the following property: H(X Y ) H(X) Definition 3. Consider two random variables X and Y with a joint probability mass function p(x, y) and marginal probability mass functions p(x) and p(y). The mutual information I(X; Y ) is defined as [3] I(X; Y ) = x p(x, y) p(x, y) log p(x)p(y) y Mutual information tells us the amount of information shared between the two random variables X and Y. Obviously, I(X; Y ) = I(Y ; X). Theorem. Mutual information and entropy [3]: I(X; Y ) = H(X) H(X Y ) = H(Y ) H(Y X) This equation shows that we can interpret mutual information as the amount of reduction of uncertainty in X after Y is known, H(X Y ) being the remaining uncertainty. This theorem shows the relationship between conditional entropy and mutual information. We can also express this relationship in a Venn diagram shown in Figure 2. Here, mutual information I(X; Y ) corresponds to the intersection of the information in X with the information in Y. Clearly, I(X; Y ) H(X). 2.2 C ID : A New Metric of The Intrusion Detection Capability Our goal is to define a metric that measures the capability of an IDS to classify the input events correctly. At an abstract level, the purpose of an IDS is to classify the input correctly as normal or intrusive. That is, the IDS output should faithfully reflect the truth about the input (i.e., whether an intrusion occurs or not). From an information-theoretic point of view, we should have less uncertainty about the input, given the IDS output. Mutual information is a proper yardstick because it captures the reduction of original uncertainty (intrusive or normal) given that we observe the IDS alerts. We propose a new metric, Intrusion Detection Capability, or C ID, which is simply the ratio of the mutual information between IDS input and output to the entropy of the input. H(X Y) H(X) I(X;Y) H(Y) H(Y X) H(X) H(Y) Realistic IDS Situation Figure 2: Relationship between entropy and mutual information. For example, H(X) = I(X; Y ) + H(X Y ). On the right, the entropy H(Y ) is much larger than H(X). This reflects a likely IDS scenario, where the base rate is very small (close to zero), so H(X) is nearly zero. On the other hand, the IDS may produce quite a few false positives. Thus, H(Y ) can be larger than H(X). Definition 4. Let X be the random variable representing the IDS input and Y the random variable representing the IDS output. Intrusion Detection Capability is defined as C ID = I(X; Y ) H(X) As discussed in Section 2., mutual information measures the reduction of uncertainty of the input by knowing the IDS output. We normalize it using the entropy (i.e., the original uncertainty) of the input. Thus, C ID is the ratio of the reduction of uncertainty of the IDS input, given the IDS output. Its value range is [, ]. Obviously, a larger C ID value means that the IDS has a better capability of classifying input events accurately. C ID can also be interpreted in the following way. Consider X as a stochastic binary vector that is the correct assessment of the input data stream S, i.e., the correct indication whether each stream unit is an intrusion or not. The detection algorithm is a deterministic function acting on S, yielding a bitstring Y that should ideally be identical to X. The IDS has to make correct guesses about the unknown X, based on the input stream S. The actual number of required binary guesses is H( X), the real information content of X. Of these, the number correctly guessed by the IDS is I( X; Y ) (see Figure 2 for the intersection H(X) H(Y )). Thus, I( X; Y )/H( X) is the fraction of correct guesses. Using the definitions in Section 2. and the abstract model of IDS input (X) and output (Y ), shown in Figure, we can expand C ID and see that it is a function of three basic variables: base rate (B), F P (α), and F N (β). When B = or B = (i.e., the input is % normal or % intrusion), H(X) =. We define C ID = for these two cases. From Figure 3(a), we can see the effect of different base rates on C ID. In realistic situations in which the base rate (B) is very low, an increase in B will improve C ID. We should emphasize that the base rate is not normally controlled by an IDS. However, it is an important factor when studying intrusion detection capability. Figure 3(a) clearly shows that for low base rates, it is better to decrease F P than F N in order to achieve a better C ID. For example, suppose we have an IDS with a base rate B = 5, and a F P =., and F N =.. If we decrease the F P from. to. (a ten-fold decrease), the (2)

5 ID Capability (C ID ) α=.,β=. α=.,β=. α=.,β=. α=.,β=. ID Capability (C ID ) B=.,β=. B=.,β=. B=.,β=.3 B=.,β=.5 ID Capability (C ID ) B=.,α=. B=.,α=.5 B=.,α=. B=.,α= Percent of Intrusion (base rate B) (a) Realistic environment (base rate B plots in log-scale). Compare the two sets of lines for changes in α against fixed β values. If the base rate is small, changes in α are more significant than changes in β False Positive (α) (b) The effect of F P (α). For a small base rate B (fixed in this plot), even tiny changes in α produce large changes in C ID False Negative (β) (c) The effect of F N (β). For a small base rate B (fixed in this plot), only large changes in β have a significant effect on C ID. Figure 3: Intrusion Detection Capability. For a realistic low base rate, C ID is more sensitive to changes in α than changes in β. C ID moves from.45 to.353. If we instead decrease the F N from. to., the C ID only moves from about.45 to.778. Thus, for very low base rates, a reduction in F P yields more improvement in intrusion detection capability than the same reduction in F N. This is intuitive as well, if one realizes that both F N and F P are misclassification errors. When the base rate is low, there are more normal packets that have a chance of being misclassified as F P. Even a large change in F N may not be very beneficial if few attack packets are at risk for misclassification as F N. A formal proof that C ID is more sensitive to F P than to F N is given in our technical report [8]. We know that in the perfect case where F P = F N =, C ID is always the same (C ID = ) because the IDS classifies the events without a mistake. For realistic (low) base rates, the effects of F P and F N are shown in Figures 3(b) and 3(c). C ID will improve with a decrease in both F P and F N. Note that any reasonable (or allowable ) IDS should have detection rate greater than the false positive rate ( F N > F P ). That is, an IDS should be doing better than random guessing, which has F P =F N=5%. Thus, when F N < F P, we define C ID =. There do exist several other similar metrics based on normalized mutual information in other research areas. For example, in medical image processing, NMI (Normalized Mutual Information [8], which is defined as NMI = (H(X) + H(Y ))/H(X, Y )), is used to compare the similarity of two medical images. In fact, NMI = (H(X)+H(Y ))/H(X, Y ) = (H(X, Y ) + I(X; Y ))/H(X, Y ) = + I(X; Y )/H(X, Y ). It ranges from to 2. For comparison with C ID, we can plot NMI using NMI = I(X; Y )/H(X, Y ) (omitting the plus from the term as a constant) in Figure 4. We can see from Figure 4(a) that NMI shows a similar trend as C ID. However, we clearly see that NMI is not sensitive to F N in that a variation of F N has little effect. For example, when F P =., if we vary F N from. to., NMI remains almost the same, because in a realistic IDS operation environment, the base rate is very low (close to zero), indicating that the uncertainty of X is close to zero. Thus, the entropy of X (nearly zero) is far less than the entropy of Y because the IDS can produce many false positives, as shown in the right part of Figure 2. We have NMI = I(X; Y )/H(X, Y ) = I(X; Y )/(H(X) + H(Y ) I(X; Y )), and H(Y ) H(X) > I(X; Y ). We also know that a change in F N will cause only a very slight change of I(X; Y ). (Recall the discussion above, where a low base rate implies there are few attack packets exposed to the risk of being misclassified as F N.) Thus, a change in F N actually has very little effect on the change in NMI. Furthermore, consider the plots in Figure 3(c) with Figure 4(c). For equivalent ranges of F N, the y-axis for the NMI plot in Figure 4 ranges from to.7, while the axis for the C ID ranges from. to.6. Thus, C ID is almost an order of magnitude more sensitive to changes in F N than NMI. Similarly, the corresponding F P plots in Figures 3(b) and 4(b) show that C ID is approximately times as sensitive to equivalent shifts in F P as NMI. For all these reasons, NMI is not a good measure of intrusion detection capability. In other domains, where the relationship H(X) H(Y ) does not apply, NMI may be a suitable metric. NMI is a symmetric measure. There is an asymmetric measure called NAMI (Normalized Asymmetric Mutual Information) in [23], which is defined as NAMI = I(X; Y )/H(Y ). This metric has the same problem as NMI in that it is relatively insensitive to changes in F N. In realistic IDS scenarios, the base rate is low, and H(X) H(Y ). Accordingly, H(Y ) H(X, Y ). Thus, NAMI NMI, and is unsuitable for an intrusion detection metric. 3. ANALYSIS AND COMPARISON This section provides an in-depth analysis of existing IDS metrics and compares them with the new metric C ID. 3. ROC Curve-Based Measurement An ROC curve shows the relationship between T P and F P, but by itself, it cannot be used to determine the best IDS operation point. The ROC curves can sometimes be used for comparing IDSs. If ROC curves of two IDSs do not cross (i.e., one is always above the other), then the IDS

6 NMI I(X;Y)/H(X,Y) α=.,β=. α=.,β=. α=.,β=. α=.,β=. NMI 8 x B=.,β=. B=.,β=. B=.,β=.3 B=.,β=.5 NMI I(X;Y)/H(X,Y) B=.,α=. B=.,α=.5 B=.,α=. B=.,α= Percent of Intrusion (base rate B) (a) Realistic environment (plot in logscale). For the same α values, changes in β produce almost no difference in NMI. Note that the sets of lines almost overlap False Positive (α) (b) The effect of F P (α). For a realistic base rate B, the plot of α to NMI is nearly two orders of magnitude less sensitive than C ID. Compare this plot to the C ID plot in Figure 3(b) False Negative (β) (c) The effect of F N (β). Again, for a realistic base rate B, the plot of β against NMI is nearly one order of magnitude less sensitive than C ID. Compare this plot to the C ID performance in Figure 3(c). Figure 4: NMI=I(X;Y)/H(X,Y). Using a realistic base rate B, we plot NMI against changes in α and β. Compared to Figure 3, NMI is far less sensitive than C ID. Note the orders of magnitude difference in scales used in this plot, and Figure 3. with the top ROC curve is better. However, if the curves do cross, the area under the ROC curve (AUC) can be used for comparison. However, this may not be a fair comparison because AUC measures all possible operation points of an IDS, while in practice, an IDS is fine-tuned to a particular (optimal) configuration (e.g., using a particular threshold). Gaffney and Ulvila [6] proposed to combine cost-based analysis with ROC to compute an expected cost for each IDS operation point. The expected cost can then be used to select the best operation point and to compare different IDSs. They assigned cost C α for responding to a false alarm and cost C β for every missed attack. They defined the cost ratio as C = C β /C α. Using a decision tree model, the expected cost of operating at a given point on the ROC curve is the sum of the products of the probabilities of the IDS alerts and the expected costs conditional on the alerts. This expected cost is given by the following equation: C exp = Min{CβB, ( α)( B)}+Min{C( β)b, α( B)} (3) In a realistic IDS operation environment, the base rate is very low, say 5. The α is also very low, say 3 (because most IDSs are tuned to have very low α), while β may not be as low, say. Hence, we can reasonably assume B < α β <. If we have selected a very small C (say, less than α/(b( β))), then C exp = CβB + C( β)b = CB This suggests that regardless of the false positive and false negative rates, the expected cost metric remains the same CB! If we have chosen a very large C (say, larger than /B), then the expected cost will become C exp = ( α)( B) + α( B) = B Again, in this case, it has nothing to do with α and β. Consider that α B in realistic situations, we can approximate Eq( 3) as C exp = Min{CβB, } + Min{C( β)b, α} (4) The above equation can be rewritten as C exp = CB if CB < α β α = CβB + α if < CB < β = + α if CB > From the above analysis, we can see that C is a very important factor in determining the expected cost. However, C is not an objective measure. In fact, in practice, the appropriate value of C is very hard to determine. Furthermore, in [6], Gaffney and Ulvila assumed a stationary cost ratio (C), which may not be appropriate because in practical situations, the relative cost (or tradeoff) of a false alarm and a missed attack changes as the total number of false alarms and missed attacks changes. To conclude, using ROC alone has limitations. Combining it with cost analysis can be useful, but it involves a subjective parameter that is very hard to estimate because a good (risk) analysis model is hard to obtain in many cases. On the other hand, our C ID is a very natural and objective metric. Therefore, it provides a very good complement to the cost-based approach. 3.2 Bayesian Detection Rate Bayesian detection rate [] is, in fact, the positive predictive value (P P V ), which is the probability of an intrusion when the IDS outputs an alarm. Similarly, Bayesian negative rate (or negative predictive value, NP V ) is the probability of no intrusion when the IDS does not output an alarm. These metrics are very important from a usability point of view because ultimately, the IDS alarms are useful only if the IDS has high P P V and NP V. Both P P V and NP V depend on T P and F P, and are sensitive to base rate. They can be expressed using Bayes theorem so that the base rate can be entered as a piece of prior information about the IDS operational environment in the Bayesian equations. The Bayesian detection rate (P P V ) is defined as []: (5)

7 P (I, A) P (I)P (A I) P (I A) = = P (A) P (I)P (A I) + P ( I)P (A I) Similarly, the Bayesian negative rate (NP V ) is P ( I A) = ( B)( α) ( B)( α) + Bβ Clearly P P V and NP V are functions on variables B, α, β. Their relationship is shown in Figure 5. We can see that both P P V and NP V will increase if F P and F N decrease. This is intuitive because lower F P and F N should yield better detection results. Figures 5(a) and 5(b) show that F P actually dominates P P V when the base rate is very low, which indicates that in most operation environments (when B is very low), P P V almost totally depends only on F P. It also changes very slightly with different F N values. For example, when F P =., if we vary F N from. to., P P V remains almost the same. This shows that P P V is not sensitive to F N. Figure 5(c) shows P P V is not as sensitive to F N as C ID. Similarly, Figures 5(d), 5(e), and 5(f) show that NP V is not sensitive to F P and F N. To conclude, both P P V and NP V are useful for an evaluating of IDS from a usability point of view. However, similar to the situation with T P and F P, there is no existing objective method to integrate these metrics. On the other hand, H(X Y ) can be expanded as H(X Y ) = B( β) log P P V Bβ log ( NP V ) ( B)( α) log NP V ( B)α log ( P P V ) We can see that our new metric C ID has incorporated both P P V and NP V in measuring the intrusion detection capability. C ID, in fact, unifies all existing commonly used metrics, i.e., T P, F P, P P V, and NP V. It also factors in the base rate, a measure of the IDS operation environment. 3.3 Sensitivity Analysis We already see one important advantage of C ID over existing metrics: it is a single unified metric, very intuitive and appealing, with a grounding in information theory. In this section, we analyze in depth why C ID is more sensitive than traditional measures in realistic situations (i.e., where the base rate is low). IDS design and deployment often results in slight changes in these parameters. For example, when fine-tuning an IDS (e.g., setting a threshold), different operation points have different T P and F P. Being sensitive means that C ID can be used to measure even slight improvements to an IDS. P P V and NP V, on the other hand, require more dramatic improvements to an IDS to yield measurable differences. Similarly, C ID provides a fairer comparison of two IDSs because, for example, a slightly better F N actually shows more of an improvement in capability than in P P V. In short, C ID is a more precise metric. As we know, the scales of P P V, NP V, C ID are all the same, i.e., from to. This provides a fair situation to test their sensitivity. To investigate how much more sensitive C ID is compared to P P V and NP V, we can perform a differential analysis of base rate B, false positive F P, and false negatives F N to study the effect of changing these parameters on P P V, NP V, and C ID. We can assume that B and α, i.e., for most IDSs and their operation environments, the base rate and false positive rates are very low. Approximate derivatives and detailed steps are shown in our technical report [8]. Note that although we originally plot Figure 6 according to their equations, where we simplify B and α, it turns out we will obtain almost the same figures when we do the numerical solution on the differential formula of P P V, NP V, and C ID without any simplification on B, α. Figure 6 shows the derivatives (in absolute value) for different metrics. We need to see only the absolute value of the derivative. A larger derivative value shows more sensitivity to changes. For example, in Figure 6(a), a change in the base rate results in a slight change in NP V. P P V improves with the change, but not as much as C ID. Clearly, from Figure 6, we can see that C ID is more sensitive to changes in B, F P, F N than P P V and NP V. For small base rates and false negative rates, P P V is more sensitive to changes in the base rate than changes in F P. It is least sensitive to F N. Given the same base rate and F P, the change of F N has a very small effect on P P V, implying that for a large difference in F N but a small difference in F P, the IDS with the smaller F P will have a better P P V. For example, suppose we have two IDSs with the same base rate B =., IDS has F P =.2%, F N = % while IDS 2 has F P =.%, F N = 3%. Although IDS has a far lower F N (% 3%) and slightly higher F P (.2% >.%), its P P V (.49) is still lower than IDS 2 (.7). On the other hand, its C ID (.487) is greater than IDS 2 (.3374). NP V, on the other hand, is more sensitive to B and F N and it does not change much when F P changes. This implies that for a large difference in F P but a small difference in F N, the one with the smaller F N will have a better NP V. For example, two IDS s with the same base rate., IDS has F P =.%, F N = 2% while IDS 2 has F P = 2%, F N = %. Although IDS has far lower F P (.% 2%) and slightly higher F N (2% > %), its NP V ( ) is still lower than IDS 2 ( ). On the other hand, its C ID (.44) is greater than IDS 2 (.2555). Hence, C ID is a more precise and sensitive measure than P P V and NP V. 4. PERFORMANCE MEASUREMENT US- ING C ID 4. Selection of Optimal Operating Point C ID factors in all existing measurements, i.e., B, F P, F N, P P V, and NP V, and is the appropriate performance measure to maximize when fine tuning an IDS (so as to select the best IDS operation point). The obtained operation point is the best that can be achieved by the IDS in terms of its intrinsic ability to classify input data. For anomaly detection systems, we can change some threshold in the detection algorithm so that we can achieve different corresponding F P and F N and create an ROC curve. In order to obtain the best optimized operational point, we can calculate a corresponding C ID for every point in the ROC. We then select the point which gives the highest C ID, and the threshold corresponding to this point provides the optimal threshold for use in practice. We first give a numerical example. We take the two ROC

8 PPV α=.,β=. α=.,β=. α=.,β=. α=.,β= Percent of Intrusion (base rate B) (a) PPV in realistic environment (both axes in log-scale) PPV B=.,β=. B=.,β=. B=.,β=.3 B=.,β= False Positive (α) (b) The effect of F P for P P V PPV B=.,α=. B=.,α=.5 B=.,α=. B=.,α= False Negative (β) (c) The effect of F N for P P V.998 NPV α=.,β=. α=.,β=. α=.,β=. α=.,β=. NPV B=.,β=. B=.,β=. B=.,β=.3 B=.,β=.5 NPV B=.,α=. B=.,α=.5 B=.,α=. B=.,α= Percent of Intrusion (base rate B) (d) NPV in realistic environment False Positive (α) (e) The effect of F P for NP V False Negative (β) (f) The effect of F N for NP V Figure 5: Positive and Negative Predictive Value. These plots, similar to those in Figures 4 show that P P V and NP V are not sensitive measures when the base rate is low. In (a), changes in β (for the same α values) have nearly no effect on P P V. In (b) for a low base rate, changes in α have a small effect on P P V. The insensitivity of P P V is also seen in (c), where changes in β do not result in large changes in P P V. The same is true for NP V, in graphs (d), (e), and (f), which show that changes in α and β do not significantly affect NP V. examples from [6]. These two intrusion detectors, denoted as IDS and IDS 2, have ROC curves that were determined from data in the 998 DARPA off-line intrusion detection evaluation [7]. We do not address how these ROC curves were obtained, and instead merely use them to demonstrate how one selects an optimized operating point using C ID. As in [6], the IDS ROC can be approximated as β =.699 ( exp( α.9 )). The IDS 2 ROC is approximated as β =.499 ( exp( 932.6α.9 )). For both IDSs, the base rate is B = 43/66 = From these two ROC curves, we can get their corresponding C ID curves in Figure 7. We can see that IDS achieves the best C ID (.4557) when the false positive rate is approximately.3 (corresponding to detection rate β =.687). Therefore, this point (with the corresponding threshold) provides the best optimized operating point for IDS. The optimized operating point for IDS 2 is approximately α =., β =.47 and the corresponding maximized C ID is.243. Thus, to set the optimized threshold, one merely has to calculate a C ID for each known point (for its T P and F P ) on the ROC curve and then select the maximum. 4.2 Comparison of Different IDSs When we get the maximized C ID for every IDS, we can compare their C ID to tell which IDS has a better intrusion detection capability. For example, in the previous section, clearly IDS is better than IDS 2 because it has a higher C ID. Granted, in this case, IDS and IDS 2 can be easily compared just from ROC curves. However, in many cases, comparing ROC curves is not straightforward, particularly when the curves cross. Consider another simple numerical example with the data taken from [2]. We compare two IDSs that have only single point ROC curves (for PROBE attacks). IDS has F P = /66,, T P =.88, while IDS 2 has F P = 7/66,, T P =.97. The base rate here is B = 7/(7 + 66, ). We note these single point curves were critiqued in [5], but here we use it merely as a simple numerical example of how C ID might compare two IDSs. IDS has C ID =.839, and IDS 2 has C ID =.888. Thus, IDS 2 is a little better than IDS. Reaching this same conclusion using just the ROC curves in [2] is not obvious. The relative C ID between different IDSs is fairly stable even if the base rate in realistic situations may change a little. This can be easily seen from Figure 3(a). The four curves do not intersect within the range of the base rate from 7 to. 4.3 Experiments To demonstrate how to use the sensitivity of the C ID measurement to select the optimal operation point (or fine-tune an IDS) in practice, we examined several existing anomaly detection systems and measured their accuracy, C ID, under

9 Derivative (in absolute value) PPV/ B NPV/ B C ID / B Derivative (in absolute value) PPV/ α NPV/ α C ID / α Derivative (in absolute value) PPV/ β NPV/ β C ID / β Base rate (B) (a) Dependence on base rate analysis (α =., β =.) False Positive Rate (α) (b) Dependence on false positive rate analysis (B =., β =.) False Negative Rate (β) (c) Dependence on false negative rate analysis (B =., α =.) Figure 6: Derivative analysis (in absolute value). In every situation C ID has the highest sensitivity, compared to P P V and NP V. For realistic situations, its derivative is always higher than other measures. Detection Rate( β) ID Capability (C ID ) IDS.2 IDS x Max C ID.2 Max C IDS ID. IDS x 3 Figure 7: IDS and IDS 2 ROC curves and corresponding C ID curves. These plots, based on values reported by Gaffney and Ulvila, show how C ID can be used to select an optimal operating point. It is not clear how simple ROC analysis could arrive at this same threshold. various configurations. Specifically, we used two anomaly network intrusion detection systems, Packet Header Anomaly Detection (PHAD) [3] and Payload Anomaly Detection (PAYL) [25]. To demonstrate how to compare two different IDSs using C ID, we compared an anomaly detection system PAYL with another open source signature-based IDS, Snort [22], in terms of their capabilities to detect Web attacks based on the same testing data set. PHAD and PAYL both detect anomalies at the packet level, with PHAD focusing on the packet header and PAYL using byte frequencies in the payload. We tested PHAD using the DARPA 999 test data set [6], using week 3 for training and weeks 4 and 5 for testing. We configured PHAD to monitor only HTTP traffic. As noted in [25], it is difficult to find sufficient data in the DARPA 999 data set to thoroughly test PAYL, so we used GTTrace, a backbone capture from our campus network. The GTTrace data set consists of approximately six hours of HTTP traffic captured on a very busy Gb/s backbone, or approximately G of data. We filtered the GTTrace set to remove known attacks, split the trace into training and testing sets, and injected numerous HTTP attacks into the testing set, using tools such as libwhisker [2]. We used C ID to identify an optimal setting for each IDS. In PHAD, a score is computed based on selected fields in each packet header. If this score exceeds a threshold, then an intrusion alert is issued. Adjusting this threshold yields different T P, F P values, shown in Figure 8(a). We configured PHAD to recognize the attack packets, instead of the attack instances reported in [3]. We can see in Figure 8(a) that the C ID curve almost follows the ROC curve (both like straight lines). The reason is that with the DARPA data set, we found the false positive rate for PHAD was fairly low, while the false negative rate was extremely high, with β. As shown in our technical report [8], given small values of α and large values of β, ROC and C ID can both be approximated as straight lines, and the equation for C ID becomes essentially K( β)/h(x), where K is a constant. We note that the authors in [3] used PHAD to monitor traffic of all types, and the details of training and testing were also different from our experiments. In particular, we configured PHAD to report each packet involved in an attack instead of reporting the attack instance. Therefore, our PHAD has a high β than reported in [3]. One can argue that just selecting the point from the ROC with the highest detection rate is an adequate way to tune an IDS. This may be true in anecdotal cases, as illustrated by our configuration of PHAD. However, it is not always the case, as shown in other situations such as Figure 7. Our analysis of PHAD, therefore, illustrates a worst-case scenario for C ID. With β, and α, C ID identifies an operating point no better than existing detection measurements, e.g. ROC. Note, however, that C ID will never return a worse operating point. In other situations, C ID will outperform existing metrics. Indeed, our analysis of PAYL and the GTTrace data set illustrates a situation in which C ID provides a better measure than simple ROC analysis. PAYL requires the user to select an optimal threshold for determining whether observed byte frequencies vary significantly from a trained model. For example, a threshold of 256 allows each character in an observed payload to vary within one standard deviation of the model [25]. As before, we can experiment with different threshold values, and measure the resulting F P, F N rates. In Figure 8(b), we see that for the GTTrace data, as the

10 ID Capability (C ID ) Detection Rate( β) x x 4 Detection Rate( β) ID Capability (C ID ) Max C ID = x x 3 Threshold Threshold 4 2 Threshold= (a) PHAD x (b) PAYL x 3 Figure 8: Experiment results. (a) A low false positive and high false positive rate in the PHAD test means C ID is no better (and no worse) than ROC. (b) C ID identifies an optimal threshold in PAYL. A simple ROC analysis would fail to find this point because of an increasing detection rate. threshold drops, C ID reaches a peak and then drops, while the ROC curves (shown in the top graph) continue to increase slowly. An analyst using just the top graph in Figure 8(b) might be tempted to set a threshold lower than, say, 8 (where α=3 3 ), because the detection capability still increases even if the false positive rate grows slightly as well. However, using C ID, we see the detection capability actually declines after C ID = (marked in Figure 8(b) with a vertical line). Thus, C ID identifies a higher, but optimal operating threshold of 64 (where α =.7 3, β =.563). In this situation, C ID provides a better operating point. It is not obvious how ROC analysis could provide the same optimal threshold. To demonstrate how C ID can be used to compare different IDS, we ran Snort (Version 2.. Build 9) on the same data as PAYL to compare their capabilities. Since the use of libwhisker which tried to evade snort, we have a poor detection rate with β =.7 (worse than PAYL), a good false positive rate α =.67 (better than PAYL). Without C ID, we cannot tell which IDS is better based on existing metrics. With the base rate B =.9, we can calculate C ID =.8 for Snort in this testing data. Clearly >.8, so (optimally configured) PAYL performs better than Snort based on our test data. Again, we emphasize that as with all evaluation attempts, the above results are strongly related to the testing data in use. 5. DISCUSSION 5. Trace-driven Evaluation and Ground Truth Currently, all IDS evaluation work is trace-driven, suggesting that when evaluating IDSs, we should have the evaluation data set where we know the details about the ground truth, i.e., what data are attacks and what data are normal traffic. Thus, we can easily find out the base rate, which is the fraction of attacks in the whole data set. After testing the IDS on this evaluation data set, we compare the IDS alerts with the ground truth, then we can calculate the false positive rate (the fraction of misclassified normal data in the whole normal data) and false negative rate (the fraction of undetected attacks among all the attack data). Using these basic metrics as inputs, we can finally compute C ID. In our technical report [8], we also briefly discuss the estimation of prior probabilities and transition probabilities in the real situation. 5.2 Unit of Analysis An important problem in IDS evaluation is unit of analysis [5]. As we mentioned when introducing the abstract IDS model, for network based intrusion detection, there are at least two units of analysis in different IDSs. Some IDSs (e.g., Snort, PAYL [25]) analyze packets and output alerts on the packets, while other IDSs such as Bro analyze traffic based on flows. Although a different unit of analysis will result in a different base rate even on the same evaluation data set, it does not affect the usage of C ID in fine-tuning an IDS to get optimal operation point. However, when comparing different IDSs, we must consider this problem. In this paper, we are not trying to solve the unit of analysis problem, because it is not peculiar to C ID, but a general problem for all the existing evaluation metrics, e.g., T P, F P. Thus, in order to provide a fair and meaningful comparison, we recommend running the IDSs based on the same unit of analysis as well as the same data set and the same detection spaces (or attack coverages). Regardless of the metrics being used, the unit of analysis problem is a general yet difficult problem in IDS evaluation. In some cases, we can also convert the different units to the same one when comparing different IDSs. For example, we can convert a packet-level analysis to a flow-level analysis by defining a flow as malicious when it contains any malicious packet; otherwise, it is a normal flow. Using such a conversion allows the comparison between a packet-level IDS and a flow-level IDS based on the same ( virtual ) granularity or unit of analysis. However, this kind of conversion

11 does not always work, particularly when the two units, such as packet sequence and system call sequence, are totally unrelated. 5.3 Involving Cost Analysis in C ID We have shown that C ID is a very natural and objective metric for measuring the intrusion detection capability and claim it is a good complement to the cost-based approach. In some cases, however, we may want to include a subjective cost analysis, particularyly when a good risk analysis model is available. We notice that C ID has some connection to the cost-based metric if the log part can be considered the cost function. In addition, we can easily involve cost analysis in C ID as an extension. A possible solution is achieved by using a weighted conditional entropy H(X Y ) when calculating C ID = (H(X) H(X Y ))/H(X). We can change the original form of conditional entropy slightly and place weights in. Now H w(x Y ) = x y w xyp(x, y) log p(x y) x y wxy, where w xy means the weight/cost considered when X = x, Y = y. We can set a larger weight of w xy when we believe the situation X = x, Y = y costs more. For instance, in the military network example, we can define a very large weight on w, which essentially gives more weight to missed attacks (X = while Y = ), i.e., false negatives. In this weighted setting, C ID will give more preference to F N than F P. Similarly, we can set a larger weight of w in the case with a single overloaded operator (or with an automated response system), which indicates a false positive (X =, Y = ) is more important in the analysis. In such a cost-based extension, C ID can achieve a similar capability as ROC combining cost analysis. Further study of cost-based extensions on C ID will be in our future work. 6. RELATED WORK Intrusion detection has been a field of active research for more than two decades, and many IDSs have been developed. There are several relevant fundamental (theoretical) research in this field. Denning [5] introduced an intrusion detection model and proposed several statistical models to build normal profiles. Helman and Liepins [] studied some statistical foundation of audit trail analysis for the detection of computer misues. They modeled the normal traffic and attack traffic as the output of two independent stationary stochastic processes. Axelsson [2] argued that the well-established signal detection and estimation theory bears similarities with the IDS domain. However, the benefits of the similarities for the design and evaluation of IDS in practice are as yet unclear. Maxion et al. [4] studied the relationship between data regularity and anomaly detection performance. The study focused on sequence data, and hence, regularity was defined as conditional entropy. The key result from experiments on synthetic data was that when an anomaly detection model was tested on data sets with varying regularity values, the detection performance also varied. Lee et al. [] applied information theoretic measurement to describe the characteristics of audit data set, suggest the appropriate anomaly detection model, and explain the performance of the models. Our work is another application of information theory to IDS and provides a natural and unified metric of the intrusion detection capability. In the area of IDS evaluation, true positive rate and false positive rate are two commonly used metrics. To consider both of these metrics, we can use ROC (receiver operating characteristic) curve [9] based analysis, which has already been well studied in other fields such as medical diagnostic tests [24]. Lippmann et al. [2] evaluated IDSs on the 998 DARPA Intrusion Detection Evaluation Data Set and used ROC curves to evaluate (and implicitly compare) them. McHugh [5] pointed out that the evaluation in [2] had serious shortcomings. For example, the proper unit of analysis and measurement was different for different detectors. McHugh also called for a more helpful measure of IDS performance. Our work is an attempt to develop a better metric. Gaffney and Ulvila [6] combined ROC curves with cost analysis methods to compute the expected cost of an IDS so that different IDSs can be evaluated and compared based on their expected costs. This approach is not practical because the result depends on the subjective estimate of the cost ratio between true and false positives. Axelsson [] proposed two other metrics: the Bayesian detection rate and the Bayesian negative rate. These are in fact the Bayesian representations of positive predictive value (P P V ) and negative predictive value (NP V ), both commonly used in medical diagnosis [24]. Axelsson s main conclusion is that given that the base rate is very low in most environments, the false alarm rate needs to be a lot lower than what most current algorithms can achieve in order to have a reasonable Bayesian detection rate. The existing metrics are all useful. However, the lack of a unified metric makes it hard to fine-tune and evaluate an IDS. Our new metric, Intrusion Detection Capability, derived from analyzing the intrusion detection process from an information-theoretic point of view, naturally unifies all the existing objective measures of the IDS detection capability. The IBM Zurich team of RIDAX [4] (developed in the context of the European MAFTIA project) proposed a set of metrics such as precision, recall, as used in the informationretrieval field. Their approach is very different from C ID because they focus on assessing the completeness and utility of arbitrary IDS combinations, while we try to capture the intrinsic capability of IDS using an information-theoretic approach. Our new metric is similar to but different from NMI (Normalized Mutual Information, (H(A)+H(B))/H(A, B)) used in medical image registration [8] and NAMI (Normalized Asymmetric Mutual Information, I(X; Y )/H(Y )) [23]. These other metrics are not as sensitive as C ID for realistic intrusion detection scenarios, as discussed in Section CONCLUSION AND FUTURE WORK The contributions of this paper are both theoretical and practical. We provided an in-depth analysis of existing IDS metrics. And we argued that the lack of a unified metric makes it hard to fine-tune an IDS and compare different IDSs. Then, we studied the intrusion detection process from the viewpoint of information theory and proposed a natural, unified metric to measure the capability of the IDS in terms of its capability to correctly classify the input events. Our metric, Intrusion Detection Capability, or C ID, is simply the ratio of the mutual information between the IDS input and output to the entropy of the IDS input. This intuitive metric combines all commonly used metrics, i.e., true positive rate,

12 false positive rate, and both positive and negative predictive values. It also factors in the base rate, an important measure of the IDS operation environment. As a composite (unified) metric, C ID greatly complements the cost-based approach. Using this metric, one can choose the best (optimized) operation point of an IDS (e.g., the threshold for an anomaly detection system). Furthermore, since C ID is normalized, we can compare different IDSs, even though their F P, F N rates are different. We presented numerical experiments and case studies to show the utility. This paper has not presented every application for Intrusion Detection Capability, and numerous extensions are possible. An obvious extension of C ID comes from rethinking the simple model of IDS inputs and outputs, X, Y, represented as a or a. We can instead encode different types of attacks into X, Y, creating a more accurate model, particularly in the context of signature-based IDS. Aware of this more accurate model, we will use C ID to measure more signature-based detection systems. Our abstract model for the intrusion detection process can be further studied using channel capacity models from information theory. Multiple processes (or layers) of IDS can be considered as multiple (chained) channels. We can analyze and improve both internal and external designs of IDS instead of only considering the intrusion detection process as an entire black box. 8. ACKNOWLEDGMENTS This work is supported in part by NSF grant CCR and Army Research Office contract W9NF539. The contents of this work are solely the responsibility of the authors and do not necessarily represent the official views of NSF and the U.S. Army. 9. REFERENCES [] S. Axelsson. The base-rate fallacy and its implications for the difficulty of intrusion detection. In Proceedings of ACM CCS 999, November 999. [2] S. Axelsson. A preliminary attempt to apply detection and estimation theory to intrusion detection. Technical Report -4, Dept. of Computer Engineering, Chalmers Univerity of Technology, Sweden, March 2. [3] T. Cover and J. Thomas. Elements of Information Theory. John Wiley, 99. [4] M. Dacier. Design of an intrusion-tolerant intrusion detection system, Maftia Project, deliverable. Available at [5] D. Denning. An intrusion-detection model. IEEE Transactions on Software Engineering, 3(2), Feb 987. [6] J. E. Gaffney and J. W. Ulvila. Evaluation of intrusion detectors: A decision theory approach. In Proceedings of the 2 IEEE Symposium on Security and Privacy, May 2. [7] I. Graf, R. Lippmann, R. Cunningham, K. K. D. Fried, S. Webster, and M. Zissman. Results of DARPA 998 off-line intrusion detection evaluation. Presented at DARPA PI Meeting, 5 December 998. [8] G. Gu, P. Fogla, D. Dagon, W. Lee, and B. Skoric. An information-theoretic measure of intrusion detection capability. Technical Report GIT-CC-5-, College of Computing, Georgia Tech, 25. [9] J. Hancock and P. Wintz. Signal Detection Theory. McGraw-Hill, 966. [] P. Helman and G. Liepins. Statistical foundations of audit trail analysis for the detection of computer misuse. IEEE Transactions on Software Engineering, 9(9), September 993. [] W. Lee and D. Xiang. Information-theoretic measures for anomaly detection. In Proceedings of the 2 IEEE Symposium on Security and Privacy, May 2. [2] R. P. Lippmann, D. J. Fried, and I. G. etc. Evaluating intrusion detection systems: The 998 darpa off-line intrusion detection evaluation. In Proceedings of the 2 DARPA Information Survivability Conference and Exposition (DISCEX ), 2. [3] M. V. Mahoney and P. K. Chan. Phad: Packet header anomaly detection for indentifying hostile network traffic. Technical Report CS-2-4, Florida Tech, 2. [4] R. Maxion and K. M. C. Tan. Benchmarking anomaly-based detection systems. In Proceedings of DSN 2, 2. [5] J. McHugh. Testing intrusion detection systems: A critique of the 998 and 999 darpa off-line intrusion detection system evaluation as performed by lincoln laboratory. ACM Transactions on Information and System Security, 3(4), November 2. [6] MIT Lincoln Laboratory. 999 darpa intrusion detection evaluation data set overview [7] V. Paxson. Bro: A system for detecting network intruders in real-time. Computer Networks, 3(23-24): , December 999. [8] J. Pluim, J. Maintz, and M. Viergever. Mutual information based registration of medical images: A survey. IEEE Trans on Medical Imaging, 22(8):986 4, Aug 23. [9] T. H. Ptacek and T. N. Newsham. Insertion, evasion, and denial of service: Eluding network intrusion detection. Technical report, Secure Networks Inc., January Newsham-Evasion-98.ps. [2] N. J. Puketza, K. Zhang, M. Chung, B. Mukherjee, and R. A. Olsson. A methodology for testing intrusion detection systems. IEEE Transactions on Software Engineering, 22():79 729, 996. [2] R. F. Puppy. Libwhisker official release v2., 24. Available at [22] M. Roesch. Snort - lightweight intrusion detection for networks. In Proceedings of USENIX LISA 99, 999. [23] A. Strehl. Relationship-based clustering and cluster ensembles for high-dimensional data mining, May 22. PhD thesis, The University of Texas at Austin. [24] J. A. Swets. Measuring the accuracy of diagnostic systems. Science, 24(4857): , 988. [25] K. Wang and S. J. Stolfo. Anomalous payload-based network intrusion detection. In Proceedings of RAID 24, September 24.

Ex. 2.1 (Davide Basilio Bartolini)

Ex. 2.1 (Davide Basilio Bartolini) ECE 54: Elements of Information Theory, Fall 00 Homework Solutions Ex.. (Davide Basilio Bartolini) Text Coin Flips. A fair coin is flipped until the first head occurs. Let X denote the number of flips

More information

CSCE 465 Computer & Network Security

CSCE 465 Computer & Network Security CSCE 465 Computer & Network Security Instructor: Dr. Guofei Gu http://courses.cse.tamu.edu/guofei/csce465/ Intrusion Detection System 1 Intrusion Definitions A set of actions aimed to compromise the security

More information

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack

The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack The Integration of SNORT with K-Means Clustering Algorithm to Detect New Attack Asnita Hashim, University of Technology MARA, Malaysia April 14-15, 2011 The Integration of SNORT with K-Means Clustering

More information

On Entropy in Network Traffic Anomaly Detection

On Entropy in Network Traffic Anomaly Detection On Entropy in Network Traffic Anomaly Detection Jayro Santiago-Paz, Deni Torres-Roman. Cinvestav, Campus Guadalajara, Mexico November 2015 Jayro Santiago-Paz, Deni Torres-Roman. 1/19 On Entropy in Network

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

A Multi-Objective Optimisation Approach to IDS Sensor Placement

A Multi-Objective Optimisation Approach to IDS Sensor Placement A Multi-Objective Optimisation Approach to IDS Sensor Placement Hao Chen 1, John A. Clark 1, Juan E. Tapiador 1, Siraj A. Shaikh 2, Howard Chivers 2, and Philip Nobles 2 1 Department of Computer Science

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation

Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation Evaluating Intrusion Detection Systems without Attacking your Friends: The 1998 DARPA Intrusion Detection Evaluation R. K. Cunningham, R. P. Lippmann, D. J. Fried, S. L. Garfinkel, I. Graf, K. R. Kendall,

More information

CSC574 - Computer and Network Security Module: Intrusion Detection

CSC574 - Computer and Network Security Module: Intrusion Detection CSC574 - Computer and Network Security Module: Intrusion Detection Prof. William Enck Spring 2013 1 Intrusion An authorized action... that exploits a vulnerability... that causes a compromise... and thus

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected] www.excelmasterseries.com

More information

Detecting Flooding Attacks Using Power Divergence

Detecting Flooding Attacks Using Power Divergence Detecting Flooding Attacks Using Power Divergence Jean Tajer IT Security for the Next Generation European Cup, Prague 17-19 February, 2012 PAGE 1 Agenda 1- Introduction 2- K-ary Sktech 3- Detection Threshold

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Establishing How Many VoIP Calls a Wireless LAN Can Support Without Performance Degradation

Establishing How Many VoIP Calls a Wireless LAN Can Support Without Performance Degradation Establishing How Many VoIP Calls a Wireless LAN Can Support Without Performance Degradation ABSTRACT Ángel Cuevas Rumín Universidad Carlos III de Madrid Department of Telematic Engineering Ph.D Student

More information

Choice under Uncertainty

Choice under Uncertainty Choice under Uncertainty Part 1: Expected Utility Function, Attitudes towards Risk, Demand for Insurance Slide 1 Choice under Uncertainty We ll analyze the underlying assumptions of expected utility theory

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

A Game Theoretical Framework for Adversarial Learning

A Game Theoretical Framework for Adversarial Learning A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

Intrusion Detection in AlienVault

Intrusion Detection in AlienVault Complete. Simple. Affordable Copyright 2014 AlienVault. All rights reserved. AlienVault, AlienVault Unified Security Management, AlienVault USM, AlienVault Open Threat Exchange, AlienVault OTX, Open Threat

More information

BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation

BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation Guofei Gu, Phillip Porras, Vinod Yegneswaran, Martin Fong, Wenke Lee USENIX Security Symposium (Security 07) Presented by Nawanol

More information

Performance Evaluation of Intrusion Detection Systems

Performance Evaluation of Intrusion Detection Systems Performance Evaluation of Intrusion Detection Systems Waleed Farag & Sanwar Ali Department of Computer Science at Indiana University of Pennsylvania ABIT 2006 Outline Introduction: Intrusion Detection

More information

Web Application Security

Web Application Security Web Application Security Richard A. Kemmerer Reliable Software Group Computer Science Department University of California Santa Barbara, CA 93106, USA http://www.cs.ucsb.edu/~rsg www.cs.ucsb.edu/~rsg/

More information

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets

Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets Nathaniel Hendren January, 2014 Abstract Both Akerlof (1970) and Rothschild and Stiglitz (1976) show that

More information

On Correlating Performance Metrics

On Correlating Performance Metrics On Correlating Performance Metrics Yiping Ding and Chris Thornley BMC Software, Inc. Kenneth Newman BMC Software, Inc. University of Massachusetts, Boston Performance metrics and their measurements are

More information

Inference of Probability Distributions for Trust and Security applications

Inference of Probability Distributions for Trust and Security applications Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations

More information

Intrusion Detection Systems

Intrusion Detection Systems Intrusion Detection Systems Assessment of the operation and usefulness of informatics tools for the detection of on-going computer attacks André Matos Luís Machado Work Topics 1. Definition 2. Characteristics

More information

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

CS 5410 - Computer and Network Security: Intrusion Detection

CS 5410 - Computer and Network Security: Intrusion Detection CS 5410 - Computer and Network Security: Intrusion Detection Professor Kevin Butler Fall 2015 Locked Down You re using all the techniques we will talk about over the course of the semester: Strong access

More information

arxiv:1112.0829v1 [math.pr] 5 Dec 2011

arxiv:1112.0829v1 [math.pr] 5 Dec 2011 How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

More information

Passive Vulnerability Detection

Passive Vulnerability Detection Page 1 of 5 Passive Vulnerability Detection "Techniques to passively find network security vulnerabilities" Ron Gula [email protected] September 9, 1999 Copyright 1999 Network Security Wizards

More information

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

More information

Soil Dynamics Prof. Deepankar Choudhury Department of Civil Engineering Indian Institute of Technology, Bombay

Soil Dynamics Prof. Deepankar Choudhury Department of Civil Engineering Indian Institute of Technology, Bombay Soil Dynamics Prof. Deepankar Choudhury Department of Civil Engineering Indian Institute of Technology, Bombay Module - 2 Vibration Theory Lecture - 8 Forced Vibrations, Dynamic Magnification Factor Let

More information

SURVEY OF INTRUSION DETECTION SYSTEM

SURVEY OF INTRUSION DETECTION SYSTEM SURVEY OF INTRUSION DETECTION SYSTEM PRAJAPATI VAIBHAVI S. SHARMA DIPIKA V. ASST. PROF. ASST. PROF. MANISH INSTITUTE OF COMPUTER STUDIES MANISH INSTITUTE OF COMPUTER STUDIES VISNAGAR VISNAGAR GUJARAT GUJARAT

More information

An Introduction to Information Theory

An Introduction to Information Theory An Introduction to Information Theory Carlton Downey November 12, 2013 INTRODUCTION Today s recitation will be an introduction to Information Theory Information theory studies the quantification of Information

More information

Load Balancing and Switch Scheduling

Load Balancing and Switch Scheduling EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: [email protected] Abstract Load

More information

Chapter 21: The Discounted Utility Model

Chapter 21: The Discounted Utility Model Chapter 21: The Discounted Utility Model 21.1: Introduction This is an important chapter in that it introduces, and explores the implications of, an empirically relevant utility function representing intertemporal

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Network Intrusion Detection Systems

Network Intrusion Detection Systems Network Intrusion Detection Systems False Positive Reduction Through Anomaly Detection Joint research by Emmanuele Zambon & Damiano Bolzoni 7/1/06 NIDS - False Positive reduction through Anomaly Detection

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK [email protected] [email protected]

More information

Introduction to Learning & Decision Trees

Introduction to Learning & Decision Trees Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing

More information

Taxonomy of Intrusion Detection System

Taxonomy of Intrusion Detection System Taxonomy of Intrusion Detection System Monika Sharma, Sumit Sharma Abstract During the past years, security of computer networks has become main stream in most of everyone's lives. Nowadays as the use

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

Role of Anomaly IDS in Network

Role of Anomaly IDS in Network Role of Anomaly IDS in Network SumathyMurugan 1, Dr.M.Sundara Rajan 2 1 Asst. Prof, Department of Computer Science, Thiruthangal Nadar College, Chennai -51. 2 Asst. Prof, Department of Computer Science,

More information

Anomaly-based Intrusion Detection in Software as a Service

Anomaly-based Intrusion Detection in Software as a Service Anomaly-based Intrusion Detection in Software as a Service Gustavo Nascimento, Miguel Correia Portugal Telecom Portugal Instituto Superior Técnico / INESC-ID Portugal [email protected], [email protected]

More information

Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

More information

Application of Netflow logs in Analysis and Detection of DDoS Attacks

Application of Netflow logs in Analysis and Detection of DDoS Attacks International Journal of Computer and Internet Security. ISSN 0974-2247 Volume 8, Number 1 (2016), pp. 1-8 International Research Publication House http://www.irphouse.com Application of Netflow logs in

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania Moral Hazard Itay Goldstein Wharton School, University of Pennsylvania 1 Principal-Agent Problem Basic problem in corporate finance: separation of ownership and control: o The owners of the firm are typically

More information

Performance of networks containing both MaxNet and SumNet links

Performance of networks containing both MaxNet and SumNet links Performance of networks containing both MaxNet and SumNet links Lachlan L. H. Andrew and Bartek P. Wydrowski Abstract Both MaxNet and SumNet are distributed congestion control architectures suitable for

More information

IDS / IPS. James E. Thiel S.W.A.T.

IDS / IPS. James E. Thiel S.W.A.T. IDS / IPS An introduction to intrusion detection and intrusion prevention systems James E. Thiel January 14, 2005 S.W.A.T. Drexel University Overview Intrusion Detection Purpose Types Detection Methods

More information

1 Uncertainty and Preferences

1 Uncertainty and Preferences In this chapter, we present the theory of consumer preferences on risky outcomes. The theory is then applied to study the demand for insurance. Consider the following story. John wants to mail a package

More information

Factoring Patterns in the Gaussian Plane

Factoring Patterns in the Gaussian Plane Factoring Patterns in the Gaussian Plane Steve Phelps Introduction This paper describes discoveries made at the Park City Mathematics Institute, 00, as well as some proofs. Before the summer I understood

More information

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu

More information

The Base Rate Fallacy and its Implications for the Difficulty of Intrusion Detection

The Base Rate Fallacy and its Implications for the Difficulty of Intrusion Detection The Base Rate Fallacy and its Implications for the Difficulty of Intrusion Detection Stefan Axelsson Presented by Kiran Kashalkar Agenda 1. 1. General Overview of of IDS 2. 2. Bayes Theorem and Base-Rate

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study But I will offer a review, with a focus on issues which arise in finance 1 TYPES OF FINANCIAL

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Aggregating Distributed Sensor Data for Network Intrusion Detection

Aggregating Distributed Sensor Data for Network Intrusion Detection Aggregating Distributed Sensor Data for Network Intrusion Detection JOHN C. McEACHEN, CHENG KAH WAI, and VONDA L. OLSAVSKY Department of Electrical and Computer Engineering Naval Postgraduate School Monterey,

More information

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives

6 EXTENDING ALGEBRA. 6.0 Introduction. 6.1 The cubic equation. Objectives 6 EXTENDING ALGEBRA Chapter 6 Extending Algebra Objectives After studying this chapter you should understand techniques whereby equations of cubic degree and higher can be solved; be able to factorise

More information

USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION. Integrated Information Systems Group, Ruhr University Bochum, Germany

USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION. Integrated Information Systems Group, Ruhr University Bochum, Germany USING LOCAL NETWORK AUDIT SENSORS AS DATA SOURCES FOR INTRUSION DETECTION Daniel Hamburg,1 York Tüchelmann Integrated Information Systems Group, Ruhr University Bochum, Germany Abstract: The increase of

More information

Globally Optimal Crowdsourcing Quality Management

Globally Optimal Crowdsourcing Quality Management Globally Optimal Crowdsourcing Quality Management Akash Das Sarma Stanford University [email protected] Aditya G. Parameswaran University of Illinois (UIUC) [email protected] Jennifer Widom Stanford

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Using simulation to calculate the NPV of a project

Using simulation to calculate the NPV of a project Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial

More information

Mahalanobis Distance Map Approach for Anomaly Detection

Mahalanobis Distance Map Approach for Anomaly Detection Edith Cowan University Research Online Australian Information Security Management Conference Security Research Institute Conferences 2010 Mahalanobis Distance Map Approach for Anomaly Detection Aruna Jamdagnil

More information

Spam Filtering based on Naive Bayes Classification. Tianhao Sun

Spam Filtering based on Naive Bayes Classification. Tianhao Sun Spam Filtering based on Naive Bayes Classification Tianhao Sun May 1, 2009 Abstract This project discusses about the popular statistical spam filtering process: naive Bayes classification. A fairly famous

More information

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Notes on Factoring. MA 206 Kurt Bryan

Notes on Factoring. MA 206 Kurt Bryan The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

More information

11.3 BREAK-EVEN ANALYSIS. Fixed and Variable Costs

11.3 BREAK-EVEN ANALYSIS. Fixed and Variable Costs 385 356 PART FOUR Capital Budgeting a large number of NPV estimates that we summarize by calculating the average value and some measure of how spread out the different possibilities are. For example, it

More information

Intrusion Detections Systems

Intrusion Detections Systems Intrusion Detections Systems 2009-03-04 Secure Computer Systems Poia Samoudi Asli Davor Sutic Contents Intrusion Detections Systems... 1 Contents... 2 Abstract... 2 Introduction... 3 IDS importance...

More information

Performance Measures in Data Mining

Performance Measures in Data Mining Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München

More information

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass

More information

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,

More information

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

More information

Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach

Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach Outline Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach Jinfeng Yi, Rong Jin, Anil K. Jain, Shaili Jain 2012 Presented By : KHALID ALKOBAYER Crowdsourcing and Crowdclustering

More information

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015 ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

More information

A DATA MINING TECHNIQUE TO DETECT REMOTE EXPLOITS

A DATA MINING TECHNIQUE TO DETECT REMOTE EXPLOITS Chapter 1 A DATA MINING TECHNIQUE TO DETECT REMOTE EXPLOITS Mohammad M. Masud, Latifur Khan, Bhavani Thuraisingham, Xinran Wang, Peng Liu and Sencun Zhu Abstract We design and implement DExtor, a Data

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]

More information

Provider-Based Deterministic Packet Marking against Distributed DoS Attacks

Provider-Based Deterministic Packet Marking against Distributed DoS Attacks Provider-Based Deterministic Packet Marking against Distributed DoS Attacks Vasilios A. Siris and Ilias Stavrakis Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH)

More information

Competition and Fraud in Online Advertising Markets

Competition and Fraud in Online Advertising Markets Competition and Fraud in Online Advertising Markets Bob Mungamuru 1 and Stephen Weis 2 1 Stanford University, Stanford, CA, USA 94305 2 Google Inc., Mountain View, CA, USA 94043 Abstract. An economic model

More information