Combining Weak Statistical Evidence in Cyber Security
|
|
- Nigel Bishop
- 8 years ago
- Views:
Transcription
1 Combining Weak Statistical Evidence in Cyber Security Nick Heard Department of Mathematics, Imperial College London; Heilbronn Institute for Mathematical Research, University of Bristol Intelligent Data Analysis XIV 23 October, 2015 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
2 Collaborative work with Patrick Rubin-Delanchy, University of Bristol Melissa Turcotte & Alex Kent, Los Alamos National Laboratory Josh Neil, Ernst & Young Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
3 Combining p-values Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
4 Abstract setting It will be supposed that n independent hypothesis tests are being conducted. For the i th test, let H 0,i, H 1,i be the null and alternative hypotheses p i be the p-value derived from some test statistic t i ; for example, the upper tail probability of t i under H 0,i, p i = Pr H0,i (T i t i ) t i have a continuous distribution = H 0,i : p i U(0, 1) Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
5 The global hypothesis test that will be considered is: H 0 : i {1,..., n}, H 0,i is true H 1 : I n {1,..., n}, I n, s.t. i I n, H 1,i is true. The test statistic T (p 1,..., p n ) will be a combiner of the p-values to a single value. Finally, for defining some such test statistics, let p (1) p (2)... p (n) be the order statistics of the n p-values. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
6 In some settings, it is more sensible (and more powerful) to combine individual test statistics rather than their p-values. But sometimes in meta-analysis, the individual test statistics may no longer be available; or the individual tests are very different in nature and difficult to combine Combining p-values is mathematically equivalent to the multiple comparison problem in bulk hypothesis testing, but with a different (in my opinion, less vague) motivation. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
7 Motivating example: Cyber security Low and slow : A sophisticated cyber attack will try to blend in with the existing traffic in a computer network, gradually achieving its objectives. Possibly only a small subset of the traffic will be malicious and yield significant p-values from statistical models of normal network/user behaviour. = The signal of the intruder can be drowned out. Need to: detect the intruder from a weak signal identify specific compromised services to limit damage Question: How can p-values be filtered and then combined to most efficiently lead to detection? Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
8 Fisher s method s n = n n log p i = log i=1 Under H 0, 2 s n χ 2 2n and so the upper tail probability from that distribution is U(0, 1). Mathematically convenient Intuitive: s n is (a monotonic transformation of) the joint cdf of the p-values under H 0. [Recall the p-values are independent and since p i U(0, 1), Pr(p i p) = p.] = the combined p-value derived from Fisher s method reports the probability of observing an even lower joint cdf value LRT: Under a special case of the global alternative hypothesis, i=1 H 1 : a (0, 1) s.t. i, f 1,i (p i ) p a i, this is the uniformly most powerful (UMP) test Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52 p i
9 Sum of p-values s n = Under H 0, s n follows an Irwin-Hall distribution, which can be well approximated by N(n/2, n/12) for n > 20. Mathematically less convenient Unintuitive: the sum has a more awkward probabilistic interpretation under H 0 as the mixture cdf of the p-values Almost nobody uses it n i=1 LRT: And yet, under another special case of the global alternative hypothesis, p i H 1 : b > 0 s.t. i, f 1,i (p i ) e bp i, the sum provides the UMP test Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
10 Linear combination w n log p i + (1 w) i=1 Mathematically inconvenient no simple closed form under H 0 Unintuitive: an even more awkward probabilistic interpretation under H 0 - a weighted mixture of the mixture and joint cdfs of the p-values! Nobody uses this LRT: Under yet another special case of the global alternative hypothesis, n i=1 H 1 : a (0, 1), b > 0 s.t. i, f 1,i (p i ) p a i e bp i, the linear combination with w = (1 a)/(1 + b a) provides the UMP test = knowledge of a and b would be required to choose w Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52 p i
11 Each of the LRT alternative densities are decreasing functions of p i. Note that as p i 0, the first density whereas as the second density 1. = s n is optimal when some p-values are extreme, s n is more appropriate for bunched p-values. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
12 p 2 p p 2 p 1 Figure: Significance levels from two p-values combined using s 2 (left) and s 2 (right). Truncated product/sum Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
13 Many other plausible combiners. Birnbaum (1954): Any combiner which is monotonic in the p-values provides the most powerful test for some special case of the alternative hypothesis H 1 : I n {1,..., n}, I n, s.t. i I n, p i f 1,i, where the alternative densities f 1,i are non-increasing. Question: Which combiners are good when I n << n? Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
14 Standard normal example: change in mean Donoho and Jin (2004): Suppose and p i = 1 Φ(t i ). H 0,i : t i N (0, 1) H 1,i : t i N ( 2r log n, 1), r (0, 1) If H 1,i holds i I n and I n n, then for a range of suitably small r, as n : Fisher s method cannot separate H 0 from H 1. Simes method and Higher Criticism separate H 0 from H 1 w.p. 1. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
15 Simes method Simes (1986): Under H 0, T Simes U(0, 1). p (i) T Simes = n min 1 i n i Equivalent to Benjamini-Hochberg (1995) procedure for controlling FDR Simes method is a KS-type test of the ratio of lower tail probabilities for two distribution functions. That is, T 1 Simes = sup F n (p) p (0,1) p where F n is the ecdf of the n p-values (Chang, 1955; Mason and Shuenmeyer, 1983) ( ) Under H 0, lim Pr p (i) arg min = 1 = e n 1 i n i Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
16 Higher Criticism Donoho and Jin (2004): HC n = max 1 i n i/n p (i) p (i) (1 p (i) )/n = max 1 i n Distribution under H 0 obtained by Monte Carlo. Can be viewed as a weighted KS test HC n = sup w(p)(f n (p) p) p (0,1) i np (i) np (i) (1 p (i) ) where w(p) 1 = p(1 p)/n is the variance of F n (p) or a GLRT, approximating binomial distributions with Gaussians: For n i.i.d. U(0, 1) variates, the number p (i) is Binomial(n, p (i) ) N(np (i), np (i) (1 p (i) )) Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
17 Under H 1 : I n {1,..., n}, I n and a [0, 1) s.t. i I n, f 1,i (p i ) p a i or H 1 : I n {1,..., n}, I n and b > 0 s.t. i I n, f 1,i (p i ) e bp i, both Simes method and HC can still lack power as they are invariant to local changes in the smallest p-values which are deemed significant. It is desirable to make use of all of the significant p-values. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
18 Partial Fisher scores/sums Two proposed combiners for those alternatives are min ζ k ( s k ), s k = 1 k n min ζ k (s k ), s k = 1 k n k log p (i) where ζ k, ζ k approximate the corresponding distributions functions. i=1 k i=1 p (i) Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
19 Under H 0, s k has closed form cdf F ( s k ) = k k 1 n! k! n k i=1 ( 1) k+i i k 1 i! (n k i)! k 1 j =0 ( ) i j γ(j + 1, s k ) k j! k(e (k+i) s k /k 1) + k + i where γ is the lower incomplete gamma function; but for n > 13 this is not nice to work with. There is no closed form for the cdf of s k under H 0. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
20 Instead, we follow Donoho and Jin and approximate the true distributions with Gaussians, yielding simple test statistics where µ k = 1 ψ(k + 1) + ψ(n + 1), S n = min 1 k n {( s k µ k )/ σ k }, S n = min 1 k n {(s k µ k )/σ k }, σ 2 k = 1/k + ψ 1(k + 1) ψ 1 (n + 1) µ k = k(k + 1)/(2(n + 1)), σ 2 k = k(k + 1)(2n(1 + 2k) (3k + 2)(k 1))/(12(n + 1)2 (n + 2)), and ψ and ψ 1 are the digamma and trigamma functions. The distributions of S n and S n are much simpler to obtain by Monte Carlo. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
21 p p p p 2 1 Figure: Significance levels from two p-values combined using S 2 (left) and S 2 (right). Product/sum Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
22 Empirical comparison Recall the example of Donoho and Jin (2004), H 0,i : t i N (0, 1) H 1,i : t i N ( 2r log n, 1), r (0, 1) where H 1,i holds for some non-empty I n {1,..., n}, s.t. I n n. For illustration, let I n = 3 n r = 4/15 (smallest detectable r assuming I n = 3 n is 1/6) Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
23 Power curve for n = 2: I n = 1, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
24 Power curve for n = 10: I n = 2, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
25 Power curve for n = 100: I n = 4, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
26 Power curve for n = 1,000: I n = 10, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
27 Power curve for n = 10,000: I n = 21, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
28 Power curve for n = 100,000: I n = 46, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
29 Power curve for n = 1,000,000: I n = 100, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
30 Power curve for n = 10,000,000: I n = 215, µ n = F1(p) Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
31 Power curve for n = 10,000,000: I n = 215, µ n = F1(p) Partial sum Partial product Higher criticism Simes method Fisher s method H p Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
32 Distribution of significant subset size under H 1, n = 1,000, I n = Partial product Higher criticism Simes method Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
33 Concluding remarks on combining p-values The asymptotic assumptions from Donoho and Jin (2004) used here imply a decreasing proportion of the alternative hypotheses are true the effect size increases with the number of tests which seems an unlikely scenario (but allowed some impressive maths). A more realistic scenario might assume a constant proportion of the alternative hypotheses are true individual test effect sizes which decrease with n Finding the correct number of significant p-values seems very difficult In many (most?) applications, Fisher s method is really good Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
34 Discrete p-values Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
35 Discrete p-values In many practical applications, some of the test statistics for the individual hypothesis tests will be discrete. Either because they naturally are; or they are recorded with limited precision, leading to interval censoring Even under H 0, discrete p-values are not U(0, 1). They are stochastically larger. = Discrete p-values are conservative = Combining discrete p-values can be really conservative = Subset selection becomes even more important Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
36 Mid-p-values To combat the conservatism of discrete p-values, some practitioners advocate mid-p-values. If the regular p-value is the corresponding mid-p-value is p i = Pr H0,i (T i t i ), p i,mid = 1 2 Pr H 0,i (T i t i ) Pr H 0,i (T i > t i ). Mid-p-values are not stochastically larger than U(0, 1). They are Less than U(0, 1) in the convex order; they have mean ½, but are less variable not conservative: For some values of α, Pr H0,i (p i,mid α) = 2α Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
37 Random p-values If we randomly draw p i,random U(Pr H0,i (T i > t i ), Pr H0,i (T i t i )). then marginally p i,random U(0, 1). Very natural to think this way when discreteness arises through censoring. But when combining random p-values, there is no rule which could guarantee monotonicity in the p-values. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
38 Cyber security Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
39 Cyber security: big business Former MI5 director Jonathan Evans, October 2015: Cyber security is a recognised major international issue There are now a lot of serious people focused on cyber security and there is a lot of investment, not only in resilience, but also by venture capitalists in cyber security startups The amount of money, intellectual activity and resource going into this will have an impact and, as this matures, there will be a balancing out between the attackers and the defenders, but right now there is still a lot of work to be done Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
40
41
42
43 Statistical challenges in cyber No network can be made fully secure. It is not unreasonable to assume that most networks are compromised to some degree. Statistical monitoring of network traffic offers a robust ( signature free ), second level of defence. For those attacks which do penetrate a network perimeter, we need technologies for identifying malign activity amongst the bulk traffic. This is a data mining problem. A model based approach: We can learn about normal behaviour in a network by gathering data and building statistical models (null hypotheses). Anomalous behaviour w.r.t. those models can indicate potential breaches requiring further inspection. We are less able to gather data on potential attack behaviours, which are more adaptive = cannot assume a model for compromised behaviour = reliant on p-values from the null model, rather than e.g. Bayes factors Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
44 User credentials authentication & computer event logs Computer event logs are a critical resource for investigating security incidents, providing detailed information at a machine level. authentication, logons,... processes applications/services Many of these log entries are tied to a user credential action. Reusable user credentials are one of the most powerful items an attacker can obtain Adversaries require user credentials to traverse the network Due to single sign-on (favoured for convenience and usability), credentials and hashed passwords are stored in computer memory, making them simple to obtain and reuse Can we detect network intrusions from event logs by ranking the most unusual behaviour according to statistical models of each user credential? Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
45 Los Alamos National Laboratory computer network LANL is a US Department of Energy research lab. Their substantial Internet presence state-of-the-art computer systems huge stores of proprietary information make LANL a prime target for hacking. They face several million cyber attacks each day. Here we consider Windows-based authentication event logs from the LANL enterprise computer network. Features: Two months of data: 444 million events for 10k users Month long red-team exercise in the second month of data, with 78 known compromised credentials Random selection of 1,000 credentials plus compromised credentials 50 million associated events These data are now freely available at Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
46 Distribution of authentication event types in LANL network network logon interactive logon remote desktop process start kerberos other Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
47 A user credential authentication modelling approach Each user credential is monitored for the first month of data. The authentication events on the network using that credential form a sequence (c t, s t, e t ) t N where c t C is the client IP address s t S is the server IP address e t E is the authentication event type From our model we should like to capture statistical surprise in each of these three data items. Is this a new client/server for the user? And if so, is this an unusual choice of client/server? And given the client-server pair, is this an unusual authentication event type? Although each user will be modelled independently for tractability and parallelistation, hyperparameters will be pooled across users to learn typical client/server behaviour across the network. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
48 To specify the model, let C t, S t respectively be the set of unique clients or servers authenticated on by the user prior to time t, and define binary variables χ t, σ t {0, 1} s.t. χ t = 1 c t / C t, σ t = 1 s t / S t. Then the probability mass function for the triple (c t, s t, e t ) is specified through Pr(c t, s t, e t ) = Pr(χ t χ t 1 )Pr(c t χ t, c t 1 )Pr(σ t c t, σ t )Pr(s t c t, σ t, s t )Pr(e t s t ) where t is the last event time previous to t where the client was c t. Bayesian multinomial Dirichlet distributions are fit to each of these conditional distributions using the first month of data, assuming either an i.i.d. sequence or a first order Markov chain. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
49 In cases where a new client or server are used (χ t = 1 or σ t = 1), time varying Dirichlet prior parameters for the possible IP addresses are set proportional to the current in/out degrees of the chosen client/server at discrete time t. When the client or server are drawn from C t or S t, the Dirichlet parameters are set equal to the empirical frequencies of each IP address. Similarly for event types, for new client-server edges the Dirichlet prior parameters are determined by the observed event type frequencies from other users on the same client-server edge, and the empirical distribution otherwise. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
50 Given this predictive mass function, the (discrete) p-value for an observation being as unlikely as (c t, s t, e t ) is given by p t = I{Pr(c t, s t, e t ) Pr(c t, s t, e t )} Pr(c t, s t, e t ). C S E The corresponding mid p-value is given by p t,mid = 1 2 p t C S E I{Pr(c t, s t, e t ) < Pr(c t, s t, e t )} Pr(c t, s t, e t ). Detecting fraudulent misuse of user credentials is a task of combining subsets of small p-values indicating anomalous authentication behaviour mixed amongst the bulk traffic. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
51 p-values from authentication modelling f (p) p p Figure: Density functions of discrete p-values for infected (left) and uninfected (right) user IDs. Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
52 ROC curve for discrete p-values 1 True positive rate Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
53 ROC curve for discrete p-values 1 True positive rate Higher criticism Simes method Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
54 ROC curve for discrete p-values 1 True positive rate Partial product Higher criticism Simes method Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
55 ROC curve for discrete p-values 1 True positive rate Partial sum Partial product Higher criticism Simes method Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
56 ROC curve for mid-p-values 1 True positive rate Partial sum Partial product Higher criticism Simes method Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
57 ROC curve for median from random p-values 1 True positive rate Partial sum Partial product Higher criticism Simes method Fisher s method False positive rate H 0 Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
58 Concluding remarks Combining p-values is an increasingly important topic in the Big Data paradigm shift. We cannot build a big joint distribution of everything and calculate the true p-value Away from authentication, there are many contexts just within cyber where combining weak evidence, in the form of p-values or weak signals, is important In NetFlow data, for example, we get a different view from the router level of IP-IP communications - service port numbers, TCP flags, numbers of packets/bytes, timings typically to the millisecond. All of these can be informative Beyond signature detection, there is often no smoking gun in cyber; just bits and pieces of evidence to piece together Statistical cyber security is a data fusion exercise, and combining p-values badly can undo the benefits of good modelling Nick Heard (Imperial College) Combining evidence in cyber 23 October, / 52
Monte Carlo testing with Big Data
Monte Carlo testing with Big Data Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with: Axel Gandy (Imperial College London) with contributions from:
More informationAnomaly detection for Big Data, networks and cyber-security
Anomaly detection for Big Data, networks and cyber-security Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with Nick Heard (Imperial College London),
More informationFinding statistical patterns in Big Data
Finding statistical patterns in Big Data Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research IAS Research Workshop: Data science for the real world (workshop 1)
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More informationPrinciple of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
More informationSPEAR PHISHING UNDERSTANDING THE THREAT
SPEAR PHISHING UNDERSTANDING THE THREAT SEPTEMBER 2013 Due to an organisation s reliance on email and internet connectivity, there is no guaranteed way to stop a determined intruder from accessing a business
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationCHI-SQUARE: TESTING FOR GOODNESS OF FIT
CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity
More informationEstimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia
Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationMore details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationLikelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
More informationDefending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject
Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationitesla Project Innovative Tools for Electrical System Security within Large Areas
itesla Project Innovative Tools for Electrical System Security within Large Areas Samir ISSAD RTE France samir.issad@rte-france.com PSCC 2014 Panel Session 22/08/2014 Advanced data-driven modeling techniques
More informationScott Lucas: I m Scott Lucas. I m the Director of Product Marketing for the Branch Solutions Business Unit.
Juniper Networks Next Generation Security for a Cybercrime World Lior Cohen Principal Solutions Architect Scott Lucas Director of Product Marketing, Branch Solutions Service Layer Technologies Business
More informationChi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
More informationAn Internal Model for Operational Risk Computation
An Internal Model for Operational Risk Computation Seminarios de Matemática Financiera Instituto MEFF-RiskLab, Madrid http://www.risklab-madrid.uam.es/ Nicolas Baud, Antoine Frachot & Thierry Roncalli
More informationProbabilistic Methods for Time-Series Analysis
Probabilistic Methods for Time-Series Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:
More informationFalse Discovery Rates
False Discovery Rates John D. Storey Princeton University, Princeton, USA January 2010 Multiple Hypothesis Testing In hypothesis testing, statistical significance is typically based on calculations involving
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true
More informationComplete Web Application Security. Phase1-Building Web Application Security into Your Development Process
Complete Web Application Security Phase1-Building Web Application Security into Your Development Process Table of Contents Introduction 3 Thinking of security as a process 4 The Development Life Cycle
More informationSTAT 830 Convergence in Distribution
STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31
More informationCIS 433/533 - Computer and Network Security Intrusion Detection
CIS 433/533 - Computer and Network Security Intrusion Detection Professor Kevin Butler Winter 2011 Computer and Information Science Intrusion An Authorized Action (or subversion of auth)... That Can Lead
More informationModule 2 Probability and Statistics
Module 2 Probability and Statistics BASIC CONCEPTS Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The standard deviation of a standard normal distribution
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More information1 Sufficient statistics
1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =
More informationTowards End-to-End Security
Towards End-to-End Security Thomas M. Chen Dept. of Electrical Engineering Southern Methodist University PO Box 750338 Dallas, TX 75275-0338 USA Tel: 214-768-8541 Fax: 214-768-3573 Email: tchen@engr.smu.edu
More informationLikelihood: Frequentist vs Bayesian Reasoning
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and
More informationCSC574 - Computer and Network Security Module: Intrusion Detection
CSC574 - Computer and Network Security Module: Intrusion Detection Prof. William Enck Spring 2013 1 Intrusion An authorized action... that exploits a vulnerability... that causes a compromise... and thus
More informationINTRUSION PREVENTION AND EXPERT SYSTEMS
INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla avic@v-secure.com Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationClassification Problems
Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems
More informationE3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
More informationIntrusion Detection. Overview. Intrusion vs. Extrusion Detection. Concepts. Raj Jain. Washington University in St. Louis
Intrusion Detection Overview Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu Audio/Video recordings of this lecture are available at: http://www.cse.wustl.edu/~jain/cse571-14/
More informationDiscrete Math in Computer Science Homework 7 Solutions (Max Points: 80)
Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80) CS 30, Winter 2016 by Prasad Jayanti 1. (10 points) Here is the famous Monty Hall Puzzle. Suppose you are on a game show, and you
More informationStatistical Testing of Randomness Masaryk University in Brno Faculty of Informatics
Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Jan Krhovják Basic Idea Behind the Statistical Tests Generated random sequences properties as sample drawn from uniform/rectangular
More informationMONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010
MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times
More informationIntrusion Detection: Game Theory, Stochastic Processes and Data Mining
Intrusion Detection: Game Theory, Stochastic Processes and Data Mining Joseph Spring 7COM1028 Secure Systems Programming 1 Discussion Points Introduction Firewalls Intrusion Detection Schemes Models Stochastic
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationGambling Systems and Multiplication-Invariant Measures
Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationCyber Watch. Written by Peter Buxbaum
Cyber Watch Written by Peter Buxbaum Security is a challenge for every agency, said Stanley Tyliszczak, vice president for technology integration at General Dynamics Information Technology. There needs
More informationCS 5410 - Computer and Network Security: Intrusion Detection
CS 5410 - Computer and Network Security: Intrusion Detection Professor Kevin Butler Fall 2015 Locked Down You re using all the techniques we will talk about over the course of the semester: Strong access
More informationCYBER ATTACKS EXPLAINED: PACKET CRAFTING
CYBER ATTACKS EXPLAINED: PACKET CRAFTING Protect your FOSS-based IT infrastructure from packet crafting by learning more about it. In the previous articles in this series, we explored common infrastructure
More informationThe Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationSpeedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques
www.ijcsi.org 387 Speedy Signature Based Intrusion Detection System Using Finite State Machine and Hashing Techniques Utkarsh Dixit 1, Shivali Gupta 2 and Om Pal 3 1 School of Computer Science, Centre
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationNonparametric adaptive age replacement with a one-cycle criterion
Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk
More information3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More informationHYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationThe Exponential Distribution
21 The Exponential Distribution From Discrete-Time to Continuous-Time: In Chapter 6 of the text we will be considering Markov processes in continuous time. In a sense, we already have a very good understanding
More information1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
More informationAn Introduction to Modeling Stock Price Returns With a View Towards Option Pricing
An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing Kyle Chauvin August 21, 2006 This work is the product of a summer research project at the University of Kansas, conducted
More informationInference of Probability Distributions for Trust and Security applications
Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations
More informationOverview of Monte Carlo Simulation, Probability Review and Introduction to Matlab
Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationAachen Summer Simulation Seminar 2014
Aachen Summer Simulation Seminar 2014 Lecture 07 Input Modelling + Experimentation + Output Analysis Peer-Olaf Siebers pos@cs.nott.ac.uk Motivation 1. Input modelling Improve the understanding about how
More informationTwo-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
More informationThe CUSUM algorithm a small review. Pierre Granjon
The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationExact Nonparametric Tests for Comparing Means - A Personal Summary
Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola
More informationPermutation & Non-Parametric Tests
Permutation & Non-Parametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate
More informationLecture 6: Discrete & Continuous Probability and Random Variables
Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September
More informationTHE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationStochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
More informationHypothesis testing. c 2014, Jeffrey S. Simonoff 1
Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationFor a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )
Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll
More informationPrincipled Reasoning and Practical Applications of Alert Fusion in Intrusion Detection Systems
Principled Reasoning and Practical Applications of Alert Fusion in Intrusion Detection Systems Guofei Gu College of Computing Georgia Institute of Technology Atlanta, GA 3332, USA guofei@cc.gatech.edu
More informationPart 2: One-parameter models
Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes
More informationTwo-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
More informationStatistics in Medicine Research Lecture Series CSMC Fall 2014
Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power
More informationUsing LYNXeon with NetFlow to Complete Your Cyber Security Picture
Using LYNXeon with NetFlow to Complete Your Cyber Security Picture 21CT.COM Combine NetFlow traffic with other data sources and see more of your network, over a longer period of time. Introduction Many
More informationNon-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
More informationSignal detection and goodness-of-fit: the Berk-Jones statistics revisited
Signal detection and goodness-of-fit: the Berk-Jones statistics revisited Jon A. Wellner (Seattle) INET Big Data Conference INET Big Data Conference, Cambridge September 29-30, 2015 Based on joint work
More informationIntrusion Detection Systems
CSE497b Introduction to Computer and Network Security - Spring 2007 - Professor Jaeger Intrusion Detection Systems CSE497b - Spring 2007 Introduction Computer and Network Security Professor Jaeger www.cse.psu.edu/~tjaeger/cse497b-s07/
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More information1 Nonparametric Statistics
1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationSystem Specification. Author: CMU Team
System Specification Author: CMU Team Date: 09/23/2005 Table of Contents: 1. Introduction...2 1.1. Enhancement of vulnerability scanning tools reports 2 1.2. Intelligent monitoring of traffic to detect
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationMath 431 An Introduction to Probability. Final Exam Solutions
Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationNon-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
More information