ORF 245 Fundamentals of Statistics Chapter 8 Hypothesis Testing

Similar documents
Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Introduction to Hypothesis Testing

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Lecture 9: Bayesian hypothesis testing

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

University of Chicago Graduate School of Business. Business 41000: Business Statistics

An Introduction to Basic Statistics and Probability

Fairfield Public Schools

The Binomial Probability Distribution

Tutorial 5: Hypothesis Testing

Hypothesis Testing for Beginners

CHAPTER 2 Estimating Probabilities

Testing Hypotheses About Proportions

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Basics of Statistical Machine Learning

Simple Regression Theory II 2010 Samuel L. Baker

ST 371 (IV): Discrete Random Variables

p-values and significance levels (false positive or false alarm rates)

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Normal distribution. ) 2 /2σ. 2π σ

LOGNORMAL MODEL FOR STOCK PRICES

Likelihood: Frequentist vs Bayesian Reasoning

Using Excel for inferential statistics

4. Continuous Random Variables, the Pareto and Normal Distributions

Random variables, probability distributions, binomial random variable

1.5 Oneway Analysis of Variance

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Testing Research and Statistical Hypotheses

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Penalized regression: Introduction

2. Discrete random variables

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Sample Size and Power in Clinical Trials

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

12: Analysis of Variance. Introduction

The CUSUM algorithm a small review. Pierre Granjon

Master s Theory Exam Spring 2006

Introduction to Hypothesis Testing OPRE 6301

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

A Predictive Probability Design Software for Phase II Cancer Clinical Trials Version 1.0.0

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Section 6.2 Definition of Probability

HYPOTHESIS TESTING WITH SPSS:

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Statistics 2014 Scoring Guidelines

1 Prior Probability and Posterior Probability

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Discrete Structures for Computer Science

Interpreting Kullback-Leibler Divergence with the Neyman-Pearson Lemma

Bayesian Analysis for the Social Sciences

Non-Inferiority Tests for One Mean

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory

Introduction to Detection Theory

Chapter 6: Point Estimation. Fall Probability & Statistics

Dongfeng Li. Autumn 2010

Part 2: One-parameter models

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Chapter 5. Random variables

From the help desk: Bootstrapped standard errors

22. HYPOTHESIS TESTING

Statistics in Medicine Research Lecture Series CSMC Fall 2014

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Recall this chart that showed how most of our course would be organized:

Problem sets for BUEC 333 Part 1: Probability and Statistics

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Randomized Block Analysis of Variance

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Multivariate Normal Distribution

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

E3: PROBABILITY AND STATISTICS lecture notes

One-Way Analysis of Variance (ANOVA) Example Problem

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

Algebra Practice Problems for Precalculus and Calculus

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Point and Interval Estimates

Introduction to General and Generalized Linear Models

Name: Date: Use the following to answer questions 3-4:

Characteristics of Binomial Distributions

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Mind on Statistics. Chapter 12

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

Tests for Two Proportions

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Stats on the TI 83 and TI 84 Calculator

Descriptive Statistics

Planning sample size for randomized evaluations

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

False Discovery Rates

Transcription:

ORF 245 Fundamentals of Statistics Chapter 8 Hypothesis Testing Robert Vanderbei Fall 2015 Slides last edited on December 11, 2015 http://www.princeton.edu/ rvdb

Coin Tossing Example Consider two coins. Coin 0 is fair (p =0.5) butcoin1isbiasedinfavorofheads:p =0.7. Imagine tossing one of these coins n times. The number of heads is a binomial random variable with the corresponding p values. The probability mass function is therefore p(x) = n! p x (1 p) n x x For n =8,theprobabilitymassfunctionsare x 0 1 2 3 4 5 6 7 8 p(x H 0 ) 0.0039 0.0313 0.1094 0.2188 0.2734 0.2188 0.1094 0.0313 0.0039 p(x H 1 ) 0.0001 0.0012 0.0100 0.0467 0.1361 0.2541 0.2965 0.1977 0.0576 Suppose that we know that our coin is either the fair coin or the biased coin. We flip it n =8times and it comes up heads 3 times. The probability of getting 3 heads with the fair coin is p(3 H 0 )=0.2188 whereas the probability of getting 3 heads with the biased coin is p(3 H 1 )=0.0467. Hence,itis0.2188/0.0467 = 4.69 times more likely that our coin is the fair coin than it is the biased coin. The ratio p(x H 0 )/p(x H 1 ) is called the likelihood ratio: x 0 1 2 3 4 5 6 7 8 p(x H 0 ) 59.5374 25.5160 10.9354 4.6866 2.0086 0.8608 0.3689 0.1581 0.0678 p(x H 1 ) 1

Bayesian Approach We consider two hypotheses: H 0 = Coin 0 is being tossed and H 1 = Coin 1 is being tossed Hypothesis H 0 is called the null hypothesis whereas H 1 is called the alternative hypothesis. Suppose we have a prior believe as to the probability P (H 0 ) that coin 0 is being tossed. Let P (H 1 )=1 P (H 0 ) denote our prior belief regarding the probability that coin 1 is being tossed. A natural initial choice is P (H 0 )=P(H 1 )=1/2. If we now toss the coin n times and observe x heads, then the posterior probability estimate that coin 0 is being tossed is P (H 0 x) = p(x H 0 )P (H 0 ) p(x H 0 )P (H 0 )+p(x H 1 )P (H 1 ) Hence, we get a simple intuitive formula for the ratio P (H 0 x) P (H 1 x) = P (H 0) P (H 1 ) p(x H 0 ) p(x H 1 ) 2

Making a Decision If we must make a choice, we d probably go with H 0 if P (H 0 x) P (H 1 x) = P (H 0) P (H 1 ) p(x H 0 ) p(x H 1 ) > 1 and H 1 otherwise. This inequality can be expressed in terms of the likelihood ratio: where c denotes the ratio of our priors. p(x H 0 ) p(x H 1 ) > P (H 1) P (H 0 ) =: c If we use an unbiased prior, then c =1and we will accept H 0 if X apple 4 and we will reject H 0 is X>4. 3

Type I and Type II Errors With c =1,wewillacceptH 0 if X apple 4 and we will reject H 0 if X>4. Type I Error: Reject H 0 when it is correct: P (reject H 0 H 0 ) = P (X >4 H 0 ) = 0.3634 Type II Error: Accept H 0 when it is incorrect: P (accept H 0 H 1 ) = P (X apple 4 H 1 ) = 0.1941 With c =5,wewillacceptH 0 if X apple 2 and we will reject H 0 is X>2. Inthiscase,weget Type I Error: P (reject H 0 H 0 ) = P (X >2 H 0 ) = 0.8556 Type II Error: P (accept H 0 H 1 ) = P (X apple 2 H 1 ) = 0.0113 The parameter c controls the trade-o between Type-I and Type-II errors. 4

Type-I and Type-II Errors PP P PPPPPP Reality Decision H 0 true H 1 true Accept H 0 Yes! Type-II error Reject H 0 Type-I error Yes! AType-Ierrorisalsocalledafalse discovery or a false positive. AType-IIerrorisalsocalledamissed discovery or a false negative. 5

Neyman-Pearson Paradigm Abandon the Bayesian approach. Instead, focus on probability of Type-I and Type-II errors associated with a threshold c. Things to Consider It is conventional to choose the simpler of two hypotheses as the null. The consequences of incorrectly rejecting one hypothesis may be graver than those of incorrectly rejecting the other. In such a case, the former should be chosen as the null hypothesis, because the probability of falsely rejecting it could be controlled by choosing. Examples of this kind arise in screening new drugs; frequently, it must be documented rather conclusively that a new drug has a positive e ect before it is accepted for general use. In scientific investigations, the null hypothesis is often a simple explanation that must be discredited in order to demonstrate the presence of some physical phenomenon or e ect. In criminal matters: Innocent until proven guilty! 6

Terminology Null Hypothesis: H 0 Alternative Hypothesis: H 1 Type I Error: Rejecting H 0 when it is true. Significance Level: The probability of a Type-I error. Usually denoted. Type II Error: Accepting H 0 when it is false. Probability usually denoted. Power: 1. Test Statistic: Rejection region: Acceptance region: The measured quantity on which a decision is based. The set of values of the test statistic that lead to rejection of the null hypothesis. The set of values of the test statistic that lead to acceptance of the null hypothesis. Null Distribution: p(x H 0 ) 7

Unrealistic Example Let X 1,X 2,...,X n be a random sample from a normal distribution having known variance 2 but unknown mean µ. Weconsidertwohypotheses: H 0 : µ = µ 0 H 1 : µ = µ 1 where µ 0 and µ 1 are two specific possible values (suppose that µ 0 <µ 1 ). The likelihood ratio is the ratio of the joint density functions: 0 exp @ nx (x i 1 µ 0 ) 2 /2 2 A f 0 (x 1,...,x n ) f 1 (x 1,...,x n ) = i=1 0 1 nx exp @ (x i µ 1 ) 2 /2 2 A We will accept the null hypothesis if this ratio is larger than some positive threshold. Let s call the threshold e c and take logs of both sides: i=1 nx (x i µ 0 ) 2 /2 2 + i=1 nx (x i µ 1 ) 2 /2 2 >c i=1 8

Example Continued Expanding the squares, combining the two sums, and simplifying, we get nx i=1 Dividing both sides by 2n and by µ 1 2x i (µ 0 µ 1 ) µ 2 0 + µ 2 1 µ 0,weget > 2 2 c x +(µ 0 + µ 1 )/2 > 2 c/n(µ 1 µ 0 ) where x = 1 n P n i=1 x i.finally,isolating x from the other terms, we get x <(µ 0 + µ 1 )/2 2 c/n(µ 1 µ 0 ) Choosing c =0corresponds to equally probable priors (since e 0 =1). In this case, we get the intuitive result that we should accept the null hypothesis if x <(µ 0 + µ 1 )/2 From these calculations, we see that the test statistic is the sample mean X = 1 n P n i=1 X i and that we should reject the null hypothesis when the test statistic is larger than some threshold value, let s call it x 0. 9

Example Significance Level In hypothesis testing, one designs the test according to a set value for the significance level. In this problem, the significance level is given by = P ( X >x 0 H 0 ) If we subtract µ 0 from both sides and then divide by astandardnormaldistribution: 0 1 = P @ X µ 0 / p n > x 0 µ 0 / p n H 0 / p n,wegetarandomvariablewith x A 0 µ 0 =1 F N / p n where F N denotes the cumulative distribution function for a standard Normal random variable. In a similar manner, we can compute the probability of a Type-II error:! = P ( X apple x 0 H 1 )=P 0 @ X µ 1 / p n apple x 0 µ 1 / p n H 1 1! x A 0 µ 1 = F N / p n 10

Example vs. Trade-O 0.4 0.35 0.3 µ 0 = 2 µ 1 = π 0.25 0.2 0.15 0.1 0.05 β α 0-1 0 1 2 3 4 5 6 x 0 As x 0 slides from left to right, goes down whereas goes up. The sum + is minimized at x 0 =(µ 0 + µ 1 )/2. To make both and smaller, need to make / p n smaller; i.e., increase n. 11

Generalized Likelihood Ratios Let X 1,...,X n be iid Normal rv s with unknown mean µ and known variance 2. We consider two hypotheses: H 0 : µ = µ 0 H 1 : µ 6= µ 0 The null hypothesis H 0 is simple. ThealternativeH 1 is composite. For simple hypotheses, the likelihood that the hypothesis is true given specific observed values x 1,x 2,...,x n is just the joint pdf. Hence, the likelihood that H 0 is true is given by: f 0 (x 1,...,x n )= 1 ( p 2 ) ne 1 2 2 P n i=1 (x i µ 0 ) 2 For composite hypotheses, the generalized likelihood is taken to be the largest possible likelihood that could be obtained over all choices of the distribution. So, the likelihood that H 1 is true is given by: f 1 (x 1,...,x n ) = max 1 ( p 1 P n 2 ) ne 2 2 i=1 (x i ) 2 MLE = 1 ( p 2 ) ne 1 2 2 P n i=1 (x i x) 2 NOTE: The maximum is achieved by the maximum likelihood estimator, whichwehave already seen is just x. 12

Likelihood Ratios Continued As before, we define our acceptance region based on the log-likelihoood-ratio: (x 1,...,x n ) = log f 0(x 1,...,x n ) f 1 (x 1,...,x n ) = 1 2 2 nx i=1 (x i µ 0 ) 2 + 1 2 2 nx (x i x) 2 Note that f 0 apple f 1 and therefore apple 0. We accept the null hypothesis when is not too negative: (x 1,...,x n ) > for some negative constant: c. Again as before, we expand the quadratics and simplify the inequality to get nx i=1 2x i (µ 0 x) µ 2 0 + x 2 > 2 2 c Now, we use the fact that P i x i = n x to help us further simplify: 2 x(µ 0 x) µ 2 0 + x 2 > 2 Some final algebraic manipulations and we get: 2 c 2 n c ( x µ 0 ) 2 < 2 n c 13 i=1

Likelihood Ratios Continued And, switching to capital-letter random-variable notation, we get that we should accept H 0 if X µ 0 < p 2cp n or, in other words, µ 0 p 2cpn < X < µ 0 + p 2cp n 14

More Realistic Example As before, let X 1,X 2,...,X n be a random sample from a normal distribution having known variance 2 but unknown mean µ. This time, let s consider these two hypotheses: where µ 0 is a specific/given value. Reject null hypothesis if H 0 : µ = µ 0 H 1 : µ 6= µ 0 X µ 0 >x 0 where x 0 is chosen so that P H0 X µ 0 >x 0 = (note the probability is computed assuming H 0 is true). Since we know that the distribution is normal, we see that x 0 = z( /2) X 15

Confidence Interval vs Hypothesis Test It s easy to check that P H0 X µ 0 >z( /2) X = is equivalent to P H0 X z( /2) X apple µ 0 apple X + z( /2) X =1 The 100(1 )% confidence interval for µ is P µ X z( /2) X apple µ apple X + z( /2) X =1 Comparing the acceptance region for the test to the confidence interval, we see that µ 0 lies in the confidence interval for µ if and only if the hypothesis test accepts the null hypothesis. In other words, the confidence interval consists precisely of all those values of µ 0 for which the null hypothesis H 0 : µ = µ 0 is accepted. 16

p-values 17

Local Climate Data Forty Year Di erences On a per century basis... X =4.25 F/century, S =26.5 F/century, S/ p n =0.35 F/century Hypotheses: Number of standard deviations out: temperature 40 30 20 10 0 10 20 30 40 Forty Year Temperature Differences 1995 2000 2005 2010 date H 0 : µ =0, H 1 : µ 6= 0 z = X/S X =4.25/0.35 = 12.1 Probability that such an outlier happened by chance: p = P (N >12.1) = 5 10 34 Now I m way convinced! Caveat: The random variables are not fully independent. If it s warmer than average today, then it is likely to be warmer than average tomorrow. Had we used daily averages from just one day a week, the variables would be closer to independent. We d still reject the null hypothesis but z would be smaller (by a factor of p 7)andsothep value would be larger. 18