Hypothesis Testing: p-value

Similar documents
Chapter 4. Hypothesis Tests

Mind on Statistics. Chapter 12

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Hypothesis testing - Steps

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Solutions to Homework 5 Statistics 302 Professor Larget

HYPOTHESIS TESTING: POWER OF THE TEST

Introduction to Hypothesis Testing

Non-Parametric Tests (I)

p ˆ (sample mean and sample

Chapter 2. Hypothesis testing in one population

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Hypothesis Testing --- One Mean

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Lesson 9 Hypothesis Testing

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Hypothesis Testing for Beginners

22. HYPOTHESIS TESTING

Mind on Statistics. Chapter 10

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

p-values and significance levels (false positive or false alarm rates)

Testing Hypotheses About Proportions

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Tests for One Proportion

3.4 Statistical inference for 2 populations based on two samples

Introduction to Hypothesis Testing OPRE 6301

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Tests for Two Proportions

MATH 140 Lab 4: Probability and the Standard Normal Distribution

6: Introduction to Hypothesis Testing

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Sample Size and Power in Clinical Trials

Two Correlated Proportions (McNemar Test)

Chapter 26: Tests of Significance

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Mind on Statistics. Chapter 4

Permutation & Non-Parametric Tests

Chi-square test Fisher s Exact test

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

CHAPTER 14 NONPARAMETRIC TESTS

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Descriptive Statistics

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Name: Date: Use the following to answer questions 3-4:

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Correlational Research

Example 1. so the Binomial Distrubtion can be considered normal

Statistics 2014 Scoring Guidelines

1 Why is multiple testing a problem?

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Principles of Hypothesis Testing for Public Health

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

WISE Power Tutorial All Exercises

HYPOTHESIS TESTING WITH SPSS:

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Hypothesis Testing. Steps for a hypothesis test:

The Binomial Probability Distribution

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

We begin by presenting the current situation of women s representation in physics departments. Next, we present the results of simulations that

Lecture 9: Bayesian hypothesis testing

Binomial random variables

Independent samples t-test. Dr. Tom Pierce Radford University

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Math 251, Review Questions for Test 3 Rough Answers

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

Prediction of Closing Stock Prices

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

A) B) C) D)

Section 13, Part 1 ANOVA. Analysis Of Variance

socscimajor yes no TOTAL female male TOTAL

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

Probability Distributions

Difference of Means and ANOVA Problems

Non-Inferiority Tests for Two Proportions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

How To Check For Differences In The One Way Anova

Transcription:

STAT 101 Dr. Kari Lock Morgan Paul the Octopus Hypothesis Testing: SECTION 4.2 andomization distribution http://www.youtube.com/watch?v=3esgpumj9e Hypotheses In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is this evidence that Paul s chance of guessing correctly, p, is really greater than 50%? What are the null and alternative hypotheses? a) H 0 : p 0.5, H a : p = 0.5 b) H 0 : p = 0.5, H a : p 0.5 c) H 0 : p = 0.5, H a : p > 0.5 d) H 0 : p > 0.5, H a : p = 0.5 Key Question How unusual is it to see a sample statistic as extreme as that observed, if H 0 is true? If it is very unusual, we have statistically significant evidence against the null hypothesis Today s Question: How do we measure how unusual a sample statistic is, if H 0 is true? Measuring Evidence against H 0 To see if a statistic provides evidence against H 0, we need to see what kind of sample statistics we would observe, just by random chance, if H 0 were true Paul the Octopus We need to know what kinds of statistics we would observe just by random chance, if the null hypothesis were true How could we figure this out??? Simulate many samples of size n = 8 with p = 0.5 1

Simulate! We can simulate this with a coin! Each coin flip = a guess between two teams (Heads = correct, Tails = incorrect) Flip a coin 8 times, count the number of heads, and calculate the sample proportion of heads Did you get all 8 heads (correct)? (a) Yes (b) No How extreme is Paul s sample proportion of 1? Paul the Octopus Based on your simulation results, for a sample size of n = 8, do you think p = 1 is statistically significant? a) Yes b) No andomization Distribution A randomization distribution is a collection of statistics from samples simulated assuming the null hypothesis is true Lots of simulations! For a better randomization distribution, we need many more simulations! www.lock5stat.com/statkey The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true andomization Distribution Paul the Octopus Based on StatKey s simulation results, for a sample size of n = 8, do you think p = 1 is statistically significant? a) Yes b) No 2

Key Question How unusual is it to see a sample statistic as extreme as that observed, if H 0 is true? A randomization distribution tells us what kinds of statistics we would see just by random chance, if the null hypothesis is true This makes it straightforward to assess how extreme the is! andomization Distribution In a hypothesis test for H 0 : = 12 vs H a : < 12, we have a sample with n = 45 and x = 10.2. What do we require about the method to produce randomization samples? a) = 12 b) < 12 c) x = 10.2 We need to generate randomization samples assuming the null hypothesis is true. andomization Distribution In a hypothesis test for H 0 : = 12 vs H a : < 12, we have a sample with n = 45 and x = 10.2. Where will the randomization distribution be centered? a) 10.2 b) 12 c) 45 d) 1.8 andomization distributions are always centered around the null hypothesized value. andomization Distribution Center A randomization distribution simulates samples assuming the null hypothesis is true, so A randomization distribution is centered at the value of the parameter given in the null hypothesis. andomization Distribution In a hypothesis test for H 0 : = 12 vs H a : < 12, we have a sample with n = 45 and x = 10.2. What will we look for on the randomization distribution? a) How extreme 10.2 is We want to see how extreme the observed b) How extreme 12 is statistic is. c) How extreme 45 is d) What the standard error is e) How many randomization samples we collected andomization Distribution In a hypothesis test for H 0 : 1 = 2 vs H a : 1 > 2, we have a sample with x 1 = 26 and x 2 = 21. What do we require about the method to produce randomization samples? a) 1 = 2 b) 1 > 2 c) x 1 =26, x 2 =21 d) x 1 x 2 = 5 We need to generate randomization samples assuming the null hypothesis is true. 3

andomization Distribution In a hypothesis test for H 0 : 1 = 2 vs H a : 1 > 2, we have a sample with x 1 = 26 and x 2 = 21. Where will the randomization distribution be centered? a) 0 b) 1 c) 21 d) 26 e) 5 The randomization distribution is centered around the null hypothesized value, 1-2 = 0 andomization Distribution In a hypothesis test for H 0 : 1 = 2 vs H a : 1 > 2, we have a sample with x 1 = 26 and x 2 = 21. What do we look for on the randomization distribution? a) The standard error b) The center point c) How extreme 26 is d) How extreme 21 is e) How extreme 5 is We want to see how extreme the observed difference in means is. Quantifying Evidence We need a way to quantify evidence against the null The is the chance of obtaining a sample statistic as extreme (or more extreme) than the observed sample statistic, if the null hypothesis is true The can be calculated as the proportion of statistics in a randomization distribution that are as extreme (or more extreme) than the observed sample statistic 1000 Simulations Paul the Octopus: the is the chance of getting all 8 out of 8 guesses correct, if p = 0.5 What proportion of statistics in the randomization distribution are as extreme as p = 1? Proportion as extreme as = 0.004 If Paul is just guessing, the chance of him getting all 8 correct is 0.004. 4

Calculating a ESP 1. What kinds of statistics would we get, just by random chance, if the null hypothesis were true? (randomization distribution) 2. What proportion of these statistics are as extreme as our original sample statistic? () For our ESP example, the is the chance of getting a sample proportion as high as 0.26, from a sample of n = 98, if p = 0.2 Simulate a randomization distribution with p = 0.2 and n = 98, and see what proportion of simulated statistics are as extreme as 0.26 www.lock5stat.com/statkey ESP andomization Distributions If you were all just guessing randomly, the chance of us getting a sample proportion as high as 0.26 is 0.072. Proportion as extreme as = 0.072 s can be calculated by randomization distributions: simulate samples, assuming H 0 is true calculate the statistic of interest for each sample find the as the proportion of simulated statistics as extreme as the Let s do a randomization distribution for a randomized experiment Cocaine Addiction In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug), and then followed to see who relapsed Question of interest: Is Desipramine better than Lithium at treating cocaine addiction? Cocaine Addiction What are the null and alternative hypotheses? p D, p L : proportion of cocaine p D addicts who relapse after taking Desipramine or Lithium, respectively H 0 : p D = p L H a : p D < p L What are the possible conclusions? eject H 0 ; Desipramine is better than Lithium Do not reject H 0 : We cannot determine from these data whether Desipramine is better than Lithium 5

2. Conduct experiment 3. Observe relapse counts in each group = elapse N = No elapse Desipramine 1. andomly assign units to treatment groups Lithium Desipramine N N N N N N N N 1. andomly assign units to treatment groups pˆ D pˆ 10 18 24 24.333 10 relapse, 14 no relapse 18 relapse, 6 no relapse L Lithium Measuring Evidence against H 0 To see if a statistic provides evidence against H 0, we need to see what kind of sample statistics we would observe, just by random chance, if H 0 were true Cocaine Addiction by random chance means by the random assignment to the two treatment groups if H 0 were true means if the two drugs were equally effective at preventing relapses (equivalently: whether a person relapses or not does not depend on which drug is taken) Simulate what would happen just by random chance, if H 0 were true N N Desipramine Simulate another randomization Lithium N N 10 relapse, 14 no relapse 18 relapse, 6 no relapse N N N N N N N pˆ ˆ D pl 16 12 24 24 0.167 N N N N N N N N N N N 16 relapse, 8 no relapse 12 relapse, 12 no relapse 6

www.lock5stat.com/statkey Desipramine N N N N N N N Simulate another randomization pˆ ˆ D pl 17 11 24 24 0.250 Lithium 17 relapse, 7 no relapse 11 relapse, 13 no relapse Proportion as extreme as If the two drugs are equal regarding cocaine relapse rates, we have a 1.3% chance of seeing a difference in proportions as extreme as that observed. Death Penalty A random sample of people were asked Are you in favor of the death penalty for a person convicted of murder? Yes Did the proportion of Americans who favor the death penalty decrease from 1980 to 2010? No 1980 663 342 2010 640 360 Death Penalty, Gallup, www.gallup.com Death Penalty p 1980, p 2010 : proportion of Americans who favor the death penalty in 1980, 2010 H 0 : p 1980 = p 2010 H a : p 1980 > p 2010 How extreme is 0.02, if p 1980 = p 2010? StatKey Yes No 1980 663 342 2010 640 360 p 1980 = 0.66 p 2010 = 0.64 So the sample statistic is: p 1980 p 2010 = 0.66 0.64 = 0.02 Death Penalty Alternative Hypothesis p 1980 p 2010 p value = 0.164 p 1980 p 2010 If proportion supporting the death penalty has not changed from 1980 to 2010, we would see differences this extreme about 16% of the time. A one-sided alternative contains either > or < A two-sided alternative contains The is the proportion in the tail in the direction specified by H a For a two-sided alternative, the is twice the proportion in the smallest tail 7

Upper-tail (ight Tail) Lower-tail (Left Tail) Two-tailed and H a H 0 : = 0 H a : > 0 x = 2 H 0 : = 0 H a : < 0 x = 1 H 0 : = 0 H a : 0 x = 2 Sleep versus Caffeine ecall the sleep versus caffeine experiment from last class s and c are the mean number of words recalled after sleeping and after caffeine. H 0 : s = c H a : s c Let s find the! Two-tailed alternative www.lock5stat.com/statkey Sleep or Caffeine for Memory? www.lock5stat.com/statkey = 2 0.022 = 0.044 and H 0 If the is small, then a statistic as extreme as that observed would be unlikely if the null hypothesis were true, providing significant evidence against H 0 The smaller the, the stronger the evidence against the null hypothesis and in favor of the alternative X X when S C H0 true X S X 3 C and H 0 The smaller the, the, the stronger the evidence against evidence H o. the stronger the evidence against Hagainst o. H o. Summary The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true A is the chance of getting a statistic as extreme as that observed, if H 0 is true A can be calculated as the proportion of statistics in the randomization distribution as extreme as the observed sample statistic The smaller the, the greater the evidence against H 0 8

ead Section 4.2 To Do Project 1 proposal (due Wednesday, 2/19) 9