Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses



Similar documents
HYPOTHESIS TESTING: POWER OF THE TEST

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Non-Parametric Tests (I)

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Lesson 9 Hypothesis Testing

Correlational Research

Descriptive Statistics

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

Introduction to Hypothesis Testing

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Hypothesis testing - Steps

Hypothesis Testing --- One Mean

Pearson's Correlation Tests

Two Related Samples t Test

Introduction to Hypothesis Testing OPRE 6301

WISE Power Tutorial All Exercises

Study Guide for the Final Exam

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Statistics 2014 Scoring Guidelines

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Hypothesis Testing. Hypothesis Testing

II. DISTRIBUTIONS distribution normal distribution. standard scores

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Lecture Notes Module 1

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

22. HYPOTHESIS TESTING

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

6.2 Normal distribution. Standard Normal Distribution:

Point Biserial Correlation Tests

Non-Inferiority Tests for Two Means using Differences

Randomized Block Analysis of Variance

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Sample Size and Power in Clinical Trials

Estimation of σ 2, the variance of ɛ

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

Hypothesis Testing for Beginners

Permutation Tests for Comparing Two Populations

HYPOTHESIS TESTING WITH SPSS:

Independent samples t-test. Dr. Tom Pierce Radford University

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

6: Introduction to Hypothesis Testing

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Week 3&4: Z tables and the Sampling Distribution of X

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Simple Regression Theory II 2010 Samuel L. Baker

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Using Excel for inferential statistics

Confidence intervals

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Difference of Means and ANOVA Problems

1 Nonparametric Statistics

Tests for One Proportion

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Additional sources Compilation of sources:

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means


Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

individualdifferences

Statistics Review PSY379

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Independent t- Test (Comparing Two Means)

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Section 13, Part 1 ANOVA. Analysis Of Variance

Two Correlated Proportions (McNemar Test)

Module 2 Probability and Statistics

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

AP STATISTICS (Warm-Up Exercises)

Two-sample hypothesis testing, II /16/2004

Chapter 2. Hypothesis testing in one population

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

3.4 Statistical inference for 2 populations based on two samples

Testing Hypotheses About Proportions

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Mind on Statistics. Chapter 12

Non-Inferiority Tests for One Mean

AP STATISTICS 2010 SCORING GUIDELINES

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Tests for Two Proportions

Two-sample inference: Continuous data

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

The Null Hypothesis. Geoffrey R. Loftus University of Washington

Permutation & Non-Parametric Tests

How To Test For Significance On A Data Set

In the past, the increase in the price of gasoline could be attributed to major national or global

Transcription:

Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the population Predict sample statistics based on population parameters (e.g. µ) Select random sample from population Compare observed sample data with predicted values 2 Step 1: State the Hypotheses The null hypothesis, H 0, states that in the population there is no change, no difference, or no relationship H 0 : µ treatment = constant (e.g. µ) e.g. H 0 : µ treatment = 100 This is read as: The null hypothesis is that the population mean of people receiving the treatment equals 100 H 0 is that the treatment had no effect 3 1

H 0 The null hypothesis must contain an equal sign of some sort (=,, ) Statistical tests are designed to reject H 0, never to accept it 4 H 1 : The Alternative Hypothesis The alternative hypothesis usually takes the following form: H 1 : µ treatment constant (e.g. µ) e.g. H 1 : µ treatment 100 This is read as: The alternative hypothesis states that the population mean of people receiving the treatment does not equal 100 H 1 is that the treatment had an effect 5 H 0 and H 1 Together, the null and alternative hypotheses must be mutually exclusive and exhaustive Mutual exclusion implies that H 0 and H 1 cannot both be true at the same time Exhaustive implies that each of the possible outcomes of the experiment must make either H 0 or H 1 true 6 2

Step 2: Set the Decision Criteria What sample means are consistent with H 0 and what sample means are consistent with H 1? Separate distribution of sample means into two sets of regions one whose means are consistent with H 0 and one whose means are consistent with H 1 n = 25, µ = 100, σ = 15 for graph Extreme, lowprobability values if H 0 is true Sample means close to H 0 : highprobability values if H 0 is true 90 95 100 105 110 Extreme, lowprobability values if H 0 is true 7 α Level The α level (alpha level; level of significance) is a probability value that is used to define the very unlikely sample outcomes if H 0 is true Psychologists usually adopt α = 0.05, although α = 0.01 and α = 0.001 are sometimes used The critical region is composed of the extreme sample values that are very unlikely (as specified by the α level) to be obtained if H 0 is true 8 Since we can reject H 0 two ways (extremely small or extremely large sample means), the α level is divided across the two tails of the distribution Find the z-score whose area above equals α / 2 z = 1.96 for α = 0.05 Find raw scores that Critical Regions Extreme, lowprobability values if H 0 is true, z = -1.96 Sample means close to H 0 : highprobability values if H 0 is true Extreme, lowprobability values if H 0 is true, z = 1.96 correspond to that z score X = 100 + 1.96 3 = 105.9 X = 100 1.96 3 = 94.1 90 95 100 105 110 9 3

Step 3: Collect Data & Compute Sample Statistics Randomly sample from population In this example, n = 25 Give the sample the treatment Measure the dependent variable Calculate the z score of sample mean in the sampling distribution In this example the sample statistics are, = 107, s = 14; population parameters from slide 7 (IQs) 10 Step 4: Make a Decision If the sample mean s z- score is in the extreme tails of the sampling distribution (e.g. in the critical region), reject H 0 ; otherwise, fail to reject H 0 Critical region is z > 1.96 or z < -1.96 for α = 0.05 The example z is 2.33. It is in the critical region. Therefore, reject H 0 It is likely the case that the treatment had an effect Extreme, lowprobability values if H 0 is true, z = -1.96 Sample means close to H 0 : highprobability values if H 0 is true 90 95 100 105 110 = 107; z = 2.33 Extreme, lowprobability values if H 0 is true, z = 1.96 11 Reject H 0 or Fail to Reject H 0 The only decisions you ever make in hypothesis testing are Reject H 0. or Fail to reject H 0 No other decisions are possible Never reject H 1 Never accept H 1 Never accept H 0 12 4

Type I (α) Error A type I (or α) error occurs when a researcher rejects H 0 when H 0 is really true Researcher concludes that the treatment had an effect when it did not This should happen with a probability equal to α 13 Type II (β) Errors A type II (or β) error occurs when a researcher fails to reject H 0 when H 0 is really false Researcher concludes that there is insufficient evidence to suggest that the treatment had an effect when in fact it does have an effect This should happen with a probability equal to β 14 β Unlike α, β is not directly set by the researcher β depends on the sample size (n) β depends on how much the treatment affects the dependent variable β depends on the variability of the data β depends on α 15 5

Type-I and Type-II Errors Ideally, we would like to minimize both Type- I and Type-II errors This is not possible for a given sample size When we lower the α level to minimize the probability of making a Type-I error, the β level will rise When we lower the β level to minimize the probability of making a Type-II error, the α level will rise 16 Type-I and Type-II Errors 17 Factors that Influence a Hypothesis Test The size of the mean difference The larger the mean difference is, the more likely you are to reject H 0 The variability of the scores The more variable the scores are, the less likely you are to reject H 0 The number of scores in the sample The larger the sample size, the more likely you are to reject H 0 18 6

Assumptions of the z-score Hypothesis Test Random sampling If the sample is not selected randomly from the population, it probably will not represent the population Independent observations σ does not change as a result of the treatment Distribution of sample means is normal 19 Directional vs Non-Directional Hypotheses The hypotheses we have been talking about are called non-directional hypotheses because they do not specify how the population mean should differ from the constant That is, they do not say that the population mean should be larger than the constant They only state that the population mean should differ from the constant Non-directional hypotheses are sometimes called two-tailed tests 20 Directional vs Non-Diretional Hypotheses Directional hypotheses include an ordinal relation between the population mean and the constant That is, they state that the population mean should be larger than the constant For directional hypotheses, the H 0 and H 1 are written as: H 0 : µ treatment constant H 1 : µ treatment > constant Directional hypotheses are sometimes called one-tailed tests 21 7

1 Tailed When performing a one tailed test, all of the critical region is in one tail of the distribution of sample means Do not divide α by two when finding the z score for the critical region This increases statistical power the probability of correctly rejecting a false H 0 22 1 Tailed vs. 2 Tailed 1 Tailed α=.05, z = 1.65 Critical region in one tail α=.05, z = -1.96 Critical region in two tails 2 Tailed α=.05, z = 1.96 Critical region in two tails -3-2 -1 0 1 2 3-3 -2-1 0 1 2 3 23 Concerns about Hypothesis Testing Hypothesis testing focuses on the data, and not the hypothesis When we reject H 0, we should really say This specific sample mean is very unlikely (p <.05) if the null hypothesis is true Statistical significance practical significance The effect size can be small, but still be statistically significant if the sample size is sufficiently large 24 8

Effect Size A measure of effect size is intended to provide a measurement of the absolute magnitude of a treatment effect, independent of the size of the sample(s) being used Cohen s d is a measure of effect size 25 Effect Size What is the effect size for the example on slide 5? Magnitude of d d = 0.2 d = 0.5 d = 0.8 Evaluation of Effect Size Small effect Medium effect Large effect This is a small effect 26 Statistical Power Statistical power is the probability that a statistical test will correctly reject a false H 0 Probability that a statistical test will identify a treatment effect if one really exists Power = 1 β= 1 probability of a Type II error 27 9

Statistical Power Calculate before performing the study Need to know / estimate How much the treatment changes the dependent variable Sample size α σ, µ 28 Statistical Power Example How much the treatment changes the dependent variable Researchers hypothesize that having proper nutrition during the first two years will increase IQ by 3 points (notice 1 tailed) µ = 100, σ = 15 Sample size n = 25 α =.05 29 Distribution of Sample Means If the treatment has no effect, by the central limit theorem, the distribution of sample means will have: a mean = population mean = 100 a standard deviation = σ/ n = 15 / 25 = 3 If the treatment has the hypothesized effect, the distribution of sample means will have a mean = population mean + effect of treatment = 100 + 3 = 103 a standard deviation = σ/ n = 15 / 25 = 3 add a constant to all scores does not change the standard deviation 30 10

z Score of Critical Region This is a one-tailed test with α =.05 Consult a table to find the z with an area above equal to.05 z = 1.65 31 Statistical Power Example 91 94 97 100 103 106 109 112 115 z 0 1 1.65 2 32 Statistical Power Example Power equals area to right of the z score for the critical region under the treatment distribution of sample means Areas to the right of the z score for the critical region correspond to rejecting H 0 Areas under the treatment distribution of sample means correspond to a false H 0 Both combined correspond to rejecting a false H 0 = power 33 11

Statistical Power Example Find the z score in the treatment distribution of sample means that is at the same location as the z score for the critical region in the no treatment distribution of sample means z treatment = z critical region z mean of treatment z mean of treatment = (103 100) / 3 = 1 z treatment = 1.65 1 = 0.65 Power = area above z = 0.65 Power =.26 Only about a 1 in 4 chance of observing this effect 34 Factors that Influence Power Sample size As sample size increases, power increases α level As α decreases (fewer Type I errors), β increases (more Type II errors), and 1 β (power) decreases Number of tails (directional vs non-directional) One tailed tests have more statistical power than two tailed tests. Can you explain why? 35 12