AMS 5 TESTS FOR TWO SAMPLES

Similar documents
MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Chapter 26: Tests of Significance

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

AMS 5 CHANCE VARIABILITY

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Testing Hypotheses About Proportions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Introduction to Hypothesis Testing

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Stat 20: Intro to Probability and Statistics

Mind on Statistics. Chapter 12

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Week 4: Standard Error and Confidence Intervals

WISE Power Tutorial All Exercises

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

HYPOTHESIS TESTING: POWER OF THE TEST

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

AP Statistics 7!3! 6!

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

R&M Solutons

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Descriptive Statistics

Stat 20: Intro to Probability and Statistics

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Lecture Notes Module 1

Statistics 2014 Scoring Guidelines

Characteristics of Binomial Distributions

Testing Research and Statistical Hypotheses

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

6. Let X be a binomial random variable with distribution B(10, 0.6). What is the probability that X equals 8? A) (0.6) (0.4) B) 8! C) 45(0.6) (0.

Non-Inferiority Tests for Two Proportions

The overall size of these chance errors is measured by their RMS HALF THE NUMBER OF TOSSES NUMBER OF HEADS MINUS NUMBER OF TOSSES

Chapter 23 Inferences About Means

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Tests for One Proportion

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Mind on Statistics. Chapter 4

A) B) C) D)

Hypothesis Testing. Steps for a hypothesis test:

Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Week 3&4: Z tables and the Sampling Distribution of X

Name: Date: Use the following to answer questions 3-4:

$ ( $1) = 40

Chapter 20: chance error in sampling

Non-Parametric Tests (I)

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key


Permutation & Non-Parametric Tests

Elementary Statistics and Inference. Elementary Statistics and Inference. 16 The Law of Averages (cont.) 22S:025 or 7P:025.

Chapter 16. Law of averages. Chance. Example 1: rolling two dice Sum of draws. Setting up a. Example 2: American roulette. Summary.

Two-Group Hypothesis Tests: Excel 2013 T-TEST Command

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Tests of Hypotheses Using Statistics

Mean = (sum of the values / the number of the value) if probabilities are equal

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

13.0 Central Limit Theorem

The Procedures of Monte Carlo Simulation (and Resampling)

Tests for Two Proportions

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Confidence Intervals for One Standard Deviation Using Standard Deviation

Tutorial 5: Hypothesis Testing

Non-Inferiority Tests for Two Means using Differences

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

Topic #6: Hypothesis. Usage

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Comparing the Means of Two Populations: Independent Samples

Final Exam Practice Problem Answers

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

II. DISTRIBUTIONS distribution normal distribution. standard scores

Categorical Data Analysis

3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventy-year-old person enters the room, the

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Testing a claim about a population mean

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

3.4 Statistical inference for 2 populations based on two samples

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Section 6.2 Definition of Probability

Using R for Linear Regression

Hypothesis Tests for 1 sample Proportions

Cell Phone Impairment?

Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Transcription:

AMS 5 TESTS FOR TWO SAMPLES

Test for difference We will consider the problem of comparing the means of two populations. The main tool to do hypothesis testing in this case is the z test for the difference of two populations. The z test is based on the standardized difference between the averages of the two groups. We will consider the following topics: How to calculate the standard error for the difference How to compare the two averages The case of binary populations Two tailed versus one tailed tests

Test for difference There are many situations where the population is split in two groups and we want to test if there are significant differences between the groups. As we have seen so far the key issue about hypothesis testing is that we need a standardized measure of the distance between what we observe and what we postulate in the null hypothesis. In order to get this we need an estimate of the standard error. Consider two boxes Suppose 400 draws are made at random with replacement from box A and 100 from box B.

Test for difference We need to estimate and test the difference between the two samples. Given the numbers in the boxes we have: BOX A: BOX B: Average of 400 draws = Average of 400 draws = 110± 60 / 400 = 110± 3 SE= 3 90± 40 / 100 = 90± 4 SE= 4 For the difference we have an expected value of 110-90 = 20 What is the corresponding standard error?

Test for difference 2 2 In the previous example we get 3 + 4 = 5 and so we expect the difference to be 20 ± 5.

Independent? For the square root formula to apply we need independence. Example 1: 100 draws are made with replacement from boxes Find the expected value and SE for the difference between the number of 1 s drawn from C and 4 s drawn from D. We expect to have 50 ± 5 1 s and 50 ± 5 4 s. The expected value for the difference is therefore 50 50 = 0. The draws are made independently and therefore the SE for the difference is 2 2 5 + 5 7.

Independent? Example 2: 100 draws are made with replacement from the box The expected number of 1's is 20 with a SE of 4. The expected number of 5's is also 20 with a SE of 4. The expected value of the difference between the number of 1's 2 2 and the number of 5's is 0, but the SE is NOT 4 + 4 since there is not independence (if one number is large the other is likely to be small).

Test statistic A nationwide sample of 1000 17-year-old students was given a math test in 1978. This was repeated in 1992. The average score went from 300.4 the first time, to 306.7 the second time. The difference is 6.3 points. Is this significant? Using hypothesis testing terminology we set the null hypothesis as H 0 : the difference is 0 and the alternative hypothesis as H 1 : the average of 1992 is bigger than the that of 1978 The averages were obtained from samples of 1,000 results from each math test. The corresponding SEs are 1.1 for 1978 and 1.0 for 1992. We can compute the SE for the difference as 2 2 1.0 + 1.1 1.5 since we can assume that the samples are independent.

Test statistic Then we can obtain a z-test as : z = This is equal to observed difference - expected difference under H 0 SE for difference 6.3-0 z= 4.2 1.5 Since the area under the normal curve corresponding to values above 4.2 is negligible we reject the null hypothesis and conclude that the difference is significantly large. Thus, students performed better in 1992 than in 1978.

Example 1. The students of the class were divided in two groups: the pink and the lavender groups. Each person counted the amount of cents in their pockets. The results are 94.46 cents average for the pink group with a SE of 28.99 cents and 79.63 cents average for the lavender group, with a SE of 38.81. Is there enough evidence to support the claim that the students in the pink group had, on average, more change that the students in the lavender group? 2 2 The standard error of the difference is 28.99 + 38.81 = 48.44. Thus the test statistics is 94.46-79.63 z= = 0.31. 48.44 The probability that a standard normal will have values above 0.31 is about 38%. Since this is a large P-value, there are no reasons to reject the hypothesis that there are no differences between the groups.

Example 2. A safety engineer compares the braking distance of two sets of tires by performing 50 braking tests for each set. The results are: set 1 has an average braking distance of 42 feet with a SD of 4.7 feet. Set 2 has an average braking distance of 55 feet with a SD of 5.3 feet. Is there enough evidence to support the claim that the second set of tires has a larger mean braking distance than the first? The standard error of the difference is Thus the test statistics is 55-42 z= = 1.83. 7.08 2 2 4.7 + 5.3 = 7.08. The probability that a standard normal will have values above 1.83 is about 3%. Since this is a small P-value, there are reasons to reject the null hypothesis that the two sets have the same mean braking distance.

Binary boxes In a sample of 200 male students 107 use a personal computer on a regular basis. In a sample of 300 female students 132 use personal computers on a regular basis. Is the difference between the two groups real or due to chance variation? We can think of two binary boxes: one for the males and one for the females. The boxes have 1's for those who use the PCs and 0 for those who don't. The percent in the males box is 53.5% and in the females box is 44.0%. According to the null hypothesis the percentage of 1's in both boxes is the same. The standard error for the percentage of PC users is approximated as 0.535 (1 0.535) 200 = 3.5% for the group of male students 0.44 (1 0.44) 300 = 2.9% for the group of female students

These SEs are obtained by approximating the SD of the box using the SD for the sample. The SE for the difference is Thus the test statistics is Binary boxes 2 2 3.5 + 2.9 4.5%. (53.5% - 44.0%) 0 z= 2.1 4.5% and this value corresponds to a P-value of approximately 2%. So we reject the null hypothesis and conclude that the difference is real.

Experiments Suppose a clinical trial to test the effectiveness of vitamin C is conducted. 200 subjects participate in the trial. Half of them are randomized to get 200mg of vitamin C and the other half gets 200mg of a placebo. The results are that, over the period of the trial, the treatment group averages 2.3 colds, with a SD of 3.1, and the control group averages 2.6 colds, with a SD of 2.9. Is the difference significant? The difference is -0.3 and the SEs are obtained as 3.1 2.9 = 0.31 and = 0.29 100 100 The standard error of the difference is Thus the z-test is -0.3 0.0 z= 0.7. 0.42 2 2 0.31 + 0.29 0.42.

Experiments This z-value corresponds to a P-value of around 24%. Thus, the difference is not significant. Is this answer O.K.? If we think of a box model we realize that this solution involves two mistakes: The draws are taken without replacement but the SEs are computed as if that were taken with replacement. The two averages are not independent, but the SEs are combined as if they were. The consequences of these mistakes are not relevant if the number of draws is small relative to the population. But this is seldom the case for clinical trials. We have that The first mistake inflates the SE. The second mistake deflates the SE. So the two mistakes compensate and usually the result is a small overestimation of the SE.

Two versus one tailed tests You want to see if a coin is fair. The coin is tossed 100 times and lands heads on 61 of the tosses. The null hypothesis consists on assuming that the coin is fair so, under the null, we have an expected value of 50 heads. The SE is 100 0.5= 5, thus the test statistics is 61 50 z= = 2.2. 5 Consider testing against the alternative hypothesis that the coin is biased towards heads, that is, that probability of heads is bigger than 1/2. Therefore, big values of z favor the alternative hypothesis. The P-value under this hypothesis is the area under the normal curve corresponding to the number greater than 2.2. This is equal to 1.4%.

Two versus one tailed tests Suppose that we consider a different alternative hypothesis, consisting on the probability of heads being different, in either direction, than 1/2. The values of z that favor the alternative hypothesis are either large negative or positive values. The P- value is obtained by the area under the normal curve corresponding to values less than -2.2 or greater than 2.2. The P-value in this case is 2.8%. The first test corresponds to a one tailed test. The second test corresponds to a two tailed test. Two tailed tests have a P-value that is the double of one tailed tests. otherwise you could be manipulating the results.

Example One hundred draws are made at random with replacement from box A and 250 are made from box B. The boxes contain numbered tickets. The numbers can be positive or negative. 1. 50 of the draws from A are positive. 131 of the draws from box B are positive. Is the difference real or due to chance? The proportion of positive tickets in box A is 50%, in box B it is 52.4%. The SEs for the boxes are given by: 0.5 0.5 0.51 0.49 = 0.05 and = 0.032 100 250 The standard error of the difference is Thus the z-test is (52.4 50) 0 z= 0.4. 5.9 2 2 0.05 + 0.032 = 0.059.

Example The probability that a standard normal will be above 0.4 is about 34%. So for either a one or two tailed test we conclude that the difference is likely due to chance. 2. The draws from box A average 1.4 with and SD of 15.3. the draws from box B average 6.3 with and SD of 16.1. Is the difference between the averages statistically significant? We can obtain the SE as 15.3 16.1 = 1.53 and = 1.02 100 250 The standard error of the difference is Thus the z-test is (6.3 1.4) 0 z= = 2.68. 1.83 2 2 1.53 + 1.02 = 1.83.

Example The probability that a standard normal will be above 2.68 is about 0.004. So even for a two tailed test we have that the P-value is very small. The conclusion is that the difference between the two boxes seems real.