Chi-square (χ 2 ) Tests

Similar documents
DIRECTIONS. Exercises (SE) file posted on the Stats website, not the textbook itself. See How To Succeed With Stats Homework on Notebook page 7!

Chapter 23. Two Categorical Variables: The Chi-Square Test

Chi-square test Fisher s Exact test

Is it statistically significant? The chi-square test

The Chi-Square Test. STAT E-50 Introduction to Statistics

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Crosstabulation & Chi Square

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Section 12 Part 2. Chi-square test

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

UNDERSTANDING THE TWO-WAY ANOVA

Odds ratio, Odds ratio test for independence, chi-squared statistic.

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

3.4 Statistical inference for 2 populations based on two samples

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Association Between Variables

Chapter 19 The Chi-Square Test

Tests of Hypotheses Using Statistics

Elementary Statistics Sample Exam #3

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

Mind on Statistics. Chapter 15

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Simulating Chi-Square Test Using Excel

Chi Square Distribution

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

1 Basic ANOVA concepts

Chi Square Tests. Chapter Introduction

Math 251, Review Questions for Test 3 Rough Answers

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

8 6 X 2 Test for a Variance or Standard Deviation

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Tests for Two Proportions

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Research Methods & Experimental Design

First-year Statistics for Psychology Students Through Worked Examples

SAS Software to Fit the Generalized Linear Model

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Testing Research and Statistical Hypotheses

Topic 8. Chi Square Tests

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Mind on Statistics. Chapter 4

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Chapter 3 RANDOM VARIATE GENERATION

Statistical tests for SPSS

Math 108 Exam 3 Solutions Spring 00

Unit 1. Today I am going to discuss about Transportation problem. First question that comes in our mind is what is a transportation problem?

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Chapter 7 TEST OF HYPOTHESIS

Elementary Statistics

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Logit Models for Binary Data

Goodness of Fit. Proportional Model. Probability Models & Frequency Data

Difference of Means and ANOVA Problems

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Row vs. Column Percents. tab PRAYER DEGREE, row col

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Study Guide for the Final Exam

Using Excel in Research. Hui Bian Office for Faculty Excellence

Additional sources Compilation of sources:

Using Excel for inferential statistics

Ordinal Regression. Chapter

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

Lecture 8. Confidence intervals and the central limit theorem

Two Correlated Proportions (McNemar Test)

Poisson Models for Count Data

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

Confidence intervals

IB Practice Chi Squared Test of Independence

Chapter 13. Chi-Square. Crosstabs and Nonparametric Tests. Specifically, we demonstrate procedures for running two separate

Binary Diagnostic Tests Two Independent Samples

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

Solutions to Homework 10 Statistics 302 Professor Larget

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Chi-Squared Tests 97. Although we made up several potential test statistics in class, the standard test statistic for this situation is

4. Continuous Random Variables, the Pareto and Normal Distributions

Weight of Evidence Module

Final Exam Practice Problem Answers

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Testing Hypotheses About Proportions

Types of Studies. Systematic Reviews and Meta-Analyses

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

Transcription:

Math 442 - Mathematical Statistics II May 5, 2008 Common Uses of the χ 2 test. 1. Testing Goodness-of-fit. Chi-square (χ 2 ) Tests 2. Testing Equality of Several Proportions. 3. Homogeneity Test. 4. Testing Independence. Testing Goodness-of-fit of Data from Multinomial Experiment. Properties of the Multinomial Experiment. 1. The experiment consists of n identical trials. 2. There are k possible outcomes to each trial. These possible outcomes are sometimes called classes or categories. 3. The probabilities of the k outcomes, denoted by p 1, p 2,..., p k, remain the same from trial to trial, and p 1 + p 2 + + p k 1. 4. The trials are independent. 5. The random variables of interest are the cell counts, n 1, n 2,..., n k, of the number of observations that fall in each of the k categories. (note: n 1 + n 2 + + n k n). Cat 1 Cat 2 Cat k Observed n 1 n 2 n k Expected n p 1 n p 2 n p k Chi-Square Statistic. The chi-square statistic is a measure of how much the observed cell counts diverge from the expected cell counts. The formula for the statistic is X 2 (observed count expected count) 2 expected count (O E) 2 E (n i n p i ) 2 (1) (2) n p i χ 2 k 1. (3) This X 2 statistic follows approximately the χ 2 distribution with k 1 degrees of freedom. Note:The chi-square approximation is adequate for practical use when the average expected cell is 5 or greater and all individual expected counts are at least 1, except in a 2 2 table where all four expected counts should be at least 5. Null Hypothesis and Alternative Hypothesis. 1. Null Hypothesis: H 0 : p 1 p 01, p 2 p 02,..., p k p 0k 2. Alternative Hypothesis: H 1 : The null hypothesis is not true.

Example: 1. Nature (Sept. 1993) reported on a study of animal and plant species hotspots in Great Britain. A hotspot is defined as a 10-km 2 area that is species-rich, that is, is heavily populated by the species of interest. Analogously, a coldspot is a 10-km 2 area that is species-poor. The table below gives the number of butterfly hotspots and the number of butterfly coldspots in a sample of 2,588 10-km 2 areas. In theory, 5% should be butterfly coldspots, while the remaining areas (90%) are neutral. Test the theory using α 0.01. hotspots coldspots Neutral Areas Total Observed 123 147 2318 2588 Expected 2. Car Crashes and Age Brackets. Among drivers who have had a car crash in the last year, 88 are randomly selected and categorized by age, with the results listed in the accompanying table. If all ages have the same crash rate, we would expect (because of the age distribution of licensed drivers) the given categories to have 16%, 44%, 27%, and 13% of the subjects, respectively. At the 0.05 level of significance, test the claim that the distribution of crashes conforms to the distribution of ages. Does any age group appear to have a disproportionate number of crashes? Age Under 25 25-44 45-64 Over 64 No. of Crashes 36 21 12 19 3. Checking Normality. The table below shows the number of values in a sample of size n 200 observations falling in each category. At the 0.05 level of significance, test whether the sample can be reasonably assumed to have come from the Standard Normal distribution. z values z < 1 1 z < 0.5 0.5 z < 0 No. of Observations 30 38 43 z values 0 z < 0.5 0.5 z < 1 z > 1 No. of Observations 38 20 31

Testing Equality of Several Proportions. 1. Null Hypothesis: H 0 : p 1 p 2 p k 2. Alternative Hypothesis: H 1 : Not all are equal. Examples. 1. No Smoking. The accompanying table summarizes successes and failures when subjects used different methods in trying to stop smoking. The determination of smoking or not smoking was made five months after the treatment was begun, and the data are based on results from the Centers for Disease Control and Prevention. Use a 0.05 level of significance to test the claim that success is independent of the method used. If someone wants to stop smoking, does the choice of the method make a difference? Nicotine Gum Nicotine Patch Nicotine Inhaler Smoking 191 263 95 No smoking 59 57 27 2. Survey Refusals and Age Bracket. A study of people who refused to answer survey questions provided the randomly selected sample data shown in the table. At the 0.01 significance level, test the claim that the cooperation of the subject (response or refusal) is independent of the age category. Does any particular age group appear to be particularly uncooperative? Age 18-21 22-29 30-39 40-49 50-59 60 and over Responded 73 255 245 136 138 202 Refused 11 20 33 16 27 49

Testing for Homogeneity (In a Two-Way Contingency Tables). 1. Null Hypothesis: H 0 : p 1j p 2j p kj, j 1, 2,..., J 2. Alternative Hypothesis: H 1 : H 0 is not true. 3. Test Statistic: The chi-square statistic X 2 is a measure of how much the observed cell counts diverge from the expected cell counts. The formula for the statistic is very similar to the one we have before. Example. X 2 all cells J j1 j1 (observed count expected count) 2 (O ij E ij ) 2 E ij expected count (4) J (n ij r i ˆp j ) 2 (5) r i ˆp j j1 J (n ij r i c j /n) 2 χ 2 r i c j /n (k 1)(J 1). (6) where, r i and c j are the total of the ith row and jth column, respectively. This X 2 statistic follows approximately the χ 2 distribution with (k 1)(J 1) degrees of freedom. Note:The chi-square approximation is adequate for practical use when the average expected cell is 5 or greater and all individual expected counts are at least 1, except in a 2 2 table where all four expected counts should be at least 5. 1. A company packages a particular product in cans of three different sizes, each one using a different production line. Most cans conform to specifications, but a quality control engineer has identified the following reasons for nonconformance: Blemish on can, crack in can, improper pull tab location, pull tab missing, and other. A sample of nonconforming units is selected from each of the three lines, and each unit is categorized according to reason for nonconformity, resulting in the following contingency table data: Production Blemish Crack Location Missing Other Total Line 1 34 65 17 21 13 150 Line 2 23 52 25 19 6 125 Line 3 32 28 16 14 10 100 Total 89 145 58 54 29 375

Testing for Independence (In a Two-Way Contingency Tables). 1. Null Hypothesis: Factor A is independent of factor B. 2. Alternative Hypothesis: Factors A and B are not independent. 3. Test Statistic: The same chi-square statistic X 2. Example. 1. Background Music. Market researchers know that background music can influence the mood and purchasing behavior of customers. One study in a supermarket in Northern Ireland compared three treatments: no music, French accordion music, and Italian string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian, and other wine purchased. Here is the two-way table that summarizes the data: Wine French music Italian music No music French 39 30 30 Italian 1 19 11 Other 35 35 43 Homework problems: Section 14.1: pp. 686-688; # 2, 4, 8. Section 14.2: pp. 693-696; # 10, 14, 15. Section 14.3: pp. 698-701; # 18, 19, 20.