Chapter 23. Two Categorical Variables: The Chi-Square Test



Similar documents
Is it statistically significant? The chi-square test

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

Elementary Statistics Sample Exam #3

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Chapter 19 The Chi-Square Test

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

The Chi-Square Test. STAT E-50 Introduction to Statistics

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Math 108 Exam 3 Solutions Spring 00

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Simulating Chi-Square Test Using Excel

STAT 145 (Notes) Al Nosedal Department of Mathematics and Statistics University of New Mexico. Fall 2013

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

Final Exam Practice Problem Answers

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

First-year Statistics for Psychology Students Through Worked Examples

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

Testing Research and Statistical Hypotheses

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

MTH 140 Statistics Videos

UNDERSTANDING THE TWO-WAY ANOVA

CHAPTER 14 NONPARAMETRIC TESTS

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Regression Analysis: A Complete Example

Recall this chart that showed how most of our course would be organized:

1 Basic ANOVA concepts

Chi Square Distribution

Chi-square test Fisher s Exact test

Chapter 3 RANDOM VARIATE GENERATION

Elementary Statistics

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Goodness of Fit. Proportional Model. Probability Models & Frequency Data

Using Excel for inferential statistics

Mind on Statistics. Chapter 15

Variables Control Charts

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Difference of Means and ANOVA Problems

How To Test For Significance On A Data Set

Association Between Variables

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Chi Square Tests. Chapter Introduction

3.4 Statistical inference for 2 populations based on two samples

Testing differences in proportions

Chapter 7. One-way ANOVA

11. Analysis of Case-control Studies Logistic Regression

Weight of Evidence Module

Analysis of categorical data: Course quiz instructions for SPSS

MULTIPLE REGRESSION EXAMPLE

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Section 12 Part 2. Chi-square test

8 6 X 2 Test for a Variance or Standard Deviation

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Solutions to Homework 10 Statistics 302 Professor Larget

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

This chapter discusses some of the basic concepts in inferential statistics.

Simple Linear Regression Inference

One-Way Analysis of Variance (ANOVA) Example Problem

Chapter 23 Inferences About Means

Crosstabulation & Chi Square

Topic 8. Chi Square Tests

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Simple Second Order Chi-Square Correction

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Common Univariate and Bivariate Applications of the Chi-square Distribution

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

TImath.com. F Distributions. Statistics

Generalized Linear Models

Data Mining Techniques Chapter 6: Decision Trees

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

The ANOVA for 2x2 Independent Groups Factorial Design

Additional sources Compilation of sources:

Stats for Strategy Fall 2012 First-Discussion Handout: Stats Using Calculators and MINITAB

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Normality Testing in Excel

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

Nonparametric Tests. Chi-Square Test for Independence

CHAPTER 13. Experimental Design and Analysis of Variance

1.5 Oneway Analysis of Variance

STATISTICA Formula Guide: Logistic Regression. Table of Contents

One-sample test of proportions

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Chapter 7 Section 1 Homework Set A

Multiple samples: Pairwise comparisons and categorical outcomes

The Analysis of Variance ANOVA

Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York

Transcription:

Chapter 23. Two Categorical Variables: The Chi-Square Test 1 Chapter 23. Two Categorical Variables: The Chi-Square Test Two-Way Tables Note. We quickly review two-way tables with an example. Example. Exercise 23.2a page 550. Expected Counts in Two-Way Tables Definition. The expected count in any cell of a two-way table when H 0 is true is row total column total expected count =. table total Example. Exercise 23.6 page 554. The Chi-Square Test Definition. The chi-square test is a measure of how far the observed counts in a two-way table are from the expected counts. The formula for the statistic is χ 2 = (observed count expected count) 2. expected count The sum is over all cells in the table.

Chapter 23. Two Categorical Variables: The Chi-Square Test 2 Cell Counts Required for the Chi-Square Test Note. You can safely use the chi-square test with critical values from the chi-square distribution when no more than 20% of the expected counts are less than 5 and all individual expected counts are 1 or greater. In particular, all four expected counts in a 2 2 table should be 5 or greater. Uses of the Chi-Square Test Note. Use the chi-square test to test the null hypothesis: H 0 : there is no relationship between two categorical variables when you have a two-way table from one of these situations: Independent SRSs from each of two or more populations, with each individual classified according to one categorical variable. (The other variable says which sample the individual comes from.) A single SRS, with each individual classified according to both of two categorical variables.

Chapter 23. Two Categorical Variables: The Chi-Square Test 3 The Chi-Square Distributions Note. The chi-square distributions are a family of distributions that take only positive values and are skewed to the right. A specific chi-square distribution is specified by giving its degrees of freedom. The chi-square test for a two-way table with r rows and c columns uses critical values from the chi-square distribution with (r 1)(c 1) degrees of freedom. The P-value is the area to the right of χ 2 under the density curve of this chi-square distribution. Table E gives the relationships between the degrees of freedom, χ 2, and P-values. Example. Exercise 23.40 page 577. Example S.23.1. χ 2 -Stooges. We now consider all 190 Three Stooges films and two categories. One category is the role of third stooge (Curly/ Shemp/Joe) and the other is number of slaps in the film (which we break into intervals as [0, 10], [11, 20], [21, 30], [31, 40], [41, )). Notice that both of these are in fact categorical variables, even though number of slaps in the film could be dealt with as a quantitative variable. The data can be put in a two-way table as follows. (This is similar to Example S.6.1.)

Chapter 23. Two Categorical Variables: The Chi-Square Test 4 Curly Shemp Joe TOTAL 0 to 10 slaps 49 34 10 93 11 to 20 slaps 36 21 5 62 21 to 30 slaps 7 14 1 22 31 to 40 slaps 3 2 0 5 more than 40 slaps 2 6 0 8 TOTAL 97 77 16 190 Calculate the χ 2 statistic and perform a χ 2 test on H 0 : there is no relationship between two categorical variables. Solution. First observe that the number of degrees of freedom is df = (r 1)(c 1) = (5 1)(3 1) = 8. Using the first formula of this chapter, we find the following expected counts: Curly Shemp Joe TOTAL 0 to 10 slaps 47.48 37.69 7.83 93 11 to 20 slaps 31.65 25.13 5.22 62 21 to 30 slaps 11.23 8.92 1.85 22 31 to 40 slaps 2.55 2.03 0.42 5 more than 40 slaps 4.08 3.24 0.67 8 TOTAL 97 77 16 190 We now sum over the 15 table entries to calculate the χ 2

Chapter 23. Two Categorical Variables: The Chi-Square Test 5 statistic: (49 47.48) 2 47.48 (34 37.69)2 37.69 (10 7.83)2 7.83 (36 31.65)2 31.65 (21 25.13)2 25.13 (5 5.22)2 5.22 (7 11.23) 2 11.23 (14 8.92)2 8.92 (2 4.08) 2 4.08 (1 1.85)2 1.85 (6 3.24)2 3.24 (3 2.55)2 2.55 (0 0.67)2 0.67 (2 2.03)2 2.03 = 11.754. (0 0.42)2 0.42 We now find χ 2 = 11.754 in Table E in the row containing df = 8. We see that 11.754 lies in Table E between p = 0.20 and p = 0.15. Software (Minitab, say) gives p = 0.1625. The p value is not small enough for us to reject the null hypothesis that there is not relationship between two categorical variables.

Chapter 23. Two Categorical Variables: The Chi-Square Test 6 The Chi-Square Test for Goodness of Fit Note. A categorical variable has k possible outcomes with probabilities p 1, p 2, 0 3,...,p k. That is, p i is the probability of the ith outcome. We have n independent observations from this categorical variable. To test the null hypothesis that the probabilities have specified values H 0 : p 1 = p 10,p 2 = p 20,...,p k = p k0 use tje chi-square statistic χ 2 = (count of outcome i np i0 ) 2 np i0. The P-value is the area to the right of χ 2 under the density curve of the chi-square distribution with k 1 degrees of freedom. Example. Exercise 23.16 page 568. rbg-4-4-2009