Hypothesis Testing for Categorical Data

Hypothesis Testing for Categorical Data 1. A die is rolled 36 times, with the following results: VALUE 1 2 3 4 5 6 TIMES 8 1 7 9 5 6 Is the die fair? Perform a Hypothesis Test with α =.05.

Step One: H 0 : The die is fair. (p 1 = p 2 = p 3 = p 4 = p 5 = p 6 = 1/6) H A : The die is not fair. α =.05

Step Two: This is a χ 2 test with α =.05. Since we have six categories, we have 5 degrees of freedom. The table value is 11.07. We therefore reject the Null Hypothesis if our test statistic is greater than 11.07.

Step Three: We compute our test statistic by first creating the expected table. If the Null Hypothesis is correct then we should receive the same number of each value, in this case six. Our table is therefore Recall that the observed table was This makes the test statistic 6 χ 2 (O i E i ) 2 = i=1 E i VALUE 1 2 3 4 5 6 TIMES 6 6 6 6 6 6 VALUE 1 2 3 4 5 6 TIMES 8 1 7 9 5 6 = (8 6)2 6 (1 6)2 6 (7 6)2 6 (9 6)2 6 (5 6)2 6 (6 6)2 6 = 6 2 3. Since χ 2 = 6 2 3 is not > 11.07, we fail to reject the Null Hypothesis and conclude that the die may be fair.

2. A simple random sample of size 1600 was taken, and two questions were asked: What is your (political) party affiliation? and Did you vote on November 2? The results were: DEM REP IND YES 390 400 170 NO 250 240 150 Is party affiliation independent of voter turnout? Perform a Hypothesis Test with α =.025.

Step One: H 0 : Party Affiliation is independent of voter turnout. H A : Party Affiliation is not independent of voter turnout. α =.025

Step Two: This is a χ 2 test with α =.025. We have two variables, one with three categories and the other with two categories, making six categories total. We have (3-1)(2-1)=2 degrees of freedom. The table value is 7.378. We therefore reject the Null Hypothesis if our test statistic is greater than 7.378.

Step Three: We compute our test statistic by first creating the expected table. If the Null Hypothesis is correct then we should receive the same proportion of Voters and Non-voters within each party. Since of the 1600 in the sample, 960 voted and 640 did not vote, we should have 3/5 voters and 2/5 non-voters in each party. Our table is therefore Recall that the observed table was This makes the test statistic 6 χ 2 (O i E i ) 2 = i=1 E i DEM REP IND YES 384 384 192 NO 256 256 128 DEM REP IND YES 390 400 170 NO 250 240 150 = (390 384)2 384 (400 384)2 384 (170 192)2 192 (250 256)2 256 (240 256)2 256 (150 128)2 128 = 8.3125. Since χ 2 = 8.3125 is > 7.378, we reject the Null Hypothesis and conclude that party affiliation is not independent of voter turnout.

3. A course with 100 students had three exams. The results were: TEST 1 TEST 2 TEST 3 A 15 30 15 B 35 25 30 C 18 22 D 17 19 24 F 15 6 9 Was the grade distribution the same on each test? Perform a Hypothesis Test with α =.05.

Step One: H 0 : The three exams have the same distribution. H A : The three exams do not have the same distribution. α =.05

Step Two: This is a χ 2 test with α =.05. We have two variables, one with three categories and the other with five categories, making fifteen categories total. We have (3-1)(5-1)=8 degrees of freedom. The table value is 15.507. We therefore reject the Null Hypothesis if our test statistic is greater than 15.507.

Step Three: We compute our test statistic by first creating the expected table. If the Null Hypothesis is correct then we should receive the same proportion of As, Bs, Cs, Ds, and Fs on each exam. Since of the 300 exams, 60 were As, 90 were Bs, 60 were Cs, 60 were Ds, and 30 were Fs, we should have As, 30 Bs, Cs, Ds and 10 Fs on each exam. Our table is therefore Recall that the observed table was TEST 1 TEST 2 TEST 3 A B 30 30 30 C D F 10 10 10 TEST 1 TEST 2 TEST 3 A 15 30 15 B 35 25 30 C 18 22 D 17 19 24 F 15 6 9

This makes the test statistic χ 2 = 15 i=1 (O i E i ) 2 E i = (15 )2 (30 )2 (15 )2 (35 30)2 30 (25 30)2 30 (30 30)2 30 (18 )2 ( )2 (22 )2 (17 )2 (19 )2 (24 )2 (15 10)2 10 (6 10)2 10 (9 10)2 10 = 15.0 6. Since χ 2 = 15.0 6 is not > 15.507, we fail to reject the Null Hypothesis and conclude that the three exams might have the same grade distributions.

4. A standardized exam is given which typically results in 15% As, % Bs, 30% Cs, % Ds, and 15% Fs. Of the 1000 students who take the exam, the following grades were given: A B C D F 128 194 326 193 159 Was the grade distribution typical for this exam? Perform a Hypothesis Test with α =.1.

Step One: H 0 : The grade distribution is 15% A, % B, 30% C, % D, 15% F. H A : The grade distribution is not 15% A, % B, 30% C, % D, 15% F. α =.1

Step Two: This is a χ 2 test with α =.1. Since we have five categories, we have 4 degrees of freedom. The table value is 7.779. We therefore reject the Null Hypothesis if our test statistic is greater than 7.779.

Step Three: We compute our test statistic by first creating the expected table. If the Null Hypothesis is correct then we should receive 15% As, % Bs, 30% Cs, % Ds, and 15% Fs. Our table is therefore Recall that the observed table was This makes the test statistic 5 χ 2 (O i E i ) 2 = i=1 A B C D F 150 0 300 0 150 A B C D F 128 194 326 193 159 E i = (128 150)2 150 (194 0)2 0 (326 300)2 300 (193 0)2 0 (159 150)2 150 = 6.445. Since χ 2 = 6.445 is not > 7.779, we fail to reject the Null Hypothesis and conclude that the grade distribution was typical for this exam.