1 Math 122 Intro to Stats Chapter 6 Semester II, Inference for Categorical Data Hypothesis Testing for a Proportion In a survey, 1864 out of 2246 randomly selected adults said texting while driving should be illegal. Use a significance level of 0.05 to test the claim that more than 80% of adults believe texting should be illegal while driving. 1. H 0 : p H A : p 2. Test Statistic and p value: 3. Conclusions: 1
2 Use STAT TESTS 1PropZTest. 1. Let p0 be the null or hypothesized value of p. 2. Let x be the number of successes. 3. Let n be the sample size. 4. Choose, <, or > to correspond to H A. 5. Choose (Calculate) and hit (ENTER) Use MENU STAT TEST (F3) (F1) (F3). 1. Specify the sidedness of the test. 2. Enter the null value, p0. 3. Enter the number of successes, x. 4. Let n be the sample size. 5. Choose (EXE) 2
3 Clinical trials involved treating patients with Tamiflu. Among 724 patients treated, 72 experienced nausea as an adverse reaction. Use a significance level of 0.05 to test the claim that the rate of nausea is greater than the 6% rate experienced by patients given a placebo. Does nausea appear to be a concern for those given the Tamiflu treatment. 1. H 0 : p H A : p 2. Test Statistic and p value: 3. Conclusions: 3
4 In a survey of 703 randomly selected workers, 61% got their jobs through networking. Use a significance level of 0.05 to test the claim that most (more than 50%) workers got their jobs through networking. What does the result suggest about the strategy for finding a job after graduation? 1. H 0 : p H A : p 2. Test Statistic and p value: 3. Conclusions: 4
5 Hypothesis Testing About Two Proportions In clinical trials of Lipitor, 94 subjects were treated with Lipitor and 270 subjects were given a placebo. Among those treated with Lipitor, 7 developed infections. Among the control group, 27 developed infections. Use a significance level of 0.05 to test the claim that rate of infection was the same for both groups. 1. H 0 : p 1 p 2 H A : p 1 p 2 2. Test Statistic and pvalue: 3. Conclusions: 5
6 ChiSquare Distribution The distribution we will use is the χ 2 distribution. Some properties of the χ 2 distribution are: 1. The χ 2 distribution is not symmetric, it is skewed to the right. 2. The values of the distribution cannot be negative. 3. The χ 2 distribution is different for each number of degrees of freedom. 6
7 The following are graphs of the χ 2 distribution for various degrees of freedom
8 Goodness of Fit Assumptions 1. The data have been randomly selected. 2. The sample data consist of frequency counts for each of the different categories. 3. For each category, the expected frequency is at least 5. A goodnessoffit test is used to test the hypothesis that an observed frequency distribution fits some claimed distribution. 8
9 Hypotheses The form of the hypotheses is H 0 : p 1 = p 2 = = p k H A : at least one proportion is different or H 0 : p 1 = value1, p 2 = value2, p k = valuek H A : at least one proportion is not as claimed 9
10 Notation for GoodnessofFit O observed frequency of an outcome E expected frequency of an outcome k number of different categories or outcomes n total number of trials Finding Expected Frequencies If all expected frequencies are equal, then E = n/k. If the expected frequencies are not equal, then E = np for each category. Where p is the probability of being in the category. Test Statistic for GoodnessofFit χ 2 = (O E) 2 E 10
11 Close agreement between observed and expected frequencies lead to a small value for the χ 2 statistic and support the null hypothesis. Significant differences between observed and expected frequencies lead to a large value for the test statistic and result in a rejection of the null hypothesis. The null hypothesis is that the data fit a particular distribution. 11
12 A researcher designs an experiment in which a rat chooses between doors of three different colors. Does the rat have preference for one of the doors? Green Red Blue Frequency I. H 0 : H A : II. Test Statistic and p value: III. Conclusions: 12
13 1. Enter the observed counts into list L1 and the expected counts into list L2. 2. Choose STAT. 3. Right arrow to TESTS. 4. Down arrow and choose χ 2 GOFTest. 5. Enter the degrees of freedom after df: 6. Choose Calculate and hit ENTER. 13
14 1. Navigate to STAT (MENU), then hit the (2) button or select STAT. 2. Enter the observed counts into a list (e.g. List 1) and the expected counts into list (e.g. List 2). 3. Choose the TEST option (F3). 4. Choose the CHI option (F3). 5. Choose the GOF option (F1). 6. Adjust the Observed and Expected lists to the corresponding list numbers from Step2. 7. Enter the degrees of freedom, df. 8. Specify a list where the contributions to the test statistic will be reported using CNTRB. This list number should be different from the others. 9. Hit the EXE button. 14
15 Enter observed values in L1 and expected values in L2. With L3 highlighted enter the following The keystrokes are the following: (L1 L2) 2 L2 ( 2ND 12ND 2 ) x 2 / 2ND 2 QUIT 2ND STAT scroll to MATH. Select option 5:sum( 2ND 3 press ENTER This is the χ 2 test statistic. 15
16 We will use the calculator to determine the p value. The degrees of freedom is one less than the number of categories. 2ND VARS 7 χ 2 cdf(test statistic, BIG NUMBER, df) 16
17 Mars, Inc., claims that its M&M s are distributed with following color percentages: 30% brown, 20% yellow, 20% red, 10% orange, 10% green, and 10% blue. A sample of 100 M&M s yielded the following data. Test the claim that color distribution is as claimed. Brown Yellow Red Orange Green Blue Frequency I. H 0 : H A : II. Test Statistic and p value: III. Conclusions: 17
18 Contingency Tables A contingency table is a table in which frequencies correspond to two variables. Male Female Nebraska Other
19 Test of Independence/Homogeneity 1. ThenullhypothesisH 0 isthestatementthattherowandcolumn variables are independent. The alternative H a is that the row and column variables are dependent. 2. For every cell, the expected frequency is at least The degrees of freedom is (r 1)(c 1), where r is the number of rows and c is the number of columns. 19
20 Test Statistic for Test of Independence χ 2 = (O E) 2 E Expected Frequency for a Contingency Table E = (row total)(column total) grand total 20
21 1. Hit 2ND x 1 (i.e. MATRIX). 2. Right arrow to EDIT. 3. Hit 1 or ENTER to select matrix A. 4. Enter the dimensions by typing #rows, ENTER, #columns, ENTER. 5. Enter the data from the twoway table. 6. Choose STAT. 7. Right arrow to TESTS. 8. Down arrow and choose C:χ 2 Test. 9. Down arrow, choose Calculate, and hit ENTER 21
22 1. Navigate to STAT ( MENU button, then hit the 2 button or select STAT). 2. Choose the TEST option (F3 button). 3. Choose the CHI option (F3 button). 4. Choose the 2WAY option (F2 button). 5. Enter the data into a matrix: Hit MAT (F2 button). Navigate to a matrix you would like to use (e.g. Mat C) and hit EXE. Specifythematrixdimensions: misforrows,nisforcolumns. Enter the data. Return to the test page by hitting EXIT twice. 6. Enter the Observed matrix that was used by hitting MAT (F1 button) and the matrix letter (e.g. C). 22
23 7. Enter the Expected matrix where the expected values will be stored (e.g. D). 8. Hit the EXE button 23
24 Male Female Nebraska Other I. H 0 : H A : II. Test Statistic and p value: III. Conclusions: 24
25 A researcher was interested in comparing three of the most commonly used NSAIDs(nonsteroidal antiinflammatory drugs): Motrin, Advil, and Aleve. Ninety (90) people who suffer of headaches were randomly assigned to each of these three treatments. I. H 0 : H A : II. Test Statistic and p value: Felt Better Motrin Advil Aleve Yes No III. Conclusions: 25
More information