Objectives. 9.1, 9.2 Inference for two-way tables. The hypothesis: no association. Expected cell counts. The chi-square test.

Transcription

1 Objectives 9.1, 9.2 Inference for two-way tables The hypothesis: no association Expected cell counts The chi-square test Using software Further reading:

2 Independence/Association: Sample and Population In the previous section we defined the notion of independence and dependence (also called association) using two-way contingency tables: Recall two variables are independent if the probability of one variable conditioned on the other is the same as the marginal probabilities. Example: In Chapter 13 we gave an example where the gender of a person did not change the chance of a pass or fail. This means that gender and passing are independent variables. If the marginal probabilities and the conditional probabilities are not the same, then there is an association between the variables. Example: In Chapter 13 we have an example where the gender of a person changed (dramatically) the chance of them wearing a dress (or not) to the Oscars. This means there is an association between dress wearing and gender. Knowledge of one variables (ie the person is a female) changes the chances of the wearing a dress or not. In reality we do not ever observe the population, and the numbers in a two-way table is a sample from a population. In such a case, even if variables are independent, sampling variation will mean that the marginal probability will not be the same as the conditional.

3 Example: The number of males and females in higher education is know to be equal. However, in a given class the numbers of males and females are likely to be different. Thus we need to `test for independent between the variables given the data set. We do this by `predicting what the numbers in the tables would be under the scenario of independence and make a comparison to what is actually observed in the data. This is best explained through several examples

4 But in reality these are only samples from the entire population. For a sample we cannot expect that even in the case of independence (no association) that the marginal probabilities and the conditional probabilities will be exactly the same. It will be different due to random variation in the sample. As in everything we have done so far, we are not interested in the sample but the population itself. What can we infer about the population based on the sample? Therefore we are interested in seeing whether in the population there is an association between the two variables. In the case that we have a two-by two table, for example, the hair and and minodoxil example we can test if the conditional probabilities (proportion who saw an improvement using minidoxil vs placebo) are the same or not by using the test on two proportions. However, this method cannot be extended to larger tables such as the student smoking example. Instead we take a slightly different approach. We calculate what we expect to see if there is no dependence and compare it what we do observe. We formalize this on the next slides.

5 Hypothesis test for association When we have two categorical variables it is often of interest to determine whether they are associated. As always, a firm decision can only be made by rejecting a null hypothesis using an appropriate test procedure. This is because we need to know if the apparent differences among sample proportions are likely to have occurred just by chance due to random sampling. The hypotheses are H 0 : the variables are not associated vs. H a : the variables are associated. We will use the chi-square (χ 2 ) test to assess the null hypothesis based on how well the data fit with what H 0 predicts the counts of a two-way table to be.

6 Expected cell counts To test the null hypothesis of no association between the variables, we must compare the actual (observed) counts from the sample data with the expected counts. The null hypothesis predicts that the cell proportions of the column variable within each row be the same as their overall proportions for the whole table. Specifically, the expected count in any cell of a two-way table when H 0 is true is: Do not round the expected count it usually is not a whole number. The expected count is a mean, not a value we would actually see.

7 Example 1: Oscars and dresses Male Female Total Dress No Dress Total Male Female Total Dress 215 ( =0.512) No Dress 205 ( =0.488) Total 200 ( =0.476) 220 ( 420 =0.524) 420 Male Female Total Dress 51.2% of 200 males 51.2% of 220 females 215 ( =0.512) = =112.6 No Dress 48.8% of 200 males 48.8% of 220 females 205 ( =0.488) Total 200 ( =97.62 = =0.476) ( 420 =0.524) 420 Let us return to the Oscars data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who wore dresses, the number who did not wear dresses, the number of females and the number of males. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Answer Only if dress and gender are independent variables. In this case the conditionals and marginal probabilities are the same. If there is no dependence between gender and whether they wear a dress or not, then we can use the marginal probabilities to predict the numbers. Compare the numbers from what is observed and expected under independence, they are completely different!

8 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is huge! It means that our predictions are completely wrong and this is because we made the predictions under the assumption that there dress and gender are completely independent clearly they are not this is why there is such a big difference. Observe that the p-value is very small. This tells us we are rejecting the null hypothesis, which is that there is no dependence between gender and whether they wear a dress or not.

9 Example 2: Gender and grades Let us return to the gender and grade Male Female Total Pass Fail Total Male Female Total Pass 288 Fail 32 Total Male Female Total Pass 90% of the 120 males 90% of the 200 females 288 ( =0.9) = 108 = 180 Fail 10% of 120 males 10% of the 320 females 32 ( =0.10) Total 120 ( =12 = 20 =0.375) ( 320 =0.625) 320 data. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who passed, the number who did not pass, the number of females and the number of males. We `fill in the middle of the table what we expect to see if the is no dependence between gender and grades If there is no dependence between gender and grades, then we can use the marginal probabilities to predicted numbers. Compare the numbers from what is observed and expected under independence, they are completely different!

10 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) (12 12) (5 5)2 5 =0 There is no difference at all, the predictions were exactly what is expected under independence of grade and gender. Look at the p-value it is one. This tells us we cannot rejecting the null hypothesis, that there is no dependence between gender and whether they wear a dress or not.

11 Example 3: Minidoxil and hair Minidoxil Placebo Subtotals Improvement No improvement Total Minidoxil Placebo Subtotals Improvement 159 No improvement 453 Total Minidoxil Placebo Subtotals Improvement 26% of % of ( = 26%) =80.54 =78.46 No improvement 74% of % of ( = 74%) Total 310 ( = = = 50.6) ( 612 = 49.3%) 612 Let us return to the Minidoxildata. Suppose we miss all the numbers inside the table and only observe the subtotals, ie. The number who saw an improvement, the number who did not, and the numbers in both groups.. Question These are the marginal numbers. Can we use these to deduce the numbers inside the boxes? Again we can only use the marginals if there is no dependence/ association between the treatment and hair growth. If this is the case, we obtain the numbers on the left. Unlike the previous two examples, the numbers neither match or are completely different. How to interpret these differences?

12 We measure the difference between what is observed actually in the data and what we predict if they is no association: 2 = ( ) ( ) ( ) ( ) = This difference is not zero (as in Example 2 or huge as in Example 1). Can we explain this difference under just sampling variation when in fact the Minidoxil is no different to the placebo? It is hard to judge based on the difference 11.58, we need to know the distribution associated with and from here deduce the p-value. We see that the p-value is 0.7%, in the next few slides we explain where this comes from.

13 The chi- squared distribution This is what a chi-squared looks like. It tells us that if there is no association between gender and binge drinking the chistatistics is likely to be quite small. In fact the chance of it being large are quite slim. These chance can be obtained using the critical values for the chi-squared distribution given in the chi-squared tables. Looking up chi-squared tables with 1df. We see that there is a 25% chance the chi-value will be more than 1.32 and a 5% chance it will be more than We apply this to our chivalue.

14 For the chi-square test, H 0 states that there is no association between the row and column variables in a two-way table. The alternative is that these variables are related. If H 0 is true, the chi-square test statistic has approximately a chisquare distribution with (r 1) (c 1) degrees of freedom. Use the chi-square table (Table F) to get the P-value. The P-value for the chi-square test is the area to the right of the test statistic χ 2 under this distribution.

15 Table F df = (r 1) (c 1) If χ 2 = and df = 2, the P-value is between and p df

16 The chi-value for the mindoxil example is We see that it is far to the right of the chi-squared distribution. The area to the RIGHT of is 0.7%. This matches with the Statcrunch output. This can be deduced from the tables. Since > 3.84 (which corresponds to 5% in the chi table) it is clear that the p-value corresponding to is a lot less than 5%. Interpretation If there is no association between treatment and hair, then about 0.7% of the time we will observe a difference in the data of times or more.

17 Example 4: Treating cocaine addiction Observed % of No Relapse by Treatment 74 patients addicted to cocaine were assigned at random to one of three possible treatments. The observed variable is whether or not they relapsed into addiction after their treatment. We test whether the chance of relapse is related to treatment with α = Expected % of No Relapse by Treatment 35.14% 35.14% 35.14% Overall 26/74 = 35.14% did not relapse. If this occurrence does not depend on the treatment then we would expect 35.14% of each group not to relapse. This is what a null hypothesis of no association would predict.

18 Treating cocaine addiction, cont. We have noted that, assuming treatment has no effect on the relapse response, each treatment group should be 35.14% no relapse and 64.85% relapse. The expected counts are computed using the margin totals from the observed (actual) counts. Observed counts No Yes Total Desipramine Lithium Placebo Total Row totals are the same for both tables. Column totals are the same for both tables. Expected counts Comparing the observed and expected counts, we can see that the two do not agree very well. But to say whether this is significant, we need to compute a test statistic and a P-value. Desipramine Lithium Placebo No Yes Total (25 26)/74 = 8.78 (26 26)/74 = 9.14 (23 26)/74 = 8.08 (25 48)/74 = (26 48)/74 = (23 48)/74 = Total

19 Treating cocaine addiction, cont. Now we compute the χ 2 test statistic. Observed counts No Yes Total Desipramine Lithium Placebo Total Expected counts No Yes Total Desipramine Lithium Placebo Total χ = (obs. exp.) exp. The degrees of freedom is (3 1) (2 1) = 2. From Table F, we find that the area to the right of is between and ( ) ( ) = (7 9.14) ( ) (4 8.08) ( ) = The P-value is less than α = 0.05 so we conclude that the chance of relapse is related to the treatment. A causal effect also is indicated because the treatment was applied and then the response (relapse/no relapse) was observed. For many data sets we cannot say there is a causal effect see HW9, Q4.

20 Cocaine addiction, cont. Observed % of no relapse From StatCrunch: use Stat-Tables- Contingency-with summary. The counts are to be provided in a table format. The P-value is , which is very significant. We reject the null hypothesis of no association and conclude there is a relationship between the treatment and the outcome (relapse or not).

21 Meaning of conditional probabilities: Cocaine addiction, cont. Since the outcome (relapse or not) is a response variable, it is sensible to ask about its conditional distribution, given each treatment, and then to compare across treatments. For example, 60% of addicts treated with Desipramine did not relapse and 27% of addicts treated with Lithium did not relapse while only 17% of addicts treated with the placebo did not relapse. But it is not sensible to look at the conditional proportions for treatment, given the outcome, because treatment is not a response variable in this study: it was applied to the patients (subjects). In fact, the row totals are the sizes of three samples and are not random. P (No Placebo) = 4 23 P (Placebo No) = 4 26 = 17.39% has meaning = 15.38% has no interpretation

22 Cocaine addiction, cont. Using the confidence interval formula for a single proportion, we can estimate the individual proportions of no relapse for each treatment. For Desipramine:.600 ± (.400) / 25 =.600 ±.192 = (.408,.792). For Lithium:.269 ± (.731) / 26 =.269 ±.170 = (.099,.439). For Placebo:.174 ± (.826) / 25 =.174 ±.155 = (.019,.329). We can also compare two groups, say Desipramine and Lithium:.600(.400).269(.731) ± =.331 ±.257 = (.074,.588).

23 Example 5: Left- handed students Does right/left-handedness of students vary between genders? We will test H 0 : There is no relationship between handedness and gender, H a : There is some relationship, at significance level α = From the class survey, we get the summary in the table at right. In StatCrunch use: Stat-Tables- Contingency-with data. The results of a chi-square test show that the P-value is 4.33%. Since this is bigger than α, we do not reject H 0. There is insufficient evidence to conclude that handedness differs by gender. Note: only 407 students responded to both questions.

24 Example 6: Parental smoking Does parental smoking influence the smoking habits of their high school children? High school students were asked whether they smoke and whether their parents smoke. In StatCrunch, request row percents (if the column variable is a response) and column percents (if the row variable is a response). The proportion of students who smoke, among those for whom both parents smoke, is 400/1780 = 22.47%. The proportion of students for whom both parents smoke, among those who smoke, is 400/1004 = 39.84%. The percent of students who smoke is greatest when both parents smoke and least when neither parent smokes (22% vs. 14%). If a student smokes, it is more likely that both parents do (40%) than that neither parent does (19%).

25 Experimental designs for two- way tables The chi-square test is an overall technique for determining evidence of a relationship between two categorical variables. There are two cases. Compare category proportions for several populations. A simple random sample is selected from each population and a single categorical variable is observed. The cocaine addiction study is an example of this. The populations are the 3 treatments. A SRS was obtained for each treatment. One response was observed. Test independence of two categorical variables in a single population. A single random sample is obtained from the populations and each individual is classified according to the two categorical variables. The handedness survey is an example of this. There was one random sample from the population and each student responded to the two categorical questions. We use the χ 2 test to test the null hypothesis of no relationship for both.

26 Review: The chi- square test statistic The chi-square statistic (χ 2 ) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. Summing over all r c cells in the table (r and c are the number of rows and columns), the formula for the χ 2 statistic is ( ) 2 2 observed count expected count expected count χ = Beware: the denominators are expected counts. Do not include the margins in the sum. Large values for χ 2 represent strong deviations from the expected distribution under the H 0 and provide evidence against H 0. How large a χ 2 value is required for statistical significance will depend on its degrees of freedom df = (r 1) (c 1).

27 Summary of testing for association This test pertains to association between two categorical variables only. The hypotheses are: H 0 : the variables are not associated vs. H a : the variables are associated. The data are summarized in a r c table with cells containing the observed counts for each combination of categories of the two variables. r is the number of rows (categories of 1 st variable) and c is the number of columns (categories of 2 nd variable). The expected count for each cell in the table is computed by The test statistic is row total column total expected count =. total sample size ( ) 2 2 observed count expected count expected count χ = The P-value is computed from the chi-square distribution as the area to the right of the χ 2 statistic, with df = (r 1) (c 1)..

28 When is it safe to use a χ 2 test? Like the z-tests for proportions, the chi-square test is based on an approximation. We can safely use the test when: The samples are simple random samples (SRS). All but one or two individual observed counts are 1 or more. All expected counts are 5 or more, except perhaps one which should be at least 1. For a 2 x 2 table, all four expected counts should be 5 or more. If the approximation is not appropriate, a statistician should be consulted for the proper procedure.

29 Accompanying problems associated with this Chapter Quiz Homework 9