1. What is the probability a passenger died given they were female? 2. What is the probability a passenger died given they were male?

RELATIVE RISK AND ODDS RATIOS Other summaries that are often computed when investigating the relationship between two categorical variables are the relative risk ratio and the odds ratio. EXAMPLE: Consider the relationship between gender and survival status for the Titanic data. The contingency table and the row percentages are given below: Questions: 1. What is the probability a passenger died given they were female? 2. What is the probability a passenger died given they were male? RISK DIFFERENCE AND RELATIVE RISK: We have seen that the probability of death was greater for males than for females. As seen earlier, one way to compare the two groups (males and females) is to look at the risk difference (i.e., the difference in proportions). Risk Difference: This is simply the difference in the two probabilities: P(died given they were female) - P(died given they were male) = 37

We can also measure the amount of discrepancy between these two probabilities based on something called relative risk. Relative Risk: This is a measure of how much a particular risk factor influences the risk of a specified outcome. For the Titanic data, we calculate the relative risk as follows: P(Passenger Died Given They Were Female) Relative Risk P(Passenger Died Given They Were Male) Proportion of Females Who Died Proportion of Males Who Died Comments: 1. We interpret this number by saying that the probability of death for females is 4/10 the probability of death for males on the Titanic. 2. A relative risk value of 1.0 is the reference value for making comparisons. That is, a relative risk of 1.0 says that there is no difference in the two probabilities. 3. The risk difference and relative risk ratio are easily displayed in our mosaic plot: 4. When you are interpreting a relative risk, you MUST consider which value you have in the numerator. For example, we could have calculated the relative risk as follows: P(Passenger Died Given They Were Male) Relative Risk P(Passenger Died Given They Were Female) Proportion of Males Who Died Proportion of Females Who Died 2.50 Question: How would we interpret this value? 38

ODDS RATIOS Another quantity that is used to describe differences in proportions is the odds ratio. This ratio is used more commonly than the relative risk ratio; however, it is more difficult to interpret and is harder to understand. Before computing an odds ratio, we need to compute the odds: Odds: Consider our Titanic example. With counts given for two distinct response categories (e.g. Male and Female), the odds of Survived versus Died is computed as the number of passengers who survived versus the number of passengers who died for each group. Recall the contingency table for this example. Find the odds of surviving for both males and females: Odds of Death for Females Number of Females who Died Number of Females who Survived Number of Males who Died Odds of Death for Males Number of Males who Survived The odds ratio is simply the ratio of the odds for the two groups: Odds Ratio Odds of Death for Females Odds of Death for Males Interpretation: The odds of death for females is about 1/10 that of males. We could also have calculated the odds ratio as follows: Odds Ratio Odds of Death for Males Odds of Death for Females Interpretation: The odds of death for males is about 10 times the odds of death for females. 39

Comments: 1. An odds ratio of 1.0 implies that there is no observable difference between the two odds. This is always the reference value! 2. The odds can also be visualized in the following graphic: 40

PRACTICE PROBLEM: The following data are from a study to investigate the relationship between condom use and the contraction of HIV. This study involved couples where it was known by both partners that one person was HIV positive. For each couple, it was noted whether or not the second person contracted HIV and whether or not condoms were always used. Second Person Contracted HIV Condom Use Yes No Totals Always 21 677 698 Not Always 20 137 157 Totals 41 814 855 1. Consider the above contingency table, and determine whether each of the following statements is valid or invalid. a. Note that 21 people contracted HIV when a condom was always used; 20 people contracted HIV when a condom was not always used. We can compare these raw counts when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) b. We can compare 20/157 = 12.7% to 21/698 = 3.0% when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) c. We can compare 21/41 = 51.2% to 20/41 = 48.8% when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) 2. Find the risk (i.e., probability) of the second person contracting HIV given that condoms are NOT ALWAYS used. 3. Find the risk (i.e., probability) of the second person contracting HIV given that condoms are ALWAYS used. 41

4. Find and interpret the relative risk of the second person contracting HIV (use the risk for the group that does NOT ALWAYS use a condom in the numerator). 5. Find and interpret the odds ratio for contracting HIV (use the group that does NOT ALWAYS use condoms in the numerator). 6. Consider these mosaic plots. First, shade the column for Condom Use=Always to show the case in which a person who does NOT ALWAYS wear a condom is equally likely as a person who ALWAYS wears a condom to contract HIV (i.e., relative risk = 1.0). Then, shade the column for Condom Use=Always to show the case in which a person who does NOT ALWAYS wear a condom is twice as likely as a person who ALWAYS wears a condom to contract HIV (i.e., relative risk = 2.0). Relative Risk = 1.0 Relative Risk = 2.0 42

Risk Difference, Relative Risk, and Odds Ratios in JMP EXAMPLE: We can use JMP to calculate these quantities for the Titanic data. Select Analyze > Fit Y by X. Move Survived to the Y, Response box and Gender to the X, Factor box. From the red drop-down arrow next to Contingency Analysis, select Relative Risk. Select Survived = No as your response category of interest, and use Females in the numerator. JMP then displays the relative risk: As seen earlier, if you select Risk Difference from the same red drop-down arrow, JMP displays the following: Finally, select Odds Ratio: 43

EXAMPLE: Consider a study in which low birth weight was investigated. Several risk factors (Previous history of low birth weight, Race, Hypertension, Smoking, and Uterine irritation during pregnancy) were considered in hopes of better understanding some of the possible contributors to low birth weight. The data can be found in the file LowBirth.JMP. First consider finding the risk of low birth weight for those with and without a Previous History. That is, we must estimate the probability of having a low birth weight baby for each group. P(Low Birth Weight Given a Previous History) = P(Low Birth Weight Given No Previous History) = PROBLEM: This is what is known as a case-control study. People with a particular disease agree to participate (these are the cases), and people who are similar to those in the case group but who do NOT have the disease in question also are investigated (these are the controls). In this type of study, the number of cases and controls who are exposed to a certain risk factor are then identified. The problem with computing the risks given above is that these quantities are affected by the number of cases recruited for the study! That is, we have artificially inflated the probability of having a low birth weight baby. Therefore, RELATIVE RISK SHOULD NOT BE USED IN A CASE-CONTROL STUDY. 44

Instead, we will use odds ratios to compare the various risk factors. We will compute the odds ratios so that they are greater than one (that is, the risk factor level which is more likely to produce a low birth weight baby will be used in the numerator. Risk Factor Contingency Table Odds Ratio Previous History 18/12 4.21 41/115 Risk Factor Contingency Table Odds Ratio Race 36/55 2.05 23/72 Hypertension 7/6 52/121 2.71 Smoke 30/43 2.02 29/84 Uterine Irriation 14/14 2.51 45/113 45

Questions: 1. Which risk factor appears to have the most influence on low birth weight? 2. Which risk factor appears to have the least influence on low birth weight? 3. Do all of the risk factors appear to have at least some influence on low birth weight? Explain. 46