1. What is the probability a passenger died given they were female? 2. What is the probability a passenger died given they were male?

Similar documents
RATIOS, PROPORTIONS, PERCENTAGES, AND RATES

Mind on Statistics. Chapter 4

Two Correlated Proportions (McNemar Test)

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting

INDIANA DEPARTMENT OF CHILD SERVICES CHILD WELFARE MANUAL. Chapter 5: General Case Management Effective Date: March 1, 2007

MATH 140 Lab 4: Probability and the Standard Normal Distribution


Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Chi-square test Fisher s Exact test

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam

Case-control studies. Alfredo Morabia

Prospective, retrospective, and cross-sectional studies

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

LOGISTIC REGRESSION ANALYSIS

Session 7 Bivariate Data and Analysis

Enrollment Data Undergraduate Programs by Race/ethnicity and Gender (Fall 2008) Summary Data Undergraduate Programs by Race/ethnicity

Client Marketing: Sets

2: Entering Data. Open SPSS and follow along as your read this description.

Measurement and Measurement Scales

CHAPTER I INTRODUCTION. Here in the Philippines, we believe in the saying of our national hero Dr.

Biology 300 Homework assignment #1 Solutions. Assignment:

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

The. Fertility Calendar. Lelanthran Krishna Manickum

Elementary Statistics

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Online Score Reports: Samples and Tips

Understanding Fertility

Multiple logistic regression analysis of cigarette use among high school students

Modifying Colors and Symbols in ArcMap

Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC

Bayes Theorem & Diagnostic Tests Screening Tests

S P S S Statistical Package for the Social Sciences

CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS

Solutions to Homework 10 Statistics 302 Professor Larget

Sexual and reproductive health challenges facing young people

Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = Adverse impact as defined by the 4/5ths rule was not found in the above data.

HIV/AIDS AND OTHER SEXUALLY TRANSMITTED INFECTIONS 11

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

On-entry Assessment Program. Accessing and interpreting online reports Handbook

Discrepancies in Self-Report Diabetes Survey yquestions using NHANES, NHIS, and CHIS data

Supplementary online appendix

Comprehensive Sexual Health Lesson Plan

Section 12 Part 2. Chi-square test

in children less than one year old. It is commonly divided into two categories, neonatal

Moving from SPSS to JMP : A Transition Guide

Lesson 17: Margin of Error When Estimating a Population Proportion

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Delaying First Pregnancy

Life Tables. Marie Diener-West, PhD Sukon Kanchanaraksa, PhD

STAT 35A HW2 Solutions

Current Yield Calculation

In This Issue: Excel Sorting with Text and Numbers

Scan Physical Inventory

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

Is it statistically significant? The chi-square test

Marathon Data Systems

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables

Linear Models in STATA and ANOVA

Algebra 2 C Chapter 12 Probability and Statistics

Creating a Provider Record From a Provider Inquiry. Knowledge Base Article

II. DISTRIBUTIONS distribution normal distribution. standard scores

Setting up a basic database in Access 2003

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

FAMILY LIFE EDUCATION ACPS Fifth Grade

Modelling spousal mortality dependence: evidence of heterogeneities and implications

Mathematics Practice for Nursing and Midwifery Ratio Percentage. 3:2 means that for every 3 items of the first type we have 2 items of the second.

MALAWI YOUTH DATA SHEET 2014

Chapter 2: Frequency Distributions and Graphs

Creating Forms in Microsoft Word 2007

ITS Training Class Charts and PivotTables Using Excel 2007

Public Health Annual Report Statistical Compendium

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Simple example of collinearity in logistic regression

UNDERSTANDING THE TWO-WAY ANOVA

Unit 26 Estimation with Confidence Intervals

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

After you complete the survey, compare what you saw on the survey to the actual questions listed below:

Retention Rates - PivoTable Instructions *Instructions use Microsoft Excel 2010 and Internet Explorer 8. ~Accessing the PivoTable file~

Scatter Plots with Error Bars

Module 4: Formulating M&E Questions and Indicators

49. INFANT MORTALITY RATE. Infant mortality rate is defined as the death of an infant before his or her first birthday.

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

Automated Inventory System

Drug Adherence in the Coverage Gap Rebecca DeCastro, RPh., MHCA

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Association Between Variables

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Tests for One Proportion

Social Return on Investment

Visualizing Categorical Data in ViSta

An Introduction to Excel Pivot Tables

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Imaging Systems Laboratory II. Laboratory 4: Basic Lens Design in OSLO April 2 & 4, 2002

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions

2 Describing, Exploring, and

Transcription:

RELATIVE RISK AND ODDS RATIOS Other summaries that are often computed when investigating the relationship between two categorical variables are the relative risk ratio and the odds ratio. EXAMPLE: Consider the relationship between gender and survival status for the Titanic data. The contingency table and the row percentages are given below: Questions: 1. What is the probability a passenger died given they were female? 2. What is the probability a passenger died given they were male? RISK DIFFERENCE AND RELATIVE RISK: We have seen that the probability of death was greater for males than for females. As seen earlier, one way to compare the two groups (males and females) is to look at the risk difference (i.e., the difference in proportions). Risk Difference: This is simply the difference in the two probabilities: P(died given they were female) - P(died given they were male) = 37

We can also measure the amount of discrepancy between these two probabilities based on something called relative risk. Relative Risk: This is a measure of how much a particular risk factor influences the risk of a specified outcome. For the Titanic data, we calculate the relative risk as follows: P(Passenger Died Given They Were Female) Relative Risk P(Passenger Died Given They Were Male) Proportion of Females Who Died Proportion of Males Who Died Comments: 1. We interpret this number by saying that the probability of death for females is 4/10 the probability of death for males on the Titanic. 2. A relative risk value of 1.0 is the reference value for making comparisons. That is, a relative risk of 1.0 says that there is no difference in the two probabilities. 3. The risk difference and relative risk ratio are easily displayed in our mosaic plot: 4. When you are interpreting a relative risk, you MUST consider which value you have in the numerator. For example, we could have calculated the relative risk as follows: P(Passenger Died Given They Were Male) Relative Risk P(Passenger Died Given They Were Female) Proportion of Males Who Died Proportion of Females Who Died 2.50 Question: How would we interpret this value? 38

ODDS RATIOS Another quantity that is used to describe differences in proportions is the odds ratio. This ratio is used more commonly than the relative risk ratio; however, it is more difficult to interpret and is harder to understand. Before computing an odds ratio, we need to compute the odds: Odds: Consider our Titanic example. With counts given for two distinct response categories (e.g. Male and Female), the odds of Survived versus Died is computed as the number of passengers who survived versus the number of passengers who died for each group. Recall the contingency table for this example. Find the odds of surviving for both males and females: Odds of Death for Females Number of Females who Died Number of Females who Survived Number of Males who Died Odds of Death for Males Number of Males who Survived The odds ratio is simply the ratio of the odds for the two groups: Odds Ratio Odds of Death for Females Odds of Death for Males Interpretation: The odds of death for females is about 1/10 that of males. We could also have calculated the odds ratio as follows: Odds Ratio Odds of Death for Males Odds of Death for Females Interpretation: The odds of death for males is about 10 times the odds of death for females. 39

Comments: 1. An odds ratio of 1.0 implies that there is no observable difference between the two odds. This is always the reference value! 2. The odds can also be visualized in the following graphic: 40

PRACTICE PROBLEM: The following data are from a study to investigate the relationship between condom use and the contraction of HIV. This study involved couples where it was known by both partners that one person was HIV positive. For each couple, it was noted whether or not the second person contracted HIV and whether or not condoms were always used. Second Person Contracted HIV Condom Use Yes No Totals Always 21 677 698 Not Always 20 137 157 Totals 41 814 855 1. Consider the above contingency table, and determine whether each of the following statements is valid or invalid. a. Note that 21 people contracted HIV when a condom was always used; 20 people contracted HIV when a condom was not always used. We can compare these raw counts when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) b. We can compare 20/157 = 12.7% to 21/698 = 3.0% when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) c. We can compare 21/41 = 51.2% to 20/41 = 48.8% when we are investigating whether using a condom reduces the chances of the second partner getting HIV. (Valid or Invalid) 2. Find the risk (i.e., probability) of the second person contracting HIV given that condoms are NOT ALWAYS used. 3. Find the risk (i.e., probability) of the second person contracting HIV given that condoms are ALWAYS used. 41

4. Find and interpret the relative risk of the second person contracting HIV (use the risk for the group that does NOT ALWAYS use a condom in the numerator). 5. Find and interpret the odds ratio for contracting HIV (use the group that does NOT ALWAYS use condoms in the numerator). 6. Consider these mosaic plots. First, shade the column for Condom Use=Always to show the case in which a person who does NOT ALWAYS wear a condom is equally likely as a person who ALWAYS wears a condom to contract HIV (i.e., relative risk = 1.0). Then, shade the column for Condom Use=Always to show the case in which a person who does NOT ALWAYS wear a condom is twice as likely as a person who ALWAYS wears a condom to contract HIV (i.e., relative risk = 2.0). Relative Risk = 1.0 Relative Risk = 2.0 42

Risk Difference, Relative Risk, and Odds Ratios in JMP EXAMPLE: We can use JMP to calculate these quantities for the Titanic data. Select Analyze > Fit Y by X. Move Survived to the Y, Response box and Gender to the X, Factor box. From the red drop-down arrow next to Contingency Analysis, select Relative Risk. Select Survived = No as your response category of interest, and use Females in the numerator. JMP then displays the relative risk: As seen earlier, if you select Risk Difference from the same red drop-down arrow, JMP displays the following: Finally, select Odds Ratio: 43

EXAMPLE: Consider a study in which low birth weight was investigated. Several risk factors (Previous history of low birth weight, Race, Hypertension, Smoking, and Uterine irritation during pregnancy) were considered in hopes of better understanding some of the possible contributors to low birth weight. The data can be found in the file LowBirth.JMP. First consider finding the risk of low birth weight for those with and without a Previous History. That is, we must estimate the probability of having a low birth weight baby for each group. P(Low Birth Weight Given a Previous History) = P(Low Birth Weight Given No Previous History) = PROBLEM: This is what is known as a case-control study. People with a particular disease agree to participate (these are the cases), and people who are similar to those in the case group but who do NOT have the disease in question also are investigated (these are the controls). In this type of study, the number of cases and controls who are exposed to a certain risk factor are then identified. The problem with computing the risks given above is that these quantities are affected by the number of cases recruited for the study! That is, we have artificially inflated the probability of having a low birth weight baby. Therefore, RELATIVE RISK SHOULD NOT BE USED IN A CASE-CONTROL STUDY. 44

Instead, we will use odds ratios to compare the various risk factors. We will compute the odds ratios so that they are greater than one (that is, the risk factor level which is more likely to produce a low birth weight baby will be used in the numerator. Risk Factor Contingency Table Odds Ratio Previous History 18/12 4.21 41/115 Risk Factor Contingency Table Odds Ratio Race 36/55 2.05 23/72 Hypertension 7/6 52/121 2.71 Smoke 30/43 2.02 29/84 Uterine Irriation 14/14 2.51 45/113 45

Questions: 1. Which risk factor appears to have the most influence on low birth weight? 2. Which risk factor appears to have the least influence on low birth weight? 3. Do all of the risk factors appear to have at least some influence on low birth weight? Explain. 46