Interpretation of the Fitted Logistic Regression Model

Transcription

1 Interpretation of the Fitted Logistic Regression Model April 19, 2016

2 Table of contents 1 Introduction 2 Odds ratio Example 3 Cell Coding Deviation from means 4 5 Statistical Adjustment Statistical Intercation Examples 6 7

3 Assumptions Before interpretating the logistic regression Model we assume the following conditions are met: The model has been fit (Chapter 5) Variables in the model are significant Model fits according to a statistical measure of fit

4 Introduction What do the estimated coefficients in the model tell us about the research questions that motivated the study? In linear regression y = β 0 + β 1 x the estimated coefficient β 1 represents the slope, the rate of change per unit of change in the independent variable. (The intercept β 0 is of little interest in most models)

5 Introduction Thus, interpretation of the logistic regression model involves two issues: 1 Determining the functional relationship between the dependent and independent variable 2 Defining the unit of change for the independent variable

6 Link function Definition A link function is a function of the dependent variable which yields a linear function of the independent variable. Examples of link functions identity function id N, in linear regression in logisitc regression its the logit transformation with π(x) g(x) = ln( 1 π(x) ) = β 0 + β 1 x π(x) = eβ 0+β 1 x 1 + e β 0+β 1 x

7 logit function While the slope β 1 in linear regression is the change in the outcome variable corresponding to a one-unit change in the independent variable β 1 = y(x + 1) y(x), in logisitc regression β 1 is the analogous change in the logit β 1 = g(x + 1) g(x). Hence, for an interpretation its necessary to place meaning on the difference between two values of the logit function.

8 Odds ratio Example We assume that the independent variable x is nominal scaled and dichotomous. In most cases that means it s coded as either 0 or 1. Now calculating the slope β 1 is straight forward and is done in a few steps.

9 Odds ratio Example g(1) g(0) = (β 0 +β 1 1) (β 0 +β 1 0) = (β 0 +β 1 ) (β 0 ) = β 1. 1 Define two values of the covariate to be compared (x = 1, x = 0) 2 Substitute those two into the logit (g(1), g(0)) 3 Calculate the difference (g(1) g(0))

10 Odds ratio Odds ratio Example The practical problem is that change on the scale of the log-odds is hard to explain and it may not be especially meaningful to a subject-matter audience. In order to provide a more meaningful interpretation we need to introduce the odds ratio as a measure of association. Definition The odds ratio OR is the ratio of odds for x = 1 to the odds for x = 0 OR = π(1) 1 π(1) π(0) 1 π(0)

11 Odds ratio Odds ratio Example Substituting the logisitc regression model probabilities into the OR we obtain that the relationship between the odds ratio and the regression coefficient is OR = e β 0 +β 1 1+e β 0 +β e β 0 +β 1 e β 0 1+e β e β 0 = eβ 0+β 1 e β 0 = e β 1

12 Odds ratio Odds ratio Example Thus, as a fourth and final step we have 1 Define two values of the covariate to be compared (x = 1, x = 0) 2 Substitute those two into the logit (g(1), g(0)) 3 Calculate the difference (g(1) g(0)) 4 Exponentiate the logit difference to obtain an odds ratio

13 Odds ratio Odds ratio Example The odds ratio approximates how much more likely or unlikely (in terms of odds) it is for the outcome to be present among those subjects with x = 1 as compared to those subjects with x = 0. Example Assume that Y, is the presence or absence of heart disease and X denotes whether or not the person engages in regular strenuous physical exercise. If the odds ratio is OR = 0.5 (OR = 2), then the odds of heart disease among those subjects who exercise is one-half (twice) the odds of heart disease for those subjects who do not exercise in the study population.

14 Odds ratio Odds ratio Example ÔR tends to have a distribution that is highly skewed to the right, due to the fact that its range is between 0 and (only for extremely large sample sizes, the distribution would be normal). Hence, inferences are usually based on the sampling distribution of ln(ôr) = β 1, which tends to follow a normal distribution for much smaller sample sizes. A confidence 100 (1 α)% interval for ÔR is given by exp[ β 1 ± z 1 α/2 ŜE( β 1 )]

15 Odds ratio Odds ratio Example What if x is coded using values a and b? Then the four steps are 1 We choose a and b to be compared 2 Substitute those into the logit and obtain: ĝ(a) = ˆβ 0 + ˆβ 1 a and ĝ(b) = ˆβ 0 + ˆβ 1 b 3 Calculate the difference: ĝ(a) ĝ(b) = ˆβ 1 (a b) 4 Exponentiate the logit difference to obtain the odds ratio ÔR(a, b) = e ˆβ 1 (a b)

16 Example Odds ratio Example

17 Example Odds ratio Example Since β 1 = we get that the estimated odds ratio is (step 4) ÔR = e β 1 = e = 2.9. Another way of obtaining the odds ratio would be ÔR = = and to go full-circle we verify β 1 = ln(2.897) = The confidence 95% interval for ÔR is exp[1.064 ± ] = (1.87, 4.49) This suggests that the odds of a fracture during follow-up among women with a prior fracture could be as little as 1.9 times or much as 4.5 times the odds for women without a prior fracture, at the 95% level of confidence.

18 Cell Coding Deviation from means Now we ll assume that the independent variable has k > 2 distinct values. For example, x could denote the county of residence, the clinic used for health care, etc.

19 Cell Coding Cell Coding Deviation from means Since we can t model a nominal scale variable as an interval scale variable we have to use design variables (see Chapter 2) The method for specifying the design variables involves setting all of them equal to 0 for the reference group, and then setting a single design variable equal to 1 for each of the other groups. This is illustrated in Table 3.6. (cell coding)

20 Cell Coding Example Cell Coding Deviation from means

21 Cell Coding Example Cell Coding Deviation from means Comparing the estimated coefficients in Table 3.7 to the log-odds ratios in Table 3.5 we find that ln[ôr(same, less)] = ˆβ 1 = 0.546, ln[ôr(more, less)] = ˆβ 2 = To check: 1 Compare levels same to less 2 logit for same and less respectively are: ĝ(same) = ˆβ 0 + ˆβ 1 (1) + ˆβ 2 (0) and ĝ(less) = ˆβ 0 + ˆβ 1 (0) + ˆβ 2 (0) 3 The logit difference is ln[ôr(same, less)] = ĝ(same) ĝ(less) = ˆβ 1

22 Cell Coding Example Cell Coding Deviation from means Generaly the estimates of the standard errors found in the logistic regression output are identical to the estimates obtained using the cell frequencies from the contingency table: ŜE( ˆβ 1 1 ) = = (1 α)% confidence intervals for ˆβ j are calculated as ˆβ j ± z 1 α/2 ŜE( ˆβ j ) and analog for the odds ratio [ exp ˆβj ± z 1 α/2 ŜE( ˆβ ] j ).

23 Deviation from means Cell Coding Deviation from means This coding expresses an effect as the deviation of the group mean (logit for the group) from the overall mean (average logit over all groups). This method of coding is obtained by setting the value of all the design variables equal to 1 for one of the categories, and then using the 0, 1 coding for the remainder of the categories.

24 Deviation from means Example Cell Coding Deviation from means

25 Deviation from means Example Cell Coding Deviation from means The logit for each of the three categories of RATERISK are: ) ĝ 1 = ln ( = 1.602, ĝ 2 = 1.056, ĝ 3 = with a mean of ḡ = With that, the estimated coefficients in Table 3.9 are: ĝ 2 ḡ = and ĝ 3 ḡ = (ĝ j ḡ for the jth design variable)

26 Cell Coding Deviation from means Deviation from means Example - Interpretation Exponentiation of the estimated coefficients yields the ratio of the odds for the particular group to the geometric mean of the odds: exp( ˆβ 1 ) = exp(ĝ 2 ḡ) = 1.06 Which is not a true odds ratio because the quantities in the numerator and denominator do not represent the odds for two distinct categories. The exponentiation of the estimated coefficient expresses the odds relative to the geometric mean odds. The interpretation of this value depends on whether the geometric mean odds is at all meaningful in the context of the study.

27 Cell Coding Deviation from means Deviation from means Example - Interpretation Estimating the odds ratio for one category relative to a reference category with the estimated coefficients: 1 Compare levels same(raterisk=2) to less (RATERISK=1) 2 logit for same and less, using the deviation from means coding are: ĝ(same) = ˆβ 0 + ˆβ 1 (RAT ERISK2D = 1) + ˆβ 2 (RAT ERISK2D = 0) and ĝ(less) = ˆβ 0 + ˆβ 1 (RAT ERISK2D = 1) + ˆβ 2 (RAT ERISK2D = 1) 3 The logit difference is ln[ôr(same, less)] = ĝ(same) ĝ(less) = 2 ˆβ 1 + ˆβ 2 4 ÔR(same, less) = e 2 ˆβ 1 + ˆβ 2

28 With continuous independent variable, interpretation of the estimated coefficient depends on how it is entered into the model and the particular units of the variable. We assume that the logit is linear in the variable (see Chapter 4).

29 Again the four-step method is used to calculate the estimator of the odds ratio with the logit g(x) = ˆβ 0 + ˆβ 1 x 1 A one-unit increase: x + 1 compared to x 2 Obtain the corresponding logit functions g(x) and g(x + 1) 3 The logit difference is ln[ôr] = g(x + 1) g(x) = ˆβ 1 (x + 1 x) = ˆβ 1 4 ÔR = e ˆβ 1 Thus the estimator has exactly the same form as the one for a dichotomous one.

30 Substituting an arbitrary increment c we obtain analogous: 1 A c-unit increase: x + c compared to x 2 Obtain the corresponding logit functions g(x) and g(x + c) 3 The logit difference is ln[ôr] = g(x + c) g(x) = ˆβ 1 (x + c x) = c ˆβ 1 4 ÔR(c) = e c ˆβ 1

31 Again we can calculate the standard error ŜE(c ˆβ 1 ) = c ŜE( ˆβ 1 ) and a 100(1 α)% confidence interval as [ exp c ˆβ 1 ± z 1 α/2 c ŜE( ˆβ ] 1 ). Thus, the choice of c is crucial. As a rule of thumb multiples of 2, 5 or 10 are most meaningful and easily understood.

32 Example The estimated logit is ĝ(age) = AGE and the estimated odds ratio for an increase in 10 years is ÔR(10) = exp( ) = 3.03.

33 Example Thus, for every increase of 10 years in age, the odds of CHD being present is estimated to increase 3.03 times. The unavoidable dilemma is, that this means the odds increase of CHD for a 30 year old compared to a 20 year old is the same for a 60 year old compared to a 50 year old.

34 Statistical Adjustment Statistical Intercation Examples Univariable models rarely provide complete analysis of the data since the independent variables are usually associated with with one another. Thus multivariable analysis is used. Two questions are of importance: 1 What effect does each varriable have on the other independent variable? 2 To what extent the estimate of the log-odds of one independent variable changes, depending on the value of another independent variable.

35 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples

36 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples The difference in height between the two groups at mean weight is (w 2 w 1 ) = (β 0 + β 1 + β 2 ā 2 ) (β 0 + β 2 ā 1 ) = β 1 + β 2 (ā 2 ā 1 ) if this is done at a common mean ā we obtain (w 4 w 3 ) = β 1.

37 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples Now instead of weight being the outcome variable, assume it is a dichotomous variable and that the vertical axis now denotes the logit or log-odds of the outcome. That is, the logit of the outcome is given by the equation g(x, a) = (β 0 + β 1 x + β 2 a).

38 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples The unadjusted difference (w 2 w 1 ) and the adjusted difference (w 4 w 3 ) of the log-odds are (w 2 w 1 ) = β 1 + β 2 (ā 2 ā 1 ), (w 4 w 3 ) = β 1 What conditions are required for them to be the same? β 1 + β 2 (ā 2 ā 1 ) = β 1 (ā 2 ā 1 ) = 0 β 2 = 0 How to test for those two conditions?

39 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples 1 Fit a model containging only d (excluding adjustment covariate a), with ˆθ 1 being d s coefficient 2 Fit a full model with ˆβ 1, ˆβ 2 as the estimated coefficients 3 Now ˆθ 1 ˆβ 1 + ˆβ 2 (ā 2 ā 1 ) thus (ˆθ 1 ˆβ 1 ) ˆβ 2 (ā 2 ā 1 ) 4 As the amount of adjustment is relative we scale it by dividing by ˆβ 1 and we obtain the delta-beta-hat-percent ˆβ% = 100 (ˆθ 1 ˆβ 1 ) ˆβ 1

40 Statistical Adjustment Statistical Adjustment Statistical Intercation Examples Amount of adjustment is expressed as a percentage of the adjusted log-odds ratio. (if ˆβ% > 20, 10, 25 then a covariate is needed to adjust the effect of another covariate (Chapter 4)). In the case of statistical adjustment we speak of x as a confounder (AGE).

41 Statistical Interaction Statistical Adjustment Statistical Intercation Examples

42 Statistical Interaction Statistical Adjustment Statistical Intercation Examples Outcome variable is the presence or absence of CHD, the risk factor is GENDER, and the covariate is AGE. l 1, l 2 being the logit for females and males respectively. Since the difference l 2 l 1 is the same for any age, hence there is no interaction. If instead l 3 is the one of the males, then the difference l 3 l 1 varries depending on age indicating that the relationship between AGE and CHD for males is different from that of females. When this occurs we say that there is an interaction between AGE and GENDER.

43 Effect Modifier Statistical Adjustment Statistical Intercation Examples We cannot estimate the odds ratio for GENDER without first specifying the AGE at which the comparison is being made. In other words, age modifies the effect of gender, so in this terminology age is called an effect modifier.

44 Examples Statistical Adjustment Statistical Intercation Examples Considering models with the pair of independent variables d (dichotomous) and x (continuous). The role of x with respect to the effect of d in the model can be one of three possibilities: 1 No statistical adjustment or interaction. 2 There is statistical adjustment, no interaction. Covariate x is a confounder. 3 There is statistical interaction. Covariate x is an effect modifier.

45 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction

46 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction There is little change in the estimate of the coefficient in Model 2 for PRIORFRAC as ˆβ% = 100 (ˆθ 1 ˆβ 1 ) ˆβ 1 = 5.1 which indicats that inclusion of HEIGHT does not statistically adjust the coefficient of PRIORFRAC HEIGHT it is not a confounder of PRIORFRAC.

47 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction The statistical interaction of prior fracture (PRIORFRAC) and height (HEIGHT) is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.492, and thus is not significant. In these data HEIGHT is not an effect modifier of prior fracture.

48 Statistical Adjustment Statistical Intercation Examples 1.No statistical adjustment or interaction Hence, the choice is between Model 1 and Model 2. Even though the estimate of the effect of prior fracture is basically the same for the two models, we would choose Model 2 as height (HEIGHT) is not only statistically significant in Model 2, but is an important clinical covariate as well.

49 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction

50 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction In Model 1 the Wald test for the coefficient of GENDER is not significant with p = We know that gender can be an important covariate, but it is not significant in the univariable model. Thus, under some model building methods it might not be considered for a multivariable model. This is a common dilemma which is addressed in Chapter 4.

51 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction There is quite some change in the estimate of the coefficient for GENDER in Model 2, which is now significant, as ˆβ% = 37.6 which indicats that inclusion of SPHEQ does statistically adjust the coefficient of GENDER SPHEQ it is a confounder of GENDER.

52 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction The statistical interaction of SPHEQ and GENDER is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.185, and thus is not significant. In these data SPHEQ is not an effect modifier of GENDER.

53 Statistical Adjustment Statistical Intercation Examples 2.Statistical adjustment, no interaction In this case we use Model 2 as it adjusts for SPHEQ.

54 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples

55 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples In Model 1 the Wald test for the coefficient of PRIORFRAC is significant with p <

56 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples There is quite some change in the estimate of the coefficient for AGE in Model 2 as ˆβ% = 26.8 which indicats that inclusion of AGE does statistically adjust the coefficient of PRIORFRAC AGE it is a confounder of PRIORFRAC.

57 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples The statistical interaction of prior fracture (PRIORFRAC) and AGE is added to Model 2 to obtain Model 3. The Wald statistic for the added product term has p = 0.022, and thus is significant. In these data AGE is an effect modifier of PRIORFRAC.

58 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples Hence, the choice is Model 3.

59 3.Statistical interaction Statistical Adjustment Statistical Intercation Examples If a model contains an interaction term, like in Model 3, one must follow the four-step method to calculate the odds ratio. 1 Comparing (PRIORFRAC=1, AGE=a) with (PRIORFRAC=0, AGE=a) 2 Substituting them into the estimated logits: ĝ(1, a) = ˆβ 0 + ˆβ ˆβ 2 a + ˆβ 3 1 a and ĝ(0, a) = ˆβ 0 + ˆβ ˆβ 2 a + ˆβ 3 0 a 3 The logit difference is ĝ(1, a) ĝ(0, a) = ˆβ 1 + ˆβ 3 a 4 ÔR((1, a), (0, a)) = e ˆβ 1 + ˆβ 3 a Now the results can be either graphed or tabulated.

64 Sofar we discussed using the logistic regression model coefficients to estimate odds ratios and construct confidence intervals. However, there are situations where the fitted values (i.e., the estimated probabilities) from the model are equally, if not more, important.

65

66 The values plotted for INH INJ = 1 (= 0 analog) in reference to Section 2.5 are obtained as follows 1 First calculating the fitted logit functions: ĝ 1 (a) = ĝ(t BSA = a, INH INJ = 1) = a = a 2 Using values from the covariance matrix to calculate V ar[ĝ 1 (a)] = a a ˆπ 1 (a) = eĝ1 (a) 1+eĝ1 (a) 4 The confidence intervals are then given as eĝ1 (a)±1.96ŝe(ĝ 1 (a)), with ŜE(ĝ) = V ar(g) 1+eĝ1 (a)±1.96ŝe(ĝ 1 (a))

67

68 We would like to plot the estimated probability of death as a function of burn area (as before) and inhalation injury but now controlling for the other four covariates in the model. This is done by either choosing typical values (median age and 0 for the dichotomous variables) or calculate a modified logit (subtracts the contribution of burn area and inhalation injury from the logit and uses its median value as a way to control for the additional model covariates).

69 Calculating the modified logit 1 The logit for the fitted model is: ĝ(x) = T BSA RACE 2 and the proposed modified logit ĝm(x) = ĝ(x) ( T BSA INH INJ) 3 with a median value over the 1000 subjects ĝm 50 = 5.349

70 With the two adjusted logits ĝ j (a) = ĝm T BSA j, ˆπ j (a) = eĝj(a) j = 0, eĝj(a)

71 With stratified(sampling) analysis of 2 2 tables to assess interaction and to control confounding the essential objective is to produce an adjusted odds ratio by first determining whether the odds ratios are constant (Mantel-Haenszel estimator or weighted logit-based estimator), or homogeneous.

72

73 Mantel-Haenszel Estimator The Mantel-Haenszel estimator is a weighted average of the stratum specific odds ratios ÔR i = a i d i b i c i, where a i, b i, c i, d i are the observed cell frequencies. ai d i N ÔR MH = i bi = 3.65, N c i i = a i + b i + c i + d i N i The logit-based summary estimator of the odds ratio is a weighted average where each weight is the inverse of the variance of the stratum specific log-odds ratio, [ ] wi ÔR L = exp ln(ôr i) = wi

74 In general, the estimators are similar when the data are not too sparse within the strata. One considerable advantage of the MantelHaenszel estimator is that it may be computed when some of the cell entries are 0.

75 These estimators provide a correct estimate of the effect of the risk factor only when the odds ratio is constant across the strata. Some test statistics: A weighted sum of the squared deviations of the stratum specific log-odds ratios from their weighted mean X 2 H = { w i [ln(ôr i) ln(ôr L)] 2 } Comparing the value of a i to an estimated expected value ê i, calculated under the assumption that the odds ratio is constant in all strata (Breslow 1996) XBD 2 = (a i ê i ) 2 ˆv i [ a i ê i ] 2 ˆvi, ˆv i... estimate of variance of a i

76 The ( odd ) end.