Lecture 19: Conditional Logistic Regression

Similar documents

Lecture 14: GLM Estimation and Logistic Regression

VI. Introduction to Logistic Regression

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Lecture 18: Logistic Regression Continued

Basic Statistical and Modeling Procedures Using SAS

STATISTICA Formula Guide: Logistic Regression. Table of Contents

SAS Software to Fit the Generalized Linear Model

Beginning Tutorials. PROC FREQ: It s More Than Counts Richard Severino, The Queen s Medical Center, Honolulu, HI OVERVIEW.

Multinomial and Ordinal Logistic Regression

Logit Models for Binary Data

Wes, Delaram, and Emily MA751. Exercise p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].

11. Analysis of Case-control Studies Logistic Regression

Lecture 3: Linear methods for classification

Poisson Models for Count Data

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Two Correlated Proportions (McNemar Test)

Cool Tools for PROC LOGISTIC

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

Multivariate Logistic Regression

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Logistic Regression (1/24/13)

Statistics in Retail Finance. Chapter 2: Statistical models of default

LOGISTIC REGRESSION ANALYSIS

Tests for Two Proportions

Basics of Statistical Machine Learning

Generalized Linear Models

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Statistical Machine Learning

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Ordinal Regression. Chapter

CREDIT SCORING MODEL APPLICATIONS:

Logit and Probit. Brad Jones 1. April 21, University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

SUGI 29 Statistics and Data Analysis

CHAPTER 2 Estimating Probabilities

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Poisson Regression or Regression of Counts (& Rates)

Reject Inference in Credit Scoring. Jie-Men Mok

LOGIT AND PROBIT ANALYSIS

Chapter 12 Nonparametric Tests. Chapter Table of Contents

Developing Risk Adjustment Techniques Using the System for Assessing Health Care Quality in the

Chapter 3 RANDOM VARIATE GENERATION

Categorical Data Analysis

Common Univariate and Bivariate Applications of the Chi-square Distribution

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Linear Models for Continuous Data

SUMAN DUVVURU STAT 567 PROJECT REPORT

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation

Parametric and non-parametric statistical methods for the life sciences - Session I

Some Essential Statistics The Lure of Statistics

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Statistics and Data Analysis

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ABSTRACT INTRODUCTION

Estimation of σ 2, the variance of ɛ

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Section 13, Part 1 ANOVA. Analysis Of Variance

Likelihood: Frequentist vs Bayesian Reasoning

Dongfeng Li. Autumn 2010

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Chapter 29 The GENMOD Procedure. Chapter Table of Contents

Maximum Likelihood Estimation

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Introduction to Quantitative Methods

Non-Inferiority Tests for Two Proportions

ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking

Final Exam Practice Problem Answers

Numerical Issues in Statistical Computing for the Social Scientist

2WB05 Simulation Lecture 8: Generating random variables

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Chi-square test Fisher s Exact test

6.2 Permutations continued

Confidence Intervals for the Difference Between Two Means

Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc

Notes on the Negative Binomial Distribution

Introduction to Fixed Effects Methods

Additional sources Compilation of sources:

Lin s Concordance Correlation Coefficient

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Regression 3: Logistic Regression

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 )

Linear Classification. Volker Tresp Summer 2015

Variables Control Charts

Nonparametric Statistics

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

Statistics, Data Analysis & Econometrics

Logistic Regression.

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Maximum Likelihood Estimation of Logistic Regression Models: Theory and Implementation

Transcription:

Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 19: Conditional Logistic Regression p. 1/40

Conditional Logistic Regression Purpose 1. Eliminate unwanted nuisance parameters 2. Use with sparse data Prior to the development of the conditional likelihood, lets review the unconditional (regular) likelihood associated with the logistic regression model. Suppose, we can group our covariates into J unique combinations and as such, we can form J (2 2) tables Lecture 19: Conditional Logistic Regression p. 2/40

First, consider the j th (2 2) table: TABLE j (or stratum W = j ) Variable (Y ) 1 2 Variable (X) 1 2 Y j11 Y j12 Y j1+ Y j21 Y j22 Y j2+ Y j+1 Y j+2 Y j++ Lecture 19: Conditional Logistic Regression p. 3/40

We can write the (2 2) data table of probabilities for stratum W = j as X Y 1 2 1 p j1 1 p j1 1 2 p j2 1 p j2 1 Lecture 19: Conditional Logistic Regression p. 4/40

Unconditional Likelihood Let p j be the probability that a subject in stratum j has a success. Then the probability of observing y j events in stratum j is P(Y j = y j ) = n j y j! p y j j (1 p j) n j y j However, we know p j is a function of covariates Without loss of generality, assume we are interested in two covariates, x j1 and x j2, such that p j = eβ 0+β 1 x j1 +β 2 x j2 1 + e β 0+β 1 x j1 +β 2 x j2 Lecture 19: Conditional Logistic Regression p. 5/40

Then, the likelihood function can be written as L( y) = = = = JQ JQ JQ JQ n j y j! p y j j (1 p j) n j y j e β 0 +β 1 x j1 +β 2 x j2 1+e β 0 +β 1 x j1 +β 2 x j2 «yj e β 0 +β 1 x j1 +β 2 x j2 y j 1+e β 0 +β 1 x j1 +β 2 x j2 n j e β 0 t 0 +β 1 t 1 +β 2 t 2 1 1+e β 0 +β 1 x j1 +β 2 x j2 JQ 1+e β 0 +β 1 x j1 +β 2 x n j2 j n j y j! n j y j! «(nj y j ) n j y j! where t k = JX y j x jk and t 0 = JX y j Lecture 19: Conditional Logistic Regression p. 6/40

The functions t 0, t 1, and t 2 are sufficient statistics for the data. Suppose we want to test β 2 = 0 using a likelihood ratio test. Then L(β 0, β 1 ) = and the LRT would equal JQ e β 0t 0 +β 1 t 1 `1 + e β 0 +β 1 x j1 nj JY n j y j! Λ( y) = 2(lnL(β 0, β 1 ) ln(l(β 0, β 1, β 2 ))) = 2ln L(β 0,β 1 ) = 2ln L(β 0,β 1,β 2 ) e β 0 t 0 +β 1 t 1 JQ 1+e β 0 +β 1 x j1 e β 0 t 0 +β 1 t 1 +β 2 t 2 JQ 1+e β 0 +β 1 x j1 +β 2 x j2 «nj J Q 0 @ «nj J Q n j y j 0 @ 1 A n j y j 1 A Which is distributed approximately chi-square with 1 df. Lecture 19: Conditional Logistic Regression p. 7/40

When you have small sample sizes, the chi-square approximation is not valid. We need a method to calculate the exact p value of H 0 : β 2 = 0 from the exact null distribution of Λ( y). Note that the distribution of Λ( y) depends on the exact distribution of y. Thus, we need to derive the exact distribution of y. Lecture 19: Conditional Logistic Regression p. 8/40

Under, H 0, β 2 = 0, so we will begin with P( y) = L(β 0, β 1 ) = JQ e β 0t 0 +β 1 t 1 `1 + e β 0 +β 1 x j1 nj JY n j y j! We cannot use the above probability statement because it relies on the population parameters (i.e., unknown constants) β 0 and β 1. That is, β 0 and β 1 are a nuisance to our calculations and as such are called nuisance parameters Similar to what we did in the Fisher s Exact Test, we are going to condition out β 0 and β 1 Lecture 19: Conditional Logistic Regression p. 9/40

Using Bayes Law, P( y t 0, t 1 ) = P JQ e β 0 t 0 +β 1 t 1 JQ 1+e β 0 +β 1 x n j1 j e β 0 t 0 +β 1 t 1 JQ JQ y Γ 1+e β 0 +β 1 x n j1 j n j y j n j y j!! where Γ is the set of all vectors of y such that JX y j = t 0 and 0 y j n j and y j is an integer. JX y j x 1j = t 1 Lecture 19: Conditional Logistic Regression p. 10/40

Then, P( y t 0, t 1 ) can be simplified to P( y t 0, t 1 ) = = = P e β 0 t 0 +β 1 t 1 JQ 1+e β 0 +β 1 x j1 e β 0 t 0 +β 1 t 1 y Γ JQ 1+e β 0 +β 1 x j1 e β 0 t 0 +β 1 t 1 JQ 1+e β 0 +β 1 x j1 e β 0 t 0 +β 1 t 1 JQ 1+e β 0 +β 1 x j1 0 JQ @ n 1 j A y j 0 P JQ @ n 1 j A y j y Γ «nj J Q 0 @ n 1 j A y j 0 Q «J @ n j nj y j 1 A 0 1 Q «J @ n j nj y j A «nj P 0 JQ @ y Γ n 1 j A y j which does not contain any unknown parameters and we can calculate the exact p value. Lecture 19: Conditional Logistic Regression p. 11/40

Let λ by any value such that Λ( y) = λ and denote Γ λ = { y : y Γ and Λ( y) = λ} Then P(Λ( y) = λ) = X y Γ λ P( y t 0, t 1 ) To obtain the exact p value for testing β 2 = 0 we need to: 1. Enumerate all values of λ 2. Calculate P(Λ( y) = λ) 3. p value equals the sum of the as extreme or more extreme values of P(Λ( y) = λ) Or use SAS PROC LOGISTIC or LogXact Lecture 19: Conditional Logistic Regression p. 12/40

Example Mantel (1963) Effectiveness of immediately injected penicillin or 1.5 hour delayed injected penicillin in protecting against β hemolytic Streptococci: Penicillin RESPONSE Level DELAY Died Cured 1/8 None 0 6 1.5 hr 0 8 1/4 None 3 3 1.5 hr 1 6 1/2 None 6 0 1.5 hr 2 4 1 None 5 1 1.5 hr 6 0 4 None 2 0 1.5 hr 5 0 Lecture 19: Conditional Logistic Regression p. 13/40

Note, there are many 0 cells in the table; may have problems with the large sample normal approximations. We want to estimate the (common) OR between Delay and Response, given strata (Penicillin). We can consider the data as arising from J = 5, (2 2) tables, where J = 5 penicillin levels. In the k th row of (2 2) table j, we assume the data have the following logistic regression model: logit(p[cured penicillin=j, DELAY k ]) = µ + α j + βdelay k, where DELAY k = ( 0 if none 1 if 1.5 hours. There are problems with the unconditional (usual) MLE, as we ll see in the computer output. Lecture 19: Conditional Logistic Regression p. 14/40

This is equivalent to the logistic model logit{p[cured penicillin=j, DELAY]} = µ + α 1 z 1 +... + α 4 z 4 + βdelay k, where z j = ( 1 if penicillin=j 0 if otherwise. Note, we have not taken the ordinal nature of penicillin into account We are considering penicillin (and the associated dose) as a nuisance parameter and are not interested in drawing inference about penicillin We are interested in learning about the timing to give any dose of penicillin Lecture 19: Conditional Logistic Regression p. 15/40

SAS Proc Logistic Using SAS Proc Logistic, we get the following results: data cmh; input pen delay response count; cards; 0.125 0 0 0 /* response: 1 = cured, 0 = died */ 0.125 0 1 6 /* delay: 0 = none, 1 = 1.5 hrs */ 0.125 1 0 0 0.125 1 1 8 0.250 0 0 3 0.250 0 1 3 0.250 1 0 1 0.250 1 1 6 0.500 0 0 6 0.500 0 1 0 0.500 1 0 2 0.500 1 1 4 1.000 0 0 5 1.000 0 1 1 1.000 1 0 6 1.000 1 1 0 4.000 0 0 2 4.000 0 1 0 4.000 1 0 5 4.000 1 1 0 ; Lecture 19: Conditional Logistic Regression p. 16/40

proc logistic descending; class pen(param=ref); model response = pen delay ; freq count; run; /* SELECTED OUTPUT */ Model Convergence Status Quasi-complete separation of data points detected. WARNING: The maximum likelihood estimate may not exist. WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity the model fit is questionable. Lecture 19: Conditional Logistic Regression p. 17/40

Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1-13.7143 163.8 0.0070 0.9333 pen 0.125 1 25.1650 199.4 0.0159 0.8996 pen 0.25 1 13.6947 163.8 0.0070 0.9334 pen 0.5 1 11.9500 163.8 0.0053 0.9419 pen 1 1 10.0640 163.8 0.0038 0.9510 delay 1 1.8461 0.9288 3.9508 0.0468 WARNING: The validity of the model fit is questionable. Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits pen 0.125 vs 4 >999.999 <0.001 >999.999 pen 0.25 vs 4 >999.999 <0.001 >999.999 pen 0.5 vs 4 >999.999 <0.001 >999.999 pen 1 vs 4 >999.999 <0.001 >999.999 delay 6.335 1.026 39.115 Lecture 19: Conditional Logistic Regression p. 18/40

Possible remedies There are (at least) two possible remedies to this problem: 1. Assign a score w j to penicillin level j and use this with logistic regression, or 2. Since we are interested in β = log(or RESPONSE,DELAY ), an alternative is to eliminate the nuisance parameters (the effects of PENICILLIN LEVEL) by using conditional logistic regression. Lecture 19: Conditional Logistic Regression p. 19/40

Assigning Scores Revised data step: data cmh; /* delay = 0 if None, 1 if 1.5 hr */ /* response = 1 if Cured, 0 if Died */ input pen delay response count; if pen = 0.125 then pen1 = 1; else pen1=0; if pen = 0.250 then pen2 = 1; else pen2=0; if pen = 0.500 then pen3 = 1; else pen3=0; if pen = 1.000 then pen4 = 1; else pen4=0; pen_cont = pen; cards;... (same as before) proc logistic descending; model response = pen_cont delay /aggregate scale=1; freq count; run; Lecture 19: Conditional Logistic Regression p. 20/40

Selected Results Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 2.2328 0.8051 7.6910 0.0055 pen_cont 1-6.8591 1.9781 12.0238 0.0005 delay 1 1.6984 0.8623 3.8792 0.0489 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits pen_cont 0.001 <0.001 0.051 delay 5.465 1.008 29.620 Note: This model converged Lecture 19: Conditional Logistic Regression p. 21/40

Approach 2 - Elimination of Nuisance Parameters For the prospective study we have, the rows (y j1+ and y j2+ ) of each (2 2) table are fixed, and we have two independent binomials: 1) Row 1: Y j11 Bin(y j1+, p j1 ), where and 2) Row 2: where p j1 = P[Cured penicillin=j, DELAY = 1] Y j21 Bin(y j2+, p j2 ), p j2 = P[Cured penicillin=j, DELAY = 0] Lecture 19: Conditional Logistic Regression p. 22/40

In terms of the logistic model, logit(p[cured penicillin=j, DELAY = 1]) = logit(p j1 ) = µ + α j + β, and logit(p[cured penicillin=j, DELAY = 0]) = logit(p j2 ) = µ + α j Then the log-odds ratio for the j th, (2 2) table is log p j1/(1 p j1 ) p j2 /(1 p j2 ) = logit(p j1 ) logit(p j2 ) = [µ + α j + β] [µ + α j ] = β, and, the odds ratio is (OR RESPONSE,DELAY ) = p j1/(1 p j1 ) p j2 /(1 p j2 ) = eβ Lecture 19: Conditional Logistic Regression p. 23/40

Conditional Logistic Regression The conditional likelihood is JQ n j! L c ( y) = P JQ y j n j! y Γ y j which is the product of the probability functions over the J tables or strata. We can find the conditional MLE, ˆβ CMLE by maximizing L c (β). As with a single table, in large samples, ˆβ will be approximately unbiased, approximately normal, and have variance equal to the negative inverse of the (expected) second derivative. Lecture 19: Conditional Logistic Regression p. 24/40

What we mean by LARGE SAMPLES For proper properties of the conditional likelihood L c (β), we need large samples. There are two kinds of large samples that allow ˆβ CMLE to be approximately normal (by the central limit theorem). 1. We can have the within stata sample size (y j++ ) large (with the number of strata J being small). 2. As the number of strata, J, becomes large (we can y j++ <, and, actually, we can have y j++ as small as 2, which we will discuss later). As a result, we can have both (y j++ ) large and J large, and this would be a large sample as well. Lecture 19: Conditional Logistic Regression p. 25/40

This is in contrast to the usual (unconditional MLE); if you look at the logistic regression model logit{p[died penici=j, DELAY k ]} = µ + α j + βdelay k, in order to be able to estimate each α j, we need each y j++ to be large Lecture 19: Conditional Logistic Regression p. 26/40

In particular, if J is large, using unconditional logistic regression for logit(p jk ) = µ + α j + βx k, we will have a large number of parameters (α j s) to estimate, and we cannot unless each y j++ is also large. If y j++ is also large, then bβ CMLE b β MLE, but the MLE is preferable since it is simpler computationally, and has nice properties. Lecture 19: Conditional Logistic Regression p. 27/40

Mantel-Haenzsel Estimate One other possibility is the Mantel Haenszel estimate of a common odds ratio for a set of J, (2 2) tables, As with the conditional MLE, the Mantel-Haenszel estimator of the common odds ratio is (asymptotically unbiased) if either (y j++ ) is large or J is large or both. However, the Mantel-Haenzel estimate does not generalize to data with many covariates, whereas the conditional likelihood does. Lecture 19: Conditional Logistic Regression p. 28/40

CMH Using PROC FREQ However, for the data studied today, we can calculate the CMH (or just the Mantel-Haenzsel) proc freq; tables pen*delay*response/cmh; weight count; run; Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ------------------------------------------------------------------------- Case-Control Mantel-Haenszel 4.6316 0.9178 23.3731 (Odds Ratio) Logit ** 3.9175 0.6733 22.7925 ** These logit estimators use a correction of 0.5 in every cell of those tables that contain a zero. Tables with a zero row or a zero column are not included in computing the logit estimators. Lecture 19: Conditional Logistic Regression p. 29/40

Conditional Logistic Regression We can use SAS Proc Logistic as before to do conditional logistic regression. SAS Proc Logistic will give us the conditional logistic regression estimate of the odds ratio, and an exact 95% confidence interval for the odds ratio using the conditional likelihood. Lecture 19: Conditional Logistic Regression p. 30/40

SAS Proc Logistic proc logistic descending; class pen(param=ref); model response = pen delay ; exact delay / estimate = both /*both = logor & or */; freq count; run; /* SOME OUTPUT */ Conditional Exact Tests --- p-value --- Effect Test Statistic Exact Mid delay Score 4.0880 0.0587 0.0384 Probability 0.0407 0.0587 0.0384 Lecture 19: Conditional Logistic Regression p. 31/40

Exact Parameter Estimates 95% Confidence Parameter Estimate Limits p-value delay 1.6903-0.2120 4.1588 0.0937 Exact Odds Ratios 95% Confidence Parameter Estimate Limits p-value delay 5.421 0.809 63.995 0.0937 Lecture 19: Conditional Logistic Regression p. 32/40

Effectiveness of immediately injected penicillin or 1.5 hour OR of Delayed injected penicillin in protecting against Streptococci ODDS 95% Confidence Limits Variable Ratio Lower Upper -------------------------------------------- COND. LOGIT 5.421 0.809 63.995 MANTEL-HA 3.918 0.673 22.793 LOGISTIC* 6.335 1.026 39.115 LOGISTIC** 5.465 1.008 29.620 -------------------------------------------- * From Proc Logistic with warning: The validity of the model fit is questionable. Uses dose as a class **Using dose with scores Lecture 19: Conditional Logistic Regression p. 33/40

From each, we see that odds of being cured if there is no delay is about 5 times of what it is if there is a delay. Although we get an estimate from (ordinarly) logistic regression, it is not as stable as the other methods. Note, these strata sizes were small, but there are situations when they are even smaller (matched pairs, discussed later). First though, let s look at a couple of other models. In particular, let s test for interaction. Lecture 19: Conditional Logistic Regression p. 34/40

Penicillin Data If we fit the model with interaction between delay and penicillin, (with 4 dummy variables for penicillin and the 4 for the interaction), and the unconditional had the same problems as above. The conditional also has problems: Proc Logistic gave me the following: proc logistic descending; class pen(param=ref); model response = pen delay pen*delay ; exact pen*delay /estimate; freq count; run; Lecture 19: Conditional Logistic Regression p. 35/40

SOME OUTPUT Exact Conditional Analysis Conditional Exact Tests --- p-value --- Effect Test Statistic Exact Mid delay*pen Score 7.0666 0.1354 0.0990 Probability 0.0729 0.1354 0.0990 Exact Parameter Estimates 95% Confidence Parameter Estimate Limits p-value delay*pen 0.125.#... delay*pen 0.25.#... delay*pen 0.5.#... delay*pen 1.#... NOTE: # indicates that the conditional distribution is degenerate. Lecture 19: Conditional Logistic Regression p. 36/40

Proc Logistic can give us an exact conditional Score test (with p value =.1354), even though it cannot give us estimates of the interaction terms. Thus, I fit a little simpler model, with the main effects of penicillin fitted using dummy variables, but the interaction with penicillin continuous (w j ): logit(p[cured penici=j, DELAY=k]) = µ + α j + β 1 x k + β 12 x k w j, where x k = ( 0 if DELAY = none (k = 1) 1 if DELAY = 1.5 hours (k = 2). and w j equals the actual dose (0.125,.25,.5, or 1), treated as continuous. We are interested in testing for homogeneity of the odds ratio across the J stratum. Lecture 19: Conditional Logistic Regression p. 37/40

proc logistic descending; class pen(param=ref); model response = pen delay pen_cont*delay ; exact pen_cont*delay /estimate; freq count; run; Exact Parameter Estimates 95% Confidence Parameter Estimate Limits p-value delay*pen_cont -4.3902-15.4187 2.8438 0.2708 Lecture 19: Conditional Logistic Regression p. 38/40

When W (pennicilin) is treated as continuous, logit(p[cured penici=j, DELAY=k]) = µ + γ 1 w j + β 1 x k + β 12 x k w j the unconditional logistic regression also converged, and we get the following comparison 95% Confidence Limits PEN*DELAY --------------------------------------------- Parameter METHOD Estimate P Value --------------------------------------------- COND -4.3902 0.2708 UNCOND -3.4990 0.4288 --------------------------------------------- Although both algorithms converged, the estimates are pretty large, and the confidence intervals are very wide (not shown). However, we have indication that the common odds ratio assumption Lecture 19: Conditional is valid. Logistic Regression p. 39/40

LogXact All of the models could have been fit in LogXact, a software package specializing in conditional (or exact) logistic regression I noted small variations in the parameter estimates between SAS and LogXact. LogXact is mouse driven and is easy to use SAS is easier to do model building activities creation of derived variables, selection = routines and copy and paste model code Lecture 19: Conditional Logistic Regression p. 40/40