# Time varying (or time-dependent) covariates

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Chapter 9 Time varying (or time-dependent) covariates References: Allison (*) p Hosmer & Lemeshow Chapter 7, Section 3 Kalbfleisch & Prentice Section 5.3 Collett Chapter 7 Kleinbaum Chapter 6 Cox & Oakes Chapter 8 Andersen & Gill Page 168 (Advanced!) 1

2 2 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Our Goal here So far, we ve been considering the following Cox PH model: p λ(t, Z) =λ 0 (t) exp(βz) =λ 0 (t)exp β j Z j where the covariates Z j are measured at study entry (t =0). Important feature of this model: The hazard ratio λ(t,z=z) =exp(βz) depends on the covariates λ(t,z=0) z 1,..., z p, but not on time t. Nowwewantto relax this assumption, and allow the hazard ratio to depend on time t. allow to incorporate time-varying covariates j=1

3 9.1. EXAMPLES TO MOTIVATE TIME-DEPENDENT COVARIATES Examples to motivate time-dependent covariates Stanford Heart transplant example: Variables: survival - days since program enrollment until death or censoring dead - indicator of death (1) or censoring (0) transpl - whether patient ever had transplant (1 if yes, 2 if no) surgery - previous heart surgery prior to program (1=yes, 0=no) age - age at time of acceptance into program wait - days from acceptance into program until transplant surgery (=. for those without transplant)

4 4 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Initially, a Cox PH model was fit for predicting survival time: λ(t, Z) = λ 0 (t) exp(β 1 transpl + β 2 surgery + β 3 age) Does this fit in the framework we have seen so far? Why or why not?

5 9.1. EXAMPLES TO MOTIVATE TIME-DEPENDENT COVARIATES 5 λ(t, Z) = λ 0 (t) exp(β 1 transpl + β 2 surgery + β 3 age) (9.1) As the covariate transpl really changes over time and gets a value depending on how long the patient has been followed... this is not a regular Cox PH model as we know it. This model could give misleading results, since patients who died more quickly had less time available to get transplants. A model with a timedependent indicator of whether a patient had a transplant at each point in time might be more appropriate: λ(t, Z) = λ 0 (t) exp(β 1 trnstime(t)+β 2 surgery + β 3 age) (9.2) where trnstime(t) =1iftranspl=1 and wait<t

6 6 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES SAS code for these two models Time-independent covariate for transpl: proc phreg data=stanford; model survival*dead(0)=transpl surgery age; run; Time-dependent covariate for transpl: proc phreg data=stanford; model survival*dead(0)=trnstime surgery age; if wait>survival or wait=. then trnstime=0; else trnstime=1; run;

7 9.1. EXAMPLES TO MOTIVATE TIME-DEPENDENT COVARIATES 7 If we add time-dependent covariates or interactions with time to the Cox proportional hazards model, then it is not a proportional hazards model any longer. Werefertoitasanextended Cox model. Comparison with a single binary predictor (like heart transplant): The Cox PH model 9.1 would compare the survival distributions between those without a transplant (ever) to those with a transplant. A subject s transplant status at the end of the study would determine which category they were put into for the entire study follow-up. This does not make much sense. An extended Cox model 9.2 would compare the risk of an event between already transplanted and non-yet-transplanted at each event time, and would re-evaluate which risk group each person belonged in based on whether they d had a transplant by that time.

8 8 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Recidivism Example: (see Allison, p.42) 432 male inmates were followed for one year after release from prison, to evaluate risk of re-arrest as function of financial aid (fin), age at release (age), race (race), full-time work experience prior to first arrest (wexp), marital status (mar), parole status (paro=1 if released with parole, 0 otherwise), and number of prior convictions (prio). Data were also collected on employment status over time during the year. Time-independent model: includes employment status of the individual at the beginning of the study (1 if employed, 0 if unemployed), or perhaps at any point during the year. Time-dependent model: However, employment status changes over time, and it may be the more recent employment status that would affect the hazard for re-arrest. E.g., we might want to define a time-dependent covariate for each month of the study that indicates whether the individual was employed during the past month.

9 9.2. EXTENDED COX MODEL Extended Cox Model Framework: For individual i, suppose we have their observation time, failure indicator, and a summary of their covariate values over time: (X i,δ i, {Z i (t),t [0,X i ]}), {Z i (t),t [0,X i ]} represents the covariate path for the i-th individual while they are in the study, and the covariates can take different values at different times. Assumption: conditional on an individual s covariate history, the Cox model for the hazard holds: λ(t; {Z i (u),u [0,t]}) =λ(t; Z i (t)) = λ 0 (t) e βz i(t) This means we record in Z(t) the part of the history that influences the hazard at time t.

10 10 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Survivor function: S(t; Z) = exp{ t 0 exp(βz(u)) λ 0 (u)du} and depends on the values of the time dependent variables over the interval from 0 to t. This is the classic formulation of the time varying Cox regression survival model. For Z(u) is step function with one change point at t 1 <t: { [ t1 t ]} S(t; Z) = exp exp(βz(u)) λ 0 (u)du + exp(βz(u)) λ 0 (u)du 0 t 1 = exp { [ exp(βz(0)) t 1 0 λ 0(u)du + exp(βz(t 1 )) t t 1 ]} λ 0 (u)du And prediction??

11 9.2. EXTENDED COX MODEL 11 Kinds of time-varying covariates: internal covariates: variables that relate to the individuals, and can only be measured when an individual is alive, e.g. white blood cell count, CD4 count external covariates: variable which changes in a known way, e.g. age, dose of drug (if non-dynamic drug regime) variable that exists totally independently of all individuals, e.g. air temperature time itself

12 12 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES 9.3 Types of applications and Examples The extended Cox model is used: I. When important covariates change during a study Framingham Heart study 5209 subjects followed since 1948 to examine relationship between risk factors and cardiovascular disease. A particular example: Outcome: time to congestive heart failure Predictors: age, systolic blood pressure, # cigarettes per day

13 9.3. TYPES OF APPLICATIONS AND EXAMPLES 13 Liver Cirrhosis (Andersen and Gill, p.528) Clinical trial comparing treatment to placebo for cirrhosis. The outcome of interest is time to death. Patients were seen at the clinic after 3, 6 and 12 months, then yearly. Fixed covariates: treatment, gender, age (at diagnosis) Time-varying covariates: alcohol consumption, nutritional status, bleeding, albumin, bilirubin, alkaline phosphatase and prothrombin. The paper on obesity and heart failure... Recidivism Study (Allison, p.42)

14 14 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES II. For cross-over studies, to indicate change in treatment Stanford heart study (Cox and Oakes p.129) Between 1967 and 1980, 249 patients entered a program at Stanford University where they were registered to receive a heart transplant. Of these, 184 received transplants, 57 died while waiting, and 8 dropped out of the program for other reasons. Does getting a heart transplant improve survival? Here is a sample of the data: Waiting transplant? survival post total final time transplant survival status etc (survival is not indicated above for those without transplants, but was available in the dataset) Compare the total survival of transplanted and non- Naive approach: transplanted. Problem: Length Bias!

15 9.3. TYPES OF APPLICATIONS AND EXAMPLES 15 III. For Surrogate Outcome Analysis For example, in cancer clinical trials, tumor response (or shrinking of the tumor) is often used as an outcome. However, clinicians want to know whether tumor response correlates with survival. For this purpose, we can fit an extended Cox model for time to death, with tumor response as a time dependent covariate. Remember: association causation!

16 16 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES IV. For testing the PH assumption For example, we can fit these two models: (1) Time independent covariate Z 1 λ(t, Z) =λ 0 (t) exp(β 1 Z 1 ) The hazard ratio for Z 1 is exp(β 1 ). (2) Time dependent covariate Z 1 λ(t, Z) =λ 0 (t) exp(β 1 Z 1 + β 2 Z 1 t) The hazard ratio for Z 1 is exp(β 1 + β 2 t). (note: we may want to replace t by (t t 0 ), so that exp(β 1 ) represents HR at some convenient time, like the median survival time.) A test of the parameter β 2 is a test of the PH assumption. (how do we get the test?...using the Wald test from the output of second model, or LR test formed by comparing the loglikelihoods of the two models)

17 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES Partial likelihood with time-varying covariates Starting out just as before, relying on non-informative censoring Suppose there are K distinct failure (or death) times, and let (τ 1,...τ K ) represent the K ordered, distinct death times. For now, assume there are no tied death times. Risk Set: Let R(t) ={i : x i t} denote the set of individuals who are at risk for failure at time t. Failure: Let i j denote the label or identity of the individual who fails at time τ j, including the value of their time-varying covariate during their time in the study {Z ij (t),t [0,τ j ]} History: Let H j denote the history of the entire data set, up to the j-th death or failure time, including the time of the failure, but not the identity of the one who fails, also including the values of all covariates for everyone up to and including time τ j.

18 18 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Partial Likelihood: We retain the previous construction of the partial likelihood. It can be written as L(β) = d P (i j H j ) j=1 = d j=1 λ(τ j ; Z ij (τ j )) l R(τ j ) λ(τ j; Z l (τ j ))

19 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 19 Under the PH assumption, this is: d exp(βz ij (τ j )) L(β) = l R(τ j ) exp(βz l(τ j )) j=1 What if Z is not measured for person l at time τ j? use the most recent value (assumes step function) interpolate impute based on some model Inference (i.e. estimating the regression coefficients, constructing score tests, etc.) proceeds similarly to standard case. The main difference is that the values of Z will change at each risk set. It is very easy to write down a Cox model with time-dependent covariates, but much harder to fit (computationally) and interpret!

20 20 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Old Example revisited: Group 0: 4 +, 7, 8 +, 9, 10 + Group 1: 3, 5, 5 +, 6, 8 + Let Z 1 be group, and add another fixed covariate Z 2 ID fail censor Z 1 Z 2 e (β 1Z 1 +β 2 Z 2 ) e β 1+β e β e β 1+β e β e β 1+β e β e β e β ordered Partial failure Individuals Likelihood time (τ j ) at risk failure ID contribution

21 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 21 Example continued: λ(t) =λ 0 (t)exp(β 1 Z 1 + β 2 Z 2 (t)) Now suppose Z 2 (a completely different covariate) is a time varying covariate: Z 2 (t) ID fail censor Z ordered Partial failure Individuals Likelihood time (τ j ) at risk failure ID contribution

22 22 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES SAS solution to previous examples Title Ph regression: small class example ; data ph; input time status group z3 z4 z5 z6 z7 z8 z9; cards; run; proc phreg ; model time*status(0)=group z3 ; run; proc phreg ; model time*status(0)=group z ; z=z3; if (time >= 4) then z=z4; if (time >= 5) then z=z5; etc.; run;

23 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 23 Model with z3: Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 2 DF (p=0.1965) Score with 2 DF (p=0.1597) Wald with 2 DF (p=0.2315) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio GROUP Z

24 24 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Model with time-dependent Z: Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 2 DF (p=0.2558) Score with 2 DF (p=0.2560) Wald with 2 DF (p=0.3212) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio GROUP Z

25 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 25 The Stanford Heart Transplant data data heart; infile heart.dat ; input wait trans post surv status ; run; data heart; set heart; if trans=2 then surv=wait; run; *** naive analysis; proc phreg; model surv*status(2)=tstat; tstat=2-trans; *** analysis with time-dependent covariate; proc phreg; model surv*status(2)=tstat; tstat = 0; if (trans=1 and surv >= wait) then tstat = 1; run; The second model took about twice as long to run as the first model, which is usual.

26 26 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES RESULTS for Stanford Heart Transplant data: Naive model with fixed transplant indicator: Criterion Covariates Covariates Model Chi-Square -2 LOG L with 1 DF (p=0.0001) Score with 1 DF (p=0.0001) Wald with 1 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio TSTAT Model with time-dependent transplant indicator: Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 1 DF (p=0.0001) Score with 1 DF (p=0.0001) Wald with 1 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio TSTAT

27 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 27 Recidivism Example: Hazard for arrest within one year of release from prison: Model without employment status Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 7 DF (p=0.0001) Score with 7 DF (p=0.0001) Wald with 7 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio FIN AGE RACE WEXP MAR PARO PRIO What are the important predictors of recidivism?

28 28 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Recidivism Example: (cont d) Now, we use the indicators of employment status for each of the 52 weeks in the study, recorded as emp1-emp52. We can fit the model in 2 different ways: proc phreg data=recid; model week*arrest(0)=fin age race wexp mar parro prio employed / ties=efron; array emp(*) emp1-emp52; do i=1 to 52; if week=i then employed=emp(i); end; run; *** a shortcut; proc phreg data=recid; model week*arrest(0)=fin age race wexp mar parro prio employed / ties=efron; array emp(*) emp1-emp52; employed=emp(week); run; The second way takes 23% less time than the first way, but the results are the same.

29 9.4. PARTIAL LIKELIHOOD WITH TIME-VARYING COVARIATES 29 Recidivism Example: Output Model WITH employment as time-dependent covariate Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio FIN AGE RACE WEXP MAR PARO PRIO EMPLOYED Is current employment important? Do the other covariates change much? Can you think of any problem with using current employment as a predictor?

30 30 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Another option for assessing impact of employment Allison suggests using the employment status of the past week rather than the current week, as follows: proc phreg data=recid; where week>1; model week*arrest(0)=fin age race wexp mar parro prio employed / ties=efron; array emp(*) emp1-emp52; employed=emp(week-1); run; The coefficient for employed changes from to -0.79, so the risk ratio is about 0.45 instead of It is still highly significant with χ 2 =13.1. Does this model improve the causal interpretation? Other options for time-dependent covariates: multiple lags of employment status (week-1, week-2, etc.) cumulative employment experience (proportion of weeks worked)

31 9.5. SOME CAUTIONARY NOTES Some cautionary notes Time-varying covariates must be carefully constructed to ensure interpretability There is no point adding a time-varying covariate whose value changes the same as study time... you will get the same answer as using a fixed covariate measured at study entry. For example, suppose we want to study the effect of age on time to death. We could 1. use age at start of the study as a fixed covariate 2. age as a time varying covariate However, the results will be the same! Why?

32 32 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Using time-varying covariates to assess model fit Suppose we have just fit the following model: λ(t; Z) =λ 0 (t) exp(β 1 Z 1 + β 2 Z β p Z p ) E.g., the nursing home data with gender, marital status and health. Suppose we want to test the proportionality assumption on health (Z p ) Create a new variable: Z p+1 (t) =Z p γ(t) where γ(t) is a known function of time, such as γ(t) = t or log(t) or e ρt or I {t>t } Then testing H 0 : β p+1 =0is a test for non-proportionality

33 9.5. SOME CAUTIONARY NOTES 33 Illustration: Colon Cancer data *** model without time*covariate interaction; proc phreg data=surv; model survtime*censs(1) = trtm stagen ; Model without time*stage interaction Event and Censored Values Percent Total Event Censored Censored Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 2 DF (p=0.0001) Score with 2 DF (p=0.0001) Wald with 2 DF (p=0.0001) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio TRTM STAGEN

34 34 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES *** model WITH time*covariate interaction; proc phreg data=surv ; model survtime*censs(1) = trtm stagen tstage ; tstage=stagen*exp(-survtime/1000); Model WITH time*stage interaction Testing Global Null Hypothesis: BETA=0 Without With Criterion Covariates Covariates Model Chi-Square -2 LOG L with 3 DF (p=0.0001) Score with 3 DF (p=0.0001) Wald with 3 DF (p=0.0002) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Risk Variable DF Estimate Error Chi-Square Chi-Square Ratio TRTM STAGEN TSTAGE Like Cox and Oakes, we can run a few different models

35 9.5. SOME CAUTIONARY NOTES 35 Time-varying covariates in Stata Create a data set with an ID column, and one line per person for each different value of the time varying covariate.. infile id time status group z using cox4_stata.dat or. input id time status group z end. stset time status. cox time group z, dead(status) tvid(id)

36 36 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES time status Coef. Std. Err. z P> z [95% Conf. Interval] group z

37 9.5. SOME CAUTIONARY NOTES 37 Time-varying covariates in Splus Create a data set with start and stop values of time: id start stop status group z

38 38 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES Then the Splus commands and results are: Commands: y_read.table("cox4_splus.dat",header=t) coxph(surv(y\$start,y\$stop,y\$status) ~ cbind(y\$group,y\$z)) Results: Alive Dead Deleted coef exp(coef) se(coef) z p [1,] [2,] exp(coef) exp(-coef) lower.95 upper.95 [1,] [2,] Likelihood ratio test= 2.73 on 2 df, p=0.256 Efficient score test = 2.73 on 2 df, p=0.256

39 9.6. PIECEWISE COX MODEL: (COLLETT, CHAPTER 10) Piecewise Cox Model: (Collett, Chapter 10) A time dependent covariate can be used to create a piecewise PH cox model. Suppose we are interested in comparing two treatments, and: HR=θ 1 during the interval (0,t 1 ) HR=θ 2 during the interval (t 1,t 2 ) HR=θ 3 during the interval (t 2, ) Define the following covariates: X - treatment indicator (X =0 standard, X =1 new treatment) Z 2 - indicator of change in HR during 2nd interval { 1 if t (t1,t Z 2 (t) = 2 ) and X =1 0 otherwise Z 3 - indicator of change in HR during 3rd interval { 1 if t (t2, ) and X =1 Z 3 (t) = 0 otherwise The model for the hazard for individual i is: λ i (t) =λ 0 (t)exp{β 1 x i + β 2 z 2i (t)+β 3 z 3i (t)}

40 40 CHAPTER 9. TIME VARYING (OR TIME-DEPENDENT) COVARIATES

41 Contents 9 Time varying (or time-dependent) covariates Examples to motivate time-dependent covariates Extended Cox Model Types of applications and Examples Partial likelihood with time-varying covariates Some cautionary notes Piecewise Cox Model: (Collett, Chapter 10)

### Introduction. Survival Analysis. Censoring. Plan of Talk

Survival Analysis Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 01/12/2015 Survival Analysis is concerned with the length of time before an event occurs.

### Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

### Survival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence

Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest

### Chapter 5: Cox Proportional Hazards Model A popular model used in survival analysis that can be used to assess the importance of various covariates

Chapter 5: Cox Proportional Hazards Model A popular model used in survival analysis that can be used to assess the importance of various covariates in the survival times of individuals or objects through

### Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:

### Introduction to Survival Analysis

John Fox Lecture Notes Introduction to Survival Analysis Copyright 2014 by John Fox Introduction to Survival Analysis 1 1. Introduction I Survival analysis encompasses a wide variety of methods for analyzing

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Competing-risks regression

Competing-risks regression Roberto G. Gutierrez Director of Statistics StataCorp LP Stata Conference Boston 2010 R. Gutierrez (StataCorp) Competing-risks regression July 15-16, 2010 1 / 26 Outline 1. Overview

### 7.1 The Hazard and Survival Functions

Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence

### The Cox Proportional Hazards Model

The Cox Proportional Hazards Model Mario Chen, PhD Advanced Biostatistics and RCT Workshop Office of AIDS Research, NIH ICSSC, FHI Goa, India, September 2009 1 The Model h i (t)=h 0 (t)exp(z i ), Z i =

### Lecture 15 Introduction to Survival Analysis

Lecture 15 Introduction to Survival Analysis BIOST 515 February 26, 2004 BIOST 515, Lecture 15 Background In logistic regression, we were interested in studying how risk factors were associated with presence

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Poisson Models for Count Data

Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

### Checking proportionality for Cox s regression model

Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and

### Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.

### Simple Sensitivity Analyses for Matched Samples. CRSP 500 Spring 2008 Thomas E. Love, Ph. D., Instructor. Goal of a Formal Sensitivity Analysis

Goal of a Formal Sensitivity Analysis To replace a general qualitative statement that applies in all observational studies the association we observe between treatment and outcome does not imply causation

### How Do We Test Multiple Regression Coefficients?

How Do We Test Multiple Regression Coefficients? Suppose you have constructed a multiple linear regression model and you have a specific hypothesis to test which involves more than one regression coefficient.

### 5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional

### Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

### Comparison of Survival Curves

Comparison of Survival Curves We spent the last class looking at some nonparametric approaches for estimating the survival function, Ŝ(t), over time for a single sample of individuals. Now we want to compare

### Non-Parametric Estimation in Survival Models

Non-Parametric Estimation in Survival Models Germán Rodríguez grodri@princeton.edu Spring, 2001; revised Spring 2005 We now discuss the analysis of survival data without parametric assumptions about the

### ˆx wi k = i ) x jk e x j β. c i. c i {x ik x wi k}, (1)

Residuals Schoenfeld Residuals Assume p covariates and n independent observations of time, covariates, and censoring, which are represented as (t i, x i, c i ), where i = 1, 2,..., n, and c i = 1 for uncensored

### Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)

### Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER Objectives Introduce event history analysis Describe some common survival (hazard) distributions Introduce some useful Stata

### SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

### Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes

### Logistic Regression. Introduction CHAPTER The Logistic Regression Model 14.2 Inference for Logistic Regression

Logistic Regression Introduction The simple and multiple linear regression methods we studied in Chapters 10 and 11 are used to model the relationship between a quantitative response variable and one or

### Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to

### Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

### Raul Cruz-Cano, HLTH653 Spring 2013

Multilevel Modeling-Logistic Schedule 3/18/2013 = Spring Break 3/25/2013 = Longitudinal Analysis 4/1/2013 = Midterm (Exercises 1-5, not Longitudinal) Introduction Just as with linear regression, logistic

### Ordinal Regression. Chapter

Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

### Multivariate Logistic Regression

1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

### Binary Logistic Regression

Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

### Randomized trials versus observational studies

Randomized trials versus observational studies The case of postmenopausal hormone therapy and heart disease Miguel Hernán Harvard School of Public Health www.hsph.harvard.edu/causal Joint work with James

### Developing Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics

Paper SD-004 Developing Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics ABSTRACT The credit crisis of 2008 has changed the climate in the investment and finance industry.

### Parametric Survival Models

Parametric Survival Models Germán Rodríguez grodri@princeton.edu Spring, 2001; revised Spring 2005, Summer 2010 We consider briefly the analysis of survival data when one is willing to assume a parametric

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Engagement at Work Predicts Changes in Depression and Anxiety Status in the Next Year

Engagement at Work Predicts Changes in Depression and Anxiety Status in the Next Year October 2009 Sangeeta Agrawal, MS and James Harter, Ph.D. For more information about Gallup Consulting or our solutions

### SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates

SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates Martin Wolkewitz, Ralf Peter Vonberg, Hajo Grundmann, Jan Beyersmann, Petra Gastmeier,

### III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as

### Vignette for survrm2 package: Comparing two survival curves using the restricted mean survival time

Vignette for survrm2 package: Comparing two survival curves using the restricted mean survival time Hajime Uno Dana-Farber Cancer Institute March 16, 2015 1 Introduction In a comparative, longitudinal

### 5 Modeling Survival Data with Parametric Regression

5 Modeling Survival Data with Parametric Regression Models 5. The Accelerated Failure Time Model Before talking about parametric regression models for survival data, let us introduce the accelerated failure

### Survival analysis methods in Insurance Applications in car insurance contracts

Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Statistical Modeling Using SAS

Statistical Modeling Using SAS Xiangming Fang Department of Biostatistics East Carolina University SAS Code Workshop Series 2012 Xiangming Fang (Department of Biostatistics) Statistical Modeling Using

### Multiple logistic regression analysis of cigarette use among high school students

Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict

### Today s lecture. Lecture 6: Dichotomous Variables & Chi-square tests. Proportions and 2 2 tables. How do we compare these proportions

Today s lecture Lecture 6: Dichotomous Variables & Chi-square tests Sandy Eckel seckel@jhsph.edu Dichotomous Variables Comparing two proportions using 2 2 tables Study Designs Relative risks, odds ratios

### More details on the inputs, functionality, and output can be found below.

Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

### Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

### LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

### Comparison of resampling method applied to censored data

International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison

### Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### LMM: Linear Mixed Models and FEV1 Decline

LMM: Linear Mixed Models and FEV1 Decline We can use linear mixed models to assess the evidence for differences in the rate of decline for subgroups defined by covariates. S+ / R has a function lme().

### Using Stata for Categorical Data Analysis

Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,

### Correlational Research

Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

### Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

### Modelling spousal mortality dependence: evidence of heterogeneities and implications

1/23 Modelling spousal mortality dependence: evidence of heterogeneities and implications Yang Lu Scor and Aix-Marseille School of Economics Lyon, September 2015 2/23 INTRODUCTION 3/23 Motivation It has

### Chapter 1: Role of statistical models

Chapter 1: Role of statistical models Statistical models 2 Statistical model (1)........................................................ 3 Statistical model (2)........................................................

### Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

### Prognosis of survival for breast cancer patients

Prognosis of survival for breast cancer patients Ken Ryder Breast Cancer Unit Data Section Guy s Hospital Patrick Royston, MRC Clinical Trials Unit London Outline Introduce the data and outcomes requested

### Logit Models for Binary Data

Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

### CRJ Doctoral Comprehensive Exam Statistics Friday August 23, :00pm 5:30pm

CRJ Doctoral Comprehensive Exam Statistics Friday August 23, 23 2:pm 5:3pm Instructions: (Answer all questions below) Question I: Data Collection and Bivariate Hypothesis Testing. Answer the following

### Lecture 16: Logistic regression diagnostics, splines and interactions. Sandy Eckel 19 May 2007

Lecture 16: Logistic regression diagnostics, splines and interactions Sandy Eckel seckel@jhsph.edu 19 May 2007 1 Logistic Regression Diagnostics Graphs to check assumptions Recall: Graphing was used to

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Application of Odds Ratio and Logistic Models in Epidemiology and Health Research

Application of Odds Ratio and Logistic Models in Epidemiology and Health Research By: Min Qi Wang, PhD; James M. Eddy, DEd; Eugene C. Fitzhugh, PhD Wang, M.Q., Eddy, J.M., & Fitzhugh, E.C.* (1995). Application

### Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

### STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS

STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS Tailiang Xie, Ping Zhao and Joel Waksman, Wyeth Consumer Healthcare Five Giralda Farms, Madison, NJ 794 KEY WORDS: Safety Data, Adverse

### 1. Role of statistical models

1. Role of statistical models Statistical models 2 Statistical model........................................................... 3 Statistical model...........................................................

### A Tutorial on Logistic Regression

A Tutorial on Logistic Regression Ying So, SAS Institute Inc., Cary, NC ABSTRACT Many procedures in SAS/STAT can be used to perform logistic regression analysis: CATMOD, GENMOD,LOGISTIC, and PROBIT. Each

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### CHAPTER 6 EXAMPLES: GROWTH MODELING AND SURVIVAL ANALYSIS

Examples: Growth Modeling And Survival Analysis CHAPTER 6 EXAMPLES: GROWTH MODELING AND SURVIVAL ANALYSIS Growth models examine the development of individuals on one or more outcome variables over time.

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

### Introduction to mixed model and missing data issues in longitudinal studies

Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models

### Regression Modeling Strategies

Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### How to get accurate sample size and power with nquery Advisor R

How to get accurate sample size and power with nquery Advisor R Brian Sullivan Statistical Solutions Ltd. ASA Meeting, Chicago, March 2007 Sample Size Two group t-test χ 2 -test Survival Analysis 2 2 Crossover

### An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG

Paper 3140-2015 An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG Iván Darío Atehortua Rojas, Banco Colpatria

### Sample Size Planning, Calculation, and Justification

Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

### Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

### Primary Prevention of Cardiovascular Disease with a Mediterranean diet

Primary Prevention of Cardiovascular Disease with a Mediterranean diet Alejandro Vicente Carrillo, Brynja Ingadottir, Anne Fältström, Evelyn Lundin, Micaela Tjäderborn GROUP 2 Background The traditional

### Modern Methods for Missing Data

Modern Methods for Missing Data Paul D. Allison, Ph.D. Statistical Horizons LLC www.statisticalhorizons.com 1 Introduction Missing data problems are nearly universal in statistical practice. Last 25 years

### Problem of Missing Data

VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

### Survival Analysis And The Application Of Cox's Proportional Hazards Modeling Using SAS

Paper 244-26 Survival Analysis And The Application Of Cox's Proportional Hazards Modeling Using SAS Tyler Smith, and Besa Smith, Department of Defense Center for Deployment Health Research, Naval Health

### VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

### Multivariable survival analysis S10. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands

Multivariable survival analysis S10 Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands m.hauptmann@nki.nl 1 Confounding A variable correlated with the variable of interest and with

### CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

### Module 14: Missing Data Stata Practical

Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724

### BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS An Addendum to Negative Binomial Regression Cambridge University Press (2007) Joseph M. Hilbe 2008, All Rights Reserved This short monograph is intended

### Chi-square test Fisher s Exact test

Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

### Logistic regression diagnostics

Logistic regression diagnostics Biometry 755 Spring 2009 Logistic regression diagnostics p. 1/28 Assessing model fit A good model is one that fits the data well, in the sense that the values predicted

### A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic Report prepared for Brandon Slama Department of Health Management and Informatics University of Missouri, Columbia

### 11 Logistic Regression - Interpreting Parameters

11 Logistic Regression - Interpreting Parameters Let us expand on the material in the last section, trying to make sure we understand the logistic regression model and can interpret Stata output. Consider

### Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom