The Cox Proportional Hazards Model



Similar documents
Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

Introduction. Survival Analysis. Censoring. Plan of Talk

Developing Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics

Competing-risks regression

Time varying (or time-dependent) covariates

Survival Analysis And The Application Of Cox's Proportional Hazards Modeling Using SAS

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

SUMAN DUVVURU STAT 567 PROJECT REPORT

Survival Analysis Approaches and New Developments using SAS. Jane Lu, AstraZeneca Pharmaceuticals, Wilmington, DE David Shen, Independent Consultant

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Comparison of Survival Curves

Survival Analysis Using Cox Proportional Hazards Modeling For Single And Multiple Event Time Data

Binary Logistic Regression

Design and Analysis of Phase III Clinical Trials

Survival analysis methods in Insurance Applications in car insurance contracts

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Lecture 19: Conditional Logistic Regression

Study Design and Statistical Analysis

VI. Introduction to Logistic Regression

Linda Staub & Alexandros Gekenidis

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates

Topic 3 - Survival Analysis

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG

Statistics Graduate Courses

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

Personalized Predictive Medicine and Genomic Clinical Trials

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Regression Modeling Strategies

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from

Organizing Your Approach to a Data Analysis

Introduction to Survival Analysis

Poisson Models for Count Data

LOGISTIC REGRESSION ANALYSIS

Logit Models for Binary Data

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Using Stata for Categorical Data Analysis

Cool Tools for PROC LOGISTIC

Survival analysis of loan repayment rate of customers of Hawassa district commercial bank. Cheru Atsmegiorgis. Hawassa University, Hawassa,Ethiopia

Effect of Risk and Prognosis Factors on Breast Cancer Survival: Study of a Large Dataset with a Long Term Follow-up

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Statistics in Retail Finance. Chapter 6: Behavioural models

Moderation. Moderation

SAS Software to Fit the Generalized Linear Model

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

Chi-square test Fisher s Exact test

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Additional sources Compilation of sources:

13. Poisson Regression Analysis

Data Analysis, Research Study Design and the IRB

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

ABSTRACT INTRODUCTION

Introduction to Longitudinal Data Analysis

TRAINING PROGRAM INFORMATICS

Predicting Customer Default Times using Survival Analysis Methods in SAS

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS

Name of the module: Multivariate biostatistics and SPSS Number of module:

The Chi-Square Test. STAT E-50 Introduction to Statistics

Elements of statistics (MATH0487-1)

Statistical Models in R

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Row vs. Column Percents. tab PRAYER DEGREE, row col

SUGI 29 Statistics and Data Analysis

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Basic Statistical and Modeling Procedures Using SAS

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Checking proportionality for Cox s regression model

Calculating Effect-Sizes

Simple Linear Regression Inference

Random effects and nested models with SAS

Logistic (RLOGIST) Example #3

Paper PO06. Randomization in Clinical Trial Studies

STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS

Multinomial Logistic Regression

Multiple logistic regression analysis of cigarette use among high school students

Examining a Fitted Logistic Model

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

II. DISTRIBUTIONS distribution normal distribution. standard scores

Charles Secolsky County College of Morris. Sathasivam 'Kris' Krishnan The Richard Stockton College of New Jersey

SPSS Notes (SPSS version 15.0)

Lesson Outline Outline

Ordinal Regression. Chapter

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Understanding and Quantifying EFFECT SIZES

[Author Name]. (2014, June). [Title of Presentation]. Podium presentation at the 7th Biennial Cancer Survivorship Research Conference, Atlanta, GA.

Lecture 14: GLM Estimation and Logistic Regression

Section 12 Part 2. Chi-square test

Applying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM

Transcription:

The Cox Proportional Hazards Model Mario Chen, PhD Advanced Biostatistics and RCT Workshop Office of AIDS Research, NIH ICSSC, FHI Goa, India, September 2009 1

The Model h i (t)=h 0 (t)exp(z i ), Z i = ß 1 T i + ß 2 X i1 + ß 3 X i2 +... + ß k X ip h 0 (t) : baseline hazard i = 1,, N individuals t i : Time Variable (dichotomous), T i : Treatment Variable, X ij : Predictors, j = 1,, p. 2

Characteristics Baseline hazard function is left unspecified Nonparametric Partial likelihood estimation Effect of covariates: Exponential h i (t)=h 0 (t)exp(z i ), 0 h i (t) 3

The Proportional Hazard Assumption Measure of Effect: Hazard Ratio (HR) h(t) i h (t) j = h 0(t) exp(z h (t) exp(z 0 = exp(z i i j ) = ) Z j ) exp(z exp(z i j ) ) 4

Example Is the intervention effective to reduce the risk of pregnancy? Treatment (AP; 0=Control, 1=Intervention) Need to control for: Site (NEV) AGE RACE Marital status (MARRIED) Hormonal contraceptive use at baseline (HEMETH) Ever been pregnant before the study (EVERPREG) 5

Example Analysis of Maximum Likelihood Estimates Parameter Standard Parameter DF Estimate Error Chi-Square Pr > ChiSq AP 1-0.03381 0.18867 0.0321 0.8578 NEV 1 0.56274 0.23317 5.8248 0.0158 age 1-0.05018 0.03977 1.5920 0.2070 race 1 0.52562 0.28002 3.5234 0.0605 married 1 0.21621 0.38229 0.3199 0.5717 hemeth 1-0.44360 0.20379 4.7381 0.0295 everpreg 1 1.20215 0.21032 32.6720 <.0001 Analysis of Maximum Likelihood Estimates Hazard 95% Hazard Ratio Parameter Ratio Confidence Limits AP 0.967 0.668 1.399 NEV 1.755 1.112 2.772 age 0.951 0.880 1.028 race 1.692 0.977 2.928 married 1.241 0.587 2.626 hemeth 0.642 0.430 0.957 everpreg 3.327 2.203 5.025 6

Evaluating the PH assumption Graphical approach: S(t) or 1-S(t) plots 0.15 Probability of Pregnancy 0.10 0.05 0.00 Treatment Group Standard Advanced 0 30 60 90 120 150 180 210 240 270 300 330 360 390 Days since Enrollment 7

Evaluating the PH assumption Graphical approach: -ln[-ln(s(t)] -ln[-ln(survival Probability)] 2 3 4 5 6 7 0 2 4 6 ln(analysis time) ap = 0 ap = 1 8

Evaluating the PH assumption Time-Dependent covariates approach: Add a covariate by time interaction term Use t, log t, or other function of t 9

Example Analysis of Maximum Likelihood Estimates Parameter Standard Parameter DF Estimate Error Chi-Square Pr > ChiSq AP 1 1.17916 0.88025 1.7945 0.1804 NEV 1 0.99259 1.16493 0.7260 0.3942 age 1-0.24842 0.18507 1.8017 0.1795 race 1-0.96664 1.63552 0.3493 0.5545 married 1-0.89980 2.43640 0.1364 0.7119 hemeth 1-1.72984 1.07774 2.5762 0.1085 everpreg 1 0.87371 0.94901 0.8476 0.3572 aplogt 1-0.25049 0.17632 2.0184 0.1554 nevlogt 1-0.08709 0.23039 0.1429 0.7054 agelogt 1 0.04105 0.03701 1.2303 0.2674 racelogt 1 0.30029 0.31818 0.8907 0.3453 marriedlogt 1 0.22417 0.47203 0.2255 0.6349 hemethlogt 1 0.26080 0.21249 1.5064 0.2197 prlogt 1 0.06870 0.19028 0.1304 0.7181 10

Likelihood Ratio Test Likelihood Ratio Test to compare two nested models with and without the interaction terms: Test Chi-Square DF Pr > ChiSq Likelihood Ratio 9.534 7 0.2165 11

Solutions to violations of the PH assumption Leave the appropriate interaction terms with time Problematic if PH assumption is violated for the treatment effect Need to interpret interaction with time. Test HR at different time points Use an approach not based on PH, e.g., stratified logrank, other models 12

Solutions to violations of the PH assumption Use a stratified Cox model Different baseline hazard for each level of the stratification variable, h 01 (t), h 02 (t), Same covariate model across strata, i.e., same coefficients and covariates Appropriate if stratification variable is not an effect of interest (i.e., not the treatment variable) and it does not interact with the effect of interest 13

Example: Stratification by site (NEV) Analysis of Maximum Likelihood Estimates Parameter Standard Parameter DF Estimate Error Chi-Square Pr > ChiSq AP 1-0.03055 0.18866 0.0262 0.8714 age 1-0.05041 0.03978 1.6056 0.2051 race 1 0.52794 0.27983 3.5594 0.0592 married 1 0.21478 0.38237 0.3155 0.5743 hemeth 1-0.44427 0.20381 4.7514 0.0293 everpreg 1 1.20246 0.21037 32.6715 <.0001 Analysis of Maximum Likelihood Estimates Hazard 95% Hazard Ratio Parameter Ratio Confidence Limits AP 0.970 0.670 1.404 age 0.951 0.880 1.028 race 1.695 0.980 2.934 married 1.240 0.586 2.623 hemeth 0.641 0.430 0.956 everpreg 3.328 2.204 5.027 14

Time-dependent covariates A time-dependent covariate is one that changes over time: Interactions with time Internal covariates (e.g., contraceptive use, SBP, white blood cell count) Assumes that the effect of a time-dependent covariate on the hazard at time t depends on the value of the covariate at the same time t May use a lag-time effect 15

A note on ties Estimation requires that no two events occur at the same time (no ties) Methods for handling ties: Exact: time consuming Discrete: time consuming Efron: closer to exact methods Breslow: most efficient 16

Software SAS proc phreg data = sas.survex ; model timepr*pregevt(0) = ap age race married hemeth everpreg aplogt / ties=efron rl ; strata nev; aplogt=ap*log(timepr+1); run; 17

Software Stata stset timepr, failure(pregevt=1) stcox ap nev age race married hemeth everpreg, efron schoenfeld(res*) stphtest, log stphplot, by (ap) Options for time-dependent covariates: tvc (varlist ), texp(exp), e.g. texp(ln(_t)) 18

Sotware SPSS TIME PROGRAM. COMPUTE T_COV_ = LN(T_) * AP. COXREG timepr /STATUS=pregevt(1) /STRATA=NEV /METHOD=ENTER T_COV_ AP age race married hemeth everpreg 19

Concluding Remarks Choose model Specify model (No variable selection) Check data Estimate parameters Run model checks } Improve model Minimal Interpret 20