R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models



Similar documents
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Credit Risk Models: An Overview

Basics of Statistical Machine Learning

Introduction to mixed model and missing data issues in longitudinal studies

1 Sufficient statistics

Exam C, Fall 2006 PRELIMINARY ANSWER KEY

A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop

Lecture 15 Introduction to Survival Analysis

Principles of Hypothesis Testing for Public Health

MODEL CHECK AND GOODNESS-OF-FIT FOR NESTED CASE-CONTROL STUDIES

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

Dongfeng Li. Autumn 2010

1 Prior Probability and Posterior Probability

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Statistical Machine Learning from Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Basic Introduction to Missing Data

Statistics 104: Section 6!

Introduction to Path Analysis

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Estimation of σ 2, the variance of ɛ


Gamma Distribution Fitting

Comparison of resampling method applied to censored data

The Ergodic Theorem and randomness

Simple Linear Regression Inference

Course 4 Examination Questions And Illustrative Solutions. November 2000

Week 5: Multiple Linear Regression

Sales forecasting # 2

Supplement to Call Centers with Delay Information: Models and Insights

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Introduction to General and Generalized Linear Models

Missing data and net survival analysis Bernard Rachet

Determining distribution parameters from quantiles

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Least Squares Estimation

HYPOTHESIS TESTING: POWER OF THE TEST

Fitting Subject-specific Curves to Grouped Longitudinal Data

Is a Brownian motion skew?

Monte Carlo Simulation

Checking proportionality for Cox s regression model

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Additional sources Compilation of sources:

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Interpretation of Somers D under four simple models

6.2 Permutations continued

Life Tables. Marie Diener-West, PhD Sukon Kanchanaraksa, PhD

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Nonlinear Regression:

A General Approach to Variance Estimation under Imputation for Missing Survey Data

Monte Carlo-based statistical methods (MASM11/FMS091)

Maximum Likelihood Estimation

Sample Size Calculation for Longitudinal Studies

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia

Properties of Future Lifetime Distributions and Estimation

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

Imputation of missing data under missing not at random assumption & sensitivity analysis

Life Table Analysis using Weighted Survey Data

Chapter 4: Statistical Hypothesis Testing

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

17. SIMPLE LINEAR REGRESSION II

Nonlinear Regression:

Study Guide for the Final Exam

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Unified Lecture # 4 Vectors

Principle of Data Reduction

Lesson19: Comparing Predictive Accuracy of two Forecasts: Th. Diebold-Mariano Test

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap

Topic 5: Stochastic Growth and Real Business Cycles

Time Series Analysis

Nonparametric adaptive age replacement with a one-cycle criterion

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Survival Distributions, Hazard Functions, Cumulative Hazards

UNTIED WORST-RANK SCORE ANALYSIS

Simple Regression Theory II 2010 Samuel L. Baker

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

Parabolic Equations. Chapter 5. Contents Well-Posed Initial-Boundary Value Problem Time Irreversibility of the Heat Equation

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

A random point process model for the score in sport matches

Transcription:

Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C. Fournier & E. Dantan (Nantes, France) July 2015 Slide 1/29

Slide 1/29 P Blanche et al. Context & Motivation Medical researchers hope to improve patient management using earlier diagnoses Statisticians can help by tting prediction models The making of so-called "dynamic" predictions has recently received a lot of attention In order to be useful for medical practice, predictions should be "accurate" How should we evaluate dynamic prediction accuracy?

Slide 2/29 P Blanche et al. Data & Clinical goal Data: DIVAT cohort data of kidney transplant recipients (subsample, n = 4, 119) Clinical goal: Dynamic prediction of risk of kidney graft failure (death or return to dialysis) Using repeated measurements of serum creatinine

Slide 3/29 P Blanche et al. DIVAT data sample (n=4,119) French cohort Adult recipients Transplanted after 2000 Creatinine measured every year (www.divat.fr)

Slide 3/29 P Blanche et al. DIVAT data sample (n=4,119) French cohort Adult recipients Transplanted after 2000 Creatinine measured every year 6 centers (www.divat.fr)

Slide 4/29 P Blanche et al. Statistical challenges discussed How to evaluate and/or compare dynamic predictions? Using concepts of: Discrimination Calibration Accounting for: Dynamic setting Censoring issue

Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations

Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!)

Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: Calibration:

Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: A model has high discriminative power if the range of predicted risks is wide and subjects with low (high) predicted risk are more (less) likely to experience the event. Calibration:

Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: A model has high discriminative power if the range of predicted risks is wide and subjects with low (high) predicted risk are more (less) likely to experience the event. Calibration: A model is calibrated if we can expect that x subjects out of 100 experience the event among all subjects that receive a predicted risk of x% (weak denition).

Dynamic prediction Creatinine (mol/l) 100 150 200 250 300 0 follow up time (years) Slide 6/29 P Blanche et al.

Dynamic prediction Creatinine (mol/l) 100 150 200 250 300 0 follow up time (years) Slide 6/29 P Blanche et al.

Dynamic prediction Creatinine (mol/l) 100 150 200 250 300 0 follow up time (years) Slide 6/29 P Blanche et al.

Dynamic prediction Creatinine (mol/l) 100 150 200 250 300 0 follow up time (years) Slide 6/29 P Blanche et al.

Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) Creatinine (mol/l) 100 150 200 250 300 Landmark time ''s'' 0 s = 4 years follow up time (years)

Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) Creatinine (mol/l) 100 150 200 250 300 Landmark time ''s'' 0 s = 4 years follow up time (years)

Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) t: prediction horizon (xed) Creatinine (mol/l) 100 150 200 250 300 Landmark time ''s'' Horizon ''t'' 0 s = 4 years s+t = 9 years follow up time (years)

Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) t: prediction horizon (xed) Creatinine (mol/l) 100 150 200 250 300 1 π i (s, t) = 64% Landmark time ''s'' Horizon ''t'' 0 s = 4 years s+t = 9 years follow up time (years) 0 % 100 % 1 Pr(Graft Failure in (s,s+t])

Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) time D i (s, t) = 1{event occurs in (s, s + t]}

Slide 7/29 P Blanche et al. Right censoring issue : uncensored Landmark time s (when predictions are made) Time s + t (end of prediction window) time D i (s, t) = 1{event occurs in (s, s + t]}

Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) : uncensored : censored time D i (s, t) = 1{event occurs in (s, s + t]}

Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) : uncensored : censored time For subject i censored within (s, s + t] the status is unknown. D i (s, t) = 1{event occurs in (s, s + t]}

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t}

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: π i (s, t)

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( ) π i (s, t) = P ξ ξ: previously estimated parameters (from independent training data)

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( π i (s, t) = P ξ D i (s, t) = 1 ) ξ: previously estimated parameters (from independent training data)

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( π i (s, t) = P ξ D i (s, t) = 1 T i > s, Y i (s), ) ξ: previously estimated parameters (from independent training data) Y i (s): marker measurements observed before time s

Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( ) π i (s, t) = P ξ D i (s, t) = 1 T i > s, Y i (s), X i ξ: previously estimated parameters (from independent training data) Y i (s): marker measurements observed before time s X i : baseline covariates

Slide 9/29 P Blanche et al. Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )?

Slide 9/29 P Blanche et al. Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )? Prediction Error: [ { } 2 ] T PE π (s, t) = E D(s, t) π(s, t) > s

Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )? Prediction Error: [ { } 2 ] T PE π (s, t) = E D(s, t) π(s, t) > s the lower the better PE Bias 2 + Variance evaluates both Calibration and Discrimination depends on P ( event occurs in (s,s+t] at risk at s ) often called "Expected Brier Score" Slide 9/29 P Blanche et al.

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? [{ } 2 ] T PE π (s, t) =E D(s, t) π(s, t) > s [{D(s, t)+ E [ D(s, t) H π (s) ] E } {{ } true underlying risk ] π(s, t) T > s

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) =E [{D(s, t) E [ D(s, t) H π (s) ]} 2 [{D(s, t) + E [ D(s, t) H π (s) ] E } {{ } true underlying risk } 2 ] T π(s, t) > s

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) =E [{D(s, t) E [ D(s, t) H π (s) ]} 2 ] T > s [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } true underlying risk H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [{D(s, t) E [ D(s, t) H π (s) ]} 2 ] T > s } {{ } Inseparability [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Bias/Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) }

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) } E [ D(s, t) H π (s) ] π(s, t) 0 denes "strong" calibration.

Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Does NOT depend on π(s,t) [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Depends on π(s,t) H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) } E [ D(s, t) H π (s) ] π(s, t) 0 denes "strong" calibration.

Slide 11/29 P Blanche et al. Rπ 2 -type criterion Benchmark PE 0 The best "null" prediction tool, which gives the same (marginal) predicted risk S(s + t s) = E [ D(s, t) H 0 (s) ], H 0 (s) = {T > s} to all subjects leads to { } PE 0 (s, t) = Var{D(s, t) T > s} = S(s + t s) 1 S(s + t s).

Slide 11/29 P Blanche et al. Rπ 2 -type criterion Benchmark PE 0 The best "null" prediction tool, which gives the same (marginal) predicted risk S(s + t s) = E [ D(s, t) H 0 (s) ], H 0 (s) = {T > s} to all subjects leads to { } PE 0 (s, t) = Var{D(s, t) T > s} = S(s + t s) 1 S(s + t s). Simple idea R 2 π(s, t) = 1 PE π(s, t) PE 0 (s, t)

Slide 12/29 P Blanche et al. Why bother? R 2 π(s, t) aims to circumvent the dicult interpretation of: the scale on which PE(s, t) is measured interpretation for trend of PE(s, t) vs s

Why bother? R 2 π(s, t) aims to circumvent the dicult interpretation of: the scale on which PE(s, t) is measured interpretation for trend of PE(s, t) vs s Because the meaning of the scale on which PE(s,t) is measured changes with s, an increasing/decreasing trend can be due to changes in: the quality of the predictions and/or the at risk population Slide 12/29 P Blanche et al.

Slide 13/29 P Blanche et al. Changes in the quality of the predictions "Essentially, all models are wrong, but some are useful.", G. Box

Slide 13/29 P Blanche et al. Changes in the quality of the predictions "Essentially, all models are wrong, but some are useful.", G. Box The prediction model from which we have obtained the predictions can be more wrong for some s than for some others. Calibration term of PE(s, t) changes with s We can work on it!

Slide 14/29 P Blanche et al. Changes in the at risk population An example: Patients with cardiovascular history (CV) all die early. Only those without CV remain at risk for late s. Then: the earlier s the more homogeneous the at risk population CV is useful for prediction for early s but useless for late s.

Slide 14/29 P Blanche et al. Changes in the at risk population An example: Patients with cardiovascular history (CV) all die early. Only those without CV remain at risk for late s. Then: the earlier s the more homogeneous the at risk population CV is useful for prediction for early s but useless for late s. The available information can be more informative for some s than for some others. Discriminating term of PE(s, t) changes with s This is just how it is, there is nothing we can do! (we can only work with the data we have)

Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk).

Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation (after little algebra)

Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation = Corr 2{ D(s, t), π(s, t) T > s } correlation (after little algebra)

Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation = Corr 2{ D(s, t), π(s, t) T > s } correlation { } = E π(s, t) D(s, t) = 1, T > s { } E π(s, t) D(s, t) = 0, T > s. mean risk dierence (after little algebra)

Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i }

Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs.

Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { Di (s, t) π i (s, t)} 2

Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { } 2 Ŵ i (s, t) Di (s, t) π i (s, t)

Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { } 2 Ŵ i (s, t) Di (s, t) π i (s, t) and R 2 π(s, t) = 1 PE π(s, t) PE 0(s, t)

Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = + + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t : uncensored : censored time

Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + 1{ T i > s + t} Ĝ(s + t s) + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + 1{ T i > s + t} Ĝ(s + t s) + 0 with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

Slide 18/29 P Blanche et al. Asymptotic i.i.d. representation Lemma: Assume that the censoring time C is independent of (T, η, π(, )) and let θ denote either PE π, R 2 π or a dierence in PE or R 2 π, then n ( θ(s, t) θ(s, t) ) = 1 n n IF θ ( T i, i, π i (s, t), s, t) + o p (1) i=1 where IF θ ( T i, i, π i (s, t), s, t) being : zero-mean i.i.d. terms easy to estimate (using Nelson-Aalen & Kaplan-Meier)

Slide 19/29 P Blanche et al. Pointwise condence interval (xed s) Asymptotic normality: n ( θ(s, t) θ(s, t) ) D ( ) N 0, σ 2 s,t 95% condence interval: { } σ s,t θ(s, t) ± z 1 α/2 n where z 1 α/2 is the 1 α/2 quantile of N (0, 1). Variance estimator: σ 2 s,t = 1 n n i=1 { IFθ ( T } 2 i, i, π i (s, t), s, t)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Υ b = sup s S

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm 1 For b = 1,..., B, say B = 4000, do: Υ b = sup s S

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: Υ b = sup s S (Conditional multiplier central limit theorem)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). Υ b = sup s S (Conditional multiplier central limit theorem)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 2 Compute q (S,t) 1 α as the 100(1 α)th percentile of { Υ 1,..., Υ B} (Conditional multiplier central limit theorem)

Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples

Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples T : time from 1-year after transplantation to graft failure which is: Death Return to dialysis OR

Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples T : time from 1-year after transplantation to graft failure which is: Death Return to dialysis OR Censoring due to: delayed entries: 2000-2013 end of follow-up: 2014 Baseline covariates: age, sex, cardiovascular history Longitudinal biomarker (yearly): serum creatinine

Descriptive statistics & censoring issue s S = {0, 0.5,..., 5} t = 5 years No. of subjects 0 200 600 1000 1400 Censored in (s,s+5] Known as event free at s+5 Observed failure in (s,s+5] s=0 s=1 s=2 s=3 s=4 s=5 landmark time (year) Slide 22/29 P Blanche et al.

Slide 23/29 P Blanche et al. Joint model Longitudinal [ ] log Y i (t ij ) = (β 0 + b 0i ) + β 0,age AGE i + β 0,sex SEX i + ( β 1 + b 1i + β 1,age AGE i ) tij + ɛ ij = m i (t) + ε ij (tted using package JM)

Slide 23/29 P Blanche et al. Joint model Longitudinal [ ] log Y i (t ij ) = (β 0 + b 0i ) + β 0,age AGE i + β 0,sex SEX i + ( β 1 + b 1i + β 1,age AGE i ) tij + ɛ ij = m i (t) + ε ij Survival (hazard) h i (t) = h 0 (t) exp { γ age AGE i + γ CV CV i + α 1 m i (t) + α 2 dm i (t) dt } (tted using package JM)

R 2 π(s, t) vs s (t=5 years) 95% pointwise CI 95% simultaneous CB R π 2 (s, t) 0 % 10 % 20 % 30 % 0 1 2 3 4 5 Landmark time s (years) Slide 24/29 P Blanche et al.

Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) 0 1 2 3 4 5 Landmark time s (years) Slide 25/29 P Blanche et al.

Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV 0 1 2 3 4 5 Landmark time s (years) Slide 25/29 P Blanche et al.

Slide 25/29 P Blanche et al. Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) T ~ Age + CV 0 1 2 3 4 5 Landmark time s (years)

Slide 25/29 P Blanche et al. Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) 0 1 2 3 4 5 Landmark time s (years)

Comparing R 2 π(s, t) vs s for dierent π(s, t) Difference in R π 2 (s,t) 5 % 0 % 5 % 10 % 20 % 0 1 2 3 4 5 Landmark time s Slide 25/29 P Blanche et al.

Comparing R 2 π(s, t) vs s for dierent π(s, t) Difference in R π 2 (s,t) 5 % 0 % 5 % 10 % 20 % 0 1 2 3 4 5 Landmark time s Slide 25/29 P Blanche et al.

Calibration plot (example for s = 3 years) 5 year risk (%) 0 20 40 60 80 100 Predicted Observed 2% 8(4)% (0,10) (0;3.5] 4% 15(6)% (10,20) (3.5;5.3] 7% 15(5)% (20,30) (5.3;8] 13(5)% 16(6)% 12% 15% 9% (30,40) (8;10.4] (40,50) (10.4;13.1] 29(7)% 26(6)% 27% 21(7)% 19% (50,60) (13.1;16.5] (60,70) (16.5;22.3] (70,80) (22.3;33] 42% 36(9)% (80,90) (33;55.1] 77% 74(8)% (90,100) (55.1;100] Risk groups (quantile groups, in %) Slide 26/29 P Blanche et al.

Slide 27/29 P Blanche et al. Area under the ROC(s, t) curve vs s AUC π (s, t) 50% 60% 70% 80% 100% T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) T ~ Age + CV 0 1 2 3 4 5 Landmark time s (years)

Summing up R 2 -type curve summarizes calibration and discriminating simultaneously has an understandable trend Simple model free inference predictions can be obtained from any model we do not assume any model to hold allows fair comparisons of dierent predictions The method accounts for: Censoring Dynamic setting (the at risk population changes) Slide 28/29 P Blanche et al.

Discussion The strong calibration assumption allows dierent interesting interpretations: Explained variation Correlation Mean risk dierence Unfortunately the strong calibration cannot be checked (curse of dimensionality) However weak and strong denitions are closely related: strong calibration implies weak calibration weak calibration can often be seen as a reasonable approximation of strong calibration in practice weak calibration can be assessed (plots) Slide 29/29 P Blanche et al.

Discussion The strong calibration assumption allows dierent interesting interpretations: Explained variation Correlation Mean risk dierence Unfortunately the strong calibration cannot be checked (curse of dimensionality) However weak and strong denitions are closely related: strong calibration implies weak calibration weak calibration can often be seen as a reasonable approximation of strong calibration in practice weak calibration can be assessed (plots) Thank you for your attention! Slide 29/29 P Blanche et al.