FINDING SUBGROUPS OF ENHANCED TREATMENT EFFECT. Jeremy M G Taylor Jared Foster University of Michigan Steve Ruberg Eli Lilly

Transcription

1 FINDING SUBGROUPS OF ENHANCED TREATMENT EFFECT Jeremy M G Taylor Jared Foster University of Michigan Steve Ruberg Eli Lilly 1

2 1. INTRODUCTION and MOTIVATION 2. PROPOSED METHOD Random Forests Classification and Regression Trees 3. SIMULATED DATA 2

3 SETTING: RANDOMIZED CLINICAL TRIAL Two treatment groups Binary outcome Efficacy Toxicity Lots of baseline covariates Range 5 to 100 Trial is already completed, small or marginal overall effect 3

4 GOAL Find subgroup of patients with enhanced treatment effect, if it exists Issues What do you mean by enhanced? Desire subgroup to be based on a small number of covariates What is the strategy for finding the subgroup Can you provide honest estimates of how good the subgroup is. 4

5 SEARCHING FOR SUBGROUPS IN RANDOMIZED CLINICAL TRIAL DATA IS A STATISTICAL NO-NO Data dredging Mining the data Overfitting the data Look hard enough you will find something Sample sizes tend not to be large enough to find subgroups 5

6 Large literature on dangers of subgroup analysis Pocock et al 2002 Rothwell et al 2005 Lagakos 2009 Brookes et al 2001, 2004 Cui et al 2002 Yusuf et al 1991 Assman et al 2000 Examples of people finding sign of the zodaic being important (Peto et al 1995) Message: use extreme caution in interpreting subgroups 6

7 Consensus opinion: Need a predefined plan for subgroup analysis Interpretation 1. Predefine the subgroups you are going to look at Interpretation 2. Predefine the strategy for searching for subgroups 7

8 A Clinicostatistical Tragedy (Feinstein 1998) Believes there is a patho-physiology reason for existence of categories Believes statistical doctrines have become too dominant Potential tragedy now is what may seem to be good statistics will be bad science 8

9 Need for validation External validation: Ideally find subgroup in one trial, validate in other trials Internal validation: Try to give honest estimate of quality of subgroup using the same dataset 9

10 Notation Y = outcome, binary T = treatment group, binary X 1,...,X p baseline covariates P(Y = 1 T,X) A subgroup (A) is a region of the design space eg A = {X 1 > 3} eg A = {X 1 > 3 and X 7 < 6)} eg A = {2X 1 + 3X 4 < 2} 10

11 Artificial data 1000 observations 15 X s Generated from logit(p(y = 1)) = X X X T X2 X TI(X A) A = {X 1 > 0,X 2 < 0}, 25% of observations Treatment group response rate = Control group response rate =

12 Scatterplot for controls X Responders Non responders X1 12

13 Scatterplot for treated individuals X Responders Non responders X1 13

14 Response rate by X1 Response rate Treated Controls ( Inf, 1.5] ( 1.5, 0.5] ( 0.5,0.5] (0.5,1.5] (1.5, Inf] X1 14

15 Response rate by X2 Response rate Treated Controls ( Inf, 1.5] ( 1.5, 0.5] ( 0.5,0.5] (0.5,1.5] (1.5, Inf] X2 15

16 Enhanced treatment effect: no unique definition Difference in absolute risk, P(Y = 1 T = 1,X) P(Y = 1 T = 0,X) Relative risk, P(Y = 1 T = 1,X)/P(Y = 1 T = 0,X) Difference in log-odds, logit(p(y = 1 T = 1,X)) logit(p(y = 1 T = 0,X)) 16

17 Simple model Treatment effect X 1 is prognostic No interaction logit(p(y = 1 T,X 1 )) = 1 + T + X 1 17

18 Sample response probabilities with no region A advantage Response probability Treated Controls X1 18

19 Interactions in statistical models Main Effects Model logit(p(y = 1 T,X)) = α + βt + γ 1 X 1 + γ 2 X 2 Note, no interaction on logit scale may have interaction on absolute risk scale Main Effects + Interaction logit(p(y = 1 T,X)) = α+βt +γ 1 X 1 +γ 2 X 2 +δtx 1 logit(p(y = 1 T,X)) = α + βt + γ 1 X 1 + γ 2 X 2 + δti(x A) Need large sample sizes to find interactions 19

20 Naive method: Forward stepwise logistic regression Include Main effects for T and X s Interactions X j X k Interactions T X j Interactions T X j X k Estimate ˆP 1i = P(Y i = 1 T i = 1,X i ) and ˆP 0i = P(Y i = 1 T i = 0,X i ) for each person i. New variable Z i = ˆP 1i ˆP 0i is then created, Subjects i in group A if Z i > c (c=cutoff, say 0.15) 20

21 Table 1: Logistic Regression Results Coefficients Estimate SE p-value X X < X < X X 2 : X < T X 1 : T X 7 : T

22 X-by-T interaction found Region Â estimated as Ẑi >

23 Histogram of difference in estimated response probabilities from logit model Frequency Hat P_1i Hat P_0i 23

24 Table 2: Number of subjects in 4 cells (Logit) Â not Â Treatment Control Overall

25 Table 3: Response rate in 4 cells (Logit) Treatment Control Â not Â Overall

26 Table 4: How close is Â to A (Logit) Â not Â A not A Overall Sensitivity = 0.24 Specificity = 0.80 Positive Predictive Value = 0.27 Negative Predictive Value =

27 Virtual Twins method For each person think about outcome if they got treatment and outcome if they got placebo Two steps Step 1. Use Random Forests (RF) on all the data Step 2. Run output from RF down a regression tree to find region A 27

28 Random Forests: A type of non-parametric regression A statistical learning algorithm for estimating function f(.) in the model P(Y = 1) = f(t,x 1,..,X p ) An ensemble method combining 250 trees Combine many simple trees Uses Bootstrap samples Uses randon subsets of covariates at each split Combine 250 predictions A black box Input, values of T and X s Output, estimate of P(Y = 1 T, X) 28

29 Step 1. Apply Random Forests to all the data Covariates X, T, X*I(T=0), X*I(T=1) Produces black box predictor For each subject apply predictor twice Once to (T = 1,X 1i,...,X pi ) Once to (T = 0,X 1i,...,X pi ) Gives ˆP 1i = P(Y i = 1 T i = 1,X i ) and ˆP 0i = P(Y i = 1 T i = 0,X i ) Form Z i = ˆP 1i ˆP 0i A measure of the treatment effect for subject i 29

30 Estimated P_1i vs. Estimated P_0i Hat P_1i Hat P_0i 30

31 Estimated P_0i vs. True P_0i Hat P_0i True P_0i 31

32 Estimated P_1i vs. True P_1i Hat P_1i True P_1i 32

33 Estimated P_1i vs. True P_1i Estimated P_1i Treated Control True P_1i 33

34 Estimated P_1i vs. True P_1i for Treated Estimated P_1i Responders Non Responders True P_1i 34

35 Histogram of difference in estimated response probabilities Frequency Hat P_1i Hat P_0i 35

36 Step 2. Regression Trees Find small number of variables that are most associated with Z i Estimate regression tree for Z i with covariates X 1i,...,X pi The result, a small number of X s with cutpoints 36

37 x1< Virtual twins tree x7< x12>= n= n= n= n=28 x7< n=233 x7>= x2>= n= n=101 37

38 Â classification Table 5: Number of subjects in 4 cells (VT) Â not Â Treatment Control Overall

39 Scatterplot for control individuals by estimated A membership X A hat Not A hat X1 39

40 Scatterplot for treated individuals by estimated A membership X A hat Not A hat X1 40

41 Scatterplot for control individuals in A hat X Responders Non responders X1 41

42 Scatterplot for treated individuals in A hat X Responders Non responders X1 42

43 Scatterplot for control individuals not in A hat X Responders Non responders X1 43

44 Scatterplot for treated individuals not in A hat X Responders Non responders X1 44

45 Table 6: Response rate in 4 cells (VT) Treatment Control Â not Â Overall

46 Properties of subgroup How big is Â? What X s does it depend on Quantify magnitude of enhanced treatment effect If know true A Are we finding the correct X s How close is Â to true A Sensitivity, Specificity Positive Predictive Value, Negative Predictive Value 46

47 Table 7: How close is Â to A (VT) Â not Â A not A Overall Sensitivity = 0.68 Specificity = 0.77 Positive Predictive Value = 0.48 Negative Predictive Value =

48 Metrics for enhanced treatment effects Q(Â) = (P(Y = 1 T = 1,X Â) P(Y = 1 T = 0,X Â)) (P(Y = 1 T = 1) P(Y = 1 T = 0)) Table 8: Response rate in 4 cells (VT) Treatment Control Â not Â Overall ˆQ(Â) V T = ( )-( )=

49 Table 9: Response rate in 4 cells (Logit) Treatment Control Â not Â Overall ˆQ(Â) Logit = ( ) - ( ) =

50 Notation Q(Â) = true value of Q for Â ˆQ(Â) = estimate of Q for Â Want estimates to have low bias and small variability (small SE) 50

51 Resubstitution estimates ˆQ(Â) V T = ˆQ(Â) Logit = Almost certainly optimistically biased estimates Need honest estimate of Q(Â) What would large trial Q(Â) be with this Â in the next very 51

52 Methods of estimating Q(Â) Resubstitution method Simulate new data Cross-validation of ˆP 1i and ˆP 0i. Full Cross-validation 52

53 Simulate new data Simulate new data, that looks like original data, but is independent Generate binary Y i using either ˆP 1i or ˆP 0i. if T i = 1 then Y i Bernoulli( ˆP 1i ) if T i = 0 then Y i Bernoulli( ˆP 0i ) Calculate ˆQ(Â) from these new data Repeat many times and average ˆQ(Â) V T = 0.095, ˆQ( Â) Logit =

54 Cross-validation of ˆP 1i and ˆP 0i. Same as Simulate New Data, except ˆP 1i and ˆP 0i are derived after cross-validation Take 9/10 of data, run Random Forest (or Logit model with forward selection), predict for left out 1/10 Repeat 10 times ˆQ(Â) V T = 0.124, ˆQ( Â) Logit =

55 Full Cross-validation Take 9/10 of data Find region Â k Find ˆQ(Â k ) for left out 1/10 Repeat 10 times Combine 10 separate ˆQ(Âk) to final ˆQ( Â) ˆQ(Â) V T = 0.089, ˆQ( Â) Logit =

56 Table 10: Summary of estimates of Q(Â) Estimation Method Virtual Twins Logit Resubstitution Simulate new data Cross-validation of ˆP 1i and ˆP 0i Full Cross-validation True value of Q(Â) True value of Q(A)

57 Sampling variability of ˆQ(Â) Also desirable to attach standard errors to ˆQ(Â) We propose the following: 1. Simulate many datasets using ˆP 1i and ˆP 0i from random forest/logistic method 2. For each data set, estimate a new Â and calculate ˆQ(Â) 3. Estimated standard error equals the standard deviation of these ˆQ(Â) 57

58 Table 11: Standard errors for ˆQ(Â) Virtual Twins Logit Method Est. SE Est. SE Resubstitution Sim. new data

59 Null distribution of ˆQ(Â) and p-values Null distribution no region of enhanced treatment effect Null should allow possibility of main effects for X and T, but with no interaction We propose the following: 1. Define V i = logit( ˆP 1i ) logit( ˆP 0i ) and V = 1 n Vi 2. Define ˆP 1i N = expit(logit( ˆP 1i )+logit( ˆP 0i ) 2 + V 2 ˆP 0i N = expit(logit( ˆP 1i )+logit( ˆP 0i ) 2 V 2 ) 3. Simulate many datasets using ˆP N 1i and ˆP N 0i ), and 59

60 4. For each data set, first estimate Â and calculate ˆQ(Â) 5. P-value is the fraction of these than the observed ˆQ(Â) ˆQ(Â) that are larger 6. If original Â is empty, take p-value to be

61 Table 12: P-values for ˆQ(Â) Method Virtual Twins Logit Resubstitution Simulate new data

62 Simulation study Generate multiple datasets from logit(p(y = 1 T,X) = α + βt + γh(x) + θti(x A) A is a known region in the design space defined by a small number of X s Possible factors to consider sample size, number of X s, correlation between X s, size of true A, number of X s that determine true A, strength of enhanced treatment effect 62

63 Properties of subgroup Simulations (know true A) Are we finding the correct X s How big is Â? How close is Â to true A Sensitivity, Specificity Positive Predictive Value, Negative Predictive Value Accuracy of estimates of Q(Â) 63

64 Used model: logit(p(y = 1)) = 1+0.5X X 2 0.5X T+0.5X 2 X 7 +θti{x A} 100 datasets generated For each, n =

65 Table 13: Selected X s, θ = 0.75, A = {X 1 > 0,X 2 < 0} Logit Virtual Twins Mean (# Unique X s) SD (# Unique X s) Pct. Found X 1 (int) Pct. Found X 2 (int) Pct. Found X 7 (main) Pct. Found X 3 (null) 0 13 Pct. Found X 1 at top of tree 71 Pct. Found X 1 top 2 of tree 62 Pct. Found no int/tree

66 Table 14: Compare A to Â, θ = 0.75, A = {X 1 > 0,X 2 < 0} Logit Virtual Twins percent Â empty 39 7 size of Â (median): Sensitivity Specificity PPV NPV AUC

67 Table 15: Q Estimates, θ = 0.75, A = {X 1 > 0,X 2 < 0} Logit Virtual Twins Mean SD Mean SD Q(A) Q(Â) ˆQ(Â) : Resub SimNewDat Cr.Val FullCr.Val

68 Table 16: Selected X s, Null case Logit Virtual Twins Mean (# Unique X s) SD (# Unique X s) Pct. Found X 1 (int) 6 72 Pct. Found X 2 (int) 6 62 Pct. Found X 7 (main) 2 70 Pct. Found X 3 (null) 0 9 Pct. Found no tree/int

69 Table 17: Properties of Â, Null case Logit Virtual Twins percent Â empty Specificity

70 Table 18: Q Estimates, Null case Logit Virtual Twins Mean SD Mean SD Q(A) Q(Â) ˆQ(Â) : Resub SimNewDat Cr.Val FullCr.Val

71 Lots of possibilities for adaptation and extension Replace Random Forest with other predictor Generalize to high dimensional data (genomic/genetic) Build in to the study design Vary thresholds to make smaller or larger Â Use different metrics for Q(A) Use different definitions of Z i 71