A simple to implement algorithm for natural direct and indirect effects in survival studies with a repeatedly measured mediator Susanne Strohmaier 1, Nicolai Rosenkranz 2, Jørn Wetterslev 2 and Theis Lange 3 1 University of Oslo, 2 Copenhagen University Hospital, 3 University of Copenhagen ISCB Utrecht, August 2015
Outline Background and motivation The 6S study Generalisation of natural direct and indirect effects Data structure and notation Objectives and encountered challenges Estimation General idea Proposed algorithm Application results Discussion References Additional Material
Background and motivation The 6S study Motivating example - The 6S trial 6S - Scandinavian Starch for Severe Sepsis/Septic Shock trial Background: Intravenous fluids are the mainstay of treatment for patients with hypovolemia due to severe sepsis to obtain fast circulatory stabilisation. Commonly applied interventions: Crystalloids including saline and dextrose (Ringer s solution) Colloids containing larger molecules such as starch or gelatine. Preferred choice in Scandinavian intensive care units (ICU) Hydtroxyethyl starch (HES) 130/0.42 (previously high molecular weight HES - reported to cause acute kidney failure and bleeding) However HES 130/0.42 is largely unstudied in patients with severe sepsis; Lack of efficacy data and concerns about safety
Background and motivation The 6S study 6S in brief Study population: Patients with severe sepsis admitted to an intensive care unit (ICU) who needed fluid for circulatory stabilisation Randomisation to: Hydroxyethyl starch (HES 130/0.42) Ringer s solution Primary outcome: Death or end-stage kidney failure at 90 days after randomisation Results reported in NEJM: (Perner et al. (2012)) At 90 days after randomisation, 201 of 398 assigned to HES had died as compared with 172 of 400 patients assigned to Ringer s acetate. (RR 1.17; 95% confidence interval, 1.01 to 1.36: p=0.03) Speculated mechanism Effect of HES vs. Ringer s solution is mediated through acute kidney impairment. Starch particles cause acute kidney injury.
A simple to implement algorithm for natural direct and indirect effects in survival studies with a repeatedly measured mediator Background and motivation The 6S study Particular data collection M 1 M 2... M τ A N 1 N 2... N τ L Figure: Assumed causal structure
A simple to implement algorithm for natural direct and indirect effects in survival studies with a repeatedly measured mediator Generalisation of natural direct and indirect effects Data structure and notation Assumed structure M A L N/dN/T Figure: Local independence graph C Some notation A... dichotomous treatment, values {0, 1} L... a set of baseline confounders T = min( T, U)... minimum of event- and censoring time = I ( T U)... censoring indicator M...mediator process M t... mediator measured at time t M t... mediator history up to time t N t = I (T t, = 1)... counting process the event of interest C t... censoring process
Generalisation of natural direct and indirect effects Objectives and encountered challenges Objectives Estimation of average natural direct and indirect effects Question we could consider What would be the effect of treatment on time-to-event if treatment had not had an effect on the mediator? Effect of treatment attributable to the treatment induced change in the mediator Simple to implement algorithm for estimation for direct parametrisation of those effects
Generalisation of natural direct and indirect effects Objectives and encountered challenges Effect definitions Average natural direct effect E[ T ama T a M a L] comparing the survival times (within levels of baseline covariates L) had the treatment been changed from a to a, but the mediator history had followed the path it would have naturally followed under some reference condition for the treatment A = a. Average natural indirect effect E[ T ama T ama L], which compares the effect of the mediator histories M a and M a on the outcome, when the treatment is set to A = a. Effect decompostion on the mean survial scale E[ T a T a L] = E[ T ama T a M a L] = E[ T ama T a M a L] + E[ T ama T ama L],
Generalisation of natural direct and indirect effects Objectives and encountered challenges Encountered Challenges Conterfactual survial times T a, mediator trajectories M a and respective nested counterfactuals T ama have only been defined for mediators measured at one single time point Considerations under which conditions generalised effects are well-defined qunatities (Zheng and van der Laan (2012)) Extend respective no-unmeasurened confounder assumptions and the sequential ignorability assumption to the process setting
Estimation General idea General idea Directly parameterise the natural direct and indirect effects Extension of Lange et al. (2012) A Simple unified approach for estimating natural direct and indirect effects to the current setting. For GLMs and a mediator measured at a single point in time g(e[y ama ]) = c 0 + c 1 a + c 2 a + c 3 a a Method builds on ideas of Hong (2010), but has been generalised to various types of outcomes and mediators. Estimate the effects through analysis of an expanded and reweighted data set (ratio of mediator probabilities under different treatment regimes)
Estimation General idea Considering survival outcomes Assuming independent censoring Cox and Aalen models for the hazard function corresponding to the counterfactual survival time, expressed as Cox natural effects model λ 0(t) exp {c 0 + c 1a + c 2a + c 3a a + c 4L} Aalen natural effects model γ 0(t) + c 1a + c 2a + c 3a a + c 4L Natural effects models for the counterfactual variable T can ama expressed in the same fashion Effect decomposition can also be done on the hazard or hazard ratio scale (VanderWeele (2011)).
Estimation Proposed algorithm Proposed estimation algorithm (1/3) 1. Organise your data in wide format Censored or dead subjects will have missing for the mediators where they were not observed. 2. Estimate models for the mediator at each measurement time t conditional on the assigned treatment A, the mediator history and baseline confounders L by using the original data set. 3. Construct a new dataset; repeating each observation twice and including an additional auxiliary variable A, which equal A for the first replication and 1 A for the second replication. Further, add and indicator,identifying which data originate from the same subject.
Estimation Proposed algorithm Proposed estimation algorithm (2/3) 4. Apply the predict function twice to predict from the models in step 2, once based on A and once based on A. By dividing the predicted values the local weights at time s can be obtained W s j = P(Ms = m s j M s 1 = m s 1 j, A = a j, L = l j ) P(M s = m s j Ms 1 = m s 1 j, A = a j, L = l j ) (1) 5. Compute the cumulative weights by multiplying the local weights obtained in step 4 across the relevant rows T j W j = Wj s (2) s=1
Estimation Proposed algorithm Proposed estimation algorithm (3/3) 6. Fit a suitable outcome model, i.e. the Cox model including A and A, possible interactions and baseline covariates. The model should be weighted by the weights from step 5. 7. Confidence intervals can be obtained using bootstrap.
Estimation Proposed algorithm Some intuition... The resulting regression coefficient associated with A will capture the natural direct effect while the coefficient of A* will capture the natural indirect effect The derived weights are a tool to distinguish between direct and indirect effects by giving more weights to observations where the observed mediator trajectory would have been more likely to occur under a different treatment level than actually observed. For a constructive illustration see Hong (2010)
Application results Back to the original medical problem A quick recapitulation Patients selection: 795 patients admitted to intensive care units (396 in HES group vs. 399 Ringer solution) Mediator: KDIGO (Kidney Disease: Improving Global Outcome) score, taking values {0, 1, 2, 3} measured daily within the first 5 days of follow-up. Mediator model: Multinomial logistic regression Outcome: Time to death within 90 days after randomisation Outcome model: Cox regression
Application results Table: Direct, indirect effect through KDIGO and total effects and 95% confidence intervals (CI) comparing HES to Ringer solution. HR (95% CI ) Direct effect 1.16 (0.95-1.40) Indirect efffect 1.05 (0.98-1.12) Total effect 1.21 (1.01-1.47) Both direct and indirect effect are not significant However, the magnitude of the direct effect indicates that could be another importatn pathway through which the treatment could bring about the outcome. Potential pathways Coagulapathy (bleeding disorder) Patients generally have a low immune response, additionally adding foreign bodies could cause there immune system to crash
Discussion Discussion Easy to implement in standard software (predict functionality, weighted modeling) Generally applicable for various mediator types, although some caution with continuous mediators (unstable weights) Can handle treatment-by-mediator interactions Effect definitions involves critical assumptions
References References Hong, G. (2010). Ratio od mediator probability weighting for estimating natural direct and indirect effects. Proceedings of the American Statistical Association, Biometrics Section, pages 2401 2415. Lange, T., Vansteelandt, S., and Bekaert, M. (2012). A simple unified approach for estimating natural direct and indirect effects. American Journal of Epidemiology, 176(3):190 195. Pearl, J. (2001). Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainy in Artifcial Intelligence, pages 411 420. Perner, A., Haase, N., Guttormsen, A., Tenhunen, J., Klemenzson, G., Åneman, A., Madsen, K., Møller, M., Elkjæ, J., Poulsen, L., Bendtsen, A., Winding, R., Steensen, M., Berezowicz, P., Søe-Jensen, P., Bestle, M., Strand, K., Wiis, J., White, J., Thornberg, K., Quist, L., Nielsen, J., Andersen, L., Holst, L., Thormar, K., Kjældgaard, A., Fabritius, M., Mondrup, F., Pott, F., Møller, T., Winkel, P., Wetterslev, J., the 6S Trial Group, and the Scandinavian Critical Care Trials Group. (2012). Hydrocyethyl starch 130/0.42 versus Ringer s acetate in severe sepsis. New England Jounral of Medicine, 367:124 134. VanderWeele, T. (2011). Causal mediation analysis with survival data. Epidemiology, 22(4):582 585. VanderWeele, T. J. and Vansteelandt, S. (2009). Conceptual issues concerning mediation, interventions and composition. Stat Interface, pages 457 468. Zheng, W. and van der Laan, M. J. (2012). Causal mediation in a survival setting with time-dependent mediators. Berkeley Division of Biostatistics Working Paper Series. Working paper 295.
Additional Material (Nested) counterfactual quantities Assuming no censoring Counterfactual survival time Tam the outcome one would - possibly contrary to the fact - observe had the treatment A been set to the variable a and the entire mediator history M to m Counterfactual mediator trajectory M a corresponds to the mediator history that one would - possibly contrary to the fact - observe had treatment A been set to the value a. Nested counterfactual denoting the outcome one would have observed if A had been set to A = a and had M been set to the values it would have taken if A had been set to a. TaMa
Additional Material Average natural direct effect E[ T ama T a M a L] comparing the survival times (within levels of baseline covariates L) had the treatment been changed from a to a, but the mediator history had followed the path it would have naturally followed under some reference condition for the treatment A = a. Average natural indirect effect E[ T ama T ama L], which compares the effect of the mediator histories M a and M a on the outcome, when the treatment is set to A = a. Effect decompostion on the mean survial scale E[ T a T a L] = E[ T ama T a M a L] = E[ T ama T a M a L] + E[ T ama T ama L],
A simple to implement algorithm for natural direct and indirect effects in survival studies with a repeatedly measured mediator Additional Material Assumptions... to relate the counterfactual quantities to the observed data A M 1 M 2... M τ A N 1 N 2... N τ Figure: Augemented DAG L Treatment is randomised No unmeasured confounding between treatment and the outcome process, treatment and the mediator process and the mediator and outcome process.
Additional Material Non-parametric structural equation models Current mediator value M t Ma t = f M (M t 1 a, A = a, L = l, ε t a), with ε t a denoting an error term. Caution: Contains an assumption that all time-dependent predictors of both the mediator and survival status are contained in the mediator process (critical discussion in Zheng and van der Laan (2012)). Counterfactual event process N t aa = f N(M t a, Nt 1 aa, A = a, L = l, γt a), with γ t a again denoting an error term. Using those equations the assumptions can be pharsed in terms of the error terms.
Additional Material Assumptions Consistency: As stated in the main text, consistency assumes that the counterfactual outcomes T am and M a are the observed outcomes T with probability 1 among subjects with A = a and M = m and for the mediator trajectories M a are the observed mediator trajectories m with probability 1 for subject with A = a. Positivity: Further to ensure that the equations above describe non-degenerate probability distributions, we assume that the error terms follow uniform distributions ε t a unif ([0, 1]), a = 0, 1 (3) γ t a unif ([0, 1]), a = 0, 1 (4) and that fm t and f N t are not constant for the whole range ([0, 1]) in the arguments ε t and γ t. Further the full vectors of error terms (ε 1 a,..., ε τ a ) and (γa, 1..., γa τ ) are assumed to be an independent and identically distributed sequence.
Additional Material Assumptions cont. No unmeasured confounders: Random treatment allocation should ensure that treatment-outcome as well as the treatment - mediator relationship will not be confounded with either measured or unmeasured confounders. This can be expressed as ε t 0 ( ) ε t 1 A γ 0 t A. (5) γ1 t For the mediator-outcome relationship we assume ( ) ( ) ε t 0γ0 t A A and ε t 1γ1 t A A
Additional Material Assumptions cont. Extended sequential ignorability: The sequential ignorability assumption known from the single mediator case to be required for identifying natural effects needs to be modified to accommodate the multiple measurements. In terms of the error terms it can be expressed as ( ) ε t γ t 0 a γ1 t ( ) a = 0, 1 and γa t ε t 0 ε t 1 a = 0, 1. (6)
Additional Material Proof Algorithm Under these assumptions E[g((T, ) ama )] can be linked to the observed data as follows, where the function g() is any measureable function (this ensures that the distributions are the same, not just a single expected value) and h() corresponds to a density with respect to a suitable measure. E[g((T, ) ama ) L = l] = E[g(T, ) A = a, A = a, L = l] Recall that τ denotes the maximal follow up time, so we have = τ g(t, δ)h(t, δ A = a, A = a, L = l), δ=0,1 t=1 further, let M denote the space of possible mediator values, then = τ g(t, δ) h(t, δ, m t A = a, A = a, L = l)d m t δ=0,1 t=1 m t M t = τ g(t, δ) δ=0,1 t=1 m t M t h(t, δ m t, A = a, A = a, L = l) h( mt A = a, A = a, L = l) h( m t A = a, A = a, L = l) h( m t A = a, A = a, L = l)d m t,
Additional Material Proof continued conditioning on m t in the first term allows to remove A from the conditioning set, further using that M t depends only on A gives the following equality = τ δ=0,1 t=1 g(t, δ) h(t, δ m t, A = a, L = l) h( mt A = a, L = l) m t M t h( m t A = a, L = l) h( m t A = a, L = l)d m t, applying the same independence arguments again, we have = τ g(t, δ) h(t, δ m t, A = a, A = a, L = l)h( m t A = a, A = a, L = l) δ=0,1 t=1 m t M t h( mt A = a, L = l) h( m t A = a, L = l) )d mt, what can be writen as = τ g(t, δ)w ( m t, a, a, l)h(t, δ, m t A = a, A = a, L = l)d m t. δ=0,1 t=1 m t M t
Additional Material Proof continued The last expression has the sample counterpart 1 #{N } where T g(t i, i )W aa (M i i, a, a, L i ), i N N = {i {1,..., n} : A i = a, L i = l}.