Generalized linear models and software for network meta-analysis Sofia Dias & Gert van Valkenhoef Tufts University, Boston MA, USA, June 2012
Generalized linear model (GLM) framework Pairwise Meta-analysis and Indirect comparisons are special cases of Mixed treatment comparisons (or NMA) All are types of linear regression Use familiar GLM framework to define the NMA model Define a likelihood l(y γ) with some unknown parameters. Use a link function g( ) to map parameter of interest, γ, onto the real line (assume linear relationship). Define model for the linear predictor. The GLM for (network) meta-analysis can be written as g(γ) = θ ik = µ i + δ ik I {k 1} with i = 1,..., M, k = 1,..., na i and I the indicator function (0 if k=1; 1 if k 1 ).
The Model g(γ) = θ ik = µ i + δ ik I {k 1} The linear predictor θ ik is a continuous measure of the effect of the treatment in arm k of study i. δ ik are the trial-specific treatment effects of the treatment in arm k relative to the treatment in arm 1. In a random effects (RE) model δ ik are assumed to be exchangeable: δ ik N (d ti1,t ik, σ 2 ) When multi-arm trials are available the RE distribution is multivariate normal. Suitable prior distributions need to be defined for µ i, d 1k, σ. In a fixed effects (FE) model, the GLM simplifies to g(γ) = θ ik = µ i + (d 1,tik d 1,ti1 ) I {k 1}
GLM framework: Binomial/logit Example Data: number of events, r ik, out of total number of participants, n ik, in arm k of trial i. The likelihood is r ik Binomial(p ik, n ik ) Use the logit link to map the probabilities onto the real line. Model: θ ik = logit(p ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-odds of an event on each arm of the trial. Define priors etc
GLM framework: Poisson/log Example Data are number of events, r ik, occurring in arm k of trial i over an exposure period E ik in person-years The likelihood is r ik Poisson(λ ik E ik ) Use the log link to map the rates onto the real line Model: θ ik = log(λ ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-rate of an event on each arm of the trial. Define priors etc
The GLM and WinBUGS GLM are ideally suited for coding in WinBUGS due to their modular structure. We have developed WinBUGS code which directly translates GLM theory. One generic model structure for FE, one for RE. Code can be adapted for various data types by changing only likelihood and link function. The meta-analysis model for the linear predictor θ ik is always the same.
FE model: Binomial/logit # Binomial likelihood, logit link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dbin(p[i,k],n[i,k]) # binomial likelihood # model for linear predictor logit(p[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS
FE model: Poisson/log # Poisson likelihood, log link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dpois(beta[i,k]) # Poisson likelihood beta[i,k] <- lambda[i,k]*e[i,k] # failure rate * exposure # model for linear predictor log(lambda[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS
Data for Binomial/logit example Define number of treatments, nt, and number of studies, ns: list(nt=4,ns=24) Data given as one trial per row Columns are: events, number of patients, treatments compared and number of arms in trial Data can be copied from spreadsheet software: r[,1] n[,1] r[,2] n[,2] r[,3] n[,3] t[,1] t[,2] t[,3] na[] 9 140 23 140 10 138 1 3 4 3 11 78 12 85 29 170 2 3 4 3 75 731 363 714 NA NA 1 3 NA 2 2 106 9 205 NA NA 1 3 NA 2. END
Initial values for Binomial/logit example Define values where simulation will start # Initial values # Chain 1 list( d=c(na,0,0,0), mu=c(0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0) ) # Chain 2 list(d=c(na,0.1,-1,-0.2), mu=c(1,-1,-2,0,0, -2,1,0,2,2, 1,-1,-2,0,0, -2,1,0,2,2, -2,-0.5,-3,0.5) ) Run WinBUGS
NICE DSU Technical Support Documents Series of Technical Support Documents (TSDs) on Evidence Synthesis commissioned by NICE DSU. Available from http://www.nicedsu.org.uk The GLM theory for (network) meta-analysis is set out with a variety of worked examples and code in TSD2. Other TSDs deal with Heterogeneity and meta-regression (TSD3), Inconsistency (TSD4), Baseline Models (TSD5) and Software (TSD6). TSD7 has a checklist for reviewers of NMA submissions Primarily for NICE Technology Appraisals, but relevant for submissions to journals as well.
Advantages of TSD Code Several worked examples available Number of events: Binomial/logit Rate data: Poisson/log and Binomial/cloglog Competing risks: Multinomial/log Continuous: Normal/identity Including change from baseline, relative effect data, SMD Ordered categorical data: multinomial/probit. Code is very general and will handle any combination of likelihood/link function; any number or trials and treatments; any number of multi-arm trials; arm-based data or data in relative effect format. Correctly accounts for the correlations in multi-arm trials. Easy to set up shared parameter models, for example when some data are in arm-based and some in relative effect formats.
Other bits of code... Basic code will provide all treatment effects relative to treatment 1 (the chosen reference). Due to modular nature of WinBUGS, it is easy to add extra code to provide other output such as: Assessing model fit (residual deviance); Obtaining all relative treatment effects; Obtaining relative effects on a different scale (eg. odds ratio) with correct uncertainty; Obtaining NNT or absolute probabilities/rates with associated uncertainty; Obtaining probabilities that each treatment is the best, second best etc. TSD2 provides sample code.
Advantages of using WinBUGS for NMA Code already available for many data types so no need for extra coding. No need for data preparation before running model. Produces sample from true posterior distribution. CODA output can be used directly to inform economic models. Due to WinBUGS flexibility can easily extend code to more complex models include covariates (meta-regression - see TSD4); class effects models; using IPD, etc.
Disadvantages of using WinBUGS for NMA Requires knowledge of MCMC methods to check convergence and detect problems But will still provide output, which can be misinterpreted... Some models may require many iterations which can take some time to run. Graphical capabilities very limited so need to export results to other software. Setting up initial values may be tricky in some models. May have problems converging when network is sparse and/or has many zero cells.
Using the TSD WinBUGS Code Need basic knowledge of Stats!! Choose appropriate code from the website, decide which nodes to monitor, and how to interpret the output. Input data and number of studies and treatments. User needs to define Overall baseline or reference treatment (treatment 1) for NMA; Treatment coding order; Priors, can be tricky for the heterogeneity in RE model; Initial values for MCMC simulation to start. Before valid output can be obtained users also need to check Convergence; Model fit; Consistency.
Automated model generation Generate model: abstract representation Structure: basic parameters, study baselines Priors Starting values Abstract representation concrete implementation BUGS syntax (templates based on NICE TSDs) JAGS syntax (templates based on NICE TSDs) YADAS MCMC models in Java
Current model generation capabilities (1/4) Model structure depends on type: Consistency / node-split / inconsistency Random effects homogeneous variance General method for priors: Use a simple heuristic Define what is large deviation vague priors General method for starting values: Sample from over-dispersed MLEs Requires parameters are directly measured Additional constraint for model structure
Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence
Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence C A D E B
Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence tpa UK C ASPAC A D E AtPA Ten B SK SKtPA Ret
Current model generation capabilities (3/4) Node-split model generation (draft) Node-splitting models require some recoding Generally there will be many nodes to split Inconvenient to do by hand Will present this @ SRSM Main problem is choosing nodes to split If right nodes chosen, model generation again easy
Current model generation capabilities (4/4) Inconsistency model generation (published, but imperfect) Inconsistency model generation is HARD Algorithm inefficient for multi-arm trials My current work leaves much to be desired I won t go into further detail
GeMTC: MTC model generation Java library (open source, reusable) for model generation Command-line interface / R-package ( GeMTC CLI ) Simplistic GUI ( GeMTC GUI ) Now: (very) quick demo of GeMTC GUI Loading a data file Generating a node-split model Quick look at generated code
http://drugis.org/gemtc
Beyond model generation Model generation alone is not enough: GUI for network meta-analysis Pseudo-automated convergence checking Automatically generate the right summaries, tables, figures Data entry / management We have this in ADDIS!
ADDIS goals The goals (will take a while to get there...): Database of trials, really structured Meta-analysis, network meta-analysis, decision analysis Inform health care policy (regulation, guidelines, reimbursement) Automate systematic review (i.e. eliminate the grunt work) Sourcing from abstract databases, systematic reviews, registries So, kind of bussiness intelligence for health care policy
ADDIS current status Somewhat advanced trial data model XML schema available Inspired by CDISC / BRIDG / OCRe Being vetted by CDISC expert now Tools for study selection falling behind But receiving some attention right now! Hardly any data sourcing (so far focussed on regulators) Analysis tools have received most attention This was/is the focus of my PhD research
Network meta-analysis in ADDIS Demo! The example dataset Building a network meta-analysis Running the models assessing convergence Assessing inconsistency Consistency results
http://drugis.org/addis
Generalized linear models Very flexible & general Requires a lot of knowledge from user Some models (node-split, inconsistency) complicated Automation could help to Make analysis faster / easier Prevent coding mistakes Ensure necessary steps are taken
Model generation / GeMTC Given dataset, generates model Everything else done in WinBUGS Requires some knowledge from user Generated models can be customized Only most common types of model available
Model generation wishlist Near future: Relative-effect data Detect sparse / invalid / problematic data Fixed effects / Random effects heterogeneous variance User-defined priors / knowledge-based prior selection R package based on GeMTC, rjags, coda More distant future: Covariates Better inconsistency DAG generation Software is open source: I welcome contributions!
Automation / ADDIS ADDIS is... Much more ambitious Network meta-analysis is a means, not an end Database of trials decision support Less flexible Not even near finished Relative to WinBUGS: No manual coding One-click interface to run models User is explictly asked to look at convergence Models to assess inconsistency directly available Appropriate tables & plots
Discussion Something in between GeMTC and ADDIS needed? Or integrating GeMTC in an R package? There will always be need for the raw WinBUGS code Automated interface should not get in the way Should give user ability to drop down to code level And there is much work to be done!
Thank you! Questions?