Generalized linear models and software for network meta-analysis



Similar documents
NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS

NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Methods for Meta-analysis in Medical Research

A Latent Variable Approach to Validate Credit Rating Systems using R

SAS Software to Fit the Generalized Linear Model

VI. Introduction to Logistic Regression

Multiple Choice: 2 points each

Model-based Synthesis. Tony O Hagan

Logistic Regression (a type of Generalized Linear Model)

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis. 25th November 2014

HLM software has been one of the leading statistical packages for hierarchical

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Oracle Data Miner (Extension of SQL Developer 4.0)

BayesX - Software for Bayesian Inference in Structured Additive Regression

R2MLwiN Using the multilevel modelling software package MLwiN from R

Lecture 19: Conditional Logistic Regression

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Using Excel for Statistical Analysis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

Lecture 3: Linear methods for classification

Imputing Missing Data using SAS

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Introduction to Longitudinal Data Analysis

Logistic Regression (1/24/13)

Simple Linear Regression Inference

data visualization and regression

Automated Biosurveillance Data from England and Wales,

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Lecture 10: Regression Trees

A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc.

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

13. Poisson Regression Analysis

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

Introduction to General and Generalized Linear Models

ADDIS: towards on-demand support for evidence based decision making based on structured data sources

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Regression 3: Logistic Regression

APPLIED MISSING DATA ANALYSIS

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

7 Generalized Estimating Equations

Data processing goes big

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Principles of Systematic Review: Focus on Alcoholism Treatment

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

Linear Classification. Volker Tresp Summer 2015

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Applications of R Software in Bayesian Data Analysis

Better decision making under uncertain conditions using Monte Carlo Simulation

PS 271B: Quantitative Methods II. Lecture Notes

Problem of Missing Data

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Data Management for Multi-Environment Trials in Excel

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Analysis of Bayesian Dynamic Linear Models

SPSS Introduction. Yi Li

Basic Statistical and Modeling Procedures Using SAS

Introduction to Fixed Effects Methods

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

Bayesian Statistics in One Hour. Patrick Lam

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide

Statistics in Retail Finance. Chapter 2: Statistical models of default

R Tools Evaluation. A review by Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

Spreadsheet software for linear regression analysis

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.

Towards running complex models on big data

Analyzing Structural Equation Models With Missing Data

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

BUSINESS RULES CONCEPTS... 2 BUSINESS RULE ENGINE ARCHITECTURE By using the RETE Algorithm Benefits of RETE Algorithm...

NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES

Big Data, Statistics, and the Internet

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Markov Chain Monte Carlo Simulation Made Simple

Regression III: Advanced Methods

Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting

Normality Testing in Excel

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Gamma Distribution Fitting

Handling missing data in Stata a whirlwind tour

Package dsmodellingclient

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Computer Math

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Poisson Regression or Regression of Counts (& Rates)

Simple Predictive Analytics Curtis Seare

Linear Threshold Units

IBM SPSS Data Preparation 22

MATLAB and Big Data: Illustrative Example

not possible or was possible at a high cost for collecting the data.

Poisson Models for Count Data

Examining a Fitted Logistic Model

STA 4273H: Statistical Machine Learning

Transcription:

Generalized linear models and software for network meta-analysis Sofia Dias & Gert van Valkenhoef Tufts University, Boston MA, USA, June 2012

Generalized linear model (GLM) framework Pairwise Meta-analysis and Indirect comparisons are special cases of Mixed treatment comparisons (or NMA) All are types of linear regression Use familiar GLM framework to define the NMA model Define a likelihood l(y γ) with some unknown parameters. Use a link function g( ) to map parameter of interest, γ, onto the real line (assume linear relationship). Define model for the linear predictor. The GLM for (network) meta-analysis can be written as g(γ) = θ ik = µ i + δ ik I {k 1} with i = 1,..., M, k = 1,..., na i and I the indicator function (0 if k=1; 1 if k 1 ).

The Model g(γ) = θ ik = µ i + δ ik I {k 1} The linear predictor θ ik is a continuous measure of the effect of the treatment in arm k of study i. δ ik are the trial-specific treatment effects of the treatment in arm k relative to the treatment in arm 1. In a random effects (RE) model δ ik are assumed to be exchangeable: δ ik N (d ti1,t ik, σ 2 ) When multi-arm trials are available the RE distribution is multivariate normal. Suitable prior distributions need to be defined for µ i, d 1k, σ. In a fixed effects (FE) model, the GLM simplifies to g(γ) = θ ik = µ i + (d 1,tik d 1,ti1 ) I {k 1}

GLM framework: Binomial/logit Example Data: number of events, r ik, out of total number of participants, n ik, in arm k of trial i. The likelihood is r ik Binomial(p ik, n ik ) Use the logit link to map the probabilities onto the real line. Model: θ ik = logit(p ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-odds of an event on each arm of the trial. Define priors etc

GLM framework: Poisson/log Example Data are number of events, r ik, occurring in arm k of trial i over an exposure period E ik in person-years The likelihood is r ik Poisson(λ ik E ik ) Use the log link to map the rates onto the real line Model: θ ik = log(λ ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-rate of an event on each arm of the trial. Define priors etc

The GLM and WinBUGS GLM are ideally suited for coding in WinBUGS due to their modular structure. We have developed WinBUGS code which directly translates GLM theory. One generic model structure for FE, one for RE. Code can be adapted for various data types by changing only likelihood and link function. The meta-analysis model for the linear predictor θ ik is always the same.

FE model: Binomial/logit # Binomial likelihood, logit link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dbin(p[i,k],n[i,k]) # binomial likelihood # model for linear predictor logit(p[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS

FE model: Poisson/log # Poisson likelihood, log link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dpois(beta[i,k]) # Poisson likelihood beta[i,k] <- lambda[i,k]*e[i,k] # failure rate * exposure # model for linear predictor log(lambda[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS

Data for Binomial/logit example Define number of treatments, nt, and number of studies, ns: list(nt=4,ns=24) Data given as one trial per row Columns are: events, number of patients, treatments compared and number of arms in trial Data can be copied from spreadsheet software: r[,1] n[,1] r[,2] n[,2] r[,3] n[,3] t[,1] t[,2] t[,3] na[] 9 140 23 140 10 138 1 3 4 3 11 78 12 85 29 170 2 3 4 3 75 731 363 714 NA NA 1 3 NA 2 2 106 9 205 NA NA 1 3 NA 2. END

Initial values for Binomial/logit example Define values where simulation will start # Initial values # Chain 1 list( d=c(na,0,0,0), mu=c(0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0) ) # Chain 2 list(d=c(na,0.1,-1,-0.2), mu=c(1,-1,-2,0,0, -2,1,0,2,2, 1,-1,-2,0,0, -2,1,0,2,2, -2,-0.5,-3,0.5) ) Run WinBUGS

NICE DSU Technical Support Documents Series of Technical Support Documents (TSDs) on Evidence Synthesis commissioned by NICE DSU. Available from http://www.nicedsu.org.uk The GLM theory for (network) meta-analysis is set out with a variety of worked examples and code in TSD2. Other TSDs deal with Heterogeneity and meta-regression (TSD3), Inconsistency (TSD4), Baseline Models (TSD5) and Software (TSD6). TSD7 has a checklist for reviewers of NMA submissions Primarily for NICE Technology Appraisals, but relevant for submissions to journals as well.

Advantages of TSD Code Several worked examples available Number of events: Binomial/logit Rate data: Poisson/log and Binomial/cloglog Competing risks: Multinomial/log Continuous: Normal/identity Including change from baseline, relative effect data, SMD Ordered categorical data: multinomial/probit. Code is very general and will handle any combination of likelihood/link function; any number or trials and treatments; any number of multi-arm trials; arm-based data or data in relative effect format. Correctly accounts for the correlations in multi-arm trials. Easy to set up shared parameter models, for example when some data are in arm-based and some in relative effect formats.

Other bits of code... Basic code will provide all treatment effects relative to treatment 1 (the chosen reference). Due to modular nature of WinBUGS, it is easy to add extra code to provide other output such as: Assessing model fit (residual deviance); Obtaining all relative treatment effects; Obtaining relative effects on a different scale (eg. odds ratio) with correct uncertainty; Obtaining NNT or absolute probabilities/rates with associated uncertainty; Obtaining probabilities that each treatment is the best, second best etc. TSD2 provides sample code.

Advantages of using WinBUGS for NMA Code already available for many data types so no need for extra coding. No need for data preparation before running model. Produces sample from true posterior distribution. CODA output can be used directly to inform economic models. Due to WinBUGS flexibility can easily extend code to more complex models include covariates (meta-regression - see TSD4); class effects models; using IPD, etc.

Disadvantages of using WinBUGS for NMA Requires knowledge of MCMC methods to check convergence and detect problems But will still provide output, which can be misinterpreted... Some models may require many iterations which can take some time to run. Graphical capabilities very limited so need to export results to other software. Setting up initial values may be tricky in some models. May have problems converging when network is sparse and/or has many zero cells.

Using the TSD WinBUGS Code Need basic knowledge of Stats!! Choose appropriate code from the website, decide which nodes to monitor, and how to interpret the output. Input data and number of studies and treatments. User needs to define Overall baseline or reference treatment (treatment 1) for NMA; Treatment coding order; Priors, can be tricky for the heterogeneity in RE model; Initial values for MCMC simulation to start. Before valid output can be obtained users also need to check Convergence; Model fit; Consistency.

Automated model generation Generate model: abstract representation Structure: basic parameters, study baselines Priors Starting values Abstract representation concrete implementation BUGS syntax (templates based on NICE TSDs) JAGS syntax (templates based on NICE TSDs) YADAS MCMC models in Java

Current model generation capabilities (1/4) Model structure depends on type: Consistency / node-split / inconsistency Random effects homogeneous variance General method for priors: Use a simple heuristic Define what is large deviation vague priors General method for starting values: Sample from over-dispersed MLEs Requires parameters are directly measured Additional constraint for model structure

Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence

Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence C A D E B

Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence tpa UK C ASPAC A D E AtPA Ten B SK SKtPA Ret

Current model generation capabilities (3/4) Node-split model generation (draft) Node-splitting models require some recoding Generally there will be many nodes to split Inconvenient to do by hand Will present this @ SRSM Main problem is choosing nodes to split If right nodes chosen, model generation again easy

Current model generation capabilities (4/4) Inconsistency model generation (published, but imperfect) Inconsistency model generation is HARD Algorithm inefficient for multi-arm trials My current work leaves much to be desired I won t go into further detail

GeMTC: MTC model generation Java library (open source, reusable) for model generation Command-line interface / R-package ( GeMTC CLI ) Simplistic GUI ( GeMTC GUI ) Now: (very) quick demo of GeMTC GUI Loading a data file Generating a node-split model Quick look at generated code

http://drugis.org/gemtc

Beyond model generation Model generation alone is not enough: GUI for network meta-analysis Pseudo-automated convergence checking Automatically generate the right summaries, tables, figures Data entry / management We have this in ADDIS!

ADDIS goals The goals (will take a while to get there...): Database of trials, really structured Meta-analysis, network meta-analysis, decision analysis Inform health care policy (regulation, guidelines, reimbursement) Automate systematic review (i.e. eliminate the grunt work) Sourcing from abstract databases, systematic reviews, registries So, kind of bussiness intelligence for health care policy

ADDIS current status Somewhat advanced trial data model XML schema available Inspired by CDISC / BRIDG / OCRe Being vetted by CDISC expert now Tools for study selection falling behind But receiving some attention right now! Hardly any data sourcing (so far focussed on regulators) Analysis tools have received most attention This was/is the focus of my PhD research

Network meta-analysis in ADDIS Demo! The example dataset Building a network meta-analysis Running the models assessing convergence Assessing inconsistency Consistency results

http://drugis.org/addis

Generalized linear models Very flexible & general Requires a lot of knowledge from user Some models (node-split, inconsistency) complicated Automation could help to Make analysis faster / easier Prevent coding mistakes Ensure necessary steps are taken

Model generation / GeMTC Given dataset, generates model Everything else done in WinBUGS Requires some knowledge from user Generated models can be customized Only most common types of model available

Model generation wishlist Near future: Relative-effect data Detect sparse / invalid / problematic data Fixed effects / Random effects heterogeneous variance User-defined priors / knowledge-based prior selection R package based on GeMTC, rjags, coda More distant future: Covariates Better inconsistency DAG generation Software is open source: I welcome contributions!

Automation / ADDIS ADDIS is... Much more ambitious Network meta-analysis is a means, not an end Database of trials decision support Less flexible Not even near finished Relative to WinBUGS: No manual coding One-click interface to run models User is explictly asked to look at convergence Models to assess inconsistency directly available Appropriate tables & plots

Discussion Something in between GeMTC and ADDIS needed? Or integrating GeMTC in an R package? There will always be need for the raw WinBUGS code Automated interface should not get in the way Should give user ability to drop down to code level And there is much work to be done!

Thank you! Questions?