Generalized linear models and software for network meta-analysis

Size: px
Start display at page:

Download "Generalized linear models and software for network meta-analysis"

Transcription

1 Generalized linear models and software for network meta-analysis Sofia Dias & Gert van Valkenhoef Tufts University, Boston MA, USA, June 2012

2 Generalized linear model (GLM) framework Pairwise Meta-analysis and Indirect comparisons are special cases of Mixed treatment comparisons (or NMA) All are types of linear regression Use familiar GLM framework to define the NMA model Define a likelihood l(y γ) with some unknown parameters. Use a link function g( ) to map parameter of interest, γ, onto the real line (assume linear relationship). Define model for the linear predictor. The GLM for (network) meta-analysis can be written as g(γ) = θ ik = µ i + δ ik I {k 1} with i = 1,..., M, k = 1,..., na i and I the indicator function (0 if k=1; 1 if k 1 ).

3 The Model g(γ) = θ ik = µ i + δ ik I {k 1} The linear predictor θ ik is a continuous measure of the effect of the treatment in arm k of study i. δ ik are the trial-specific treatment effects of the treatment in arm k relative to the treatment in arm 1. In a random effects (RE) model δ ik are assumed to be exchangeable: δ ik N (d ti1,t ik, σ 2 ) When multi-arm trials are available the RE distribution is multivariate normal. Suitable prior distributions need to be defined for µ i, d 1k, σ. In a fixed effects (FE) model, the GLM simplifies to g(γ) = θ ik = µ i + (d 1,tik d 1,ti1 ) I {k 1}

4 GLM framework: Binomial/logit Example Data: number of events, r ik, out of total number of participants, n ik, in arm k of trial i. The likelihood is r ik Binomial(p ik, n ik ) Use the logit link to map the probabilities onto the real line. Model: θ ik = logit(p ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-odds of an event on each arm of the trial. Define priors etc

5 GLM framework: Poisson/log Example Data are number of events, r ik, occurring in arm k of trial i over an exposure period E ik in person-years The likelihood is r ik Poisson(λ ik E ik ) Use the log link to map the rates onto the real line Model: θ ik = log(λ ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-rate of an event on each arm of the trial. Define priors etc

6 The GLM and WinBUGS GLM are ideally suited for coding in WinBUGS due to their modular structure. We have developed WinBUGS code which directly translates GLM theory. One generic model structure for FE, one for RE. Code can be adapted for various data types by changing only likelihood and link function. The meta-analysis model for the linear predictor θ ik is always the same.

7 FE model: Binomial/logit # Binomial likelihood, logit link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dbin(p[i,k],n[i,k]) # binomial likelihood # model for linear predictor logit(p[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS

8 FE model: Poisson/log # Poisson likelihood, log link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dpois(beta[i,k]) # Poisson likelihood beta[i,k] <- lambda[i,k]*e[i,k] # failure rate * exposure # model for linear predictor log(lambda[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS

9 Data for Binomial/logit example Define number of treatments, nt, and number of studies, ns: list(nt=4,ns=24) Data given as one trial per row Columns are: events, number of patients, treatments compared and number of arms in trial Data can be copied from spreadsheet software: r[,1] n[,1] r[,2] n[,2] r[,3] n[,3] t[,1] t[,2] t[,3] na[] NA NA 1 3 NA NA NA 1 3 NA 2. END

10 Initial values for Binomial/logit example Define values where simulation will start # Initial values # Chain 1 list( d=c(na,0,0,0), mu=c(0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0) ) # Chain 2 list(d=c(na,0.1,-1,-0.2), mu=c(1,-1,-2,0,0, -2,1,0,2,2, 1,-1,-2,0,0, -2,1,0,2,2, -2,-0.5,-3,0.5) ) Run WinBUGS

11 NICE DSU Technical Support Documents Series of Technical Support Documents (TSDs) on Evidence Synthesis commissioned by NICE DSU. Available from The GLM theory for (network) meta-analysis is set out with a variety of worked examples and code in TSD2. Other TSDs deal with Heterogeneity and meta-regression (TSD3), Inconsistency (TSD4), Baseline Models (TSD5) and Software (TSD6). TSD7 has a checklist for reviewers of NMA submissions Primarily for NICE Technology Appraisals, but relevant for submissions to journals as well.

12

13 Advantages of TSD Code Several worked examples available Number of events: Binomial/logit Rate data: Poisson/log and Binomial/cloglog Competing risks: Multinomial/log Continuous: Normal/identity Including change from baseline, relative effect data, SMD Ordered categorical data: multinomial/probit. Code is very general and will handle any combination of likelihood/link function; any number or trials and treatments; any number of multi-arm trials; arm-based data or data in relative effect format. Correctly accounts for the correlations in multi-arm trials. Easy to set up shared parameter models, for example when some data are in arm-based and some in relative effect formats.

14 Other bits of code... Basic code will provide all treatment effects relative to treatment 1 (the chosen reference). Due to modular nature of WinBUGS, it is easy to add extra code to provide other output such as: Assessing model fit (residual deviance); Obtaining all relative treatment effects; Obtaining relative effects on a different scale (eg. odds ratio) with correct uncertainty; Obtaining NNT or absolute probabilities/rates with associated uncertainty; Obtaining probabilities that each treatment is the best, second best etc. TSD2 provides sample code.

15 Advantages of using WinBUGS for NMA Code already available for many data types so no need for extra coding. No need for data preparation before running model. Produces sample from true posterior distribution. CODA output can be used directly to inform economic models. Due to WinBUGS flexibility can easily extend code to more complex models include covariates (meta-regression - see TSD4); class effects models; using IPD, etc.

16 Disadvantages of using WinBUGS for NMA Requires knowledge of MCMC methods to check convergence and detect problems But will still provide output, which can be misinterpreted... Some models may require many iterations which can take some time to run. Graphical capabilities very limited so need to export results to other software. Setting up initial values may be tricky in some models. May have problems converging when network is sparse and/or has many zero cells.

17 Using the TSD WinBUGS Code Need basic knowledge of Stats!! Choose appropriate code from the website, decide which nodes to monitor, and how to interpret the output. Input data and number of studies and treatments. User needs to define Overall baseline or reference treatment (treatment 1) for NMA; Treatment coding order; Priors, can be tricky for the heterogeneity in RE model; Initial values for MCMC simulation to start. Before valid output can be obtained users also need to check Convergence; Model fit; Consistency.

18 Automated model generation Generate model: abstract representation Structure: basic parameters, study baselines Priors Starting values Abstract representation concrete implementation BUGS syntax (templates based on NICE TSDs) JAGS syntax (templates based on NICE TSDs) YADAS MCMC models in Java

19 Current model generation capabilities (1/4) Model structure depends on type: Consistency / node-split / inconsistency Random effects homogeneous variance General method for priors: Use a simple heuristic Define what is large deviation vague priors General method for starting values: Sample from over-dispersed MLEs Requires parameters are directly measured Additional constraint for model structure

20 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence

21 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence C A D E B

22 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence tpa UK C ASPAC A D E AtPA Ten B SK SKtPA Ret

23 Current model generation capabilities (3/4) Node-split model generation (draft) Node-splitting models require some recoding Generally there will be many nodes to split Inconvenient to do by hand Will present SRSM Main problem is choosing nodes to split If right nodes chosen, model generation again easy

24 Current model generation capabilities (4/4) Inconsistency model generation (published, but imperfect) Inconsistency model generation is HARD Algorithm inefficient for multi-arm trials My current work leaves much to be desired I won t go into further detail

25 GeMTC: MTC model generation Java library (open source, reusable) for model generation Command-line interface / R-package ( GeMTC CLI ) Simplistic GUI ( GeMTC GUI ) Now: (very) quick demo of GeMTC GUI Loading a data file Generating a node-split model Quick look at generated code

26

27 Beyond model generation Model generation alone is not enough: GUI for network meta-analysis Pseudo-automated convergence checking Automatically generate the right summaries, tables, figures Data entry / management We have this in ADDIS!

28 ADDIS goals The goals (will take a while to get there...): Database of trials, really structured Meta-analysis, network meta-analysis, decision analysis Inform health care policy (regulation, guidelines, reimbursement) Automate systematic review (i.e. eliminate the grunt work) Sourcing from abstract databases, systematic reviews, registries So, kind of bussiness intelligence for health care policy

29 ADDIS current status Somewhat advanced trial data model XML schema available Inspired by CDISC / BRIDG / OCRe Being vetted by CDISC expert now Tools for study selection falling behind But receiving some attention right now! Hardly any data sourcing (so far focussed on regulators) Analysis tools have received most attention This was/is the focus of my PhD research

30 Network meta-analysis in ADDIS Demo! The example dataset Building a network meta-analysis Running the models assessing convergence Assessing inconsistency Consistency results

31

32 Generalized linear models Very flexible & general Requires a lot of knowledge from user Some models (node-split, inconsistency) complicated Automation could help to Make analysis faster / easier Prevent coding mistakes Ensure necessary steps are taken

33 Model generation / GeMTC Given dataset, generates model Everything else done in WinBUGS Requires some knowledge from user Generated models can be customized Only most common types of model available

34 Model generation wishlist Near future: Relative-effect data Detect sparse / invalid / problematic data Fixed effects / Random effects heterogeneous variance User-defined priors / knowledge-based prior selection R package based on GeMTC, rjags, coda More distant future: Covariates Better inconsistency DAG generation Software is open source: I welcome contributions!

35 Automation / ADDIS ADDIS is... Much more ambitious Network meta-analysis is a means, not an end Database of trials decision support Less flexible Not even near finished Relative to WinBUGS: No manual coding One-click interface to run models User is explictly asked to look at convergence Models to assess inconsistency directly available Appropriate tables & plots

36 Discussion Something in between GeMTC and ADDIS needed? Or integrating GeMTC in an R package? There will always be need for the raw WinBUGS code Automated interface should not get in the way Should give user ability to drop down to code level And there is much work to be done!

37 Thank you! Questions?

NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS

NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS REPORT BY THE DECISION SUPPORT UNIT August 2011 (last

More information

NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT

NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT REPORT BY THE DECISION SUPPORT UNIT September 2011 (last updated April 2012) Sofia Dias 1, Alex

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Methods for Meta-analysis in Medical Research

Methods for Meta-analysis in Medical Research Methods for Meta-analysis in Medical Research Alex J. Sutton University of Leicester, UK Keith R. Abrams University of Leicester, UK David R. Jones University of Leicester, UK Trevor A. Sheldon University

More information

A Latent Variable Approach to Validate Credit Rating Systems using R

A Latent Variable Approach to Validate Credit Rating Systems using R A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Multiple Choice: 2 points each

Multiple Choice: 2 points each MID TERM MSF 503 Modeling 1 Name: Answers go here! NEATNESS COUNTS!!! Multiple Choice: 2 points each 1. In Excel, the VLOOKUP function does what? Searches the first row of a range of cells, and then returns

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Logistic Regression (a type of Generalized Linear Model)

Logistic Regression (a type of Generalized Linear Model) Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis. 25th November 2014

PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis. 25th November 2014 Guidance on the implementation and reporting of a drug safety Bayesian network meta-analysis PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis 25th November 2014 1 2 Outline Overview

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Oracle Data Miner (Extension of SQL Developer 4.0)

Oracle Data Miner (Extension of SQL Developer 4.0) An Oracle White Paper September 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into a workflow using the SQL Query node Denny Wong Oracle Data Mining

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

R2MLwiN Using the multilevel modelling software package MLwiN from R

R2MLwiN Using the multilevel modelling software package MLwiN from R Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the

More information

Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

data visualization and regression

data visualization and regression data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

More information

Automated Biosurveillance Data from England and Wales, 1991 2011

Automated Biosurveillance Data from England and Wales, 1991 2011 Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

Lecture 10: Regression Trees

Lecture 10: Regression Trees Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

More information

A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc.

A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc. A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc. Introduction: The Basel Capital Accord, ready for implementation in force around 2006, sets out

More information

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANI-KALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of

More information

13. Poisson Regression Analysis

13. Poisson Regression Analysis 136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

ADDIS: towards on-demand support for evidence based decision making based on structured data sources

ADDIS: towards on-demand support for evidence based decision making based on structured data sources ADDIS: towards on-demand support for evidence based decision making based on structured data sources Gert van Valkenhoef 2014-11-21 @ NLM / ClinicalTrials.gov Section 1 Background About me MSc Artificial

More information

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

More information

Regression 3: Logistic Regression

Regression 3: Logistic Regression Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic regression Logistic regression in R Outline Logistic regression Introduction The model Looking at and comparing

More information

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

7 Generalized Estimating Equations

7 Generalized Estimating Equations Chapter 7 The procedure extends the generalized linear model to allow for analysis of repeated measurements or other correlated observations, such as clustered data. Example. Public health of cials can

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Principles of Systematic Review: Focus on Alcoholism Treatment

Principles of Systematic Review: Focus on Alcoholism Treatment Principles of Systematic Review: Focus on Alcoholism Treatment Manit Srisurapanont, M.D. Professor of Psychiatry Department of Psychiatry, Faculty of Medicine, Chiang Mai University For Symposium 1A: Systematic

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Applications of R Software in Bayesian Data Analysis

Applications of R Software in Bayesian Data Analysis Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

More information

Better decision making under uncertain conditions using Monte Carlo Simulation

Better decision making under uncertain conditions using Monte Carlo Simulation IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Problem of Missing Data

Problem of Missing Data VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort [email protected] Motivation Location matters! Observed value at one location is

More information

Data Management for Multi-Environment Trials in Excel

Data Management for Multi-Environment Trials in Excel Data Management for Multi-Environment Trials in Excel Cathy Garlick ([email protected]), Statistical Services Centre, University of Reading, UK 27 May 2010 1. Introduction Excel is widely available

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

More information

SPSS Introduction. Yi Li

SPSS Introduction. Yi Li SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program Domain Clinical Data Sciences Private Limited 8-2-611/1/2, Road No 11, Banjara Hills, Hyderabad Andhra Pradesh

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: PRELIS User s Guide Table of contents INTRODUCTION... 1 GRAPHICAL USER INTERFACE... 2 The Data menu... 2 The Define Variables

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015 R Tools Evaluation A review by Analytics @ Global BI / Local & Regional Capabilities Telefónica CCDO May 2015 R Features What is? Most widely used data analysis software Used by 2M+ data scientists, statisticians

More information

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments

More information

Spreadsheet software for linear regression analysis

Spreadsheet software for linear regression analysis Spreadsheet software for linear regression analysis Robert Nau Fuqua School of Business, Duke University Copies of these slides together with individual Excel files that demonstrate each program are available

More information

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

Analyzing Structural Equation Models With Missing Data

Analyzing Structural Equation Models With Missing Data Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University [email protected] based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected]

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected] IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

BUSINESS RULES CONCEPTS... 2 BUSINESS RULE ENGINE ARCHITECTURE... 4. By using the RETE Algorithm... 5. Benefits of RETE Algorithm...

BUSINESS RULES CONCEPTS... 2 BUSINESS RULE ENGINE ARCHITECTURE... 4. By using the RETE Algorithm... 5. Benefits of RETE Algorithm... 1 Table of Contents BUSINESS RULES CONCEPTS... 2 BUSINESS RULES... 2 RULE INFERENCE CONCEPT... 2 BASIC BUSINESS RULES CONCEPT... 3 BUSINESS RULE ENGINE ARCHITECTURE... 4 BUSINESS RULE ENGINE ARCHITECTURE...

More information

NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES

NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES REPORT BY THE DECISION SUPPORT UNIT May 2011 (last updated April 2012)

More information

Big Data, Statistics, and the Internet

Big Data, Statistics, and the Internet Big Data, Statistics, and the Internet Steven L. Scott April, 4 Steve Scott (Google) Big Data, Statistics, and the Internet April, 4 / 39 Summary Big data live on more than one machine. Computing takes

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting

Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

Package dsmodellingclient

Package dsmodellingclient Package dsmodellingclient Maintainer Author Version 4.1.0 License GPL-3 August 20, 2015 Title DataSHIELD client site functions for statistical modelling DataSHIELD

More information

WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math

WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math Textbook Correlation WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math Following Directions Unit FIRST QUARTER AND SECOND QUARTER Logic Unit

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Poisson Regression or Regression of Counts (& Rates)

Poisson Regression or Regression of Counts (& Rates) Poisson Regression or Regression of (& Rates) Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Generalized Linear Models Slide 1 of 51 Outline Outline

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

IBM SPSS Data Preparation 22

IBM SPSS Data Preparation 22 IBM SPSS Data Preparation 22 Note Before using this information and the product it supports, read the information in Notices on page 33. Product Information This edition applies to version 22, release

More information

MATLAB and Big Data: Illustrative Example

MATLAB and Big Data: Illustrative Example MATLAB and Big Data: Illustrative Example Rick Mansfield Cornell University August 19, 2014 Goals Use a concrete example from my research to: Demonstrate the value of vectorization Introduce key commands/functions

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information