Generalized linear models and software for network meta-analysis
|
|
|
- Reynard Dalton
- 10 years ago
- Views:
Transcription
1 Generalized linear models and software for network meta-analysis Sofia Dias & Gert van Valkenhoef Tufts University, Boston MA, USA, June 2012
2 Generalized linear model (GLM) framework Pairwise Meta-analysis and Indirect comparisons are special cases of Mixed treatment comparisons (or NMA) All are types of linear regression Use familiar GLM framework to define the NMA model Define a likelihood l(y γ) with some unknown parameters. Use a link function g( ) to map parameter of interest, γ, onto the real line (assume linear relationship). Define model for the linear predictor. The GLM for (network) meta-analysis can be written as g(γ) = θ ik = µ i + δ ik I {k 1} with i = 1,..., M, k = 1,..., na i and I the indicator function (0 if k=1; 1 if k 1 ).
3 The Model g(γ) = θ ik = µ i + δ ik I {k 1} The linear predictor θ ik is a continuous measure of the effect of the treatment in arm k of study i. δ ik are the trial-specific treatment effects of the treatment in arm k relative to the treatment in arm 1. In a random effects (RE) model δ ik are assumed to be exchangeable: δ ik N (d ti1,t ik, σ 2 ) When multi-arm trials are available the RE distribution is multivariate normal. Suitable prior distributions need to be defined for µ i, d 1k, σ. In a fixed effects (FE) model, the GLM simplifies to g(γ) = θ ik = µ i + (d 1,tik d 1,ti1 ) I {k 1}
4 GLM framework: Binomial/logit Example Data: number of events, r ik, out of total number of participants, n ik, in arm k of trial i. The likelihood is r ik Binomial(p ik, n ik ) Use the logit link to map the probabilities onto the real line. Model: θ ik = logit(p ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-odds of an event on each arm of the trial. Define priors etc
5 GLM framework: Poisson/log Example Data are number of events, r ik, occurring in arm k of trial i over an exposure period E ik in person-years The likelihood is r ik Poisson(λ ik E ik ) Use the log link to map the rates onto the real line Model: θ ik = log(λ ik ) = µ i + δ ik I {k 1} The linear predictor θ ik is the log-rate of an event on each arm of the trial. Define priors etc
6 The GLM and WinBUGS GLM are ideally suited for coding in WinBUGS due to their modular structure. We have developed WinBUGS code which directly translates GLM theory. One generic model structure for FE, one for RE. Code can be adapted for various data types by changing only likelihood and link function. The meta-analysis model for the linear predictor θ ik is always the same.
7 FE model: Binomial/logit # Binomial likelihood, logit link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dbin(p[i,k],n[i,k]) # binomial likelihood # model for linear predictor logit(p[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS
8 FE model: Poisson/log # Poisson likelihood, log link # Fixed effects model model{ # *** PROGRAM STARTS for(i in 1:ns){ # LOOP THROUGH STUDIES # vague priors for all trial baselines mu[i] dnorm(0,.0001) for (k in 1:na[i]) { # LOOP THROUGH ARMS r[i,k] dpois(beta[i,k]) # Poisson likelihood beta[i,k] <- lambda[i,k]*e[i,k] # failure rate * exposure # model for linear predictor log(lambda[i,k]) <- mu[i] + d[t[i,k]] - d[t[i,1]] } } d[1]<-0 # treatment effect is zero for reference treatment # vague priors for treatment effects for (k in 2:nt){ d[k] dnorm(0,.0001) } } # *** PROGRAM ENDS
9 Data for Binomial/logit example Define number of treatments, nt, and number of studies, ns: list(nt=4,ns=24) Data given as one trial per row Columns are: events, number of patients, treatments compared and number of arms in trial Data can be copied from spreadsheet software: r[,1] n[,1] r[,2] n[,2] r[,3] n[,3] t[,1] t[,2] t[,3] na[] NA NA 1 3 NA NA NA 1 3 NA 2. END
10 Initial values for Binomial/logit example Define values where simulation will start # Initial values # Chain 1 list( d=c(na,0,0,0), mu=c(0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0) ) # Chain 2 list(d=c(na,0.1,-1,-0.2), mu=c(1,-1,-2,0,0, -2,1,0,2,2, 1,-1,-2,0,0, -2,1,0,2,2, -2,-0.5,-3,0.5) ) Run WinBUGS
11 NICE DSU Technical Support Documents Series of Technical Support Documents (TSDs) on Evidence Synthesis commissioned by NICE DSU. Available from The GLM theory for (network) meta-analysis is set out with a variety of worked examples and code in TSD2. Other TSDs deal with Heterogeneity and meta-regression (TSD3), Inconsistency (TSD4), Baseline Models (TSD5) and Software (TSD6). TSD7 has a checklist for reviewers of NMA submissions Primarily for NICE Technology Appraisals, but relevant for submissions to journals as well.
12
13 Advantages of TSD Code Several worked examples available Number of events: Binomial/logit Rate data: Poisson/log and Binomial/cloglog Competing risks: Multinomial/log Continuous: Normal/identity Including change from baseline, relative effect data, SMD Ordered categorical data: multinomial/probit. Code is very general and will handle any combination of likelihood/link function; any number or trials and treatments; any number of multi-arm trials; arm-based data or data in relative effect format. Correctly accounts for the correlations in multi-arm trials. Easy to set up shared parameter models, for example when some data are in arm-based and some in relative effect formats.
14 Other bits of code... Basic code will provide all treatment effects relative to treatment 1 (the chosen reference). Due to modular nature of WinBUGS, it is easy to add extra code to provide other output such as: Assessing model fit (residual deviance); Obtaining all relative treatment effects; Obtaining relative effects on a different scale (eg. odds ratio) with correct uncertainty; Obtaining NNT or absolute probabilities/rates with associated uncertainty; Obtaining probabilities that each treatment is the best, second best etc. TSD2 provides sample code.
15 Advantages of using WinBUGS for NMA Code already available for many data types so no need for extra coding. No need for data preparation before running model. Produces sample from true posterior distribution. CODA output can be used directly to inform economic models. Due to WinBUGS flexibility can easily extend code to more complex models include covariates (meta-regression - see TSD4); class effects models; using IPD, etc.
16 Disadvantages of using WinBUGS for NMA Requires knowledge of MCMC methods to check convergence and detect problems But will still provide output, which can be misinterpreted... Some models may require many iterations which can take some time to run. Graphical capabilities very limited so need to export results to other software. Setting up initial values may be tricky in some models. May have problems converging when network is sparse and/or has many zero cells.
17 Using the TSD WinBUGS Code Need basic knowledge of Stats!! Choose appropriate code from the website, decide which nodes to monitor, and how to interpret the output. Input data and number of studies and treatments. User needs to define Overall baseline or reference treatment (treatment 1) for NMA; Treatment coding order; Priors, can be tricky for the heterogeneity in RE model; Initial values for MCMC simulation to start. Before valid output can be obtained users also need to check Convergence; Model fit; Consistency.
18 Automated model generation Generate model: abstract representation Structure: basic parameters, study baselines Priors Starting values Abstract representation concrete implementation BUGS syntax (templates based on NICE TSDs) JAGS syntax (templates based on NICE TSDs) YADAS MCMC models in Java
19 Current model generation capabilities (1/4) Model structure depends on type: Consistency / node-split / inconsistency Random effects homogeneous variance General method for priors: Use a simple heuristic Define what is large deviation vague priors General method for starting values: Sample from over-dispersed MLEs Requires parameters are directly measured Additional constraint for model structure
20 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence
21 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence C A D E B
22 Current model generation capabilities (2/4) Consistency model generation (under review) Consistency model generation easy even arbitrary Method for generating starting values restricts structure Basic parameters must be directly measured They are a spanning tree of the evidence graph Will choose compact tree good for convergence tpa UK C ASPAC A D E AtPA Ten B SK SKtPA Ret
23 Current model generation capabilities (3/4) Node-split model generation (draft) Node-splitting models require some recoding Generally there will be many nodes to split Inconvenient to do by hand Will present SRSM Main problem is choosing nodes to split If right nodes chosen, model generation again easy
24 Current model generation capabilities (4/4) Inconsistency model generation (published, but imperfect) Inconsistency model generation is HARD Algorithm inefficient for multi-arm trials My current work leaves much to be desired I won t go into further detail
25 GeMTC: MTC model generation Java library (open source, reusable) for model generation Command-line interface / R-package ( GeMTC CLI ) Simplistic GUI ( GeMTC GUI ) Now: (very) quick demo of GeMTC GUI Loading a data file Generating a node-split model Quick look at generated code
26
27 Beyond model generation Model generation alone is not enough: GUI for network meta-analysis Pseudo-automated convergence checking Automatically generate the right summaries, tables, figures Data entry / management We have this in ADDIS!
28 ADDIS goals The goals (will take a while to get there...): Database of trials, really structured Meta-analysis, network meta-analysis, decision analysis Inform health care policy (regulation, guidelines, reimbursement) Automate systematic review (i.e. eliminate the grunt work) Sourcing from abstract databases, systematic reviews, registries So, kind of bussiness intelligence for health care policy
29 ADDIS current status Somewhat advanced trial data model XML schema available Inspired by CDISC / BRIDG / OCRe Being vetted by CDISC expert now Tools for study selection falling behind But receiving some attention right now! Hardly any data sourcing (so far focussed on regulators) Analysis tools have received most attention This was/is the focus of my PhD research
30 Network meta-analysis in ADDIS Demo! The example dataset Building a network meta-analysis Running the models assessing convergence Assessing inconsistency Consistency results
31
32 Generalized linear models Very flexible & general Requires a lot of knowledge from user Some models (node-split, inconsistency) complicated Automation could help to Make analysis faster / easier Prevent coding mistakes Ensure necessary steps are taken
33 Model generation / GeMTC Given dataset, generates model Everything else done in WinBUGS Requires some knowledge from user Generated models can be customized Only most common types of model available
34 Model generation wishlist Near future: Relative-effect data Detect sparse / invalid / problematic data Fixed effects / Random effects heterogeneous variance User-defined priors / knowledge-based prior selection R package based on GeMTC, rjags, coda More distant future: Covariates Better inconsistency DAG generation Software is open source: I welcome contributions!
35 Automation / ADDIS ADDIS is... Much more ambitious Network meta-analysis is a means, not an end Database of trials decision support Less flexible Not even near finished Relative to WinBUGS: No manual coding One-click interface to run models User is explictly asked to look at convergence Models to assess inconsistency directly available Appropriate tables & plots
36 Discussion Something in between GeMTC and ADDIS needed? Or integrating GeMTC in an R package? There will always be need for the raw WinBUGS code Automated interface should not get in the way Should give user ability to drop down to code level And there is much work to be done!
37 Thank you! Questions?
NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS
NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK META-ANALYSIS OF RANDOMISED CONTROLLED TRIALS REPORT BY THE DECISION SUPPORT UNIT August 2011 (last
NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT
NICE DSU TECHNICAL SUPPORT DOCUMENT 3: HETEROGENEITY: SUBGROUPS, META-REGRESSION, BIAS AND BIAS-ADJUSTMENT REPORT BY THE DECISION SUPPORT UNIT September 2011 (last updated April 2012) Sofia Dias 1, Alex
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Methods for Meta-analysis in Medical Research
Methods for Meta-analysis in Medical Research Alex J. Sutton University of Leicester, UK Keith R. Abrams University of Leicester, UK David R. Jones University of Leicester, UK Trevor A. Sheldon University
A Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
Multiple Choice: 2 points each
MID TERM MSF 503 Modeling 1 Name: Answers go here! NEATNESS COUNTS!!! Multiple Choice: 2 points each 1. In Excel, the VLOOKUP function does what? Searches the first row of a range of cells, and then returns
Model-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
Logistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis. 25th November 2014
Guidance on the implementation and reporting of a drug safety Bayesian network meta-analysis PSI Pharmaceutical Statistics Journal Club Meeting David Ohlssen, Novartis 25th November 2014 1 2 Outline Overview
HLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Oracle Data Miner (Extension of SQL Developer 4.0)
An Oracle White Paper September 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into a workflow using the SQL Query node Denny Wong Oracle Data Mining
BayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
R2MLwiN Using the multilevel modelling software package MLwiN from R
Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the
Lecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
Using Excel for Statistical Analysis
Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure
A Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
Introduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
Logistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
data visualization and regression
data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species
Automated Biosurveillance Data from England and Wales, 1991 2011
Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical
Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
Lecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc.
A Hybrid Modeling Platform to meet Basel II Requirements in Banking Jeffery Morrision, SunTrust Bank, Inc. Introduction: The Basel Capital Accord, ready for implementation in force around 2006, sets out
ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING
ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANI-KALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of
13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
ADDIS: towards on-demand support for evidence based decision making based on structured data sources
ADDIS: towards on-demand support for evidence based decision making based on structured data sources Gert van Valkenhoef 2014-11-21 @ NLM / ClinicalTrials.gov Section 1 Background About me MSc Artificial
Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
Regression 3: Logistic Regression
Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic regression Logistic regression in R Outline Logistic regression Introduction The model Looking at and comparing
APPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
7 Generalized Estimating Equations
Chapter 7 The procedure extends the generalized linear model to allow for analysis of repeated measurements or other correlated observations, such as clustered data. Example. Public health of cials can
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
Principles of Systematic Review: Focus on Alcoholism Treatment
Principles of Systematic Review: Focus on Alcoholism Treatment Manit Srisurapanont, M.D. Professor of Psychiatry Department of Psychiatry, Faculty of Medicine, Chiang Mai University For Symposium 1A: Systematic
LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
Applications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
Better decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
PS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
Problem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort [email protected] Motivation Location matters! Observed value at one location is
Data Management for Multi-Environment Trials in Excel
Data Management for Multi-Environment Trials in Excel Cathy Garlick ([email protected]), Statistical Services Centre, University of Reading, UK 27 May 2010 1. Introduction Excel is widely available
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
SPSS Introduction. Yi Li
SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf
Basic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
Introduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program
Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program Domain Clinical Data Sciences Private Limited 8-2-611/1/2, Road No 11, Banjara Hills, Hyderabad Andhra Pradesh
Bayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide
Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: PRELIS User s Guide Table of contents INTRODUCTION... 1 GRAPHICAL USER INTERFACE... 2 The Data menu... 2 The Define Variables
Statistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015
R Tools Evaluation A review by Analytics @ Global BI / Local & Regional Capabilities Telefónica CCDO May 2015 R Features What is? Most widely used data analysis software Used by 2M+ data scientists, statisticians
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments
Spreadsheet software for linear regression analysis
Spreadsheet software for linear regression analysis Robert Nau Fuqua School of Business, Duke University Copies of these slides together with individual Excel files that demonstrate each program are available
Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.
Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS
Towards running complex models on big data
Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation
Analyzing Structural Equation Models With Missing Data
Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University [email protected] based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected]
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected] IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
BUSINESS RULES CONCEPTS... 2 BUSINESS RULE ENGINE ARCHITECTURE... 4. By using the RETE Algorithm... 5. Benefits of RETE Algorithm...
1 Table of Contents BUSINESS RULES CONCEPTS... 2 BUSINESS RULES... 2 RULE INFERENCE CONCEPT... 2 BASIC BUSINESS RULES CONCEPT... 3 BUSINESS RULE ENGINE ARCHITECTURE... 4 BUSINESS RULE ENGINE ARCHITECTURE...
NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES
NICE DSU TECHNICAL SUPPORT DOCUMENT 6: EMBEDDING EVIDENCE SYNTHESIS IN PROBABILISTIC COST-EFFECTIVENESS ANALYSIS: SOFTWARE CHOICES REPORT BY THE DECISION SUPPORT UNIT May 2011 (last updated April 2012)
Big Data, Statistics, and the Internet
Big Data, Statistics, and the Internet Steven L. Scott April, 4 Steve Scott (Google) Big Data, Statistics, and the Internet April, 4 / 39 Summary Big data live on more than one machine. Computing takes
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Markov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting
Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example
Normality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Handling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
Package dsmodellingclient
Package dsmodellingclient Maintainer Author Version 4.1.0 License GPL-3 August 20, 2015 Title DataSHIELD client site functions for statistical modelling DataSHIELD
WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math
Textbook Correlation WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math Following Directions Unit FIRST QUARTER AND SECOND QUARTER Logic Unit
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
Poisson Regression or Regression of Counts (& Rates)
Poisson Regression or Regression of (& Rates) Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Generalized Linear Models Slide 1 of 51 Outline Outline
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
IBM SPSS Data Preparation 22
IBM SPSS Data Preparation 22 Note Before using this information and the product it supports, read the information in Notices on page 33. Product Information This edition applies to version 22, release
MATLAB and Big Data: Illustrative Example
MATLAB and Big Data: Illustrative Example Rick Mansfield Cornell University August 19, 2014 Goals Use a concrete example from my research to: Demonstrate the value of vectorization Introduce key commands/functions
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
Examining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
