Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni



Similar documents
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Markov Chain Monte Carlo Simulation Made Simple

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Lecture 3: Linear methods for classification

Least Squares Estimation

Bayesian Statistics in One Hour. Patrick Lam

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Fitting Subject-specific Curves to Grouped Longitudinal Data

PS 271B: Quantitative Methods II. Lecture Notes

Coefficient of Determination

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

A Basic Introduction to Missing Data

Extreme Value Modeling for Detection and Attribution of Climate Extremes

Analysis of Bayesian Dynamic Linear Models

Bayesian Model Averaging CRM in Phase I Clinical Trials

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data

Imputing Missing Data using SAS

Part 2: Analysis of Relationship Between Two Variables

11. Time series and dynamic linear models

1 Prior Probability and Posterior Probability

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Introduction to General and Generalized Linear Models

Standard errors of marginal effects in the heteroskedastic probit model

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Illustration (and the use of HLM)

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report

Comparison of Estimation Methods for Complex Survey Data Analysis

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Model Selection and Claim Frequency for Workers Compensation Insurance

Simple Linear Regression Inference

Sample Size Calculation for Longitudinal Studies

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Additional sources Compilation of sources:

Estimation of σ 2, the variance of ɛ

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Bayesian Approaches to Handling Missing Data

Course 4 Examination Questions And Illustrative Solutions. November 2000

Lasso on Categorical Data

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Comparison of resampling method applied to censored data

Multilevel Modelling of medical data

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Probability Calculator

Applications of R Software in Bayesian Data Analysis

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

A Latent Variable Approach to Validate Credit Rating Systems using R

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Bayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009

Confidence Intervals for Cp

Logit Models for Binary Data

Web-based Supplementary Materials

Poisson Models for Count Data

QUALITY ENGINEERING PROGRAM

Supplement to Call Centers with Delay Information: Models and Insights

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Bayesian Adaptive Designs for Early-Phase Oncology Trials

D-optimal plans in observational studies

Time Series Analysis

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Multivariate normal distribution and testing for means (see MKB Ch 3)

Notes on Applied Linear Regression

Sampling for Bayesian computation with large datasets

Forecasting in supply chains

Variations of Statistical Models

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap

Bayesian Analysis for the Social Sciences

HETEROGENEOUS AGENTS AND AGGREGATE UNCERTAINTY. Daniel Harenberg University of Mannheim. Econ 714,

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX

Confidence Intervals for Spearman s Rank Correlation

Part II. Multiple Linear Regression

Lecture 9: Introduction to Pattern Analysis

Confidence Intervals for Exponential Reliability

Reject Inference in Credit Scoring. Jie-Men Mok

Credit Risk Models: An Overview

Regression Analysis: A Complete Example

Monte Carlo-based statistical methods (MASM11/FMS091)

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

Introduction to mixed model and missing data issues in longitudinal studies

3. Regression & Exponential Smoothing


LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

Panel Data: Linear Models

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Confidence Intervals for the Difference Between Two Means

Linear Regression. Guy Lebanon

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Two Correlated Proportions (McNemar Test)

Multiple Imputation for Missing Data: A Cautionary Tale

Pattern Analysis. Logistic Regression. 12. Mai Joachim Hornegger. Chair of Pattern Recognition Erlangen University

STATISTICA Formula Guide: Logistic Regression. Table of Contents

BayesX - Software for Bayesian Inference in Structured Additive Regression

Point Biserial Correlation Tests

Confidence Intervals for One Standard Deviation Using Standard Deviation

Correlation in Random Variables

Transcription:

1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed MCMC algorithms for BAC and TBAC. In Web Appendices B-E, we provide additional simulation results to further evaluate the performances of TBAC and BAC under various situations. A summary of these simulation results is provided in Section 4.1 of the paper. Web Appendix A: Details of the MCMC Algorithms BAC The posterior samples of (α X, α Y, β αy ) are obtained by iteratively sampling from P (α X β αy, α Y, D), P (α Y β αy, α X, D) and P (β αy α X, α Y, D). The three full conditionals are 1) P (α X β αy, α Y, D) P (X α X )P (α X α Y ), where, based on Raftery et al. (1997), P (X α X ) using A3 ====== P (α X α Y, X) = P (X αx, α Y )P (α X α Y ) P (X α Y ) Γ( ν+n 2 = )(νλ)ν/2 π n/2 Γ( ν ) I 2 n + φ 2 W α XΣ 0α XW α X 1/2 using A2 ====== P (X αx )P (α X α Y ) P (X α Y ) {λν + (X W α Xµ 0α X) (I n + φ 2 W α XΣ 0α XW α X) 1 (X W α Xµ 0α X)} ν+n 2, and W α X is the design matrix of the exposure regression, I n is the n n identity matrix, n is the sample size. 2) P (α Y β αy, α X, D) P (Ỹ αy )P (α Y α X ), using A1 ====== P (α Y α X, Ỹ ) = P (Ỹ αx, α Y )P (α Y α X ) P (Ỹ αx ) using A4 ====== P (Ỹ αy )P (α Y α X ) P (Ỹ αx ) where Ỹ = Y βαy X. Let W α Y be the design matrix of the outcome regression and suppose the observations of X are placed in the first column of W α Y. Let W α Y be the

2 Biometrics, 000 0000 second to the (M + 1)th columns of W α Y, based on Raftery et al. (1997), P (Ỹ αy ) Γ( ν+n 2 = )(νλ)ν/2 π n/2 Γ( ν ) I 2 n + φ 2 W Σ W α Y 0α Y α Y 1/2 {λν + (Ỹ W α Y µ 0α Y ) (I n + φ 2 W α Y Σ 0α Y W α Y ) 1 (Ỹ W α Y µ 0α Y )} ν+n 2, where µ 0α Y is the second to the (M + 1)th elements of µ 0α Y and Σ 0α Y is the second to the (M + 1)th rows and the second to the (M + 1)th columns of Σ 0α Y. 3) P (β αy α X, α Y, D) using A3 ====== P (β αy α Y, D). Based on Bernardo and Smith (2000), we obtain β αy α Y, D t n+ν (β nα Y, σ 2 nα Y ), where β nα Y is the first element of θ nα Y, σ 2 nα Y is the (1,1) element of S nα Y, and θ nα Y = (W α Y W α Y + Σ 1 0α Y /φ 2 ) 1 (Σ 1 0α Y µ 0α Y /φ 2 + W α Y Y ) S nα Y = (n+ν) 1 {νλ+(y W α Y θ nα Y ) Y +(µ 0α Y θ nα Y ) Σ 1 0α Y µ 0α Y /φ 2 }{W α Y W α Y +Σ 1 0α Y /φ 2 ) 1 }. TBAC We implement two separate MCMC algorithms to draw from P (α X X) and from P (β αy, α Y D). First, we use the MC 3 method (Madigan and York, 1995) to sample from P (α X X) and count the appearance frequency of each α X. Second, we draw a posterior sample of (α Y, β αy ) to approximate P (α Y, β D). Using P (α Y X) in equation (7) as prior of α Y, we have P (α Y, β αy D) = P (Y αy, β αy )P (β αy α Y )P (α Y ) P (Y ) = P (Y αy, β αy )P (β αy α Y ) P (α Y α X )P (α X X) P (Y ) α X = α X P α X(α Y, β αy Y )P (α X X), where P α X(α Y, β αy Y ) = P (Y α Y, β αy )P (β αy α Y )P (α Y α X )/P (Y ) is the joint posterior of (α Y, β αy ) with prior on α Y specified as P (α Y α X ). For each given α X, we draw a

3 sample of (α Y, β αy ) from P α X(α Y, β αy Y ) by iteratively sampling from P α X(α Y β αy, Y ) and P α X(β αy α Y, Y ). These two full conditionals can be derived as follows: 1) P α X(α Y β αy, Y ) using A1 ===== P α X(α Y Ỹ ) P (Ỹ αy )P (α Y α X ). 2) P α X(β αy α Y, Y ) follows the t-distribution t n+ν (β nα Y, σ 2 nα Y ). We take the sample size equal to the frequency of α X in the Markov chain from P (α X X) and combine samples from different α X s together to obtain the sample of (α Y, β αy ) from their joint posterior P (α Y, β αy D). Web Appendix B: Simulation Results in the Presence of Predictors Only Correlated with X Consider the true model: Y i = βx i + δ1 Y U 1i + δ2 Y U 2i + ɛ Y i, where i = 1,..., 1000, and ɛ Y i are independent N(0, 1). (X i, U 1i, U 2i, U 3i, U 4i ) are independent normal vectors with mean zero and a covariance matrix, Σ = (σ kl ) 5 5, where σ kk = 1, k = 1,..., 5, σ 12 = σ 14 = σ 21 = σ 24 = ρ, σ 15 = σ 51 = ρ/2, and all other σ kl s equal to zero. Under this scenario, U 3 and U 4 are two predictors that are only correlated with X but not with Y (given X). The set of potential confounders U includes U 1,..., U 4 as well as 49 additional independent N(0, 1) random variables. In our simulation, ρ is set to 0.6 and β = δ1 Y = δ2 Y = 0.1. Five hundred independent data sets were generated. We applied odds priors for both BAC and TBAC, and chose the value of dependence parameter ω to be 2, 4, 10 or. The results are summarized in Web Table 1. [Web Table 1 about here.] For both BAC and TBAC, the standard errors of estimates increase as ω increases. Since the data include two predictors only correlated with X but not correlated with Y (given X), increasing ω assigns higher probabilities to include them in the outcome model, which yields larger standard errors. On the other hand, the bias decreases as ω increases. Since

4 Biometrics, 000 0000 the data include U 1, a confounder strongly correlated with X but weakly correlated with Y, increasing ω assigns higher probability to include this predictor in the outcome model, which yields less bias. In terms of MSE, which balances bias and variation, having ω less than infinity yields smaller MSEs. But in terms of coverage probability of 95% CIs, small ω, such as ω = 2, provides lower coverage probability than desired. Interestingly, ω = 10 has the same coverage probability as ω = but much smaller MSE, and therefore is a better choice under this simulation scenario. Web Appendix C: Simulation Results when the Exposure Model is Misspecified Consider the same true outcome model as in the first simulation scenario in the paper: Y i = βx i + δ Y 1 U 1i + δ Y 2 U 2i + ɛ Y i, where i = 1,..., 1000, and ɛ Y i are independent N(0, 1). The set of potential confounders U includes U 1, U 2 as well as 49 additional random variables. All these potential confounders follow independent N(0, 1). In this scenario, the exposure X is modeled as a non-linear function of U 1 : X i = δ X 1 U 3 1i + ɛ X i, where ɛ X i are independent N(0, 0.5). In our simulation, we set β = δ Y 1 = δ Y 2 = 0.1, and δ X 1 = 0.7. We generated 500 data sets and applied BAC and TBAC with ω =. The results are summarized in Web Table 2. [Web Table 2 about here.] Estimates from BAC and TBAC are very similar to each other, both close to the results obtained from the true model. In this simulation scenario, both methods are robust to the misspecification of the exposure model. Web Appendix D: Simulation Results for Comparing TBAC vs. BAC Consider the true model: Y i = βx i + δ Y 1 U 1i + δ Y 2 U 2i + δ Y 3 U 3i + δ Y 4 U 4i + ɛ Y i, where i = 1,..., 100, and ɛ Y i are independent N(0, 1). The set of potential confounders U consists

5 of U 1i,..., U 4i, which are independent N(0, 1) random variables. X is modeled by X i = δ X 1 U 1i + δ X 2 U 2i + δ X 3 U 3i + δ X 4 U 4i + ɛ X i, where ɛ X i are independent N(0, 0.5 2 ). In our simulation, we set β = δ X 1 = δ X 2 = δ Y 1 = δ Y 3 = 0.1, δ X 3 = δ X 4 = 0.6 and δ Y 2 = δ Y 4 = 2. Five hundred independent data sets were generated and BAC and TBAC with ω = were applied. We first compared the marginal posterior distributions of α X (Web Table 3). For both BAC and TBAC, all the posterior weights are assigned to models containing U 3 and U 4 since these two predictors are strongly correlated with X. TBAC assigns equal weights to α X = (0, 1, 1, 1) and α X = (1, 0, 1, 1) since U 1 and U 2 have the same correlation coefficient with X. In contrast, BAC assigns much higher weight to α X = (0, 1, 1, 1) than to α X = (1, 0, 1, 1). This is the result of feedback effect since U 2 is strongly correlated with Y while U 1 is weakly correlated with Y. [Web Table 3 about here.] We next compared the marginal posterior distribution of α Y (Web Table 4). For both BAC and TBAC, the posterior weights are concentrated on models containing U 2, U 3 and U 4 since thses three predictors are highly correlated with either X or Y or both. Large weights are assigned to α Y = (0, 1, 1, 1), the model not containing U 1, since U 1 is only weakly correlated with both X and Y. Compared to TBAC, BAC assigns more weight to α Y = (0, 1, 1, 1). By considering the feedback effect and joint modeling the exposure and outcome models, BAC tends to assign higher weights to more parsimonious models. [Web Table 4 about here.] Finally, we compared the estimation of exposure effect, β. As shown in Web Table 5, the estimates from two methods are very similar. [Web Table 5 about here.]

6 Biometrics, 000 0000 Web Appendix E: Simulation Results for Comparing the MSEs from BAC and TBAC versus that from BMA We used the same models as in the two simulation scenarios in the paper, but considered a smaller sample size of 100. For each simulation scenario, we generated 500 replications. The dependence parameters ω in both BAC and TBAC are set to. The estimation results are summarized in Web Table 6. When sample size is 100, the MSE from BMA is lower than those from BAC and TBAC in simulation scenario one but is higher in scenario two. When sample size increases to 1000, as shown in the paper, MSEs of BAC and TBAC are lower in both scenarios. [Web Table 6 about here.] References Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian Theory. John Wiley & Sons, England. Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. International Statistical Review 63, 215 232. Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association 92, 179 191.

7 Web Table 1 Comparison of estimates of β from BAC and TBAC using odds priors. BIAS is the difference between the mean of estimates of β and the true value, SEE is the mean of standard error estimates, SSE is the standard error of the estimates of β, MSE is the mean square error, and CP is the coverage probability of the 95% confidence interval or credible interval. Method BIAS SEE SSE MSE CP True model 0.002 0.040 0.040 0.002 0.94 BAC ω = 0.001 0.072 0.074 0.006 0.93 ω = 10 0.019 0.055 0.054 0.003 0.93 ω = 4 0.025 0.050 0.051 0.003 0.90 ω = 2 0.029 0.047 0.050 0.003 0.88 TBAC ω = 0.001 0.073 0.074 0.006 0.94 ω = 10 0.018 0.055 0.054 0.003 0.94 ω = 4 0.025 0.051 0.051 0.003 0.91 ω = 2 0.029 0.049 0.049 0.003 0.89

8 Biometrics, 000 0000 Web Table 2 Comparison of estimates of β from BAC and TBAC when the exposure model is misspecified. The dependence parameters ω in both BAC and TBAC are set to. Method BIAS SEE SSE MSE CP True model 0.000 0.017 0.018 0.0003 0.95 BAC 0.001 0.017 0.018 0.0003 0.95 TBAC 0.001 0.017 0.018 0.0003 0.95

9 Web Table 3 Comparison of marginal posterior distributions of α X. The dependence parameters ω in both BAC and TBAC are set to. Model P (α X D) from BAC P (α X D) from TBAC (0,0,1,1) 0.62 0.43 (0,1,1,1) 0.27 0.23 (1,0,1,1) 0.07 0.23 (1,1,1,1) 0.03 0.11

10 Biometrics, 000 0000 Web Table 4 Comparison of marginal posterior distributions of α Y. The dependence parameters ω in both BAC and TBAC are set to. Model P (α Y D) from BAC P (α Y D) from TBAC (0,1,1,1) 0.80 0.61 (1,1,1,1) 0.20 0.39

11 Web Table 5 Comparison of estimates of β from BAC and TBAC. The dependence parameters ω in both BAC and TBAC are set to. Model BIAS SEE SSE MSE CP MLE from model (0,1,1,1) 0.040 0.203 0.200 0.042 0.94 MLE from model (1,1,1,1) 0.001 0.207 0.204 0.042 0.94 BAC 0.032 0.206 0.202 0.042 0.95 TBAC 0.025 0.207 0.201 0.041 0.95

12 Biometrics, 000 0000 Web Table 6 Comparison of MSEs from BAC, TBAC and BMA. The data were generated from the same models as in the two simulation scenarios in the paper, but with sample size 100. The dependence parameters ω in both BAC and TBAC are set to. Simulation Scenario Method BIAS SEE SSE MSE CP One BAC 0.006 0.142 0.152 0.023 0.93 TBAC 0.005 0.149 0.155 0.024 0.94 BMA 0.043 0.116 0.124 0.017 0.93 Two BAC 0.059 0.162 0.170 0.032 0.92 TBAC 0.041 0.175 0.178 0.033 0.94 BMA 0.153 0.129 0.127 0.040 0.76