Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
|
|
- George Norris
- 7 years ago
- Views:
Transcription
1 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed MCMC algorithms for BAC and TBAC. In Web Appendices B-E, we provide additional simulation results to further evaluate the performances of TBAC and BAC under various situations. A summary of these simulation results is provided in Section 4.1 of the paper. Web Appendix A: Details of the MCMC Algorithms BAC The posterior samples of (α X, α Y, β αy ) are obtained by iteratively sampling from P (α X β αy, α Y, D), P (α Y β αy, α X, D) and P (β αy α X, α Y, D). The three full conditionals are 1) P (α X β αy, α Y, D) P (X α X )P (α X α Y ), where, based on Raftery et al. (1997), P (X α X ) using A3 ====== P (α X α Y, X) = P (X αx, α Y )P (α X α Y ) P (X α Y ) Γ( ν+n 2 = )(νλ)ν/2 π n/2 Γ( ν ) I 2 n + φ 2 W α XΣ 0α XW α X 1/2 using A2 ====== P (X αx )P (α X α Y ) P (X α Y ) {λν + (X W α Xµ 0α X) (I n + φ 2 W α XΣ 0α XW α X) 1 (X W α Xµ 0α X)} ν+n 2, and W α X is the design matrix of the exposure regression, I n is the n n identity matrix, n is the sample size. 2) P (α Y β αy, α X, D) P (Ỹ αy )P (α Y α X ), using A1 ====== P (α Y α X, Ỹ ) = P (Ỹ αx, α Y )P (α Y α X ) P (Ỹ αx ) using A4 ====== P (Ỹ αy )P (α Y α X ) P (Ỹ αx ) where Ỹ = Y βαy X. Let W α Y be the design matrix of the outcome regression and suppose the observations of X are placed in the first column of W α Y. Let W α Y be the
2 2 Biometrics, second to the (M + 1)th columns of W α Y, based on Raftery et al. (1997), P (Ỹ αy ) Γ( ν+n 2 = )(νλ)ν/2 π n/2 Γ( ν ) I 2 n + φ 2 W Σ W α Y 0α Y α Y 1/2 {λν + (Ỹ W α Y µ 0α Y ) (I n + φ 2 W α Y Σ 0α Y W α Y ) 1 (Ỹ W α Y µ 0α Y )} ν+n 2, where µ 0α Y is the second to the (M + 1)th elements of µ 0α Y and Σ 0α Y is the second to the (M + 1)th rows and the second to the (M + 1)th columns of Σ 0α Y. 3) P (β αy α X, α Y, D) using A3 ====== P (β αy α Y, D). Based on Bernardo and Smith (2000), we obtain β αy α Y, D t n+ν (β nα Y, σ 2 nα Y ), where β nα Y is the first element of θ nα Y, σ 2 nα Y is the (1,1) element of S nα Y, and θ nα Y = (W α Y W α Y + Σ 1 0α Y /φ 2 ) 1 (Σ 1 0α Y µ 0α Y /φ 2 + W α Y Y ) S nα Y = (n+ν) 1 {νλ+(y W α Y θ nα Y ) Y +(µ 0α Y θ nα Y ) Σ 1 0α Y µ 0α Y /φ 2 }{W α Y W α Y +Σ 1 0α Y /φ 2 ) 1 }. TBAC We implement two separate MCMC algorithms to draw from P (α X X) and from P (β αy, α Y D). First, we use the MC 3 method (Madigan and York, 1995) to sample from P (α X X) and count the appearance frequency of each α X. Second, we draw a posterior sample of (α Y, β αy ) to approximate P (α Y, β D). Using P (α Y X) in equation (7) as prior of α Y, we have P (α Y, β αy D) = P (Y αy, β αy )P (β αy α Y )P (α Y ) P (Y ) = P (Y αy, β αy )P (β αy α Y ) P (α Y α X )P (α X X) P (Y ) α X = α X P α X(α Y, β αy Y )P (α X X), where P α X(α Y, β αy Y ) = P (Y α Y, β αy )P (β αy α Y )P (α Y α X )/P (Y ) is the joint posterior of (α Y, β αy ) with prior on α Y specified as P (α Y α X ). For each given α X, we draw a
3 3 sample of (α Y, β αy ) from P α X(α Y, β αy Y ) by iteratively sampling from P α X(α Y β αy, Y ) and P α X(β αy α Y, Y ). These two full conditionals can be derived as follows: 1) P α X(α Y β αy, Y ) using A1 ===== P α X(α Y Ỹ ) P (Ỹ αy )P (α Y α X ). 2) P α X(β αy α Y, Y ) follows the t-distribution t n+ν (β nα Y, σ 2 nα Y ). We take the sample size equal to the frequency of α X in the Markov chain from P (α X X) and combine samples from different α X s together to obtain the sample of (α Y, β αy ) from their joint posterior P (α Y, β αy D). Web Appendix B: Simulation Results in the Presence of Predictors Only Correlated with X Consider the true model: Y i = βx i + δ1 Y U 1i + δ2 Y U 2i + ɛ Y i, where i = 1,..., 1000, and ɛ Y i are independent N(0, 1). (X i, U 1i, U 2i, U 3i, U 4i ) are independent normal vectors with mean zero and a covariance matrix, Σ = (σ kl ) 5 5, where σ kk = 1, k = 1,..., 5, σ 12 = σ 14 = σ 21 = σ 24 = ρ, σ 15 = σ 51 = ρ/2, and all other σ kl s equal to zero. Under this scenario, U 3 and U 4 are two predictors that are only correlated with X but not with Y (given X). The set of potential confounders U includes U 1,..., U 4 as well as 49 additional independent N(0, 1) random variables. In our simulation, ρ is set to 0.6 and β = δ1 Y = δ2 Y = 0.1. Five hundred independent data sets were generated. We applied odds priors for both BAC and TBAC, and chose the value of dependence parameter ω to be 2, 4, 10 or. The results are summarized in Web Table 1. [Web Table 1 about here.] For both BAC and TBAC, the standard errors of estimates increase as ω increases. Since the data include two predictors only correlated with X but not correlated with Y (given X), increasing ω assigns higher probabilities to include them in the outcome model, which yields larger standard errors. On the other hand, the bias decreases as ω increases. Since
4 4 Biometrics, the data include U 1, a confounder strongly correlated with X but weakly correlated with Y, increasing ω assigns higher probability to include this predictor in the outcome model, which yields less bias. In terms of MSE, which balances bias and variation, having ω less than infinity yields smaller MSEs. But in terms of coverage probability of 95% CIs, small ω, such as ω = 2, provides lower coverage probability than desired. Interestingly, ω = 10 has the same coverage probability as ω = but much smaller MSE, and therefore is a better choice under this simulation scenario. Web Appendix C: Simulation Results when the Exposure Model is Misspecified Consider the same true outcome model as in the first simulation scenario in the paper: Y i = βx i + δ Y 1 U 1i + δ Y 2 U 2i + ɛ Y i, where i = 1,..., 1000, and ɛ Y i are independent N(0, 1). The set of potential confounders U includes U 1, U 2 as well as 49 additional random variables. All these potential confounders follow independent N(0, 1). In this scenario, the exposure X is modeled as a non-linear function of U 1 : X i = δ X 1 U 3 1i + ɛ X i, where ɛ X i are independent N(0, 0.5). In our simulation, we set β = δ Y 1 = δ Y 2 = 0.1, and δ X 1 = 0.7. We generated 500 data sets and applied BAC and TBAC with ω =. The results are summarized in Web Table 2. [Web Table 2 about here.] Estimates from BAC and TBAC are very similar to each other, both close to the results obtained from the true model. In this simulation scenario, both methods are robust to the misspecification of the exposure model. Web Appendix D: Simulation Results for Comparing TBAC vs. BAC Consider the true model: Y i = βx i + δ Y 1 U 1i + δ Y 2 U 2i + δ Y 3 U 3i + δ Y 4 U 4i + ɛ Y i, where i = 1,..., 100, and ɛ Y i are independent N(0, 1). The set of potential confounders U consists
5 5 of U 1i,..., U 4i, which are independent N(0, 1) random variables. X is modeled by X i = δ X 1 U 1i + δ X 2 U 2i + δ X 3 U 3i + δ X 4 U 4i + ɛ X i, where ɛ X i are independent N(0, ). In our simulation, we set β = δ X 1 = δ X 2 = δ Y 1 = δ Y 3 = 0.1, δ X 3 = δ X 4 = 0.6 and δ Y 2 = δ Y 4 = 2. Five hundred independent data sets were generated and BAC and TBAC with ω = were applied. We first compared the marginal posterior distributions of α X (Web Table 3). For both BAC and TBAC, all the posterior weights are assigned to models containing U 3 and U 4 since these two predictors are strongly correlated with X. TBAC assigns equal weights to α X = (0, 1, 1, 1) and α X = (1, 0, 1, 1) since U 1 and U 2 have the same correlation coefficient with X. In contrast, BAC assigns much higher weight to α X = (0, 1, 1, 1) than to α X = (1, 0, 1, 1). This is the result of feedback effect since U 2 is strongly correlated with Y while U 1 is weakly correlated with Y. [Web Table 3 about here.] We next compared the marginal posterior distribution of α Y (Web Table 4). For both BAC and TBAC, the posterior weights are concentrated on models containing U 2, U 3 and U 4 since thses three predictors are highly correlated with either X or Y or both. Large weights are assigned to α Y = (0, 1, 1, 1), the model not containing U 1, since U 1 is only weakly correlated with both X and Y. Compared to TBAC, BAC assigns more weight to α Y = (0, 1, 1, 1). By considering the feedback effect and joint modeling the exposure and outcome models, BAC tends to assign higher weights to more parsimonious models. [Web Table 4 about here.] Finally, we compared the estimation of exposure effect, β. As shown in Web Table 5, the estimates from two methods are very similar. [Web Table 5 about here.]
6 6 Biometrics, Web Appendix E: Simulation Results for Comparing the MSEs from BAC and TBAC versus that from BMA We used the same models as in the two simulation scenarios in the paper, but considered a smaller sample size of 100. For each simulation scenario, we generated 500 replications. The dependence parameters ω in both BAC and TBAC are set to. The estimation results are summarized in Web Table 6. When sample size is 100, the MSE from BMA is lower than those from BAC and TBAC in simulation scenario one but is higher in scenario two. When sample size increases to 1000, as shown in the paper, MSEs of BAC and TBAC are lower in both scenarios. [Web Table 6 about here.] References Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian Theory. John Wiley & Sons, England. Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. International Statistical Review 63, Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association 92,
7 7 Web Table 1 Comparison of estimates of β from BAC and TBAC using odds priors. BIAS is the difference between the mean of estimates of β and the true value, SEE is the mean of standard error estimates, SSE is the standard error of the estimates of β, MSE is the mean square error, and CP is the coverage probability of the 95% confidence interval or credible interval. Method BIAS SEE SSE MSE CP True model BAC ω = ω = ω = ω = TBAC ω = ω = ω = ω =
8 8 Biometrics, Web Table 2 Comparison of estimates of β from BAC and TBAC when the exposure model is misspecified. The dependence parameters ω in both BAC and TBAC are set to. Method BIAS SEE SSE MSE CP True model BAC TBAC
9 9 Web Table 3 Comparison of marginal posterior distributions of α X. The dependence parameters ω in both BAC and TBAC are set to. Model P (α X D) from BAC P (α X D) from TBAC (0,0,1,1) (0,1,1,1) (1,0,1,1) (1,1,1,1)
10 10 Biometrics, Web Table 4 Comparison of marginal posterior distributions of α Y. The dependence parameters ω in both BAC and TBAC are set to. Model P (α Y D) from BAC P (α Y D) from TBAC (0,1,1,1) (1,1,1,1)
11 11 Web Table 5 Comparison of estimates of β from BAC and TBAC. The dependence parameters ω in both BAC and TBAC are set to. Model BIAS SEE SSE MSE CP MLE from model (0,1,1,1) MLE from model (1,1,1,1) BAC TBAC
12 12 Biometrics, Web Table 6 Comparison of MSEs from BAC, TBAC and BMA. The data were generated from the same models as in the two simulation scenarios in the paper, but with sample size 100. The dependence parameters ω in both BAC and TBAC are set to. Simulation Scenario Method BIAS SEE SSE MSE CP One BAC TBAC BMA Two BAC TBAC BMA
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationMarkov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationPS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationExtreme Value Modeling for Detection and Attribution of Climate Extremes
Extreme Value Modeling for Detection and Attribution of Climate Extremes Jun Yan, Yujing Jiang Joint work with Zhuo Wang, Xuebin Zhang Department of Statistics, University of Connecticut February 2, 2016
More informationAnalysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
More informationBayesian Model Averaging CRM in Phase I Clinical Trials
M.D. Anderson Cancer Center 1 Bayesian Model Averaging CRM in Phase I Clinical Trials Department of Biostatistics U. T. M. D. Anderson Cancer Center Houston, TX Joint work with Guosheng Yin M.D. Anderson
More informationModeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data
Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data Brian J. Smith, Ph.D. The University of Iowa Joint Statistical Meetings August 10,
More informationImputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More information1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationIllustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
More informationValidation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT
Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software
More informationUniversity of Maryland Fraternity & Sorority Life Spring 2015 Academic Report
University of Maryland Fraternity & Sorority Life Academic Report Academic and Population Statistics Population: # of Students: # of New Members: Avg. Size: Avg. GPA: % of the Undergraduate Population
More informationComparison of Estimation Methods for Complex Survey Data Analysis
Comparison of Estimation Methods for Complex Survey Data Analysis Tihomir Asparouhov 1 Muthen & Muthen Bengt Muthen 2 UCLA 1 Tihomir Asparouhov, Muthen & Muthen, 3463 Stoner Ave. Los Angeles, CA 90066.
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationModel Selection and Claim Frequency for Workers Compensation Insurance
Model Selection and Claim Frequency for Workers Compensation Insurance Jisheng Cui, David Pitt and Guoqi Qian Abstract We consider a set of workers compensation insurance claim data where the aggregate
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationSample Size Calculation for Longitudinal Studies
Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationBayesian Approaches to Handling Missing Data
Bayesian Approaches to Handling Missing Data Nicky Best and Alexina Mason BIAS Short Course, Jan 30, 2012 Lecture 1. Introduction to Missing Data Bayesian Missing Data Course (Lecture 1) Introduction to
More informationCourse 4 Examination Questions And Illustrative Solutions. November 2000
Course 4 Examination Questions And Illustrative Solutions Novemer 000 1. You fit an invertile first-order moving average model to a time series. The lag-one sample autocorrelation coefficient is 0.35.
More informationLasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationComparison of resampling method applied to censored data
International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison
More informationMultilevel Modelling of medical data
Statistics in Medicine(00). To appear. Multilevel Modelling of medical data By Harvey Goldstein William Browne And Jon Rasbash Institute of Education, University of London 1 Summary This tutorial presents
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationApplications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationA Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
More informationGaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
More informationBayesian Model Averaging Continual Reassessment Method BMA-CRM. Guosheng Yin and Ying Yuan. August 26, 2009
Bayesian Model Averaging Continual Reassessment Method BMA-CRM Guosheng Yin and Ying Yuan August 26, 2009 This document provides the statistical background for the Bayesian model averaging continual reassessment
More informationConfidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationWeb-based Supplementary Materials
Web-based Supplementary Materials Continual Reassessment Method for Partial Ordering by Nolan A. Wages, Mark R. Conaway, and John O Quigley Web Appendix A: Further details for matrix orders In this section,
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationQUALITY ENGINEERING PROGRAM
QUALITY ENGINEERING PROGRAM Production engineering deals with the practical engineering problems that occur in manufacturing planning, manufacturing processes and in the integration of the facilities and
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationTwo Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
More informationBayesian Adaptive Designs for Early-Phase Oncology Trials
The University of Hong Kong 1 Bayesian Adaptive Designs for Early-Phase Oncology Trials Associate Professor Department of Statistics & Actuarial Science The University of Hong Kong The University of Hong
More informationD-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationGenerating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010
Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationSampling for Bayesian computation with large datasets
Sampling for Bayesian computation with large datasets Zaiying Huang Andrew Gelman April 27, 2005 Abstract Multilevel models are extremely useful in handling large hierarchical datasets. However, computation
More informationForecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
More informationVariations of Statistical Models
38. Statistics 1 38. STATISTICS Revised September 2013 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a
More informationM1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap
Nathalie Villa-Vialanei Année 2015/2016 M1 in Economics and Economics and Statistics Applied multivariate Analsis - Big data analtics Worksheet 1 - Bootstrap This worksheet illustrates the use of nonparametric
More informationBayesian Analysis for the Social Sciences
Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November
More informationHETEROGENEOUS AGENTS AND AGGREGATE UNCERTAINTY. Daniel Harenberg daniel.harenberg@gmx.de. University of Mannheim. Econ 714, 28.11.
COMPUTING EQUILIBRIUM WITH HETEROGENEOUS AGENTS AND AGGREGATE UNCERTAINTY (BASED ON KRUEGER AND KUBLER, 2004) Daniel Harenberg daniel.harenberg@gmx.de University of Mannheim Econ 714, 28.11.06 Daniel Harenberg
More informationMAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX
MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX KRISTOFFER P. NIMARK The next section derives the equilibrium expressions for the beauty contest model from Section 3 of the main paper. This is followed by
More informationConfidence Intervals for Spearman s Rank Correlation
Chapter 808 Confidence Intervals for Spearman s Rank Correlation Introduction This routine calculates the sample size needed to obtain a specified width of Spearman s rank correlation coefficient confidence
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations
More informationLecture 9: Introduction to Pattern Analysis
Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns
More informationConfidence Intervals for Exponential Reliability
Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationMonte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based
More informationPROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationPanel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationLinear Regression. Guy Lebanon
Linear Regression Guy Lebanon Linear Regression Model and Least Squares Estimation Linear regression is probably the most popular model for predicting a RV Y R based on multiple RVs X 1,..., X d R. It
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationMultiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationPoint Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
More informationConfidence Intervals for One Standard Deviation Using Standard Deviation
Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from
More informationCorrelation in Random Variables
Correlation in Random Variables Lecture 11 Spring 2002 Correlation in Random Variables Suppose that an experiment produces two random variables, X and Y. What can we say about the relationship between
More information