Bayesian inference for population prediction of individuals without health insurance in Florida

Size: px
Start display at page:

Download "Bayesian inference for population prediction of individuals without health insurance in Florida"

Transcription

1 Bayesian inference for population prediction of individuals without health insurance in Florida Neung Soo Ha 1 1 NISS 1 / 24

2 Outline Motivation Description of the Behavioral Risk Factor Surveillance System, BRFSS Description of posterior predictive distribution under informative/non-informative sampling Selection bias Inference for proportions of individuals without the health insurance Variation of maps 2 / 24

3 Motivation: Interest: 1. Inference for the county-level proportions of persons without health insurance in FL, using the 2010 Behavioral Risk Factor Surveillance System, BRFSS. 2. Display the variation of maps 3 / 24

4 2010 BRFSS: Survey description The largest on-going telephone based survey; administered by the Centers for Disease Control and Prevention. Collect data on individual risk behaviors and preventive health practices for the adult population(18 years of age and older); collects state-specific data ex: alcohol/cigarette consumption, general health status, health insurance, Uses a disproportionate stratified sample (DSS) design. In FL, 63 out of 67 counties were sampled. No cell phones for / 24

5 Examples of observed data Observed proportion of non insured persons Observed population proportion of non insured persons: Hisp Figure: Observed: All races Figure: Observed: Hispanics white=having insurance, blue=not having insurance 5 / 24

6 Complication 1. Small or non-existent number of observations for some sub-population groups in certain geographical areas. 2. Possible presence of selection bias in the sample 6 / 24

7 Model specification for the BRFSS data P(Y ikl = 1 θ ik ) = θ ik logit(θ ik ) = X k β + ν i ν i N(0, σ 2 ν) Y ikl = 1 : not having insurance, Y ikl = 0 : having insurance i = 1,..., M: county k = 1,..., K: population class (9 age groups 3 races 2 genders ) l = 1,..., N ik : units in county i and group k N ik = population size in ith county and kth group X k = vector of indicators for group k. p(β) = const on (, ) Gelman et al.(2004) p(σ 2 ν) = const on (0, ) 7 / 24

8 Model Evaluation Model fit Bayes residual plots QQ plots Bayesian tests: Partial posterior predictive p-values, Bayarri and Berger(2002) Predictions Cross-validation to simulate finite population inference Use 100(1-p)% of observed data to make predictions for the 100*p% that were held-out 8 / 24

9 Posterior predictive distribution with non-informative sampling Use posterior predictive distribution to make inference for the remaining non-sampled units Under the non-informative sampling, i.e. independence between the probability of selecting a person and the response: Y s = sampled units, Y ns = non-sampled units Y ns Y s θ, X f (Y ns Y s, X) = g(y ns Y s, X, θ)p(θ Y s, X)dθ = g(y ns θ, X)p(θ Y s, X)dθ p(θ Y s, X) L(θ Y s, X)p(θ β, σν)p(β, 2 σν): 2 posterior distribution θ = a vector of model parameters (β, σ 2 ν ) = a vector of hyperparameters 9 / 24

10 Under informative sampling Selection probabilities are related to the outcome variables. The observed outcomes may not be representative of the population outcomes. May cause bias for inferences about the finite population parameters. 10 / 24

11 Selection bias Let I = (I 1,..., I N ), I i = 1 if i s, I i = 0 if i / s, s = sample The posterior predictive distribution with I f (Y ns Y s, X, θ, I) g(y ns θ, X)p(I Y, X, θ) (1) Note: p(i X, Y, θ) p(i X, θ) i.e. selection bias. Selection information only comes from the sampled units. 11 / 24

12 Procedure: Ha and Sedransk(2015) 1. Pfeffermann and Sverchkov (2009): p(i Y, X, θ) = p(i Y, X) = k f 1(I k Y, X k ) 2. Using the Poisson sampling approximation, Chambers et. al.(1998) p(i Y, X) = M K { P(I ikl = 1 Y, X k )}{ (1 P(I ikl = 1 Y, X k ))} l s ik l/ s ik i=1 k=1 3. P(I ikl = 1 Y, X k ) = E U (π ikl Y, X k ) = {E s (w ikl Y, X k )} 1, Pfeffermann and Sverchkov (2009), πikl = probability of unit selected w ikl = 1/π ikl 12 / 24

13 Posterior predictive distribution with inclusion probability g(y ns Y s, X, θ, I) g(y ns θ, X) M i=1 K k=1 (1 a k) M ik1(1 b k ) M ik0, a k = 1 w k,y0, b k = 1 w k,y1 w k,y0 = average of inverse probability of selected units for class k with response Y ikl = 0 wk,y1 = average of inverse probability of selected units for class k with response Y ikl = 1 Mik1 : # non-sampled units with Y ikl = 1 M ik0 : # non-sampled units with Y ikl = 0 M ik1 + M ik0 = N ik n ik Use rejection sampling algorithm to obtain g(y ns ), Robert, C. and Casella, G. (2004) In our data set, 0.96 (1 a k )/(1 b k ) 1.01 with 43 of the 54 K groups. No substantial selection bias. 13 / 24

14 Comparison between prediction to observation County comparison Observed proportion of non insured persons Predicted population proportion of non insured persons Figure: Observed: All races Figure: Predicted: All races 14 / 24

15 Comparison between prediction to observation Observed population proportion of non insured persons: Hisp Predicted population proportion of non insured persons: Hisp Figure: Observed: Hispanics Figure: Predicted: Hispanics 15 / 24

16 Comparison between races Predicted population proportion of non insured persons: White Predicted population proportion of non insured persons: Hisp Figure: Predicted: Whites Figure: Predicted: Hispanics 16 / 24

17 Variation of maps Objective: Assess the variation of the entire map as a unit rather than expressing the variation separately for each geographic unit. Produce a map, based on each MCMC replicate. Analyze variations from a set of maps Illustrate variation between the counties from ν i, the county-level random effect. 17 / 24

18 random effect: mean Mean choropleth map of random effect quintiles [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] Posterior means of county-level random effects, ν i, are partitioned into quintiles Each quintile is color coded Figure: Mean map of ˆν i 18 / 24

19 Besag Method, Besag et al (1995) Provide 100(1 a)% regions for each of ν i : ν i (L) < ν i < ν i (U) Construct lower choropleth map of ν i (L) and ν i (U) 1. Denote the posterior MCMC sample by {ν (t) i : i = 1,..., M, t = 1,..., T } 2. Order {ν (t) i x [t] i and ranks r (t) i : t = 1,..., T } separately for each i to obtain order statistics 3. For fixed k {1,..., T }, let t be the smallest integer such that x [T +1 t ] i x (t) i x [t ] i, for all i, for at least k values of t 4. t = kth order statistic from the set a (t) = max{max i r (t) i, T + 1 min i r (t) i }, t = 1,..., T, i.e. t = a [k] 5. Then, {[x [T +1 t ] i, x [t ] i ] : i = 1,..., M} are a set of simultaneous credible regions containing at least 100k/T % of the distribution 19 / 24

20 Lower and Upper maps random effect: Besag L 90 random effect: mean quintiles random effect: Besag U 90 [ 1.12, 0.76] ( 0.76, 0.562] ( 0.562, 0.359] ( 0.359, 0.19] ( 0.19,0.264] [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] [ 0.172,0.203] (0.203,0.359] (0.359,0.519] (0.519,0.697] (0.697,1.07] Figure: 90 % Lower, Mean, 90% Upper 20 / 24

21 Variation in Choropleth map Goal: Illustrate the uncertainty of the choropleth maps. Use T = 1000 MCMC replications of ν i Each replication can be used to produce a choropleth map. Each quintile is color coded. For each replication, the value of ith county is color-coded according to its quintile. 21 / 24

22 Heat map Q1 Q2 Q3 Q4 Q5 Rows: replicates, columns: counties Variation in maps is evaluated by color changes 22 / 24

23 Summary Present a method for posterior predictive inference that includes selection information for sampled units Illustrate variation of maps 23 / 24

24 Reference 1. Chambers, R. L., Dorfman, A. and Wang, W. (1998). Limited Information Likelihood Analysis of Survey Data. Journal of the Royal Statistical Society, 60, Bayesian benchmarking with application to small area estimation Test Gelman, A., Carlin, J. Stern, H. and Rubin, D. (2004), Bayesian Data Analysis. Chapman and Hall/ CRC, New York, NY. 3. Pfeffermann, D. and Sverchkov, M. (2009) Inference under Informative Sampling Sample Surveys: Inference and Analysis, 29B, Robert, C. and Casella, G. (2004), Monte Carlo Statistical Methods, Springer, New York, NY. 24 / 24

Applications of R Software in Bayesian Data Analysis

Applications of R Software in Bayesian Data Analysis Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Probabilistic Methods for Time-Series Analysis

Probabilistic Methods for Time-Series Analysis Probabilistic Methods for Time-Series Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:

More information

Validation of Software for Bayesian Models Using Posterior Quantiles

Validation of Software for Bayesian Models Using Posterior Quantiles Validation of Software for Bayesian Models Using Posterior Quantiles Samantha R. COOK, Andrew GELMAN, and Donald B. RUBIN This article presents a simulation-based method designed to establish the computational

More information

Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

More information

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software

More information

More details on the inputs, functionality, and output can be found below.

More details on the inputs, functionality, and output can be found below. Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

More information

Sampling for Bayesian computation with large datasets

Sampling for Bayesian computation with large datasets Sampling for Bayesian computation with large datasets Zaiying Huang Andrew Gelman April 27, 2005 Abstract Multilevel models are extremely useful in handling large hierarchical datasets. However, computation

More information

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

More information

STAT3016 Introduction to Bayesian Data Analysis

STAT3016 Introduction to Bayesian Data Analysis STAT3016 Introduction to Bayesian Data Analysis Course Description The Bayesian approach to statistics assigns probability distributions to both the data and unknown parameters in the problem. This way,

More information

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

Bayesian Analysis of Comparative Survey Data

Bayesian Analysis of Comparative Survey Data Bayesian Analysis of Comparative Survey Data Bruce Western 1 Filiz Garip Princeton University April 2005 1 Department of Sociology, Princeton University, Princeton NJ 08544. We thank Sara Curran for making

More information

Health Insurance Estimates for States 1

Health Insurance Estimates for States 1 Health Insurance Estimates for States 1 Robin Fisher and Jennifer Campbell, U.S. Census Bureau, HHES-SAEB, FB 3 Room 1462 Washington, DC, 20233 Phone: (301) 763-5897, email: robin.c.fisher@census.gov Key

More information

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Measures of explained variance and pooling in multilevel models

Measures of explained variance and pooling in multilevel models Measures of explained variance and pooling in multilevel models Andrew Gelman, Columbia University Iain Pardoe, University of Oregon Contact: Iain Pardoe, Charles H. Lundquist College of Business, University

More information

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS).

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS. Vladislav Beresovsky National Center for Health Statistics 3311 Toledo Road Hyattsville, MD 078

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Foundations of Statistics Frequentist and Bayesian

Foundations of Statistics Frequentist and Bayesian Mary Parker, http://www.austincc.edu/mparker/stat/nov04/ page 1 of 13 Foundations of Statistics Frequentist and Bayesian Statistics is the science of information gathering, especially when the information

More information

Inference on the parameters of the Weibull distribution using records

Inference on the parameters of the Weibull distribution using records Statistics & Operations Research Transactions SORT 39 (1) January-June 2015, 3-18 ISSN: 1696-2281 eissn: 2013-8830 www.idescat.cat/sort/ Statistics & Operations Research c Institut d Estadstica de Catalunya

More information

Basic Bayesian Methods

Basic Bayesian Methods 6 Basic Bayesian Methods Mark E. Glickman and David A. van Dyk Summary In this chapter, we introduce the basics of Bayesian data analysis. The key ingredients to a Bayesian analysis are the likelihood

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel

More information

Bayesian credibility methods for pricing non-life insurance on individual claims history

Bayesian credibility methods for pricing non-life insurance on individual claims history Bayesian credibility methods for pricing non-life insurance on individual claims history Daniel Eliasson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 215:1

More information

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously

More information

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE

THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE Batsirai Winmore Mazviona 1 Tafadzwa Chiduza 2 ABSTRACT In general insurance, companies need to use data on claims gathered from

More information

Anomaly detection for Big Data, networks and cyber-security

Anomaly detection for Big Data, networks and cyber-security Anomaly detection for Big Data, networks and cyber-security Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with Nick Heard (Imperial College London),

More information

Bayesian Factorization Machines

Bayesian Factorization Machines Bayesian Factorization Machines Christoph Freudenthaler, Lars Schmidt-Thieme Information Systems & Machine Learning Lab University of Hildesheim 31141 Hildesheim {freudenthaler, schmidt-thieme}@ismll.de

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Estimating Industry Multiples

Estimating Industry Multiples Estimating Industry Multiples Malcolm Baker * Harvard University Richard S. Ruback Harvard University First Draft: May 1999 Rev. June 11, 1999 Abstract We analyze industry multiples for the S&P 500 in

More information

STATISTICAL DATA ANALYSIS

STATISTICAL DATA ANALYSIS STATISTICAL DATA ANALYSIS INTRODUCTION Fethullah Karabiber YTU, Fall of 2012 The role of statistical analysis in science This course discusses some statistical methods, which involve applying statistical

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

The Chinese Restaurant Process

The Chinese Restaurant Process COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.

More information

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)

More information

A Bayesian Model to Enhance Domestic Energy Consumption Forecast

A Bayesian Model to Enhance Domestic Energy Consumption Forecast Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 A Bayesian Model to Enhance Domestic Energy Consumption Forecast Mohammad

More information

Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model

Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model Bartolucci, A.A 1, Singh, K.P 2 and Bae, S.J 2 1 Dept. of Biostatistics, University of Alabama at Birmingham, Birmingham,

More information

Black-box Performance Models for Virtualized Web. Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang ardagna@elet.polimi.it

Black-box Performance Models for Virtualized Web. Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang ardagna@elet.polimi.it Black-box Performance Models for Virtualized Web Service Applications Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang ardagna@elet.polimi.it Reference scenario 2 Virtualization, proposed in early

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

Audit Sampling for Tests of Controls and Substantive Tests of Transactions

Audit Sampling for Tests of Controls and Substantive Tests of Transactions Audit Sampling for Tests of Controls and Substantive Tests of Transactions Chapter 15 2008 Prentice Hall Business Publishing, Auditing 12/e, Arens/Beasley/Elder 15-1 Learning Objective 1 Explain the concept

More information

Operational Risk Management: Added Value of Advanced Methodologies

Operational Risk Management: Added Value of Advanced Methodologies Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches

More information

Data Collection and Sampling OPRE 6301

Data Collection and Sampling OPRE 6301 Data Collection and Sampling OPRE 6301 Recall... Statistics is a tool for converting data into information: Statistics Data Information But where then does data come from? How is it gathered? How do we

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

How To Find Out What Search Strategy Is Used In The U.S. Auto Insurance Industry

How To Find Out What Search Strategy Is Used In The U.S. Auto Insurance Industry Simultaneous or Sequential? Search Strategies in the U.S. Auto Insurance Industry Current version: April 2014 Elisabeth Honka Pradeep Chintagunta Abstract We show that the search method consumers use when

More information

Monte Carlo Methods and Models in Finance and Insurance

Monte Carlo Methods and Models in Finance and Insurance Chapman & Hall/CRC FINANCIAL MATHEMATICS SERIES Monte Carlo Methods and Models in Finance and Insurance Ralf Korn Elke Korn Gerald Kroisandt f r oc) CRC Press \ V^ J Taylor & Francis Croup ^^"^ Boca Raton

More information

Section 5. Stan for Big Data. Bob Carpenter. Columbia University

Section 5. Stan for Big Data. Bob Carpenter. Columbia University Section 5. Stan for Big Data Bob Carpenter Columbia University Part I Overview Scaling and Evaluation data size (bytes) 1e18 1e15 1e12 1e9 1e6 Big Model and Big Data approach state of the art big model

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

DISCUSSION PAPER ANALYSIS OF VARIANCE WHY IT IS MORE IMPORTANT THAN EVER 1. BY ANDREW GELMAN Columbia University

DISCUSSION PAPER ANALYSIS OF VARIANCE WHY IT IS MORE IMPORTANT THAN EVER 1. BY ANDREW GELMAN Columbia University The Annals of Statistics 2005, Vol. 33, No. 1, 1 53 DOI 10.1214/009053604000001048 Institute of Mathematical Statistics, 2005 DISCUSSION PAPER ANALYSIS OF VARIANCE WHY IT IS MORE IMPORTANT THAN EVER 1

More information

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1 Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1 Vladislav Beresovsky (hvy4@cdc.gov), Janey Hsiao National Center for Health Statistics, CDC

More information

Spatial modelling of claim frequency and claim size in non-life insurance

Spatial modelling of claim frequency and claim size in non-life insurance Spatial modelling of claim frequency and claim size in non-life insurance Susanne Gschlößl Claudia Czado April 13, 27 Abstract In this paper models for claim frequency and average claim size in non-life

More information

Bayesian Hierarchical Classes Analysis

Bayesian Hierarchical Classes Analysis Bayesian Hierarchical Classes Analysis Iwin Leenen Universidad Complutense de Madrid University of Leuven Iven Van Mechelen University of Leuven Andrew Gelman Columbia University Stijn De Knop University

More information

Sample Size Designs to Assess Controls

Sample Size Designs to Assess Controls Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

Approaches for Analyzing Survey Data: a Discussion

Approaches for Analyzing Survey Data: a Discussion Approaches for Analyzing Survey Data: a Discussion David Binder 1, Georgia Roberts 1 Statistics Canada 1 Abstract In recent years, an increasing number of researchers have been able to access survey microdata

More information

Statistical issues in the analysis of microarray data

Statistical issues in the analysis of microarray data Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data

More information

Dealing with Missing Data

Dealing with Missing Data Res. Lett. Inf. Math. Sci. (2002) 3, 153-160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1 Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Summary. Modern Bayesian inference is highly computational but commonly proceeds without reference to modern developments in statistical graphics.

Summary. Modern Bayesian inference is highly computational but commonly proceeds without reference to modern developments in statistical graphics. Summary. Modern Bayesian inference is highly computational but commonly proceeds without reference to modern developments in statistical graphics. This should change. Visualization has two important roles

More information

AALBORG UNIVERSITY. An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants

AALBORG UNIVERSITY. An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants AALBORG UNIVERSITY An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants by Jesper Møller, Anthony N. Pettitt, Kasper K. Berthelsen and Robert W. Reeves

More information

AClass: An online algorithm for generative classification

AClass: An online algorithm for generative classification AClass: An online algorithm for generative classification Vikash K. Mansinghka MIT BCS and CSAIL vkm@mit.edu Daniel M. Roy MIT CSAIL droy@mit.edu Ryan Rifkin Honda Research USA rrifkin@honda-ri.com Josh

More information

Objections to Bayesian statistics

Objections to Bayesian statistics Bayesian Analysis (2008) 3, Number 3, pp. 445 450 Objections to Bayesian statistics Andrew Gelman Abstract. Bayesian inference is one of the more controversial approaches to statistics. The fundamental

More information

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

More information

33. STATISTICS. 33. Statistics 1

33. STATISTICS. 33. Statistics 1 33. STATISTICS 33. Statistics 1 Revised September 2011 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in high-energy physics. In statistics, we are interested in using a

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Problem of Missing Data

Problem of Missing Data VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Estimation across multiple models with application to Bayesian computing and software development

Estimation across multiple models with application to Bayesian computing and software development Estimation across multiple models with application to Bayesian computing and software development Trevor J Sweeting (Department of Statistical Science, University College London) Richard J Stevens (Diabetes

More information

AN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS. Department of Mathematics and Statistics University of Calgary

AN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS. Department of Mathematics and Statistics University of Calgary AN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS DAVID P. M. SCOLLNIK Department of Mathematics and Statistics University of Calgary Abstract This paper introduces the

More information

A hidden Markov model for criminal behaviour classification

A hidden Markov model for criminal behaviour classification RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University

More information

NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES

NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES Silvija Vlah Kristina Soric Visnja Vojvodic Rosenzweig Department of Mathematics

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Multilevel Modeling of Complex Survey Data

Multilevel Modeling of Complex Survey Data Multilevel Modeling of Complex Survey Data Sophia Rabe-Hesketh, University of California, Berkeley and Institute of Education, University of London Joint work with Anders Skrondal, London School of Economics

More information

A Statistical Framework for Operational Infrasound Monitoring

A Statistical Framework for Operational Infrasound Monitoring A Statistical Framework for Operational Infrasound Monitoring Stephen J. Arrowsmith Rod W. Whitaker LA-UR 11-03040 The views expressed here do not necessarily reflect the views of the United States Government,

More information

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

More information

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Model-based Synthesis. Tony O Hagan

Model-based Synthesis. Tony O Hagan Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

More information

Bayes and Naïve Bayes. cs534-machine Learning

Bayes and Naïve Bayes. cs534-machine Learning Bayes and aïve Bayes cs534-machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule

More information

Bayesian Methods for the Social and Behavioral Sciences

Bayesian Methods for the Social and Behavioral Sciences Bayesian Methods for the Social and Behavioral Sciences Jeff Gill Harvard University 2007 ICPSR First Session: June 25-July 20, 9-11 AM. Email: jgill@iq.harvard.edu TA: Yu-Sung Su (ys463@columbia.edu).

More information

Item selection by latent class-based methods: an application to nursing homes evaluation

Item selection by latent class-based methods: an application to nursing homes evaluation Item selection by latent class-based methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University

More information

Topic models for Sentiment analysis: A Literature Survey

Topic models for Sentiment analysis: A Literature Survey Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.

More information

Obesity in America: A Growing Trend

Obesity in America: A Growing Trend Obesity in America: A Growing Trend David Todd P e n n s y l v a n i a S t a t e U n i v e r s i t y Utilizing Geographic Information Systems (GIS) to explore obesity in America, this study aims to determine

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University Machine Learning Methods for Causal Effects Susan Athey, Stanford University Guido Imbens, Stanford University Introduction Supervised Machine Learning v. Econometrics/Statistics Lit. on Causality Supervised

More information

Comparison of Imputation Methods in the Survey of Income and Program Participation

Comparison of Imputation Methods in the Survey of Income and Program Participation Comparison of Imputation Methods in the Survey of Income and Program Participation Sarah McMillan U.S. Census Bureau, 4600 Silver Hill Rd, Washington, DC 20233 Any views expressed are those of the author

More information

Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

MODELLING AND ANALYSIS OF

MODELLING AND ANALYSIS OF MODELLING AND ANALYSIS OF FOREST FIRE IN PORTUGAL - PART I Giovani L. Silva CEAUL & DMIST - Universidade Técnica de Lisboa gsilva@math.ist.utl.pt Maria Inês Dias & Manuela Oliveira CIMA & DM - Universidade

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics

More information