Bayesian inference for population prediction of individuals without health insurance in Florida


 Colleen Houston
 3 years ago
 Views:
Transcription
1 Bayesian inference for population prediction of individuals without health insurance in Florida Neung Soo Ha 1 1 NISS 1 / 24
2 Outline Motivation Description of the Behavioral Risk Factor Surveillance System, BRFSS Description of posterior predictive distribution under informative/noninformative sampling Selection bias Inference for proportions of individuals without the health insurance Variation of maps 2 / 24
3 Motivation: Interest: 1. Inference for the countylevel proportions of persons without health insurance in FL, using the 2010 Behavioral Risk Factor Surveillance System, BRFSS. 2. Display the variation of maps 3 / 24
4 2010 BRFSS: Survey description The largest ongoing telephone based survey; administered by the Centers for Disease Control and Prevention. Collect data on individual risk behaviors and preventive health practices for the adult population(18 years of age and older); collects statespecific data ex: alcohol/cigarette consumption, general health status, health insurance, Uses a disproportionate stratified sample (DSS) design. In FL, 63 out of 67 counties were sampled. No cell phones for / 24
5 Examples of observed data Observed proportion of non insured persons Observed population proportion of non insured persons: Hisp Figure: Observed: All races Figure: Observed: Hispanics white=having insurance, blue=not having insurance 5 / 24
6 Complication 1. Small or nonexistent number of observations for some subpopulation groups in certain geographical areas. 2. Possible presence of selection bias in the sample 6 / 24
7 Model specification for the BRFSS data P(Y ikl = 1 θ ik ) = θ ik logit(θ ik ) = X k β + ν i ν i N(0, σ 2 ν) Y ikl = 1 : not having insurance, Y ikl = 0 : having insurance i = 1,..., M: county k = 1,..., K: population class (9 age groups 3 races 2 genders ) l = 1,..., N ik : units in county i and group k N ik = population size in ith county and kth group X k = vector of indicators for group k. p(β) = const on (, ) Gelman et al.(2004) p(σ 2 ν) = const on (0, ) 7 / 24
8 Model Evaluation Model fit Bayes residual plots QQ plots Bayesian tests: Partial posterior predictive pvalues, Bayarri and Berger(2002) Predictions Crossvalidation to simulate finite population inference Use 100(1p)% of observed data to make predictions for the 100*p% that were heldout 8 / 24
9 Posterior predictive distribution with noninformative sampling Use posterior predictive distribution to make inference for the remaining nonsampled units Under the noninformative sampling, i.e. independence between the probability of selecting a person and the response: Y s = sampled units, Y ns = nonsampled units Y ns Y s θ, X f (Y ns Y s, X) = g(y ns Y s, X, θ)p(θ Y s, X)dθ = g(y ns θ, X)p(θ Y s, X)dθ p(θ Y s, X) L(θ Y s, X)p(θ β, σν)p(β, 2 σν): 2 posterior distribution θ = a vector of model parameters (β, σ 2 ν ) = a vector of hyperparameters 9 / 24
10 Under informative sampling Selection probabilities are related to the outcome variables. The observed outcomes may not be representative of the population outcomes. May cause bias for inferences about the finite population parameters. 10 / 24
11 Selection bias Let I = (I 1,..., I N ), I i = 1 if i s, I i = 0 if i / s, s = sample The posterior predictive distribution with I f (Y ns Y s, X, θ, I) g(y ns θ, X)p(I Y, X, θ) (1) Note: p(i X, Y, θ) p(i X, θ) i.e. selection bias. Selection information only comes from the sampled units. 11 / 24
12 Procedure: Ha and Sedransk(2015) 1. Pfeffermann and Sverchkov (2009): p(i Y, X, θ) = p(i Y, X) = k f 1(I k Y, X k ) 2. Using the Poisson sampling approximation, Chambers et. al.(1998) p(i Y, X) = M K { P(I ikl = 1 Y, X k )}{ (1 P(I ikl = 1 Y, X k ))} l s ik l/ s ik i=1 k=1 3. P(I ikl = 1 Y, X k ) = E U (π ikl Y, X k ) = {E s (w ikl Y, X k )} 1, Pfeffermann and Sverchkov (2009), πikl = probability of unit selected w ikl = 1/π ikl 12 / 24
13 Posterior predictive distribution with inclusion probability g(y ns Y s, X, θ, I) g(y ns θ, X) M i=1 K k=1 (1 a k) M ik1(1 b k ) M ik0, a k = 1 w k,y0, b k = 1 w k,y1 w k,y0 = average of inverse probability of selected units for class k with response Y ikl = 0 wk,y1 = average of inverse probability of selected units for class k with response Y ikl = 1 Mik1 : # nonsampled units with Y ikl = 1 M ik0 : # nonsampled units with Y ikl = 0 M ik1 + M ik0 = N ik n ik Use rejection sampling algorithm to obtain g(y ns ), Robert, C. and Casella, G. (2004) In our data set, 0.96 (1 a k )/(1 b k ) 1.01 with 43 of the 54 K groups. No substantial selection bias. 13 / 24
14 Comparison between prediction to observation County comparison Observed proportion of non insured persons Predicted population proportion of non insured persons Figure: Observed: All races Figure: Predicted: All races 14 / 24
15 Comparison between prediction to observation Observed population proportion of non insured persons: Hisp Predicted population proportion of non insured persons: Hisp Figure: Observed: Hispanics Figure: Predicted: Hispanics 15 / 24
16 Comparison between races Predicted population proportion of non insured persons: White Predicted population proportion of non insured persons: Hisp Figure: Predicted: Whites Figure: Predicted: Hispanics 16 / 24
17 Variation of maps Objective: Assess the variation of the entire map as a unit rather than expressing the variation separately for each geographic unit. Produce a map, based on each MCMC replicate. Analyze variations from a set of maps Illustrate variation between the counties from ν i, the countylevel random effect. 17 / 24
18 random effect: mean Mean choropleth map of random effect quintiles [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] Posterior means of countylevel random effects, ν i, are partitioned into quintiles Each quintile is color coded Figure: Mean map of ˆν i 18 / 24
19 Besag Method, Besag et al (1995) Provide 100(1 a)% regions for each of ν i : ν i (L) < ν i < ν i (U) Construct lower choropleth map of ν i (L) and ν i (U) 1. Denote the posterior MCMC sample by {ν (t) i : i = 1,..., M, t = 1,..., T } 2. Order {ν (t) i x [t] i and ranks r (t) i : t = 1,..., T } separately for each i to obtain order statistics 3. For fixed k {1,..., T }, let t be the smallest integer such that x [T +1 t ] i x (t) i x [t ] i, for all i, for at least k values of t 4. t = kth order statistic from the set a (t) = max{max i r (t) i, T + 1 min i r (t) i }, t = 1,..., T, i.e. t = a [k] 5. Then, {[x [T +1 t ] i, x [t ] i ] : i = 1,..., M} are a set of simultaneous credible regions containing at least 100k/T % of the distribution 19 / 24
20 Lower and Upper maps random effect: Besag L 90 random effect: mean quintiles random effect: Besag U 90 [ 1.12, 0.76] ( 0.76, 0.562] ( 0.562, 0.359] ( 0.359, 0.19] ( 0.19,0.264] [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] [ 0.172,0.203] (0.203,0.359] (0.359,0.519] (0.519,0.697] (0.697,1.07] Figure: 90 % Lower, Mean, 90% Upper 20 / 24
21 Variation in Choropleth map Goal: Illustrate the uncertainty of the choropleth maps. Use T = 1000 MCMC replications of ν i Each replication can be used to produce a choropleth map. Each quintile is color coded. For each replication, the value of ith county is colorcoded according to its quintile. 21 / 24
22 Heat map Q1 Q2 Q3 Q4 Q5 Rows: replicates, columns: counties Variation in maps is evaluated by color changes 22 / 24
23 Summary Present a method for posterior predictive inference that includes selection information for sampled units Illustrate variation of maps 23 / 24
24 Reference 1. Chambers, R. L., Dorfman, A. and Wang, W. (1998). Limited Information Likelihood Analysis of Survey Data. Journal of the Royal Statistical Society, 60, Bayesian benchmarking with application to small area estimation Test Gelman, A., Carlin, J. Stern, H. and Rubin, D. (2004), Bayesian Data Analysis. Chapman and Hall/ CRC, New York, NY. 3. Pfeffermann, D. and Sverchkov, M. (2009) Inference under Informative Sampling Sample Surveys: Inference and Analysis, 29B, Robert, C. and Casella, G. (2004), Monte Carlo Statistical Methods, Springer, New York, NY. 24 / 24
Applications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 723 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
More informationBayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationHandling attrition and nonresponse in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 6372 Handling attrition and nonresponse in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More informationProbabilistic Methods for TimeSeries Analysis
Probabilistic Methods for TimeSeries Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:
More informationValidation of Software for Bayesian Models Using Posterior Quantiles
Validation of Software for Bayesian Models Using Posterior Quantiles Samantha R. COOK, Andrew GELMAN, and Donald B. RUBIN This article presents a simulationbased method designed to establish the computational
More informationData Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling
Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu Topics 1. Hypothesis Testing 1.
More informationSTAT3016 Introduction to Bayesian Data Analysis
STAT3016 Introduction to Bayesian Data Analysis Course Description The Bayesian approach to statistics assigns probability distributions to both the data and unknown parameters in the problem. This way,
More informationParallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis WeiChen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
More informationBayesian modeling of inseparable spacetime variation in disease risk
Bayesian modeling of inseparable spacetime variation in disease risk Leonhard KnorrHeld Laina Mercer Department of Statistics UW May 23, 2013 Motivation Area and timespecific disease rates Area and
More informationValidation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT
Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulationbased method designed to establish that software
More informationSampling for Bayesian computation with large datasets
Sampling for Bayesian computation with large datasets Zaiying Huang Andrew Gelman April 27, 2005 Abstract Multilevel models are extremely useful in handling large hierarchical datasets. However, computation
More informationMore details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a twoarmed trial comparing
More informationBayesian Analysis of Comparative Survey Data
Bayesian Analysis of Comparative Survey Data Bruce Western 1 Filiz Garip Princeton University April 2005 1 Department of Sociology, Princeton University, Princeton NJ 08544. We thank Sara Curran for making
More informationGaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
More informationSample Size Calculation for Finding Unseen Species
Bayesian Analysis (2009) 4, Number 4, pp. 763 792 Sample Size Calculation for Finding Unseen Species Hongmei Zhang and Hal Stern Abstract. Estimation of the number of species extant in a geographic region
More informationCombining information from different survey samples  a case study with data collected by world wide web and telephone
Combining information from different survey samples  a case study with data collected by world wide web and telephone Magne Aldrin Norwegian Computing Center P.O. Box 114 Blindern N0314 Oslo Norway Email:
More informationTutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in
More informationBayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationMODELLING OCCUPATIONAL EXPOSURE USING A RANDOM EFFECTS MODEL: A BAYESIAN APPROACH ABSTRACT
MODELLING OCCUPATIONAL EXPOSURE USING A RANDOM EFFECTS MODEL: A BAYESIAN APPROACH Justin Harvey * and Abrie van der Merwe ** * Centre for Statistical Consultation, University of Stellenbosch ** University
More informationHealth Insurance Estimates for States 1
Health Insurance Estimates for States 1 Robin Fisher and Jennifer Campbell, U.S. Census Bureau, HHESSAEB, FB 3 Room 1462 Washington, DC, 20233 Phone: (301) 7635897, email: robin.c.fisher@census.gov Key
More informationSampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data
Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian
More informationEC 6310: Advanced Econometric Theory
EC 6310: Advanced Econometric Theory July 2008 Slides for Lecture on Bayesian Computation in the Nonlinear Regression Model Gary Koop, University of Strathclyde 1 Summary Readings: Chapter 5 of textbook.
More informationLab 8: Introduction to WinBUGS
40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next
More informationData Collection and Sampling OPRE 6301
Data Collection and Sampling OPRE 6301 Recall... Statistics is a tool for converting data into information: Statistics Data Information But where then does data come from? How is it gathered? How do we
More informationMethodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS).
Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS. Vladislav Beresovsky National Center for Health Statistics 3311 Toledo Road Hyattsville, MD 078
More informationApproaches for Analyzing Survey Data: a Discussion
Approaches for Analyzing Survey Data: a Discussion David Binder 1, Georgia Roberts 1 Statistics Canada 1 Abstract In recent years, an increasing number of researchers have been able to access survey microdata
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationModeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationMeasures of explained variance and pooling in multilevel models
Measures of explained variance and pooling in multilevel models Andrew Gelman, Columbia University Iain Pardoe, University of Oregon Contact: Iain Pardoe, Charles H. Lundquist College of Business, University
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationChenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)
Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationFoundations of Statistics Frequentist and Bayesian
Mary Parker, http://www.austincc.edu/mparker/stat/nov04/ page 1 of 13 Foundations of Statistics Frequentist and Bayesian Statistics is the science of information gathering, especially when the information
More informationThe Use of Sampling Weights in Bayesian Hierarchical Models for Small Area Estimation
The Use of Sampling Weights in Bayesian Hierarchical Models for Small Area Estimation Cici X. Chen Department of Statistics, University of Washington Seattle, USA Thomas Lumley Department of Statistics,
More informationHierarchical Bayes Small Area Estimates of Adult Literacy Using Unmatched Sampling and Linking Models
Hierarchical Bayes Small Area Estimates of Adult Literacy Using Unmatched Sampling and Linking Models Leyla Mohadjer 1, J.N.K. Rao, Benmei Liu 1, Tom Krenzke 1, and Wendy Van de Kerckhove 1 Leyla Mohadjer,
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationInference on the parameters of the Weibull distribution using records
Statistics & Operations Research Transactions SORT 39 (1) JanuaryJune 2015, 318 ISSN: 16962281 eissn: 20138830 www.idescat.cat/sort/ Statistics & Operations Research c Institut d Estadstica de Catalunya
More informationEstimating Industry Multiples
Estimating Industry Multiples Malcolm Baker * Harvard University Richard S. Ruback Harvard University First Draft: May 1999 Rev. June 11, 1999 Abstract We analyze industry multiples for the S&P 500 in
More informationAn Analysis of Bullying and Suicide in the United States using a Non Gaussian Multivariate Spatial Model
Proceedings of The National Conference On Undergraduate Research (NCUR) 015 Eastern Washington University, Cheney, WA April 1618, 015 An Analysis of Bullying and Suicide in the United States using a Non
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationAdvanced Quantitative Methods: Mathematics (p)review
Advanced Quantitative Methods: Mathematics (p)review University College Dublin 22 January 2013 1 Course outline 2 3 Outline Course outline 1 Course outline 2 3 Introduction Course outline Topic: advanced
More informationUsing SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously
More informationA Bayesian Model to Enhance Domestic Energy Consumption Forecast
Proceedings of the 2012 International Conference on Industrial Engineering and Operations Management Istanbul, Turkey, July 3 6, 2012 A Bayesian Model to Enhance Domestic Energy Consumption Forecast Mohammad
More informationA Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
More informationAnomaly detection for Big Data, networks and cybersecurity
Anomaly detection for Big Data, networks and cybersecurity Patrick RubinDelanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with Nick Heard (Imperial College London),
More informationMethodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1
Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1 Vladislav Beresovsky (hvy4@cdc.gov), Janey Hsiao National Center for Health Statistics, CDC
More informationBayesian credibility methods for pricing nonlife insurance on individual claims history
Bayesian credibility methods for pricing nonlife insurance on individual claims history Daniel Eliasson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 215:1
More informationTHE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE
THE USE OF STATISTICAL DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR INSURANCE Batsirai Winmore Mazviona 1 Tafadzwa Chiduza 2 ABSTRACT In general insurance, companies need to use data on claims gathered from
More informationINTRODUCTORY STATISTICS
INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore
More informationChisquare test Testing for independeny The r x c contingency tables square test
Chisquare test Testing for independeny The r x c contingency tables square test 1 The chisquare distribution HUSRB/0901/1/088 Teaching Mathematics and Statistics in Sciences: Modeling and Computeraided
More informationBasic Bayesian Methods
6 Basic Bayesian Methods Mark E. Glickman and David A. van Dyk Summary In this chapter, we introduce the basics of Bayesian data analysis. The key ingredients to a Bayesian analysis are the likelihood
More informationBayesian Multiple Imputation of Zero Inflated Count Data
Bayesian Multiple Imputation of Zero Inflated Count Data ChinFang Weng chin.fang.weng@census.gov U.S. Census Bureau, 4600 Silver Hill Road, Washington, D.C. 202331912 Abstract In government survey applications,
More informationBayesian Factorization Machines
Bayesian Factorization Machines Christoph Freudenthaler, Lars SchmidtThieme Information Systems & Machine Learning Lab University of Hildesheim 31141 Hildesheim {freudenthaler, schmidtthieme}@ismll.de
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationA methodology for estimating joint probability density functions
A methodology for estimating joint probability density functions Mauricio Monsalve Computer Science Department, Universidad de Chile Abstract We develop a theoretical methodology for estimating joint probability
More informationDealing with Missing Data
Res. Lett. Inf. Math. Sci. (2002) 3, 153160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904
More information2WB05 Simulation Lecture 8: Generating random variables
2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating
More informationDISCUSSION PAPER ANALYSIS OF VARIANCE WHY IT IS MORE IMPORTANT THAN EVER 1. BY ANDREW GELMAN Columbia University
The Annals of Statistics 2005, Vol. 33, No. 1, 1 53 DOI 10.1214/009053604000001048 Institute of Mathematical Statistics, 2005 DISCUSSION PAPER ANALYSIS OF VARIANCE WHY IT IS MORE IMPORTANT THAN EVER 1
More informationObjections to Bayesian statistics
Bayesian Analysis (2008) 3, Number 3, pp. 445 450 Objections to Bayesian statistics Andrew Gelman Abstract. Bayesian inference is one of the more controversial approaches to statistics. The fundamental
More informationSTATISTICAL DATA ANALYSIS
STATISTICAL DATA ANALYSIS INTRODUCTION Fethullah Karabiber YTU, Fall of 2012 The role of statistical analysis in science This course discusses some statistical methods, which involve applying statistical
More informationBayesian Methods. 1 The Joint Posterior Distribution
Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower
More informationThe Chinese Restaurant Process
COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.
More informationAnalyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model
Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model Bartolucci, A.A 1, Singh, K.P 2 and Bae, S.J 2 1 Dept. of Biostatistics, University of Alabama at Birmingham, Birmingham,
More informationOperational Risk Management: Added Value of Advanced Methodologies
Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches
More informationBayesX  Software for Bayesian Inference in Structured Additive Regression
BayesX  Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, LudwigMaximiliansUniversity Munich
More informationWebbased Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Webbased Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationSummary. Modern Bayesian inference is highly computational but commonly proceeds without reference to modern developments in statistical graphics.
Summary. Modern Bayesian inference is highly computational but commonly proceeds without reference to modern developments in statistical graphics. This should change. Visualization has two important roles
More informationMonte Carlo Methods and Models in Finance and Insurance
Chapman & Hall/CRC FINANCIAL MATHEMATICS SERIES Monte Carlo Methods and Models in Finance and Insurance Ralf Korn Elke Korn Gerald Kroisandt f r oc) CRC Press \ V^ J Taylor & Francis Croup ^^"^ Boca Raton
More informationBlackbox Performance Models for Virtualized Web. Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang ardagna@elet.polimi.it
Blackbox Performance Models for Virtualized Web Service Applications Danilo Ardagna, Mara Tanelli, Marco Lovera, Li Zhang ardagna@elet.polimi.it Reference scenario 2 Virtualization, proposed in early
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationSection 5. Stan for Big Data. Bob Carpenter. Columbia University
Section 5. Stan for Big Data Bob Carpenter Columbia University Part I Overview Scaling and Evaluation data size (bytes) 1e18 1e15 1e12 1e9 1e6 Big Model and Big Data approach state of the art big model
More informationDirect Methods for Solving Linear Systems. Linear Systems of Equations
Direct Methods for Solving Linear Systems Linear Systems of Equations Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University
More informationLecture 6: The Bayesian Approach
Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Loglinear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set
More informationObesity in America: A Growing Trend
Obesity in America: A Growing Trend David Todd P e n n s y l v a n i a S t a t e U n i v e r s i t y Utilizing Geographic Information Systems (GIS) to explore obesity in America, this study aims to determine
More informationAudit Sampling for Tests of Controls and Substantive Tests of Transactions
Audit Sampling for Tests of Controls and Substantive Tests of Transactions Chapter 15 2008 Prentice Hall Business Publishing, Auditing 12/e, Arens/Beasley/Elder 151 Learning Objective 1 Explain the concept
More informationSimultaneous or Sequential? Search Strategies in the U.S. Auto Insurance Industry
Simultaneous or Sequential? Search Strategies in the U.S. Auto Insurance Industry Current version: April 2014 Elisabeth Honka Pradeep Chintagunta Abstract We show that the search method consumers use when
More informationIn all 1336 (35.9%) said yes, 2051 (55.1%) said no and 335 (9.0%) said not sure
1. (10 points total) The magazine BusinessWeek ran an online poll on their website and asked the readers the following question: Do you think Google is too powerful? Poll participants were offered three
More informationHow To Use A Monte Carlo Study To Decide On Sample Size and Determine Power
How To Use A Monte Carlo Study To Decide On Sample Size and Determine Power Linda K. Muthén Muthén & Muthén 11965 Venice Blvd., Suite 407 Los Angeles, CA 90066 Telephone: (310) 3919971 Fax: (310) 3918971
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationElementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
More informationUsing the American Community Survey Data
Using the American Community Survey Data The ACS website is accessible from www.census.gov/acs. In the middle of the screen choose Indepth Data Go directly to the ACS Datasets Tab The ACS Datasets Tab
More informationBayesian Hierarchical Classes Analysis
Bayesian Hierarchical Classes Analysis Iwin Leenen Universidad Complutense de Madrid University of Leuven Iven Van Mechelen University of Leuven Andrew Gelman Columbia University Stijn De Knop University
More informationSpatial modelling of claim frequency and claim size in nonlife insurance
Spatial modelling of claim frequency and claim size in nonlife insurance Susanne Gschlößl Claudia Czado April 13, 27 Abstract In this paper models for claim frequency and average claim size in nonlife
More informationItem selection by latent classbased methods: an application to nursing homes evaluation
Item selection by latent classbased methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University
More informationThe Bootstrap. 1 Introduction. The Bootstrap 2. Short Guides to Microeconometrics Fall Kurt Schmidheiny Unversität Basel
Short Guides to Microeconometrics Fall 2016 The Bootstrap Kurt Schmidheiny Unversität Basel The Bootstrap 2 1a) The asymptotic sampling distribution is very difficult to derive. 1b) The asymptotic sampling
More informationStatistical issues in the analysis of microarray data
Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VAaffiliated statisticians;
More informationSample Size Designs to Assess Controls
Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference
More informationUsing LinkTracing Data to Inform Epidemiology
Using LinkTracing Data to Inform Epidemiology Krista J. Gile Nuffield College, Oxford joint work with Mark S. Handcock University of Washington, Seattle 23 October, 2008 For details, see: Gile, K.J. (2008).
More informationData Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1
Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields
More informationArtificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =
Artificial Intelligence 15381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More information7. Tests of association and Linear Regression
7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More information