Bayesian inference for population prediction of individuals without health insurance in Florida


 Colleen Houston
 1 years ago
 Views:
Transcription
1 Bayesian inference for population prediction of individuals without health insurance in Florida Neung Soo Ha 1 1 NISS 1 / 24
2 Outline Motivation Description of the Behavioral Risk Factor Surveillance System, BRFSS Description of posterior predictive distribution under informative/noninformative sampling Selection bias Inference for proportions of individuals without the health insurance Variation of maps 2 / 24
3 Motivation: Interest: 1. Inference for the countylevel proportions of persons without health insurance in FL, using the 2010 Behavioral Risk Factor Surveillance System, BRFSS. 2. Display the variation of maps 3 / 24
4 2010 BRFSS: Survey description The largest ongoing telephone based survey; administered by the Centers for Disease Control and Prevention. Collect data on individual risk behaviors and preventive health practices for the adult population(18 years of age and older); collects statespecific data ex: alcohol/cigarette consumption, general health status, health insurance, Uses a disproportionate stratified sample (DSS) design. In FL, 63 out of 67 counties were sampled. No cell phones for / 24
5 Examples of observed data Observed proportion of non insured persons Observed population proportion of non insured persons: Hisp Figure: Observed: All races Figure: Observed: Hispanics white=having insurance, blue=not having insurance 5 / 24
6 Complication 1. Small or nonexistent number of observations for some subpopulation groups in certain geographical areas. 2. Possible presence of selection bias in the sample 6 / 24
7 Model specification for the BRFSS data P(Y ikl = 1 θ ik ) = θ ik logit(θ ik ) = X k β + ν i ν i N(0, σ 2 ν) Y ikl = 1 : not having insurance, Y ikl = 0 : having insurance i = 1,..., M: county k = 1,..., K: population class (9 age groups 3 races 2 genders ) l = 1,..., N ik : units in county i and group k N ik = population size in ith county and kth group X k = vector of indicators for group k. p(β) = const on (, ) Gelman et al.(2004) p(σ 2 ν) = const on (0, ) 7 / 24
8 Model Evaluation Model fit Bayes residual plots QQ plots Bayesian tests: Partial posterior predictive pvalues, Bayarri and Berger(2002) Predictions Crossvalidation to simulate finite population inference Use 100(1p)% of observed data to make predictions for the 100*p% that were heldout 8 / 24
9 Posterior predictive distribution with noninformative sampling Use posterior predictive distribution to make inference for the remaining nonsampled units Under the noninformative sampling, i.e. independence between the probability of selecting a person and the response: Y s = sampled units, Y ns = nonsampled units Y ns Y s θ, X f (Y ns Y s, X) = g(y ns Y s, X, θ)p(θ Y s, X)dθ = g(y ns θ, X)p(θ Y s, X)dθ p(θ Y s, X) L(θ Y s, X)p(θ β, σν)p(β, 2 σν): 2 posterior distribution θ = a vector of model parameters (β, σ 2 ν ) = a vector of hyperparameters 9 / 24
10 Under informative sampling Selection probabilities are related to the outcome variables. The observed outcomes may not be representative of the population outcomes. May cause bias for inferences about the finite population parameters. 10 / 24
11 Selection bias Let I = (I 1,..., I N ), I i = 1 if i s, I i = 0 if i / s, s = sample The posterior predictive distribution with I f (Y ns Y s, X, θ, I) g(y ns θ, X)p(I Y, X, θ) (1) Note: p(i X, Y, θ) p(i X, θ) i.e. selection bias. Selection information only comes from the sampled units. 11 / 24
12 Procedure: Ha and Sedransk(2015) 1. Pfeffermann and Sverchkov (2009): p(i Y, X, θ) = p(i Y, X) = k f 1(I k Y, X k ) 2. Using the Poisson sampling approximation, Chambers et. al.(1998) p(i Y, X) = M K { P(I ikl = 1 Y, X k )}{ (1 P(I ikl = 1 Y, X k ))} l s ik l/ s ik i=1 k=1 3. P(I ikl = 1 Y, X k ) = E U (π ikl Y, X k ) = {E s (w ikl Y, X k )} 1, Pfeffermann and Sverchkov (2009), πikl = probability of unit selected w ikl = 1/π ikl 12 / 24
13 Posterior predictive distribution with inclusion probability g(y ns Y s, X, θ, I) g(y ns θ, X) M i=1 K k=1 (1 a k) M ik1(1 b k ) M ik0, a k = 1 w k,y0, b k = 1 w k,y1 w k,y0 = average of inverse probability of selected units for class k with response Y ikl = 0 wk,y1 = average of inverse probability of selected units for class k with response Y ikl = 1 Mik1 : # nonsampled units with Y ikl = 1 M ik0 : # nonsampled units with Y ikl = 0 M ik1 + M ik0 = N ik n ik Use rejection sampling algorithm to obtain g(y ns ), Robert, C. and Casella, G. (2004) In our data set, 0.96 (1 a k )/(1 b k ) 1.01 with 43 of the 54 K groups. No substantial selection bias. 13 / 24
14 Comparison between prediction to observation County comparison Observed proportion of non insured persons Predicted population proportion of non insured persons Figure: Observed: All races Figure: Predicted: All races 14 / 24
15 Comparison between prediction to observation Observed population proportion of non insured persons: Hisp Predicted population proportion of non insured persons: Hisp Figure: Observed: Hispanics Figure: Predicted: Hispanics 15 / 24
16 Comparison between races Predicted population proportion of non insured persons: White Predicted population proportion of non insured persons: Hisp Figure: Predicted: Whites Figure: Predicted: Hispanics 16 / 24
17 Variation of maps Objective: Assess the variation of the entire map as a unit rather than expressing the variation separately for each geographic unit. Produce a map, based on each MCMC replicate. Analyze variations from a set of maps Illustrate variation between the counties from ν i, the countylevel random effect. 17 / 24
18 random effect: mean Mean choropleth map of random effect quintiles [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] Posterior means of countylevel random effects, ν i, are partitioned into quintiles Each quintile is color coded Figure: Mean map of ˆν i 18 / 24
19 Besag Method, Besag et al (1995) Provide 100(1 a)% regions for each of ν i : ν i (L) < ν i < ν i (U) Construct lower choropleth map of ν i (L) and ν i (U) 1. Denote the posterior MCMC sample by {ν (t) i : i = 1,..., M, t = 1,..., T } 2. Order {ν (t) i x [t] i and ranks r (t) i : t = 1,..., T } separately for each i to obtain order statistics 3. For fixed k {1,..., T }, let t be the smallest integer such that x [T +1 t ] i x (t) i x [t ] i, for all i, for at least k values of t 4. t = kth order statistic from the set a (t) = max{max i r (t) i, T + 1 min i r (t) i }, t = 1,..., T, i.e. t = a [k] 5. Then, {[x [T +1 t ] i, x [t ] i ] : i = 1,..., M} are a set of simultaneous credible regions containing at least 100k/T % of the distribution 19 / 24
20 Lower and Upper maps random effect: Besag L 90 random effect: mean quintiles random effect: Besag U 90 [ 1.12, 0.76] ( 0.76, 0.562] ( 0.562, 0.359] ( 0.359, 0.19] ( 0.19,0.264] [ 0.616, 0.236] ( 0.236, ] ( ,0.074] (0.074,0.246] (0.246,0.684] [ 0.172,0.203] (0.203,0.359] (0.359,0.519] (0.519,0.697] (0.697,1.07] Figure: 90 % Lower, Mean, 90% Upper 20 / 24
21 Variation in Choropleth map Goal: Illustrate the uncertainty of the choropleth maps. Use T = 1000 MCMC replications of ν i Each replication can be used to produce a choropleth map. Each quintile is color coded. For each replication, the value of ith county is colorcoded according to its quintile. 21 / 24
22 Heat map Q1 Q2 Q3 Q4 Q5 Rows: replicates, columns: counties Variation in maps is evaluated by color changes 22 / 24
23 Summary Present a method for posterior predictive inference that includes selection information for sampled units Illustrate variation of maps 23 / 24
24 Reference 1. Chambers, R. L., Dorfman, A. and Wang, W. (1998). Limited Information Likelihood Analysis of Survey Data. Journal of the Royal Statistical Society, 60, Bayesian benchmarking with application to small area estimation Test Gelman, A., Carlin, J. Stern, H. and Rubin, D. (2004), Bayesian Data Analysis. Chapman and Hall/ CRC, New York, NY. 3. Pfeffermann, D. and Sverchkov, M. (2009) Inference under Informative Sampling Sample Surveys: Inference and Analysis, 29B, Robert, C. and Casella, G. (2004), Monte Carlo Statistical Methods, Springer, New York, NY. 24 / 24
Bayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationState and local health departments, communitybased
Research and Practice Methods Comparison of SmallArea Analysis Techniques for Estimating CountyLevel Outcomes Haomiao Jia, PhD, Peter Muennig, MD, MPH, Elaine Borawski, PhD Background: Methods: Results:
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationPredicting the Present with Bayesian Structural Time Series
Predicting the Present with Bayesian Structural Time Series Steven L. Scott Hal Varian June 28, 2013 Abstract This article describes a system for short term forecasting based on an ensemble prediction
More informationWORKING P A P E R. The Impact of Nearly Universal Insurance Coverage on Health Care Utilization and Health. Evidence from Medicare
WORKING P A P E R The Impact of Nearly Universal Insurance Coverage on Health Care Utilization and Health Evidence from Medicare DAVID CARD, CARLOS DOBKIN, NICOLE MAESTAS WR197 October 2004 This product
More informationA Bayesian Approach to Person Fit Analysis in Item Response Theory Models
A Bayesian Approach to Person Fit Analysis in Item Response Theory Models Cees A. W. Glas and Rob R. Meijer University of Twente, The Netherlands A Bayesian approach to the evaluation of person fit in
More informationAgricultural Contracts and Alternative Marketing Options: A Matching Analysis. Ani L. Katchova University of Illinois
Agricultural Contracts and Alternative Marketing Options: A Matching Analysis Ani L. Katchova University of Illinois Contact Information: Ani L. Katchova University of Illinois 326 Mumford Hall, MC710
More informationDRIVER ATTRIBUTES AND REAREND CRASH INVOLVEMENT PROPENSITY
U.S. Department of Transportation National Highway Traffic Safety Administration DOT HS 809 540 March 2003 Technical Report DRIVER ATTRIBUTES AND REAREND CRASH INVOLVEMENT PROPENSITY Published By: National
More informationTrends in Asthma Morbidity and Mortality
Trends in Asthma Morbidity and Mortality American Lung Association Epidemiology and Statistics Unit Research and Health Education Division September 2012 Table of Contents Asthma Mortality, 19992009 Asthma
More informationDesign of Experiments (DOE)
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Design
More informationA METHODOLOGY TO MATCH DISTRIBUTIONS OF BOTH HOUSEHOLD AND PERSON ATTRIBUTES IN THE GENERATION OF SYNTHETIC POPULATIONS
A METHODOLOGY TO MATCH DISTRIBUTIONS OF BOTH HOUSEHOLD AND PERSON ATTRIBUTES IN THE GENERATION OF SYNTHETIC POPULATIONS Xin Ye Department of Civil and Environmental Engineering Arizona State University,
More informationOneyear reserve risk including a tail factor : closed formula and bootstrap approaches
Oneyear reserve risk including a tail factor : closed formula and bootstrap approaches Alexandre Boumezoued R&D Consultant Milliman Paris alexandre.boumezoued@milliman.com Yoboua Angoua NonLife Consultant
More informationEstimation and comparison of multiple changepoint models
Journal of Econometrics 86 (1998) 221 241 Estimation and comparison of multiple changepoint models Siddhartha Chib* John M. Olin School of Business, Washington University, 1 Brookings Drive, Campus Box
More informationA Direct Approach to Data Fusion
A Direct Approach to Data Fusion Zvi Gilula Department of Statistics Hebrew University Robert E. McCulloch Peter E. Rossi Graduate School of Business University of Chicago 1101 E. 58 th Street Chicago,
More informationDOES THE RIGHT TO CARRY CONCEALED HANDGUNS DETER COUNTABLE CRIMES? ONLY A COUNT ANALYSIS CAN SAY* Abstract
DOES THE RIGHT TO CARRY CONCEALED HANDGUNS DETER COUNTABLE CRIMES? ONLY A COUNT ANALYSIS CAN SAY* FLORENZ PLASSMANN State University of New York at Binghamton and T. NICOLAUS TIDEMAN Virginia Polytechnic
More information8 INTERPRETATION OF SURVEY RESULTS
8 INTERPRETATION OF SURVEY RESULTS 8.1 Introduction This chapter discusses the interpretation of survey results, primarily those of the final status survey. Interpreting a survey s results is most straightforward
More informationRelative Status and WellBeing: Evidence from U.S. Suicide Deaths
FEDERAL RESERVE BANK OF SAN FRANCISCO WORKING PAPER SERIES Relative Status and WellBeing: Evidence from U.S. Suicide Deaths Mary C. Daly Federal Reserve Bank of San Francisco Daniel J. Wilson Federal
More informationAN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS. Department of Mathematics and Statistics University of Calgary
AN INTRODUCTION TO MARKOV CHAIN MONTE CARLO METHODS AND THEIR ACTUARIAL APPLICATIONS DAVID P. M. SCOLLNIK Department of Mathematics and Statistics University of Calgary Abstract This paper introduces the
More informationWhat are plausible values and why are they useful?
What are plausible values and why are they useful? Matthias von Davier Educational Testing Service, Princeton, New Jersey, United States Eugenio Gonzalez Educational Testing Service, Princeton, New Jersey,
More informationUnderstanding and Using ACS SingleYear and Multiyear Estimates
Appendix. Understanding and Using ACS SingleYear and Multiyear Estimates What Are SingleYear and Multiyear Estimates? Understanding Period Estimates The ACS produces period estimates of socioeconomic
More informationNICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK METAANALYSIS OF RANDOMISED CONTROLLED TRIALS
NICE DSU TECHNICAL SUPPORT DOCUMENT 2: A GENERALISED LINEAR MODELLING FRAMEWORK FOR PAIRWISE AND NETWORK METAANALYSIS OF RANDOMISED CONTROLLED TRIALS REPORT BY THE DECISION SUPPORT UNIT August 2011 (last
More informationReview of basic statistics and the simplest forecasting model: the sample mean
Review of basic statistics and the simplest forecasting model: the sample mean Robert Nau Fuqua School of Business, Duke University August 2014 Most of what you need to remember about basic statistics
More informationCommon Core State Standards for Mathematical Practice 4. Model with mathematics. 7. Look for and make use of structure.
Who Sends the Most Text Messages? Written by: Anna Bargagliotti and Jeanie Gibson (for ProjectSET) Loyola Marymount University and Hutchison School abargagl@lmu.edu, jgibson@hutchisonschool.org, www.projectset.com
More informationEarly Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records
Early Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records Sriram Somanchi Carnegie Mellon University Pittsburgh, USA somanchi@cmu.edu Elena Eneva Accenture San Franscisco, USA elena@elenaeneva.com
More informationWhat s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks
Journal of Machine Learning Research 6 (2005) 1961 1998 Submitted 8/04; Revised 3/05; Published 12/05 What s Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks
More informationA 61MillionPerson Experiment in Social Influence and Political Mobilization
Supplementary Information for A 61MillionPerson Experiment in Social Influence and Political Mobilization Robert M. Bond 1, Christopher J. Fariss 1, Jason J. Jones 2, Adam D. I. Kramer 3, Cameron Marlow
More informationWhen is statistical significance not significant?
brazilianpoliticalsciencereview ARTICLE Dalson Britto Figueiredo Filho Political Science Department, Federal University of Pernambuco (UFPE), Brazil Ranulfo Paranhos Social Science Institute, Federal University
More informationSmall area estimation of turnover of the Structural Business Survey
12 1 Small area estimation of turnover of the Structural Business Survey Sabine Krieg, Virginie Blaess, Marc Smeets The views expressed in this paper are those of the author(s) and do not necessarily reflect
More informationRipley s K function. Philip M. Dixon Volume 3, pp 1796 1803. Encyclopedia of Environmetrics (ISBN 0471 899976) Edited by
Ripley s K function Philip M. Dixon Volume 3, pp 1796 183 in Encyclopedia of Environmetrics (ISBN 471 899976) Edited by Abdel H. ElShaarawi and Walter W. Piegorsch John Wiley & Sons, Ltd, Chichester,
More informationSAMESEX UNMARRIED PARTNER COUPLES IN THE AMERICAN COMMUNITY SURVEY: THE ROLE OF MISREPORTING, MISCODING AND MISALLOCATION
SAMESEX UNMARRIED PARTNER COUPLES IN THE AMERICAN COMMUNITY SURVEY: THE ROLE OF MISREPORTING, MISCODING AND MISALLOCATION GARY J. GATES AND MICHAEL D. STEINBERGER* *Gary J. Gates is the Williams Distinguished
More information