Package zic. September 4, 2015



Similar documents
Variable Selection for Health Care Demand in Germany

Package ChannelAttribution

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University

Package PACVB. R topics documented: July 10, Type Package

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Bayesian Statistics: Indian Buffet Process

Package mcmcse. March 25, 2016

Package EstCRM. July 13, 2015

Imperfect Debugging in Software Reliability

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

STA 4273H: Statistical Machine Learning

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

11. Time series and dynamic linear models

Package fastghquad. R topics documented: February 19, 2015

Standard errors of marginal effects in the heteroskedastic probit model

Markov Chain Monte Carlo Simulation Made Simple

BayesX - Software for Bayesian Inference in Structured Additive Regression

Package SurvCorr. February 26, 2015

Inference on Phase-type Models via MCMC

MCMC-Based Assessment of the Error Characteristics of a Surface-Based Combined Radar - Passive Microwave Cloud Property Retrieval

Bayesian Statistics in One Hour. Patrick Lam

Package ATE. R topics documented: February 19, Type Package Title Inference for Average Treatment Effects using Covariate. balancing.

Package MBA. February 19, Index 7. Canopy LIDAR data

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

R2MLwiN Using the multilevel modelling software package MLwiN from R

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Handling attrition and non-response in longitudinal data

Tutorial on Markov Chain Monte Carlo


Reliability estimators for the components of series and parallel systems: The Weibull model

Simple Linear Regression Inference

Package CoImp. February 19, 2015

Package changepoint. R topics documented: November 9, Type Package Title Methods for Changepoint Detection Version 2.

Journal of Statistical Software

The frequency of visiting a doctor: is the decision to go independent of the frequency?

Package tagcloud. R topics documented: July 3, 2015

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

BAYESIAN ANALYSIS OF AN AGGREGATE CLAIM MODEL USING VARIOUS LOSS DISTRIBUTIONS. by Claire Dudley

Calculating VaR. Capital Market Risk Advisors CMRA

More details on the inputs, functionality, and output can be found below.

Model Selection and Claim Frequency for Workers Compensation Insurance

Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

A Latent Variable Approach to Validate Credit Rating Systems using R

Package bigdata. R topics documented: February 19, 2015

Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting

Sampling for Bayesian computation with large datasets

Package MFDA. R topics documented: July 2, Version Date Title Model Based Functional Data Analysis

Multilevel Modelling of medical data

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Imputing Missing Data using SAS

APPLIED MISSING DATA ANALYSIS

DURATION ANALYSIS OF FLEET DYNAMICS

Package acrm. R topics documented: February 19, 2015

Some Examples of (Markov Chain) Monte Carlo Methods

A Two-Stage Bayesian Model for Predicting Winners in Major League Baseball

Package metafuse. November 7, 2015

Package missforest. February 20, 2015

Dealing with large datasets

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Probability Calculator


Centre for Central Banking Studies

Statistical Machine Learning

Bayesian Estimation Supersedes the t-test

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Package bcrm. September 24, 2015


Markov Chain Monte Carlo and Applied Bayesian Statistics: a short course Chris Holmes Professor of Biostatistics Oxford Centre for Gene Function

Online Appendix The Earnings Returns to Graduating with Honors - Evidence from Law Graduates

Statistical Machine Learning from Data

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap

Package survpresmooth

BayeScan v2.1 User Manual

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Probabilistic Fitting

Bayesian Image Super-Resolution

Big Data need Big Model 1/44

Basic Bayesian Methods

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

Transcription:

Package zic September 4, 2015 Version 0.9 Date 2015-09-03 Title Bayesian Inference for Zero-Inflated Count Models Author Markus Jochmann <markus.jochmann@ncl.ac.uk> Maintainer Markus Jochmann <markus.jochmann@ncl.ac.uk> Description Provides MCMC algorithms for the analysis of zero-inflated count models. The case of stochastic search variable selection (SVS) is also considered. All MCMC samplers are coded in C++ for improved efficiency. A data set considering the demand for health care is provided. License GPL (>= 2) Depends R (>= 3.0.2) Imports Rcpp (>= 0.11.0), coda (>= 0.14-2) LinkingTo Rcpp, RcppArmadillo NeedsCompilation yes Repository CRAN Date/Publication 2015-09-04 01:07:45 R topics documented: docvisits........................................... 2 zic.............................................. 3 zic.svs............................................ 5 Index 8 1

2 docvisits docvisits Demand for Health Care Data Description Usage Format This data set gives the number of doctor visits in the last three months for a sample of German male individuals in 1994. The data set is taken from Riphahn et al. (2003) and is a subsample of the German Socioeconomic Panel (SOEP). In contrast to Riphahn et al. (2003) only male individuals from the last wave are considered. See Jochmann (2013) for further details. data(docvisits) This data frame contains 1812 observations on the following 22 variables: docvisits number of doctor visits in last 3 months age age agesq age squared / 1000 age30 1 if age >= 30 age35 1 if age >= 35 age40 1 if age >= 40 age45 1 if age >= 45 age50 1 if age >= 50 age55 1 if age >= 55 age60 1 if age >= 60 health health satisfaction, 0 (low) - 10 (high) handicap 1 if handicapped, 0 otherwise hdegree degree of handicap in percentage points married 1 if married, 0 otherwise schooling years of schooling hhincome household monthly net income, in German marks / 1000 children 1 if children under 16 in the household, 0 otherwise self 1 if self employed, 0 otherwise civil 1 if civil servant, 0 otherwise bluec 1 if blue collar employee, 0 otherwise employed 1 if employed, 0 otherwise public 1 if public health insurance, 0 otherwise addon 1 if add-on insurance, 0 otherwise

zic 3 References Jochmann, M. (2013). What Belongs Where? Variable Selection for Zero-Inflated Count Models with an Application to the Demand for Health Care, Computational Statistics, 28, 1947 1964. Riphahn, R. T., Wambach, A., Million, A. (2003). Incentive Effects in the Demand for Health Care: A Bivariate Panel Count Data Estimation, Journal of Applied Econometrics, 18, 387 405. Wagner, G. G., Frick, J. R., Schupp, J. (2007). The German Socio-Economic Panel Study (SOEP) Scope, Evolution and Enhancements, Schmollers Jahrbuch, 127, 139 169. zic Bayesian Inference for Zero-Inflated Count Models Description zic fits zero-inflated count models via Markov chain Monte Carlo methods. Usage zic(formula, data, a0, b0, c0, d0, e0, f0, n.burnin, n.mcmc, n.thin, tune = 1.0, scale = TRUE) Arguments formula data A symbolic description of the model to be fit specifying the response variable and covariates. A data frame in which to interpret the variables in formula. a0 The prior variance of α. b0 The prior variance of β j. c0 The prior variance of γ. d0 The prior variance of δ j. e0 The shape parameter for the inverse gamma prior on σ 2. f0 The inverse scale parameter the inverse gamma prior on σ 2. n.burnin n.mcmc n.thin tune scale Number of burn-in iterations of the sampler. Number of iterations of the sampler. Thinning interval. Tuning parameter of Metropolis-Hastings step. If true, all covariates (except binary variables) are rescaled by dividing by their respective standard errors.

4 zic Details The considered zero-inflated count model is given by yi Poisson[exp(ηi )], ηi = α + x iβ + ε i, ε i N(0, σ 2 ), d i = γ + x iδ + ν i, ν i N(0, 1), y i = 1(d i > 0)yi, where y i and x i are observed. The assumed prior distributions are α N(0, a 0 ), β k N(0, b 0 ), k = 1,..., K, γ N(0, c 0 ), δ k N(0, d 0 ), k = 1,..., K, σ 2 Inv-Gamma (e 0, f 0 ). The sampling algorithm described in Jochmann (2013) is used. Value A list containing the following elements: alpha Posterior draws of α (coda mcmc object). beta Posterior draws of β (coda mcmc object). gamma delta sigma2 acc References Posterior draws of γ (coda mcmc object). Posterior draws of δ (coda mcmc object). Posterior draws of σ 2 (coda mcmc object). Acceptance rate of the Metropolis-Hastings step. Jochmann, M. (2013). What Belongs Where? Variable Selection for Zero-Inflated Count Models with an Application to the Demand for Health Care, Computational Statistics, 28, 1947 1964. Examples ## Not run: data( docvisits ) mdl <- docvisits ~ age + agesq + health + handicap + hdegree + married + schooling + hhincome + children + self + civil + bluec + employed + public + addon post <- zic( f, docvisits, 10.0, 10.0, 10.0, 10.0, 1.0, 1.0, 1000, 10000, 10, 1.0, TRUE ) ## End(Not run)

zic.svs 5 zic.svs SVS for Zero-Inflated Count Models Description zic.svs applies SVS to zero-inflated count models Usage zic.svs(formula, data, a0, g0.beta, h0.beta, nu0.beta, r0.beta, s0.beta, e0, f0, c0, g0.delta, h0.delta, nu0.delta, r0.delta, s0.delta, n.burnin, n.mcmc, n.thin, tune = 1.0, scale = TRUE) Arguments formula data A symbolic description of the model to be fit specifying the response variable and covariates. A data frame in which to interpret the variables in formula. a0 The prior variance of α. g0.beta The shape parameter for the inverse gamma prior on κ β k. h0.beta The inverse scale parameter for the inverse gamma prior on κ β k. nu0.beta Prior parameter for the spike of the hypervariances for the β k. r0.beta Prior parameter of ω β. s0.beta Prior parameter of ω β. e0 The shape parameter for the inverse gamma prior on σ 2. f0 The inverse scale parameter the inverse gamma prior on σ 2. c0 The prior variance of γ. g0.delta The shape parameter for the inverse gamma prior on κ δ k. h0.delta The inverse scale parameter for the inverse gamma prior on κ δ k. nu0.delta Prior parameter for the spike of the hypervariances for the δ k. r0.delta Prior parameter of ω δ. s0.delta Prior parameter of ω δ. n.burnin n.mcmc n.thin tune scale Number of burn-in iterations of the sampler. Number of iterations of the sampler. Thinning interval. Tuning parameter of Metropolis-Hastings step. If true, all covariates (except binary variables) are rescaled by dividing by their respective standard errors.

6 zic.svs Details Value The considered zero-inflated count model is given by y i Poisson[exp(η i )], η i = α + x iβ + ε i, ε i N(0, σ 2 ), d i = γ + x iδ + ν i, ν i N(0, 1), y i = 1(d i > 0)y i, where y i and x i are observed. The assumed prior distributions are α N(0, a 0 ), β k N(0, τ β k κβ k ),, k = 1,..., K, κ β j Inv-Gamma(gβ 0, hβ 0 ), τ β k (1 ωβ )δ ν β + ω β δ 1, 0 ω β Beta(r β 0, sβ 0 ), γ N(0, c 0 ), δ k N(0, τ δ k κ δ k), k = 1,..., K, κ δ k Inv-Gamma(g δ 0, h δ 0), τ δ k (1 ω δ )δ ν δ 0 + ω δ δ 1, ω δ Beta(r δ 0, s δ 0), σ 2 Inv-Gamma (e 0, f 0 ). The sampling algorithm described in Jochmann (2013) is used. A list containing the following elements: alpha beta gamma delta sigma2 Posterior draws of α (coda mcmc object). Posterior draws of β (coda mcmc object). Posterior draws of γ (coda mcmc object). Posterior draws of δ (coda mcmc object). Posterior draws of σ 2 (coda mcmc object). I.beta Posterior draws of indicator whether τ β j I.delta omega.beta omega.delta acc Posterior draws of indicator whether τ δ j Posterior draws of ω β (coda mcmc object). Posterior draws of ω δ (coda mcmc object). Acceptance rate of the Metropolis-Hastings step. is one (coda mcmc object). is one (coda mcmc object).

zic.svs 7 References Jochmann, M. (2013). What Belongs Where? Variable Selection for Zero-Inflated Count Models with an Application to the Demand for Health Care, Computational Statistics, 28, 1947 1964. Examples ## Not run: data( docvisits ) mdl <- docvisits ~ age + agesq + health + handicap + hdegree + married + schooling + hhincome + children + self + civil + bluec + employed + public + addon post <- zic.ssvs( mdl, docvisits, 10.0, 5.0, 5.0, 1.0e-04, 2.0, 2.0, 1.0, 1.0, 10.0, 5.0, 5.0, 1.0e-04, 2.0, 2.0, 1000, 10000, 10, 1.0, TRUE ) ## End(Not run)

Index Topic datasets docvisits, 2 docvisits, 2 zic, 3 zic.svs, 5 8