Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes


 Mervin Lester
 2 years ago
 Views:
Transcription
1 Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on the methods described in Jun Xu J. Scott Long (forthcoming) Confidence Intervals for Predicted Outcomes in Regression Models for Categorical Outcomes The Stata Journal. These formula were incorporated into prvalue. See for further details. 1 General Formula The delta method is a general approach for computing confidence intervals for functions of maximum likelihood estimates. The delta method takes a function that is too complex for analytically computing the variance, creates a linear approximation of that function, then computes the variance of the simpler linear function that can be used for large sample inference. We begin with a general result for maximum likelihood theory. Under stard regularity conditions, if β b is a vector of ML estimates, then ³ h ³ n bβ β d N 0,V ar bβ i. (1) Let G (β) be some function, such as predicted probabilities from a logit or ordinal logit model. The Taylor series expansion of G( b β) is G( β) b G(β)+( β b β) 0 G 0 (β)+( β b β) 0 G 00 (β )( β b β)/2 (2) G(β)+( β b β) 0 G 0 (β), where G 0 (β) G 00 (β) are matrices of first second partial derivatives with respect to β, β is some value between β b β. Then, h n G( β) b i G(β) n( β b β) 0 G 0 (β). (3) This leads to leads to G( β) b ³ N G(β), G(β) Var( b β) G(β) (Greene 2000; Agresti 2002). 0 To estimate the variance, we evaluate the partials at the ML estimates, G(β x),which β 0 β leads to ³ Var G( β) b G(b β) b 0 Var( β) b G(b β) b. (4) 1
2 Forexample,considerthelogitmodelwith G(β) Pr(y 1 x) Λ (x 0 β). (5) To compute the confidence interval, we need the gradient vector G(β) h Λ(x0 β) 0 Λ(x 0 β) 1 Λ(x 0 β) K i 0. (6) Since Λ is a cdf, Λ(x0 β) Λ(x0 β) k x 0 β x 0 β k λ (x 0 β) x k.then G(β) λ (x 0 β) λ (x 0 β) x 1 λ (x 0 β) x K 0. (7) To compute the confidence interval for a change in the probability as the independent variables change from x a to x b, we use the function G (β) Λ(β x a ) Λ (β x b ), (8) where G(β) [Λ(β x a) Λ(β x b )] Λ(β x a) Λ(β x b). (9) Substituting this result into equation 4, Λ(β xa ) Var(G(β)) 0 Var( β) b Λ(β x a) Λ(β xa ) 0 Var( β) b Λ(β x b) Λ(β xb ) 0 Var( β) b Λ(β x a) Λ(β xb ) + 0 Var( β) b Λ(β x b) (10). We now apply these formula to the models for which prvalue computes confidence intervals. 2 Binary Models In binary models, G(β) Pr(y 1 x) F (x 0 β) where F is the cdf for the logistic, normal, or cloglog function. The gradient is F(x 0 β) F(x0 β) k x 0 β x 0 β k f(x 0 β)x k, (11) where f is the pdf corresponding to F.Forthevectorx it follows that F(x 0 β) f(x 0 β)x. (12) 2
3 From equation 4, Var [Pr(y 1 x)] f(x 0 β)x 0 Var( β)xf(x b 0 β) f(x 0 β) 2 x 0 Var( β)x b. (13) The variances of Pr(y 0 x) Pr(y 0 x) are the equal since [1 F (x 0 β)] f(x 0 β)x (14) Var [Pr(y 0 x)] [ f(x 0 β)] 2 x 0 Var( b β)x. (15) 3 Ordered Logit Probit Assume that there are m 1,J outcome categories, where Pr (y m x) F (τ m x 0 β) F (τ m 1 x 0 β) for j 1,J. (16) Since we assume that τ 0 τ J, F (τ 0 x 0 β)0 F (τ J x 0 β)1. To compute the gradient, It follows that F (τ m x 0 β) F (τ m x 0 β) (τ m x 0 β) (17) k (τ m x 0 β) k f (τ m x 0 β)( x k ) (18) F (τ m x 0 β) F (τ m x 0 β) τ j (τ m x 0 β) (τ m x 0 β) τ j. (19) F (τ m x 0 β) τ j f (τ m x 0 β) if j m (20) F (τ m x 0 β) 0if j 6 m. (21) τ j Using these results with equation 16, Pr (y i m x i ) k [f (τ m x 0 β)( x k )] [f (τ m 1 x 0 β)( x k )] (22) x k f (τ m x 0 β) [f (τ m 1 x 0 β)] Pr (y i m x i ) F (τ m x 0 β) F (τ m 1 x 0 β) (23) τ j τ j τ j f (τ m x 0 β) if j m f (τ m 1 x 0 β) if j m 1 0 otherwise. 3
4 then For example, with three categories: With respect to τ, Pr (y 1 x) F (τ 1 x 0 β) 0 (24) Pr (y 2 x) F (τ 2 x 0 β) F (τ 1 x 0 β) (25) Pr (y 3 x) 1 F (τ 2 x 0 β), (26) Pr (y i 1 x i ) k x k [f (τ 1 x 0 β)] (27) Pr (y i 2 x i ) k x k [f (τ 2 x 0 β) f (τ 1 x 0 β)] (28) Pr (y i 3 x i ) k x k [ f (τ 2 x 0 β)]. (29) Pr (y i 1 x i ) τ 1 f (τ 1 x 0 β) (30) Pr (y i 1 x i ) τ 2 0 (31) Pr (y i 2 x i ) τ 1 f (τ 1 x 0 β) (32) Pr (y i 2 x i ) τ 2 f (τ 2 x 0 β) (33) Pr (y i 3 x i ) τ 1 0 (34) Pr (y i 3 x i ) τ 2 0. (35) To implement these procedures in Stata, we create the augmented matrices: β β 0 τ 1 τ J 1 0 (36) x 1 x x 2 x (37). x J 1 x , such that x 0 j β τ j xβ. (38) We then create the gradients described above. 4
5 4 Generalized Ordered Logit 1 The generalized ordered logit model is identical to the ordinal logit model except that the coefficients associated with x differ for each outcome. Since there is an intercept for each outcome, the τ s are fixed to zero β J 0 for identification. Then, Pr (y i 1 x i ) F ( x 0 β m ) (39) Pr (y i m x i ) F ( x 0 β m ) F x 0 β m 1 for m 2,J 1 (40) Pr (y i J x i ) F x 0 β m 1. (41) The gradient with respect to the β s is while no gradient for thresholds is needed. Then, 5 Multinomial Logit Assuming outcomes 1 through J, F ( x 0 β m ) m,k f ( x 0 β m )( x k ), (42) Pr (y i m x i ) m,k x k f ( x 0 β m ) f x 0 β m 1. (43) Pr(y m x) exp (xβ m) JP exp, (44) xβ j where without loss of generality we assume that β 1 0 to identify the model ( accordingly, the derivatives below do not apply to the partial with respect to β 1 ). To simplify notation, let P exp xβ j. The derivative of the probability of m with respect to βn is Using the quotient rule, Pr(y m x) n Examining each partial in turn. Pr(y m x) n j1 exp (xβ m) 1 n. (45) exp (xβ m) exp (xβ m ) 2. (46) n n exp (xβ m ) exp (xβ m) m (47) n m n exp(xβ m ) x if m n 0 if m 6 n 1 Confidence intervals for the generalized ordered logit model are not supported in prvalue. 5
6 P J exp xβ j JP exp xβ j j1 j1 n n (48) exp(xβ n ) x. The last equality follows since the partial of exp xβ j with respect to βn is 0 unless j n. Combining these results. If m n, Pr(y m x) exp (xβ m ) x exp (xβ m ) 2 x 2 (49) m exp (xβ m ) exp (xβ m ) 2 2 x exp (xβm ) exp (xβ m) exp (xβ m ) x [Pr(y m) Pr (y m)pr(y m)] x Pr(y m)[1 Pr (y m)] x. For m 6 n, Pr(y m x) [0 exp (xβ m )exp(xβ n ) x] 2 (50) n exp (xβ m) exp (xβ n ) x Pr(y m)pr(y n) x. For example, for two x s m 1: Pr(y m x) p m (1 p m ) x 1 p m (1 p m ) x 2 p m (1 p m ) 0 (51) m Pr(y m x) 0 p m p n x 1 p m p n x 2 p m p n. (52) n6m 6 Poisson Negative Binomial Regression In the Poisson regression model, so that μ i exp(x 0 iβ), (53) μ exp (x0 β) (54) k k exp(x0 β) x 0 β x 0 β k exp(x 0 β)x k μx k 6
7 Using matrices, μ exp (x0 β) exp(x0 β) x 0 β x 0 β μx. (55) The probability of a given count is so we can compute the gradient as: Pr (y x) exp ( μ) μy y!, (56) exp ( μ) μ y /y! 1 exp ( μ) μ y μ. (57) k y! μ k Since the last term was computed above, we only need to derive exp ( μ) μ y μ exp( μ) μy exp ( μ) + μy μ μ exp( μ) yμ y 1 μ y exp ( μ). (58) This leads to Using matrices, Pr (y x) 1 k y! μ exp ( μ) yμ y 1 μ y exp ( μ) x k (59) exp ( μ) yμy μ y+1 exp ( μ) x k y! yμy μ y+1 exp (μ) y! x k. Pr (y x) The negative binomial model is specified as exp ( μ) yμy μ y+1 exp ( μ) x (60) y! yμy μ y+1 exp (μ) y! x. μ exp(x 0 β + ε) (61) exp(x 0 β)exp(ε), where ε has a gamma distribution with variance α. The counts have a negative binomial distribution Pr (y i x i ) Γ(y µ ν µ yi i + ν) ν μi, (62) y i! Γ(ν) ν + μ i ν + μ i 7
8 where ν α 1. The derivatives of the loglikelihood are given in Stata Reference, Version 8, page 10. To simplify notation, we define τ lnα, m 1/α, p 1/ (1 + αμ), μ exp(xβ). Withψ (z) being the digamma function evaluated at z, τ p (y μ) (63) α (μ y) m ln (1 + αμ)+ψ (y + m) ψ (m). 1+αμ (64) Then by the chain rule, Pr (y x) Pr (y x) Pr(y x) 1 Pr (y x), (65) so that Similarly for τ, Pr (y x) Pr (y x) τ τ Pr (y x). (66) Pr (y x). (67) 7 References Agresti, Alan Categorical Data Analysis. 2nd Edition. New York: Wiley. Greene, William H Econometric Analysis, 4th Ed. New York: Prentice Hall. File: spost_deltaci22aug2005.tex  22Aug2005 8
Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UCDavis, Dept. of Political Science
Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In nonlinear regression models, such as the heteroskedastic
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationFrom the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Takehome final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationFraternity & Sorority Academic Report Spring 2016
Fraternity & Sorority Academic Report Organization Overall GPA Triangle 1717 1 Delta Chi 88 12 100 2 Alpha Epsilon Pi 77 3 80 3 Alpha Delta Chi 28 4 32 4 Alpha Delta Pi 190190 4 Phi Gamma Delta 85 3
More informationMissing data and net survival analysis Bernard Rachet
Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 2729 July 2015 Missing data and net survival analysis Bernard Rachet General context Populationbased,
More informationFraternity & Sorority Academic Report Fall 2015
Fraternity & Sorority Academic Report Organization Lambda Upsilon Lambda 11 1 Delta Chi 77 19 96 2 Alpha Delta Chi 30 1 31 3 Alpha Delta Pi 134 62 196 4 Alpha Sigma Phi 37 13 50 5 Sigma Alpha Epsilon
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationErrata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More information1 Logit & Probit Models for Binary Response
ECON 370: Limited Dependent Variable 1 Limited Dependent Variable Econometric Methods, ECON 370 We had previously discussed the possibility of running regressions even when the dependent variable is dichotomous
More informationExamples of Using R for Modeling Ordinal Data
Examples of Using R for Modeling Ordinal Data Alan Agresti Department of Statistics, University of Florida Supplement for the book Analysis of Ordinal Categorical Data, 2nd ed., 2010 (Wiley), abbreviated
More informationLecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationAn Introduction to Logistic and Probit Regression Models. Chelsea Moore
An Introduction to Logistic and Probit Regression Models Chelsea Moore Goals Brief overview of logistic and probit models Example in Stata Interpretation within & between models Binary Outcome Examples:
More informationC: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}
C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationWes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].
Wes, Delaram, and Emily MA75 Exercise 4.5 Consider a twoclass logistic regression problem with x R. Characterize the maximumlikelihood estimates of the slope and intercept parameter if the sample for
More informationPrediction for Multilevel Models
Prediction for Multilevel Models Sophia RabeHesketh Graduate School of Education & Graduate Group in Biostatistics University of California, Berkeley Institute of Education, University of London Joint
More informationCREDIT SCORING MODEL APPLICATIONS:
Örebro University Örebro University School of Business Master in Applied Statistics Thomas Laitila Sune Karlsson May, 2014 CREDIT SCORING MODEL APPLICATIONS: TESTING MULTINOMIAL TARGETS Gabriela De Rossi
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationMaximum Likelihood Estimation of Logistic Regression Models: Theory and Implementation
Maximum Likelihood Estimation of Logistic Regression Models: Theory and Implementation Abstract This article presents an overview of the logistic regression model for dependent variables having two or
More informationThe University of Kansas
All Greek Summary Rank Chapter Name Total Membership Chapter GPA 1 Beta Theta Pi 3.57 2 Chi Omega 3.42 3 Kappa Alpha Theta 3.36 4 Kappa Kappa Gamma 3.28 *5 Pi Beta Phi 3.27 *5 Gamma Phi Beta 3.27 *7 Alpha
More informationproblem arises when only a nonrandom sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a nonrandom
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationNote on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 18831889 Note on the M Algorithm in Linear Regression Model JiXia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
More informationi=1 In practice, the natural logarithm of the likelihood function, called the loglikelihood function and denoted by
Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a pdimensional parameter
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationInfinitely Imbalanced Logistic Regression
Journal of Machine Learning Research () 1 13 Submitted 9/06;Revised 12/06; Published Infinitely Imbalanced Logistic Regression Art B. Owen Department of Statistics Stanford Unversity Stanford CA, 94305,
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationML estimation: very general method, applicable to. wide range of problems. Allows, e.g., estimation of
Maximum Likelihood (ML) Estimation ML estimation: very general method, applicable to wide range of problems. Allows, e.g., estimation of models nonlinear in parameters. ML estimator requires a distributional
More informationNonParametric Estimation in Survival Models
NonParametric Estimation in Survival Models Germán Rodríguez grodri@princeton.edu Spring, 2001; revised Spring 2005 We now discuss the analysis of survival data without parametric assumptions about the
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationPenalized Logistic Regression and Classification of Microarray Data
Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAGLMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification
More informationGeneralized Linear Model Theory
Appendix B Generalized Linear Model Theory We describe the generalized linear model as formulated by Nelder and Wedderburn (1972), and discuss estimation of the parameters and tests of hypotheses. B.1
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationPortfolio Using Queuing Theory
Modeling the Number of Insured Households in an Insurance Portfolio Using Queuing Theory JeanPhilippe Boucher and Guillaume CouturePiché December 8, 2015 Quantact / Département de mathématiques, UQAM.
More informationThe University of Kansas
Fall 2011 Scholarship Report All Greek Summary Rank Chapter Name Chapter GPA 1 Beta Theta Pi 3.57 2 Chi Omega 3.42 3 Kappa Alpha Theta 3.36 *4 Gamma Phi Beta 3.28 4 Kappa Kappa Gamma 3.28 6 Pi Beta Phi
More informationA Tutorial on Logistic Regression
A Tutorial on Logistic Regression Ying So, SAS Institute Inc., Cary, NC ABSTRACT Many procedures in SAS/STAT can be used to perform logistic regression analysis: CATMOD, GENMOD,LOGISTIC, and PROBIT. Each
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationA revisit of the hierarchical insurance claims modeling
A revisit of the hierarchical insurance claims modeling Emiliano A. Valdez Michigan State University joint work with E.W. Frees* * University of Wisconsin Madison Statistical Society of Canada (SSC) 2014
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More information2. Illustration of the Nikkei 225 option data
1. Introduction 2. Illustration of the Nikkei 225 option data 2.1 A brief outline of the Nikkei 225 options market τ 2.2 Estimation of the theoretical price τ = + ε ε = = + ε + = + + + = + ε + ε + ε =
More informationFrom the help desk: Swamy s randomcoefficients model
The Stata Journal (2003) 3, Number 3, pp. 302 308 From the help desk: Swamy s randomcoefficients model Brian P. Poi Stata Corporation Abstract. This article discusses the Swamy (1970) randomcoefficients
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 1618, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationTest Code MS (Short answer type) 2004 Syllabus for Mathematics. Syllabus for Statistics
Test Code MS (Short answer type) 24 Syllabus for Mathematics Permutations and Combinations. Binomials and multinomial theorem. Theory of equations. Inequalities. Determinants, matrices, solution of linear
More information0.1 Solutions to CAS Exam ST, Fall 2015
0.1 Solutions to CAS Exam ST, Fall 2015 The questions can be found at http://www.casact.org/admissions/studytools/examst/f15st.pdf. 1. The variance equals the parameter λ of the Poisson distribution for
More informationANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING
ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANIKALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationData Analysis for categorical variables and its application to happiness studies
Data Analysis for categorical variables and its application to happiness studies Thanawit Bunsit Department of Economics, University of Bath The economics of happiness and wellbeing workshop: building
More informationHierarchical Insurance Claims Modeling
Hierarchical Insurance Claims Modeling Edward W. (Jed) Frees, University of Wisconsin  Madison Emiliano A. Valdez, University of Connecticut 2009 Joint Statistical Meetings Session 587  Thu 8/6/0910:30
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène JacqminGadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More information7.1 The Hazard and Survival Functions
Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 23, Probability and Statistics, March 2015. Due:March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 3, Probability and Statistics, March 05. Due:March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationWebbased Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Webbased Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationRecent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago
Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:
More informationGENERALIZED LINEAR MODELS IN VEHICLE INSURANCE
ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS Volume 62 41 Number 2, 2014 http://dx.doi.org/10.11118/actaun201462020383 GENERALIZED LINEAR MODELS IN VEHICLE INSURANCE Silvie Kafková
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More informationMaster programme in Statistics
Master programme in Statistics Björn Holmquist 1 1 Department of Statistics Lund University Cramérsällskapets årskonferens, 20100325 Master programme Vad är ett Master programme? Breddmaster vs Djupmaster
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationLogistic and Poisson Regression: Modeling Binary and Count Data. Statistics Workshop Mark Seiss, Dept. of Statistics
Logistic and Poisson Regression: Modeling Binary and Count Data Statistics Workshop Mark Seiss, Dept. of Statistics March 3, 2009 Presentation Outline 1. Introduction to Generalized Linear Models 2. Binary
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationMachine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More informationPROBABILITY AND STATISTICS. Ma 527. 1. To teach a knowledge of combinatorial reasoning.
PROBABILITY AND STATISTICS Ma 527 Course Description Prefaced by a study of the foundations of probability and statistics, this course is an extension of the elements of probability and statistics introduced
More informationThe equivalence of logistic regression and maximum entropy models
The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.winvector.com/blog/20/09/thesimplerderivationoflogisticregression/
More informationMarginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
More informationCourse 4 Examination Questions And Illustrative Solutions. November 2000
Course 4 Examination Questions And Illustrative Solutions Novemer 000 1. You fit an invertile firstorder moving average model to a time series. The lagone sample autocorrelation coefficient is 0.35.
More informationGeneral Regression Formulae ) (N2) (1  r 2 YX
General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Betahat: β = r YX (1 Standard
More informationUsing Repeated Measures Techniques To Analyze Clustercorrelated Survey Responses
Using Repeated Measures Techniques To Analyze Clustercorrelated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
More informationLecture  32 Regression Modelling Using SPSS
Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture  32 Regression Modelling Using SPSS (Refer
More informationParametric Survival Models
Parametric Survival Models Germán Rodríguez grodri@princeton.edu Spring, 2001; revised Spring 2005, Summer 2010 We consider briefly the analysis of survival data when one is willing to assume a parametric
More informationMultinomial Logistic Regression
Multinomial Logistic Regression Dr. Jon Starkweather and Dr. Amanda Kay Moske Multinomial logistic regression is used to predict categorical placement in or the probability of category membership on a
More informationHousehold s Life Insurance Demand  a Multivariate Two Part Model
Household s Life Insurance Demand  a Multivariate Two Part Model Edward (Jed) W. Frees School of Business, University of WisconsinMadison July 30, 1 / 19 Outline 1 2 3 4 2 / 19 Objective To understand
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationCredit Scoring Modelling for Retail Banking Sector.
Credit Scoring Modelling for Retail Banking Sector. Elena Bartolozzi, Matthew Cornford, Leticia GarcíaErgüín, Cristina Pascual Deocón, Oscar Iván Vasquez & Fransico Javier Plaza. II Modelling Week, Universidad
More informationExam C, Fall 2006 PRELIMINARY ANSWER KEY
Exam C, Fall 2006 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 E 19 B 2 D 20 D 3 B 21 A 4 C 22 A 5 A 23 E 6 D 24 E 7 B 25 D 8 C 26 A 9 E 27 C 10 D 28 C 11 E 29 C 12 B 30 B 13 C 31 C 14
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationP (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )
Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =
More informationLogistic Regression, Part III: Hypothesis Testing, Comparisons to OLS
Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals heavily
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More informationMarkov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
More informationModels for Count Data With Overdispersion
Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extrapoisson variation and the negative binomial model, with brief appearances
More informationAnalysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
More information