Multivariate Ordered Regression
|
|
- Sybil Sophie Hawkins
- 7 years ago
- Views:
Transcription
1 Multivariate Ordered Regression February 28, 2012 Valentino Dardanoni, Antonio Forcina, Paolo Li Donni ABSTRACT TO BE WRITTEN JEL codes: Keywords: 1 Introduction Many interesting problems in economics involve the study of how an ordered response variable depends on a set of regressors. In many situations we may be concerned with the more general problem of modeling how the joint distribution of several closely related ordered response variables (Y 1,..., Y K ) depend on a vector of covariates z. Problems of this kind arise, for example, when we consider different choices made by each subject at a given point in time, or repeated choices on the same item made by each subject at different points in time. 1 In the microeconometric literature, the current approach to modelling the joint conditional distribution of ordinal response variables relies on assuming the existence of K latent variables which form a regression system Y k = z kτ k + ɛ k, k = 1,..., K where z k denotes the subset of regressors z relevant to the kth equation; a set of threshold parameters transform the continuous latent distribution into the actual discrete one. In addition, the joint distribution for the errors H(ɛ 1,..., ɛ K ) may also be specified. An alternative approach is to consider the conditional distribution of the ordered response variables as a multi-way table of joint probabilities, to arrange these into the vector π(z) and to define a suitable multivariate transformation of π(z), known as link function, which makes the dependence on covariates linear π(z) = g[λ(z)] = g(α + Zβ); (1) here Z is a matrix of known constants which depends on z and the vector of parameters λ(z) describe relevant aspects of (Y 1,..., Y K z) which have, hopefully, interpretation in terms of economic theory. Models of this kind have, essentially, two components: 1 If data involve observations taken at different points in time, the model described in this paper can be seen as a static panel model with free correlations across periods. With panel data, more specific approaches have been designed to take into account individual heterogeneity, lagged dependent variables and serial correlation. 1
2 The regression model λ(z) = α + Zβ: This is the parametric component of the model. The link function π(z) = g[λ(z)]: This is the potentially non parametric component of the model. It maps the linear function of covariates λ(z) onto a discrete joint distribution. Equation (1) defines a wide class of multivariate ordered regression models, whose elements are characterized by the specific link function g. Well known examples are the log-linear and the probit links. The log-linear link function (see e.g. Amemiya ([3], Chapter 9) or Agresti ([1], Chapter 4), in spite of its appealing simplicity, does not allow to model the univariate marginals directly, cannot take into account the ordered nature of the response variables, and does not have a latent variable interpretation. On the other hand, the so called multivariate ordered probit model, which is based on the probit link, does not suffer from any of these limitations. However, because of its simplicity, the probit link imposes some strong restrictions to the association structure of the response variables, namely (i) each bivariate marginal distribution has only one association parameter; and (ii) all log-linear interactions of order higher than two are constrained to zero. In addition, the process of fitting models based on the probit link may be computationally demanding when there are several response variables. Within the class of model defined by equation (1), we are typically interested in link functions whose elements have, possibly, the following desirable features: they describe relevant aspects of (Y 1,..., Y K ) z which have a substantial economic interpretation; they take into account the ordered nature of the response variables; they make it possible to model the univariate marginals and the association structure directly; they have an interpretation in terms of the system of latent equations. By exploiting recent advances in the theory of marginal modelling (Bartolucci, Colombi and Forcina, [4]), we propose a link function having most of the desirable properties listed above. The available theory of marginal models can handle a link function where the association structure is unrestricted, however this would make the exposition heavier with little practical advantages. Though the link function we discuss in this paper implies certain mild constraints on the association structure, it may be further simplified by testable restrictions. In this formulation, the table π(z) which arrays the joint probabilties of the response variables (Y 1,..., Y K ) z is decomposed into two main sets of parameters of interest, namely the global logits, which model the marginal distribution of ordered discrete variables, and the global log-odds ratios, which describe their association. As we show in section 2.4 below, these parameters have a natural interpretation in terms of stochastic dominance, thus allowing a clear economic intepretation of estimated parameters. In Section 2 we derive the main properties of the marginal modeling approach and its relation with the latent variable model. Section 3 discusses the computational properties of maximum likelihood estimates, and the asymptotic distribution of the likelihood ratio under suitable equality and inequality constraints. Section 4 clarifies the use of these models with an application to testing for asymmetric information in the Medigap insurance market. 2
3 2 Ordered regression In order to help the understanding of our approach to the multivariate case, we start by briefly reviewing the well known univariate case with a single ordinal response variable Y, taking value in {1,..., m}. Let q(z) be the m 1 1 vector denoting the survival function of Y, conditionally on z: q(z) = ( P r(y > 1 z)... P r(y > m 1 z) ) and notice that the term P r(y > 0 z) = 1 can be omitted. A generalized ordered regression model is an equation that relates q(z) to z through a vector valued function g : R m 1 (0, 1) m 1, known as a link function (McCullagh and Nelder [12]) which is assumed to be invertible and twice differentiable and such that q(z) = g(α + Zβ) (2) where Z is a matrix of known constants which depend on z. Since the link function is invertible, there is a well defined set of m 1 parameters which are linearly related to Z: λ(z) = g 1 (q(z)) = α + Zβ. (3) It is well known (see, for example Wooldridge [19], p. 457) that, under a few additional assumptions, the ordered regression model (2) is equivalent to assume the existence of a continuous latent variable Y which follows a linear regression model Y = τ 0 + z τ + ɛ (4) where the error ɛ is independent of z and has cumulative distribution function P (ɛ v) = G(v), and there is a vector of m 1 unknown parameters γ (called thresholds), with γ 1 < < γ m 1, such that Y j Y γ j. Notice that, since we can add an arbitrary constant to τ 0 and subtract the same constant from γ without affecting α, these two parameters cannot be identified simultaneously; in the sequel we assume that τ 0 = 0 so that the regression model in (4) has no intercept. When ɛ has a standard logistic distribution, the latent regression model (4) is equivalent to an ordered regression model where P r(ɛ v) = G(v) = e v /(1 + e v ). By inverting the survival function P r(y > j) = G( γ j + z β), it follows that λ(z) is a vector of so-called global logits which are linearly related to z λ j (z) = log[p r(y > j z)/p r(y j z)] = ( γ j ) + z β, j = 1,..., m 1, (5) and can be seen as the natural generalization of the standard binary logits when the variable is ordered. In fact, global logits can be interpreted as binary logits computed after dichotomizing the response categories at each cut point into a low and a high level. The standard ordered logit model can be seen as a special case of the ordered regression model (2) with G being the logit link. Different choices of the link function G give raise to different assumptions on the distribution of the latent regression error ɛ and the definition of the parameter vector λ(z). The best known alternative to the ordered logit model is of course the ordered probit model, where G is the standardized normal cdf. 3
4 2.1 Multivariate ordered regression The univariate latent regression model (4) may be extended to the multivariate case by assuming, in addition the existence of K latent continuous variables Yk, k = 1,..., K, such that Y k = z kτ k + ɛ k, k = 1,..., K, (6) the existence of a joint distribution for the errors ɛ 1,..., ɛ K. It then becomes a seemingly unrelated regression system formulated in terms of latent variables. Let H denote the joint distribution of the errors so that H(a 1,..., a K ) = P (ɛ 1 a 1,..., ɛ K a K ), and let H k, k = 1,..., K denote the corresponding univariate marginal distributions. The simplest modelling strategy would be to assume that the K error components in the latent regression models are independent, so that the regression system (6) implies K separate ordered regression models of the form H 1 k [P (Y k > j(k) z)] = γ j(k) + z kτ k, (k = 1,..., K, j = 1,, m k 1) which may be estimated as in the univariate case and no additional theory is required. However, there are several reasons for modelling also the association structure of (ɛ 1,..., ɛ K ). Apart from the fact that, by assuming that (ɛ 1,..., ɛ K ) are independent, we are likely to mispecify the true model with a loss of efficiency, the main reason for estimating the whole system of latent equations is that the nature and the degree of association between the response variables (conditionally on the covariates) may be of substantive interest, as in the application considered in this paper. 2.2 Multivariate Link Functions as copulas The probabilities which define the joint density of Y 1,..., Y K conditionally on z, can be displayed in a table with t = K 1 m k cells. It is convenient to arrange these probabilities into the vector π(z) in lexicographic order by letting variables Y k with a larger index k run faster from 1 to m k ; if unrestricted, π(z) belongs to the t-dimensional simplex Π defined by 1 t π(z) = 1 and π(z) 0. As in the univariate case, we are interested in invertible and differentiable mappings which allow to link individual response probabilities to a common vector of parameters π(z) = g[λ(z)] = g(α + Zβ) = g(xψ), (7) where the elements of Z are known functions of the vector of covariates z, X equals to ( I Z ), and ψ = ( α β ) collects the model parameters. In view of the fact that the marginal logits are directly related to the univariate latent regression models, it may be convenient to partition λ(z) into two components. The first one, denoted by λ u (z), contains the s = K k=1 (m k 1) global logits which determine the univariate marginal distributions, while the second one, denoted by λ a (z), contains a suitable subset of the remaining t 1 s parameters which model the association between the K response variables. Thus, it is convenient to rewrite the regression system (7) above as ( λ u ) ( (z) α u + Z u β u ) λ(z) = λ a = (z) α a + Z a β a = Xψ, (8) 4
5 where α u, β u and Z u denote respectively the intercepts, the regression coefficients and the covariate matrix for the set of univariate logits, while α a, β a and Z a refer to those for the association parameters. By modelling the univariate component directly, the association component of the link function outlined above defines a multivariate copula, a conceptual tool for modeling the association among the errors in the K regression equations in a way which treats the response variables in a symmetric fashion. Sklar s Theorem implies that any continuous distribution F may be determined by its marginal distributions F k and a copula C F (u 1,..., u K ) = F (F1 1 (u 1 ),..., FK 1 (u K)), u k [0, 1] for all k; thus a copula describes the association structure of F irrespective of its univariate marginals. Nelsen [14] provides an excellent introduction to copulas and their properties. This tool is particularly appropriate for describing the association between ordinal variables because copulas are invariant to strictly monotone transformations of the random variables; the other basic property of copulas is that univariate marginals and association structure may be modelled separately and then combined. Therefore, the latent regression model (6) above can be described by the set of regression coefficients β k, the threshold parameters γ k, the univariate marginal distribution of the ɛ k and the copula C H which in turn may also depend on covariates. A well known family of parametric copulas is the Gaussian copula: Cρ(u 1,..., u K ) = Ψ K (Ψ 1 1 (u 1),..., Ψ 1 1 (u K), ρ) where Ψ Q denotes the standard Q-variate normal distribution and ρ is the K(K 1)/2 vector of all bivariate correlation coefficients. When the Gaussian copula is combined with K standard normal marginal distributions it gives the multivariate normal distribution, which, when employed in the latent regression model (6) gives rise to the multivariate ordered probit model (sometimes also called the multiresponse ordered probit model). Though this model may look appealing, the set of discrete multivariate distributions which are compatible with this copula is very limited. The reason is that, once the cut points are determined in accordance with the univariate marginals, each discrete bivariate marginal distribution, say (Y k, Y h ), has (m k 1)(m h 1) additional free cells which are not constrained by the univariate marginals. Under the Gaussian copula all of these probabilities must conform to a bivariate normal and are determined by a single additional parameter (the correlation coefficient). This implies a rather strong restriction unless, of course, the underlying response variables are binary. 2.3 The Multivariate Ordered Logit Model The copula that we are going to describe in this section is determined by a suitable set of interaction parameters which, together with the global logits (of which they are the natural extension), constitute a parametrization of a relevant subset of Π which is of more direct interest. The approach that can be derived from this formulation has two main advantages with respect to the Gaussian copula: 1. the parameterization is easily invertible without the need for numerical integration; 2. the dependence structure is more flexible because, in the unrestricted model, the number of parameters that determine each bivariate distribution equals the number of free cells in the corresponding frequency table. 5
6 However, when the association parameters are constrained to be equal, the complexity of our model is identical to that of the Gaussian copula Global interaction parameters Dale (1986) was among the first to consider this parametrization in the bivariate case; an extension to multivariate distributions see Molenberghs and Lesaffre [13]; Bartolucci, Colombi and Forcina [4] provided a general framework for parameterizing discrete distribution with different kind of marginal interaction parameters; some of their results are used here. Loosely speaking, an interaction term is a parameter measuring the association among a set of variables, say I (1,..., K), which may be defined within a marginal distribution, say M, such that I M. In the following we restrict attention to bivariate interactions defined within the corresponding bivariate distribution. The bivariate interactions, which are the key association parameters in our model, are called global log-odds ratios, and, for any two response variables Y i and Y j, dichotomized at the cut point c i and c j respectively, may be written as λ {i j} (c i, c j z) = log ( P r(y i > c i, Y j > c j z)p r(y i c i, Y j c j z) P r(y i c i, Y j > c j z)p r(y i > c i, Y j c j z) ) The link function and its inverse Recall that the multinomial distribution, being a member of the exponential family, may be also parameterized by a vector, say θ(z), of canonical parameters; these are log-linear parameters defined within the overall joint distribution (rather than the corresponding marginals). Though these parameters do not have, usually, a direct interpretation, they can be easily converted into probabilities and provide a useful tool for defining the link function and its inverse. The link function that we propose is such that each univariate distribution Y i is determined by m i 1 global logits, each bivariate distribution Y i, Y j is determined by (m i 1)(m j 1) global interactions and all log-linear interactions of order greater than two are constrained to 0. Let Π D denote the subset of the probability simplex satisfying the above restrictions and v = i (m i 1)+ i>j (m i 1)(m j 1). Bartolucci, Colombi and Forcina [4] provide a simple algorithm for constructing a design matrix G D such that, when θ(z) varies in R v, π(z) = exp[g Dθ(z)] 1 exp[g D θ(z)] varies in Π D ; ([4], Appendix) provide an algorithm for constructing a contrast matrix C and a marginalization matrix M such that ( λ u ) (z) λ(z) = λ a = C log[mπ(z)], π(z) Π D (9) (z) Let L = {λ(z) : λ(z) = C log[mπ(z)] for some π(z) Π D } denote the space of compatible marginal interaction parameters. We can now state the main result of this section: Theorem 1 The mapping defined by (9) from Π D to L is invertible and differentiable and thus defines a proper link function π(z) = g(xψ). 6
7 The theorem is a special case of Theorem 1 in Bartolucci, Colombi and Forcina[4] who consider a more general class of hierarchical parametrization with I M. Unfortunately (9) has no analytical inverse; however a numerical inverse may be computed by a Newton algorithm and is extremely fast and reliable. Lemma 1 The mapping from λ(z) to θ(z) R v, or equivalently π(z) Π D, may be computed by the following algorithm: 1. at the initial step choose a value θ (0) such that λ (0) is sufficiently close to λ(z); 2. at the h-th step update the vector of canonical parameters by the first order approximation θ (h) = θ (h 1) + D[λ(z) λ (h 1) ] where D = θ/ λ = [Cdiag(Mπ) 1 Mdiag(π)G D ] 1 3. iterate until the norm of λ(z) λ (h 1) is close to 0. The Lemma is a direct application of the Newton algorithm. Since the mapping from θ(z) to λ(z) has continuous second order derivatives θ(z) whose elements are finite, the result may be derived, for example, from Theorem 4.4 in Süli and Mayers [18]. 2 Forcina and Dardanoni [9] study in detail the nature of the copula defined by the multivariate ordered logit link function and show that there exists a continuous multivariate latent distribution which has exactly the same parameters as its discrete analog. 2.4 Interpretation of the parameters in terms of stochastic dominance The main parameters of interest in our model are the univariate global logits and the bivariate global log-odds ratios. To understand their economic significance in this section we explore their properties in terms of stochastic dominance Global logits It is interesting to note that the global logits satisfy a stochastic ordering property which seems particularly appropriate when the response variables have an ordinal nature, in the sense that their relevant properties are preserved under arbitrary monotonic transformations: Lemma 2 Given two discrete ordered random variables Y h and Y k in {1,, m}, the following conditions are equivalent: 1. E[u(Y h ) z] E[u(Y k ) z] for any function u which is non decreasing; 2. Y h is stochastically greater than Y k conditionally on z; 3. log[p (Y h > j z)/p (Y h j z)] log[p (Y k > j z)/p (Y k j z)] j < m. 2 In our experience, by setting θ (0) = 0 v, the algorithm always converges as long as λ(z) is not too close to the boundary of the parameter space; this may happen for example when one or more elements are much larger than 20 in modulus. 7
8 Proof The equivalence between the first two conditions is well known (see for example Hadar and Russell [10] Theorems 1 and 2). The equivalence with the third condition follows by noting that global logits are strictly increasing transformations of the cumulative distribution. If we regress a global logit on a given covariate and the regression coefficient is positive, then the response variable becomes stochastically larger whenever that covariate increases; because of this, regression coefficients in the ordered logit regression have a direct interpretation in terms of stochastic dominance Global log-odds ratios The log-odds ratios, which determine the association for any pair of responses, are also closely related to the notion of positive quadrant dependence (P QD), an instance of positive dependence between ordinal variables first introduced by Lehmann [11]: two random variables Y h, Y k taking values in {1,, m h } and {1,, m k } are P QD if P r(y h i, Y k j) P r(y h i)p r(y k j), i {1,, m h }, j {1,, m k }, which intuitively means that, compared to the case of independence, small values of Y h tend to go with small values of Y k. Negative quadrant dependence is defined by reversing the inequality above. The ordinal nature of Y h and Y k seems to motivate the requirement that their relevant properties should be preserved under arbitrary monotonic transformations. The following result in the theory of stochastic orderings links the notion of positive dependence, P QD, to the log-odds ratios: Lemma 3 Given a pair of discrete ordered random variables Y h and Y k taking values in {1,, m h } and {1,, m k }, and any pair of increasing functions u, v, the following conditions are equivalent: 1. Cov[u(Y h ), v(y k ) z] 0; 2. Y h and Y k are P QD conditionally on z; 3. the set of log-odds ratios λ h,k (c i, c j z) 0 for all c i < m h, c j < m k. Proof See Nelsen [14], exercises 5.22 and 5.27, p This result may be interpreted as saying that, if the ordered variables are the discrete version of continuous latent variables discretized at arbitrary thresholds, the log-odds ratios are the most appropriate measure of association, in the sense that the sign of the dependence between the underlying variables is preserved, irrespective of how the ordered categories are constructed. 3 Statistical inference 3.1 Hypotheses of interest A convenient feature of the multivariate ordered regression model defined by equation (8) is that all the relevant hypotheses of interest can be expressed in the form of linear equality and inequality 3 Notice instead that the standard interpretation of ordered logit coefficients (see for example Crawford, Pollak and Vella, 1998), which refers to the density rather than the cumulative distribution of the response variable, implies often a rather convoluted interpretation. 8
9 constraints on the vector of model parameters ψ = ( α β ). A relevant set of testable restrictions consists in assuming that the bivariate association parameters do not depend on the cut points, an assumption which is the multivariate analog of the Plackett distribution. The family of bivariate Plackett distributions, introduced by Plackett [15], has been extended to the multivariate case by Molenberg and Lesaffre [13]. Forcina and Dardanoni [9] discuss the multivariate ordered regression model under the Plackett distribution; that model is a close analog to the multivariate ordered probit model, with the correlation coefficients replaced by the corresponding bivariate log-odds ratios. The advantage of our modeling strategy is that these assumptions are imposed by means of testable restrictions. Linear inequality constraints may be used to test a stochastic dominance effect of certain covariates on a set of latent regressions. We could also be interested to test positive dependence between a pair of responses against conditional independence by imposing that all the (m h 1)(m k 1) log-odds ratios are positive against being zero. Generally speaking, any set of hypothesis of interest may be expressed in the form H : {ψ : Eψ = 0, Uψ 0} by an appropriate choice of the equality and inequality matrices E and U. Clearly, the case with E or U equal to the null matrix correspond to restriction with only inequalities or only equalities respectively. 3.2 Likelihood inference Suppose we have independent observations (Y 1i,..., Y Ki, z i ) for a sample of n units. Let t(z i ) be a vector of size m k made of 0 s except for the element corresponding to the observed combination of (Y 1,..., Y K ) for the ith unit which is equal to 1. To simplify notations, in the sequel we write t(i) instead of t(z i ); a similar convention will be adopted for any vector which depends on z i. Under independent sampling, conditionally on z i, t(i) has a multinomial distribution with vector of probabilities π(i). In order to manipulate the likelihood function more easily, we write the multinomial as an exponential family with the vector of canonical parameters introduced before Lemma 2; in practice, these are all the log-linear interactions which belong to the hierarchical set of marginals D, so that λ(i) has the same dimension as θ(i). The contribution of the ith unit to the log likelihood is L(i) = t(i) log[π(i)] which, by expressing π(i) in terms of the canonical parameters, may be written as If we define D(i) = θ(i) λ(i) = L(i) = t(i) G D θ(i) log[1 exp(g D θ(i))]. [ ] 1 λ(i) = [Cdiag[Mπ(i)] 1 Mdiag[π(i)]G θ(i) D ] 1 the individual score vector is easily computed by differentiating L with respect to the vector of model parameters ψ by the chain rule s(i) = L(i) ψ = λ(i) ψ θ(i) L(i) λ(i) θ(i) = X(i) D(i) G D [t(i) π(i)]. 9
10 The individual contribution to the expected information matrix has also a simple form because E{G D [t(i) π(i)]} = 0 F (i) = E [ 2 ] L(i) = X(i) D(i) G ψ ψ D Ω(i)G D D(i)X(i) where Ω(i) = diag[π(i)) π(i)π(i) ] is the kernel for the variance matrix of the multinomial distributions. Having assumed that the units are independent, the log-likelihood is L(ψ) = i L(i), thus the score vector s(ψ) = L(ψ) ψ and the expected information matrix F (ψ) = E( 2 L(ψ) ) can be obtained ψ ψ by summing over units. Dardanoni, Fiorini and Focina (2012), for the case of two ordinal responses, discuss an approach to likelihood inference similar to the one proposed here. 3.3 Parameter estimation Maximum likelihood estimates of ψ under H can be obtained by an algorithm which extends to inequality constraints the seminal algorithm introduced by Aitchison and Silvey [2]. The Aitchison and Silvey algorithm (see for instance Colombi and Forcina [7]) is based on iterated linear approximations of the regression model onto the space of the canonical parameters which are variation independent; the approximation is updated until convergence. Formally: assign a starting value ψ (0) which produces compatible λ(i) for all units; at the hth step, compute a linear approximation of θ(i) and a quadratic approximation of the log likelihood θ(i) h = θ(i) h 1 + D(i) h 1 [X(i)ψ h λ(i) h 1 ] Q h (ψ) = (ψ b h ) F h 1 (ψ b h )/2 where b h = [F h 1 ] 1 [s h 1 +s h 1 1 ] and s h 1 1 = i X(i) D(i) h 1 G D Ω(i) h 1 G D D(i) h 1 λ(i) h 1 set ψ h to be equal to the constrained maximum of Q h (ψ) under H, iterate until convergence, that is, until the estimate of λ is sufficiently stable and the linear approximation of θ is sufficiently accurate. The starting point must be chosen so that the corresponding λ(i) 0 is compatible for all subjects. This may be achieved by setting to zero the intercepts of association parameters and all the regression coefficients corresponding to the covariates. Notice that, when inequality constraints are present, so that U is not the null matrix, the maximization of Q h (ψ) at each step requires a quadratic optimization which is itself iterative; there are many algorithms for quadratic optimization under inequality constraints, which are usually very fast and reliable. Since the likelihood function and the transformation from θ(i) to λ(i) satisfy the conditions discussed by Aitchison and Silvey ([2] p. 817), it follows that, as n increases, the probability that a constrained maximum exists tends to one. If the algorithm converges, it must converge to a local maximum by the argument of Aitchison and Silvey ([2] p. 826). Notice that our parameterization satisfies the two basic assumptions given by Rao ([16], p.296), namely identifiability and continuity of the transformation from ψ to π. It follows that, provided that 10
11 ψ 0, the true value of ψ under H, is an interior point of the parameter space, the m.l.e. of ψ under H exists, is consistent and has an asymptotic normal distribution. 4 An application to the Positive Correlation Test of asymmetric information in insurance markets There is a constantly growing body of empirical literature studying the existence of asymmetric information in insurance markets (for a review see Cohen and Spiegelman [6] and Einav et al. [8])). Standard economic theory predicts that risk occurrence and insurance coverage are positively correlated, since individuals who know to be riskier tend to buy more coverage (adverse selection) or to consume more for a given structure of the contract (moral hazard). The theoretically predicted positive correlation has inspired the seminal Positive Correlation (PC) test by Chiappori and Salanié [5]. The PC test rejects the null of absence of asymmetric information in a given insurance market when, conditional on consumers characteristics used by insurance companies to price contracts, individuals with more coverage experience more of the insured risk. In their seminal paper Chiappori and Salanié [5] provide simple empirical strategies to test the positive correlation hypothesis when both insurance coverage and risk occurrence are binary variables. The PC test has been applied to hundreds of various insurance markets, including acute health, long-term care, automobile, annuities, life, reverse mortgages and crop. In most applications, its implementation relies on a simple bivariate probit model, where the null of the absence of private information is tested by absence of residual errors correlation. In this application, we explain how our approach can be used to empirically implement the PC test when insurance coverage and risk are ordered categorical. We focus on the Medigap health insurance market in the US. Medigap is a private health insurance designed to cover some gaps in the coverage left by Medicare, which is a public health insurance program which provides coverage for all individuals aged 65 in US. In general the structure of Medicare is such that it leaves beneficiaries at risk for large out-of-pocket expenses. As a result, elderly may purchase voluntary supplemental private policies, such as Medigap, to fill Medicare s gaps in non-covered health care services and limit cost sharing. Medigap insurance market is highly regulated by Federal law, which designed a particular mechanism favoring the insured. In particular, insurance companies must offer a basic plan if they offer any other more generous plan; in addition, there is an enrolment period where insurance companies cannot refuse any insurer even if there are pre-existing conditions. Finally, federal regulation allows insurance company to set premium by individual s age and gender. To study the Medigap health insurance market we use data from the Health and Retirement Study conducted during the year Since we focus specifically on Medigap, we exclude those individuals younger than 65 and that received additional coverage through a former employer, spouse or are covered by some other government agency. We then consider only those who bought deliberately additional coverage. The final sample size is then given by N = 3290 observations. Since the Medigap plans differ on how generous is the coverage, we define the Medigap insurance coverage indicator (Plan) equals 0 if the individual has no coverage, 1 if she is covered by Medigap plan A or B, and 2 if 4 We do not use the last few available waves because there are no specific information on the Medigap plan s letter. 11
12 she is covered by any other more generous Medigap plan. Risk occurrence is measured by the number of doctor visits and hospital admissions in the previous two years. Since only a very small number of elders had no doctor visits, we constructed the variable Doc, which takes 0 if individual had less than five doctor visits, 1 if she had between five and ten, and 2 if she visited a doctor more than ten times. Since a very small number of elders had more than two hospital admissions, we defined the variable Hosp equals 0 if respondent had no hospital admission, 1 if she had one hospital admission and 2 if she had at least two hospital inpatient staying. Finally, given that Federal Law allows insurance companies to set premium according to the insured s age and gender, we use as control only whether the individual is a female (Fem) and 26 age dummy variables ranging from 65 to 90 years old or more. 4.1 Empirical strategy and results Let i = 1,..., N denote individual, z i, denote the vector collecting age and gender of individual i, and P lan i, Doc i, and Hosp i denote insurance coverage, and doctor and hospital use of individual i. We can rewrite model (6) as P lan = α p + z β p + ɛ p Doc = α d + z β d + ɛ d (10) Hosp = α h + z β h + ɛ h where α and β are vectors of unknown parameters, and the errors ɛ k, k = p, d, h have standard logistic marginal distribution but unspecified association structure (copula). Within this structure, the null of the absence of asymmetric information amounts to testing independence of ɛ p, ɛ d and ɛ p, ɛ h. If we assume that ɛ p, ɛ d, ɛ h in model (10) are jointly distributed as multivariate normal with standard marginal distributions we obtain the multivariate ordered probit which can be estimated by simulated Maximum Likelihood using for example the Stata CMP module (Roodman [17]). Estimated coefficients are reported in table 2, and correlation terms are reported in Panel A of table 4. The correlation between P lan and Doc strongly rejects the null of asymmetric information; on the other hand, the null of no asymmetric information between P lan and Hosp has a p-value of As comparison, table 3 and Panel B of table 4 report estimated coefficients of our multivariate logit model with the Placket restriction, which has the same complexity (same number of parameters) of the multivariate probit since it restricts each bivariate association to one single parameter. A glance of the tables reveals that the two models (probit and Plackett) give a very similar qualitative picture. Allowing for the different scale, the covariates parameters follow very similar patterns, and, more importantly, association coefficients (correlation coefficients for the probit and log-odds ratios for the logit) have very similar z-ratios and significance. Thus, the main advantage of our model is that it does not require use of simulated methods for estimation, and tends to be more accurate and faster. 5 Both the multivariate probit and the multivariate logit with Plackett restrictions however suffer from the limitation of imposing a restrictive structure to the bivariate associations of interest. To relax the Plackett assumption, we allow λ {p,d}, λ {p,h} and λ {d,h} to vary across the categories of P lan, Doc and Hosp. Since these variables have three categories, this implies estimating = 12 association parameters rather then three with the Palckett restriction. Estimated parameters for this 5 On a 2.4 GHz P8600 processor, estimating the logit model took about 178 seconds, which is about half the time the time (358 seconds) employed estimating the CMP module with the default number of draws set by the program (115)). 12
13 model are reported in table 4 and in Panel C of table 1. A glance at Panel C of table 1 reveals that, while the 4 association coefficients for the two health care variables Doc and Hosp are similar across categories, they differ in the case of coverage/risk association P lan Doc and P lan Hosp. A formal test of the Plackett assumption λ {p,d} (1, 1) = λ {p,d} (1, 2) = λ {p,d} (2, 1) = λ {p,d} (2, 2) and λ {p,h} (1, 1) = λ {p,h} (1, 2) = λ {p,h} (2, 1) = λ {p,h} (2, 2) has LR test statistic equal to and is asymptotically distributed as a χ 2 with 12 3 = 9 dof. Therefore the null is overwhelmingly rejected with a p-value equal to Panel A Table 1: Estimated correlation terms Plan-Doc Plan-Hosp Doc-Hosp Coef. S.E. Coef. S.E. Coef. S.E. ρ (0.0271) (0.0307) Panel B λ (0.0783) (0.0911) (0.0725) Panel C λ(1, 1) (0.0903) (0.0936) (0.0872) λ(1, 2) (0.0951) (0.139) (0.137) λ(2, 1) (0.106) (0.107) (0.0850) λ(2, 2) (0.110) (0.164) (0.116) Notes: For all equations control variables are age and gender dummies. Omitted age category is 65 years old. Estimated parameters reveal that the coverage/risk correlation is not homogeneous across coverage and risk categories. In particular, association between coverage and risk, for both doctor visits and hospital stays, is significantly positive only for moderate levels of health care use. In other words, conditional on age and gender, the null of no asymmetric information cannot be rejected if actual risk is defined as heavy use of health resources. Our results show that allowing association to vary across categories may provide a clearer picture of the effect underlying individual s heterogeneity. Recall, however, that finding residual coverage/risk correlation does not necessarily help to understand whether this is due to the structure of the contract (moral hazard) or rather to the existence of unpriced (by the insurer) individual risk (adverse selection). References [1] Agresti, A. (2002). Categorical data analysis. Wiley-Interscience. [2] Aitchison, J. and Silvey, S. D. (1958). Maximum-likelihood estimation of parameters subject to restraints. The Annals of Mathematical Statistics 29: pp [3] Amemiya, T. (1985). Advanced econometrics. Harvard University Press. 6 For robustness we have also computed the LR test relaxing the equality assumptions separately. The LR test statistics for the null of equal λs between P lan Doc and P lan Hosp are equal to and 9.47, which are rejected with a p-value of and respectively. On the contrary the LR test statistics for the null of equal λs between doc and hosp is equal to , which is not rejected with a p-value
14 [4] Bartolucci, F., Colombi, R. and Forcina, A. (2007). An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints. Statistica Sinica 17: 691. [5] Chiappori, P.-A. and Salanié, B. (2000). Testing for asymmetric information in insurance markets. The Journal of Political Economy 108: [6] Cohen, A. and Spiegelman, P. (2010). Testing for adverse selection in insurance markets. Journal of Risk and Insurance 77: [7] Colombi, R. and Forcina, A. (2001). Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88: pp [8] Einav, L., Finkelstein, A. and Levin, J. (2010). Beyond testing: Empirical models of insurance markets. Annual Review of Economics 2: [9] Forcina, A. and Dardanoni, V. (2008). Regression models for multivariate ordered responses via the plackett distribution. Journal of Multivariate Analysis 99: [10] Hadar, J. and Russell, W. R. (1969). Rules for ordering uncertain prospects. The American Economic Review 59: pp [11] Lehmann, E. L. (1966). Some concepts of dependence. The Annals of Mathematical Statistics 37: pp [12] McCullagh, P. and Nelder, J. (1989). Generalized linear models (Monographs on statistics and applied probability 37). Chapman Hall, London. [13] Molenberghs, G. and Lesaffre, E. (1994). Marginal modeling of correlated ordinal data using a multivariate plackett distribution. Journal of the American Statistical Association 89: pp [14] Nelsen, R. (2006). An introduction to copulas. Springer Verlag. [15] Plackett, R. L. (1965). A class of bivariate distributions. Journal of the American Statistical Association 60: pp [16] Rao, C. (1973). Linear statistical inference and its applications. Wiley (New York). [17] Roodman, D. (2011). Fitting fully observed recursive mixed-process models with cmp. Stata Journal 11: (48). [18] Süli, E. and Mayers, D. (2003). An introduction to numerical analysis. Cambridge University Press. [19] Wooldridge, J. (2002). Econometric analysis of cross section and panel data. The MIT press. 5 Appendix 14
15 Table 2: Estimated α and β parameters for the multivariate probit model Plan Doc Hosp Coef. S.E. Coef. S.E. Coef. S.E. /cut (0.0972) (0.0832) (0.0958) /cut (0.0978) (0.0843) (0.0977) aged (0.141) (0.115) (0.134) aged (0.129) (0.111) (0.131) aged (0.130) (0.112) (0.131) aged (0.138) (0.115) (0.131) aged (0.134) (0.113) (0.134) aged (0.135) (0.115) (0.131) aged (0.137) (0.117) (0.137) aged (0.145) (0.123) (0.137) aged (0.142) (0.120) (0.141) aged (0.160) (0.128) (0.144) aged (0.159) (0.131) (0.147) aged (0.155) (0.129) (0.148) aged (0.149) (0.127) (0.143) aged (0.165) (0.138) (0.157) aged (0.172) (0.132) (0.146) aged (0.184) (0.137) (0.156) aged (0.178) (0.151) (0.165) aged (0.202) (0.148) (0.163) aged (0.196) (0.154) (0.166) aged (0.183) (0.158) (0.169) aged (0.197) (0.166) (0.186) aged (0.214) (0.183) (0.188) aged (0.226) (0.190) (0.213) aged (0.382) (0.241) (0.285) aged (0.156) (0.123) (0.134) fem (0.0492) (0.0404) (0.0455) 15
16 Table 3: Estimated α and β parameters for the Plackett model Plan Doc Hosp Coef. S.E. Coef. S.E. Coef. S.E. /cut (0.0434) (0.0354) (0.0393) /cut (0.0505) (0.0396) (0.0571) aged (0.251) (0.188) (0.238) aged (0.220) (0.181) (0.230) aged (0.223) (0.184) (0.228) aged (0.241) (0.188) (0.227) aged (0.233) (0.186) (0.238) aged (0.230) (0.189) (0.227) aged (0.234) (0.192) (0.238) aged (0.250) (0.200) (0.233) aged (0.246) (0.198) (0.245) aged (0.284) (0.212) (0.247) aged (0.278) (0.215) (0.251) aged (0.272) (0.212) (0.257) aged (0.254) (0.209) (0.247) aged (0.287) (0.226) (0.272) aged (0.315) (0.215) (0.249) aged (0.343) (0.226) (0.268) aged (0.306) (0.245) (0.276) aged (0.381) (0.240) (0.273) aged (0.356) (0.253) (0.279) aged (0.313) (0.259) (0.284) aged (0.348) (0.274) (0.318) aged (0.370) (0.297) (0.313) aged (0.397) (0.312) (0.364) aged (0.773) (0.385) (0.526) aged (0.281) (0.203) (0.228) fem (0.0861) (0.0659) (0.0779) 16
17 Table 4: Estimated α and β parameters Plan Doc Hosp Coef. S.E. Coef. S.E. Coef. S.E. /cut (0.0434) (0.0355) (0.0394) /cut (0.0505) (0.0397) (0.0572) aged (0.251) (0.188) (0.238) aged (0.219) (0.181) (0.231) aged (0.223) (0.183) (0.229) aged (0.240) (0.188) (0.228) aged (0.233) (0.186) (0.238) aged (0.229) (0.189) (0.228) aged (0.234) (0.192) (0.239) aged (0.248) (0.200) (0.233) aged (0.246) (0.197) (0.245) aged (0.286) (0.211) (0.247) aged (0.279) (0.215) (0.251) aged (0.272) (0.212) (0.257) aged (0.254) (0.209) (0.247) aged (0.288) (0.226) (0.271) aged (0.316) (0.215) (0.249) aged (0.345) (0.226) (0.268) aged (0.304) (0.245) (0.276) aged (0.378) (0.239) (0.273) aged (0.354) (0.253) (0.279) aged (0.311) (0.258) (0.284) aged (0.351) (0.273) (0.316) aged (0.371) (0.296) (0.312) aged (0.395) (0.312) (0.363) aged (0.750) (0.384) (0.522) aged (0.283) (0.202) (0.228) fem (0.0858) (0.0659) (0.0779) 17
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationThe Effect Of Supplemental Insurance On Health Care Demand With Multiple Information: A Latent Class Analysis
HEDG Working Paper 09/03 The Effect Of Supplemental Insurance On Health Care Demand With Multiple Information: A Latent Class Analysis VALENTINO DARDANONI PAOLO LI DONNI March 2009 ISSN 1751-1976 york.ac.uk/res/herc/hedgwp
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationEIEF Working Paper 03/12 February 2012. Incentive and Selection Effects of Medigap Insurance on Inpatient Care
EIEF WORKING PAPER series IEF Einaudi Institute for Economics and Finance EIEF Working Paper 03/12 February 2012 Incentive and Selection Effects of Medigap Insurance on Inpatient Care by Valentino Dardanoni
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More information171:290 Model Selection Lecture II: The Akaike Information Criterion
171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationA GENERALIZED DEFINITION OF THE POLYCHORIC CORRELATION COEFFICIENT
A GENERALIZED DEFINITION OF THE POLYCHORIC CORRELATION COEFFICIENT JOAKIM EKSTRÖM Abstract. The polychoric correlation coefficient is a measure of association for ordinal variables which rests upon an
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationIt is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationMarkov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationMathematical finance and linear programming (optimization)
Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may
More informationUniversity of Lille I PC first year list of exercises n 7. Review
University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients
More informationMultivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationE3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
More informationYou Are What You Bet: Eliciting Risk Attitudes from Horse Races
You Are What You Bet: Eliciting Risk Attitudes from Horse Races Pierre-André Chiappori, Amit Gandhi, Bernard Salanié and Francois Salanié March 14, 2008 What Do We Know About Risk Preferences? Not that
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationA revisit of the hierarchical insurance claims modeling
A revisit of the hierarchical insurance claims modeling Emiliano A. Valdez Michigan State University joint work with E.W. Frees* * University of Wisconsin Madison Statistical Society of Canada (SSC) 2014
More informationChapter G08 Nonparametric Statistics
G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................
More informationFinancial Vulnerability Index (IMPACT)
Household s Life Insurance Demand - a Multivariate Two Part Model Edward (Jed) W. Frees School of Business, University of Wisconsin-Madison July 30, 1 / 19 Outline 1 2 3 4 2 / 19 Objective To understand
More informationLecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationThe equivalence of logistic regression and maximum entropy models
The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/20/09/the-simplerderivation-of-logistic-regression/
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationA hidden Markov model for criminal behaviour classification
RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationPractical Guide to the Simplex Method of Linear Programming
Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationCentre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationFULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS
FULLY MODIFIED OLS FOR HEEROGENEOUS COINEGRAED PANELS Peter Pedroni ABSRAC his chapter uses fully modified OLS principles to develop new methods for estimating and testing hypotheses for cointegrating
More informationMultiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationLongitudinal Meta-analysis
Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department
More informationChapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
More informationA SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS
A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationNote on growth and growth accounting
CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationChapter 4: Statistical Hypothesis Testing
Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationSensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS
Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and
More informationGeneral Sampling Methods
General Sampling Methods Reference: Glasserman, 2.2 and 2.3 Claudio Pacati academic year 2016 17 1 Inverse Transform Method Assume U U(0, 1) and let F be the cumulative distribution function of a distribution
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationMarkov random fields and Gibbs measures
Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationHow To Test Granger Causality Between Time Series
A general statistical framework for assessing Granger causality The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationState Space Time Series Analysis
State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State
More informationCHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationNAG C Library Chapter Introduction. g08 Nonparametric Statistics
g08 Nonparametric Statistics Introduction g08 NAG C Library Chapter Introduction g08 Nonparametric Statistics Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 7: Conditionally Positive Definite Functions Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More information