CHARLES J. ROMEO Economist, Antitrust Division, U.S. Department of Justice, Washington, D.C.,

Size: px
Start display at page:

Download "CHARLES J. ROMEO Economist, Antitrust Division, U.S. Department of Justice, Washington, D.C., 20530 E-mail: charles.romeo@usdoj."

Transcription

1 Quantitative Marketing and Economics, 3, 71 93, C 2005 Springer Science + Business Media, Inc. Printed in The United States. Estimating Discrete Joint Probability Distributions for Demographic Characteristics at the Store Level Given Store Level Marginal Distributions and a City-Wide Joint Distribution CHARLES J. ROMEO Economist, Antitrust Division, U.S. Department of Justice, Washington, D.C., charles.romeo@usdoj.gov Abstract. This paper provides a solution to the problem of estimating a joint distribution using the associated marginal distributions and a related joint distribution. The particular application we have in mind is estimating joint distributions of demographic characteristics corresponding to market areas for individual retail stores. Marginal distributions are generally available at the census tract level, but joint distributions are only available for Metropolitan Statistical Areas which are generally much larger than the market for a single retail store. Joint distributions over demographics are an important input into mixed logit demand models for aggregate data. Market shares that vary systematically with demographics are essential for relieving the restrictions imposed by the Independence from Irrelevant Alternative property of the logit model. We approach this problem by formulating a parametric function that incorporates both the city-wide joint distributional information and marginal information specific to the retail store s market area. To estimate the function, we form moment conditions equating the moments of the parametric function to observed data, and we input these into a GMM objective. In one of our illustrations we use four marginal demographic distributions from each of eight stores in Dominick s Finer Foods data archive to estimate a four dimensional joint distribution for each store. Our results show that our GMM approach produces estimated joint distributions that differ substantially from the product of marginal distributions and emit marginals that closely match the observed marginal distributions. Mixed logit demand estimates are also presented which show the estimates to be sensitive to the formulation of the demographics distribution. Key words. mixed logit, discrete joint probability distributions, generalized method of moments JEL Classifications: C51, C81 1. Introduction The advantage of the mixed logit model for aggregate data pioneered by Berry (1994) and Berry, Levinsohn, and Pakes (1995) (henceforth BLP) is that it allows one to solve for the primitives of a flexible differentiated products model using only aggregate data on prices, quantities sold, and product characteristics. Heterogeneity is introduced by interacting randomly generated consumer tastes with the characteristics of products in a logit demand function. The BLP paper has been followed by a steady stream of papers in the The views expressed are not purported to reflect those of the United States Department of Justice.

2 72 ROMEO economics and marketing literatures, as it offers the possibility of flexible inference with readily available data. 1 However, the flexibility engendered by mixing the logit model does not come about magically. Generating elasticities relieved of the restrictions imposed by the Independence from Irrelevant Alternatives (IIA) property of the logit model generally requires information beyond just aggregate data on product characteristics. BLP recognized this in their seminal paper by introducing demographic data on income in addition to normal random variates to represent individual types. The direction of extensions that have been produced since BLP, have pushed in the direction of incorporating additional demographic information into the model. Nevo (2001) and Davis (1998) introduced draws from a joint distribution of demographic information into the second stage of the demand hierarchy. Berry, Levinsohn, and Pakes (2003) and Petrin (2002) incorporated moments conditions based on consumer survey data into the GMM objective to improve the fit of certain aspects of the model. Dube (2002) and Hendal (1999) present multiple discrete choice models that completely mix micro with aggregate data, while Chintagunta and Dube (2003) present a BLP type model in which they integrate household level purchase data with store level market share data to improve the estimate of both the mean response and the heterogeneity distribution over what could be obtained with a single data source. A difficulty that one sometimes faces with these models is in obtaining a joint distribution of demographics that matches the contours of the market for the products under study. The particular application we have in mind for this paper is the Dominick s store level data available on the University of Chicago, Graduate School of Business web site. 2 This data archive contains as many as 400 weeks of store level observations on a myriad of supermarket products. In addition, a file of store level demographic distributions is available that provides a snapshot of the characteristics of the households and the local economy for each of the 89 Dominick s stores. However, all the distributions in the demographic file are marginals, and this limits their usefulness for mixing with BLP class models. One is either limited to drawing a single demographic characteristic, as BLP did in their original work, or to forming store level joint distributions as a product of marginals and hoping that the difference between joint distributions approximated in this manner and the true store level joint distributions is empirically unimportant. 3 This limitation is not specific to the Dominick s data. Marginal distributional information is available for numerous demographic variables at the census tract level while joint distributions can only be formed for a few variables. Consequently, the potential is there for researchers to face this limitation whenever the focus is on modeling demand at the retail outlet level and the market area for the outlet s goods is a subset of the census tracts in the Metropolitan Statistical Area (MSA). 4 1 Kadiyali, Sudhir, and Rao (2001) provide a survey Meza and Sudhir (2003) appear to take this approach. 4 At the MSA or Primary MSA (PMSA) level, joint distributional information is readily available for a wide variety a variables from the Current Population Survey web site (

3 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 73 The innovation offered by this current paper is to develop a Generalized Method of Moments (GMM) approach for consistently estimating discrete store level joint distributions by combining discrete store level marginal distributions, with information from a discrete joint distribution for the same set of variables from the associated MSA. The essence of our approach is to use the available store level and MSA level information to form an initial estimate of the joint distribution of interest that contains all of the elements of variation of the true store level distribution. For example, suppose we are interested in the joint distribution of income and race for an individual store. To form an initial estimate of this joint distribution, we could specify a parametric function that incorporates the MSA level joint distribution over income and race to capture joint variation in these two variables that is not specific to the individual retail outlet, and the store level marginal distributions for income and race to capture information specific to the individual store. This function varies over both income and race, and is specific to each individual retail outlet. Moment conditions are then formed that equate moments formulated using this parametric function to observed moments. Previous researchers have faced this problem in other contexts and two previous solutions have been offered. To our knowledge the first solution is attributed to Deming and Stephan (1940). Their method, Iterative Proportional Fitting (IPF), was used to estimate internal cells of a two-way contingency table for the total census population. The inputs they used were a two-way contingency table for the same variables generated from a five percent census sample and marginal frequencies for the total census population. The approach is iterative in that row frequencies are matched first, then column frequencies. Matching column frequencies alters the row frequencies and vice-versa, so each is then updated in second and subsequent iterations until convergence is achieved. The objective function underlying IPF is a constrained minimum chi-square. This is an intrinsically statistical objective, but, to our knowledge, the statistical properties of the IPF estimator have never been developed. 5 We do not develop them here as that is outside the scope of this research. More recently, Putler, Kalyanam, and Hodges (1996) have offered a Bayesian solution. Their interest is in estimating joint distributions over demographics to improve the targeting of marketing efforts. They too use MSA level Census data to, in their case, provide a prior joint distribution and they use smaller area marginal distributions as data inputs to the posterior. They form a posterior distribution over free cells, i.e., those not constrained by the marginal information. In comparison to our moment based approach, the Putler et al. Bayesian approach has the advantage of incorporating the structural likelihood information which should improve the fit of model to the data. On the other hand, the number of parameters to be estimated grows rapidly as the number of free cells increases in both the number of cells in a given dimension and the number of dimensions. This limits the 5 This was an active area of research throughout the 1940s until at least the early 1960s. Researchers offered avariety of modifications to Deming and Stephan s IPF algorithm (Stephan, 1942; Smith, 1947; Friedlander, 1961), but the focus was generally on providing a more efficient algorithm to reduce computational costs. In the days when computational power was defined by pencil and paper, statistical distributions that were not feasible to calculate may have been perceived as too esoteric to invest in describing. The most recent discussion of IPF that we have found is in Bishop, Fienberg, and Holland (1975) which also does not contain any discussion of the statistical properties of this estimator.

4 74 ROMEO feasibility of the Putler et al. approach to estimating joint distributions with relatively few free cells. Two illustrations are provided. In the first illustration, we use an example from Putler et al. to facilitate a comparison of the three approaches to solving this problem: IPF, Bayesian, and GMM. This example shows that all three methods produce similar estimates of the joint distribution. In the second illustration, we estimate a four-dimensional joint distribution over demographics for each of eight Dominick s stores using only the GMM approach. Our results show that the model produces an excellent fit to the moment conditions, and that the estimated joint distributions produce marginal distributions that are generally the same as the observed marginals to at least two decimal places. In addition, we evaluate the empirical importance of the demographic distribution formulation in a mixed logit demand model. To do this, we generate two sets of estimates for an equilibrium mixed logit demand and supply system for bath tissue data from these eight Dominick s stores. For the first estimates, we draw the demographics from a joint distribution formulated as a product of marginal distributions, while for the second estimates we draw demographics from the joint distribution estimated by GMM. The results show substantial, though generally not statistically significant differences between the estimates. The remainder of this paper is organized as follows. Section 2 contains the methodology for initializing and estimating store level discrete joint distributions. This section is developed in five parts. In part one, we formulate the parametric function and moment conditions for a two-by-two discrete probability distribution. In part two, we formulate the parametric function for the general case, while part three contains the associated moment conditions. Differences in model parameterization for the GMM, IPF, and Bayesian approaches to inference are discussed in part four, and part five discusses GMM estimation. Section 3 contains the two illustrations, and Section 4 contains conclusions. 2. Formulating and estimating a discrete joint distribution 2.1. The two-by-two case Suppose we observe joint probabilities for a city-wide area, and marginal probabilities for the market area associated with a retail outlet within the city as in Table 1. subject to: c jk 0, j,k c jk = 1. d j, d k 0, j d j = 1, k d k = 1 Our goal is to use this information to parameterize a joint probability distribution for the retail outlet. Let p(θ) =(p 11 (θ), p 12 (θ), p 21 (θ), p 22 (θ)) be the unknown joint probabilities formulated in terms of observed data and unknown parameters θ. Suppose now we take log odds transformations of the data in Table 1 and incorporate this information into a logistic

5 z 1 1 c 11 c 12 d 1 d 1 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 75 Table 1. Acity-wide joint probability distribution with associated retail outlet marginal distributions for the two-by-two case. City-wide joint distribution z 2 Retail outlet marginal distributions 1 2 z 1 z 2 2 c 21 c 22 d 2 d 2 distribution for p(θ)asfollows, p jk (θ) = exp { A jk ln ( c jk ) c 22 + β1 I [ j=1] ln ( d j ) d 2 + β2 I [k=1] ln ( )} d k d s,r=1,2,r<s exp{ A sr ln ( c sr ) c 22 + β1 I [s=1] ln ( d s ) d 2 + β2 I [r=1] ln ( )}. (1) d r d 2 The log odds transformations are formed by dividing each element of the distributions in Table 1 by the last element in that distribution and then taking logs. Since, by construction, ln(c 22 /c 22 ) = ln(d 2 /d 2 ) = ln(d 2 /d 2 ) = 0 the specifications for p 12 (θ), p 21 (θ), p 22 (θ) each have fewer terms than that for p 11 (θ). Define A = [A jr ] j,r=1,2, and θ = (A,β j, j = 1, 2). The I [ ] are indicator functions equal to one if the condition in the brackets is satisfied. The advantage of specifying a parametric form for p(θ)is that it contains all the elements of variation of the true unknown joint distribution. Incorporating the ln(c jk /c 22 ) terms reflect the joint variation at the city level, while the ln(d j /d 2 ) and ln(d k /d 2 ) terms incorporate information specific to the retail outlet into p(θ) that adjust the city level joint variation. Using a parametric form for p(θ) enables us to fit the moment conditions with fewer parameters than required for either the IPF or Bayesian approaches. In addition, the number of parameters to be estimated will grow more slowly with problem size than for either of these other approaches. We form three sets of moment conditions and associated GMM objective to consistently estimate θ. The first set of conditions are formed as the difference between the estimated and observed marginals. 6 p j (θ) d j = 0, j = 1, 2, p r (θ) d r = 0, r = 1, 2. (2) Given the adding up constraints on the marginal distributions, only two of the four moment conditions above are independent. Using these four moment conditions alone will 6 It is a slight abuse of notation to set the following moment conditions exactly to zero. Rather, the GMM objective will make the discrepancies between the estimated and observed moments as small as possible. We extend our notation to remedy this abuse in Section 2.3.

6 76 ROMEO only enable us to consistently estimate β 1 and β 2 with A set = [1]. 7 To estimate parameters in A we introduce a second condition relating the city-wide and retail outlet covariances. where and cov(z 1, z 2 ; θ) city-wide cov(z 1, z 2 ) = 0, (3) cov(z 1, z 2 ; θ) = j,k=1,2 (z 1 j E[z 1 ; θ])(z 2k E[z 2 ; θ])p jk (θ), E[z 1 ; θ] = (1/2) j z 1 j p j (θ), and E[z 2 ; θ] is similarly formulated. Covariance is not a very meaningful measure in discrete distributions. We chose to use a condition based on covariance discrepancies because covariances are the simplest moments that are formulated using bivariate distributional information. This condition penalizes differences in the estimated and city-wide bivariate distributions, p jk (θ) and the c jk respectively. Including it improved the model fit in both our illustrations. The model in (1) contains five parameters-a 22 does not enter the model. For purposes of parsimony and identification, we structure A as the product of two 2 1vectors α 1 and α 2, such that A = α 1 α 2 and set one of these vectors to a column of ones.8 For the third set of conditions we specify the joint distribution for each cell as the product of a conditional distribution derived from (1) and a retail outlet marginal distribution. Taking the difference between two formulations of each cell provides the moment conditions. Specifically, form moment conditions as d j p jr (θ)/p j (θ) d r p jr (θ)/p r (θ) = d j p r (θ) d r p j (θ) = 0, j, r = 1, 2. (4) As the second line of (4) shows, this condition simplifies to a difference of products of marginal distributions of d and p. There is a condition in the form of (4) for each cell in the joint probability distribution, and this penalizes the model for any discrepancies in the joint probabilities. Since this condition relies on the same sample moments as (2), it does not increase the number of parameters in θ that can be identified, but, as our illustrations show, including this condition improves model fit. 7 For problems larger than 2 2, enough independent marginal moment conditions are available to make elements in A estimable with these conditions alone, at least in principle. In one of our illustrations, however, estimating the βs and elements in A with just these conditions produced estimated joint distributions that were very close to joint distributions formulated as a product of marginal distributions. 8 In the 2 2 case we still have four parameters and only three independent moment conditions if A is structured this way, and hence we still could not estimate either α vector. This is only a problem for this particular case.

7 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS The general case: Formulating an initial estimate of the store level joint probabilities Let z = (z 1,...,z J ) be a discrete random vector. For each z j, j = 1,...,J, let m j = (m j1,...,m jkj ) be a vector of k j < support points, and let m ={(m 1l1, m 2l2,...,m JlJ ): l j = 1,...,k j, j = 1,...,J} denote the set of all J dimensional support points for z. Indexing stores by s = 1,...,S, our goal is to consistently estimate the true joint probabilities P 0 z (m s) = [P0 z (m 1l 1, m 2l2,...,m JlJ s)] (kj k2 ) k 1 (joint over the random vector z) for each s, given known joint probability C z (m) = [C z (m 1l1, m 2l2,...,m JlJ )] (kj k2 ) k 1 for the city as a whole, and known marginal distributions D z j (m j s) = [D z j (m jl j s)] k j 1 (marginal with respect to the z j ). Our first step in estimating P 0 z (m s) istouse the available information to parameterize a function, say P z (m s; θ), that contains all of the elements of variation in P 0 z (m s). To formulate P z (m s; θ)wegenerate the necessary data inputs from log odds transformations of our known city-wide and store level distributions. From the city-wide distribution we generate the data x ( ( ( )) ) Cz m1l1,...,m JlJ m 1l1,...,m JlJ = ln ( ), l j = 1,...,k j, j = 1,...,J, C z m1k1,...,m JkJ while we use the store level marginals to provide data y ( ( ( ) ) Dz m jl j m jl ) j s j s = ln ( D z j m jkj s ), l j = 1,...,k j j = 1,...,J, (6) for each s. Itisuseful to organize the matrix x = [x(m 1l1,...,m JlJ )] so that it is of dimension (k J k J 1 k 2 )xk 1,asweare going to organize the corresponding matrix of unknown parameters A conformably with x. Specifically, define the set of parameter vectors {α 1,...,α J } such that α j is k j x1 for each j, and formulate the matrix A of parameters as A = α J α 3 (α 2 α 1 ). A has the same dimensions as x, and A(m 1l 1,...,m JlJ ) = α 1l1 α 2l2 α JlJ.Ingeneral, {α set j = ı k j, j = 1,...,J, j r}, where ı is a k j vector of ones, so that only one α vector is estimated. The choice of which vector to allow to vary freely, α r, depends in part on the number of linear and covariance constraints available, and in part on model fit criteria. Use (5) and (6) to formulate P z (m s; θ) = [P z (m 1l1, m 2l2,...,m JlJ s; θ)] (kj k2 )xk 1 as a logistic distribution having elements ( ) P z m 1l1,...,m JlJ s; θ = exp(a(m 1l1,...,m JlJ ) x(m 1l1,...,m JlJ ) + β 1 y(m 1l1 s) β J y(m JlJ s)) l 1,...,l J exp(a(m 1l1,...,m JlJ ) x(m 1l1,...,m JlJ ) + β 1 y(m 1l1 s) β J y(m JlJ s)), (5) l j = 1,...,k j, j = 1,...,J, (7)

8 78 ROMEO where the β j are scalars and θ = (α r,β 1,...,β J ). Equation (7) contains all the all the elements of variation contained in unknown store level distributions P 0 z (m s) : x(m 1l1,...,m JlJ ) allows for variation among the (z 1,...,z J ) unconditional on s, while each of the y( s) allows for adjustments to the city-wide distribution for a particular z j conditional on s. Finally, setting α r = ı kr, and setting β j = 1 for all j, gives us an initial estimate of P 0 z (m s). To improve upon this estimate, we specify a set of moment conditions having P 0 z (m s)astheir unique solution. Choosing θ to minimize the GMM criterion formed from these moment conditions provides a consistent estimate of P 0 z (m s) Moment conditions Using (7), specify marginal probabilities as ( ) ( ) P zr mrlr s; θ = P z m1l1,...,m JlJ s; θ. (8) l 1,...,l r 1,l r+1,...,l J Form the following three sets of moment conditions for each s: ( ) ( ) (M1) P z j m jl j s; θ Dz j m jl j s = δ jl j, l j = 1,...,k j, j = 1,...,J (M2) Cov(z j, z r s; θ) city-wide Cov(z j, z r ) = η jr, each j, r, j r; ( ) ( ) ( ) ( ) (M3) P zg mglg s; θ Dzr mrlr s Pzr mrlr s; θ Dzg mglg rg s = ν l 1,...,l J, l r = 1,...,k r, l g = 1,...,k g r, g = 1,...,J, r < g. Condition (M1) is formed as the difference between the estimated and observed marginal distributions at each point of support. There are k j moment conditions constraining the marginals for each j. The difference between the estimated and observed marginals is assumed to equal an error δ jlj having the properties E[δ jl j ] = 0, and Var[δ jl j ] <, at each l j, all j. Moment condition (M2) imposes covariance assumptions in the estimation. There are ( 2 J ) = J(J 1)/2 ofthese conditions available. We assume the difference between the estimated and observed covariance to equal an error η jr,having the properties E[η jr ] = 0, and Var[η jr ] <, all j, r. As shown in (4) above, condition (M3) is equivalent to formulating the joint distribution P z (m s; θ) two different ways at each point of support m, with each formulation mixing a different estimated conditional distribution with an observed store level marginal. There are k 1 k 2 k J conditions (M3) corresponding to the same number of points of support of P z (m s; θ) for each (r, g) pair, and there are ( 2 J ) = J(J 1)/2(r, g) pairs. The difference in the two formulations of the joint moments is assumed to be equal to an error ν rg l 1,...,l J that has the properties E[ν rg l 1,...,l J ] = 0, and Var[ν rg l 1,...,l J ] <, ateach l 1,...,l J, all r, g.

9 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS Differences in parameterization of the GMM, IPF, and Bayesian approaches to inference In the general case, there are P J = J j=1 k j 1 independent cells in the joint probability distribution and S J = J j=1 k j J independent marginal relationships. In addition, we add J(J 1)/2 covariance conditions yielding a total of S J + J(J 1)/2 independent constraints for identifying θ. Identification requires the number of free parameters in θ to be less than or equal to the number of independent constraints. In general, our approach requires many fewer free parameters than there are independent constraints to achieve a good model fit. We associate one β parameter with each store level marginal, and allow one α vector, the rth, to be free for a total of J + k r free parameters. Underlying the IPF estimator is a chi-square criterion that minimizes the difference between the unknown probabilities P 0 z (m s) and the observed city-wide joint distribution C z (m). This criterion is then subject to the S J marginal constraints each interacted with an unknown Lagrangian multiplier. Estimation of the Lagrangian multipliers is the focus of the Iterated Proportional Fitting algorithm. Then, as Deming and Stephan (1940) show, using the estimated Lagrangian multipliers one can infer estimates for all the free cells in P 0 z (m s). In general, the number of Lagrangian multipliers S J > J + k r parameters estimated using our GMM approach, but not substantially so. Putler, Kalyanam, and Hodges (1996) take a Bayesian approach. They use the city-wide joint distribution to form a Dirichlet prior and specify a multinomial likelihood over the store level joint distribution. They do not parameterize the probabilities in the multinomial distribution and, as such, this leaves them with a posterior distribution over DF = P J S J free parameters. Since DF grows quickly in both the number of cells in any given dimension and in the number of dimensions, the computational cost of posterior inference grows more quickly than for either the parametric GMM approach we propose, or the IPF approach. The Putler et al. approach can be extended by specifying parametric functions for the Dirichlet prior probabilities and for the probabilities in the likelihood to reduce the dimensionality of the estimation problem. This will, however, complicate the structure of the posterior as conjugacy of the prior and the likelihood will be lost by introducing a parametric form for P z (m s) Estimation To form a GMM objective function, we input the moment conditions in one long vector. We chose this formulation because there is a different number of moments associated with each set of moment conditions, and for moment condition (M1), the number of moments varies with each marginal distribution. Hence there is no natural way to allow the moment conditions to freely correlate. Define (θ) = [ δ 1l1 (θ),δ 1l2 (θ),...,δ 1lk1 (θ),...,δ JlkJ (θ) ], H(θ) = [η 12 (θ),η 13 (θ),...,η 1J (θ),...,η J 1,J (θ)], V (θ) = [ ν 1,2 1l 1,...,Jl 1 (θ),ν 1,2 1l 1,...,Jl 2 (θ),...,ν 1,2 1l k1,...,jl kj (θ),...,ν J 1,J 1l k1,...,jl kj (θ) ],

10 80 ROMEO and define T (θ) = [ (θ), H(θ), V (θ) ], where vector T (θ) has length J j=1 k j + J(J 1)/2 + (k 1 k 2 k J )J(J 1)/2. Now specify the objective function as θ = argmin{t (θ) T (θ)}. (8) Estimates of θ from (8) have asymptotic normal distribution ˆθ N(θ,σ 2 (G G) 1 ), where σ 2 = Var[T (ˆθ)], and G = T (ˆθ)/d ˆθ.Inaddition, P z (m s; θ) isasymptotically normally distributed with mean P z (m s; ˆθ) and variance σ 2 H(G G) 1 H, where H = P z (m s; ˆθ)/ˆθ. 3. Illustrations We present two illustrations. The first uses data and results from an illustration presented in Putler et al. The authors present joint distributions estimated three ways: as a product of marginals, using IPF, and conducting posterior inference. To these results we add a column obtained using our GMM approach. The second illustration uses demographic data from eight Dominick s stores and from the Chicago PMSA. We use the GMM approach to estimate a joint distribution over demographics for each of these stores, and present model fit statistics. We then determine if the formulation of the joint demographic distribution has empirically important effects on the results of an equilibrium mixed logit demand-supply model for bath tissue consumption. Two sets of results are generated. For the first set of results we draw individual types from a joint demographic distribution formulated from a product of marginals, while for the second, we take draws from our GMM estimate of the joint demographic distribution Targeting the market for stain resistant carpeting As discussed in Putler et al., the target market for stain resistant carpeting is married couples who are homeowners with young children living at home. To estimate a joint distribution for these three variables for Sioux Falls, South Dakota, the authors use marginals for Sioux Falls, and a prior joint distribution that corresponds to the whole state of South Dakota. The variables are each coded into binary categories: (renter, homeowner), (married, unmarried), (children under 18, no children under 18). The true distribution for Sioux Falls is available for evaluating goodness of fit. These data are presented in Table 2, along with estimates of the joint distribution derived four different ways: independence estimate, IPF, posterior mode, GMM mode. As the table shows, the independence estimate is a poor approximation to the true joint distribution. In fact, this estimate is considerably worse than using the prior distribution for the entire state of South Dakota to represent the distribution of these variables in Sioux Falls. On the other hand, the IPF, posterior, and GMM results each produce substantial improvements over both the independence estimate and the prior. All three approaches produce very similar estimates of the joint distribution, and all three are found to be statistically indistinguishable

11 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 81 Table 2. Estimates of joint probabilities for stain-resistant carpeting direct mail campaign. Cell descriptor a (housing, married, Actual Prior Independence Posterior GMM children <18) proportion a proportion a estimate a IPF a mode a estimate (rent, no, no) (0.0055) (0.0033) (rent, no, yes) (0.0037) (0.0012) (rent, yes, no) (0.0035) (0.0014) (rent, yes, yes) (0.0046) (0.0020) (own, no, no) (0.0055) (0.0039) (own, no, yes) (0.0037) (0.0013) (own, yes, no) (0.0053) (0.0048) (own, yes, yes) (0.0061) (0.0036) χ 2 goodness of fit measure b a Source: Table 5, Putler et al. (1996). b All χ 2 are based on 2083 observations. from the true joint distribution by a χ 2 goodness of fit test at the one percent significance level. 9 To produce the GMM results we formulated A = α r (α m α a ),with r, m, a representing the housing status, marital status, and child status dimensions respectively, and we tested which 2 1 α vectors to estimate and which ones to set equal to ı,avector of ones. Since we have three independent moment conditions on the marginal distributions and three more covariance conditions, in principal we can identify up to six parameters in θ. Inpractice we found that the χ 2 goodness of fit statistic was the smallest with A set = [1]. Estimating any of the α vectors produced a small increase in the χ 2 statistic. We also tried estimating the model using just marginal moment conditions (M1), and including only conditions (M2) or only (M3) in addition to (M1). We found that excluding conditions of type (M2) and/or (M3) caused a slight deterioration in the fit. For example, estimating the three ß parameters using only moment conditions (M1) produced a χ 2 statistic equal to 16.54, up from the value of obtained using all three sets of moment conditions. 9 Putler et al. Also include posterior mean estimates that yield χ 2 statistics as small as 16.1.

12 82 ROMEO 3.2. Dominick s data As stated in the introduction, the motivation behind this research is to estimate joint distributions corresponding to a subset of variables in the store level marginal distributions provided in the Dominick s data archive. These estimated joint distributions will then be used as our source of individual types in a mixed logit model for aggregate data. Data for the Chicago PMSA from the March 1996 Current Population Survey is used to provide the city-wide joint distribution for our GMM procedure. To begin, we extracted demographic data for eight stores from the Dominick s data archive that were reasonably representative of the total population of Dominick s stores. We limit attention to eight stores to keep sample size within the range of computational feasibility for mixed logit estimation. Three criteria were used to choose stores. First we wanted to have at least two stores from each of the three pricing regimes that Dominick s employs. To this end we chose two low price, three medium price, and three high price stores, while the population of Dominick s stores contains 9.4, 64.7, and 25.9 percent of low, medium, and high price stores respectively. Second, the stores were chosen to exhibit substantial variation over our four variables of interest. Third, we closely matched the means and correlations of our sample and the Dominick s population for these variables in order to produce a sample that is representative of the Dominick s population. The variables we chose are income, number of persons in a household, race, and number of units in a housing structure. Two pieces of information are provided regarding the income distribution for the market area surrounding each Dominick s store: the log median income, and the standard deviation of income. The researcher is left to guess what continuous probability distribution these variables parameterize. It is straightforward to show that the log median of a lognormal distribution is the mean of the associated normal distribution, hence we inferred that a lognormal was used. 10 However, since the standard deviations provided appear to be for a lognormal, we are left with a normal mean and lognormal standard error. To make these two statistics coherent with one another, we solved for the lognormal mean,, using the relationship µ = ln 2.5ln( 2 +λ 2 ), (see e.g. Greene, 1997, p. 71) where µ is the normal mean, and λ 2 is the lognormal variance. 11 With the income distribution parameterized, we index it by i, and discretize it into 17 adjacent intervals, (in 000 s): [0,10), [10,20), [20,30), [30,40), [40,50), [50,60), [60,70), [70,80), [80,90), [90,100),[100,125), [125,150), [150,175), [175,200), [200,300), [300,400), [400, ). For number of persons in household, n, the Dominick s data provides a distribution with four points of support: 1 person, 2 persons, 3 or 4 persons, and 5 or more persons. For race, indexed by r, the data provides the percentage of nonwhites. The percentage of detached houses, u, is our housing units variable. Corresponding variables and the city-wide joint distribution were readily available for the Chicago PMSA from the March 1996 Current Population Survey. 10 Specifically, if x has a lognormal distribution with parameters µ and σ 2, such that E[x] = exp(µ+σ 2 /2) and Var[x] λ 2 = exp(2µ + σ 2 )(exp(σ 2 ) 1) and median(x) = γ, then if y = ln(x), y N(µ, σ 2 ) and ln(γ ) = µ. 11 We used a bisection algorithm to solve this relationship for.

13 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 83 Table 3 contains the raw data for our eight store sample, and descriptive statistics to compare our eight stores with the whole population of Dominick s stores and the Chicago PMSA. As the table shows, the means for our eight store sample match the Dominick s population quite closely for all except the proportion of nonwhites. Our sample contain nearly five percent more nonwhites than the Dominick s population. In comparison with 1996 March CPS for the Chicago PMSA, the Dominick s market areas have similar means for ln(median income) and proportion of nonwhites, but have fewer persons/household, and fewer detached houses. A sample correlation comparison of our sample with the Dominick s population is contained in Table 4. While less closely matched than the means, the correlations all have the same signs, and are similar in magnitude. In this example, J = 4, k i = 17, k n = 4, k r = 2, and k u = 2. This yields k i + k n + k r + k u = 25 marginal distribution moment conditions of type (M1), J(J 1)/2 = 6covariance conditions of type (M2), and k i k n k r k u = 272 moment conditions of type (M3) for each pair of variables, and there are J(J 1)/2 = 6variable pairs. Together, these yield a total of 1663 moment conditions, 31 of which provide identifying information about θ. A separate joint distribution is estimated for the marketing area corresponding to each of our eight stores, and three criteria are used to gauge model fit for each store: the Euclidean distance of the estimated joint distribution from a joint distribution formed from a product of marginal distributions, the weighted average Euclidean distance of all moment conditions from zero, and the GMM function value. The first criteria enables us to gauge the impact of excluding moment conditions (M2) and/or (M3) and of the parameterization of A on our ability to estimate a joint distribution that differs substantially from a joint distribution formed under the assumption of independence. To form the second criteria we first evaluate the Euclidean distance of each set of moment conditions from zero, and then use the weighted average of these distances as a model fit metric. 12 In forming this measure we incorporate both included and excluded moment conditions. So, for example, if we estimate the model without moment conditions of type (M3) this will enable us to determine if a substantial deterioration in fit occurred as the (M3) moment conditions still get included in the metric. The third criterion, the GMM function value, provides a fit metric that is affected only by the included moment conditions. This metric is less reliable as it can show improvements even if overall fit, as measured by the Euclidean distance metric, has deteriorated. To parameterize θ, we tested letting each of the α vectors in A be free individually and in various combinations. These tests showed that allowing α n to vary freely produced marginally better results and than allowing α r or α u to be free. Allowing α i to vary freely caused problems with inverting (G G)asdid allowing the α s to vary freely in any combination. In addition, we estimated the model with different combinations of the moment conditions included and with A = [1]. Table 5 contains the results of tests of various moment and parameter configurations for one of our Dominick s stores (Store # 111). Results for other stores were similar. The results in Column 1 include all three sets of moment conditions in the model and allow α n to vary freely. This combination produces the best fit statistics. The estimated 12 The proportions of moment conditions of each type are used as the weights.

14 84 ROMEO Table 3. The eight store sample, with a comparison of descriptive statistics for the eight stores and the Dominick s population. Store # Persons/household ln(median Detached income) or 4 5+ Mean a house Nonwhite Low priced Medium priced High priced store sample Mean (std. dev.) (0.365) (0.073) (0.026) (0.056) (0.029) (0.236) (0.224) (0.323) (0.463) (0.518) (0.518) Dominick s population Mean (std. dev.) (0.280) (0.080) (0.031) (0.058) (0.031) (0.263) (0.208) (0.186) (0.294) (0.481) (0.441) Chicago PMSA Mean (std. dev.) (1.715) (0.478) (0.424) a Mean persons/household is calculated by indexing 1, 2, 3 or 4, and 5+ persons with 1, 2, 3.5, and 6 respectively.

15 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 85 Table 4. Acomparison of sample correlations between the eight store sample and the Dominick s population. Dominick s population ln(median income) Mean persons/hh Detached house Nonwhites 8 store sample ln(median income) Mean persons/hh Detached house Nonwhites Table 5. Effect of model specification on fit criteria for Dominick s store 111. Moment conditions included and parameterization of A (1) (2) (3) (4) (5) (6) Marginal conditions (M1) Yes Yes Yes Yes Yes Yes included Covariance conditions (M2) Yes Yes Yes No No No included Individual cell conditions (M3) Yes No Yes Yes Yes No included α n free Yes Yes No No Yes Yes Fit criteria Distance: P z (u, r, n, i s; θ) P(u)P(r)P(n)P(I ) Weighted average distance of all moment conditions from 0 GMM function value e e-6 joint distribution is substantially different from a product of marginal distributions, and the weighted average distance of all moment conditions from 0 is the smallest of the first four columns. In Columns 2 4, we exclude moment conditions (M3), fix α n = ı 4, and exclude conditions (M2) and fix α n = ı 4 respectively. Each of these changes produces a deterioration in fit as measured by the weighted average distance metric, and some changes in distance of the joint distribution from a product of marginals. In Columns 5 and 6 we exclude conditions covariance conditions (M2) and let α n vary freely. This produces large changes in the estimated distribution as it has moved much closer to a product of independent marginals. The important point to take from the results in Columns 5 and 6 is that one has to be wary when estimating an α vector without including covariance moment conditions. The GMM and weighted average Euclidean distance criteria both show that the model in these two columns fits better than in any of the previous columns. The reality, however,

16 86 ROMEO Table 6. Summary measures of model fit: weighted average Euclidean distances and GMM objective function values for a model including all three types of moment conditions and letting α n vary freely. Weighted average Euclidean Distance Store number Initial Final Distance between estimated GMM and independence distributions objective function value Euclidean Max absolute is that the GMM objective is optimized by placing very little weight on city-wide joint distribution. Each estimated element in α n < 0.01 in absolute value and, as such, yields a joint distribution that is very close to a product of marginals. Table 6 includes summary measures of model fit for all eight Dominick s stores for the model including all three sets of moment conditions and with θ = (α n,β j, j = u, r, n, i). 13 The results indicate that the model fits the moment conditions well for all eight stores, and it estimates joint distributions that differ substantially from joints formed as a product of marginal distributions. In addition to the three fit measures discussed above, we include the maximum absolute distance between the estimated and independence joint distributions. This is done to try and give the reader a better feel for how far the estimated model is from a joint distribution formed as a product of marginals. More specifically, these joint probability distributions each contain 272 cells. Hence, the average probability associated with each cell is 1/272 = The Euclidean distance is an order of magnitude larger than this value for seven of eight stores and the maximum absolute distance is an order of magnitude larger for all eight stores, thereby indicating the differences from independence to be substantial. Tables 7a and 7b provide the estimated joint probability distribution and the observed and estimated marginal distribution for stores 18 and 111 respectively. These tables are provided to show that the estimated marginals closely match the observed marginals, and that the estimated joint distributions for each store are substantially different. The two stores serve very different demographic populations. Store 18 s market population is 91 percent white, 61 percent of whom live in detached houses. For store 111, 99.5 percent of its market population is nonwhite, and only 31 percent live in detached homes. There are also substantial differences in the income distributions, and store 111 has a higher average number of persons per household. 13 We estimated this model using GAUSS on a 2GHz Pentium 4. Total estimation time for all eight stores was 3.2 seconds.

17 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 87 Table 7a. Estimated joint and marginal probability distributions for store 18. Units Race Persons Income interval midpoints (in 000 s) Attached White Attached White Attached White 3 or Attached White Attached Nonwhite Attached Nonwhite Attached Nonwhite 3 or Attached Nonwhite Detached White Detached White Detached White 3 or Detached White Detached Nonwhite Detached Nonwhite Detached Nonwhite 3 or Detached Nonwhite Store level marginal distributions Income: observed Estimated Persons: observed (1, 2, 3 or 4,5+) estimated Race: observed (white, nonwhite) estimated Units: observed (attach, detach) estimated Notes: entries imply probability element s value is in interval [10 8,10 3 ), while a 0 entry implies element is strictly less than Standard errors not reported to reduce clutter.

18 88 ROMEO Table 7b. Estimated joint and marginal probability distributions for store 111. Units Race Persons Income interval midpoints (in 000 s) Attached White Attached White Attached White 3 or Attached White Attached Nonwhite Attached Nonwhite Attached Nonwhite 3 or Attached Nonwhite Detached white Detached white Detached white 3 or Detached white Detached Nonwhite Detached Nonwhite Detached Nonwhite 3 or Detached Nonwhite Store level marginal distributions Income: observed estimated Persons: observed (1, 2, 3 or 4,5+) estimated Race: observed (white, nonwhite) estimated Units: observed (attach, detach) estimated Notes: entries imply probability element s value is in interval [10 8,10 3 ), while a 0 entry implies element is strictly less than Standard errors not reported to reduce clutter.

19 ESTIMATING DISCRETE JOINT PROBABILITY DISTRIBUTIONS 89 The observed and estimated marginals match to the second decimal place in most cases. The estimated race distribution for store 111 misses the observed distribution by a full three percent, but this is because of the extreme skewing of the distribution toward nonwhites. The model does a much better job of matching the somewhat less skewed race distribution of store A study of the effect of formulation of the joint demographic distribution on the results of an equilibrium model for demand and supply We estimate an equilibrium demand-supply model for bathroom tissue using one year of weekly data from each of the eight Dominick s stores; the demand model is mixed logit and a static Bertrand-Nash equilibrium condition is used to generate the supply function. Two versions of the model are run each using a different estimate of the joint demographic distribution for each of the eight stores. In the first version, the joint demographic distribution is estimated as a product of marginals, in the second version it is estimated using GMM Demand model. We use a random coefficients specification to represent the conditional indirect utility of consumption, c ijmt, for consumer i from product j purchased from store m in week t, yielding U(c ijmt ; d ) = x a j θ a + x b j θ b im P jmtα im + ξ j + ξ jmt + ε ijmt, i = 1,...,N, j = 0,...,J mt, m = 1,...,M, t = 1,...,T, (9) where for each product j we observe characteristics x j = (x a j, x b j ), and prices, p jmt.we decompose x j into subvectors x a and x b to highlight the point that we restrict x a to enter only the mean, while x b is allowed to influence both the mean and the random coefficients. In addition, the x s are only subscripted by j because all product characteristics other than price remain constant across stores and throughout the time period. Different products may be available over time or across stores, but the characteristics of products with a particular UPC number do not change. 15 Examples of product characteristics are the color of the tissue, the size of the roll (single or double), the ply of the paper (1- or 2-ply), the lotion content of the paper (with or without lotion), and the scent of the paper (scented or scent free). ξ j is the mean valuation of the unobserved (by the econometrician) product characteristic across all of the stores in our dataset, and ξ jmt is the store-week specific deviation from that mean. Following Nevo (2001), we use brand dummy variables to control for the ξ j, leaving the ξ jmt as our error terms. We expect that consumers and firms take the characteristics of all J mt products into consideration when making decisions, and hence the ξ jmt will be 14 We sketch out the demand model here, but do not present the supply function. A structural supply function is incorporated to provide additional structure that improves the precision of the demand model estimates. Supply function estimates, however, are not substantially affected by the choice of demographic distribution. Hence its development is ancillary to our main focus. The supply function is developed in detail in Romeo and Sullivan (2004). 15 In each store-week we observe between 28 and 42 bath tissue UPCs.

20 90 ROMEO correlated with the prices of all products available in store m in week t.wedefine alternative j = 0asthe outside good. Since we do not have detailed information about this alternative we retain the intercept and normalize other elements of x, p 0mt,ξ 0, ξ 0mt to equal 0. To control the effect of demographic differences on bath tissue choices in each store s market area, we specify the vector of consumer taste parameters (θim b,α im) as a function of store area demographics and a random normal component as in ( θ b im,α im) = ( θ b, ᾱ) + Ɣa im + ϒν im, i = i,...,n, m = 1,...,M, (10) where each a im draw is a L 1vector having probability distribution ˆP(l 1,...,l L ) and Ɣ is a matrix of unknown parameters. ˆP is the estimated joint demographic distribution and ϒ is specified as a diagonal matrix of unknown random coefficient parameters and the ν im are draws from N(0,I). Finally, the ε ijmt are unobserved buyer attributes that are assumed to follow a type 1 extreme value distribution that is independent across individuals, products, and time periods. In addition, a im,ν im, ξ jmt, and ε ijmt are assumed to be mutually independent. Aggregate demand shares are obtained by assuming that individuals make utility maximizing choices in their consumption of bath tissue, and integrating the ε ijmt, a im, and ν im over the appropriate regions. Integration over ε yields a logit distribution. a and ν are integrated numerically by drawing 200 samples from ˆP and N(0,I) for each store. As discussed in BLP, Nevo (2001), and Romeo and Sullivan (2004), integration of a and ν is embedded in a contraction mapping for determining mean utility Demand model results. Table 8 contains estimates of the product characteristic parameters for the demand model. 16 The table contains two columns of results. In the first column, results incorporate draws taken from demographic distributions formulated as products of marginal distributions, while the second column results incorporate draws taken from the demographic distributions estimated using GMM; 200 draws from each store are used in both cases. The results show the choice of demographic distribution to produce results that are substantially different, though the differences are not generally statistically significant. For example, the price coefficient is two units larger in absolute value when the GMM estimated distribution is used. This difference produces own- and cross-elasticity estimates (not reported here) that are 5 10 percent larger for the model using the GMM estimated demographic distributions. Finally, the objective function value indicates that the model using the GMM estimated joint distributions fits the data better. The overidentifying conditions are not rejected are the 5 percent level for this model, while they are rejected at the 5 percent level for the model with draws from the product of the marginals. 16 The demand model also includes month and brand dummies. Instrumental variables issues are discussed in Romeo and Sullivan.

Estimating the random coefficients logit model of demand using aggregate data

Estimating the random coefficients logit model of demand using aggregate data Estimating the random coefficients logit model of demand using aggregate data David Vincent Deloitte Economic Consulting London, UK davivincent@deloitte.co.uk September 14, 2012 Introduction Estimation

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

1 Portfolio mean and variance

1 Portfolio mean and variance Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring

More information

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3. IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION

More information

Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing

Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing Communications for Statistical Applications and Methods 2013, Vol 20, No 4, 321 327 DOI: http://dxdoiorg/105351/csam2013204321 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Intro to Data Analysis, Economic Statistics and Econometrics

Intro to Data Analysis, Economic Statistics and Econometrics Intro to Data Analysis, Economic Statistics and Econometrics Statistics deals with the techniques for collecting and analyzing data that arise in many different contexts. Econometrics involves the development

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

VI. Real Business Cycles Models

VI. Real Business Cycles Models VI. Real Business Cycles Models Introduction Business cycle research studies the causes and consequences of the recurrent expansions and contractions in aggregate economic activity that occur in most industrialized

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Chi Square Tests. Chapter 10. 10.1 Introduction

Chi Square Tests. Chapter 10. 10.1 Introduction Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Macroeconomic Effects of Financial Shocks Online Appendix

Macroeconomic Effects of Financial Shocks Online Appendix Macroeconomic Effects of Financial Shocks Online Appendix By Urban Jermann and Vincenzo Quadrini Data sources Financial data is from the Flow of Funds Accounts of the Federal Reserve Board. We report the

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239 STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Gaussian Conjugate Prior Cheat Sheet

Gaussian Conjugate Prior Cheat Sheet Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

A logistic approximation to the cumulative normal distribution

A logistic approximation to the cumulative normal distribution A logistic approximation to the cumulative normal distribution Shannon R. Bowling 1 ; Mohammad T. Khasawneh 2 ; Sittichai Kaewkuekool 3 ; Byung Rae Cho 4 1 Old Dominion University (USA); 2 State University

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

A Simple Model of Price Dispersion *

A Simple Model of Price Dispersion * Federal Reserve Bank of Dallas Globalization and Monetary Policy Institute Working Paper No. 112 http://www.dallasfed.org/assets/documents/institute/wpapers/2012/0112.pdf A Simple Model of Price Dispersion

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Markups and Firm-Level Export Status: Appendix

Markups and Firm-Level Export Status: Appendix Markups and Firm-Level Export Status: Appendix De Loecker Jan - Warzynski Frederic Princeton University, NBER and CEPR - Aarhus School of Business Forthcoming American Economic Review Abstract This is

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

A Log-Robust Optimization Approach to Portfolio Management

A Log-Robust Optimization Approach to Portfolio Management A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983

More information

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:

More information

Financial Market Microstructure Theory

Financial Market Microstructure Theory The Microstructure of Financial Markets, de Jong and Rindi (2009) Financial Market Microstructure Theory Based on de Jong and Rindi, Chapters 2 5 Frank de Jong Tilburg University 1 Determinants of the

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Life Table Analysis using Weighted Survey Data

Life Table Analysis using Weighted Survey Data Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

On Correlating Performance Metrics

On Correlating Performance Metrics On Correlating Performance Metrics Yiping Ding and Chris Thornley BMC Software, Inc. Kenneth Newman BMC Software, Inc. University of Massachusetts, Boston Performance metrics and their measurements are

More information

Inflation. Chapter 8. 8.1 Money Supply and Demand

Inflation. Chapter 8. 8.1 Money Supply and Demand Chapter 8 Inflation This chapter examines the causes and consequences of inflation. Sections 8.1 and 8.2 relate inflation to money supply and demand. Although the presentation differs somewhat from that

More information

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College. The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Multiple Choice Models II

Multiple Choice Models II Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical

More information

The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities

The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth Garrett-Mayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Internet Appendix for Money Creation and the Shadow Banking System [Not for publication]

Internet Appendix for Money Creation and the Shadow Banking System [Not for publication] Internet Appendix for Money Creation and the Shadow Banking System [Not for publication] 1 Internet Appendix: Derivation of Gross Returns Suppose households maximize E β t U (C t ) where C t = c t + θv

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

Goodness of fit assessment of item response theory models

Goodness of fit assessment of item response theory models Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014 LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes

More information

Lecture notes: single-agent dynamics 1

Lecture notes: single-agent dynamics 1 Lecture notes: single-agent dynamics 1 Single-agent dynamic optimization models In these lecture notes we consider specification and estimation of dynamic optimization models. Focus on single-agent models.

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information