# What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Save this PDF as:

Size: px
Start display at page:

Download "What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling"

## Transcription

1 What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and Large Group Sizes 3. What if G and M g are Both Large? 4. Nonlinear Models 1

2 1. The Linear Model with Cluster Effects. For each group or cluster g, let y gm, x g, z gm : m 1,...,M g be the observable data, where M g is the number of units in cluster g, y gm is a scalar response, x g is a 1 K vector containing explanatory variables that vary only at the group level, and z gm is a 1 L vector of covariates that vary within (as well as across) groups. The linear model with an additive error is y gm x g z gm v gm (1) for m 1,...,M g, g 1,...,G. Key questions: Are we primarily interested in or? Does v gm contain a common group effect, as in 2

3 v gm c g u gm, m 1,...,M g, (2) where c g is an unobserved cluster effect and u gm is the idiosyncratic error? Are the regressors x g, z gm appropriately exogenous? How big are the group sizes (M g and number of groups G? Two kinds of sampling schemes. First, from a large population of relatively small clusters, we draw a large number of clusters (G), where cluster g has M g members. For example, sampling a large number of families, classrooms, or firms from a large population. This is like the panel data setup we have covered. In the panel data setting, G is the number of cross-sectional units and M g is the number of time periods for unit g. A different sampling scheme results in data sets that also can be arranged by group, but is better 3

4 interpreted in the context of sampling from different populations or different strata within a population. We stratify the population into into G 2 nonoverlapping groups. Then, we obtain a random sample of size M g from each group. Ideally, the group sizes are large in the population, hopefully resulting in large M g. Large Group Asymptotics The theory with G and the group sizes, M g, fixed is well developed. How should one use these methods? If E v gm x g, z gm 0 (3) then pooled OLS estimator of y gm on 1, x g, z gm, m 1,...,M g ; g 1,...,G, is consistent for,, (as G with M g fixed) and G -asymptotically normal. In panel data case, (3) 4

5 does allow for non-strictly exogenous covariates, but only if there is no unobserved effect. Robust variance matrix is needed to account for correlation within clusters or heteroskedasticity in Var v gm x g, z gm, or both. Write W g as the M g 1 K L matrix of all regressors for group g. Then the 1 K L 1 K L variance matrix estimator is Avar POLS G g 1 W g W g 1 G g 1 W g v gv g W g (4) G g 1 W g W g 1 where v g is the M g 1 vector of pooled OLS residuals for group g. This asymptotic variance is now computed routinely using cluster options. 5

6 If we strengthen the exogeneity assumption to E v gm x g, Z g 0, m 1,...,M g ; g 1,...,G, (5) where Z g is the M g L matrix of unit-specific covariates, then we can use GLS. This is about the strongest assumption we can make. As discussed in the linear panel data notes, the random effects approach makes enough assumptions so that the M g M g variance-covariance matrix of v g v g1, v g2,...,v g,mg has the so-called random effects form, Var v g 2 c j Mg j Mg 2 u I Mg, (6) where j Mg is the M g 1 vector of ones and I Mg is the M g M g identity matrix. Plus, the usual assumptions include the system homoskedasticity assumption, 6

7 Var v g x g, Z g Var v g. (7) The random effects estimator RE is asymptotically more efficient than pooled OLS under (5), (6), and (7) as G with the M g fixed. The RE estimates and test statistics are computed routinely by popular software packages. Important point is often overlooked: one can, and in many cases should, make inference completely robust to an unknown form of Var v g x g, Z g, whether we have a true cluster sample or panel data. Cluster sample example: random coefficient model, y gm x g z gm g v gm. (8) By estimating a standard random effects model that 7

8 assumes common slopes, we effectively include z gm g in the idiosyncratic error; this generally creates within-group correlation because z gm g and z gp g will be correlated for m p, conditional on Z g. If we are only interested in estimating, the fixed effects (FE) or within estimator is attractive. The within transformation subtracts off group averages from the dependent variable and explanatory variables: y gm ȳ g z gm z g u gm ū g, (9) and this equation is estimated by pooled OLS. (Of course, the x g get swept away by the within-group demeaning.) Often important to allow Var u g Z g to have an arbitrary form, including within-group correlation and heteroskedasticity. Certainly should 8

9 for panel data (serial correlation), but also for cluster sampling. In linear panel data notes, we saw that FE can consistently estimate the average effect in the random coefficient case. But z gm z g g appears in the error term. A fully robust variance matrix estimator is Avar FE G g 1 Z g Z g 1 G g 1 Z g û g û g Z g (10) G g 1 Z g Z g 1, where Z g is the matrix of within-group deviations from means and û g is the M g 1 vector of fixed effects residuals. This estimator is justified with large-g asymptotics. Should we Use the Large G Formulas with 9

10 Large M g? What if one applies robust inference in scenarios where the fixed M g, G asymptotic analysis not realistic? Hansen (2007) has recently derived properties of the cluster-robust variance matrix and related test statistics under various scenarios that help us more fully understand the properties of cluster robust inference across different data configurations. First consider how his results apply to true cluster samples. Hansen (2007, Theorem 2) shows that, with G and M g both getting large, the usual inference based on (1.4) is valid with arbitrary correlation among the errors, v gm, within each group. Because we usually think of v gm as including the group effect c g, this means that, with 10

11 large group sizes, we can obtain valid inference using the cluster-robust variance matrix, provided that G is also large. So, for example, if we have a sample of G 100 schools and roughly M g 100 students per school, and we use pooled OLS leaving the school effects in the error term, we should expect the inference to have roughly the correct size. Probably we leave the school effects in the error term because we are interested in a school-specific explanatory variable, perhaps indicating a policy change. Unfortunately, pooled OLS with cluster effects when G is small and group sizes are large fall outside Hansen s theoretical findings. Generally, we should not expect good properties of the cluster-robust inference with small groups and very 11

12 large group sizes when cluster effects are left in the error term. As an example, suppose that G 10 hospitals have been sampled with several hundred patients per hospital. If the explanatory variable of interest is exogenous and varies only at the hospital level, it is tempting to use pooled OLS with cluster-robust inference. But we have no theoretical justification for doing so, and reasons to expect it will not work well. In the next section we discuss other approaches available with small G and large M g. If the explanatory variables of interest vary within group, FE is attractive for a couple of reasons. The first advantage is the usal one about allowing c g to be arbitrarily correlated with the z gm. The second advantage is that, with large M g,we 12

13 can treat the c g as parameters to estimate because we can estimate them precisely and then assume that the observations are independent across m (as well as g). This means that the usual inference is valid, perhaps with adjustment for heteroskedasticity. The fixed G, large M g asymptotic results in Theorem 4 of Hansen (2007) for cluster-robust inference apply in this case. But using cluster-robust inference is likely to be very costly in this situation: the cluster-robust variance matrix actually converges to a random variable, and t statistics based on the adjusted version of (10) multiplied by G/ G 1 have an asymptotic t G 1 distribution. Therefore, while the usual or heteroskedasticity-robust inference can be based on the standard normal distribution, the cluster-robust 13

14 inference is based on the t G 1 distribution (and the cluster-robust standard errors may be larger than the usual standard errors). For panel data applications, Hansen s (2007) results, particularly Theorem 3, imply that cluster-robust inference for the fixed effects estimator should work well when the cross section (N) and time series (T) dimensions are similar and not too small. If full time effects are allowed in addition to unit-specific fixed effects as they often should then the asymptotics must be with N and T both getting large. In this case, any serial dependence in the idiosyncratic errors is assumed to be weakly dependent. The similulations in Bertrand, Duflo, and Mullainathan (2004) and Hansen (2007) verify that the fully robust 14

15 cluster-robust variance matrix works well. There is some scope for applying the fully robust variance matrix estimator when N is small relative to T when unit-specific fixed effects are included. But allowing time effects causes problems in this case. Really want large N and T to allow for a full set of time and unit-specific effects. 2. Estimation with a Small Number of Groups and Large Group Sizes When G is small and each M g is large, thinking of sampling from different strata in a population, or even different populations, makes more sense. Alternatively, we might think that the clusters were randomly drawn from a large population, but only a small number were drawn. Either way, except for the relative dimensions of G and M g, the resulting 15

16 data set is essentially indistinguishable from a data set obtained by sampling clusters. The problem of proper inference when M g is large relative to G the Moulton (1990) problem has been recently studied by Donald and Lang (2007). DL treat the parameters associated with the different groups as outcomes of random draws (so it seems more like the second sampling experiment). Simplest case: a single regressor that varies only by group: y gm x g c g u gm g x g u gm. (11) (12) Notice how (12) is written as a model with common slope,, but intercept, g, that varies across g. Donald and Lang focus on (11), where c g is assumed to be independent of x g with zero mean. 16

17 They use this formulation to highlight the problems of applying standard inference to (11), leaving c g as part of the error term, v gm c g u gm. We know that standard pooled OLS inference can be badly biased because it ignores the cluster correlation. And Hansen s results do not apply. (We cannot use fixed effects here.) The DL solution is to study the OLS estimate in the regression between regresssion ȳ g on 1, x g, g 1,...,G, (13) which is identical to pooled OLS when the group sizes are the same. Conditional on the x g, inherits its distribution from v g : g 1,...,G, the within-group averages of the composite errors. If we add some strong assumptions, there is an exact solution to the inference problem. In addition 17

18 to assuming M g M for all g, assume c g x g ~Normal 0, c 2 and assume u gm x g, c g Normal 0, u 2. Then v g is independent of x g and v g Normal 0, c 2 u 2 /M for all g. Because we assume independence across g, the equation ȳ g x g v g, g 1,...,G (14) satisfies the classical linear model assumptions. We can use inference based on the t G 2 distribution to test hypotheses about, provided G 2. If G is small, the requirements for a significant t statistic using the t G 2 distribution are much more stringent then if we use the t M1 M 2... M G 2 distribution which is what we would be doing if we use the usual pooled OLS statistics. Using (14) is not the same as using cluster-robust 18

19 standard errors for pooled OLS. Those are not even justified and, besides, we would use the wrong df in the t distribution. We can apply the DL method without normality of the u gm if the group sizes are large because Var v g c 2 u 2 /M g so that ū g is a negligible part of v g. But we still need to assume c g is normally distributed. If z gm appears in the model, then we can use the averaged equation ȳ g x g z g v g, g 1,...,G, (15) provided G K L 1. If c g is independent of x g, z g with a homoskedastic normal distribution and the group sizes are large, inference can be carried out using the t G K L 1 distribution. Regressions like (15) are reasonably common, at 19

20 least as a check on results using disaggregated data, but usually with larger G then just a few. If G 2, should we give up? Suppose x g is binary, indicating treatment and control. The DL estimate of is the usual one: ȳ 1 ȳ 0.Butin the DL setting, we cannot do inference (there are zero df). So, the DL setting rules out the standard comparison of means. It also rules out the typical setup for difference-in-differences, where there would be four groups, for the same reason. Can we still obtain inference on estimated policy effects using randomized or quasi-randomized interventions when the policy effects are just identified? Not according the DL approach. Even when we can apply the approach, should we? Suppose there G 4 groups with groups one 20

21 and two control groups (x 1 x 2 0) and two treatment groups x 3 x 4 1. The DL approach would involve computing the averages for each group, ȳ g, and running the regression ȳ g on 1, x g, g 1,...,4. Inference is based on the t 2 distribution. The estimator in this case can be written as ȳ 3 ȳ 4 /2 ȳ 1 ȳ 2 /2. (16) With written as in (16), it is clearly it is approximately normal (for almost any underlying population distribution) provided the group sizes M g are moderate. The DL approach would base inference on a t 2 distribution. In effect, the DL approach rejects the usual inference based on group means from large sample sizes because it may not be the case that 1 2 and

22 Equation (16) hints at a different way to view the small G, large M g setup. We estimated two parameters, and, given four moments that we can estimate with the data. The OLS estimates can be interpreted as minimum distance estimates that impose the restrictions 1 2 and 3 4. Ifweusethe4 4 identity matrix as the weight matrix, we get (16) and ȳ 1 ȳ 2 /2. With large group sizes, and whether or not G is especially large, we can put the probably generally into an MD framework, as done, for example, by Loeb and Bound (1996), who had G 36 cohort-division groups and many observations per group. For each group g, write y gm g z gm g u gm. (17) 22

23 where we assume random sampling within group and independent sampling across groups. Generally, the OLS estimates withing group are M g -asymptotically normal. The presence of x g can be viewed as putting restrictions on the intercepts, g, in the separate group models in (2.8). In particular, g x g, g 1,...,G, (18) where we now think of x g as fixed, observed attributes of heterogeneous groups. With K attributes we must have G K 1 to determine and. In the first stage, we obtain the g, either by group-specific regressions or pooling to impose some common slope elements in g.letv be the G G estimated (asymptotic) variance matrix of 23

24 the G 1 vector. Then the MD estimator is X V 1 X 1 X V 1 (19) The asymptotics are as each group size gets large, and has an asymptotic normal distribution; its estimated asymptotic variance is X V 1 X 1. When separate regressions are used, the g are independent, and V is diagonal. Can test the overidentification restrictions. If reject, can go back to the DL approach (or find more elements of x g. With large group sizes, can justify analyzing g x g c g, g 1,...,G (20) as a classical linear model because g g O p Mg 1/2, provided c g is normally distributed. 24

25 3. What if G and M g are Both Large? If we have a reasonably large G in addition to large M g, we have more flexibility. In addition to ignoring the estimation error in g (because of large M g ), we can also drop the normality assumption in c g (because, as G gets large, we can apply the central limit theorem). But, of course, we are still assuming that the deviations, c g,in g x g c g, are at least uncorrelated with x g. We can apply IV methods in this setting, though, if we have suitable instruments. 4. Nonlinear Models Many of the issues for nonlinear models are the same as for linear models. The biggest difference is that, in many cases, standard approaches require distributional assumptions about the unobserved 25

26 group effects. In addition, it is more difficult in nonlinear models to allow for group effects correlated with covariates, especially when group sizes differ. Large Group Asymptotics We can illustrate many issues using an unobserved effects probit model. Let y gm be a binary response, with x g and z gm, m 1,...,M g, g 1,...,G defined as in Section 1. Assume that y gm 1 x g z gm c g u gm 0 u gm x g, Z g, c g ~Normal 0, 1 (21) (22) (where 1 is the indicator function). Then P y gm 1 x g, z gm, c g x g z gm c g, (23) where is the standard normal cumulative distribution function (cdf). 26

27 We already discussed the issue of quantities of interest, including parameters and average partial effects. For estimation, if we assume c g is independent of x g, Z g with a Normal 0, g 2 distribution, then pooled probit consistently estimates the scaled coefficients (multiplied by 1 c 2 1/2 ). The pooled or partial maximum likelihood estimator is sometimes called a pseudo maximum likelihood estimator If we add the conditional independence assumption that u g1,...,u g,mg are independent conditional on x g, Z g, c g then we can use random effects probit, albeit in an unbalanced case. As we saw before, all parameters are identified. A challenging task, and one that appears not to 27

28 have gotten much attention for true cluster samples, is allowing correlation between the unobserved heterogeneity, c g, and the covariates that vary within group, z gm. For linear models, we know that the fixed effects estimator allows arbitrary correlation, and does not restrict the within-cluster dependence of u g1,...,u g,mg. Unfortunately, allowing correlation between c g and z g1, z g2,...,z gmg. Even if we assume normality and exchangeability in the mean, we must at least allow for difference variances: 2 c g z g1,...,z g,mg ~Normal z g, a,mg, (24) 2 where a,mg denotes a different variance for each group size, M g. Then the marginal distributions are 2 P y gm 1 Z g z gm z g / 1 a,mg 1/2. (25) 28

29 Any estimation must account for the different variances for different group sizes. With very large G and little variation in M g, we might just use the unrestricted estimates Mg, M g, Mg, estimate the APEs for each group size, and then average these across group size. But more work needs to be done to see if such an approach loses too much in terms of efficiency. The methods of Altonji and Matzkin (2005) can be applied to allow more flexible relationships between c g and z g, say, or other functions of z g1,...,z g,mg The logit conditional MLE applies to cluster samples without change, so we can estimate parameters without restricting D c g z g1,...,z gmg. A Small Number of Groups and Large Group 29

30 Sizes Unlike in the linear case, for nonlinear models exact inference is unavailable even under the strongest set of assumptions. But approximate inference is if the group sizes M g are reasonably large With small G and random sampling of y gm, z gm : m 1,...,M g write P y gm 1 z gm g z gm g (26) g x g, g 1,...,G. (27) Using a minimum distance approach, in a first step we estimate a series of G probits (or pool across g to impose common slopes), obtain the group fixed effects g, g 1,...,G. Then, we impose the restrictions in (26) using linear MD estimation 30

31 just as before. Now, the asymptotic variances Avar g come from the probits. The DL approach also applies with large M g but we again must assume g x g c g where c g is independent of x g and homoskedastic normal. As in the linear case, we just use classical linear model inference in the equation g x g c g, provide G K 1. The same holds for virtually any nonlinear model with an index structure: the second step is linear regression. Large G and Large M g As in the linear case, more flexibility is afforded if G is somewhat large along with large M g because we can relax the normality assumption in c g in analyzing the regression g on 1, x g, g 1,...,G. 31

32 A version of the method proposed by Berry, Levinsohn, and Pakes (1995) for estimating structural models using both individual-level and product-level data, or market-level data, or both can be treated in the large G, large M g framework, where g indexes good or market and m indexes individuals within a market. Suppose there is a single good across many markets (so G is large) and we have many individuals within each market (the M g are large). The main difference with what we have done up until now is that BLP must allow correlation between x g (particularly, price) and the unobserved product attributes in market g, c g. So, the second step involves instrumental variables estimation of the equation g x g c g.ifthe M g are large enough to ignore estimation error in 32

33 the g, then the second step can be analyzed just like a standard cross section IV estimation. (BLP actually derive an asymptotic variance that accounts for estimation error in g, along with the uncertainty in c g, and simulation error which comes from difficult computational problems in the first stage.) 33

### 1. THE LINEAR MODEL WITH CLUSTER EFFECTS

What s New in Econometrics? NBER, Summer 2007 Lecture 8, Tuesday, July 31st, 2.00-3.00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples

### Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

### ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working

### FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

### Clustering in the Linear Model

Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

### Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable

### Correlated Random Effects Panel Data Models

INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### Microeconometrics Blundell Lecture 1 Overview and Binary Response Models

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2016 Blundell (University College London)

### Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS15/16

1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS15/16 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

### 1. Linear-in-Parameters Models: IV versus Control Functions

What s New in Econometrics? NBER, Summer 2007 Lecture 6, Tuesday, July 31st, 9.00-10.30 am Control Function and Related Methods These notes review the control function approach to handling endogeneity

### Marketing Mix Modelling and Big Data P. M Cain

1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

### 1. Review of the Basic Methodology

What s New in Econometrics? NBER, Summer 2007 Lecture 10, Tuesday, July 31st, 4.30-5.30 pm Difference-in-Differences Estimation These notes provide an overview of standard difference-in-differences methods

### Clustering in the Linear Model

Short Guides to Microeconometrics Fall 2013 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

### Instrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.

Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### Heteroskedasticity and Weighted Least Squares

Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/

### Econometric Methods fo Panel Data Part II

Econometric Methods fo Panel Data Part II Robert M. Kunst University of Vienna April 2009 1 Tests in panel models Whereas restriction tests within a specific panel model follow the usual principles, based

### Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

### Evaluating one-way and two-way cluster-robust covariance matrix estimates

Evaluating one-way and two-way cluster-robust covariance matrix estimates Christopher F Baum 1 Austin Nichols 2 Mark E Schaffer 3 1 Boston College and DIW Berlin 2 Urban Institute 3 Heriot Watt University

### Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

### Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

### OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.

OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns

### From the help desk: Bootstrapped standard errors

The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

### Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

### 2. What are the theoretical and practical consequences of autocorrelation?

Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since

### Using instrumental variables techniques in economics and finance

Using instrumental variables techniques in economics and finance Christopher F Baum 1 Boston College and DIW Berlin German Stata Users Group Meeting, Berlin, June 2008 1 Thanks to Mark Schaffer for a number

### Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider

### Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the

Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the explanatory variables is jointly determined with the dependent

### HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

### Testing for serial correlation in linear panel-data models

The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear panel-data models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear panel-data

### The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss

### CAPM, Arbitrage, and Linear Factor Models

CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors

### The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

### Panel Data Econometrics

Panel Data Econometrics Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

### AP Statistics 2011 Scoring Guidelines

AP Statistics 2011 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

### Sampling Distributions and the Central Limit Theorem

135 Part 2 / Basic Tools of Research: Sampling, Measurement, Distributions, and Descriptive Statistics Chapter 10 Sampling Distributions and the Central Limit Theorem In the previous chapter we explained

### Clustered Errors in Stata

10 Sept 2007 Clustering of Errors Cluster-Robust Standard Errors More Dimensions A Seemingly Unrelated Topic Clustered Errors Suppose we have a regression model like Y it = X it β + u i + e it where the

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics

Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics John Pepper Assistant Professor Department of Economics University of Virginia 114 Rouss

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### TIME SERIES REGRESSION

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 TIME SERIES REGRESSION I. AGENDA: A. A couple of general considerations in analyzing time series data B. Intervention analysis

### Machine Learning Methods for Demand Estimation

Machine Learning Methods for Demand Estimation By Patrick Bajari, Denis Nekipelov, Stephen P. Ryan, and Miaoyu Yang Over the past decade, there has been a high level of interest in modeling consumer behavior

### UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

### Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

### Econometrics of Survey Data

Econometrics of Survey Data Manuel Arellano CEMFI September 2014 1 Introduction Suppose that a population is stratified by groups to guarantee that there will be enough observations to permit suffi ciently

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

### Multivariate Analysis of Health Survey Data

10 Multivariate Analysis of Health Survey Data The most basic description of health sector inequality is given by the bivariate relationship between a health variable and some indicator of socioeconomic

### Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

### Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

### Problems with OLS Considering :

Problems with OLS Considering : we assume Y i X i u i E u i 0 E u i or var u i E u i u j 0orcov u i,u j 0 We have seen that we have to make very specific assumptions about u i in order to get OLS estimates

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

### PANEL DATA METHODS FOR FRACTIONAL RESPONSE VARIABLES WITH AN APPLICATION TO TEST PASS RATES

PANEL DATA METHODS FOR FRACTIONAL RESPONSE VARIABLES WITH AN APPLICATION TO TEST PASS RATES Leslie E. Papke Department of Economics Michigan State University East Lansing, MI 48824-1038 papke@msu.edu Jeffrey

### C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}

C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression

### Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

### Introduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A.

Introduction to Regression Models for Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. McManus Panel Data Analysis October 2011 What are Panel Data? Panel

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Estimating the random coefficients logit model of demand using aggregate data

Estimating the random coefficients logit model of demand using aggregate data David Vincent Deloitte Economic Consulting London, UK davivincent@deloitte.co.uk September 14, 2012 Introduction Estimation

### The Method of Least Squares

Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### Lecture 3: Differences-in-Differences

Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).

### Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### My factorial research design yields panel data in which i vignettes (i = 1,, 10) are nested

1 SUPPLEENTAL ATERIAL Analytic Strategy: Studies 1ab y factorial research design yields panel data in which i vignettes (i = 1,, 10) are nested within individuals ( = 1,, J). As a result, I estimated two-level

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

### Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### CHAPTER 6. SIMULTANEOUS EQUATIONS

Economics 24B Daniel McFadden 1999 1. INTRODUCTION CHAPTER 6. SIMULTANEOUS EQUATIONS Economic systems are usually described in terms of the behavior of various economic agents, and the equilibrium that

### Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### An Introduction to Hierarchical Linear Modeling for Marketing Researchers

An Introduction to Hierarchical Linear Modeling for Marketing Researchers Barbara A. Wech and Anita L. Heck Organizations are hierarchical in nature. Specifically, individuals in the workplace are entrenched