# What s New in Econometrics? Lecture 10 Difference-in-Differences Estimation

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 What s New in Econometrics? Lecture 10 Difference-in-Differences Estimation Jeff Wooldridge NBER Summer Institute, Review of the Basic Methodology 2. How Should We View Uncertainty in DD Settings? 3. General Settings for DD Analysis: Multiple Groups and Time Periods 4. Individual-Level Panel Data 5. Semiparametric and Nonparametric Approaches 6. Synthetic Control Methods for Comparative Case Studies 1

2 1. Review of the Basic Methodology The standard case: outcomes are observed for two groups for two time periods. One of the groups is exposed to a treatment in the second period but not in the first period. The second group is not exposed to the treatment during either period. In the case where the same units within a group are observed in each time period (panel data), the average gain in the second (control) group is substracted from the average gain in the first (treatment) group. This removes biases in second period comparisons between the treatment and control group that could be the result from permanent differences between those groups, as well as biases from comparisons over time in the 2

3 treatment group that could be the result of trends. With repeated cross sections, let A be the control group and B the treatment group. Write y 0 1 db 0 d2 1 d2 db u, (1) where y is the outcome of interest. The dummy db captures possible differences between the treatment and control groups prior to the policy change. The dummy d2 captures aggregate factors that would cause changes in y even in the absense of a policy change. The coefficient of interest is 1. The difference-in-differences estimate is 1 ȳ B,2 ȳ B,1 ȳ A,2 ȳ A,1. (2) Inference based on even moderate sample sizes in each of the four groups is straightforward, and is easily made robust to different group/time period 3

4 variances in the regression framework. More convincing analysis sometimes available by refining the definition of treatment and control groups. Example: change in state health care policy aimed at elderly. Could use data only on people in the state with the policy change, both before and after the change, with the control group being people 55 to 65 (say) and and the treatment group being people over 65. This DD analysis assumes that the paths of health outcomes for the younger and older groups would not be systematically different in the absense of intervention. Instead, might use the over-65 population from another state as an additional control. Let de be a dummy equal to one for someone over 65. 4

5 y 0 1 db 2 de 3 db de 0 d2 1 d2 db 2 d2 de 3 d2 db de u (3) The coefficient of interest is 3, the coefficient on the triple interaction term, d2 db de. The OLS estimate 3 can be expressed as follows: 3 ȳ B,E,2 ȳ B,E,1 ȳ A,E,2 ȳ A,E,1 ȳ B,N,2 ȳ B,N,1 (4) where the A subscript means the state not implementing the policy and the N subscript represents the non-elderly. This is the difference-in-difference-in-differences (DDD) estimate. Can add covariates to either the DD or DDD analysis to (hopefully) control for compositional changes. Can use multiple time periods and groups. 5

6 2. How Should We View Uncertainty in DD Settings? Standard approach: all uncertainty in inference enters through sampling error in estimating the means of each group/time period combination. Long history in analysis of variance. Recently, different approaches have been suggest that focus on different kinds of uncertainty perhaps in addition to sampling error in estimating means. Bertrand, Duflo, and Mullainathan (2004), Donald and Lang (2007), Hansen (2007a,b), and Abadie, Diamond, and Hainmueller (2007) argue for additional sources of uncertainty. In fact, for the most part, the additional uncertainty is assumed to swamp the sampling error in estimating group/time period means. (See DL 6

7 approach in cluster sample notes, although we did not explicitly introduce a time dimension.) One way to view the uncertainty introduced in the DL framework and a perspective explicitly taken by ADH is that our analysis should better reflect the uncertainty in the quality of the control groups. Issue: In the standard DD and DDD cases, the policy effect is just identified in the sense that we do not have multiple treatment or control groups assumed to have the same mean responses. So, the DL approach does not allow inference. Example from Meyer, Viscusi, and Durbin (1995) on estimating the effects of benefit generosity on length of time a worker spends on workers compensation. MVD have the standard 7

8 DD setting: a before and after period, where the policy change was to raise the cap on covered earnings; control group is low earners. Using Kentucky and a total sample size of 5,626, the DD estimate of the policy change is about 19.2% (longer time on workers compensation) with t Using Michigan, with a total sample size of 1,524, the DD estimate is 19.1% with t (Adding controls does not help reduce the standard error.) There seems to be plenty of uncertainty in the estimate even with a pretty large sample size. Should we conclude that we really have no usable data for inference? 3. General Settings for DD Analysis: Multiple Groups and Time Periods The DD and DDD methodologies can be applied 8

9 to more than two time periods. In the DD case, add a full set of time dummies to the equation. This assumes the policy has the same effect in every year; easily relaxed. In a DDD analysis, a full set of dummies is included for each of the two kinds of groups and all time periods, as well as all pairwise interactions. Then, a policy dummy (or sometimes a continuous policy variable) measures the effect of the policy. See Meyer (1995) for applications. With many time periods and groups, a general framework considered by BDM (2004) and Hansen (2007b) is useful. The equation at the individual level is y igt t g x gt z igt gt v gt u igt, i 1,...,M gt, (5) where i indexes individual, g indexes group, and t 9

10 indexes time. This model has a full set of time effects, t, a full set of group effects, g, group/time period covariates, x gt (these are the policy variables), individual-specific covariates, z igt, unobserved group/time effects, v gt, and individual-specific errors, u igt. We are interested in estimating. As in cluster sample cases, can write y igt gt z igt gt u igt, i 1,...,M gt, (6 ) which shows a model at the individual level where both the intercepts and slopes are allowed to differ across all g, t pairs. Then, we think of gt as gt t g x gt v gt. (7) We can think of (7) as a regression model at the group/time period level. 10

11 As discussed by BDM, a common way to estimate and perform inference in (5) is to ignore v gt, so the individual-level observations are treated as independent. When v gt is present, the resulting inference can be very misleading. BDM and Hansen (2007b) allow serial correlation in v gt : t 1,2,...,T but assume independence across g. If we view (7) as ultimately of interest, there are simple ways to proceed. We observe x gt, t is handled with year dummies,and g just represents group dummies. The problem, then, is that we do not observe gt. Use OLS on the individual-level data to estimate the gt, assuming E z igt u igt 0 and the group/time period sizes, M gt,are reasonably large. 11

12 Sometimes one wishes to impose some homogeneity in the slopes say, gt g or even gt in which case pooling can be used to impose such restrictions. In any case, proceed as if M gt are large enough to ignore the estimation error in the gt; instead, the uncertainty comes through v gt in (7). The MD approach from cluster sample notes effectively drops v gt from (7) and views gt t g x gt as a set of deterministic restrictions to be imposed on gt. Inference using the efficient MD estimator uses only sampling variation in the gt. Here, we proceed ignoring estimation error, and so act as if (7) is, for t 1,...,T, g 1,...,G, gt t g x gt v gt (8) 12

13 We can apply the BDM findings and Hansen (2007a) results directly to this equation. Namely, if we estimate (8) by OLS which means full year and group effects, along with x gt thentheols estimator has satisfying properties as G and T both increase, provided v gt : t 1,2,...,T is a weakly dependent time series for all g. The simulations in BDM and Hansen (2007a) indicate that cluster-robust inference, where each cluster is a set of time periods, work reasonably well when v gt follows a stable AR(1) model and G is moderately large. Hansen (2007b), noting that the OLS estimator (the fixed effects estimator) applied to (8) is inefficient when v gt is serially uncorrelated, proposes feasible GLS. When T is small, estimating 13

14 the parameters in g Var v g, where v g is the T 1 error vector for each g, is difficult when group effects have been removed. Estimates based on the FE residuals, v gt, disappear as T, but can be substantial. In AR(1) case, comes from v gt on v g,t 1, t 2,...,T, g 1,...,G. (9) One way to account for bias in : use fully robust inference. But, as Hansen (2007b) shows, this can be very inefficient relative to his suggestion to bias-adjust the estimator and then use the bias-adjusted estimator in feasible GLS. (Hansen covers the general AR p model.) Hansen shows that an iterative bias-adjusted procedure has the same asymptotic distribution as in the case should work well: G and T both tending to infinity. Most importantly for the 14

15 application to DD problems, the feasible GLS estimator based on the iterative procedure has the same asymptotic distribution as the infeasible GLS etsimator when G and T is fixed. Even when G and T are both large, so that the unadjusted AR coefficients also deliver asymptotic efficiency, the bias-adusted estimates deliver higher-order improvements in the asymptotic distribution. One limitation of Hansen s results: they assume x gt : t 1,...,T are strictly exogenous. If we just use OLS, that is, the usual fixed effects estimate strict exogeneity is not required for consistency as T. Nothing new that GLS relies on strict exogeneity in serial correlation cases. In intervention analyis, might be concerned if the 15

16 policies can switch on and off over time. With large G and small T, one can estimate an unstricted variance matrix g and proceed with GLS this is the approach suggested by Kiefer (1980) and studied more recently by Hausman and Kuersteiner (2003). Works pretty well with G 50 and T 10, but get substantial size distortions for G 50 and T 20. If the M gt are not large, might worry about ignoring the estimation error in the gt. Can instead aggregate the equations over individuals, giving ȳ gt t g x gt z gt v gt ū gt, t 1,..,T, g 1,...,G. (10) Can estimate this by FE and use fully robust inference because the composite error, r gt v gt ū gt, is weakly dependent. 16

17 The Donald and Lang (2007) approach applies in the current setting by using finite sample analysis applied to the pooled regression (10). However, DL assume that the errors v gt are uncorrelated across time, and so, even though for small G and T it uses small degrees-of-freedom in a t distribution, it does not account for uncertainty due to serial correlation in v gt. 4. Individual-Level Panel Data Let w it be a binary indicator, which is unity if unit i participates in the program at time t. Consider y it d2 t w it c i u it, t 1, 2, (11) where d2 t 1ift 2 and zero otherwise, c i is an observed effect, and u it are the idiosyncratic errors. The coefficient is the treatment effect. A simple estimation procedure is to first difference to remove 17

18 c i : y i2 y i1 w i2 w i1 u i2 u i1 (12) or Δy i Δw i Δu i. (13) If E Δw i Δu i 0, that is, the change in treatment status is uncorrelated with changes in the idiosyncratic errors, then OLS applied to (13) is consistent. If w i1 0 for all i, the OLS estimate is Δȳ treat Δȳ control, (14) which is a DD estimate except that we different the means of the same units over time. With many time periods and arbitrary treatment patterns, we can use y it t w it x it c i u it, t 1,...,T, (15) 18

19 which accounts for aggregate time effects and allows for controls, x it. Estimation by FE or FD to remove c i is standard, provided the policy indicator, w it, is strictly exogenous: correlation beween w it and u ir for any t and r causes inconsistency in both estimators (with FE having some advantages for larger T if u it is weakly dependent) What if designation is correlated with unit-specific trends? Correlated random trend model: y it c i g i t t w it x it u it (16) where g i is the trend for unit i. A general analysis allows arbitrary corrrelation between c i, g i and w it, which requires at least T 3. If we first difference, we get, for t 2,...,T, 19

20 Δy it g i t Δw it Δx it Δu it. (17) Can difference again or estimate (17) by FE. Can derive standard panel data approaches using the counterfactural framework from the treatment effects literature. For each i, t, lety it 1 and y it 0 denote the counterfactual outcomes, and assume there are no covariates. Unconfoundedness, conditional on unobserved heterogeneity, can be stated as E y it0 w i, c i0, c i1 E y it0 c i0 E y it1 w i, c i0, c i1 E y it1 c i1, (18) (19) where w i w i1,...,w it is the time sequence of all treatments. If the gain from treatment only depends on t, E y it1 c i1 E y it0 c i0 t (20) 20

21 and then If we assume then E y it w i, c i0, c i1 E y it0 c i0 t w it. (21) E y it0 c i0 t0 c i0, (22) E y it w i, c i0, c i1 t0 c i0 t w it, (23) an estimating equation that leads to FE or FD (often with t. If add strictly exogenous covariates, and assume linearity of conditional expectations, and allow the gain from treatment to depend on x it and an additive unobserved effect a i, get E y it w i, x i, c i0, a i t0 t w it x it 0 w it x it t c i0 a i w it, (24) a correlated random coefficient model because the 21

22 coefficient on w it is t a i. Can eliminate a i (and c i0. Or, with t, can estimate the i and then use N N 1 i 1 i. (25) See Wooldridge (2002, Section 11.2) for standard error, or bootstrapping. 5. Semiparametric and Nonparametric Approaches Return to the setting with two groups and two time periods. Athey and Imbens (2006) generalize the standard DD model in several ways. Let the two time periods be t 0 and 1 and label the two groups g 0 and 1. Let Y i 0 be the counterfactual outcome in the absense of intervention and Y i 1 the counterfactual outcome with intervention. AI 22

23 assume that where T i is the time period and Y i 0 h 0 U i, T i, (26) h 0 u, t strictly increasing in u for t 0, 1 (27) The random variable U i represents all unobservable characteristics of individual i. Equation (26) incorporates the idea that the outcome of an individual with U i u willbethesameinagiven time period, irrespective of group membership. The distribution of U i is allowed to vary across groups, but not over time within groups, so that D U i T i, G i D U i G i. (28) The standard DD model can be expressed in this way, with h 0 u, t u t (29) 23

24 and U i G i V i, V i G i, T i (30) although, because of the linearity, we can get by with the mean independence assumption E V i G i, T i 0. With constant treatment effect, Y i T i G i G i T i V i, (31) Because E V i G i, T i 0, the parameters in (31) can be estimated by OLS (usual DD analysis). Athey and Imbens call the extension of the usual DD model the changes-in-changes (CIC) model. Can recover D Y i 0 G i 1, T i 1, (32) under their assumptions (with an extra support condition). In fact, if F 0 gt y the be cumulative distribution function of D Y i 0 G i g, T i t for 24

25 g 1, 2 and t 1, 2, and F gt y is the cdf for the observed outcome Y i conditional on G i g and T i t, then F 0 11 y F 10 F 1 00 F 01 y, (33) where F 1 00 is the inverse function of F 00, which exists under the strict monotonicity assumption. Because F 1 11 y F 11 y, we can estimate the entire distributions of both counterfactuals conditional on intervention, G i T i 1. Can apply to repeated cross sections or panel data. Of course, can also identify the average treatment effect In particular, CIC E Y 11 1 E Y (34) 25

26 CIC E Y 11 E F 1 01 F 00 Y 10. (35) Other approaches with panel data: Altonji and Matzkin (2005) under exchaneability in D U i W i1,...,w it. Heckman, Ichimura, Smith, and Todd (1997) and Abadie (2005). Consider basic setup with two time periods, no treated units in first time period. Without an i subscript, Y t w is the counterfactual outcome for treatment level w, w 0, 1, at time t. Parameter: the average treatment effect on the treated, ATT E Y 1 1 Y 1 0 W 1. (36) Remember, in the current setup, no units are treated in the initial time period, so W 1 means treatment in the second time period. 26

27 Key unconfoundedness assumption: E Y 1 0 Y 0 0 X, W E Y 1 0 Y 0 0 X (37) Also need P W 1 X 1 (38) is critical. Under (37) and (38), ATT E W p X Y 1 Y 0 1 p X /P W 1, (39) where Y t, t 0, 1 are the observed outcomes (for the same unit) and p X P W 1 X is the propensity score. Dehejia and Wahba (1999) derived (39) for the cross-sectional case. All quantities are observed or, in the case of the p X and P W 1, can be estimated. As in Hirano, Imbens, and Ridder (2003), a flexible logit model can be used for p X ; the fraction of units treated 27

28 wouldbeusedfor. Then N ATT 1 N 1 i 1 W i p X i ΔY i 1 p X i. (40) is consistent and N -asymptotically normal. HIR discuss variance estimation. Imbens and Wooldridge (2007) provide a simple adjustment available in the case that p is treated as a parametric model. Similar approach works for ATE. Regression version: ΔY i on 1, W i, p X i, W i p X i, i 1,...,N. The coefficient on W i is the estimated ATE. Requires some functional form restrictions. Certainly preferred to running the regression Y it on 1, d1 t, d1 t W i, p X i. This latter regression 28

29 requires unconfoundedness in the levels, and as dominated by the basic DD estimate from ΔY i on 1, W i Regression adjustment can also be used, as in HIST (1997). 6. Synthetic Control Methods for Comparative Case Studies Abadie, Diamond, and Hainmueller (2007) argue that in policy analysis at the aggregate leve, there is little or no estimation uncertainty: the goal is to determine the effect of a policy on an entire population, and the aggregate is measured without error (or very little error). Application: California s tobacco control program on state-wide smoking rates. ADH focus on the uncertainty with choosing a 29

30 suitable control for California among other states (that did not implement comparable policies over the same period). ADH suggest using many potential control groups (38 states or so) to create a single synthetic control group. Two time periods: one before the policy and one after. Let y it be the outcome for unit i in time t, with i 1 the treated unit. Suppose there are J possible controls, and index these as 2,...,J 1. Letx i be observed covariates for unit i that are not (or would not be) affected by the policy; x i may contain period t 2 covariates provided they are not affected by the policy. Generally, we can estimate the effect of the policy as 30

31 J 1 y 12 w j y j2, (41) j 2 where w j are nonnegative weights that add up to one. How to choose the weights to best estimate the intervention effect? ADH propose choosing the weights so as to minimize the distance between y 11, x 1 and J 1 j 2 w j y j1, x j, say. That is, functions of the pre-treatment outcomes and the predictors of post-treatment outcomes. ADH propose permutation methods for inference, which require estimating a placebo treatment effect for each region, using the same synthetic control method as for the region that underwent the intervention. 31

### 1. Review of the Basic Methodology

What s New in Econometrics? NBER, Summer 2007 Lecture 10, Tuesday, July 31st, 4.30-5.30 pm Difference-in-Differences Estimation These notes provide an overview of standard difference-in-differences methods

### What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

### Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

### FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

### 1. THE LINEAR MODEL WITH CLUSTER EFFECTS

What s New in Econometrics? NBER, Summer 2007 Lecture 8, Tuesday, July 31st, 2.00-3.00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples

### ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working

### Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable

### Correlated Random Effects Panel Data Models

INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear

### Lecture 3: Differences-in-Differences

Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).

### SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

### AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE

AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE AND THEIR IMPLEMENTATION IN STATA Barbara Sianesi IFS Stata Users Group Meeting Berlin, June 25, 2010 1 (PS)MATCHING IS EXTREMELY POPULAR 240,000

### Clustering in the Linear Model

Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

### Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

### 1. When Can Missing Data be Ignored?

What s ew in Econometrics? BER, Summer 2007 Lecture 12, Wednesday, August 1st, 11-12.30 pm Missing Data These notes discuss various aspects of missing data in both pure cross section and panel data settings.

### Instrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.

Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

### HOW MUCH SHOULD WE TRUST DIFFERENCES-IN-DIFFERENCES ESTIMATES?

HOW MUCH SHOULD WE TRUST DIFFERENCES-IN-DIFFERENCES ESTIMATES? Marianne Bertrand Esther Duflo Sendhil Mullainathan This Version: June 2003 Abstract Most papers that employ Differences-in-Differences estimation

### Panel Data: Linear Models

Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What

### Clustered Standard Errors

Clustered Standard Errors 1. The Attraction of Differences in Differences 2. Grouped Errors Across Individuals 3. Serially Correlated Errors 1. The Attraction of Differences in Differences Estimates Typically

### Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers

### Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

### Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order

### Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

### Structural Econometric Modeling in Industrial Organization Handout 1

Structural Econometric Modeling in Industrial Organization Handout 1 Professor Matthijs Wildenbeest 16 May 2011 1 Reading Peter C. Reiss and Frank A. Wolak A. Structural Econometric Modeling: Rationales

### A Note on Parametric and Nonparametric Regression in the Presence of Endogenous Control Variables

DISCUSSION PAPER SERIES IZA DP No. 2126 A Note on Parametric and Nonparametric Regression in the Presence of Endogenous Control Variables Markus Frölich April 2006 Forschungsinstitut zur Zukunft der Arbeit

### Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY,

ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY, AND EMPLOYMENT LOCK Craig Garthwaite Tal Gross Matthew J. Notowidigdo December 2013 A1. Monte Carlo Simulations This section describes a set of

### HOW ROBUST IS THE EVIDENCE ON THE EFFECTS OF COLLEGE QUALITY? EVIDENCE FROM MATCHING

HOW ROBUST IS THE EVIDENCE ON THE EFFECTS OF COLLEGE QUALITY? EVIDENCE FROM MATCHING Dan A. Black Department of Economics and Center for Policy Research Syracuse University danblack@maxwell.syr.edu Jeffrey

### Marketing Mix Modelling and Big Data P. M Cain

1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

### Economics Series. A Simple and Successful Shrinkage Method for Weighting Estimators of Treatment Effects

Economics Series Working Paper No. 304 A Simple and Successful Shrinkage Method for Weighting Estimators of Treatment Effects Winfried Pohlmeier 1, Ruben Seiberlich 1 and Selver Derya Uysal 2 1 University

### HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

### PANEL DATA METHODS FOR FRACTIONAL RESPONSE VARIABLES WITH AN APPLICATION TO TEST PASS RATES

PANEL DATA METHODS FOR FRACTIONAL RESPONSE VARIABLES WITH AN APPLICATION TO TEST PASS RATES Leslie E. Papke Department of Economics Michigan State University East Lansing, MI 48824-1038 papke@msu.edu Jeffrey

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### Panel Data Econometrics

Panel Data Econometrics Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

### Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University

Machine Learning Methods for Causal Effects Susan Athey, Stanford University Guido Imbens, Stanford University Introduction Supervised Machine Learning v. Econometrics/Statistics Lit. on Causality Supervised

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### Dealing with Missing Data for Credit Scoring

ABSTRACT Paper SD-08 Dealing with Missing Data for Credit Scoring Steve Fleming, Clarity Services Inc. How well can statistical adjustments account for missing data in the development of credit scores?

### Employer-Provided Health Insurance and Labor Supply of Married Women

Upjohn Institute Working Papers Upjohn Research home page 2011 Employer-Provided Health Insurance and Labor Supply of Married Women Merve Cebi University of Massachusetts - Dartmouth and W.E. Upjohn Institute

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When

### The primary goal of this thesis was to understand how the spatial dependence of

5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

### A Basic Introduction to Missing Data

John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

### Propensity scores for the estimation of average treatment effects in observational studies

Propensity scores for the estimation of average treatment effects in observational studies Leonardo Grilli and Carla Rampichini Dipartimento di Statistica Giuseppe Parenti Universit di Firenze Training

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Implementing Propensity Score Matching Estimators with STATA

Implementing Propensity Score Matching Estimators with STATA Barbara Sianesi University College London and Institute for Fiscal Studies E-mail: barbara_s@ifs.org.uk Prepared for UK Stata Users Group, VII

### Profit-sharing and the financial performance of firms: Evidence from Germany

Profit-sharing and the financial performance of firms: Evidence from Germany Kornelius Kraft and Marija Ugarković Department of Economics, University of Dortmund Vogelpothsweg 87, 44227 Dortmund, Germany,

### The Method of Least Squares

Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

### Introduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A.

Introduction to Regression Models for Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. McManus Panel Data Analysis October 2011 What are Panel Data? Panel

### The synthetic control method compared to difference in diff

The synthetic control method compared to difference in differences: discussion April 2014 Outline of discussion 1 Very brief overview of synthetic control methods 2 Synthetic control methods for disaggregated

### Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You

Mgmt 469 Fixed Effects Models Suppose you want to learn the effect of price on the demand for back massages. You have the following data from four Midwest locations: Table 1: A Single Cross-section of

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### 1 Introduction. 2 The Econometric Model. Panel Data: Fixed and Random Effects. Short Guides to Microeconometrics Fall 2015

Short Guides to Microeconometrics Fall 2015 Kurt Schmidheiny Unversität Basel Panel Data: Fixed and Random Effects 2 Panel Data: Fixed and Random Effects 1 Introduction In panel data, individuals (persons,

### LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### the effectiveness of private voucher education: evidence from structural school switches bernardo lara _ alejandra mizala _ andrea repetto

m a y o 001 the effectiveness of private voucher education: evidence from structural school switches bernardo lara _ alejandra mizala _ andrea repetto The Effectiveness of Private Voucher Education: Evidence

### INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

### Imbens/Wooldridge, Lecture Notes 5, Summer 07 1

Imbens/Wooldridge, Lecture Notes 5, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 5, Monday, July 30th, 4.30-5.30pm Instrumental Variables with Treatment Effect Heterogeneity: Local

### FULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS

FULLY MODIFIED OLS FOR HEEROGENEOUS COINEGRAED PANELS Peter Pedroni ABSRAC his chapter uses fully modified OLS principles to develop new methods for estimating and testing hypotheses for cointegrating

### ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

### Missing Data Dr Eleni Matechou

1 Statistical Methods Principles Missing Data Dr Eleni Matechou matechou@stats.ox.ac.uk References: R.J.A. Little and D.B. Rubin 2nd edition Statistical Analysis with Missing Data J.L. Schafer and J.W.

### Does Disability Insurance Receipt Discourage Work? Using Examiner Assignment to Estimate Causal Effects of SSDI Receipt

CENTER for DISABILITY RESEARCH Does Disability Insurance Receipt Discourage Work? Using Examiner Assignment to Estimate Causal Effects of SSDI Receipt Nicole Maestas, RAND Kathleen Mullen, RAND Alexander

### Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics

Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics John Pepper Assistant Professor Department of Economics University of Virginia 114 Rouss

### Centre for Central Banking Studies

Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

### Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

### Panel Data: Fixed and Random Effects

Short Guides to Microeconometrics Fall 2015 Kurt Schmidheiny Unversität Basel Panel Data: Fixed and Random Effects 1 Introduction In panel data, individuals (persons, firms, cities, ) are observed at several

### The Potential Outcome of Schooling: Individual Heterogeneity, Program Risk and Residual Wage Inequality

The Potential Outcome of Schooling: Individual Heterogeneity, Program Risk and Residual Wage Inequality Winfried Pohlmeier University of Konstanz CoFE, ZEW Anton L. Flossmann University of Konstanz Very

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### 2. What are the theoretical and practical consequences of autocorrelation?

Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since

### Using Hierarchical Linear Models to Measure Growth. Measurement Incorporated Hierarchical Linear Models Workshop

Using Hierarchical Linear Models to Measure Growth Measurement Incorporated Hierarchical Linear Models Workshop Chapter 6 Motivation Linear Growth Curves Quadratic Growth Curves Some other Growth Curves

### Random Effects Models for Longitudinal Survey Data

Analysis of Survey Data. Edited by R. L. Chambers and C. J. Skinner Copyright 2003 John Wiley & Sons, Ltd. ISBN: 0-471-89987-9 CHAPTER 14 Random Effects Models for Longitudinal Survey Data C. J. Skinner

### NOTES ON HLM TERMINOLOGY

HLML01cc 1 FI=HLML01cc NOTES ON HLM TERMINOLOGY by Ralph B. Taylor breck@rbtaylor.net All materials copyright (c) 1998-2002 by Ralph B. Taylor LEVEL 1 Refers to the model describing units within a grouping:

### Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

### Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

### 6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander

Reply Affirmative Action s Affirmative Actions: A Reply to Sander Daniel E. Ho I am grateful to Professor Sander for his interest in my work and his willingness to pursue a valid answer to the critical

### 1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider

### Statistical Tests for Multiple Forecast Comparison

Statistical Tests for Multiple Forecast Comparison Roberto S. Mariano (Singapore Management University & University of Pennsylvania) Daniel Preve (Uppsala University) June 6-7, 2008 T.W. Anderson Conference,

### Generalized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)

Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through

### NBER WORKING PAPER SERIES TEACHER QUALITY AT THE HIGH-SCHOOL LEVEL: THE IMPORTANCE OF ACCOUNTING FOR TRACKS. C. Kirabo Jackson

NBER WORKING PAPER SERIES TEACHER QUALITY AT THE HIGH-SCHOOL LEVEL: THE IMPORTANCE OF ACCOUNTING FOR TRACKS C. Kirabo Jackson Working Paper 17722 http://www.nber.org/papers/w17722 NATIONAL BUREAU OF ECONOMIC

### Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

### Simultaneous or Sequential? Search Strategies in the U.S. Auto. Insurance Industry

Simultaneous or Sequential? Search Strategies in the U.S. Auto Insurance Industry Elisabeth Honka 1 University of Texas at Dallas Pradeep Chintagunta 2 University of Chicago Booth School of Business September

### Returns to education on earnings in Italy: an evaluation using propensity score techniques

Returns to education on earnings in Italy: an evaluation using propensity score techniques Elena Cottini Università Cattolica del Sacro Cuore, Milan PRELIMINARY DRAFT (June 2005) ABSTRACT: This paper investigates

### Handling attrition and non-response in longitudinal data

Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

### Poor identification and estimation problems in panel data models with random effects and autocorrelated errors

Poor identification and estimation problems in panel data models with random effects and autocorrelated errors Giorgio Calzolari Laura Magazzini January 7, 009 Submitted for presentation at the 15th Conference

### Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

### Multiple Regression: What Is It?

Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in