Clustered Standard Errors



Similar documents
Lecture 3: Differences-in-Differences

Clustering in the Linear Model

Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy

HOW MUCH SHOULD WE TRUST DIFFERENCES-IN-DIFFERENCES ESTIMATES?

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Integrated Resource Plan

Redistributive Taxation and Personal Bankruptcy in US States

Difference in differences and Regression Discontinuity Design

FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

August 2012 EXAMINATIONS Solution Part I

Clustered Errors in Stata

How Much Equity Does the Government Hold?

Optimal Social Insurance Design: UI Benefit Levels

Part 2: Analysis of Relationship Between Two Variables

The Real Business Cycle model

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Redistributional impact of the National Health Insurance System

UNIVERSITY OF WAIKATO. Hamilton New Zealand

Online Appendices to the Corporate Propensity to Save

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

Least Squares Estimation

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

14.74 Lecture 7: The effect of school buildings on schooling: A naturalexperiment

Introduction to Longitudinal Data Analysis

A Basic Introduction to Missing Data

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

VI. Real Business Cycles Models

From the help desk: Bootstrapped standard errors

Chapter 7: Simple linear regression Learning Objectives

Statistics 104: Section 6!

4. Simple regression. QBUS6840 Predictive Analytics.

How to choose an analysis to handle missing data in longitudinal observational studies

Intermediate Macroeconomics: The Real Business Cycle Model

PRELIMINARY DRAFT: PLEASE DO NOT QUOTE

Forecasting in supply chains

Evaluating one-way and two-way cluster-robust covariance matrix estimates

Fixed-Effect Versus Random-Effects Models

Lecture 14 More on Real Business Cycles. Noah Williams

Learning objectives. The Theory of Real Business Cycles

Revealing Taste-Based Discrimination in Hiring: A Correspondence Testing Experiment with Geographic Variation

Construction of variables from CE

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Lecture 15. Endogeneity & Instrumental Variable Estimation

Internet Appendix to. Why does the Option to Stock Volume Ratio Predict Stock Returns? Li Ge, Tse-Chun Lin, and Neil D. Pearson.

The Effects of Unemployment on Crime Rates in the U.S.

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012

2. Real Business Cycle Theory (June 25, 2013)

Multiple Linear Regression in Data Mining

Retirement routes and economic incentives to retire: a cross-country estimation approach Martin Rasmussen

The Impact of the Medicare Rural Hospital Flexibility Program on Patient Choice

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Do Commodity Price Spikes Cause Long-Term Inflation?

I. Basic concepts: Buoyancy and Elasticity II. Estimating Tax Elasticity III. From Mechanical Projection to Forecast

2013 MBA Jump Start Program. Statistics Module Part 3

Financial Planning with Your Retirement Portfolio

Week 5: Multiple Linear Regression

Poorly Measured Confounders are More Useful on the Left Than on the Right

2. Linear regression with multiple regressors

Simulating Investment Portfolios

The RBC methodology also comes down to two principles:

Why Do Self-Employed Immigrants in Denmark and Sweden Have Such Low Incomes?

Lecture 8 (Chapter 11): Project Analysis and Evaluation

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Comparison of Estimation Methods for Complex Survey Data Analysis

1 Short Introduction to Time Series

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

Moderation. Moderation

3. Regression & Exponential Smoothing

Department of Economics and Related Studies Financial Market Microstructure. Topic 1 : Overview and Fixed Cost Models of Spreads

Stress Test Modeling in Cards Portfolios

Russian migrants to Russia: assimilation and local labor market effects

A Latent Variable Approach to Validate Credit Rating Systems using R

16 : Demand Forecasting

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Correlated Random Effects Panel Data Models

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

The Wage Impact of the Marielitos: A Reappraisal. George J. Borjas Harvard University. September 2015

17. SIMPLE LINEAR REGRESSION II

Regression Analysis: A Complete Example

From the help desk: hurdle models

Regression step-by-step using Microsoft Excel

Transcription:

Clustered Standard Errors 1. The Attraction of Differences in Differences 2. Grouped Errors Across Individuals 3. Serially Correlated Errors

1. The Attraction of Differences in Differences Estimates Typically evaluate programs which differ across groups, such as U.S. States e.g., effect of changes in state minimum wage laws or state welfare programs on earnings or unemployment Treat selection (heterogeneity) bias by removing state effects (one diff ) Treat common economic fluctuations by removing year effects (the other diff ) Hence the appealing nickname diffs in diffs

2. The Grouped Error Problem: Binary covariates define groups within which errors are potentially correlated (e.g., cities, states, years, states after treatment, self-employed, etc..) - remember that errors contain unobserved variables Y ist = A st + B t + cx ist + βi st + ε ist, s are groups (perhaps states) t is time I is an indicator for treatment, which occurs as the group x time level ε is an error term, which is not necessary iid.

2. Grouped Errors Across Individuals E.g., Minimum wages on NJ/Penn border Card and Krueger (1994) looked at the effects of minimum wages on employment in fast-food restaurants near the NJ Penn border. Data collected before and after NJ raised its minimum wage by 80 cents (in 1992). i - restaurant, s state, t time S=2, T=2, N is large. They found small positive effects within a small confidence interval of zero.

2. Grouped Errors Across Individuals E.g., Mariel Boatlift Card (1990) looked at the effects of a surprise supply shock of immigrants to Miami due to a temporary lifting of emigration restrictions by Cuba in 1980. He estimates the effect of the boatlift on unemployment and wages of low skill workers in Miami using four other cities as comparisons (Atlanta, Houston, LA and Tampa-St. Petersburg) with CPS data. i - individual, s city, t time S=5, T~=2, N is large. He finds no statistically significant effect on employment or wages of the labor supply shock.

2. Grouped Errors Across Individuals How big does the number of groups (S, or S*T) have to be? Y ist = a st + d t + cz ist + βi st + ε ist, Donald and Lang (2004): In the (plausible) case where we have some within-group correlation, and under generous assumptions the t-statistics converge to a normal distribution at rate S*T no matter what N is. Intuition: Imagine that within s,t groups the errors are perfectly correlated. Then you might as well aggregate and run the regression with S*T observations. Intuition: 2 step estimator If group and time effects are included, with normally distributed group-time specific errors under generous assumptions, the t- statistics have a t distribution with S*T-S-T degrees of freedom, no matter what N is. (Table 3) Donald-Lang suggested estimator has this flavor. (Table 3) Alternative: collapse into s,t groups 3 issues: consistent s.e., efficient s.e. and distribution of t-stat in small samples

Distribution of t-ratio, 4 d.o.f, β = 0 When N=250 the simulated distribution is almost identical

3. Correlations over time in panels Y ist = A st + B t + cx ist + βi st + ε ist, S are groups (perhaps states) t is time I is an indicator for treatment, which occurs as the group x time level Correlations within group, period (i.e., s,t) cells only is very restrictive. In general we want to allow correlations over time as well (within s but not within t)

Lots of DD papers T is large The variables tend to be serially corr. So are std. errors consistent?

Placebo Binary Laws Randomly choose a year between 79-99 & randomly assign a law to 25 states til end of 99 Rej. rate is % for which t>1.96

Placebo Binary Laws Type I error is worst when T is large

Solutions: AR(1) correction N=50, T=21 AR(1) biased for small T Process looks more like AR(2)

Solutions: Ignore TS Information correct size but loss of power Residual aggregation is a Frisch-Waugh exercise: first - regress on other variables, then - aggregate residuals before and after treatment

Solutions: Cluster within states (over time) simple, easy to implement Works well for N=10 But this is only one data set and one variable (CPS, log weekly earnings) -

Current Standard Practice Be conservative: cluster by group or time (not the interaction) and report the larger std. error - note: this may get size and power wrong Better.. you can cluster on both! Cameron, Gelbach, and Miller (2006, NBER Technical WP) method not coded in Stata yet, but you can get an.ado from Doug Miller s Stata page http://www.econ.ucdavis.edu/faculty/dlmiller/statafiles/ Do you have enough groups for a normal approximation?.. Check with a Wild Bootstrap Cameron, Gelbach, Miller (ReStat 2008);.do file on Miller s page. May be argument for using Newey-West std. errors. Ask Gordon Dahl, who is working on a better method

Exam? Wed Dec 7 in Granger room, 3PM