The synthetic control method compared to difference in diff



Similar documents
The Best of Both Worlds:

Marketing Mix Modelling and Big Data P. M Cain

1. Briefly explain what an indifference curve is and how it can be graphically derived.

Multiple Linear Regression in Data Mining

Implementing Propensity Score Matching Estimators with STATA

Some useful concepts in univariate time series analysis

OUTLINE OF PRINCIPLES OF IMPACT EVALUATION

Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California s Tobacco Control Program

Interactive Computer Graphics

Labor Economics, Lecture 3: Education, Selection, and Signaling

Did Latvia s Flat Tax Reform Improve Growth?

Lecture 3: Differences-in-Differences

Gains from Trade. Christopher P. Chambers and Takashi Hayashi. March 25, Abstract

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

Five Myths of Active Portfolio Management. P roponents of efficient markets argue that it is impossible

The Long-Run Effects of Attending an Elite School: Evidence from the UK (Damon Clark and Emilia Del Bono): Online Appendix

Life Cycle Asset Allocation A Suitable Approach for Defined Contribution Pension Plans

THE DESIGN AND EVALUATION OF ROAD SAFETY PUBLICITY CAMPAIGNS

Social Media Mining. Network Measures

How To Understand The Theory Of Active Portfolio Management

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

Finite Element Formulation for Beams - Handout 2 -

Introduction to time series analysis

A New Method for Estimating Maximum Power Transfer and Voltage Stability Margins to Mitigate the Risk of Voltage Collapse

Zeros of Polynomial Functions

8. Linear least-squares

Grambling State University FIVE-YEAR STRATEGIC PLAN. FY through FY

Cross Validation. Dr. Thomas Jensen Expedia.com

Support Vector Machines Explained

In recent years, Federal Reserve (Fed) policymakers have come to rely

Persuasion by Cheap Talk - Online Appendix

2.5 ZEROS OF POLYNOMIAL FUNCTIONS. Copyright Cengage Learning. All rights reserved.

BIS database for debt service ratios for the private nonfinancial

Household Survey Data Basics

Financial Management

Louisiana Tech University FIVE-YEAR STRATEGIC PLAN. FY through FY

Moving Least Squares Approximation

APPENDIX N. Data Validation Using Data Descriptors

Monotonicity Hints. Abstract

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA

Zeros of Polynomial Functions

Pricing in a Competitive Market with a Common Network Resource

Econometrics Simple Linear Regression

EXPL ORI NG THE RE LAT ION SHI P ATTE NDA NCE AND COG NIT IVE GAIN S OF LA S BES T STU DEN TS

Clustered Standard Errors

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL

Project Evaluation Guidelines

The Impact of the Medicare Rural Hospital Flexibility Program on Patient Choice

How To Find Out What Search Strategy Is Used In The U.S. Auto Insurance Industry

Chapter 13: Binary and Mixed-Integer Programming

Life Insurance (prudential standard) determination No. 7 of 2010

Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM

Statistical Models in R

Study Design and Statistical Analysis

PharmaSUG Paper IB05

Talking with Health Care Entities about Value: A Tool for Calculating a Return on Investment

Impact Assessment (IA)

Choice under Uncertainty

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

A Practical Scheme for Wireless Network Operation

I. Basic concepts: Buoyancy and Elasticity II. Estimating Tax Elasticity III. From Mechanical Projection to Forecast

Chapter 5: The Cointegrated VAR model

Adaptive Online Gradient Descent

Question 2 Naïve Bayes (16 points)

TRUSTING YOUR DATA. How companies benefit from a reliable data using Information Steward 1/2/2013

4. Simple regression. QBUS6840 Predictive Analytics.

IRSG Opinion on Joint Discussion paper on Key Information Document (KID) for Packaged Retail and Insurance-based Investment Products (PRIIPs)

MODELING RANDOMNESS IN NETWORK TRAFFIC

Lecture 15. Endogeneity & Instrumental Variable Estimation

Dynamics of the current account in a small open economy microfounded model

MCS 563 Spring 2014 Analytic Symbolic Computation Wednesday 9 April. Hilbert Polynomials

Natural cubic splines

Universal Health Insurance And Mortality: Evidence From Massachusetts

HETEROGENEOUS AGENTS AND AGGREGATE UNCERTAINTY. Daniel Harenberg University of Mannheim. Econ 714,

Mgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side

The Macroeconomic Effects of Tax Changes: The Romer-Romer Method on the Austrian case

Transcription:

The synthetic control method compared to difference in differences: discussion April 2014

Outline of discussion 1 Very brief overview of synthetic control methods 2 Synthetic control methods for disaggregated data 3 Matching vs/and synthetic control methods 4 Indirect effects

Synthetic control methods: overview Goal: to evaluate the impact of a treatment implemented at the aggregate level in one (or very few) unit using a small number of controls to build the counterfactual Synthetic control methods use (long) longitudinal data to build the weighted average of non-treated units that best reproduces characteristics of the treated unit over time, prior to treatment this is the synthetic cohort impact of treatment is quantified by a simple difference after treatment: treated vs synthetic cohort

Synthetic control methods: overview Goal: to evaluate the impact of a treatment implemented at the aggregate level in one (or very few) unit using a small number of controls to build the counterfactual Synthetic control methods use (long) longitudinal data to build the weighted average of non-treated units that best reproduces characteristics of the treated unit over time, prior to treatment this is the synthetic cohort impact of treatment is quantified by a simple difference after treatment: treated vs synthetic cohort

Synthetic control methods: formalisation Units: j = 0,1,..., where j = 0 is the treated and j = 1,..., are controls Time frame: t = 1,...,T 1 split in two periods - before treatment t = 1,...,T 0 and after treatment t = T 0 + 1,...,T 1 Potential and observed outcomes for the treated unit are ( Y0t 0,Y ) 0t 1 and { Y0t 0 for t = 1,...,T 0 Y 0t = Y0t 1 for t = T 0 + 1,...,T 1 Aim is to estimate α 0t = Y 1 0t Y 0 0t for t = T 0 + 1,...,T 1

Synthetic control methods: formalisation Model of untreated outcomes for unit j = 0,..., and time t = 1,...,T 1 Y 0 jt = δ t + θ t Z j + λ t µ j + ε jt Z j are the observed, pre-treatment covariates µ j are permanent unobserved variables δ t are common time effects ε jt are unobserved transitory shocks at the unit level with zero mean

Synthetic control method: formalisation Choose W = ( w 1,...,w ) [0,1], adding to 1, to minimise distance in pre-treatment characteristics between treated and weighted average of controls Treatment effect estimated by the simple difference ˆα 0t = Y 0t w j Y jt for t = T 0 + 1,...,T 1 Ideally, one would want to select W such that w j Z j = Z 0 and w j µ j = µ 0 so ˆα 0t is unbiased (ε is mean-independent of (Z, µ) and independent across units and over time) Not feasible since µ is unobserved

Synthetic control methods: formalisation Solution: choose W satisfying w j Z j = Z 0, w j Y j1 = Y 01,..., Bias can be bounded under mild conditions ( ) 1 p E(ˆα 0t α 0t ) < β p 1 m p max T p 1, 0 w j Y jt 0 = Y 0T0 σ T0

Synthetic control methods: bias ( ) 1 p E(ˆα 0t α 0t ) < β p 1 m p max T p 1, 0 Bias is small when T 0 is large relative to scale of ε σ T0 Intuition: a synthetic cohort can fit ( Z 0,Y 01,...,Y 0T0 ) for a large T 0 only if it fits (Z 0, µ 0 ) But a large does not help reducing the bias once these conditions are met w j Z j = Z 0, w j Y j1 = Y 01,..., w j Y jt 0 = Y 0T0

Synthetic control methods for disaggregated data Some issues 1 Many controls not necessarily beneficial - although this depends on how small is and whether the aggregate treated unit lies outside the domain of the controls 2 Scale of the transitory shock can be larger at the disaggregated level if the aggregate outcome is the average of outcomes for smaller units in which case need more time periods to keep bias down 3 Possibly more serious interpolation bias

Synthetic control methods for disaggregated data Some issues 1 Many controls not necessarily beneficial - although this depends on how small is and whether the aggregate treated unit lies outside the domain of the controls 2 Scale of the transitory shock can be larger at the disaggregated level if the aggregate outcome is the average of outcomes for smaller units in which case need more time periods to keep bias down 3 Possibly more serious interpolation bias

Synthetic control methods for disaggregated data Some issues 1 Many controls not necessarily beneficial - although this depends on how small is and whether the aggregate treated unit lies outside the domain of the controls 2 Scale of the transitory shock can be larger at the disaggregated level if the aggregate outcome is the average of outcomes for smaller units in which case need more time periods to keep bias down 3 Possibly more serious interpolation bias

Synthetic control methods for disaggregated data Some issues 1 Many controls not necessarily beneficial - although this depends on how small is and whether the aggregate treated unit lies outside the domain of the controls 2 Scale of the transitory shock can be larger at the disaggregated level if the aggregate outcome is the average of outcomes for smaller units in which case need more time periods to keep bias down 3 Possibly more serious interpolation bias

Synthetic control methods for disaggregated data Interpolation bias The synthetic control method relies on the linearity of the model of untreated outcomes Y 0 jt = δ t + θ t Z j + λ t µ j + ε jt Even if linearity is violated, the model can be a good local approximation But bias can be large if the characteristics of control units are far from those of the treated In aggregate studies: pick and choose the control units that more closely resemble the treated But this is difficult to implement with more and smaller units Or when treated and control units are of different nature

Matching and synthetic control methods DID has been used with matching to relax strict functional form assumptions Matching selects controls that have, each of them, characteristics ( Zj,Y j1,...,y jt0 ) close to those of the treated Matching is a local estimator hence it is less sensitive to interpolation bias and allows for a more general specification of the model of untreated outcomes

Matching and synthetic control methods A more general model, under the assumption that ε is balanced conditional on ( Z j,y j1,...,y jt0 ) : Y 0 jt = f t ( Zj, µ j ) + εjt Matching is unbiased if it removes unobserved permanent differences between treated and control units But it may be impossible to match closely on the many characteristics ( Zj,Y j1,...,y jt0 ) Combine matching with synthetic cohorts for more disaggregated data? if f is smooth, a local polynomial approximation of f is accurate matching prior to applying a synthetic control method could help ensuring the comparability of admissable controls

Indirect effects At the aggregate level, it is quite plausible that what happens in one unit affects other units Abadie s and co-authors applications California s tobacco control programme may have influenced behaviour and legislation in other states The unification of Germany (and indeed aggregate shocks to its economy) may affect the economic outcomes of closely connected countries More similar control units may be more exposed to indirect effects Trade-off between interpolation bias and indirect effects