Implementing Propensity Score Matching Estimators with STATA



Similar documents
AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE

Propensity scores for the estimation of average treatment effects in observational studies

How To Find Out If A Local Market Price Is Lower Than A Local Price

Profit-sharing and the financial performance of firms: Evidence from Germany

Estimation of average treatment effects based on propensity scores

Editor Executive Editor Associate Editors Copyright Statement:

Propensity Score Matching Regression Discontinuity Limited Dependent Variables

Some Practical Guidance for the Implementation of Propensity Score Matching

Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. Description

Returns to education on earnings in Italy: an evaluation using propensity score techniques

Online Appendix The Earnings Returns to Graduating with Honors - Evidence from Law Graduates

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

Linear Threshold Units

Lecture 3: Linear methods for classification

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University

Introduction to mixed model and missing data issues in longitudinal studies

The synthetic control method compared to difference in diff

Classification by Pairwise Coupling

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Introduction to General and Generalized Linear Models

Nonparametric Statistics

Moving the Goalposts: Addressing Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand

Does Health Insurance Decrease Savings and Boost Consumption in Rural China?

A repeated measures concordance correlation coefficient

Econometrics Simple Linear Regression

MACHINE LEARNING IN HIGH ENERGY PHYSICS

Journal of Statistical Software

Turning Unemployment into Self-Employment: Effectiveness of Two Start-Up Programmes Å

Logit and Probit. Brad Jones 1. April 21, University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.

Classification Problems

Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs

Statistical Machine Learning

Journal of Statistical Software

Handling missing data in Stata a whirlwind tour

Standard errors of marginal effects in the heteroskedastic probit model

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Robust Tree-based Causal Inference for Complex Ad Effectiveness Analysis

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Nominal and ordinal logistic regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Impact of Farmer Field Schools on Agricultural Income and Skills: Evidence from an Aid-Funded Project in Rural Ethiopia

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

DATA ANALYSIS II. Matrix Algorithms

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Evaluation of Credit Guarantee Policy using Propensity Score Matching

Multivariate Logistic Regression

Goodness of fit assessment of item response theory models

The Stata Journal. Lisa Gilmore Gabe Waggoner

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Imbens/Wooldridge, Lecture Notes 5, Summer 07 1

Improving Experiments by Optimal Blocking: Minimizing the Maximum Within-block Distance

UNTIED WORST-RANK SCORE ANALYSIS

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Propensity Scores: What Do They Do, How Should I Use Them, and Why Should I Care?

Estimation of Unknown Comparisons in Incomplete AHP and It s Compensation

Wes, Delaram, and Emily MA751. Exercise p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].

Linear Classification. Volker Tresp Summer 2015

Effect Size Calculation for experimental & quasiexperimental

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Do exports generate higher productivity? Evidence from Slovenia

Transcription:

Implementing Propensity Score Matching Estimators with STATA Barbara Sianesi University College London and Institute for Fiscal Studies E-mail: barbara_s@ifs.org.uk Prepared for UK Stata Users Group, VII Meeting London, May 2001 1

BACKGROUND: THE EVALUATION PROBLEM POTENTIAL-OUTCOME APPROACH Evaluating the causal effect of some treatment on some outcome Y experienced by units in the population of interest. Y 1i Y 0i the outcome of unit i if i were exposed to the treatment the outcome of unit i if i were not exposed to the treatment D i {0, 1} indicator of the treatment actually received by unit i Y i = Y 0i + D i (Y 1i Y 0i ) the actually observed outcome of unit i X the set of pre-treatment characteristics CAUSAL EFFECT FOR UNIT i Y 1i Y 0i THE FUNDAMENTAL PROBLEM OF CAUSAL INFERENCE impossible to observe the individual treatment effect impossible to make causal inference without making generally untestable assumptions 2

Under some assumptions: estimate the average treatment effect at the population, or at a subpopulation, level: average treatment effect average treatment effect on the untreated AVERAGE TREATMENT EFFECT ON THE TREATED: E(Y1 Y0 D=1) = E(Y1 D=1) E(Y0 D=1) Need to construct the counterfactual E(Y0 D=1) the outcome participants would have experienced, on average, had they not participated. E(Y0 D=0)? In non-experimental studies: need to adjust for confounding variables 3

MATCHING METHOD 1. assume that all relevant differences between the two groups are captured by their observables X: Y 0 D X (A1) 2. select from the non-treated pool a control group in which the distribution of observed variables is as similar as possible to the distribution in the treated group For this need: 0 < Prob{D=1 X=x } < 1 for x ~ Χ (A2) matching has to be performed over the common support region PROPENSITY SCORE MATCHING p(x) Pr{D=1 X=x} A1) & A2) Y 0 D p(x) for X in ~ Χ 4

OVERVIEW: TYPES OF MATCHING ESTIMATORS pair to each treated individual i some group of comparable non-treated individuals and then associate to the outcome of the treated individual i, y i, the (weighted) outcomes of his neighbours j in the comparison group: where: y$ = w y i ij j j 0 C ( ) C 0 (pi) is the set of neighbours of treated i in the control group p i w ij [0, 1] with w ij = 0 j C ( ) p i 1 is the weight on control j in forming a comparison with treated i Two broad groups of matching estimators individual neighbourhood weights 5

Associate to the outcome yi of treated unit i a matched outcome given by 1. the outcome of the most observably similar control unit TRADITIONAL MATCHING ESTIMATORS: one-to-one matching 0 C ( pi) = j : pi pj = min { pi pk } k { D= 0} w ik = 1(k=j) 2. a weighted average of the outcomes of more (possibly all) nontreated units where the weight given to non-treated unit j is in proportion to the closeness of the observables of i and j SMOOTHED WEIGHTED MATCHING ESTIMATORS: kernel-based matching C 0 (pi) = {D=0} (for gaussian kernel).4 w ij K p i h p j K(.) non-negative symmetric unimodal K.3.2.1 0-5 -4-3 -2-1 0 1 2 3 4 5 u 6

IMPLEMENTING PROPENSITY SCORE MATCHING ESTIMATORS WITH STATA Preparing the dataset Keep only one observation per individual Estimate the propensity score on the X s e.g. via probit or logit and retrieve either the predicted probability or the index Necessary variables: á the 1/0 dummy variable identifying the treated/controls á the predicted propensity score á the variable identifying the outcome to be evaluated á [optionally: the individual identifier variable] 7

ONE-TO-ONE MATCHING WITH REPLACEMENT (WITHIN CALIPER) Nearest-neighbour matching Treated unit i is matched to that non-treated unit j such that: p p = min { p p } i j k { D=0} i k Caliper matching For a pre-specified δ>0, treated unit i is matched to that non-treated unit j such that: δ > p p = min { p p } i j k { D= 0} i k If none of the non-treated units is within δ from treated unit i, i is left unmatched. 8

. psmatch treated, on(score) cal(.01) [id(serial)] [outcome(wage)] Creates: 1) _times number of times used use _times as frequency weights to identify the matched treated and the (possibly repeatedly) matched controls 2) _matchdif pairwise difference in score. sum _matchdif, det for matching quality If id(idvar) specified 3) _matchedid the idvar of the matched control If outcome(outcomevar) specified: directly calculates and displays: Mean wage of matched treated = 640.39 Mean wage of matched controls = 582.785 Effect = 57.605 Std err = 74.251377 Note: takes account of possibly repeated use of control observations but NOT of estimation of propensity score. T-statistics for H0: effect=0 is.77581053 9

KERNEL-BASED MATCHING Idea associate to the outcome y i of treated unit i a matched outcome given by a kernel-weighted average of the outcome of all non-treated units, where the weight given to non-treated unit j is in proportion to the closeness between i and j: y$ i = j { D= 0} j { D= 0} K p p i j h K p p i h j y j Control j s outcome yj is weighted by w ij = K p p i j h K p p i h j { D= 0} j Option smooth(outcomevar) creates: _moutcomevar the matched smoothed outcomevar $y i 10

Bandwidth h selection a central issue in non-parametric analysis trade-off bias-variability Kernel K choice Gaussian Ku ( ) exp( u 2 / 2) uses all the non-treated units. psmatch treated, on(score) cal(0.06) smooth(wage) Mean wage of matched treated = 642.70352 Mean wage of matched controls = 677.1453 Effect =-34.441787 2 Epanechnikov Ku ( ) ( 1 u) if u <1 (zero otherwise) uses a moving window within the D=0 group, i.e. only those non-treated units within a fixed caliper of h from pi: pi pj < h. psmatch treated, on(score) cal(0.06) smooth(wage)epan 11

Common support if not ruled out by the option nocommon, common support is imposed on the treated units: treated units whose p is larger than the largest p in the non-treated pool are left unmatched.. psmatch treated, on(score) cal(0.06) smooth(wage) [epan] nocommon 12

SMOOTHING THE TREATED TOO For kernel-based matching: for each i {D=1}, smooth non-parametrically E(Y D=1, P(X)=pi) $y i s (to be used instead of the observed yi). psmatch treated, on(score) cal(0.06) smooth(wage) [epan] [nocommon] both In addition to _moutcomevar the matched smoothed outcomevar $y i option both creates: _soutcomevar the treated smoothed outcomevar $y i s E.g.. psmatch treated, on(score) cal(0.06) smooth(wage) both Mean wage of matched treated = 642.9774 Mean wage of matched controls = 677.1453 Effect = -34.167822 13

MAHALANOBIS METRIC MATCHING Replace pi pj above with d(i,j) = (Pi Pj) S-1 (Pi Pj) where Pi is the (2 1) vector of scores of unit i Pj is the (2 1) vector of scores of unit j S is the pooled within-sample (2 2) covariance matrix of P based on the sub-samples of the treated and complete non-treated pool. Useful in particular for multiple treatment framework. psmatch treated, on(score1 score2) cal(.06) [smooth(wage)] [epan] [both] [nocommon] 14

ESSENTIAL REFERENCES ½ Propensity score matching Rosenbaum, P.R. and Rubin, D.B. (1983), The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika, 70, 1, 41-55. ½ Caliper matching Cochran, W. and Rubin, D.B. (1973), Controlling Bias in Observational Studies, Sankyha, 35, 417-446. ½ Kernel-based matching Heckman, J.J., Ichimura, H. and Todd, P.E. (1997), Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme, Review of Economic Studies, 64, 605-654. Heckman, J.J., Ichimura, H. and Todd, P.E. (1998), Matching as an Econometric Evaluation Estimator, Review of Economic Studies, 65, 261-294. ½ Mahalanobis distance matching Rubin, D.B. (1980), Bias Reduction Using Mahalanobis-Metric Matching, Biometrics, 36, 293-298. 15