Counterfactual distributions in Stata. estimation and inference in Stata

Similar documents
Quantile Regression under misspecification, with an application to the U.S. wage structure

Chapter 3: The Multiple Linear Regression Model

1 Another method of estimation: least squares

Chapter 4: Statistical Hypothesis Testing

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

The Dynamics of UK and US In ation Expectations

Lecture 9: Keynesian Models

Consumer Behaviour - The Best Non Parametric Bounds on Demand Responses

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

BayesX - Software for Bayesian Inference in Structured Additive Regression

PS 271B: Quantitative Methods II. Lecture Notes

Innovation and Top Income Inequality

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

Regression with a Binary Dependent Variable

CAPM, Arbitrage, and Linear Factor Models

Review of Bivariate Regression

EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA

Herding, Contrarianism and Delay in Financial Market Trading

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

From the help desk: Bootstrapped standard errors

Moral Hazard and Adverse Selection in Private Health Insurance

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Panel Data Econometrics

THE IMPACT OF 401(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS

Chapter 2. Dynamic panel data models

Moral Hazard and Adverse Selection in Private Health Insurance

How To Model The Fate Of An Animal

Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics

Redwood Building, Room T204, Stanford University School of Medicine, Stanford, CA

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

7 Generalized Estimating Equations

Fiscal and Monetary Policy in Australia: an SVAR Model

Statistical Machine Learning

SUMAN DUVVURU STAT 567 PROJECT REPORT

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Portfolio selection based on upper and lower exponential possibility distributions

A revisit of the hierarchical insurance claims modeling

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Statistics in Retail Finance. Chapter 6: Behavioural models

Multinomial and Ordinal Logistic Regression

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Pricing Cloud Computing: Inelasticity and Demand Discovery

Direct and indirect effects in a logit model

Paid and Unpaid Labor in Developing Countries: an inequalities in time use approach

Hierarchical Insurance Claims Modeling

Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model

AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE

Binary Outcome Models: Endogeneity and Panel Data

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

What is it About Schooling that the Labor Market Rewards? The Components of the Return to Schooling

VaR-x: Fat tails in nancial risk management

University of Saskatchewan Department of Economics Economics Homework #1

Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

Logistic Regression (1/24/13)

Sampling Error Estimation in Design-Based Analysis of the PSID Data

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

Interpretation of Somers D under four simple models

Common sense, and the model that we have used, suggest that an increase in p means a decrease in demand, but this is not the only possibility.

Discussion Section 4 ECON 139/ Summer Term II

The zero-adjusted Inverse Gaussian distribution as a model for insurance claims

A.II. Kernel Estimation of Densities

Wealth inequality: Britain in international perspective. Frank Cowell: Wealth Seminar June 2012

LOGIT AND PROBIT ANALYSIS

Basel II Operational Risk Modeling Implementation & Challenges

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

A note on the impact of options on stock return volatility 1

Prediction for Multilevel Models

Investment and Financial Constraints: Empirical Evidence for Firms in Brazil and China

Multivariate Normal Distribution

College degree supply, productivity spillovers and occupational allocation of graduates in Central European countries

Testosterone levels as modi ers of psychometric g

Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. Description

Sharp and Diffuse Incentives in Contracting

Econometric Analysis of Cross Section and Panel Data

Local classification and local likelihoods

Master programme in Statistics

VICTOR CHERNOZHUKOV. Professor of Economics Massachusetts Institute of Technology

Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Employment E ects of Service O shoring: Evidence from Matched Firms

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

10. Fixed-Income Securities. Basic Concepts

Dongfeng Li. Autumn 2010

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Our development of economic theory has two main parts, consumers and producers. We will start with the consumers.

Estimating the random coefficients logit model of demand using aggregate data

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Changing income shocks or changed insurance - what determines consumption inequality?

STA 4273H: Statistical Machine Learning

Applying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM

Growth in the Nonpro t Sector and Competition for Funding

Simple Linear Regression Inference

Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation

E3: PROBABILITY AND STATISTICS lecture notes

Dynamic Macroeconomics I Introduction to Real Business Cycle Theory

UNDERGRADUATE DEGREE DETAILS : BACHELOR OF SCIENCE WITH

Transcription:

Counterfactual distributions: estimation and inference in Stata Victor Chernozhukov Iván Fernández-Val Blaise Melly MIT Boston University Bern University November 17, 2016 Swiss Stata Users Group Meeting in Bern

Questions I What would have been the wage distribution in 1979 if the workers had the same distribution of characteristics as in 1988? I What would be the distribution of housing prices resulting from cleaning up a local hazardous-waste site? I What would be the distribution of wages for female workers if female workers were paid as much as male workers with the same characteristics? I In general, given an outcome Y and a covariate vector X. What is the e ect on F Y of a change in 1. F X (holding F Y jx xed)? 2. F Y jx (holding F X xed)? I To answer these questions we need to estimate counterfactual distributions.

Counterfactual distributions I Let 0 denote 1979 and 1 denote 1988. I Y is wages and X is a vector of worker characteristics (education, experience,...). I F Xk (x) is worker composition in k 2 f0, 1g; F Yj (y j x) is wage structure in j 2 f0, 1g. I De ne Z F Y hjjki (y) := F Yj (y j x)df Xk (x). I F Y h0j0i is the observed distribution of wages in 1979; F Y h0j1i is the counterfactual distribution of wages in 1979 if workers have 1988 composition. I Common support: F Y h0j1i is well de ned if the support of X 1 is included in the support of X 0.

E ect of changing F X I We are interested in the e ect of shifting the covariate distribution from 1979 to that of 1988. I Distribution e ects DE (y) = F Y h0j1i (y) I The quantiles are often also of interest: F Y h0j0i (y) Q Y hjjki (τ) = inffy : F Y hjjki (y) ug, 0 < τ < 1. Quantile e ects QE (τ) = Q Y h0j1i (τ) I In general, for a functional φ, the e ects is Q Y h0j0i (τ) (w) := φ(f Y h0j1i ) (w) φ(f Y h0j0i ) (w). Special cases: Lorenz curve, Gini coe cient, interquartile range, and more trivially the mean and the variance.

Types of counterfactual changes in F X 1. Groups correspond to di erent subpopulations (di erent time periods, male vs. female, black vs. white). 2. Transformations of the population: X 1 = g(x 0 ): I Unit change in location of one covariate: X 1 = X 0 + 1 where X is the number of cigarettes smoked by the mother and Y is the birthweight of the newborn. I Neutral redistribution of income: X 1 = µ X0 + α(x 0 µ X0 ), where Y is the food expenditure (Engel curve). I Stock (1991): e ect on housing prices of removing hazardous waste disposal site. In 1. and 2., bf X1 (x) = n 1 1 n 1 i =1 1fX 1i xg. 3. Change in some variable(s) but not in the other ones: unionization rate in 1988 and other characteristics from 1979. In 3., d bf X1 (x) = d ˆF U1 jc 1 (ujc)d ˆF C0 (c).

E ect of changing F Y jx I We are often interested in the e ect of changing the conditional distribution of the outcome for a given population. I Program evaluation: Group 1 is treated and group 0 is the control group. The quantile treatment e ect on the treated is QTET = Q Y h1j1i (τ) Q Y h0j1i (τ). I The counterfactual distributions are always statistically well-de ned object. The e ects are of interest even in non-causal framework (e.g. gender wage gap). I Causal interpretation under additional assumptions that give a structural interpretation to the conditional distribution. Selection on observables: the conditional distribution may be estimated using quantile or distribution regression. Endogenous groups: IV quantile regression (e.g. Chernozhukov and Hansen 2005).

Decompositions I The counterfactual distributions that we analyze are the key ingredients of the decomposition methods often used in economics. I Blinder/Oaxaca decomposition (parametric, linear decomposition of the mean di erence): Ȳ 0 Ȳ 1 = ( X 0 β 0 X 1 β 0 ) + ( X 1 β 0 X 1 β 1 ). This ts in our framework (even if our machinery is not needed in this simple case) as Y h0j0i Y h1j1i = Y h0j0i Y h0j1i + Y h0j1i Y h1j1i. I Our results allow us to do similar decomposition of any functional of the distribution. E.g. a quantile decomposition Q Y h0j0i (τ) Q Y h0j1i (τ) + Q Y h0j1i (τ) Q Y h1j1i (τ).

Estimation: plug-in principle I We estimate the unknown elements in R F Y0 (y j x)df X1 (x) by analog estimators. I We estimate the distribution of X 1 by the empirical distribution for group 1. I The conditional distribution can be estimated by: 1. Location and location-scale shift models (e.g. OLS and independent errors), 2. Quantile regression, 3. Duration models (e.g. proportional hazard model), 4. Distribution regression. I Our results also cover other methods (e.g. IV quantile regression).

Outline of the algorithm for F Y h0j1i (y) 1. Estimation 1.1 Estimate F X1 (x) by bf X1 (x). 1.2 Estimate F Y0 (y j x) by bf Y0 jx 0 (yjx). 1.3 bf Y h0j1i (y) = R bf Y0 jx 0 (yjx)d bf X1 (x) (in most cases: n1 1 n 1 i =1 F b Y0 jx 0 (yjx 1i )). 2. Pointwise inference 2.1 Bootstrap bf Y h0j1i (y) to obtain the pointwise s.e. ˆΣ (y). 2.2 Obtain a 95% CI as bf Y h0j1i (y) 1.96 ˆΣ (y). 3. Uniform inference Obtain the 95% con dence bands as bf Y h0j1i (y) ˆt ˆΣ (y), where ˆt is the 95th percentile of the bootstrap draws of the maximal t statistic over y.

Conditional quantile models I Location shift model (OLS with independent error term): Y = X 0 β + V, V?? X Q Y (ujx) = x 0 β + Q V (u). Parsimonious but restrictive, X only impact location of Y. I Quantile regression (Koenker and Bassett 1978): Y = X 0 β(u), U j X U(0, 1) Q Y (ujx) = x 0 β(u). X can change shape of entire conditional distribution. I Connect the conditional distribution with the conditional quantile F Y0 (yjx) Z 1 0 1fQ Y0 (ujx) ygdu.

Quantile regression y 0 1 2 3 4 0.2.4.6.8 1 x y Median First quartile Third quartile

Conditional distribution models I Distribution regression model (Foresi and Peracchi 1995): F Y (yjx) = Λ(x 0 β(y)), where Λ is a link function (probit, logit, cauchit). X can have heterogeneous e ects across the distribution. I Cox (72) proportional hazard model is a special case with complementary log-log link and constant slope parameter F Y (yjx) = 1 exp( exp(β 0 (y) x 0 β 1 )) In other words: β(y) is assumed to be constant. I Estimate functional parameter vector y 7! β(y) by MLE: 1. Create indicators 1fY yg, 2. Probit/logit of 1fY yg on X.

Distribution regression yi 0 1 2 3 0.2.4.6.8 1 Conditional distribution 0.2.4.6.8 1 x y Prob(Y<1.5 x) Prob(Y<1.15 x) Prob(Y<1.85 x)

Comparison: QR vs DR I QR and DR are exible semiparametric models for the conditional distribution that generalize important classical models. I Equivalent if X is saturated; but not nested otherwise. Choice cannot be made on the basis of generality. I QR requires smooth conditional density of Y. I QR usually overperforms DR under smoothness, but is less robust when Y has mass points. I Di erent ability to deal with data limitations: censoring and rounding.

Pointwise and uniform inference I The covariance function of bf Y h0j1i (y) is cumbersome to estimate =) exchangeable bootstrap (covers empirical bootstrap, weighted bootstrap and subsampling) provides the pointwise s.e. ˆΣ (y). I Many policy questions of interest involve functional hypotheses: no e ect, constant e ect, stochastic dominance. =) uniform con dence bands: bf Y h0j1i (y) bt ˆΣ (y). The true t corresponds to the 95th percentile of the distribution of the maximum t-statistic sup y bσ(y) 1/2 jbf Y h0j1i (y) F Y h0j1i (y)j, which is unknown. We use the bootstrap to estimate it.

Theoretical results I Under high level conditions we prove 1. Functional central limit theorems p n b F h0j1i (y) F Y h0j1i (y) Z h0j1i (y) where Z h0j1i (y) is a tight zero-mean Gaussian process. 2. The validity of the uniform con dence bands n o lim Pr F n! Y h0j1i (y) 2 [bf Y h0j1i (y) bt ˆΣ (y)] for all y = 0.95 I Under standard primitive conditions we show that QR and DR satisfy the high level conditions, i.e. functional central limit theorem and validity of the bootstrap for the coe cients processes.

The cdeco command Quantile decomposition (go to de nition): cdeco depvar indepvars [if ] [in] [weight], group(varname) [options] I group(varname): binary variable de ning the groups. I quantiles(numlist): quantile(s) τ at which the decomposition will be estimated. I method(string): estimator of the conditional distribution; available: qr (the default), loc, locsca, cqr, cox, logit, probit, and lpm. I nreg(#): number of regressions estimated to approximate the conditional distribution; default is 100. I reps(#): number of bootstrap replications. I noboot: suppresses the bootstrap.

Application: private-public sector wage di erences I Data: Merged Outgoing Rotation Groups from the Current Population Survey in 2015. I Sample: white males between 25 and 60 years old. I Stata command and head of output:

Total di erence (private - public)

Characteristics (private - public)

Private sector wage premium

Summary

The counterfactual command I The counterfactual command estimates the e ect of changing the distribution of the covariates on the distribution of the outcome (link to de nition). I Syntax: counterfactual depvar indepvars [if ] [in] [weight] [, group(varname) counterfactual(varlist) other options] I Either the option group or counterfactual must be speci ed: I I group if X 0 and X 1 correspond to di erent subpopulations, counterfactual if X 1 is a transformations X 0. This option must provide a list of the counterfactual covariates that corresponds to the reference covariates given in indepvars. The order matters! I The other options are the same as for cdeco.

Application: Engel curve

Engel curve: e ect of redistributing income

Summary I Chernozhukov, Fernandez-Val and Melly (2013) suggest regression-based estimation and inference methods for counterfactual distributions. I cdeco and counterfactual implement these methods in Stata. I To do list: I I I I Write an article to submit to the Stata Journal. For non-continuous outcomes: implement the procedure in Chernozhukov, Fernandez-Val, Melly and Wüthrich (2016). Detailed decomposition: work in progress with Philipp Ketz. Faster algorithms for quantile and distribution regression.