An introduction to GMM estimation using Stata

Similar documents

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

From the help desk: Swamy s random-coefficients model

From the help desk: hurdle models

Sample Size Calculation for Longitudinal Studies

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects

Standard errors of marginal effects in the heteroskedastic probit model

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Lecture 15. Endogeneity & Instrumental Variable Estimation

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Correlated Random Effects Panel Data Models

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

Estimating the random coefficients logit model of demand using aggregate data

From the help desk: Bootstrapped standard errors

Handling missing data in Stata a whirlwind tour

The following postestimation commands for time series are available for regress:

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

Clustering in the Linear Model

Estimating the marginal cost of rail infrastructure maintenance. using static and dynamic models; does more data make a

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

From the help desk: Demand system estimation

The marginal cost of rail infrastructure maintenance; does more data make a difference?

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

Nonlinear Regression Functions. SW Ch 8 1/54/

Discussion Section 4 ECON 139/ Summer Term II

Panel Data Analysis Josef Brüderl, University of Mannheim, March 2005

Using instrumental variables techniques in economics and finance

Analyses on Hurricane Archival Data June 17, 2014

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

MULTIPLE REGRESSION EXAMPLE

Dynamic Panel Data estimators

Introduction to structural equation modeling using the sem command

Rockefeller College University at Albany

Maximum likelihood estimation of a bivariate ordered probit model: implementation and Monte Carlo simulations

August 2012 EXAMINATIONS Solution Part I

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

PS 271B: Quantitative Methods II. Lecture Notes

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

Statistical Machine Learning

xtmixed & denominator degrees of freedom: myth or magic

Panel Data: Linear Models

Competing-risks regression

Introduction to General and Generalized Linear Models

SYSTEMS OF REGRESSION EQUATIONS

Linear Classification. Volker Tresp Summer 2015

On Marginal Effects in Semiparametric Censored Regression Models

Institut für Soziologie Eberhard Karls Universität Tübingen

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

For more information on the Stata Journal, including information for authors, see the web page.

Online Appendix The Earnings Returns to Graduating with Honors - Evidence from Law Graduates

Editor Executive Editor Associate Editors Copyright Statement:

Least Squares Estimation

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Basic Statistical and Modeling Procedures Using SAS

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

Lecture 3: Linear methods for classification

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Correlation and Regression

Module 14: Missing Data Stata Practical

Quick Stata Guide by Liz Foster

Poisson Models for Count Data

Stata Walkthrough 4: Regression, Prediction, and Forecasting

1 Teaching notes on GMM 1.

Econometrics Simple Linear Regression

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Interaction effects between continuous variables (Optional)

SAS Software to Fit the Generalized Linear Model

Weak instruments: An overview and new techniques

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

2. Linear regression with multiple regressors

Additional sources Compilation of sources:

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

The Regression Calibration Method for Fitting Generalized Linear Models with Additive Measurement Error

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Factor Analysis. Factor Analysis

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

The Probit Link Function in Generalized Linear Models for Data Mining Applications

Nonlinear relationships Richard Williams, University of Notre Dame, Last revised February 20, 2015

Addressing Alternative. Multiple Regression Spring 2012

1 Simple Linear Regression I Least Squares Estimation

STATISTICA Formula Guide: Logistic Regression. Table of Contents

UNIVERSITY OF WAIKATO. Hamilton New Zealand

Chapter 6: Multivariate Cointegration Analysis

Linear Regression Models with Logarithmic Transformations

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Chapter 3: The Multiple Linear Regression Model

Multiple Linear Regression in Data Mining

Robust Inferences from Random Clustered Samples: Applications Using Data from the Panel Survey of Income Dynamics

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Transcription:

An introduction to GMM estimation using Stata David M. Drukker StataCorp German Stata Users Group Berlin June 2010 1 / 29

Outline 1 A quick introduction to GMM 2 Using the gmm command 2 / 29

A quick introduction to GMM What is GMM? The generalize method of moments (GMM) is a general framework for deriving estimators Maximum likelihood (ML) is another general framework for deriving estimators. 3 / 29

A quick introduction to GMM GMM and ML I 4 / 29 ML estimators use assumptions about the specific families of distributions for the random variables to derive an objective function We maximize this objective function to select the parameters that are most likely to have generated the observed data GMM estimators use assumptions about the moments of the random variables to derive an objective function The assumed moments of the random variables provide population moment conditions We use the data to compute the analogous sample moment conditions We obtain parameters estimates by finding the parameters that make the sample moment conditions as true as possible This step is implemented by minimizing an objective function

A quick introduction to GMM GMM and ML II ML can be more efficient than GMM ML uses the entire distribution while GMM only uses specified moments GMM can be produce estimators using few assumptions More robust, less efficient ML is a special case of GMM Solving the ML score equations is equivalent to maximizing the ML objective function The ML score equations can be viewed as moment conditions 5 / 29

A quick introduction to GMM What is generalized about GMM? In the method of moments (MM), we have the same number of sample moment conditions as we have parameters In the generalized method of moments (GMM), we have more sample moment conditions than we have parameters 6 / 29

A quick introduction to GMM Method of Moments (MM) 7 / 29 We estimate the mean of a distribution by the sample, the variance by the sample variance, etc We want to estimate µ = E[y] The population moment condition is E[y] µ = 0 The sample moment condition is (1/N) N y i µ = 0 i=1 Our estimator is obtained by solving the sample moment condition for the parameter Estimators that solve sample moment conditions to produce estimates are called method-of-moments (MM) estimators This method dates back to Pearson (1895)

A quick introduction to GMM Ordinary least squares (OLS) is an MM estimator We know that OLS estimates the parameters of the condtional expectation of y i = x i β + ɛ i under the assumption that E[ɛ x] = 0 Standard probability theory implies that E[ɛ x] = 0 E[xɛ] = 0 So the population moment conditions for OLS are E[x(y xβ)] = 0 The corresponding sample moment condtions are (1/N) N i=1 x i(y i x i β) = 0 Solving for β yields ( N ) 1 β OLS = i=1 x i x N i i=1 x i y i 8 / 29

A quick introduction to GMM Generalized method-of-moments (GMM) The MM only works when the number of moment conditions equals the number of parameters to estimate If there are more moment conditions than parameters, the system of equations is algebraically over identified and cannot be solved Generalized method-of-moments (GMM) estimators choose the estimates that minimize a quadratic form of the moment conditions GMM gets as close to solving the over-identified system as possible GMM reduces to MM when the number of parameters equals the number of moment condtions 9 / 29

A quick introduction to GMM Definition of GMM estimator 10 / 29 Our research question implies q population moment conditions E[m(w i, θ)] = 0 m is q 1 vector of functions whose expected values are zero in the population w i is the data on person i θ is k 1 vector of parmeters, k q The sample moments that correspond to the population moments are m(θ) = (1/N) N i=1 m(w i, θ) When k < q, the GMM choses the parameters that are as close as possible to solving the over-identified system of moment conditions θ GMM arg min θ m(θ) Wm(θ)

A quick introduction to GMM Some properties of the GMM estimator θ GMM arg min θ m(θ) Wm(θ) When k = q, the MM estimator solves m(θ) exactly so m(θ) Wm(θ) = 0 W only affects the efficiency of the GMM estimator Setting W = I yields consistent, but inefficent estimates Setting W = Cov[m(θ)] 1 yields an efficient GMM estimator We can take multiple steps to get an efficient GMM estimator 1 Let W = I and get θ GMM1 arg min θ m(θ) m(θ) 2 Use θ GMM1 to get Ŵ, which is an estimate of Cov[m(θ)] 1 3 Get θ GMM2 arg min θ m(θ) Ŵm(θ) 4 Repeat steps 2 and 3 using θ GMM2 in place of θ GMM1 11 / 29

The gmm command The command gmm estimates paramters by GMM gmm is similar to nl, you specify the sample moment conditions as substitutable expressions Substitutable expressions enclose the model parameters in braces {} 12 / 29

The syntax of gmm I For many models, the population moment conditions have the form E[ze(β)] = 0 where z is a q 1 vector of instrumental variables and e(β) is a scalar function of the data and the parameters β The corresponding syntax of gmm is gmm (eb expression) [ if ][ in ][ weight ], instruments(instrument varlist) [ options ] where some options are onestep use one-step estimator (default is two-step estimator) winitial(wmtype) initial weight-matrix W wmatrix(witype) weight-matrix W computation after first step vce(vcetype) vcetype may be robust, cluster, bootstrap, hac 13 / 29

Modeling crime data I We have data. use cscrime, clear. describe Contains data from cscrime.dta obs: 10,000 vars: 5 24 May 2008 17:01 size: 480,000 (98.6% of memory free) (_dta has notes) storage display value variable name type format label variable label policepc double %10.0g police officers per thousand arrestp double %10.0g arrests/crimes convictp double %10.0g convictions/arrests legalwage double %10.0g legal wage index 0-20 scale crime double %10.0g property-crime index 0-50 scale Sorted by: 14 / 29

Modeling crime data II We specify that crime i = β 0 + policepc i β 1 + legalwage i β 2 + ɛ i We want to model E[crime policepc, legalwage] = β 0 + policepcβ 1 + legalwageβ 2 If E[ɛ policepc, legalwage] = 0, the population moment conditions [( ) ] ( ) policepc 0 E ɛ = legalwage 0 hold 15 / 29

OLS by GMM I. gmm (crime - policepc*{b1} - legalwage*{b2} - {b3}), /// > instruments(policepc legalwage) nolog Final GMM criterion Q(b) = 6.61e-32 GMM estimation Number of parameters = 3 Number of moments = 3 Initial weight matrix: Unadjusted Number of obs = 10000 GMM weight matrix: Robust Robust Coef. Std. Err. z P> z [95% Conf. Interval] /b1 -.4203287.0053645-78.35 0.000 -.4308431 -.4098144 /b2-7.365905.2411545-30.54 0.000-7.838559-6.893251 /b3 27.75419.0311028 892.34 0.000 27.69323 27.81515 Instruments for equation 1: policepc legalwage _cons 16 / 29

OLS by GMM I. regress crime policepc legalwage, robust Linear regression Number of obs = 10000 F( 2, 9997) = 4422.19 Prob > F = 0.0000 R-squared = 0.6092 Root MSE = 1.8032 Robust crime Coef. Std. Err. t P> t [95% Conf. Interval] policepc -.4203287.0053653-78.34 0.000 -.4308459 -.4098116 legalwage -7.365905.2411907-30.54 0.000-7.838688-6.893123 _cons 27.75419.0311075 892.20 0.000 27.69321 27.81517 17 / 29

IV and 2SLS Using the gmm command For some variables, the assumption E[ɛ x] = 0 is too strong and we need to allow for E[ɛ x] 0 If we have q variables z for which E[ɛ z] = 0 and the correlation between z and x is sufficiently strong, we can estimate β from the population moment conditions E[z(y xβ)] = 0 z are known as instrumental variables If the number of variables in z and x is the same (q = k), solving the the sample moment contions yield the MM estimator known as the instrumental variables (IV) estimator If there are more variables in z than in x (q > k) and we let ( N 1 W = i=1 z i i) z in our GMM estimator, we obtain the two-stage least-squares (2SLS) estimator 18 / 29

2SLS on crime data I The assumption that E[ɛ policepc] = 0 is false, if communities increase policepc in response to an increase in crime (an increase in ɛ i ) The variables arrestp and convictp are valid instruments, if they measure some components of communities toughness-on crime that are unrelated to ɛ but are related to policepc We will continue to maintain that E[ɛ legalwage] = 0 19 / 29

2SLS by GMM I. gmm (crime - policepc*{b1} - legalwage*{b2} - {b3}), /// > instruments(arrestp convictp legalwage ) nolog onestep Final GMM criterion Q(b) =.001454 GMM estimation Number of parameters = 3 Number of moments = 4 Initial weight matrix: Unadjusted Number of obs = 10000 Robust Coef. Std. Err. z P> z [95% Conf. Interval] /b1-1.002431.0455469-22.01 0.000-1.091701 -.9131606 /b2-1.281091.5890977-2.17 0.030-2.435702 -.1264811 /b3 30.0494.1830541 164.16 0.000 29.69062 30.40818 Instruments for equation 1: arrestp convictp legalwage _cons 20 / 29

2SLS by GMM II. ivregress 2sls crime legalwage (policepc = arrestp convictp), robust Instrumental variables (2SLS) regression Number of obs = 10000 Wald chi2(2) = 1891.83 Prob > chi2 = 0.0000 R-squared =. Root MSE = 3.216 Robust crime Coef. Std. Err. z P> z [95% Conf. Interval] policepc -1.002431.0455469-22.01 0.000-1.091701 -.9131606 legalwage -1.281091.5890977-2.17 0.030-2.435702 -.1264811 _cons 30.0494.1830541 164.16 0.000 29.69062 30.40818 Instrumented: Instruments: policepc legalwage arrestp convictp 21 / 29

Count-data model with endogenous variables Consider a model for count data in which some of the covariates are exogenous E[y x, ν] = exp(xβ + β 0 )ν By conditioning on the unobserved ν, we are allowing for ν to be related to x. Mullahy (1997) showed we can estimate the parameters of this model using the moment conditions E[y/ exp(xβ + β 0 ) 1] = 0 Numerically more stable moment conditions E[y/ exp(xβ) γ] = 0 22 / 29

Count-data model with endogenous variables. use accidents2, clear. gmm (accidents/(exp({traffic}*traffic+{tickets}*tickets)) - {cons}), /// > instruments(traffic cvalue kids) nolog Final GMM criterion Q(b) =.0000881 GMM estimation Number of parameters = 3 Number of moments = 4 Initial weight matrix: Unadjusted Number of obs = 9999 GMM weight matrix: Robust Robust Coef. Std. Err. z P> z [95% Conf. Interval] /traffic.6468395.1473377 4.39 0.000.3580628.9356161 /tickets.2488043.0839933 2.96 0.003.0841805.4134282 /cons.3854079.0097498 39.53 0.000.3662985.4045172 Instruments for equation 1: traffic cvalue kids _cons 23 / 29

More complicated moment conditions The structure of the moment conditions for some models is too complicated to fit into the interactive syntax used thus far For example, Wooldridge (1999, 2002); Blundell, Griffith, and Windmeijer (2002) discuss estimating the fixed-effects Poisson model for panel data by GMM. In the Poisson panel-data model we are modeling E[y it x it, η i ] = exp(x it β + η i ) Hausman, Hall, and Griliches (1984) derived a conditional log-likelihood function when the outcome is assumed to come from a Poisson distribution with mean exp(x it β + η i ) and η i is an observed component that is correlated with the x it 24 / 29

Wooldridge (1999) showed that you could estimate the parameters of this model by solving the sample moment conditions ( ) i t x y it y it µ i it = 0 µ i These moment conditions do not fit into the interactive syntax because the term µ i depends on the parameters Need to use moment-evaluator program syntax 25 / 29

Moment-evaluator program syntax An abreviated form of the syntax for gmm is gmm moment progam [ if ][ in ][ weight ], equations(moment cond names) parameters(parameter names) [ instruments() options ] The moment program is an ado-file of the form program gmm_eval version 11 syntax varlist if, at(name) quietly { <replace elements of varlist with error part of moment conditions> } end 26 / 29

program xtfe version 11 syntax varlist if, at(name) quietly { tempvar mu mubar ybar generate double mu = exp(kids* at [1,1] /// + cvalue* at [1,2] /// + tickets* at [1,3]) if egen double mubar = mean( mu ) if, by(id) egen double ybar = mean(accidents) if, by(id) replace varlist = accidents /// - mu * ybar / mubar if } end 27 / 29

FE Poisson by gmm. use xtaccidents. by id: egen max_a = max(accidents ). drop if max_a ==0 (3750 observations deleted). gmm xtfe, equations(accidents) parameters(kids cvalue tickets) /// > instruments(kids cvalue tickets, noconstant) /// > vce(cluster id) onestep nolog Final GMM criterion Q(b) = 1.50e-16 GMM estimation Number of parameters = 3 Number of moments = 3 Initial weight matrix: Unadjusted Number of obs = 1250 (Std. Err. adjusted for 250 clusters in id) Robust Coef. Std. Err. z P> z [95% Conf. Interval] /kids -.4506245.0969133-4.65 0.000 -.6405711 -.2606779 /cvalue -.5079946.0615506-8.25 0.000 -.6286315 -.3873577 /tickets.151354.0873677 1.73 0.083 -.0198835.3225914 Instruments for equation 1: kids cvalue tickets 28 / 29

FE Poisson by xtpoisson, fe. xtpoisson accidents kids cvalue tickets, fe nolog vce(robust) Conditional fixed-effects Poisson regression Number of obs = 1250 Group variable: id Number of groups = 250 Obs per group: min = 5 avg = 5.0 max = 5 Wald chi2(3) = 84.89 Log pseudolikelihood = -351.11739 Prob > chi2 = 0.0000 (Std. Err. adjusted for clustering on id) Robust accidents Coef. Std. Err. z P> z [95% Conf. Interval] kids -.4506245.0969133-4.65 0.000 -.6405712 -.2606779 cvalue -.5079949.0615506-8.25 0.000 -.6286319 -.3873579 tickets.151354.0873677 1.73 0.083 -.0198835.3225914 29 / 29

References References Blundell, Richard, Rachel Griffith, and Frank Windmeijer. 2002. Individual effects and dynamics in count data models, Journal of Econometrics, 108, 113 131. Hausman, Jerry A., Bronwyn H. Hall, and Zvi Griliches. 1984. Econometric models for count data with an application to the patents R & D relationship, Econometrica, 52(4), 909 938. Mullahy, J. 1997. Instrumental variable estimation of Poisson Regression models: Application to models of cigarette smoking behavior, Review of Economics and Statistics, 79, 586 593. Pearson, Karl. 1895. Contributions to the mathematical theory of evolution II. Skew variation in homogeneous material, Philosophical Transactions of the Royal Society of London, Series A, 186, 343 414. Wooldridge, Jeffrey. 2002. Econometric Analysis of Cross Section and Panel Data, Cambridge, Massachusetts: MIT Press. 29 / 29

Wooldridge, Jeffrey M. 1999. Distribution-free estimation of some nonlinear panel-data models, Journal of Econometrics, 90, 77 90. 29 / 29