ECON 5360 Class Notes Systems of Equations

Similar documents
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Chapter 2. Dynamic panel data models

Lecture 1: Systems of Linear Equations

Chapter 3: The Multiple Linear Regression Model

160 CHAPTER 4. VECTOR SPACES

IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011

SYSTEMS OF REGRESSION EQUATIONS

CAPM, Arbitrage, and Linear Factor Models

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Panel Data Econometrics

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

1.2 Solving a System of Linear Equations

Solución del Examen Tipo: 1

1 Another method of estimation: least squares

1 Teaching notes on GMM 1.

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

y t by left multiplication with 1 (L) as y t = 1 (L) t =ª(L) t 2.5 Variance decomposition and innovation accounting Consider the VAR(p) model where

Topic 5: Stochastic Growth and Real Business Cycles

3.1 Least squares in matrix form

Representation of functions as power series

Chapter 4: Vector Autoregressive Models

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Lecture Notes 10

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Chapter 5: The Cointegrated VAR model

Chapter 4: Statistical Hypothesis Testing

Section 2.4: Equations of Lines and Planes

Multivariate Normal Distribution

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Lecture Notes 2: Matrices as Systems of Linear Equations

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Portfolio selection based on upper and lower exponential possibility distributions

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

Chapter 6: Multivariate Cointegration Analysis

171:290 Model Selection Lecture II: The Akaike Information Criterion

Econometrics Simple Linear Regression

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Department of Economics

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files

Cardiff Economics Working Papers

Name: Section Registered In:

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

1 Solving LPs: The Simplex Algorithm of George Dantzig

Review of Bivariate Regression

Chapter 7 Nonlinear Systems

SOLVING LINEAR SYSTEMS

Similarity and Diagonalization. Similar Matrices

Note 2 to Computer class: Standard mis-specification tests

Penalized regression: Introduction

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Poisson Models for Count Data

Granger Causality between Government Revenues and Expenditures in Korea

Factor analysis. Angela Montanari

The Characteristic Polynomial

Introduction to Macroeconomics TOPIC 2: The Goods Market

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

Solution to Homework 2

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

16 : Demand Forecasting

STATISTICA Formula Guide: Logistic Regression. Table of Contents

7 Gaussian Elimination and LU Factorization

Using row reduction to calculate the inverse and the determinant of a square matrix

Empirical Methods in Applied Economics

Linear Threshold Units

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

Partial Fractions Decomposition

LOGIT AND PROBIT ANALYSIS

Least Squares Estimation

Linear Programming. March 14, 2014

Linear Classification. Volker Tresp Summer 2015

Multivariate Analysis of Variance (MANOVA): I. Theory

Trade Liberalization and the Economy:

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Time Series Analysis III

From the help desk: Bootstrapped standard errors

1 Determinants and the Solvability of Linear Systems

CHAPTER 2 Estimating Probabilities

Regression III: Advanced Methods

A Stock-Flow Accounting Model of the Labor Market: An Application to Israel

Continued Fractions and the Euclidean Algorithm

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Statistical Machine Learning

2. Linear regression with multiple regressors

1 Present and Future Value

What is Linear Programming?

Trend and Seasonal Components

Linear Models for Continuous Data

Our development of economic theory has two main parts, consumers and producers. We will start with the consumers.

SAS Software to Fit the Generalized Linear Model

Imputing Missing Data using SAS

Linear Algebra Notes

Introduction to Regression and Data Analysis

Transcription:

ECON 0 Class Notes Systems of Equations Introduction Here, we consider two types of systems of equations The rst type is a system of Seemingly Unrelated Regressions (SUR), introduced by Arnold Zellner (9) Sets of equations with distinct dependent and independent variables are often linked together by some common unmeasurable factor Examples include systems of factor demands by a particular rm, agricultural supply-response equations, and capital-asset pricing models The methods presented here can also be thought of as an alternative estimation framework for panel-data models The second type is a Simultaneous Equations System, which involve the interaction of multiple endogenous variables within a system of equations Estimating the parameters of such as system is typically not as simple as doing OLS equation-by-equation Issues such as identi cation (whether the parameters are even estimable) and endogeneity bias are the primary topics in this section Seemingly Unrelated Regressions (SUR) Model Consider the following set of equations y i X i i + i () for i ; :::; M, where the matrices y i, X i and i are of dimension (T ), (T K i ) and (K i ), respectively The stacked system in matrix form is y X 0 0 y 0 X 0 Y + X + : y M 0 0 X M M M Although each of the M equations may seem unrelated (ie, each has potentially distinct coe cient vectors, dependent variables and explanatory variables), the equations in () are linked through their (mean-zero) Although each equation typically represents a separate time series, it is possible that T instead denotes the number of cross sections within an equation

error structure where I T I T M I T E( 0 I T I T M I T ) I T M I T M I T MM I T M M M M MM MM MT MT is the variance-covariance matrix for each t ; :::; T error vector Generalized Least Squares (GLS) The system resembles the one we studied in chapter 0 on nonshperical disturbances The e cient estimator in this context is the GLS estimator ^ (X 0 X) (X 0 Y ) [X 0 ( I)X] [X 0 ( I)Y ]: Assuming all the classical assumptions hold (other than that of spherical disturbances), GLS is the best linear unbiased estimator There are two important conditions under which GLS does not provide any e ciency gains over OLS: ij 0 When all the contemporaneous correlations across equations equal zero, the equations are not linked in any fashion and GLS does not provide any e ciency gains In fact, one can show that b ^ X X X M When the explanatory variables are identical across equations, b ^ As a rule, the e ciency gains of GLS over OLS tend to be greater when the contemporaneous correlation in errors across equations ( ij ) is greater and there is less correlation between X across equations Feasible Generalized Least Squaures (FGLS) Typically, is not known Assuming that a consistent estimator of is available, the feasible GLS estimator ^ F GLS (X 0 ^ X) (X 0 ^ Y ) [X 0 (^ I)X] [X 0 (^ I)Y ] ()

will be a consistent estimator of The typical estimator of is ^ ^ ^ M ^ ^ ^ M ^ ^ M ^ M ^ MM T e0 e T e0 e T e0 e M T e0 e T e0 e T e0 e M T e0 M e T e0 M e T e0 M e M ; () where e i ; i ; :::; M represent the OLS residuals Degrees of freedom corrections for the elements in ^ are possible, but will not generally produce unbiasedness It is also possible to iterate on () and () until convergence, which will produce the maximum likelihood estimator under multivariate normal errors In other words, ^ F GLS and ^ ML will have the same limiting distributions such that ^ ML;F GLS asy N(; ) where is consistently estimated by ^ [X 0 (^ I)X] : Maximum Likelihood Although asymptotically equivalent, maximum likelihood is an alternative estimator to FGLS that will provide di erent answers in small samples Begin by rewriting the model for the t th observation as Y t y ;t y ;t 0 x t M + y Mt M;t ;t ;t 0 x t + 0 t where x t is the row vector of all di erent explanatory variables in the system and each i is the column vector of coe cients for the i th equation (unless each equation contains all explanatory variables, there will be zeros in i to allow for exclusion restrictions) Assuming multivariate normally distributed errors, the log likelihood function is log L X T log L t MT t log() T log jj X T t 0 t t () where, as de ned earlier, E( t 0 t) The maximum likelihood estimates are found by taking the derivatives of () with respect to and, setting them equal to zero and solving

Hypothesis Testing We consider two types of tests tests for contemporaneous correlation between errors and test for linear restrictions on the coe cients Contemporaneous Correlation If there is no contemporaneous correlation between errors in di erent equations (ie, is diagonal), then OLS equation-by-equation is fully e cient Therefore, it is useful to test the following restriction H 0 : ij 0 8i j H A : H 0 false Breusch and Pagan suggest using the Lagrange multiplier test statistic T X M X i i j r ij where r ij is calculated using the OLS residuals as follows r ij q e 0 i e j : (e 0 i e i)(e 0 j e j) Under the null hypothesis, is asymptotically chi-squared with M(M ) degrees of freedom Restrictions on Coe cients The general F test presented in chapter can be extended to the SUR system However, since the statistic requires using ^, the test will only be valid asymptotically Consider testing the following J linear restrictions H 0 : R q H A : H 0 false where ( ; ; :::; M ) 0 Within the SUR framework, it is possible to test coe cient restrictions across equations One possible test statistic is W (R^ F GLS q) 0 [Rvar(^ F GLS )R 0 ] (R^ F GLS q) which has an asymptotic chi-square distribution with J degrees of freedom

Autocorrelation Heteroscedasticity and autocorrelation are possibilities within the SUR framework I will focus on autocorrelation because SUR systems are often comprised of time series observations for each equation Assume the errors follow i;t i i;t + it where it is white noise The overall error structure will now be M M E( 0 M M ) M M M M MM MM MT MT where ij j T j i T j T i T i T T : The following three-step approach is recommended Run OLS equation-by-equation Compute consistent estimate of i (eg, ^ i ( P T t e i;te i:t )( P T t e i:t )) Transform the data, using either Prais-Winsten or Cochrane-Orcutt, to remove the autocorrelation Estimate using the transformed data as suggested in () Use ^ and equation () to calculate the FGLS estimates SUR Gauss Application Consider the data taken from Woolridge (00) The model attempts to explain wages and fringe bene ts for workers: W ages t X t + t Benefits t X ;t + t where X t X t so that OLS and FGLS will produce equivalent results Although OLS and FGLS are equivalent, one advantage of FGLS within a SUR framework is that it allows you test coe cient restrictions

across equations Doing OLS equation-by-equation would not allow such tests The variables are de ned as follows: Dependent Variables Wages Bene ts Hourly earnings in 999 dollars per hour Hourly bene ts (vacation, sick leave, insurance and pension) in 999 dollars per hour Explanatory Variables Education Experience Years of schooling Years of work experience Tenure Union South Years with current employer One if union member, zero otherwise One if live in south, zero otherwise Northeast One if live in northeast, zero otherwise Northcentral One if live in northcentral, zero otherwise Married One if married, zero otherwise White Male One if white, zero otherwise One if male, zero otherwise See Gauss example 9 for further details The Simultaneous Equation Model The simultaneous system can be written as Y + XB E () where the variable matrices are Y Y Y M X X X K M Y Y Y M X X X K M Y T M ; X T K ; E T M Y T Y T Y T M X T X T X T K T T T M

and the coe cient matrices are MM M M M M ; B KM : M M MM K K MK Some de nitions Y t;j is the jth endogenous variable X t;j is the jth exogenous or predetermined variable Equations () are referred to as structural equations and B are the structural parameters To examine the assumptions about the error terms, rewrite the E matrix as ~E vec(e) ( ; ; :::; T ; ; ; :::; T ; :::; M ; M ; :::; T M ) 0 : We assume E( ~ E) 0 E( ~ E ~ E 0 ) I T where the variance-covariance matrix for t ( t ; t ; :::; tm ) 0 is M M : M M MM Reduced Form The reduced-form solution to () is Y XB + E X + V

where B, V E and the error vector ~ V vec(v ) satis es E( ~ V ) 0 E( ~ V ~ V 0 ) ( 0 I T ) ( I T ) where 0 Demand and Supply Example Consider the following demand and supply equations Q s t 0 + P t + W t + Z t + s t Q d t 0 + P t + Z t + d t Q s t Q d t where Q s t, Q d t and P t are endogenous variables and W t and Z t are exogenous variables Let Q Q s t Q d t In matrix form, the system can be written as Q P W Z s d Q P W Z s d Y ; X ; E W T Z T Q T P T s T d T and Identi cation ; B 0 0 0 Identi cation Question Given data on X and Y, can we identify, B and? Estimation of and Begin by making the standard assumptions about the reduced form Y X + V : plim( T X0 X) Q plim( T X0 V ) 0 8

plim( T V 0 V ) : These assumptions imply that the equation-by-equation OLS estimates of and will be consistent Relationship Between (; ) and ( ; B; ) With these estimates (^ and ^) in hand, the question is whether we can map back to, B and? We know the following B and 0 To see if identi cation is possible, we can count the number of known elements on the left-hand side and compare with the number of unknown elements on the right-hand side Number of Known Elements KM elements in M(M + ) elements in Total M(K + (M + )): Number of Unknown Elements M elements in M(M + ) elements in B KM Total M(M + K + (M + )): Therefore, we are M pieces of information shy of identifying the structural parameters In other words, there is more than one set of structural parameters that are consistent with the reduced form We say the model is underidenti ed Identi cation Conditions There are several possibilities for obtaining identi cation: Normalization (ie, set ii in for i ; :::; m) Identities (eg, national income accounting identity) Exclusion restrictions (eg, demand and supply shift factors) Other linear (and nonlinear) restrictions (eg, Blanchard-Quah long-run restriction) 9

Rank and Order Conditions Begin by rewriting the i th equation from B in matrix form as I K i B i 0 () where i and B i represent the i th columns of and B, respectively Since the rank of [ I K ] equals K, () represents a system of K equations in M +K we will introduce linear restrictions as follows unknowns (after normalization) In achieving identi cation, R i i B i 0 () where Rank(R i ) J Putting equations () and () together and rede ning i ( i ; B i ) 0 gives ( I K ) i 0: R i From this discussion, it is clear that R i must provide at least M new pieces of information Here are the formal rank and order conditions Order Condition The order condition states that Rank(R i ) J M is a necessary but not su cient condition for identi cation A situation where the order condition is not su cient is when R i j 0 More details on the order condition below Rank Condition The rank condition states that Rank(R i ) M is a necessary and su cient condition for identi cation We can now summarize all possible identi cation outcomes Under Identi cation If either Rank(R i ) < M or Rank(R i ) < M, the i th equation is underidenti ed Exact Identi cation If Rank(R i ) M and Rank(R i ) M, the i th equation is exactly identi ed Over Identi cation If Rank(R i ) > M and Rank(R i ) M, the i th equation is overidenti ed 0

Identi cation Conditions in the Demand and Supply Example Begin with supply and note that M The order condition is simple Since all the variables are in the supply equation, there is no restriction matrix R s so that Rank(R s ) 0 < The supply equation is underidenti ed There is no need to look at the rank condition Next, consider demand The relevant matrix equations are 0 0 ( I 0 0 K ) d 0 0 R d 0 0 0 0 0 0; for which the order condition is clearly satis ed (ie, Rank(R d ) M ) For the rank condition, we need to nd the rank of R d 0 0 0 0 0 0 0 0 : Clearly, Rank(R d ) M, so that the demand equation is exactly identi ed Limited-Information Estimation We will consider ve di erent limited-information estimation techniques OLS, indirect least squares (ILS), instrumental variable (IV) estimation, two-stage least squares (SLS) and limited-information maximum likelihood (LIML) The term limited information refers to equation-by-equation estimation, as opposed to full-information estimation which uses the linkages among the di erent equations Begin by writing the i th equation as Y i + XB i i y i Y i i + Y i i + X i i + X i i + i where Y i represents the vector of endogenous variables (other than y i ) in the i th equation, Y i vector of endogenous variables excluded from the i th equation, and similarly for X represents the Therefore, i 0 and

i 0 so that y i Y i i + X i i + i Y i X i i + i Z i i + i : i Ordinary Least Squares (OLS) The OLS estimator of i is The expected value of ^ i is ^OLS i (Z 0 iz i ) (Z 0 iy i ): E(^ OLS i ) i + E[(Z 0 iz i ) Z 0 i i ]: However, since y i and Y i are jointly determined (recall Z i contains Y i ), we cannot expect that E(Z 0 i i) 0 or plim(z 0 i i) 0 Therefore, OLS estimates will be biased and inconsistent This is commonly known as simultaneity or endogeneity bias Indirect Least Squares (ILS) The indirect least squares estimator simply uses the consistent reduced-form estimates (^ and ^) and the relations B and 0 to solve for, B and The ILS estimator is only feasible if the system is exactly identi ed To see this, consider the i th equation as given in () i B i where ^ (X 0 X) X 0 Y Substitution gives (X 0 X) X 0 y i Y i ^ i ^ i : 0 Multiplying through by (X 0 X) gives X 0 y i + X 0 Y i^ i X 0 X ^ i ) 0 X 0 y i X 0 Y i^ i + X 0 X i^i X 0 Z i^i :

Therefore, the ILS estimator for the i th equation can be written as ^ILS i (X 0 Z i ) (X 0 y i ): There are three cases: If the i th equation is exactly identi ed, then X 0 Z i is square and invertible If the i th equation is underidenti ed, then X 0 Z i is not square If the i th equation is overidenti ed, then X 0 Z i is not square although a subset could be used to obtain consistent albeit ine cient estimates of i Instrumental Variable (IV) Estimation Let W i be an instrument matrix (dimension T (K i + M i )) satisfying plim( T W i 0Z i) wz, a nite invertible matrix plim( T W i 0W i) ww, a nite positive-de nite matrix plim( T W i 0 i) 0 Since plim(zi 0 i) 0, we can instead examine plim( T W 0 i y i ) plim( T W 0 i Z i ) + plim( T W 0 i i ) plim( T W 0 i Z i ) Naturally, the instrumental variable estimator is ^IV i (W 0 i Z i ) (W 0 i y i ) which is consistent and has asymptotic variance-covariance matrix asy:var:(^ IV i ) ii T [ wz ww zw]: This can be estimated by est:asy:var:(^ IV i ) ^ ii (W 0 i Z i ) W 0 i W i (Z 0 iw i ) and ^ ii T (y i Z i^iv i ) 0 (y i Z i^iv i ):

A degrees of freedom correction is optional Notice that ILS is a special case of IV estimation for an exactly identi ed equation where W i X Two-Stage Least Squares (SLS) When an equation in the system is overidenti ed (ie, rows(x 0 Z i ) >cols(x 0 Z i )), a convenient and intuitive IV estimator is the two-stage least squares estimator The SLS estimator works as follows: Stage # Regress Y i on X and form ^Y i X ^ OLS Stage # Estimate i by an OLS regression of y i on ^Y i and X i More formally, let ^Z i ( ^Y i ; X i ) The SLS estimator is given by ^SLS i ( ^Z 0 i ^Z i ) ( ^Z 0 iy i ) where the asymptotic variance-covariance matrix for ^ SLS i can be estimated consistently by est:asy:var(^ SLS i ) ^ ii ( ^Z 0 i ^Z i ) and ^ ii T (y i Z i^sls i ) 0 (y i Z i^sls i ) Limited-Information Maximum Likelihood (LIML) Limited-information maximum likelihood estimation refers to ML estimation of a single equation in the system For example, if we assume normally distributed errors, then we can form the joint probability distribution function of (y i ; Y i ) and maximize it by choosing i and the appropriate elements of Since the LIML estimator is more complex but asymptotically equivalent to the SLS estimator, it is not widely used Note When the i th equation is exactly identi ed, ^ ILS i ^ IV i ^ SLS i ^ LIML i Full-Information Estimation The ve estimators mentioned above are not fully e cient because they ignore cross-equation relationships between error terms and any omitted endogenous variables We consider two fully e cient estimators below

Three-Stage Least Squares (SLS) Begin by writing the system () as y y y M Z 0 0 0 Z 0 0 0 Z M M + M ) y Z + where ~ E and E( 0 ) I T Then, applying the principle from SUR estimation, the fully e cient estimator is ^ ( W 0 ( I) Z) ( W 0 ( I)y) where W indicates an instrumental variable matrix in the form of Z Zellner and Theil (9) suggest the following three-stage procedure for estimating Stage # Calculate ^Y i for each equation (i ; :::; M) using OLS and the reduced form Stage # Use ^Y i to calculate Stage # Calculate the IV-GLS estimator ^ SLS i and ^ ij T (y i Z i^sls i ) 0 (y j Z j^sls j ) ^SLS [Z 0 (^ X(X 0 X) 0 )Z] [Z 0 (^ X(X 0 X) 0 )y] [ ^Z 0 (^ I) ^Z] [ ^Z 0 (^ I)y]: The asymptotic variance-covariance matrix can be estimated by est:asy:var(^ SLS ) [Z 0 (^ X(X 0 X) 0 )Z] [ ^Z 0 (^ I) ^Z] : Full-Information Maximum Likelihood (FIML) The full-information maximum likelihood estimator is asymptotically e cient Assuming multivariate normally distributed errors, we maximize ln L(; jy; Z) MT ln() + T ln jj + T ln j j (y Z)0 ( I T )(y Z)

by choosing, B and The FIML estimator can be computationally burdensome and has the same asymptotic distribution as the SLS estimator As a result, most researchers use SLS 8 Simultaneous Equations Gauss Application Consider estimating a traditional Keynesian consumption function using quarterly data between 9 and 00 The simultaneous system is C t 0 + DI t + c t (8) DI t + C t + I t + G t + NX t + y t (9) where the variables are de ned as follows: Endogenous Variables C t Consumption DI t Disposable Income Exogenous Variables I t Investment G t Government Spending NX t Net Exports The conditions for identi cation of (8), the more interesting equation to be estimated, are shown below Begin by writing the system as Y + XB Y + X 0 0 E 0 0 where Y t (C t ; DI t ) and X t (; I t ; G t ; NX t ) The order condition depends on the rank of the restriction matrix for (8), 0 0 0 0 0 R 0 0 0 0 0 ; 0 0 0 0 0

which is obviously Rank(R ) > M : Therefore, the model is overidenti ed if the rank condition is satis ed (ie, Rank(R ) M ) The relevant matrix for the rank condition is 0 0 0 0 0 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ; 0 0 0 which has Rank(R ) Therefore, equation (8) is overidenti ed Another way to see that (8) is overidenti ed is to solve for the reduced-form representation of C t C t 0 + [ + C t + I t + G t + NX t + y t ] + c t [( 0 + ) + I t + G t + NX t + ( c t + y t )] C t 0 + I t + G t + NX t + v t : (0) Clearly, estimation of (0) is likely to produce three distinct estimates of, which re ects the overidenti - cation problem See Gauss examples 0 and for OLS and SLS estimation of (8)