ECON 0 Class Notes Systems of Equations Introduction Here, we consider two types of systems of equations The rst type is a system of Seemingly Unrelated Regressions (SUR), introduced by Arnold Zellner (9) Sets of equations with distinct dependent and independent variables are often linked together by some common unmeasurable factor Examples include systems of factor demands by a particular rm, agricultural supply-response equations, and capital-asset pricing models The methods presented here can also be thought of as an alternative estimation framework for panel-data models The second type is a Simultaneous Equations System, which involve the interaction of multiple endogenous variables within a system of equations Estimating the parameters of such as system is typically not as simple as doing OLS equation-by-equation Issues such as identi cation (whether the parameters are even estimable) and endogeneity bias are the primary topics in this section Seemingly Unrelated Regressions (SUR) Model Consider the following set of equations y i X i i + i () for i ; :::; M, where the matrices y i, X i and i are of dimension (T ), (T K i ) and (K i ), respectively The stacked system in matrix form is y X 0 0 y 0 X 0 Y + X + : y M 0 0 X M M M Although each of the M equations may seem unrelated (ie, each has potentially distinct coe cient vectors, dependent variables and explanatory variables), the equations in () are linked through their (mean-zero) Although each equation typically represents a separate time series, it is possible that T instead denotes the number of cross sections within an equation
error structure where I T I T M I T E( 0 I T I T M I T ) I T M I T M I T MM I T M M M M MM MM MT MT is the variance-covariance matrix for each t ; :::; T error vector Generalized Least Squares (GLS) The system resembles the one we studied in chapter 0 on nonshperical disturbances The e cient estimator in this context is the GLS estimator ^ (X 0 X) (X 0 Y ) [X 0 ( I)X] [X 0 ( I)Y ]: Assuming all the classical assumptions hold (other than that of spherical disturbances), GLS is the best linear unbiased estimator There are two important conditions under which GLS does not provide any e ciency gains over OLS: ij 0 When all the contemporaneous correlations across equations equal zero, the equations are not linked in any fashion and GLS does not provide any e ciency gains In fact, one can show that b ^ X X X M When the explanatory variables are identical across equations, b ^ As a rule, the e ciency gains of GLS over OLS tend to be greater when the contemporaneous correlation in errors across equations ( ij ) is greater and there is less correlation between X across equations Feasible Generalized Least Squaures (FGLS) Typically, is not known Assuming that a consistent estimator of is available, the feasible GLS estimator ^ F GLS (X 0 ^ X) (X 0 ^ Y ) [X 0 (^ I)X] [X 0 (^ I)Y ] ()
will be a consistent estimator of The typical estimator of is ^ ^ ^ M ^ ^ ^ M ^ ^ M ^ M ^ MM T e0 e T e0 e T e0 e M T e0 e T e0 e T e0 e M T e0 M e T e0 M e T e0 M e M ; () where e i ; i ; :::; M represent the OLS residuals Degrees of freedom corrections for the elements in ^ are possible, but will not generally produce unbiasedness It is also possible to iterate on () and () until convergence, which will produce the maximum likelihood estimator under multivariate normal errors In other words, ^ F GLS and ^ ML will have the same limiting distributions such that ^ ML;F GLS asy N(; ) where is consistently estimated by ^ [X 0 (^ I)X] : Maximum Likelihood Although asymptotically equivalent, maximum likelihood is an alternative estimator to FGLS that will provide di erent answers in small samples Begin by rewriting the model for the t th observation as Y t y ;t y ;t 0 x t M + y Mt M;t ;t ;t 0 x t + 0 t where x t is the row vector of all di erent explanatory variables in the system and each i is the column vector of coe cients for the i th equation (unless each equation contains all explanatory variables, there will be zeros in i to allow for exclusion restrictions) Assuming multivariate normally distributed errors, the log likelihood function is log L X T log L t MT t log() T log jj X T t 0 t t () where, as de ned earlier, E( t 0 t) The maximum likelihood estimates are found by taking the derivatives of () with respect to and, setting them equal to zero and solving
Hypothesis Testing We consider two types of tests tests for contemporaneous correlation between errors and test for linear restrictions on the coe cients Contemporaneous Correlation If there is no contemporaneous correlation between errors in di erent equations (ie, is diagonal), then OLS equation-by-equation is fully e cient Therefore, it is useful to test the following restriction H 0 : ij 0 8i j H A : H 0 false Breusch and Pagan suggest using the Lagrange multiplier test statistic T X M X i i j r ij where r ij is calculated using the OLS residuals as follows r ij q e 0 i e j : (e 0 i e i)(e 0 j e j) Under the null hypothesis, is asymptotically chi-squared with M(M ) degrees of freedom Restrictions on Coe cients The general F test presented in chapter can be extended to the SUR system However, since the statistic requires using ^, the test will only be valid asymptotically Consider testing the following J linear restrictions H 0 : R q H A : H 0 false where ( ; ; :::; M ) 0 Within the SUR framework, it is possible to test coe cient restrictions across equations One possible test statistic is W (R^ F GLS q) 0 [Rvar(^ F GLS )R 0 ] (R^ F GLS q) which has an asymptotic chi-square distribution with J degrees of freedom
Autocorrelation Heteroscedasticity and autocorrelation are possibilities within the SUR framework I will focus on autocorrelation because SUR systems are often comprised of time series observations for each equation Assume the errors follow i;t i i;t + it where it is white noise The overall error structure will now be M M E( 0 M M ) M M M M MM MM MT MT where ij j T j i T j T i T i T T : The following three-step approach is recommended Run OLS equation-by-equation Compute consistent estimate of i (eg, ^ i ( P T t e i;te i:t )( P T t e i:t )) Transform the data, using either Prais-Winsten or Cochrane-Orcutt, to remove the autocorrelation Estimate using the transformed data as suggested in () Use ^ and equation () to calculate the FGLS estimates SUR Gauss Application Consider the data taken from Woolridge (00) The model attempts to explain wages and fringe bene ts for workers: W ages t X t + t Benefits t X ;t + t where X t X t so that OLS and FGLS will produce equivalent results Although OLS and FGLS are equivalent, one advantage of FGLS within a SUR framework is that it allows you test coe cient restrictions
across equations Doing OLS equation-by-equation would not allow such tests The variables are de ned as follows: Dependent Variables Wages Bene ts Hourly earnings in 999 dollars per hour Hourly bene ts (vacation, sick leave, insurance and pension) in 999 dollars per hour Explanatory Variables Education Experience Years of schooling Years of work experience Tenure Union South Years with current employer One if union member, zero otherwise One if live in south, zero otherwise Northeast One if live in northeast, zero otherwise Northcentral One if live in northcentral, zero otherwise Married One if married, zero otherwise White Male One if white, zero otherwise One if male, zero otherwise See Gauss example 9 for further details The Simultaneous Equation Model The simultaneous system can be written as Y + XB E () where the variable matrices are Y Y Y M X X X K M Y Y Y M X X X K M Y T M ; X T K ; E T M Y T Y T Y T M X T X T X T K T T T M
and the coe cient matrices are MM M M M M ; B KM : M M MM K K MK Some de nitions Y t;j is the jth endogenous variable X t;j is the jth exogenous or predetermined variable Equations () are referred to as structural equations and B are the structural parameters To examine the assumptions about the error terms, rewrite the E matrix as ~E vec(e) ( ; ; :::; T ; ; ; :::; T ; :::; M ; M ; :::; T M ) 0 : We assume E( ~ E) 0 E( ~ E ~ E 0 ) I T where the variance-covariance matrix for t ( t ; t ; :::; tm ) 0 is M M : M M MM Reduced Form The reduced-form solution to () is Y XB + E X + V
where B, V E and the error vector ~ V vec(v ) satis es E( ~ V ) 0 E( ~ V ~ V 0 ) ( 0 I T ) ( I T ) where 0 Demand and Supply Example Consider the following demand and supply equations Q s t 0 + P t + W t + Z t + s t Q d t 0 + P t + Z t + d t Q s t Q d t where Q s t, Q d t and P t are endogenous variables and W t and Z t are exogenous variables Let Q Q s t Q d t In matrix form, the system can be written as Q P W Z s d Q P W Z s d Y ; X ; E W T Z T Q T P T s T d T and Identi cation ; B 0 0 0 Identi cation Question Given data on X and Y, can we identify, B and? Estimation of and Begin by making the standard assumptions about the reduced form Y X + V : plim( T X0 X) Q plim( T X0 V ) 0 8
plim( T V 0 V ) : These assumptions imply that the equation-by-equation OLS estimates of and will be consistent Relationship Between (; ) and ( ; B; ) With these estimates (^ and ^) in hand, the question is whether we can map back to, B and? We know the following B and 0 To see if identi cation is possible, we can count the number of known elements on the left-hand side and compare with the number of unknown elements on the right-hand side Number of Known Elements KM elements in M(M + ) elements in Total M(K + (M + )): Number of Unknown Elements M elements in M(M + ) elements in B KM Total M(M + K + (M + )): Therefore, we are M pieces of information shy of identifying the structural parameters In other words, there is more than one set of structural parameters that are consistent with the reduced form We say the model is underidenti ed Identi cation Conditions There are several possibilities for obtaining identi cation: Normalization (ie, set ii in for i ; :::; m) Identities (eg, national income accounting identity) Exclusion restrictions (eg, demand and supply shift factors) Other linear (and nonlinear) restrictions (eg, Blanchard-Quah long-run restriction) 9
Rank and Order Conditions Begin by rewriting the i th equation from B in matrix form as I K i B i 0 () where i and B i represent the i th columns of and B, respectively Since the rank of [ I K ] equals K, () represents a system of K equations in M +K we will introduce linear restrictions as follows unknowns (after normalization) In achieving identi cation, R i i B i 0 () where Rank(R i ) J Putting equations () and () together and rede ning i ( i ; B i ) 0 gives ( I K ) i 0: R i From this discussion, it is clear that R i must provide at least M new pieces of information Here are the formal rank and order conditions Order Condition The order condition states that Rank(R i ) J M is a necessary but not su cient condition for identi cation A situation where the order condition is not su cient is when R i j 0 More details on the order condition below Rank Condition The rank condition states that Rank(R i ) M is a necessary and su cient condition for identi cation We can now summarize all possible identi cation outcomes Under Identi cation If either Rank(R i ) < M or Rank(R i ) < M, the i th equation is underidenti ed Exact Identi cation If Rank(R i ) M and Rank(R i ) M, the i th equation is exactly identi ed Over Identi cation If Rank(R i ) > M and Rank(R i ) M, the i th equation is overidenti ed 0
Identi cation Conditions in the Demand and Supply Example Begin with supply and note that M The order condition is simple Since all the variables are in the supply equation, there is no restriction matrix R s so that Rank(R s ) 0 < The supply equation is underidenti ed There is no need to look at the rank condition Next, consider demand The relevant matrix equations are 0 0 ( I 0 0 K ) d 0 0 R d 0 0 0 0 0 0; for which the order condition is clearly satis ed (ie, Rank(R d ) M ) For the rank condition, we need to nd the rank of R d 0 0 0 0 0 0 0 0 : Clearly, Rank(R d ) M, so that the demand equation is exactly identi ed Limited-Information Estimation We will consider ve di erent limited-information estimation techniques OLS, indirect least squares (ILS), instrumental variable (IV) estimation, two-stage least squares (SLS) and limited-information maximum likelihood (LIML) The term limited information refers to equation-by-equation estimation, as opposed to full-information estimation which uses the linkages among the di erent equations Begin by writing the i th equation as Y i + XB i i y i Y i i + Y i i + X i i + X i i + i where Y i represents the vector of endogenous variables (other than y i ) in the i th equation, Y i vector of endogenous variables excluded from the i th equation, and similarly for X represents the Therefore, i 0 and
i 0 so that y i Y i i + X i i + i Y i X i i + i Z i i + i : i Ordinary Least Squares (OLS) The OLS estimator of i is The expected value of ^ i is ^OLS i (Z 0 iz i ) (Z 0 iy i ): E(^ OLS i ) i + E[(Z 0 iz i ) Z 0 i i ]: However, since y i and Y i are jointly determined (recall Z i contains Y i ), we cannot expect that E(Z 0 i i) 0 or plim(z 0 i i) 0 Therefore, OLS estimates will be biased and inconsistent This is commonly known as simultaneity or endogeneity bias Indirect Least Squares (ILS) The indirect least squares estimator simply uses the consistent reduced-form estimates (^ and ^) and the relations B and 0 to solve for, B and The ILS estimator is only feasible if the system is exactly identi ed To see this, consider the i th equation as given in () i B i where ^ (X 0 X) X 0 Y Substitution gives (X 0 X) X 0 y i Y i ^ i ^ i : 0 Multiplying through by (X 0 X) gives X 0 y i + X 0 Y i^ i X 0 X ^ i ) 0 X 0 y i X 0 Y i^ i + X 0 X i^i X 0 Z i^i :
Therefore, the ILS estimator for the i th equation can be written as ^ILS i (X 0 Z i ) (X 0 y i ): There are three cases: If the i th equation is exactly identi ed, then X 0 Z i is square and invertible If the i th equation is underidenti ed, then X 0 Z i is not square If the i th equation is overidenti ed, then X 0 Z i is not square although a subset could be used to obtain consistent albeit ine cient estimates of i Instrumental Variable (IV) Estimation Let W i be an instrument matrix (dimension T (K i + M i )) satisfying plim( T W i 0Z i) wz, a nite invertible matrix plim( T W i 0W i) ww, a nite positive-de nite matrix plim( T W i 0 i) 0 Since plim(zi 0 i) 0, we can instead examine plim( T W 0 i y i ) plim( T W 0 i Z i ) + plim( T W 0 i i ) plim( T W 0 i Z i ) Naturally, the instrumental variable estimator is ^IV i (W 0 i Z i ) (W 0 i y i ) which is consistent and has asymptotic variance-covariance matrix asy:var:(^ IV i ) ii T [ wz ww zw]: This can be estimated by est:asy:var:(^ IV i ) ^ ii (W 0 i Z i ) W 0 i W i (Z 0 iw i ) and ^ ii T (y i Z i^iv i ) 0 (y i Z i^iv i ):
A degrees of freedom correction is optional Notice that ILS is a special case of IV estimation for an exactly identi ed equation where W i X Two-Stage Least Squares (SLS) When an equation in the system is overidenti ed (ie, rows(x 0 Z i ) >cols(x 0 Z i )), a convenient and intuitive IV estimator is the two-stage least squares estimator The SLS estimator works as follows: Stage # Regress Y i on X and form ^Y i X ^ OLS Stage # Estimate i by an OLS regression of y i on ^Y i and X i More formally, let ^Z i ( ^Y i ; X i ) The SLS estimator is given by ^SLS i ( ^Z 0 i ^Z i ) ( ^Z 0 iy i ) where the asymptotic variance-covariance matrix for ^ SLS i can be estimated consistently by est:asy:var(^ SLS i ) ^ ii ( ^Z 0 i ^Z i ) and ^ ii T (y i Z i^sls i ) 0 (y i Z i^sls i ) Limited-Information Maximum Likelihood (LIML) Limited-information maximum likelihood estimation refers to ML estimation of a single equation in the system For example, if we assume normally distributed errors, then we can form the joint probability distribution function of (y i ; Y i ) and maximize it by choosing i and the appropriate elements of Since the LIML estimator is more complex but asymptotically equivalent to the SLS estimator, it is not widely used Note When the i th equation is exactly identi ed, ^ ILS i ^ IV i ^ SLS i ^ LIML i Full-Information Estimation The ve estimators mentioned above are not fully e cient because they ignore cross-equation relationships between error terms and any omitted endogenous variables We consider two fully e cient estimators below
Three-Stage Least Squares (SLS) Begin by writing the system () as y y y M Z 0 0 0 Z 0 0 0 Z M M + M ) y Z + where ~ E and E( 0 ) I T Then, applying the principle from SUR estimation, the fully e cient estimator is ^ ( W 0 ( I) Z) ( W 0 ( I)y) where W indicates an instrumental variable matrix in the form of Z Zellner and Theil (9) suggest the following three-stage procedure for estimating Stage # Calculate ^Y i for each equation (i ; :::; M) using OLS and the reduced form Stage # Use ^Y i to calculate Stage # Calculate the IV-GLS estimator ^ SLS i and ^ ij T (y i Z i^sls i ) 0 (y j Z j^sls j ) ^SLS [Z 0 (^ X(X 0 X) 0 )Z] [Z 0 (^ X(X 0 X) 0 )y] [ ^Z 0 (^ I) ^Z] [ ^Z 0 (^ I)y]: The asymptotic variance-covariance matrix can be estimated by est:asy:var(^ SLS ) [Z 0 (^ X(X 0 X) 0 )Z] [ ^Z 0 (^ I) ^Z] : Full-Information Maximum Likelihood (FIML) The full-information maximum likelihood estimator is asymptotically e cient Assuming multivariate normally distributed errors, we maximize ln L(; jy; Z) MT ln() + T ln jj + T ln j j (y Z)0 ( I T )(y Z)
by choosing, B and The FIML estimator can be computationally burdensome and has the same asymptotic distribution as the SLS estimator As a result, most researchers use SLS 8 Simultaneous Equations Gauss Application Consider estimating a traditional Keynesian consumption function using quarterly data between 9 and 00 The simultaneous system is C t 0 + DI t + c t (8) DI t + C t + I t + G t + NX t + y t (9) where the variables are de ned as follows: Endogenous Variables C t Consumption DI t Disposable Income Exogenous Variables I t Investment G t Government Spending NX t Net Exports The conditions for identi cation of (8), the more interesting equation to be estimated, are shown below Begin by writing the system as Y + XB Y + X 0 0 E 0 0 where Y t (C t ; DI t ) and X t (; I t ; G t ; NX t ) The order condition depends on the rank of the restriction matrix for (8), 0 0 0 0 0 R 0 0 0 0 0 ; 0 0 0 0 0
which is obviously Rank(R ) > M : Therefore, the model is overidenti ed if the rank condition is satis ed (ie, Rank(R ) M ) The relevant matrix for the rank condition is 0 0 0 0 0 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ; 0 0 0 which has Rank(R ) Therefore, equation (8) is overidenti ed Another way to see that (8) is overidenti ed is to solve for the reduced-form representation of C t C t 0 + [ + C t + I t + G t + NX t + y t ] + c t [( 0 + ) + I t + G t + NX t + ( c t + y t )] C t 0 + I t + G t + NX t + v t : (0) Clearly, estimation of (0) is likely to produce three distinct estimates of, which re ects the overidenti - cation problem See Gauss examples 0 and for OLS and SLS estimation of (8)