The Review of Economic Studies, Ltd.


 August Walters
 3 years ago
 Views:
Transcription
1 The Review of Economic Studies, Ltd. Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations Author(s): Manuel Arellano and Stephen Bond Reviewed work(s): Source: The Review of Economic Studies, Vol. 8, No. (Apr., 99), pp Published by: Oxford University Press Stable URL: Accessed: 6// 7:44 Your use of JSTOR archive indicates your acceptance of Terms & Conditions of Use, available at. JSTOR is a notforprofit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact Oxford University Press and The Review of Economic Studies, Ltd. are collaborating with JSTOR to digitize, preserve and extend access to The Review of Economic Studies.
2 Review of Economic Studies (99) 8, /9/877$.? 99 The Review of Economic Studies Limited Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Equations MANUEL ARELLANO London School of Economics and STEPHEN BOND University of Oxford Employment First version received May 988; final version accepted July 99 (Eds.) This paper presents specification tests that are applicable after estimating a dynamic model from panel data by generalized method of moments (GMM), and studies practical performance of se procedures using both generated and real data. Our GMM estimator optimally exploits all linear moment restrictions that follow from assumption of no serial correlation in errors, in an equation which contains individual effects, lagged dependent variables and no strictly exogenous variables. We propose a test of serial correlation based on GMM residuals and compare this with Sargan tests of overidentifying restrictions and Hausman specification tests.. INTRODUCTION The purpose of this paper is to present specification tests that are applicable after estimating a dynamic model from panel data by generalized method of moments (GMM) and to study practical performance of se procedures using both generated and real data. Previous work concerning dynamic equations from panel data (e.g. Chamberlain (984), Bhargava and Sargan (983)) has emphasized case where model with an arbitrary intertemporal covariance matrix of errors is identified. The fundamental identification condition for this model is strict exogeneity of some of explanatory variables (or availability of strictly exogenous instrumental variables) conditional on unobservable individual effects. In practice, this allows one to use past, present and future values of strictly exogenous variables to construct instruments for lagged dependent variables and or nonexogenous variables once permanent effects have been differenced out. Bhargava and Sargan (983) and Arellano (99) considered estimation and inference imposing restrictions on autocovariances, but assumption that model with unrestricted covariance matrix is identified was never removed. However, sometimes one is less willing to assume strict exogeneity of an explanatory variable than to restrict serial correlation structure of errors, in which case 77
3 78 REVIEW OF ECONOMIC STUDIES different identification arrangements become available. Uncorrelated errors arise in a number of environments. These include rational expectations models where disturbance is a surprise term, errorcorrection models and vector autoregressions. Moreover, if re are a priori reasons to expect autoregressive errors in a regression model, se can be represented as a dynamic regression with nonlinear common factor restrictions and uncorrelated disturbances (e.g. Sargan (98)). In se cases and also in models with movingaverage errors, lagged values of dependent variable itself become valid instruments in differenced equations corresponding to later periods. Simple estimators of this type were first proposed by Anderson and Hsiao (98, 98). Griliches and Hausman (986) have developed estimators for errorsinvariables models whose identification relies on assumptions of lack of (or limited) serial correlation in measurement errors. HoltzEakin, Newey and Rosen (988) have also considered estimators of this type for vector autoregressions which are similar to ones we employ in this paper. An estimator that uses lags as instruments under assumption of white noise errors would lose its consistency if in fact errors were serially correlated. It is refore essential to satisfy oneself that this is not case by reporting test statistics of validity of instrumental variables (i.e. tests of lack of serial correlation) toger with parameter estimates. In this paper we consider three such tests: a direct test on tne secondorder residual serial correlation coefficient, a Sargan test of overidentifying restrictions and a Hausman specification test. The operating characteristics of se tests are different as well as ir number of degrees of freedom. In addition, depending on alternative auxiliary distributional assumptions concerning stationarity and heterogeneity, different forms of each of tests are available. These alternative versions of a given test are asymptotically equivalent under less general set of auxiliary assumptions but y still may perform quite differently in finite samples. We have refore produced a number of Monte Carlo experiments to investigate relative performance of various tests. Finally, as an empirical illustration we report some estimated employment equations using Datastream panel of quoted U.K. companies. The paper is organized as follows. Section presents model and estimators. For a fixed number of time periods in sample, model specifies a finite number of moment restrictions and refore an asymptotically efficient GMM estimator is readily available. The discussion is kept as simple as possible by concentrating initially on a firstorder autoregression with a fixed effect. Exogenous variables and unbalanced panel considerations are subsequently introduced. Section 3 presents various tests of serial correlation and ir asymptotic distributions. Section 4 reports simulation results. Section contains application to employment equations and Section 6 concludes.. ESTIMATION The simplest model without strictly exogenous variables is an autoregressive specification of form Yi, ayi(tl)+ +,i + Vit, lal <. () We assume that a random sample of N individual time series (yi,...., YiT) is available. T is small and N is large. The vi, are assumed to have finite moments and in particular E(vit) E(vitvij) for t $ s. That is, we assume lack of serial correlation but not necessarily independence over time. With se assumptions, values of y lagged two periods or more are valid instruments in equations in first differences. Namely, for
4 ARELLANO & BOND SPECIFICATION TESTS 79 Ti_3 model implies following m(t)(t)/ linear moment restrictions E[(yit  ay(t_))yj(t_j) (j,..., (t ); t 3,..., T) () where for simplicity Yi Yi Yi(ti) We wish to obtain optimal estimator of a as N  co for fixed T on basis of se moment restrictions alone. That is, in absence of any or knowledge concerning initial conditions or distributions of vi, and qj. Note that our assumptions also imply quadratic moment restrictions, for example E( eiti(t)), which however we shall not exploit in order to avoid iterative procedures. This estimation problem is an example of those analyzed by Hansen (98) and White (98), and an optimal GMM or twostage instrumental variables estimator should be available. The moment equations in () can be conveniently written in vector form as E(Z'ibi) where vi (vi3... lbit)' and Zi is a (T) x m block diagonal matrix whose sth block is given by (yi *... Yis) The GMM estimator a' is based on sample moments N' Z'i3 N'Z'i where v3yajll(iu',..., I'N)' is a N( T)x vector and Z(Z,..., Z'N)' is a N(T  ) x m matrix. a is given by a argmina (ViZ)AN(Z') lzanz (3) Y_,ZANZ'Yl' Multivariate standard CLT implies that VPN"ZZ'I is asymptotically standard normal where VN N 'E E(Z'ifAiZj) is average covariance matrix of Ziui. Under our assumptions, VN can be replaced by VN N Zivi3Z, Ii where are residuals from a preliminary consistent estimator at. The onestep estimator ca is obtained by setting AN (N Ei ZHZi)l where H is a (T) square matrix which has twos in main diagonal, minus ones in first subdiagonals and zeroes orwise. A consistent estimate of avar (c) for arbitrary AN is given by aa (a) NY lzanvnanz'y avar( an  ZN,p_) 4 The optimal choice for AN is VN (e.g. see Hansen (98)) which produces a twostep A A A estimator a. a, and a will be asymptotically equivalent if vi, are independent and homoskedastic both across units and over time. It is useful to relate se estimators to AndersonHsiao (AH) estimator which is commonly used in practice. Anderson and Hsiao (98) proposed to estimate a by regressing y on Y using eir Y or Y as instruments. Since both Y and Y are linear combinations of Z resulting estimators will be inefficient. Note that under stationarity, namely when E(YitYi(tk)) Cik for all t, estimator that uses Z+ diag (yit) (t,..., T  ) is asymptotically equivalent to one based on stacked vector Y, whose computation is much simpler (since AN becomes irrelevant). However neir of two is asymptotically equivalent to a, or a, not even under stationarity. The extension of previous results to case where a limited amount of serial correlation is allowed in vit is straightforward. Suppose that vit is MA (q) in. In this paper we represent this type of matrix by Zi diag (yil,. Yis), (s... i T).. An alternative choice of AN is (N ZfAlZi)l with A N S The resulting estimator does not depend on data fourthorder moments and is asymptotically equivalent to ' provided vi, are serially independent. Note that in this case E(Z!ti3ivZi) E(Z!fliZi) and limn o N, E[Z!(fli fn)zi] O (see White (98), p. 49).
5 8 REVIEW OF ECONOMIC STUDIES sense that E(vivi(,_k)) $ O for k _ q and zero orwise. In this case a is just identified with T q + 3 and re are mq T  q  )( T  q  )/ restrictions available. Models with exogenous variables We now turn to consider an extended version of equation () where (k ) independent explanatory variables have been included Yit ayi(t,l)+ xu* + 'qi + Vit 8'xi, + 7Ri + vi, () where xit (Yi(t) x*')' is k x and vit are not serially correlated. Suppose initially that x* are all correlated with qi. In this context form of optimal matrix of instruments depends on wher x* are predetermined or strictly exogenous variables. If x* are predetermined, in sense that E(xi*vis) $ for s < t and zero orwise, n only x*,..., x*4s) are valid instruments in differenced equation for period s so that optimal Zi is a (T)x(T)[(k)(T+)+(T)]/ matrix of form Zi diag (yi,... yisx*l... x*( +)), (s,..., T  ). On or hand, if x* are strictly exogenous, i.e. E(x viis) for all t, s, n all x*'s are valid instruments for all equations and Zi takes form Zi diag (yi, * yis x*x' * X** ), (s,..., T ). Clearly, x* may also include a combination of both predetermined and strictly exogenous variables. In eir case, form of GMM estimator of k x coefficient vector 8 is (X'ZANZ'X) X'ZANZ'y (6) where X is a stacked (T ) N x k matrix of observations on xit, and y and Z are as above for appropriate choice of Zi. As before, alternative choices of AN will produce onestep or twostep estimators.3 Turning now to case where x* can be partitioned into (x*t,x*x,) and x*, is uncorrelated with qi, additional moment restrictions exploiting this lack of correlation in levels equations become available. For example, if x*, is predetermined and letting uit qi + vi,, we have T extra restrictions. Namely, E(uiXIi4) and E (ui,xi,i), (t,..., T). Note that all remaining restrictions from levels equations are redundant given those previously exploited for equations in first differences. Define ui (Ui.. UiTY), let v+ be [(T)+(T)]xl vector v+(i3u')' and let v+ (V+'... v+' Y+  X+8. The optimal matrix of instruments Z+ is block diagonal and consists of two blocks: Zi which is as in predetermined x* case above (assuming that x* t is also predetermined), and Za which is itself block diagonal with (x*lx*4) in first block and x*', s 3,..., T in remaining blocks. The twostep estimator is of same form as (6) with X+, y+ and Z+ replacing X, y and Z respectively, and AN [N Eji Z+A+A+t'ZP+]. On or hand, if x4*t is strictly exogenous, observations for all periods become valid instruments in levels equations. However, given those previously exploited in first differences we only have T extra restrictions which in this case can be conveniently expressed as E(T,i uisxit) (t,..., T). Thus, twostep estimator would just combine ( T ) first difference equations and average level equation.4 3. Note that if E(vjvt) is unrestricted (i.e. vi, is MA(q) with T< q+3) but x* is strictly exogenous, model is identified with Z, diag (*... *x), * (s,..., T ), in which case twostep estimator coincides with generalized threestage least squares estimator proposed by Chamberlain (984). 4. Note that when x*, variables are available it may be possible to identify and estimate coefficients for timeinvariant variables on lines suggested by Hausman and Taylor (98), Bhargava and Sargan (983) and Amemiya and MaCurdy (986).
6 ARELLANO & BOND SPECIFICATION TESTS 8 Models from unbalanced panel data By unbalanced panel data we refer to a sample in which consecutive observations on individual units are available, but number of time periods available may vary from unit to unit as well as historical points to which observations correspond. This type of sample is very common particularly with firm data which is context of application reported below. Aside from often allowing one to exploit a much larger sample or to pool more than one panel, use of unbalanced panels may lessen impact of self selection of firms in sample. In fact nothing fundamental changes in econometric methods provided a minimal number of continuous time periods are available on each unit, and one assumes that if periodspecific parameters are present number of observations on se periods tend to infinity. Of course, essential assumption is that observations in initial crosssection are independently distributed and that subsequent additions and deletions take place at random (see Hsiao (986), Chapter 8). The previous notation can accommodate unbalanced panels with minor changes. We now have Ti timeseries observations on ith unit, and re are N individual units in sample. The matrices X and Z, and vectors y and v are made of N rowblocks, ith block containing (Ti  ) rows. Note that now number of nonzero columns in each Zi may vary across units. For example, in firstorder autoregressive specification above, number of columns in Zi is p (r )(r  )/ where r is total number of periods on which observations are available for some individuals in sample, and Zi diag (yi,.., Yis), (s,..., r), only if r observations are available on i. For individuals with Ti < r, rows of diag (yi,..., yis) corresponding to missing equations are deleted and missing values of Yit in remaining rows are replaced by zeroes. The twostep GMM estimator of a for this choice of instruments: is same as in (3) using AN(N'E'Z'ivZiZl) where Zi is (Ti)xp and vi is (Ti ) x. 3. TESTING THE SPECIFICATION In order to keep notation simple we now drop bars from variables in first differences, so that firstdifference equation for unbalanced panel is now yx 8+ v (7) nxl nxk kxl nxl where n i (Ti ). We also assume that x* are all potentially correlated with mi. The n x vector of residuals is given by AyX8 vx(88) where 8 can be any estimator of form (6) for a particular choice of Z and AN. Let v be vector of residuals lagged twice, of order q i (Ti 4) and let v* be a q x vector of trimmed v to match v and similarly for X*. Since vit are first differences of serially uncorrelated errors, E(vitvi(tl)) need not be zero, but consistency of GMM estimators above hinges heavily upon assumption that E(vitvi(t)). In an. This is optimal choice amongst estimators that can be obtained by stacking equations for all periods and individuals. An alternative estimator would minimize sum of GMM criteria for each of balanced subpanels in sample. Although latter is strictly more efficient when number of units in all subsamples tend to infinity, it may have a poorer finitesample performance when various subsample sizes are not sufficiently large.
7 8 REVIEW OF ECONOMIC STUDIES unbalanced panel (r4) such covariances can be estimated in total, in principle with varying number of sample observations to estimate each of covariances. Provided one assumes that all subsamples tend to infinity, a (r4) degrees of freedom test can be constructed of hyposis that secondorder autocovariances for all periods in sample are zero. However, a considerably simpler procedure will look at average covariances Xi  ()V These averages are independent random variables across units with zero mean under null although with unequal variances in general. So a straightforward one degree of freedom test statistic can be constructed to test wher E(4i) is zero or not. The test statistic for secondorder serial correlation based on residuals from firstdifference equation takes form M 'A/* di N(, ) (8) V under E(vitvi(t)), where v is given by V Z, V ) Vi* V* V'i() 'X*(X'ZANZXY)XPZAN(N Zviv i*vi()) + v' X avar (8)X *V. (9) Note that m is only defined if min Ti'. A proof of asymptotic normality result is sketched in Appendix. It is interesting to notice that m criterion is rar flexible, in that it can be defined in terms of any consistent GMM estimator, not necessarily in terms of efficient estimators, eir in sense of using optimal Z or AN or both. However, asymptotic power of m test will depend on efficiency of estimators used. The m statistic tests for lack of secondorder serial correlation in firstdifference residuals. This will certainly be case if errors in model in levels are not serially correlated, but also if errors in levels follow a randomwalk process. One may attempt to discriminate between two situations by calculating an ml statistic, on same lines as M, to test for lack of firstorder serial correlation in differenced residuals. Alternatively, notice that if errors in levels follow a random walk, n both OLS and GMM estimates in firstdifference model are consistent which suggests a Hausman test based on difference between two estimators. We now turn to consider two or tests of specification which are applicable in same context. One is a Sargan test of overidentifying restrictions (cf. Sargan (98, 988), Hansen (98)) given by SV Z(Zl ZiviZZi) YZVa Xpk () where v y  X8, and 8 is a twostep estimator of 8 for a given Z. Notice that Z need not be optimal set of instruments; here p just refers to number of columns in Z provided p > k. Also notice that while we are able to produce a version of serial correlation test based upon a onestep estimator of 8 which remained asymptotically normal on null under more general distributional assumptions, no robust chisquare Sargan test based on onestep estimates is available. Under null a statistic of form lav Z(EI V ZHiHZi) Z where v are onestep residuals, will only have a limiting chisquare distribution if errors are indeed i.i.d. over time and individuals. In general, asymptotic distribution
8 ARELLANO & BOND SPECIFICATION TESTS 83 of s is a quadratic form in standard normal variables. Critical values can still be calculated by numerical integration but this clearly leads to a burdensome test procedure. On or hand, re may be circumstances where serial correlation test is not defined while Sargan test can still be computed. As a simple example, take firstorder autoregressive equation at beginning of Section with T 4; in this case Sargan statistic tests two linear combinations of three moment restrictions available, namely E(Ui3yi) E(lbi4yi) E(ti4yi), but no differenced residuals two periods apart are available to construct an m test. A furr possibility is to use Sargan difference tests to discriminate between nested hyposes concerning serial correlation in a sequential way. For example, let Z, be a n x p, matrix containing columns of Z which remain valid instrumental variables when errors in levels are firstorder moving average, and let 8, be a twostep estimator of 8 based on Z, with associated residuals v^, n S VI Z,( I viiii IV a I k if errors in levels are MA () or MA (). In addition ds s_siax'p l) if errors in levels are not serially correlated. Moreover ds is asymptotically independent of s, (see Appendix). A closely related alternative is to construct a Hausman test based on difference (8, 8) (cf. Hausman (978) and Hausman and Taylor (98)). This type of test has been proposed by Griliches and Hausman (986) in context of movingaverage measurement errors. A test is based on statistic h, (?  )'[av'r (?SI)  ava'r (3)](?s ?) a x() A A where r rank avar (3, 8) and ( ) indicates a generalized inverse. The value of r will depend on number of columns of X which are maintained to be strictly exogenous. In particular, if only nonexogenous variable is lagged dependent variable n r. 4. EXPERIMENTAL EVIDENCE A limited simulation was carried out to study performance of estimation and testing procedures discussed above in samples of a size likely to be encountered in practice. In all experiments dependent variable yi, was generated from a model of form Yit ayi(,i) + oxit,q7i+ vi, (i 9...., N; t,...., T+ ) Vit 7it(eit + 4ei(t)), it + Xit (3) where q, i i.i.d. N(O, o ), {it  i.i.d. N(, ) and yio. The first ten crosssections were discarded so that actual samples contain NT observations. With regard to xi,, we considered following generating equation xit pxi(,,) + 6it (4) with Eit i.i.d. N(O, (J) independent of qi and vi, for all t, s and kept observations on xi, fixed over replications. As an alternative choice of regressors we used total sales from a sample of quoted U.K. firms where large variations across units and outliers are likely to be present. In both cases, xi, is strictly exogenous and uncorrelated with individual effects. However, since we are interested in performance of estimators that
9 84 REVIEW OF ECONOMIC STUDIES rely on lags of yi, for identification of a and,3, overidentifying restrictions arising from strict exogeneity of xi, and lack of correlation with qi were not used in simulated GMM estimator. Thus, we chose6 Zi  [diag(yi,.. **Yis)*C(X3 ** ity ] (S I.. T) which is a valid instrument set provided 4?. In base design, sample size is N and T 7, vi, are independent over time and homoskedastic:, 4?, a. (o, /, p 8 and or 9. Tables and summarize results for a ,, 8 obtained from replications. Results for or variants of this design were calculated (N, T 6, r.,,, p ], and are available from authors on request. However conclusions are same as for results reported here. Table reports sample means and standard deviations for onestep and twostep GMM estimators (GMM and GMM respectively), OLS in levels, withingroups, and TABLE Biases in estimates Robust Within Onestep Onestep Twostep GMM GMM OLS groups AHd AHI ASE ASE ASE a,, Coefficient: a Mean St. Dev Coefficient:,3 Mean St. Dev a ,,3 Coefficient: a Mean 937 d St. Dev Coefficient: /8 Mean St. Dev a 8,/3 Coefficient: a Mean St. Dev Coefficient:,3 Mean St. Dev Notes. (i) N, T 7, replications, o a. (ii) Exogenous variable is first order autoregressive with p 8 and or 9. (iii) GMMI and GMM are respectively one step and two step differenceiv estimators of type described in Section. Both GMM use Z, [diag (y,,... Yi) (Xj3 * * *,T)'] (S,. * *, T). (iv) AHd and AHl are AndersonHsiao stackediv estimators of equation in first differences that use AYi(,) and Yi(t) as an instrument for Ayi(,) respectively. (v) One Step ASE and Robust One Step ASE are estimates of asymptotic standard errors of GMM. The former are only valid for i.i.d. errors while latter are robust to general heteroskedasticity over individuals and over time. Two step ASE is a robust estimate of asymptotic standard errors of GMM. 6. The optimal instrument set for system of first difference equations would be zi diag (yi, *.* Yis, Xi,  , XiT)
10 ARELLANO & BOND SPECIFICATION TESTS 8 two AH estimators. The AHd estimator is given by ( ^) (i Et4 AZitw I)t t 4 AZit Ayit () where wi (yi(t_), xit)' and zi, (yi(t), xi,)'. The AHI estimator replaces Azi, with (yi(t), Axit)' and summation goes from t 3 to T The next two columns report sample means and standard deviations for two alternative estimates of asymptotic standard errors of one step GMM estimator. The first one is only valid for i.i.d. errors while second is robust to heteroskedasticity of arbitrary form. The last column corresponds to estimates of asymptotic standard errors of twostep GMM estimator. The tabulated results show a small downward finitesample bias in GMM estimators of a (of about to 3%). Not surprisingly, OLS and withingroup (WG) estimators of a exhibit large biases in opposite directions (i.e. upward bias in OLS whose size depends on (J; downward bias in WG whose size depends on T). The behaviour of AH estimators is more surprising. Concerning AHd, re is evidence of lack of identification for a and negligible biases, though coupled with large variances, for a  and 8. On or hand, standard deviation of AHl is small for a  and a but it more than doubles that of AHd for a 8. These results are consistent with calculations of asymptotic variance matrices for AH estimators reported elsewhere (cf. Arellano (989)). As explained in that note, in a model containing an exogenous variable in addition to lagged dependent variable, re are values of a and p between and for which re is no correlation between AYi(,t) and AYi(,), in which case AYi(t) is not a valid instrument and AHd is not identified. In our first experiment, AHd is close to such a singularity which explains result. In contrast, AHl has no singularities for stationary values of a and p but can neverless be even less precise than AHd for large values of a. An interesting result is that standard deviation of GMM estimators of a is about three times smaller than that of AHd for a  and 8 and between four and five times smaller than that of AHl for a 8 (although standard deviation of AHl for a  and a  is of a similar magnitude as for GMM estimators). This suggests that re may be significant efficiency gains in practice by using GMM as opposed to AH, aside from overcoming potential singularities as in our first experiment. Concerning GMM, two alternative estimators of ir asymptotic standard errors behave in a similar way, although robust alternatives always have a bigger standard deviation. Their sample mean is always very close to finitesample standard deviation in column one, suggesting that asymptotic approximation is quite accurate for simulated designs. On or hand, estimator of asymptotic standard errors of GMM in last column shows a downward bias of around percent relative to finitesample standard deviations reported in second column. Table reports number of rejections toger with sample means and variances for test statistics discussed in Section 3. The first three columns contain two alternative versions of one step m statistic and two step m statistic (see notes to table). The Sargan tests are tests of overidentifying restrictions based on minimized criteria of GMM estimators of Table. The differencesargan tests are based on difference between minimized GMM criteria and restricted versions of se that remain valid when errors are MA (). The Hausman statistics test distance between GMM and restricted GMM estimates of a. With only replications we cannot hope to provide accurate estimates of tail probabilities associated with test statistics; our results can only be suggestive. The
11 86 REVIEW OF ECONOMIC STUDIES TABLE Sizes of Test Statistics, Number of rejections out of cases Onestep Twostep Robust Onestep Twostep difference difference Onestep Twostep Onestep onestep Twostep Sargan Sargan Sargan Sargan Hausman Hausman M M M (df 4) (df 4) (df ) (df ) (df ) (df ) a Mean Variance a Mean Variance a Mean Variance Notes. (i) N, T 7, replications, U U p. (ii) Exogenous variable is first order autoregressive with p 8 and o 9 (iii) All tests are based on one and twostep GMM estimates reported in Table as well as on restricted versions of those which only use columns of Zi that remain valid instruments when errors are MA (). The onestep m statistic is described in Appendix. (iv) Results on an extended set of experiments are available from authors on request.
12 ARELLANO & BOND SPECIFICATION TESTS 87 TABLE 3 Power of Test Statistics, Number of rejections out of cases Onestep Twostep Robust Onestep Twostep difference difference Onestep Twostep Onestep onestep Twostep Sargan Sargan Sargan Sargan Hausman Hausman M M M (df 4) (df 4) (df ) (df ) (df ) (df ) Serial correlation: Corr (v,v,) Mean Variance Serial correlation: Corr (v,v,l) Mean * Variance Heteroskedasticity: Var (vi,) xi,, Xit AR () data Mean Variance Heteroskedasticity: Var (Vi,) xi,, : U.K. Sales Mean * Variance * *73 Notes. (i) N, T 7, replications, a, /3, o. (ii) In serial correlation designs vi, has been generated as MA () with standard normal random errors. (iii) In first three designs xi, is AR () with p 8 and T. 9. (iv) In fourth design, xi, are total sales from a sample of quoted U.K. companies scaled to have Var (Axi,). x,, behaves as a random walk process with large differences in mean and variance across firms. (v) In heteroskedastic designs re is no serial correlation in vi,.
13 88 REVIEW OF ECONOMIC STUDIES robust m statistics, which depend on fourthorder moments of data, both tend to reject too often at % level, suggesting that y have a slower convergence to normality by comparison with or test, but y are still to be recommended when heteroskedasticity is suspected. Overall all three m tests seem to be quite well approximated by ir asymptotic distributions under null, with no obvious indications of need for systematic finitesample size corrections. The same is true for Sargan, differencesargan and one step Hausman tests. However, twostep Hausman statistic appears to overreject consistently in se experiments. Table 3 repeats exercise for two models with MA () serial correlation (4 9 and.333) and two or experiments with heteroskedastic errors. The m statistic will reject null more than half time at per cent level when correlation between vi, and vi(,i) is only. However when autocorrelation rises to 3, null will be rejected in 9% of cases. The Hausman test has considerably less power than differencesargan test or m statistics, and with increasing autocorrelation difference in power becomes wider. The last two panels of Table 3 investigate effects of departures from homoskedasticity of error distribution on probabilities of rejection of tests. Both experiments have and. In first, xi, are generated AR () data as in previous experiments, while in second xi, are U.K. sales data. This has a dramatic effect on onestep tests which are not robust to heteroskedasticity. On or hand, robust m statistics and twostep differencesargan test show no serious departures from ir nominal size. The twostep Sargan test tends to underreject and twostep Hausman test overrejects, especially in last experiment where variance of xi, is much greater. We suspect that twostep Hausman statistic is very sensitive to presence of outliers.. AN APPLICATION TO EMPLOYMENT EQUATIONS In this section we apply strategy for estimation and testing outlined earlier to a model of employment, using panel data for a sample of U.K. companies. We consider a dynamic employment equation of form nit a, ni(tl) + ani(t) + i'(l)xit + At + qi + vit. (6) Here ni, is logarithm of U.K. employment in company i at end of year t,7 vector xi, contains a set of explanatory variables and 3(L) is a vector of polynomials in lag operator. The specification also contains a time effect At that is common to all companies,8 a permanent but unobservable firmspecific effect 'qi and an error term vi,. Equation (6) will admit more than one oretical interpretation. Suppose first that, in absence of adjustment costs, a pricesetting firm facing a constant elasticity demand curve would choose to set employment according to a loglinear labour demand equation (see, for example, Layard and Nickell (986)) n* hyo+ Yl wi, + 7k + Ce, + 7(7) where YI <, Y > and Y3?. Here wi, is log of real product wage, kit is log 7. Note that time period is taken to be month period covered in company's accounts ("accounting year") and so differs across companies in sample. 8. These time effects relate to calendar years and a company's accounting year is allocated to calendar year in which it ends.
14 ARELLANO & BOND SPECIFICATION TESTS 89 of gross capital, a'e is a measure of expected demand for firm's product relative to potential output, and intercept may contain a firmspecific component q'. If employment adjustment is costly n actual employment will deviate from n* in short run. This suggests a dynamic labour demand model of form of (6), where xit contains k, w, and o, and unrestricted lag structures are included to model this sluggish adjustment. We include log of industry output (ysi,) to capture industry demand shocks, and aggregate demand shocks are also included through time dummies. The resulting employment equation is a skeleton version of those estimated on U.K. time series data by Layard and Nickell (986) and on micro data by Nickell and Wadhwani (989). The shortrun dynamics will compound influences from adjustment costs, expectations formation and decision processes. Alternatively, if adjustment costs take standard additivelyseparable quadratic form (/a)(ni,  Ni(t_l)), where Nit denotes level rar than logarithm of company employment, n Dolado (987), following Nickell (984), derives a loglinear approximation to Euler equation for a firm maximising present discounted value of profits as E.l(nit) 8+ (+ r)ni(t,l)( + r)ni(t)+ a( + r)[n(l)  ni(t,l)]. (8) Here r is a real discount rate, assumed constant, and n* is given by (7). Replacing conditional expectation by its realisation and introducing an expectational error vit yields a model with form of (6), though with strong restrictions on dynamic structure in this case. In particular rational expectations hyposis suggests a oretical motivation for assumption of seriallyuncorrelated errors in this kind of model. The principal data source used is published accounts of 4 quoted companies whose main activity is manufacturing and for which we have seven or more continuous observations during period The panel is unbalanced both in sense that we have more observations on some firms than on ors, and because se observations correspond to different points in historical time. We allocate each of our companies to one of nine broad subsectors of manufacturing according to ir main product by sales, and use valueadded in that sector as our measure of industry output. Our wage variable is a measure of average remuneration per employee in company, which we deflate using a valueadded price deflator at industry level. Finally we use an inflationadjusted estimate of company's gross capital stock. More information about sample and construction of se variables is provided in Data Appendix. In Table 4 we report GMM estimates of se dynamic employment equations.9 We begin by including currentdated variables and unrestricted lag structures. Columns (al) and (a) present onestep and twostep results respectively for most general dynamic specification that we considered. Three crosssections are lost in constructing lags and taking first differences, so that estimation period is , with 6 useable observations. Here all variables or than lagged dependent variables are assumed to be strictly exogenous, although none of overidentifying restrictions that follow from this assumption are exploited. Comparing columns (al) and (a) shows that estimated coefficients are quite similar in all cases. Both models are well determined and have sensible longrun properties for a labour demand equation. However asymptotic standard errors associated with twostep estimates are generally around 3% lower than those associated with 9. Estimation was performed using DPD program written in GAUSS, described in Arellano and Bond (988a) and available from authors on request.
15 9 REVIEW OF ECONOMIC STUDIES Dependent variable: In (Employment)i, TABLE 4 Employment equations GMM estimates (all variables in first differences) Sample Period: (4 companies) Independent Instrumenting wages and capital* variables (al) (a) (b) (c) (d) ni(,_l) 686 ( 4) 69 ( 9) 474 ( 8) 8 (.48) 8 (.6) ni(t) 8 ( 6) 6 ( 7) 3 (7) 6 ()  74 (.) Wit *68 (78) 6 (4) 3 (*49) 64 (.4) wi(t) *393 (68) 3 (94)  ( 8) *64 (66) 43 (.76) ki, *37 (.9) 78 (.4) 93 ( 39)  () k,(,_l)  8 (*73) 4 ( 3) 77 ( 4) ki(t)  (33) 4 (.6) ysi, *68 (.7) 9 (6) 6 ( 9) 89 ( 98) ysi(t)  7 (.3) 66 (4) () 87 () () YSi(t) 6 ( 4)  (3) 96 ( 9) M Sargan test 6.8 () 3 4() 3 () 63() 683 () DifferenceSargan 49 (6) 4 (6)  (6) 86 () 36 () Hausman 8 () 44 () 34 ()  () 9 () Wald test 483 () 667 () 37 (7) (7) 639 (6) No. of observations * A subset of valid moment restrictions involving lagged wages and capital are exploitedsee note (vi). Additional instruments used are stacked levels and first differences of (firm real sales)(,) and (firm real stocks)(t) Notes (i) Time dummies are included in all equations. (ii) Asymptotic standard errors robust to general crosssection and time series heteroskedasticity are reported in parenses. (iii) The GMM estimates reported are all two step except column (al). (iv) The M, Sargan, differencesargan and Hausman statistics are all two step versions of se tests except in column (al). In column (al) m and Hausman statistics are asymptotically robust to general heteroskedasticity, whilst Sargan and differencesargan tests are only valid in case of i.i.d. errors. All Hausman statistics test only coefficient on ni(,t). Degrees of freedom for x statistics are reported in parenses. (v) The Wald statistic is a test of joint significance of independent variables asymptotically distributed as X under null of no relationship, where k is number of coefficients estimated (excluding time dummies). (vi) The basic instrument set used in columns (al), (a) and (b) is of form nil ni... O... O *&Xlj4] 979 l nil ni ni3 AX* 98 ZiI. O * nil... n7. &x'9] 984 where x, is vector of exogenous variables included in equation. For example, equation for 979 in first differences can be written as An,4 :alani3 + aani + Ax'4f3 + Avi4 For companies on which less than 9 observations are available, rows of Z; corresponding to missing equations are deleted and missing values of n in remaining rows are replaced by zeroes. In columns (c) and (d) Zi is modified to take form Zi [diag (ni I.. niswi(s_l)wski(sl)kis): (Ax'4... Axi9)'] (s,..., 7) where xi, is now vector of explanatory variables excluding wages and capital but including stacked lagged sales and stocks.
16 ARELLANO & BOND SPECIFICATION TESTS 9 onestep estimates, with discrepancy being even larger in some cases. We suspect that most of this apparent gain in precision may reflect a downward finitesample bias in estimates of twostep standard errors as indicated by simulation results in Table, suggesting that caution would be advisable in making inferences based on twostep estimator alone in samples of this size. Turning to test statistics, neir of robust m statistics nor twostep Sargan test provide evidence to suggest that assumption of serially uncorrelated errors is inappropriate in this example. The onestep Sargan and differencesargan statistics do reject overidentifying restrictions but our simulation results showed a strong tendency for those tests to reject too often in presence of heteroskedasticity. The twostep differencesargan test is more marginal but does reject at per cent significance level. Both Hausman tests also reject but se too show a tendency to overreject in our simulation experiments. One possibility is that this instability across different instrument sets reflects failure of strict exogeneity assumption for wages and capital, rar than serial correlation per se. In Table we present some alternative estimates of this same model for comparison. Columns (e) and (f) report two instrumental variable estimates of differenced equation using simpler instrument sets of AH type. In column (e) we use difference Ani(t3) to instrument Ani(t_), losing one furr crosssection, whilst in column (f) we use level ni(t3) as instrument. In both cases coefficient estimates are poorly determined, indicating a massive loss in efficiency compared to eir GMM estimator in this application. Using both Afni(t3) and ni(t3) as instruments (not reported) helped a little, but estimates remained very imprecise. In column (g) we report OLS estimates of employment equation in levels. In this case 978 crosssection is available and longer estimation period has been used here. Compared to GMM estimates re is a serious upward bias on lagged dependent variable, which suggests presence of firmspecific effects. Column (h) reports withingroups estimates, which are close to GMM in this example. In fact WG estimate of firstorder autoregressive coefficient is bigger than corresponding GMM estimates, although comparison between WG and GMM in this case is obscured by likely endogeneity of wages and capital. Returning to GMM estimates in Table 4, column (b) omits insignificant dynamics with little change in longrun properties of previous model. In columns (b)(d) we report only twostep estimates though onestep coefficient estimates were invariably similar; In column (b) twostep differencesargan test now marginally accepts hyposis of no serial correlation, but twostep Hausman statistic remains an outlier. In column (c) we relax assumption that real wage and capital stock are strictly exogenous and instead treat m as being endogenous. We refore use lags of w and k dated (t ) and earlier as instruments for wit, wi(tl) and kit. We also use lagged values of company's real sales and real stocks as additional instruments. Given size of our sample, not all available moment restrictions were used. The precise form of instrument matrix is described in note (vi) to Table 4. The results in column (c) suggest that it is inappropriate to treat wages and capital as strictly exogenous in this model. In this case none of test statistics indicate presence of misspecification. The coefficient estimates for our preferred specification in column (c) suggest a longrun wage elasticity of 4 (standard error 8) and a longrun elasticity with respect to capital of 7 (S.E. 4). There is a strong suggestion that industry output enters model in changes rar than levels, which is appealing since o' in (7) measures demand shocks relative to potential output. Layard and Nickell (986) interpret
17 9 REVIEW OF ECONOMIC STUDIES Dependent variable: In (Employment)i, TABLE Employment equations Alternative estimates Independent (e) (f) (g) (h) variables AHd AHI OLS Withingroups P (.) () (*) (.8) ni(t) (*8) (7) (48) (.77) Wit (.3) (83) (7) () Wi(t,l) ( 768) (8) (69) (43) k,, () ( ) (48) (6) (.386) (*37) (64) ( 3) ki(t) (3) (4) (3) (.4) ysi, (3) (338) (76) (93) ysi(,_) (88) (.99) ( 48) ( 8) YSi(t) (46) (.4) (3) (39) M Wald test 993 () () R Number of observations Notes. (i) Time dummies are included in all equations. (ii) Asymptotic standard errors robust to general crosssection and time series heteroskedasticity are reported in parenses. (iii) The m and Wald tests are asymptotically robust to general heteroskedasticity. (iv) Columns (e) and (f) report AndersonHsiaotype estimates of equation in first differences:,&n,(,l) is treated as endogenous and additional instruments used are An,(,3) in (e), and ni(,3) in (f), so that one furr crosssection is lost in (e) and effective sample period becomes (v) Column (g) reports OLS estimates of equation in levels, where effective sample period becomes (vi) Column (h) reports withingroups estimates. These are OLS estimates of equation in deviations from time means. shortrun effect of product demand fluctuations on labour demand as reflecting practice of normal cost pricing. Finally in column (d) we report estimates of Euler equation model given in (8). Here again we treat wages and capital as endogenous variables. Although tests for serial correlation remain below ir critical values, coefficient estimates are not favourable to Euler equation interpretation. The coefficients on capital and industry output are poorly determined, whilst those on lagged dependent variable imply a real discount rate of around %. Very similar results were obtained for versions of Euler equation model allowing for an MA () error process and estimating in levels as opposed to logs. It appears that process of employment adjustment is not well described by this model.
18 ARELLANO & BOND SPECIFICATION TESTS 93 The results of this empirical application are generally in agreement with those of our Monte Carlo simulations. The GMM estimator offers significant efficiency gains compared to simpler IV alternatives, and produces estimates that are welldetermined in dynamic panel data models. The tendency for nonrobust test statistics to overreject is confirmed. The robust m statistics perform satisfactorily as do twostep Sargan and differencesargan tests, but twostep Hausman test must be considered suspect in samples of this size. 6. CONCLUSION In this paper we have discussed estimation of dynamic panel data models by generalized method of moments. The estimators we consider exploit optimally all linear moment restrictions that follow from particular specifications, and are extended to cover case of unbalanced panel data. We focus on models with predetermined but not strictly exogenous explanatory variables in which identification results from lack of serial correlation in errors. A test of serial correlation based on GMM residuals is proposed and compared with Sargan tests of overidentifying restrictions and Hausman specification tests. To study practical performance of se procedures we performed a Monte Carlo simulation for units, seven timeperiods and two parameters. The results indicate negligible finite sample biases in GMM estimators and substantially smaller variances than those associated with simpler IV estimators of kind introduced by Anderson and Hsiao (98). We also find that distributions of serialcorrelation tests are wellapproximated by ir asymptotic counterparts. We applied se methods to estimate employment equations using an unbalanced panel of 4 quoted U.K. companies for period The GMM estimators and serial correlation tests performed well in this application. A potentially serious problem, suggested by both experimental evidence and application, concerns estimates of standard errors for twostep GMM estimator which we find to be downward biased in our samples. Furr results on alternative estimators of se standard errors would be very useful. A. The asymptotic normality of m statistic APPENDIX Following notation of Section 3, under assumption that (X v*/ N) op () we have and also where Nd^' 6 Nvv  (v' X*/N)N(88)+op() as N*oo N/A/ N V_U* A~ N N/vLI/gNZ v_ g* gn V + is( op(), vx*(x'zanz'xv)'x'zan. Then a multivariate central limit orem for independent observations ensures
19 94 REVIEW OF ECONOMIC STUDIES where WN is average covariance matrix of (4i, ct)': and {i Zvi. Therefore where N I iei~ _'N ipn\ WN N E( lifi ifi ) N VN) N INlIVV* Lv +d N(O, ) (Al) or VN )N g'n + gnvngn N (~A VN i) N [J, E(vi()vi*vs*vl()) (VLX*)(XZANZX) X ZANLi E(Zviv*vi()) with +(v' X*) avar (X*V)] avar () N(X'ZANZ'X)(X'ZANVNANZ'X)(X'ZANZ'X)Y A consistent estimate of VN can be obtained by replacing population average expectations of errors by sample averages of residuals. Finally, noticing that under our assumptions (Al) remains valid after this replacement, result follows. For a onestep?, we can also consider an alternative m criterion ("one step M") which relies on more restrictive auxiliary distributional assumptions. Assuming that errors in model in levels are independent and identically distributed across individuals and time, we have N OJN  N a, E(V!()H,*Vi()) where Hi* is a ( Ti  4) square matrix which has twos in main diagonal, minus ones in first subdiagonals and zeros orwise. Moreover where Hit is a ( Ti) x ( Ti4) matrix with N N N tvi()) N VH,*J and Hi.. is a x (T 4) matrix with minus one in (, ) position and zeroes elsewhere. Under such conditions v can be replaced by V ff Lil Vi()Hi. AX*(X ZANZ X) X ZAN (Si z (Hit ))a + v^'x* avar (8)X *V where is an estimate of a* B. The Sargan difference test Note that s in () can be rewritten as iyz* *j 6*iN zvi A S AzI N \I (N N 'vivizi '' IN` where Z* ZH and H is a p x p linear transformation matrix such that Z*(ZiIZl) with N Z V^V^'Z * O
20 ARELLANO & BOND SPECIFICATION TESTS 9 Letting where CN is p x p nonsingular, and noting that ( zn z' IZ z> CNC'N Z*tv z*x Xz* z*x 'xlz* z*v Z/ IP Z XXZ CNC'N ] CNC'N} v we have where vpz* s CN[IP  G(G'G) IG']C'N Z ZN G CN N( Following usual argument one can show that s de'me M X k, where E  N(O, IP) and M is of form IP D(D'D) D' with rank p  k. On same lines we can write SI NI CIN[IP G(G' GI) G']C' Z'v where G C'N (Z X) and ( N z i l) CINC IN Let G* contain top p, rows of G. Notice that G*  G, P. Therefore Ml ds dsss~~deme~p' S SI  E ME E ) O)E where Ml IP DI (D'DY) D and rank (MI) p,  k with D' (D'I D'). Finally notice that is symmetric and idempotent with rank p  p, and also from which () follows. [M Q??)] /Ml Ml o\ / (a) Sample DATA APPENDIX The principal data source used is company accounts from Datastream International which provide accounts records of employment and remuneration (i.e. wage bill) for all U.K. quoted companies from 976 onwards. We have used a sample of 4 companies with operations mainly in U.K., whose main activity is manufacturing and for which we have at least 7 continuous observations during period Where more than 7 observations are available we have exploited this additional information, so that our sample has unbalanced structure described in Table Al.
SYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics  University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
More informationDISCUSSION PAPER SERIES
DISCUSSION PAPER SERIES No. 3048 GMM ESTIMATION OF EMPIRICAL GROWTH MODELS Stephen R Bond, Anke Hoeffler and Jonathan Temple INTERNATIONAL MACROECONOMICS ZZZFHSURUJ Available online at: www.cepr.org/pubs/dps/dp3048.asp
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationEconometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England
Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND
More informationFIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 488241038
More informationPanel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS15/16
1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS15/16 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:
More informationON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS
ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationA SubsetContinuousUpdating Transformation on GMM Estimators for Dynamic Panel Data Models
Article A SubsetContinuousUpdating Transformation on GMM Estimators for Dynamic Panel Data Models Richard A. Ashley 1, and Xiaojin Sun 2,, 1 Department of Economics, Virginia Tech, Blacksburg, VA 24060;
More informationRandom Effects Models for Longitudinal Survey Data
Analysis of Survey Data. Edited by R. L. Chambers and C. J. Skinner Copyright 2003 John Wiley & Sons, Ltd. ISBN: 0471899879 CHAPTER 14 Random Effects Models for Longitudinal Survey Data C. J. Skinner
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More informationThe Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss
More information1. The Classical Linear Regression Model: The Bivariate Case
Business School, Brunel University MSc. EC5501/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 018956584) Lecture Notes 3 1.
More informationNormalization and Mixed Degrees of Integration in Cointegrated Time Series Systems
Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 EMail: r.j.rossana@wayne.edu
More informationTesting for serial correlation in linear paneldata models
The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear paneldata models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear paneldata
More informationAPPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS. Giovanni S.F. Bruno EEA 20041
ISTITUTO DI ECONOMIA POLITICA Studi e quaderni APPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS Giovanni S.F. Bruno EEA 20041 Serie di econometria ed economia applicata
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationEstimation and Inference in Cointegration Models Economics 582
Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there
More informationChapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
More information2 Testing under Normality Assumption
1 Hypothesis testing A statistical test is a method of making a decision about one hypothesis (the null hypothesis in comparison with another one (the alternative using a sample of observations of known
More informationCHAPTER 6. SIMULTANEOUS EQUATIONS
Economics 24B Daniel McFadden 1999 1. INTRODUCTION CHAPTER 6. SIMULTANEOUS EQUATIONS Economic systems are usually described in terms of the behavior of various economic agents, and the equilibrium that
More informationWhy Is Micro Evidence on the Effects of Uncertainty Not Replicated In Macro Data?
WP/05/158 Why Is Micro Evidence on the Effects of Uncertainty Not Replicated In Macro Data? Stephen Bond and Domenico Lombardi 2005 International Monetary Fund WP/05/158 IMF Working Paper Office of the
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More informationCorrelated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 1319, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
More informationDo Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*
Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?* Jennjou Chen and TsuiFang Lin Abstract With the increasing popularity of information technology in higher education, it has
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.unimuenchen.de/2962/ MPRA Paper No. 2962, posted
More informationPanel Data Econometrics
Panel Data Econometrics Master of Science in Economics  University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is
More informationHypothesis Testing in the Classical Regression Model
LECTURE 5 Hypothesis Testing in the Classical Regression Model The Normal Distribution and the Sampling Distributions It is often appropriate to assume that the elements of the disturbance vector ε within
More informationImplementations of tests on the exogeneity of selected. variables and their Performance in practice ACADEMISCH PROEFSCHRIFT
Implementations of tests on the exogeneity of selected variables and their Performance in practice ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationTesting The Quantity Theory of Money in Greece: A Note
ERC Working Paper in Economic 03/10 November 2003 Testing The Quantity Theory of Money in Greece: A Note Erdal Özmen Department of Economics Middle East Technical University Ankara 06531, Turkey ozmen@metu.edu.tr
More informationChapter 6. Econometrics. 6.1 Introduction. 6.2 Univariate techniques Transforming data
Chapter 6 Econometrics 6.1 Introduction We re going to use a few tools to characterize the time series properties of macro variables. Today, we will take a relatively atheoretical approach to this task,
More information2. What are the theoretical and practical consequences of autocorrelation?
Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since
More informationMarketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
More informationproblem arises when only a nonrandom sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a nonrandom
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More informationStock Markets, Banks, and Growth: Correlation or Causality
Stock Markets, Banks, and Growth: Correlation or Causality Thorsten Beck and Ross Levine July 2001 Abstract: This paper investigates the impact of stock markets and banks on economic growth using a panel
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulationbased method for estimating the parameters of economic models. Its
More informationInstrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.
Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted
More informationEstimating the effect of state dependence in workrelated training participation among British employees. Panos Sousounis
Estimating the effect of state dependence in workrelated training participation among British employees. Panos Sousounis University of the West of England Abstract Despite the extensive empirical literature
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationhttp://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions
A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.
More informationConditional guidance as a response to supply uncertainty
1 Conditional guidance as a response to supply uncertainty Appendix to the speech given by Ben Broadbent, External Member of the Monetary Policy Committee, Bank of England At the London Business School,
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationC: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}
C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression
More informationEconometrics Regression Analysis with Time Series Data
Econometrics Regression Analysis with Time Series Data João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, May 2011
More informationOLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.
OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns
More informationSimultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the
Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the explanatory variables is jointly determined with the dependent
More informationEconometric Methods fo Panel Data Part II
Econometric Methods fo Panel Data Part II Robert M. Kunst University of Vienna April 2009 1 Tests in panel models Whereas restriction tests within a specific panel model follow the usual principles, based
More informationFrom Education to Democracy?
From Education to Democracy? By DARON ACEMOGLU, SIMON JOHNSON, JAMES A. ROBINSON, AND PIERRE YARED* The conventional wisdom, since at least the writings of John Dewey (1916), views high levels of educational
More informationState Space Time Series Analysis
State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationPractical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University
Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction
More informationEconometric Methods for Panel Data
Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies
More information4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Nonlinear functional forms Regression
More informationNote 2 to Computer class: Standard misspecification tests
Note 2 to Computer class: Standard misspecification tests Ragnar Nymoen September 2, 2013 1 Why misspecification testing of econometric models? As econometricians we must relate to the fact that the
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationInequality, Mobility and Income Distribution Comparisons
Fiscal Studies (1997) vol. 18, no. 3, pp. 93 30 Inequality, Mobility and Income Distribution Comparisons JOHN CREEDY * Abstract his paper examines the relationship between the crosssectional and lifetime
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationw 2i«97 CEKT^CIRCUUTlWB^rA^ ?\ " You^y be d a minim*" The w~*g&&zt^ borrowed rene sponsible for its the library from wb * * previous due date.
CEKT^CIRCUUTlWB^rA^ The w~*g&&zt^ to sponsible for its rene the library from wb * * ^ borrowed?\ " You^y be d a minim*" for disciplinary action and may the Unlver.lty. CENTER, 333M00 w 2i«97 previous
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationTEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND
I J A B E R, Vol. 13, No. 4, (2015): 15251534 TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND Komain Jiranyakul * Abstract: This study
More informationTESTING THE ONEPART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWOPART MODEL
TESTING THE ONEPART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWOPART MODEL HARALD OBERHOFER AND MICHAEL PFAFFERMAYR WORKING PAPER NO. 201101 Testing the OnePart Fractional Response Model against
More informationThe information content of lagged equity and bond yields
Economics Letters 68 (2000) 179 184 www.elsevier.com/ locate/ econbase The information content of lagged equity and bond yields Richard D.F. Harris *, Rene SanchezValle School of Business and Economics,
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationRegression analysis in practice with GRETL
Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that
More informationLimitations of regression analysis
Limitations of regression analysis Ragnar Nymoen Department of Economics, UiO 8 February 2009 Overview What are the limitations to regression? Simultaneous equations bias Measurement errors in explanatory
More information1. THE LINEAR MODEL WITH CLUSTER EFFECTS
What s New in Econometrics? NBER, Summer 2007 Lecture 8, Tuesday, July 31st, 2.003.00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples
More informationStandardization and Estimation of the Number of Factors for Panel Data
Journal of Economic Theory and Econometrics, Vol. 23, No. 2, Jun. 2012, 79 88 Standardization and Estimation of the Number of Factors for Panel Data Ryan GreenawayMcGrevy Chirok Han Donggyu Sul Abstract
More informationThe marginal cost of rail infrastructure maintenance; does more data make a difference?
The marginal cost of rail infrastructure maintenance; does more data make a difference? Kristofer Odolinski, JanEric Nilsson, Åsa Wikberg Swedish National Road and Transport Research Institute, Department
More informationEC327: Advanced Econometrics, Spring 2007
EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Appendix D: Summary of matrix algebra Basic definitions A matrix is a rectangular array of numbers, with m
More informationDEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests
DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also
More informationWhich null hypothesis do overidentification restrictions actually test? Abstract
Which null hypothesis do overidentification restrictions actually test? Rembert De Blander EcRu, U.C.Louvain and CES, K.U.Leuven Abstract In this note I investigate which alternatives are detected by overidentifying
More informationTime Series Analysis
Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina GarcíaMartos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and GarcíaMartos
More informationAdvanced timeseries analysis
UCL DEPARTMENT OF SECURITY AND CRIME SCIENCE Advanced timeseries analysis Lisa Tompson Research Associate UCL Jill Dando Institute of Crime Science l.tompson@ucl.ac.uk Overview Fundamental principles
More informationAdvanced Forecasting Techniques and Models: ARIMA
Advanced Forecasting Techniques and Models: ARIMA Short Examples Series using Risk Simulator For more information please visit: www.realoptionsvaluation.com or contact us at: admin@realoptionsvaluation.com
More informationINVENTORY INVESTMENT AND LOAN SUPPLY SHOCKS. EVIDENCE FROM UKRAINE. Dzhumyga Maryna. MA in Economics. Kyiv School of Economics
INVENTORY INVESTMENT AND LOAN SUPPLY SHOCKS. EVIDENCE FROM UKRAINE by Dzhumyga Maryna A thesis submitted in partial fulfillment of the requirements for the degree of MA in Economics Kyiv School of Economics
More informationProblems with OLS Considering :
Problems with OLS Considering : we assume Y i X i u i E u i 0 E u i or var u i E u i u j 0orcov u i,u j 0 We have seen that we have to make very specific assumptions about u i in order to get OLS estimates
More informationEstimating the marginal cost of rail infrastructure maintenance. using static and dynamic models; does more data make a
Estimating the marginal cost of rail infrastructure maintenance using static and dynamic models; does more data make a difference? Kristofer Odolinski and JanEric Nilsson Address for correspondence: The
More informationTHE IMPACT OF EXCHANGE RATE VOLATILITY ON BRAZILIAN MANUFACTURED EXPORTS
THE IMPACT OF EXCHANGE RATE VOLATILITY ON BRAZILIAN MANUFACTURED EXPORTS ANTONIO AGUIRRE UFMG / Department of Economics CEPE (Centre for Research in International Economics) Rua Curitiba, 832 Belo Horizonte
More informationWooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis:
Wooldridge, Introductory Econometrics, 4th ed. Chapter 4: Inference Multiple regression analysis: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of
More informationECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.
More informationTURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS. A nonlinear moving average test as a robust test for ARCH
TURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS ISSN 0786 656 ISBN 951 9 1450 6 A nonlinear moving average test as a robust test for ARCH Jussi Tolvi No 81 May
More informationCentre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationPanel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationGeneralized Method of Moments Estimation
Generalized Method of Moments Estimation Lars Peter Hansen 1 Department of Economics University of Chicago email: lhansen@uchicago.edu June 17, 2007 1 I greatly appreciate comments from Lionel Melin,
More informationFORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits
Technical Paper Series Congressional Budget Office Washington, DC FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits Albert D. Metz Microeconomic and Financial Studies
More informationVariance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
More informationLONG TERM FOREIGN CURRENCY EXCHANGE RATE PREDICTIONS
LONG TERM FOREIGN CURRENCY EXCHANGE RATE PREDICTIONS The motivation of this work is to predict foreign currency exchange rates between countries using the long term economic performance of the respective
More information