DISCUSSION PAPER SERIES IZA DP No. 2756 Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models Cheng Hsao M. Hashem Pesaran Andreas Pck Aprl 2007 Forschungsnsttut zur Zukunft der Arbet Insttute for the Study of Labor
Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models Cheng Hsao Unversty of Southern Calforna M. Hashem Pesaran CIMF, Cambrdge Unversty, Unversty of Southern Calforna and IZA Andreas Pck CIMF, Cambrdge Unversty Dscusson Paper No. 2756 Aprl 2007 IZA P.O. Box 7240 53072 Bonn Germany Phone: +49-228-3894-0 Fax: +49-228-3894-180 E-mal: za@za.org Any opnons expressed here are those of the author(s and not those of the nsttute. Research dssemnated by IZA may nclude vews on polcy, but the nsttute tself takes no nsttutonal polcy postons. he Insttute for the Study of Labor (IZA n Bonn s a local and vrtual nternatonal research center and a place of communcaton between scence, poltcs and busness. IZA s an ndependent nonproft company supported by Deutsche Post World Net. he center s assocated wth the Unversty of Bonn and offers a stmulatng research envronment through ts research networks, research support, and vstors and doctoral programs. IZA engages n ( orgnal and nternatonally compettve research n all felds of labor economcs, ( development of polcy concepts, and ( dssemnaton of research results and concepts to the nterested publc. IZA Dscusson Papers often represent prelmnary work and are crculated to encourage dscusson. Ctaton of such a paper should account for ts provsonal character. A revsed verson may be avalable drectly from the author.
IZA Dscusson Paper No. 2756 Aprl 2007 ABSRAC Dagnostc ests of Cross Secton Independence for Nonlnear Panel Data Models * In ths paper we dscuss tests for resdual cross secton dependence n nonlnear panel data models. he tests are based on average par-wse resdual correlaton coeffcents. In nonlnear models, the defnton of the resdual s ambguous and we consder two approaches: devatons of the observed dependent varable from ts expected value and generalzed resduals. We show the asymptotc consstency of the cross secton dependence (CD test of Pesaran (2004. In Monte Carlo experments t emerges that the CD test has the correct sze for any combnaton of N and whereas the LM test reles on large relatve to N. We then analyze the roll-call votes of the 104th U.S. Congress and fnd consderable dependence between the votes of the members of Congress. JEL Classfcaton: C12, C33, C35 Keywords: cross-secton dependence, nonlnear panel data model Correspondng author: Hashem Pesaran Faculty of Economcs Unversty of Cambrdge Sdgwck Avenue Cambrdge, CB3 9DD Unted Kngdom E-mal: hashem.pesaran@econ.cam.ac.uk * he research for ths paper began when the thrd author was post-doctoral research fellow at De Nederlandsche Bank (DNB. He would lke to thank DNB for ts hosptalty. We would lke to thank Q L for helpful comments.
1 Introducton Many panel data models assume that observatons across ndvduals are ndependent. However, there could be common shocks that affect all ndvduals. Often economc theores also predct that agents take actons that lead to nterdependence among themselves. For example, the predcton that rsk-averse agents wll make nsurance contracts allowng them to smooth dosyncratc shocks mples dependence n consumpton across ndvduals. If observatons are dependent across ndvduals, estmators that are based on the assumpton of cross sectonal ndependence may be nconsstent. Snce contrary to tme seres data, there s no natural orderng for cross sectonal ndces,, approprate modelng and estmaton of cross sectonal dependence can be dffcult, n partcular f the dmenson of cross sectonal observatons, N, s large and the tme seres dmenson,, s small. herefore, t s appealng to frst test for cross sectonal dependence before one attempts to ncorporate cross sectonal dependence nto a model. here are essentally two approaches to test for cross sectonal dependence. One s to postulate a connecton or spatal matrx, then test f the coeffcent of ths spatal matrx s zero, e.g. Moran (1948, Kelejan and Prucha (2001. Although under the null of no cross-correlatons the coeffcent of spatal matrx s zero no matter how ths matrx s postulated, the power of ths knd of test presumably wll depend on the choce of the spatal matrx. Moreover, the computaton of the spatal regresson model s qute complcated, see Kelejan and Prucha (1999 and Lee (2002. Another approach to testng cross sectonal dependence s to drectly test f the cross-correlatons of the errors are zero. For example, Breusch and Pagan s (1980 Lagrangan multpler (LM test s based on the average of the squared par-wse correlaton coeffcents. However, the mean of the squared correlaton coeffcents s not correctly centered when s small. When N s large, the ncorrect centerng of the squared correlaton coeffcents s lkely to be accentuated, resultng n sgnfcant sze dstortons. Pesaran, Ullah and Yamagata (2006 have proposed a bas-adjusted normal approxmaton verson of the LM test for lnear regresson models wth strctly exogenous regressors and normal errors. Small sample evdence based on ther Monte Carlo experments suggests that the bas adjusted LM tests successfully control the sze. However, f the model s nonlnear, t does not appear feasble to derve the exact mean and varance based, for example, on the work of Ullah (2004. As an alternatve to the test based on the square of the error correlaton coeffcents, Pesaran (2004 proposes to use the smple average of all par-wse correlaton coeffcents of the least squares resduals from the ndvdual lnear regressons n the panel, whch s closely related to the C AVE by Frees (1995. he advantage of Pesaran cross secton dependence test (CD test s that t s correctly centered for fxed N and under the null of cross secton ndependence assumng that the errors are symmetrcally 2
dstrbuted. In a recent paper, Ng (2006 employs spacng varance rato statstcs to test the severty of cross secton correlaton n panels by parttonng the par-wse cross-correlatons nto groups from hgh to low. he proposed statstcs are ntended as agnostc tools for dentfyng and characterzng correlatons across groups. However, they cannot be used as dagnostc tests of cross secton ndependence that underle the standard analyss of panel data. Such tests are mportant as parameter estmates may be nconsstent f cross secton correlaton of errors s not accounted for at the estmaton stage. It s, therefore, mportant to establsh whether the cross secton error ndependence can be mantaned pror to estmaton and nference. hs paper explores the use of the LM and CD tests for nonlnear panel data models. In such models, the calculaton of the errors s not as straghtforward as n lnear models. We consder two approaches to estmate the errors. he frst estmate of the errors s the devaton of the observed dependent varable from ts expected value, and the second estmate s to predct the errors condtonal on the observed dependent varable, the so called generalzed resdual, see e.g. Goureroux, Monfort and rognon (1985. Based on the estmated resduals we ntroduce the LM and CD test statstcs for nonlnear panel data model and, usng Monte Carlo experments, we examne the small sample performance of the tests for the probt and the obt model. Usng data on votng n the U.S. Congress and campagn contrbutons by poltcal lobby groups prevously analyzed by Wawro (2001 we demonstrate the applcaton of the test. he next secton ntroduces the nonlnear panel data model. Secton 3 dscusses the estmaton of the resduals and the tests for cross secton dependence. he small sample performance s evaluated usng Monte Carlo experments n Secton 4, and the tests are appled to the data on votng n the U.S. Congress n Secton 5. Fnally, Secton 6 provdes some concludng remarks. echncal detals are provded n Appendces A and B, and a bootstrap procedure to approxmate the fnte sample dstrbuton of the CD test s dscussed n Appendx C. 2 he nonlnear panel data model Suppose that the latent varable, yt, s generated by the followng nonlnear panel data model, f(y t, x t, θ = ε t, for = 1, 2,..., N, t = 1, 2,...,, (1 where x t s a k 1 vector of exogenous varables, θ s a q 1 vector of parameters, ε t s a scalar dsturbance, N s the number of cross secton observatons, and s the number of observatons n tme. he varable y t s observed, whch s related to the latent varable va the lnk functon g(, y t = g(y t. (2 3
hs general model encompasses many econometrc models. Examples nclude bnary choce models where and f(y t, x t, β = y t β x t, (3 g(y t = I(y t (4 where I(A s the ndcator functon, whch s unty f A > 0 and zero otherwse. If the dstrbuton of ε t s the logstc, then ths consttutes the logt model. If ε t s standard normally dstrbuted, then ths s the probt model. he obt model s obtaned f the latent model s that of equaton (3, errors are normal and the lnk functon s g(y t = y ti(y t. (5 he dsturbances are assumed to be potentally contemporaneously correlated, ε t (0, Σ, where ε t = (ε 1t, ε 2t,..., ε Nt. hs paper focuses on testng Σ = D aganst Σ D, where D s a dagonal matrx. Such an error structure could arse, for example, from the presence of unobserved common factors ε t = γ f t + e t, where γ s the vector of factor loadngs, f t d (0, Σ f, and e t d ( 0, σ 2 e. 3 estng for cross secton ndependence Pesaran (2004 has suggested two approaches to test for cross secton dependence usng the par-wse correlaton coeffcents of the resduals n the regresson equatons of the th and j th unt, ρ j. One s the LM test of Breusch and Pagan (1980 LM = 1 N(N 1 N 1 =1 j=+1 he other s the CD test N 1 2 CD = N(N 1 N ( ρ 2 j 1. (6 N =1 j=+1 ρ j. (7 It s clear that, n contrast to the LM test, the CD test requres the cross secton correlaton to be dfferent from zero on average to detect devatons from cross-secton ndependence. Whle we beleve that ths s not a restrctve assumpton for most real lfe stuatons, ths lmtaton should be borne n mnd when applyng the CD test. 4
For lnear models resduals are estmated drectly from the underlyng lnear regressons. For nonlnear models the concept of a resdual s ambguous and can be defned n a number of dfferent ways. One possblty would be to defne the dsturbances of the nonlnear models analogous to the lnear case as the devaton of the observed dependent varable from ts expected value u t = y t E(y t x t, θ (8 wth an estmated counterpart, the resdual, gven by ũ t = y t E (y t x t, θ, (9 where θ s a consstent estmator of θ under the null of cross secton ndependence. For the probt model, for example, ths devaton s gven by and for the obt model t s ũ t = y t Φ( β x t, (10 ũ t = y t ( β x t + σ λt Φ ( β x t σ, (11 where ( [ ( ] 1 β λ t = φ x t β Φ x t, σ σ and φ( s the standard normal probablty densty functon (pdf, and Φ( s the standard normal cumulatve dstrbuton functon (cdf. For many models, the resduals wll be heteroskedastc. One can transform them nto homoskedastc resduals by dvdng the estmated resduals by ther estmated standard errors. In the case of the probt model the standardzed resdual s defned as ũ t = y t Φ( β x t, (12 Φ( β x t (1 Φ( β x t and, n the case of the obt model, t s ũ t = ũ t / ω t (13 where [ ( ] ( ω t 2 = ( β x t 2 + β x t σ λt + σ 2 ( β β x t + σ λt 2 Φ x t β Φ x t. σ σ he dervatons of the resduals and ther varances s gven n Appendx A. Alternatvely, the computaton of ρ j can be based on the generalzed resdual proposed by Goureroux, Monfort, and rognon (1985, Goureroux, 5
Monfort, Renault, and rognon (1987, and Chesher and Irsh (1987. he generalzed resduals are defned as u g t = E 0(f(y t, x t, θ y t = ψ 0 (y t, x t, θ, (14 where E 0 (. s the expectaton operator under the null hypothess of no cross secton dependence. In contrast to the resdual (8, the expectaton of the generalzed resduals n (14 s condtonal on the observed dependent varable, y t. An estmator of ψ 0 (y t, x t, θ s gven by ũ g t = ψ 0(y t, x t, θ. (15 Usng equaton (15, the generalzed resdual for the probt models s gven by ũ g t = φ( β x t Φ( β x t [1 Φ( β x t ] [y t Φ( β x t ]. (16 For the obt model we have ũ g t = (y t β x t I(y t σ φ( β x t / σ Φ( β x t / σ [1 I(y t], (17 where σ s the estmated standard devaton of the error term (Chesher and Irsh (1987. he generalzed resdual s also heterskedastc for many models. In Appendx A we gve the varances for the probt and the obt model, whch could be used to obtan standardzed versons of the generalzed resduals. However, n lne wth the lterature we wll use the resduals (16 and (17 below. In general ρ j can be estmated usng any of the resduals defned above. For example, usng the resduals, ũ t, we have ρ j = (ũ t ũ (ũ jt ũ j ( 1/2 ( (ũ 1/2, (18 t ũ 2 (ũ jt ũ j 2 where ũ = 1 he estmated correlaton coeffcent, ρ j, s then used n equatons (6 and (7 to obtan the LM and CD test statstcs. For large ũ wll tend to zero and could be gnored but for better small sample performance the mean correcton mght be desrable. p Under the null hypothess and for suffcently large, ρ j 0, for each and j. However, the probablty lmt of ρ j wll dffer from zero n the presence of cross secton dependence. Under the null of cross secton ndependence and for suffcently large N and, the CD statstc tends to a standard normal varate. See Appendx B for a proof and precse mathematcal condtons. 6 ũ t.
4 Small sample propertes: Monte Carlo evdence 4.1 Data generatng process he Monte Carlo experments are based on the followng data generatng process (DGP for the latent varable, y (r t = α (r + β x (r t + ε (r t, (19 where = 1, 2,..., N, t = 1, 2,...,, and r, r = 1, 2,..., R, denotes the replcaton ndex n the Monte Carlo experments wth R = 1000, β = 1. he regressors are generated as η (r t = λη (r,t 1 + ζ(r t and λ = 0.5. Fnally, where x (r = N =1 x(r, ζ(r t x(r t x (r t = δf (r t + η (r t d N(0, 1, and f (r t d N(0, 1. We set δ = 1 α (r /, Ŝ(r x = = x (r + Ŝ(r x ν(r, [ (N 1 1 ] 1/2 N =1 ( x(r x (r 2, x (r = /N, and ν (r d N(0, 1. Hence, the setup covers the case, where the ndvdual specfc effects are allowed to be correlated wth the explanatory varables. hs s an mportant consderaton n the analyss of mcro panels, as noted, for example, by Chamberlan (1980. he estmaton of β under a probt specfcaton only makes use of y (r ( I y (r t, and under the obt specfcaton y (r t = y (r t I wthout loss of generalty the varance of the error term, u (r ( y (r t. Hence, t = t, may be set equal to unty. o allow for correlaton across the errors of dfferent cross secton unts we adopt the followng standardzed one-factor structure ε (r t = γ(r f (r t + e (r t 1 + γ (r2 where γ (r s a scalar, f (r t d N(0, 1, and e (r t d N(0, 1. Under these assumptons we have E(ε (r t = 0 and Var(ε(r t = 1. he par-wse correlaton coeffcent of the errors s gven by ( Corr ε (r t, ε(r jt = γ (r γ (r j ( 1 + γ (r2 (1 + γ (r2 j In the experments reported below we use γ (r = 0,, γ (r U(0.1, 0.3, and γ (r U( 0.2, 0.6, where U(a, b denotes the unform dstrbuton wth lower bound a and upper bound b. 7.
Usng the artfcal data, β, (and σ n the case of the obt model are estmated under the assumpton of cross secton ndependence by maxmum lkelhood for each, separately. hen, ρ j s computed usng the two alternatve resduals, ũ t and ũg t, and the CD and LM test statstcs are then calculated wth mean correctons gven n (18. 4.2 Results able 1 presents the sze and power of CD and LM tests for the probt models, and able 2 presents the sze and power of CD and LM tests for obt model. he results n these tables suggest the followng. ( here are substantal sze dstortons for the LM test even when N or or both are large. ( he emprcal sze s close to the nomnal sze for the CD test even for N and as small as 10. hs result holds generally and does not requre the fxed effects and the regressors to be uncorrelated. ( he power of CD test mproves as ether N or ncreases. However, the power mproves much faster when N ncreases than when ncreases. When = 20 and N = 100, the power s about 0.9. On the other hand, when N = 20 and = 100, the power of CD test s about 0.6 0.7. (v he test results are robust to the way resduals from the nonlnear models are computed. (v Even when the LM test has the correct sze, as n the case where = 100 and N = 10, the CD test contnues to exhbt a hgher power. 5 Applcaton to an analyss of campagn contrbutons and roll-call votes We llustrate the use the use of the CD test by re-analyzng the data on votng and campagn contrbutons of Wawro (2001. Usng a panel probt model Wawro analyzes the nfluence of campagn contrbutons of a busness lobby group, the US Chamber of Commerce (USCC, and a labor lobby group, the Amercan Federaton of Labor-Congress of Industral Organzatons (AFL-CIO, on the votng behavour of members of the US Congress wth the unemployment rate n the consttuency of each member of Congress as an addtonal explanatory varable. he data set, whch s avalable from Prof. Wawro s web page 1, contans data for a selectons of the roll-call votes for each sesson of the 102th, 103rd and 104th Congress, where the selected roll-call votes are those that 1 http://www.columba.edu/ gjw10/panelprobt.html 8
able 1: Sze and power of CD and LM tests: he probt model Standardzed resduals, ũ t Generalzed resduals, ũ g t \N 10 20 30 50 100 10 20 30 50 100 = 0, CD test 10 0.064 0.059 0.077 0.064 0.068 0.066 0.063 0.078 0.057 0.054 20 0.057 0.056 0.059 0.058 0.075 0.056 0.055 0.051 0.059 0.072 30 0.054 0.052 0.063 0.048 0.049 0.054 0.053 0.062 0.047 0.051 50 0.050 0.045 0.061 0.059 0.061 0.048 0.045 0.060 0.056 0.062 100 0.046 0.051 0.060 0.067 0.062 0.048 0.057 0.060 0.066 0.061 LM test 10 0.197 0.459 0.698 0.974 1.000 0.192 0.456 0.712 0.975 1.000 20 0.085 0.203 0.356 0.656 0.985 0.093 0.224 0.419 0.734 0.996 30 0.077 0.132 0.235 0.394 0.858 0.074 0.160 0.290 0.527 0.940 50 0.070 0.087 0.099 0.196 0.491 0.076 0.113 0.152 0.346 0.784 100 0.060 0.083 0.067 0.083 0.195 0.063 0.091 0.095 0.186 0.534 γ (r U(0.1, 0.3 CD test 10 0.097 0.153 0.235 0.381 0.702 0.098 0.141 0.231 0.365 0.674 20 0.128 0.220 0.361 0.596 0.903 0.136 0.235 0.371 0.596 0.897 30 0.140 0.281 0.473 0.736 0.973 0.149 0.295 0.487 0.738 0.978 50 0.190 0.404 0.639 0.908 0.998 0.188 0.424 0.657 0.909 0.999 100 0.246 0.634 0.891 0.993 1.000 0.266 0.660 0.898 0.994 1.000 LM test 10 0.193 0.475 0.728 0.971 1.000 0.186 0.485 0.733 0.973 1.000 20 0.126 0.244 0.383 0.759 0.991 0.130 0.264 0.429 0.821 0.995 30 0.104 0.175 0.261 0.559 0.943 0.105 0.191 0.311 0.652 0.974 50 0.097 0.142 0.240 0.437 0.868 0.092 0.154 0.283 0.548 0.952 100 0.095 0.145 0.238 0.440 0.880 0.102 0.158 0.290 0.548 0.957 γ (r U( 0.2, 0.6 CD test 10 0.104 0.126 0.206 0.347 0.642 0.101 0.129 0.200 0.329 0.615 20 0.124 0.195 0.298 0.545 0.838 0.132 0.203 0.295 0.543 0.836 30 0.136 0.270 0.394 0.661 0.929 0.141 0.280 0.396 0.667 0.934 50 0.170 0.358 0.550 0.808 0.992 0.181 0.366 0.561 0.818 0.993 100 0.243 0.501 0.766 0.938 1.000 0.251 0.502 0.773 0.941 1.000 LM test 10 0.206 0.450 0.730 0.974 1.000 0.208 0.465 0.737 0.974 1.000 20 0.147 0.286 0.466 0.822 0.996 0.145 0.306 0.518 0.871 0.996 30 0.114 0.222 0.402 0.681 0.988 0.114 0.268 0.457 0.774 0.996 50 0.136 0.252 0.403 0.681 0.975 0.140 0.281 0.481 0.786 0.996 100 0.176 0.373 0.613 0.854 0.999 0.198 0.427 0.687 0.919 1.000 γ (r he able reports the percentage of rejectons of the null of no cross-secton dependence for the CD-statstc (7 and the LM-statstc (6 from the standardzed and the generalzed resduals at the 5% sgnfcance level for 1000 repettons of the experment. 9
able 2: Sze and power of CD and LM tests: he obt model Standardzed resduals, ũ t Generalzed resduals, ũ g t \N 10 20 30 50 100 10 20 30 50 100 = 0, CD test 10 0.063 0.080 0.060 0.066 0.072 0.062 0.048 0.068 0.054 0.057 20 0.059 0.055 0.047 0.055 0.066 0.048 0.048 0.070 0.053 0.056 30 0.069 0.066 0.063 0.067 0.059 0.049 0.064 0.059 0.055 0.047 50 0.048 0.052 0.052 0.043 0.064 0.050 0.055 0.045 0.057 0.060 100 0.038 0.046 0.062 0.060 0.053 0.045 0.065 0.048 0.038 0.046 LM test 10 0.184 0.461 0.737 0.984 1.000 0.201 0.498 0.765 0.982 1.000 20 0.122 0.253 0.406 0.722 0.985 0.137 0.288 0.501 0.822 0.996 30 0.114 0.160 0.260 0.467 0.893 0.102 0.214 0.374 0.668 0.983 50 0.069 0.129 0.153 0.245 0.589 0.080 0.126 0.246 0.537 0.941 100 0.067 0.078 0.092 0.134 0.255 0.084 0.070 0.182 0.362 0.882 γ (r U(0.1, 0.3 CD test 10 0.144 0.267 0.388 0.600 0.881 0.133 0.261 0.391 0.613 0.884 20 0.173 0.379 0.545 0.786 0.971 0.200 0.371 0.588 0.808 0.978 30 0.174 0.474 0.674 0.892 0.999 0.221 0.497 0.710 0.924 0.996 50 0.261 0.620 0.846 0.984 1.000 0.301 0.682 0.883 0.990 1.000 100 0.402 0.841 0.983 1.000 1.000 0.481 0.894 0.989 1.000 1.000 LM test 10 0.225 0.542 0.784 0.986 1.000 0.254 0.550 0.783 0.994 1.000 20 0.193 0.368 0.596 0.871 0.999 0.173 0.391 0.591 0.894 1.000 30 0.132 0.316 0.467 0.788 0.991 0.147 0.327 0.521 0.845 0.996 50 0.149 0.298 0.492 0.761 0.985 0.138 0.316 0.525 0.851 0.997 100 0.189 0.367 0.557 0.848 1.000 0.195 0.403 0.653 0.931 1.000 γ (r U( 0.2, 0.6 CD test 10 0.144 0.243 0.356 0.574 0.821 0.143 0.264 0.360 0.572 0.845 20 0.151 0.326 0.448 0.725 0.945 0.180 0.351 0.527 0.745 0.964 30 0.178 0.397 0.569 0.821 0.988 0.205 0.431 0.628 0.865 0.995 50 0.264 0.511 0.707 0.975 0.999 0.285 0.557 0.765 0.935 0.999 100 0.361 0.693 0.874 0.989 1.000 0.429 0.730 0.907 0.992 1.000 LM test 10 0.228 0.581 0.829 0.991 1.000 0.230 0.559 0.838 0.989 1.000 20 0.183 0.443 0.675 0.925 1.000 0.209 0.474 0.714 0.954 0.999 30 0.216 0.410 0.648 0.914 1.000 0.224 0.464 0.746 0.956 1.000 50 0.267 0.498 0.708 0.913 1.000 0.274 0.572 0.820 0.979 1.000 100 0.345 0.685 0.878 0.985 1.000 0.438 0.822 0.964 0.999 1.000 γ (r he specfcaton of the observed dependent varable s gven n (5. able 1. 10 Otherwse see footnote of
the lobby groups themselves deemed mportant. We test for cross secton ndependence n the sessons of the 104th Congress for the roll-call votes suggested by the USCC, whch has the largest number of votes of the data sets. We exclude members of Congress that do not change ther vote more than three tmes as the estmaton of the regresson equaton for each ndvdual separately would otherwse be computatonally unstable. hs leads to two data sets wth M = 19 each, and N = 139 for the frst sesson and N = 145 for the second sesson, where M s the number of motons that are put before Congress and are recorded n Prof. Wawro s data set as the tmng of the votes s not obvous we refer to the dfferent motons for whch the roll-call votes were recorded as m = 1, 2,..., M nstead of t. Wawro (2001, p.570 ncludes a moton-specfc ntercept to account for the partcular poltcal context around the roll-call votes. We do not nclude such a moton-specfc ntercept. However, f the poltcal context around a moton nfluences the votng behavour, then omttng a moton specfc ntercept s tantamount to ntroducng cross-correlaton among the resduals. he CD test can therefore be nterpreted as a test for the necessty to nclude a moton-specfc ntercept. We proceed as follows. We estmate the parameters of the probt model by maxmzng the lkelhood L = M Φ(β x m y m (1 Φ(β x m (1 ym, for = 1, 2,..., N (20 m=1 where y m s a bnary ndcator for the votes of the member of Congress ( aye or nay, x m contans an ntercept, the contrbutons of the USCC, the contrbutons of the AFL-CIO, and the unemployment rate of the consttuency of the member of Congress. Note that ths model s more general than the random effects model estmated by Warwo (2001 as we also allow the slope parameters to vary between ndvduals. Usng the parameter vector β, we calculate the condtonal generalzed resdual as gven n equaton (16 and the uncondtonal resdual of equaton (12. From these we obtan the parwse correlaton coeffcent. he average correlaton coeffcent for the condtonal resduals are 0.283 for the frst sesson and 0.310 for the second sesson. For the uncondtonal resduals, they are 0.286 and 0.310. Usng parwse correlaton coeffcents we can calculate the CD test statstc. However, both panels are unbalanced as some observatons are mssng and, therefore, the statstc has to be adjusted to N 1 2 N CD = Mj ρ j, (21 N(N 1 =1 j=+1 where M j s the number of motons where observatons on votes are avalable for both and j. 11
able 3: CD test for roll-call votes n the 104th Congress ũ g t Bootstrap ũ t Bootstrap 5% crt. val. 5% crt. val. All motons 1st sesson 151.488 [ 1.670 2.545] 151.470 [ 1.677 2.539] 2nd sesson 169.300 [ 1.655 2.392] 169.310 [ 1.658 2.393] Subset of motons 1st sesson 104.488 [ 1.727 2.092] 104.496 [ 1.726 2.092] 2nd sesson 118.742 [ 1.733 2.078] 118.839 [ 1.726 2.078] he results for the subset of motons only use motons wth less than 90% unanmty, whch excludes two motons of each sesson of Congress. he bootstrap crtcal values were computed usng 1000 teratons. he frst number gves the 2.5% lower crtcal value and the second number the 2.5% upper crtcal value. Whle the Monte Carlo results suggest that the CD test has the correct sze for all combnatons of N and, t mght be worthwhle to double check that the results for possble departures from the asymptotc test results. Hence, we also calculate crtcal values of the CD test usng a bootstrap procedure, the detals of whch are gven n Appendx C. he upper half of able 3 reports the results for the CD test for the entre sample. he test statstcs are very large and clearly reject the null of cross secton ndependence. he values n brackets are the 5% crtcal bootstrap values. he bootstrap test results are generally n lne wth asymptotc test results, and confrm the exstence of statstcally sgnfcant evdence of cross secton error correlatons n Warwo s applcaton. In order to address the queston of whether our test results are drven by a few motons wth near unanmty of the votes, we elmnated the motons where more than 90% of the votes are n agreement. hs reduces the number of motons beng consdered, M, from 19 to 17 for each of the two sessons. We apply the CD tests to ths subset and report the results n the lower half of able 3. Whle the test statstcs are reduced n sze, they reman statstcally hghly sgnfcant. he CD test therefore stll rejects the null hypothess of cross secton error ndependence n ths emprcal applcaton. 6 Concluson In ths paper, we have generalzed Pesaran s (2004 CD test for cross secton ndependence to nonlnear models. Our Monte Carlo studes show that there are substantal sze dstortons of the Lagrangan multpler type tests. On the other hand, CD tests perform well even n small N and cases. he 12
emprcal sze of the CD test s close to the nomnal sze. he test also has good power, n partcular when N s large, even when s relatvely small. he CD test s smple to mplement. As s well known n panel data lterature when s small and N s large, the presence of ndvdual-specfc effects ntroduces the classcal ncdental parameter problems (Neyman and Scott 1948. he estmaton of structural parameters are often entangled wth the estmaton of ncdental parameters. o obtan a consstent estmator of structural parameters, one often has to mpose strngent condtons on the data and the estmaton becomes complcated (see e.g. Hsao 2003. he problem can only become more unweldy f there exst cross secton dependence. A nce feature of Pesaran CD test s that one can estmate model parameters under cross secton ndependence and the presence of ndvdualspecfc effects (possbly correlated wth the regressors does not affect the performance of the test because each cross sectonal unt parameter s estmated usng that unt s tme seres observaton alone. In cases, such as the applcaton n ths paper, where cross secton error ndependence s rejected, one may wsh to nvestgate the nature of the dependence, possbly along the lnes of Ng (2006. Also the estmaton of the structural parameters n the model wll need to take the cross secton dependence nto account. Whle these two topcs are beyond the scope of the current paper and are left for future research, ths paper proposes a smple yet powerful test for the detecton of cross secton error dependence, whch s the startng pont for any such endeavor. 13
Appendx A: Dervatons of the resduals he uncondtonal resdual for the probt model u t = y t E(y t = y t E(y t y t = 1Pr(y t = 1 E(y t y t = 0Pr(y t = 0 = y t Φ(β x t, where for notatonal convenence the fact that the moments are condtonal on x t and the parameters s not stated explctly. he varance s Var(u t = E ( yt 2 Φ(β x t 2 = E(yt y 2 = 1Pr(y t = 1 + E(yt y 2 = 0Pr(y t = 0 Φ(β x t 2 = Φ(β x t [ 1 Φ(β x t ]. he generalzed resdual for the probt model u g t = E(u t y t = E(u t y t = 1y t + E(u t y t = 0(1 y t = E(u t u t > β x t y t + E(u t u t β x t (1 y t = φ(β x t Φ(β x t y φ(β t + x t [ 1 Φ(β x t ](1 y t = φ(β x t Φ(β x t [ 1 Φ(β x t ] [ y t Φ(β x t ]. he varance of the generalzed resdual s Var(u g t = φ(β x t 2 Φ(β x t 2 [ 1 Φ(β x t ] 2 E[(y t Φ(β x t 2 ] = φ(β x t 2 Φ(β x t [ 1 Φ(β x t ]. he uncondtonal resdual for the obt model u t = y t E(y t = y t E(y t y t > 1Pr(y t > 1 E(y t y t = 0Pr(y t = 0 = y t E(β x t + u t u t > β x t Φ ( β x t /σ = y t ( β x t + σ λ t Φ ( β x t /σ, where λ t = φ(β x t /σ /Φ(β x t /σ s the nverse Mlls rato wth argument 14
β x t /σ. he varance s Var(u t = E ( yt 2 [( ( ] β x t + σ λ t Φ β 2 x t /σ = E(yt u 2 t > β x t Pr(y t > 0 [( β ( ] x t + σ λ t Φ β 2 x t /σ = σ 2 { E((α x t + v t 2 v t > α x t Φ ( α x t [( α ( ] } x t + λ t Φ α 2 x t {[ (α x t 2 + α x t λ t + 1 ] Φ ( α x t = σ 2 [( α x t + λ t Φ ( α x t ] 2 }, where α t = β t /σ, v t = u t /σ N(0, 1, and we used the fact that E(v 2 t v t > α x t = Var(v t v t > α x t + E(v t v t > α x t 2 = [1 λ t (λ t + α x t ] + λ 2 t = 1 λ t α x t. Substtutng α t nto (22 gves the varance n (13. he generalzed resdual for the obt model u g t = E(u t y t = E(u t y t = 1I(y t + E(u t y t = 0 [1 I(y t ] = (y t β x t I(y t + E(u t u t < β x t [1 I(y t ] = (y t β x t I(y t σ φ(β x t /σ Φ( β x t /σ [1 I(y t]. For the varance we have, Var(u g t = E(u2 t u t > β x t Φ(β x t /σ +σ 2 = σ 2 φ(β x t /σ 2 Φ( β x t /σ 2 [1 Φ(β x t /σ ] [(1 λ t β x t /σ Φ(β x t /σ + φ(β x t /σ 2 Φ( β x t /σ Appendx B: Lmtng dstrbuton of CD test for nonlnear models In ths appendx we show that the lmtng dstrbuton of Pesaran (2004 CD test holds for nonlnear panel data models. We consder a nonlnear model of the form y t = f(x t, θ + u t, where θ s a p 1 vector of unknown parameters for cross secton unt. We denote the true value of θ by θ 0, and make the followng assumptons. 15 ]. (22
A1: For each, the dsturbances, u t, are serally ndependent wth zero means and varances, σ 2, such that 0 < σ2 <. A2: Under the null hypothess defned by H 0 : u t = σ ɛ t, wth ɛ t d(0, 1 for all and t, the dsturbances, ɛ t, are symmetrcally dstrbuted around 0. A3: he k 1 explanatory varables, x t, are strctly exogenous such that E(u t x = 0 for all and t, where x = (x 1,..., x, such that 1 u t f jt p θ j0 0,, j, and t. A4: Let Θ be an open neghborhood of θ and f t = f(x t, θ, f t s contnuous n θ Θ unformly n t. A5: f t θ f t θ f t θ exsts and s contnuous on Θ and 1 p Ω and 1 f jt f t p θ j Ω θ j, where Ω and Ω j are fnte, non-stochastc matrces, and convergence s unformly for all θ Θ. A6: 2 f t θ θ s contnuous n θ Θ unformly n t, and 1 2 f t θ θ converges to a fnte nonsngular matrx. A7: (ˆθ θ 0 a N(0, Σ, where Σ s a postve defnte matrx. Let L denote the log-lkelhood functon of the th cross secton unt wth the jont pdf l = l(u 1,..., u = l(u t, and denote by ˆθ the maxmum lkelhood estmator of θ, L = 0. θ ˆθ hen as, ˆθ s consstent and asymptotcally normally dstrbuted wth ( ˆθ 2 1 ( ( L θ 0 = L 1 θ θ + O p. θ he estmated resduals, û t, θ0 θ0 û t = y t f(x t, ˆθ = y t f(x t, θ 0 f ( t 1 (ˆθ θ 0 + O p θ0 θ. 16
1 û t û jt = 1 + 1 1 1 = 1 u t u jt (ˆθ θ 0 f t θ θ0 f jt u t (ˆθ j θ j0 θj0 θ j f jt θ j (ˆθ j θ j0 θj0 ( f t u jt 1 (ˆθ θ 0 + O p θ0 u t u jt θ + 1 (ˆθ θ 0 1 f t θ θ0 1 f jt u t (ˆθ j θ j0 ( 1 θ j θj0 f t u jt θ θ0 f jt θ j θj0 ( 1 (ˆθ θ 0 + O p Hence, usng assumptons A3, A5, and A6 t follows that 1 û t û jt = 1 u t u jt + O p ( 1. (ˆθ j θ j0 herefore, followng the same argument of Pesaran (2004, we can show that the lmtng dstrbuton of the CD test contnues to hold n the case of nonlnear panel data models as well. Appendx C: Bootstrap procedure A bootstrap approxmaton mght be used to mprove the fnte sample approxmaton of the dstrbuton of the CD test. he bootstrap procedure we suggest has prevously been employed n dfferent contexts n the lterature. Härdle, Mammen and Proença (2001 use the bootstrap approxmaton to mprove the sze of the Horowtz-Härdle test for the specfcaton of the lnk functon, g( n equaton (2. Dkta, Kvesc and Schmdt (2006 call the procedure a model based resamplng scheme and use t to test for the functonal form of the underlyng regresson model. For the test at hand the bootstrap procedure works as follows. 17
1. Usng the observed data y t and x t estmate the parameters for the model and obtan θ for each = 1, 2,..., N. 2. Sample ˆε t d F(0, σ 2 for = 1, 2,... N and t = 1, 2,..., where F( s the dstrbuton of the error term mpled by the mantaned model. 3. Construct ŷ t usng the model f (ŷ t, x t, θ = ˆε t, and ŷ t = g(ŷt. 4. Usng ŷ t and x t estmate the parameters for the model and obtan ˆ θ for each = 1, 2,..., N. Construct the CD test statstc usng x t and ˆ θ. 5. Repeat step 2 4 B tmes. 6. he B samples of the test statstc are then used to calculate the crtcal values aganst whch the test statstc obtaned from the data s evaluated. he crtcal values are the, say, the 2.5% lowest and the 2.5% hghest values n the sample of the B bootstrap test statstcs. Gven that nonlnear models are typcally estmated va maxmum lkelhood, ths bootstrap procedure entals consderable computatonal costs. Härdle, Mammen and Proença (2001 suggest to set the startng values n the estmaton of ˆ θ to θ and use only one teraton to obtan the estmates. In the applcaton n Secton 5, however, we let the maxmzaton algorthm run to convergence. References Breusch,. S. and Adran R. Pagan (1980 he Lagrange multpler test and ts applcaton to model specfcatons n Econometrcs. Revew of Economc Studes 47, 239 253 Chamberlan, Gary (1980 Analyss of covarance wth qualtatve data. Revew of Economc Studes 47, 225 238 Chesher, Andrew, and Margaret Irsh (1987 Numercal and graphcal resdual analyss n the grouped and censored normal lnear model. Journal of Econometrcs 34, 33 61 Dkta, Gerhard, Marsel Kvesc, and Chrstan Schmdt (2006 Bootstrap approxmaton n model checks wth bnary data. Journal of the Amercan Statstcal Assocaton 101, 521 530 Frees, Edward W. (1995 Assessng cross sectonal correlaton n panel data. Journal of Econometrcs 69, 393 414 Goureroux, Chrstan, Alan Monfort, and Alan rognon (1985 A general approach to seral correlaton. Econometrc heory 1, 315 340 18
Goureroux, Chrstan, Alan Monfort, Erc Renault, and Alan rognon (1987 Generalsed resduals. Journal of Econometrcs 34, 5 32 Härdle, Wolfgang, Enno Mammen, and Isabel Proença (2001 A bootstrap test for sngle ndex models. Statstcs 35, 427 452 Hsao, Cheng (2003 Analyss of Panel Data, 2nd ed., Cambrdge: Cambrdge Unversty Press Kelejan, Harry H. and Ingmar R. Prucha (1999 A generalzed moments estmator for the autoregressve parameter n a spatal model. Internatonal Economc Revew, 40, 509 533 Kelejan, Harry H. and Ingmar R. Prucha (2001 On the asymptotc dstrbuton of the Moran I test statstc wth applcatons. Journal of Econometrcs 104, 219 257 Lee, Lung-Fe (2002 Consstency and effcency of least squares estmaton for mxed regressve, spatal autoregressve models. Econometrc heory 18, 252 277 Moran, P. A. P. (1948 he nterpretaton of statstcal maps. Bometrca 35, 255 260 Neyman, J. and Elzabeth R. Scott (1948 Consstent estmates based on partally consstent observatons. Econometrca Ng, Serena (2006 estng cross secton correlaton n panel data usng spacng. Journal of Busness and Economc Statstcs 24, 12 23 Pesaran, M. Hashem (2004 General dagnostc tests for cross secton dependence n panels. Cambrdge Workng Paper n Economcs 0435 Pesaran, M. Hashem, Aman Ullah and akash Yamagata (2006 A bas adjusted LM test of error cross secton ndependence. Cambrdge Workng Paper n Economcs 0641 Ullah, Aman (2004 Fnte Sample Econometrcs, New York: Oxford Unversty Press Wawro, Gregory (2001 A panel probt analyss of campagn contrbutons and roll-call votes. Amercan Journal of Poltcal Scence 45, 563 579 19