Multilevel Analysis (ver. 1.0)

Multlevel Analyss (ver. 1.0) Oscar Torres-Reyna Data Consultant otorres@prnceton.edu http://dss.prnceton.edu/tranng/

Motvaton Use multlevel model whenever your data s grouped (or nested) n more than one category (for example, states, countres, etc). Multlevel models allow: Study effects that vary by entty (or groups) Estmate group level averages Some advantages: Regular regresson gnores the average varaton between enttes. Indvdual regresson may face sample problems and lack of generalzaton

Varaton between enttes use http://dss.prnceton.edu/tranng/schools.dta bysort school: egen y_meanmean(y) twoway scatter y school, msze(tny) connected y_mean school, connect(l) clwdth(thck) clcolor(black) mcolor(black) msymbol(none), yttle(y) y -40-0 0 0 40 0 0 40 60 school Score y_mean 3

statsby nter_b[_cons] slope_b[x1], by(school) savng(ols, replace): regress y x1 sort school merge school usng ols Indvdual regressons (no-poolng approach) drop _merge gen yhat_ols nter + slope*x1 sort school x1 separate y, by(school) separate yhat_ols, by(school) twoway connected yhat_ols1-yhat_ols65 x1 lft y x1, clwdth(thck) clcolor(black) legend(off) yttle(y) y -0-10 0 10 0 30-40 -0 0 0 40 Readng test 4

Varyng-ntercept model (null). xtmxed y school:, mle nolog y j[ ] α + ε Mxed-effects ML regresson Number of obs 4059 Group var able: school Number of groups 65 Obs per group: mn avg 6.4 max 198 Mean of state level ntercepts Wald ch(0). Log lkelhood -14851.50 Prob > ch. y Coef. Std. Err. z P> z [95% Conf. Interval] _cons -.1317104.5367-0.5 0.806-1.18784.9193634 Standard devaton at the school level (level ) Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] Standard devaton at the ndvdual level (level ) school: Identty sd(_cons) 4.106553.3999163 3.39995 4.970174 sd(resdual) 9.07357.103014 9.007636 9.411505 LR test vs. lnear regresson: chbar(01) 498.7 Prob > chbar 0.0000 Intraclass ( sgma _ u) correlaton ( sgma _ u) + ( sgma _ e) sd(_ cons) sd(_ cons) + sd( resdual) 4.11 4.11 + 9.1 _ 0.17 Ho: Random-effects 0 If the nterclass correlaton (IC) approaches 0 then the groupng by countes (or enttes) are of no use (you may as well run a smple regresson). If the IC approaches 1 then there s no varance to explan at the ndvdual level, everybody s the same. An ntraclass correlaton tells you about the correlaton of the observatons (cases) wthn a cluster (http://www.ats.ucla.edu/stat/stata/lbrary/cpsu.htm) 5

Varyng-ntercept model (one level-1 predctor). xtmxed y x1 school:, mle nolog y α + βx + ε j[ ] Mxed-effects ML regresson Number of obs 4059 Group var able: school Number of groups 65 Obs per group: mn avg 6.4 max 198 Mean of state level ntercepts Standard devaton at the school level (level ) Standard devaton at the ndvdual level (level ) Wald ch(1) 04.57 Log lkelhood -1404.799 Prob > ch 0.0000 x1.5633697.014654 45.19 0.000.5389381.5878014 _cons.038706.40058 0.06 0.95 -.7605576.808987 Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] school: Identty y Coef. Std. Err. z P> z [95% Conf. Interval] sd(_cons) 3.03571.305516.496 3.69659 sd(resdual) 7.51481.0841759 7.35895 7.68885 LR test vs. lnear regresson: chbar(01) 403.7 Prob > chbar 0.0000 Intraclass ( sgma _ u) correlaton ( sgma _ u) + ( sgma _ e) sd(_ cons) sd(_ cons) + sd( resdual) 3.03 3.03 + 7.5 _ 0.14 Ho: Random-effects 0 If the nterclass correlaton (IC) approaches 0 then the groupng by countes (or enttes) are of no use (you may as well run a smple regresson). If the IC approaches 1 then there s no varance to explan at the ndvdual level, everybody s the same. An ntraclass correlaton tells you about the correlaton of the observatons (cases) wthn a cluster (http://www.ats.ucla.edu/stat/stata/lbrary/cpsu.htm) 6

Varyng-ntercept, varyng-coeffcent model y α β x + ε j[ ] + j[ ]. xtmxed y x1 school: x1, mle nolog covarance(unstructure) Mxed-effects ML regresson Number of obs 4059 Group var able: school Number of groups 65 Obs per group: mn avg 6.4 max 198 Mean of state level ntercepts Wald ch(1) 779.80 Log lkelhood -14004.613 Prob > ch 0.0000 y Coef. Std. Err. z P> z [95% Conf. Interval] x1.556791.0199367 7.9 0.000.5176539.5958043 _cons -.1150841.3978336-0.9 0.77 -.894836.6646554 Standard devaton at the school level (level ) Standard devaton at the ndvdual level (level ) Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] school: Unstructured sd(x1).105631.018987.0885508.1641483 sd(_cons) 3.007436.3044138.4665 3.667375 corr(x1,_cons).4975474.1487416.157843.73131 sd(resdual) 7.440788.083948 7.78059 7.607157 LR test vs. lnear regresson: ch(3) 443.64 Prob > ch 0.0000 Note: LR test s conservatve and provded only for reference. Ho: Random-effects 0 Intraclass ( sgma _ u) correlaton ( sgma _ u) + ( sgma _ e) sd(_ cons) + sd( x1) sd(_ cons) + sd( x1) + sd( resdual) 0.1 + 3.01 0.1 + 3.01 + 7.44 _ 0.14 7

Varyng-slope model y α β x + ε + j[ ]. xtmxed y x1 _all: R.x1, mle nolog Mxed-effects ML regresson Number of obs 4059 Group var able: _all Number of groups 1 Obs per group: mn 4059 avg 4059.0 max 4059 Mean of state level ntercepts Wald ch(1) 186.09 Log lkelhood -146.433 Prob > ch 0.0000 y Coef. Std. Err. z P> z [95% Conf. Interval] Standard devaton at the school level (level ) x1.5950551.01769 46.76 0.000.5701108.6199995 _cons -.011948.163914-0.09 0.95 -.596706.357746 Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] _all: Identty sd(r.x1).0003388.1806391 0. Standard devaton at the ndvdual level (level ) sd(resdual) 8.05417.08937 7.87914 8.950 LR test vs. lnear regresson: chbar(01) 0.00 Prob > chbar 1.0000 8

Postestmaton 9

Comparng models usng lkelhood-raton test Use the lkelhood-rato test (lrtest) to compare models ftted by maxmum lkelhood. Ths test compares the log lkelhood (shown n the output) of two models and tests whether they are sgnfcantly dfferent. /*Fttng random ntercepts and storng results*/ quetly xtmxed y x1 school:, mle nolog estmates store r /*Fttng random coeffcents and storng results*/ quetly xtmxed y x1 school: x1, mle nolog covarance(unstructure) estmates store rc /*Runnng the lkelhood-rato test to compare*/ lrtest r rc. lrtest r rc Lkelhood-rato test LR ch() 40.37 (Assumpton: r nested n rc) Prob > ch 0.0000 Note: LR test s conservatve The null hypothess s that there s no sgnfcant dfference between the two models. If Prob>ch<0.05, then you may reject the null and conclude that there s a statstcally sgnfcant dfference between the models. In the example above we reject the null and conclude that the random coeffcents model provdes a better ft (t has the lowest log lkelhood) 10

Varyng-ntercept, varyng-coeffcent model: postestmaton. xtmxed y x1 school: x1, mle nolog covarance(unstructure) varance Mxed-effects ML regresson Number of obs 4059 Group var able: school Number of groups 65 Obs per group: mn avg 6.4 max 198 Mean of state level ntercepts Wald ch(1) 779.80 Log lkelhood -14004.613 Prob > ch 0.0000 y Coef. Std. Err. z P> z [95% Conf. Interval] x1.556791.0199367 7.9 0.000.5176539.5958043 _cons -.1150841.3978336-0.9 0.77 -.894836.6646554 Standard devaton at the school level (level ) Standard devaton at the ndvdual level (level ) Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] school: Unstructured var(x1).0145355.004577.007841.069446 var(_cons) 9.04467 1.83101 6.08398 13.44964 cov(x1,_cons).1804036.0691515.044869.315938 var(resdual) 55.36533 1.498 5.97014 57.86883 LR test vs. lnear regresson: ch(3) 443.64 Prob > ch 0.0000 Note: LR test s conservatve and provded only for reference. ( sgma _ u) var(_ cons) + var( x1) 0.014 + 9.045 Intraclass _ correlaton 0.14 ( sgma _ u) + ( sgma _ e) var(_ cons) + var( x1) + var( resdual) 0.014 + 9.045 + 55.365 11

Postestmaton: varance-covarance matrx. xtmxed y x1 school: x1, mle nolog covarance(unstructure) varance Random-effects Parameters Estmate Std. Err. [95% Conf. Interval] school: Unstructured var(x1).0145355.004577.007841.069446 var(_cons) 9.04467 1.83101 6.08398 13.44964 cov(x1,_cons).1804036.0691515.044869.315938 var(resdual) 55.36533 1.498 5.97014 57.86883 LR test vs. lnear regresson: ch(3) 443.64 Prob > ch 0.0000 Note: LR test s conservatve and provded only for reference.. estat recovarance Random-effects covarance matrx for level school x1 _cons x1.0145355 _cons.1804036 9.04467 Varance-covarance matrx. estat recovarance, correlaton Random-effects correlaton matrx for level school x1 _cons x1 1 _cons.4975474 1 The correlaton between the ntercept and x1 shows a close relatonshp between the average of y and x1. 1

Postestmaton: estmatng random effects (group-level errors) y x α j[ ] + β j[ ] + ε y α j[ ] + β j[ ] x + uα + uβ + j[ ] ε Fxed-effects Random-effects To estmate the random effects u, use the command predct wth the opton reffects, ths wll gve you the best lnear unbased predctons (BLUPs) of the random effects whch bascally show the amount of varaton for both the ntercept and the estmated beta coeffcent(s). After runnng xtmxed, type predct u*, reffects Two new varables are created u1 BLUP r.e. for school: x1 ------- /* u β */ u BLUP r.e. for school: _cons --- /* u α */ 13

Postestmaton: estmatng random effects (group-level errors) y 0.1 + 0.56x1 y 0.1 + 0.56x1 + u α + uβ To explore some results type: Fxed-effects Random-effects bysort school: generate groups(_n1) /*_n1 selects the frst case of each group */ lst school u u1 f school<10 & groups. lst school u u1 f school<10 & groups school u u1 Here u and u1 are the group level errors for the ntercept and the slope respectvely. For the frst school the equaton would be: 1. 1 3.749336.149755 74. 4.7019.164761 19. 3 4.79768.0808666 181. 4.350505.17181 60. 5.46805.070576 95. 6 5.183809.05864 375. 7 3.64094 -.1488697 463. 8 -.11886.0068855 565. 9-1.76798 -.0886194 599. 10-3.139076 -.1360763 y 1 0.1 + 0.56x1 + 3.75 + 0.1 ( 0.1 + 3.75) + (0.56 + 0.1) x1 3.63+ 0.68x1 14

Postestmaton: estmatng ntercept/slope y 0.1 + 0.56x1 + 3.75 + 0.1 ( 0.1 + 3.75) + (0.56 + 0.1) x1 3.63+ 0.68 1 x 1 To estmate ntercepts and slopes per school type : gen ntercept _b[_cons] + u gen slope _b[x1] + u1 lst school ntercept slope f school<10 & groups Compare the coeffcents for school 1 above. lst school ntercept slope f school<10 & groups school ntercept slope 1. 1 3.63451.6817045 74. 4.587045.71455 19. 3 4.68596.6375957 181. 4.351664.6839111 60. 5.34771.687867 95. 6 5.06875.6153533 375. 7 3.55858.4078594 463. 8 -.369701.5636145 565. 9-1.883067.4681097 599. 10-3.54161.40658 15

Postestmaton: fttng values Usng ntercept and slope you can estmate yhat, type gen yhat ntercept + (slope*x1) Or, after xtmxed type: predct yhat_ft, ftted lst school yhat yhat_ft f school<10 & groups. lst school yhat yhat_ft f school<10 & groups school yhat yhat_ft 1. 1-1.4943-1.4943 74. -15.3951-15.3951 19. 3-7.179871-7.179871 181. 4-15.8805-15.8805 60. 5-5.193317-5.193318 95. 6-3.836668-3.836667 375. 7-6.084939-6.084939 463. 8-13.98353-13.98353 565. 9-15.609-15.609 599. 10-9.341847-9.341847 16

You can plot ndvdual regressons, type Postestmaton: ftted values (graph) twoway connected yhat_ft x1 f school<10, connect(l) Ftted values: xb + Zu -0-10 0 10 0-40 -0 0 0 40 Readng test 17

After xtmxed you can get the resduals by typng: Postestmaton: resduals predct resd, resduals predct resd_std, rstandard /* resduals/sd(resdual) */ A quck check for normalty n the resduals qnorm resd_std Standardzed resduals -4-0 4-4 - 0 4 Inverse Normal 18

DSS Onlne Tranng Secton http://dss.prnceton.edu/tranng/ UCLA Resources http://www.ats.ucla.edu/stat/ Prnceton DSS Lbgudes http://lbgudes.prnceton.edu/dss Books/References Useful lnks / Recommended books / References Beyond Fxed Versus Random Effects : A framework for mprovng substantve and statstcal analyss of panel, tme-seres cross-sectonal, and multlevel data / Brandom Bartels http://polmeth.wustl.edu/retreve.php?d838 Robust Standard Errors for Panel Regressons wth Cross-Sectonal Dependence / Danel Hoechle, http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf An Introducton to Modern Econometrcs Usng Stata/ Chrstopher F. Baum, Stata Press, 006. Data analyss usng regresson and multlevel/herarchcal models / Andrew Gelman, Jennfer Hll. Cambrdge ; New York : Cambrdge Unversty Press, 007. Data Analyss Usng Stata/ Ulrch Kohler, Frauke Kreuter, nd ed., Stata Press, 009. Desgnng Socal Inqury: Scentfc Inference n Qualtatve Research / Gary Kng, Robert O. Keohane, Sdney Verba, Prnceton Unversty Press, 1994. Econometrc analyss / Wllam H. Greene. 6th ed., Upper Saddle Rver, N.J. : Prentce Hall, 008. Introducton to econometrcs / James H. Stock, Mark W. Watson. nd ed., Boston: Pearson Addson Wesley, 007. Statstcal Analyss: an nterdscplnary ntroducton to unvarate & multvarate methods / Sam Kachgan, New York : Radus Press, c1986 Statstcs wth Stata (updated for verson 9) / Lawrence Hamlton, Thomson Books/Cole, 006 Unfyng Poltcal Methodology: The Lkelhood Theory of Statstcal Inference / Gary Kng, Cambrdge Unversty Press, 1989 19