Econometrics Final Exam João Valle e Azevedo Erica Marujo June 15, 2010 Time for completion: 2h Give your answers in the space provided. Use draft paper to plan your answers before writing them on the exam paper. Unless otherwise stated, use 5% for significance level. Name: Number: Group I (9 points, 1.5 for each question) Give a very concise answer to the following questions. Conciseness will be valued, avoid unnecessary details. 1. The acronym BLUE stands for what in Econometrics? 2. In one phrase, describe the meaning of Contemporaneous exogeneity in a multiple linear regression (for time series data) context. 3. Explain why, in general, you would want to transform series that are not weakly dependent before using them in a multiple linear regression model for time series data. In which cases is the transformation not needed?
4. Write in matrix form the expression for the variance of the OLS Estimator of the parameters of a multiple linear regression model for cross-sectional data, assuming the homoskedasticity assumption is not verified. (Assume the variance of the error term is of known form and that the necessary assumptions hold) 5. Describe succinctly a very parsimonious (i.e., with few variables) test aimed at detecting heteroskedasticity in the error term of a multiple linear regression model. Assume all the necessary assumptions hold. 6. Write a model aimed at testing whether there is any effect on GDP growth of next quarter of an extraordinary (more than 10%) growth in public investment in the current quarter, controlling for other factors. Describe one limitation of your model. 2
Group II (10 points) 1. Consider the following output of the model which describes grade point averages ( GPA) for college athletes: Dependent Variable: CUMGPA Method: Least Squares Sample: 1 732 Included observations: 732 Coefficient Std. Error t-statistic Prob. C -1.046971 0.483458-2.165588 0.0307 SAT 0.000964 0.000205 4.691868 0.0000 HSPERC -0.006849 0.001550-4.419186 0.0000 TOTHRS 0.010177 0.000998 10.19751 0.0000 CRSGPA 0.727651 0.157487 4.620393 0.0000 R-squared 0.257227 Mean dependent var 2.080861 Adjusted R-squared 0.253140 S.D. dependent var 0.989617 S.E. of regression 0.855237 Akaike info criterion 2.531932 Sum squared resid 531.7503 Schwarz criterion 2.563324 Log likelihood -921.6870 Hannan-Quinn criter. 2.544041 F-statistic 62.94110 Durbin-Watson stat 2.026332 Prob(F-statistic) 0.000000 where is cumulative grade point average (GPA prior to the current semester); is SAT score, measured in points; is graduating percentile in high school class; is total credit hours prior to the semester; and is a weighted average of overall GPA in courses taken by each student. a) Interpret the coefficient estimates of the model, and. What can you say about the overall quality of this model? Be complete in your answer. (1 point) 3
Now consider the following two outputs: Dependent Variable: CUMGPA Method: Least Squares Sample: 1 732 IF FEMALE=1 Included observations: 180 Coefficient Std. Error t-statistic Prob. C -1.226599 1.025719-1.195843 0.2334 SAT 0.001835 0.000468 3.917441 0.0001 HSPERC -0.007056 0.003954-1.784380 0.0761 TOTHRS 0.013945 0.002260 6.170078 0.0000 CRSGPA 0.477304 0.326148 1.463459 0.1451 R-squared 0.375130 Mean dependent var 2.268611 Adjusted R-squared 0.360848 S.D. dependent var 1.126549 S.E. of regression 0.900643 Akaike info criterion 2.655968 Sum squared resid 141.9525 Schwarz criterion 2.744661 Log likelihood -234.0371 Hannan-Quinn criter. 2.691929 F-statistic 26.26461 Durbin-Watson stat 2.240975 Prob(F-statistic) 0.000000 Dependent Variable: CUMGPA Method: Least Squares Sample: 1 732 IF FEMALE=0 Included observations: 552 Coefficient Std. Error t-statistic Prob. C -0.826778 0.563308-1.467719 0.1428 SAT 0.000661 0.000228 2.897074 0.0039 HSPERC -0.006438 0.001725-3.731255 0.0002 TOTHRS 0.008809 0.001121 7.860264 0.0000 CRSGPA 0.754003 0.185269 4.069772 0.0001 R-squared 0.210642 Mean dependent var 2.019638 Adjusted R-squared 0.204869 S.D. dependent var 0.933655 S.E. of regression 0.832541 Akaike info criterion 2.480348 Sum squared resid 379.1392 Schwarz criterion 2.519420 Log likelihood -679.5761 Hannan-Quinn criter. 2.495615 F-statistic 36.49198 Durbin-Watson stat 2.009564 Prob(F-statistic) 0.000000 4
and consider the following model, where is a dummy variable equal to 1 if female and zero otherwise. b) Using the information from the outputs above, compute estimates for the coefficients of this model (,,,,, ). (2 points) c) Test if the expected cumulative GPA for men is statistically different from the expected cumulative GPA for women. Be precise in your answer and indicate all the necessary steps to perform that test. (2 points) 2. 5
2. Consider the following output of the model: log invpc t =β 0 +β 1 log price t +β log price t-1 +β 3 log pop t +β t+u t Dependent Variable: LINVPC Method: Least Squares Sample (adjusted): 2 42 Included observations: 41 after adjustments Coefficient Std. Error t-statistic Prob. C 33.20187 12.86664 2.580462 0.0141 LPRICE 2.591738 0.924113 2.804567 0.0081 LPRICE_1-4.835751 0.897678-5.386956 0.0000 LPOP -2.895495 1.086491-2.664996 0.0115 T 0.054741 0.015734 3.479188 0.0013 R-squared 0.635898 Mean dependent var -0.658846 Adjusted R-squared 0.595442 S.D. dependent var 0.167975 S.E. of regression 0.106840 Akaike info criterion -1.521114 Sum squared resid 0.410935 Schwarz criterion -1.312141 Log likelihood 36.18283 Hannan-Quinn criter. -1.445017 F-statistic 15.71833 Durbin-Watson stat 1.123386 Prob(F-statistic) 0.000000 where invpc is real per capita housing investment (in thousands of dollars), price denote a housing price index (equal to 1 in 1982), and pop denote total population in the United States, in thousands. The data are annual observations in the United States for 1947 through 1988. a) Interpret each one of the coefficient estimates of the model,,, and. Are they statistically significant? Justify. The included trend is of which type? (1 point) 6
b) Given the information in the output above, can you conclude if the errors of this model suffer from any type of serial correlation? In order to take this conclusion, which Gauss-Markov assumption needs to be satisfied? What are the consequences for your answer in a) if this problem is present? (2 points) c) Suppose that you know for sure that: Corr u t,log price t =Corr u t,log price t-1 =Corr u t,log pop t =Corr u t,t =0 and that: Corr u t,log price t 2 0,5. Does this change your conclusions from the previous question? Why? Are the OLS estimators unbiased and consistent in this case? Why? Is it possible to test for serial correlation in this case? How? (2 points) 7
Group III (1 point) 1. Give an example of a time series process with 2 observations (you can consider more if you want) that is covariance-stationary and at the same time nonstationary. a) a) the OLS estimator is unbiased but consistent 8