Econometrics I: Econometric Methods

Econometrics I: Econometric Methods Jürgen Meinecke Research School of Economics, Australian National University 24 May, 2016

Housekeeping Assignment 2 is now history The ps tute this week will go through selected exercises from this year s midterm exam There will be regular computer tutes this week as well

Consultation times weeks 14 and 15 Exclusively at the following days/times/locations: May 31 from 10-12, Arndt 1022 (Juergen) June 6 from 10-12, Arndt 1022 (Juergen) June 7 from 10-12, CBE (26c) tutorial room 1 June 10 from 10-12, CBE (26c) tutorial room 1 Note: no other consultation is offered!

Assignment 2 and Participation Marks We will publish your marks for assignment 2 and your tutorial participation on Wattle before the final exam How can you receive your assignment back? after we post the assignment 2 marks on Wattle you can collect your assignment from the same place where you dropped it: the EMET2007 assignment box on the first floor of the Arndt building (only this time the little door will be unlocked!)

Student Course Evaluations Please take time to submit course evaluations for EMET2007 Any feedback is appreciated! It would be great if you could give constructive feedback (even if you did not enjoy the course very much) What did you like about the course? What did you not like about the course? What did you think of the ps tutes and the computer tutes? How could I improve the course? What should I do differently? Your feedback DOES matter!

After EMET2007, we offer four more econometrics courses for undergraduate students two of these do micro-econometrics (like we did during weeks 1-10) the other two do macro-econometrics (like we did during weeks 11-13)

Roadmap Introduction Econometrics Pathways Microeconometrics Macroeconometrics/Time Series Econometrics Time Series Regression and Forecasting Recap Practical Guidelines for Time Series Analysis Example: CPI and Inflation Pseudo Out-of Sample Forecasts One Last Thing Last Slide

EMET3004: Econometric Modelling (S2) formerly EMET2008 continuation of EMET2007 take a deeper look at issues such as causality, omitted variables bias and functional form specifications EMET 3006: Applied Microeconometrics (S1) Studies methods and models with a focus on applications You do NOT need to take EMET3004 to take EMET3006 (or vice versa)!

EMET3007: Business and Economic Forecasting (S2) Starts with data cleaning exercises such as smoothing and detrending Takes a closer look at autoregressive models Study different methods of forecasting EMET3008: Applied Macro and Financial Econometrics (S2) Model several time series simultaneously in system estimation Develop particular types of time series model used to study macroeconomic equilibria Use macro and financial data to fit model and analyze forecast performance Both these courses will (likely) NOT use Stata but some other programming environment (example, Eviews or Matlab)

After taking EMET2007, EMET3004 and EMET3006 you would have a solid understanding of the methods and models used for micro-econometrics After taking EMET2007, EMET3007 and EMET3008 you would have a solid understanding of the methods and models used for macro and time series econometrics

We have studied the AR(p) model Y t = β 0 + αt + β 1 Y t 1 + β 2 Y t 2 + + β p Y t p + u t and we have studied the ADL(p,r) model Y t = β 0 + αt+ + β 1 Y t 1 + β 2 Y t 2 + + β p Y t p + γ 1 X t 1 + γ 2 X t 2 + + γ r X t r + u t which is a straightforward extension of the AR(p) model, all we do is add another regressor

We would like to use the two models to produce forecasts for the time series Y t But we need to be careful: You can only estimate AR(p) models and ADL(p,r) models when the time series Y t and X t are stationary If they are not stationary, then we run into the problem of spurious correlation Or do you really believe in the stork?

Stationary time series have probability distributions that do not change over time Intuitively, this means that you can use past observations of the time series to make predictions for the future If a time series is not stationary, then it does not make sense to use past observations and pretend like they could be used to predict the future

When are time series non-stationary? Trends: persistent, long-term movement or tendency Two types of trend lead to non-stationary time series: deterministic (relatively easy to spot in a time series graph, example: real GDP as seen during the computer tutorials) stochastic (almost impossible to spot in a time series graph, see next slide)

Four realizations from the same random walk process

Random walks follow stochastic trends but they do not always look like they are following an obvious trending behavior Luckily, Dickey and Fuller came up with a reasonably easy statistical test for stochastic trends To test if a time series Y t follows a stochastic trend, we only need to test the coefficient δ in the model: Y t = β 0 + α t + δy t 1 + γ 1 Y t 1 + γ 2 Y t 2 + + γ p Y t p + u t, Recall that δ 0 If δ is sufficiently far away from zero (as measured relative to the critical values provided by Dickey and Fuller) then we reject the null hypothesis of a stochastic trend

We use the following terms interchangeably: stochastic trend unit root non-stationary

Suppose I give you a time series Y t and ask you to produce forecasts, what should you do? Let s try to tie up all loose ends and come up with some practical advice Given the time series Y t, here s what you should do to produce the best forecast...

Graph it Sometimes you can spot deterministic trends in graphs (e.g., GDP) A graph will definitely not help you spot a stochastic trend Run an ADF unit root test on Y t (possibly use AIC to set optimal lag length)

If you reject the null hypothesis of a unit root: Estimate the AR(p) model for Y t : Y t = β 0 + αt + β 1 Y t 1 + β 2 Y t 2 + + β p Y t p + u t (It s always safe to include a deterministic time trend) You can easily estimate this in Stata and it is safe to take the Stata output of standard errors, t-statistic, p-values and confidence intervals at face value Use the AR(p) model for Y t to produce your forecasts

If you cannot reject the null hypothesis of a unit root: Generate first differences Y t of the data and run a unit root test on Y t In 90% of practical situations, you will reject the null hypothesis of a unit root in Y t (in other words, first differencing effectively removes the unit root) This suggests that you should use an AR(p) model for Y t : Y t = β 0 + αt + β 1 Y t 1 + β 2 Y t 2 + + β p Y t p + u t Since Y t is stationary, you can easily estimate this in Stata and it is safe to take the Stata output of standard errors, t-statistic, p-values and confidence intervals at face value Use the AR(p) model for Y t to produce your forecasts

During the last two weeks I have used inflation as the running example to illustrate important concepts Recall that we defined inflation to be infl t := 400 (log(cpi t ) log(cpi t 1 )) (this is the annualized version of quarter-to-quarter inflation; but it would work just the same if we hadn t used the annualized version) Our goal: Come up with the best AR(p) model to forecast inflation Let s go through the steps

Graphing the data Deterministic trend: not obvious Stochastic trend: never obvious

Test for unit root dfuller infl, lags(2) trend Augmented Dickey-Fuller test for unit root Number of obs = 224 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -3.086-3.999-3.433-3.133 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.1097 (I have determined that the optimal lag length is 2) Dickey Fuller critical value is -3.41 (allowing for deterministic time trend here) Cannot reject the null hypothesis of unit root here

Cannot estimate an AR(p) model for infl t Let s take first differences to remove unit root: infl t := infl t infl t 1 Let s convince ourselves that infl t has indeed no unit root...

Test for unit root dfuller Dinfl, lags(4) trend Augmented Dickey-Fuller test for unit root Number of obs = 221 ---------- Interpolated Dickey-Fuller --------- Test 1% Critical 5% Critical 10% Critical Statistic Value Value Value ------------------------------------------------------------------------------ Z(t) -8.931-4.000-3.434-3.134 ------------------------------------------------------------------------------ MacKinnon approximate p-value for Z(t) = 0.0000 (I have determined that the optimal lag length is 4) Dickey Fuller critical value is -3.41 (allowing for deterministic time trend here) Do reject the null hypothesis of unit root here

Therefore, our best time series model is this one: infl t = β 0 + αt + β 1 infl t 1 + β 2 infl t 2 Let s estimate it... + β 3 infl t 3 + β 4 infl t 4 + u t

regress L(0/4).Dinfl yearqrt, robust Linear regression Number of obs = 222 F( 5, 216) = 11.03 Prob > F = 0.0000 R-squared = 0.2092 Root MSE = 1.8898 ------------------------------------------------------------------------------ Robust Dinfl Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- Dinfl L1. -.3692247.0672725-5.49 0.000 -.5018193 -.2366302 L2. -.4021147.065175-6.17 0.000 -.5305751 -.2736543 L3. -.0538849.101832-0.53 0.597 -.2545965.1468268 L4. -.158153.075199-2.10 0.037 -.3063708 -.0099352 yearqrt -.0008346.002153-0.39 0.699 -.0050782.003409 _cons.0704911.2096899 0.34 0.737 -.3428092.4837913 ------------------------------------------------------------------------------

What should we do with these estimates now? Remember that our main goal is to produce forecasts The data set runs from 1957Q1 to 2013Q4 Let s produce an inflation forecast for 2014Q1 All we need to do is use a bit of common sense

In the data we get the following values: infl 2013Q1 = 0.738 infl 2013Q2 = 1.453 infl 2013Q3 = 2.622 infl 2013Q4 = 1.747 Obtain forecast by plugging into the estimated model: infl 2014Q1 2013Q4 = 0.070 0.001 229 0.369 ( 1.747) 0.402 (2.622) 0.054 ( 1.453) 0.158 ( 0.738) = 0.373 The forecast for the change in the annualized quarterly inflation rate for 2014Q1 is -0.373

If we are not interested in the change but instead want to learn about the level of inflation in 2014Q1: From the data: infl 2013Q4 = 0.847 Therefore infl 2014Q1 2013Q4 = 0.847 + infl2014q1 2013Q4 = 0.847 0.373 = 0.474

The inflation data runs from 1957Q1 to 2013Q4 To measure forecast performance, we could follow this little algorithm: 1. pretend like your data already ends in s = 2005Q1 2. estimate an AR(4) model for infl t for the time frame 1957Q1 to s 3. calculate a forecast for period s + 1: infl s+1 s 4. compare the forecast to the actual realization infl s+1 ; difference is the forecast error: infl s+1 infl s+1 s 5. shift s up by one period an jump back to step 2 (repeat until you reach end of your data set (s T)

Doing this in Stata is not easy In fact, Stata is not really made for this Stata is strong for micro-econometrics but not for macro-econometrics Nevertheless, I have implemented this algorithm in Stata and ran it over the inflation data Here s the output in two pictures...

Inflation: actual versus forecast

Forecast error

Is this a good model or a bad model? Hard to tell on its own Could try different AR(p) models Even better: try an ADL(p,r) model Bottom line: there are a lot of possible time series models out there This simple AR(4) model already does a reasonable job And it took you only three weeks to understand!

Last Slide (yey!) I hope you...... enjoyed it a bit (didn t hate it too much!?)... learned something useful... consider doing more econometrics Feel free to come by my office for a chat anytime! Good luck with all your exams!