Open the usa.dta data set (1984q1-2009q4), create the dates and declare it as a time series. Save the data so you won t have to do this step again.

Similar documents
Forecasting in STATA: Tools and Tricks

The following postestimation commands for time series are available for regress:

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

MULTIPLE REGRESSION EXAMPLE

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

Interaction effects between continuous variables (Optional)

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

Correlation and Regression

How To Model A Series With Sas

Time Series - ARIMA Models. Instructor: G. William Schwert

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Bits and Bets Information, Price Volatility, and Demand for Bitcoin

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

The average hotel manager recognizes the criticality of forecasting. However, most

Promotional Forecast Demonstration

Time Series Analysis

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Nonlinear Regression Functions. SW Ch 8 1/54/

Advanced Forecasting Techniques and Models: ARIMA

Econometrics I: Econometric Methods

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

TIME SERIES ANALYSIS

Powerful new tools for time series analysis

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2

August 2012 EXAMINATIONS Solution Part I

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

Performing Unit Root Tests in EViews. Unit Root Testing

TIME SERIES ANALYSIS

Sample Size Calculation for Longitudinal Studies

Lecture 15. Endogeneity & Instrumental Variable Estimation

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Introduction to Time Series Regression and Forecasting

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

From the help desk: Swamy s random-coefficients model

IBM SPSS Forecasting 22

Rockefeller College University at Albany

Predicting Indian GDP. And its relation with FMCG Sales

Discussion Section 4 ECON 139/ Summer Term II

Linear Regression Models with Logarithmic Transformations

Nonlinear relationships Richard Williams, University of Notre Dame, Last revised February 20, 2015

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Interaction effects and group comparisons Richard Williams, University of Notre Dame, Last revised February 20, 2015

Air passenger departures forecast models A technical note

16 : Demand Forecasting

Handling missing data in Stata a whirlwind tour

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Addressing Alternative. Multiple Regression Spring 2012

Time series estimation and forecasting

MODELING AUTO INSURANCE PREMIUMS

xtmixed & denominator degrees of freedom: myth or magic

From this it is not clear what sort of variable that insure is so list the first 10 observations.

USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY

COMP6053 lecture: Time series analysis, autocorrelation.

Forecasting Using Eviews 2.0: An Overview

Using R for Linear Regression

Threshold Autoregressive Models in Finance: A Comparative Approach

Time Series Analysis: Basic Forecasting.

Time Series Analysis

Week TSX Index

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Some useful concepts in univariate time series analysis

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

Using JMP Version 4 for Time Series Analysis Bill Gjertsen, SAS, Cary, NC

Statewide Fuel Consumption Forecast Models

Module 6: Introduction to Time Series Forecasting

Correlated Random Effects Panel Data Models

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)

Regression Analysis: A Complete Example

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation

S TAT E P LA N N IN G OR G A N IZAT IO N

ITSM-R Reference Manual

A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Forecasting of Paddy Production in Sri Lanka: A Time Series Analysis using ARIMA Model

Data Analysis Methodology 1

Integrated Resource Plan

Standard errors of marginal effects in the heteroskedastic probit model

Time Series Analysis

Univariate Time Series Analysis; ARIMA Models

11. Analysis of Case-control Studies Logistic Regression

9th Russian Summer School in Information Retrieval Big Data Analytics with R

25 Working with categorical data and factor variables

Time Series Analysis of Aviation Data

Module 5: Multiple Regression Analysis

Multinomial and Ordinal Logistic Regression

Data analysis and regression in Stata

I. Introduction. II. Background. KEY WORDS: Time series forecasting, Structural Models, CPS

FORECAST MODEL USING ARIMA FOR STOCK PRICES OF AUTOMOBILE SECTOR. Aloysius Edward. 1, JyothiManoj. 2

Multiple Regression: What Is It?

The leverage statistic, h, also called the hat-value, is available to identify cases which influence the regression model more than others.

Quick Stata Guide by Liz Foster

Transcription:

ARIMA forecasts Open the usa.dta data set (1984q1-2009q4), create the s and declare it as a time series. Save the data so you won t have to do this step again. use usa, clear * --------------------------------------- * Create s and declare time-series * --------------------------------------- generate = q(1984q1) + _n-1 format %tq tsset Here, we plot real GDP, its difference, its natural log and the log difference. qui gen lg = ln(gdp) qui tsline gdp, name(g, replace) qui tsline D.gdp, name(dg, replace) qui tsline lg, name(lg, replace) qui tsline D.lg, name(dlg, replace) graph combine g Dg lg Dlg

5000 10000 15000 real US gross domestic product real US gross domestic product, D -400-200 0 200 400 ln(gdp) 8 8.5 9 9.5 D.ln(gdg) -.02 -.01 0.01.02.03 Looks like there is a trend in the level (perhaps exponential). The difference (upper right) may show a slight upward trend until the bottom dropped out in late 2008. Still, I see no reason to use logs, so I won t. Others might disagree. dfgls gdp dfgls D.gdp, notrend. dfgls gdp DF-GLS for gdp Number of obs = 91 Maxlag = 12 chosen by Schwert criterion DF-GLS tau 1% Critical 5% Critical 10% Critical [lags] Test Statistic Value Value Value ------------------------------------------------------------------------------ 12-0.996-3.575-2.753-2.479 11-1.147-3.575-2.783-2.508 10-1.571-3.575-2.813-2.537 9-1.707-3.575-2.842-2.565 8-1.147-3.575-2.870-2.591 7-1.131-3.575-2.898-2.617 6-1.256-3.575-2.924-2.641 5-1.402-3.575-2.949-2.664 4-1.371-3.575-2.972-2.686 3-1.193-3.575-2.994-2.706 2-1.324-3.575-3.014-2.723

1-1.155-3.575-3.031-2.739 Opt Lag (Ng-Perron seq t) = 11 with RMSE 55.07213 Min SC = 8.324652 at lag 1 with RMSE 61.11491 Min MAIC = 8.277655 at lag 1 with RMSE 61.11491 That s not too good. Clearly we are in the not reject region. The level is nonstationary. And the differences with notrend results:. dfgls D.gdp, notrend DF-GLS for D.gdp Number of obs = 90 Maxlag = 12 chosen by Schwert criterion DF-GLS mu 1% Critical 5% Critical 10% Critical [lags] Test Statistic Value Value Value ------------------------------------------------------------------------------ 12-2.644-2.600-1.971-1.672 11-2.942-2.600-1.986-1.687 10-2.727-2.600-2.001-1.701 9-2.140-2.600-2.016-1.716 8-2.033-2.600-2.031-1.731 7-2.769-2.600-2.046-1.745 6-2.930-2.600-2.061-1.759 5-2.872-2.600-2.075-1.772 4-2.870-2.600-2.088-1.785 3-3.241-2.600-2.101-1.797 2-3.924-2.600-2.113-1.808 1-3.894-2.600-2.124-1.817 Opt Lag (Ng-Perron seq t) = 10 with RMSE 55.81506 Min SC = 8.33664 at lag 1 with RMSE 61.45603 Min MAIC = 8.700253 at lag 1 with RMSE 61.45603 The statistic is significant at every lag. Go for the differences. Removing the trend has no substantive effect in this case. I think the DF-GLS test is the way to go as opposed to the usual DF or ADF test (more powerful than ADF) so I ll use it. Also, this test in Stata is useful in helping to model select the number of lags to use. First, I ll run the autoregressions manually using the regress command, testing residuals for autocorrelation after each. reg D.gdp L.D.gdp estat bgodfrey reg D.gdp L(1/2).D.gdp estat bgodfrey

. reg D.gdp L.D.gdp Source SS df MS Number of obs = 102 F( 1, 100) = 47.92 Model 168398.972 1 168398.972 Prob > F = 0.0000 Residual 351387.432 100 3513.87432 R-squared = 0.3240 Adj R-squared = 0.3172 Total 519786.404 101 5146.40003 Root MSE = 59.278 D.gdp Coef. Std. Err. t P> t [95% Conf. Interval] gdp LD..5712509.0825183 6.92 0.000.407537.7349649 _cons 43.95044 10.19719 4.31 0.000 23.71952 64.18137. estat bgodfrey Breusch-Godfrey LM test for autocorrelation and lags(p) chi2 df Prob > chi2 1 1.692 1 0.1933. reg D.gdp L(1/2).D.gdp H0: no serial correlation Source SS df MS Number of obs = 101 F( 2, 98) = 24.76 Model 174120.822 2 87060.411 Prob > F = 0.0000 Residual 344632.963 98 3516.66289 R-squared = 0.3357 Adj R-squared = 0.3221 Total 518753.785 100 5187.53785 Root MSE = 59.301 D.gdp Coef. Std. Err. t P> t [95% Conf. Interval] gdp LD..4968111.1008118 4.93 0.000.2967534.6968689 L2D..1295186.1008543 1.28 0.202 -.0706234.3296606 _cons 38.6639 11.11212 3.48 0.001 16.61225 60.71554. estat bgodfrey Breusch-Godfrey LM test for autocorrelation lags(p) chi2 df Prob > chi2 1 0.415 1 0.5196 H0: no serial correlation I estimated AR(1) and AR(2) models on the differenced series. AR(1) is probably the best choice, but I continue the example with AR(2) just for fun. The arima command is very convenient. It can be used to take differences, add autoregressive terms, add other regressors and their lags, and add autocorrelated errors to the model (called moving average). Here is the syntax:

Title [TS] arima ARIMA, ARMAX, and other dynamic regression models Syntax Basic syntax for a regression model with ARMA disturbances arima depvar [indepvars], ar(numlist) ma(numlist) Basic syntax for an ARIMA(p,d,q) model arima depvar, arima(#p,#d,#q) options Model noconstant arima(#p,#d,#q) ar(numlist) ma(numlist) constraints(constraints) collinear description suppress constant term specify ARIMA(p,d,q) model for dependent variable autoregressive terms of the structural model disturbance moving-average terms of the structural model disturbance apply specified linear constraints keep collinear variables I want 2 autoregressive terms and to take the first difference of real GDP. That is done arima gdp, arima(2,1,0). arima gdp, arima(2,1,0) (setting optimization to BHHH) Iteration 0: log likelihood = -564.46367 Iteration 1: log likelihood = -564.45944 Iteration 2: log likelihood = -564.45779 Iteration 3: log likelihood = -564.45655 Iteration 4: log likelihood = -564.45556 (switching optimization to BFGS) Iteration 5: log likelihood = -564.4548 Iteration 6: log likelihood = -564.45308 Iteration 7: log likelihood = -564.4521 Iteration 8: log likelihood = -564.45206 ARIMA regression Sample: 2-104 Number of obs = 103 Wald chi2(2) = 34.79 Log likelihood = -564.4521 Prob > chi2 = 0.0000 OPG D.gdp Coef. Std. Err. z P> z [95% Conf. Interval] gdp ARMA _cons 102.3637 17.97174 5.70 0.000 67.13974 137.5877 ar L1..4920216.1077538 4.57 0.000.2808281.7032151 L2..1274014.0886153 1.44 0.151 -.0462814.3010841 /sigma 57.9252 2.452609 23.62 0.000 53.11817 62.73223 The results for the AR terms are very close to those from least squares. ML is not making much of a difference in estimating the parameters. Compare the standard errors though. To generate a series of 1-step ahead forecasts, simply use

predict ghat, y Dynamic forecasts can be generated as well. These use actual values of gdp up to a point and then use forecasted values for all subsequent values. These will be quite smooth. predict ghatdy, dynamic(tq(2004q1)) y tsline gdp ghatdy ghat if tin(2004q1,) The resulting graph is 11000 12000 13000 14000 15000 2004q1 2005q3 2007q1 2008q3 2010q1 real US gross domestic product y prediction, one-step y prediction, dyn(tq(2004q1)) You can see that the 1-step forecasts never deviate very far from the actual series (since they use actual values of gdp each time). The dynamic forecast is smoother and deviations of predicted and actual gdp are fairly large (at least for a while).