Econometrics 2 Spring 25 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen of4 Outline of the Lecture () Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing time dependence: ACF and PACF. (4) Modelling time dependence: the ARMA(p,q) model (5) Examples: AR(). AR(2). MA(). (6) Lag operators, lag polynomials and invertibility. (7) Model selection. (8) Estimation. (9) Forecasting. 2of4
Univariate Time Series Analysis We consider a single time series, y,y 2,..., y T. We want to construct simple models for y t as a function of the past: E[y t history]. Univariate models are useful for: () Analyzing the dynamic properties of a time series. What is the dynamic adjustment after a shock? Do shocks have transitory or permanent effects (presence of unit roots)? (2) Forecasting. AmodelforE[y t x t ] is only useful for forecasting y t+ if we know (or can forecast) x t+. (3) Univariate time series analysis is a way to introduce the tools necessary for analyzing more complicated models. 3of4 Stationarity A time series, y,y 2,..., y t,..., y T,isstrictly stationary if the joint distributions (y t,y t2,..., y tn ) and (y t +h,y t2 +h,...,y tn +h) arethesameforallh. A time series is called weakly stationary or covariance stationary if E[y t ] = µ V [y t ] = E[(y t µ) 2 ] = γ Cov[y t,y t k ] = E[(y t µ)(y t k µ)] = γ k for k =, 2,... Often µ, γ,andγ k are assumed finite. On these slides we consider only stationary processes. Later we consider (unit root) non-stationary processes. 4of4
Stationarity, Non-Stationarity, and Trends 4 2 Stationary process 5 Stationary process with a deterministic trend 5-2 2 5 2 4 6 8 Stationary process with a stochastic trend (unit root) 4 2 2 4 6 8 First difference is a stationary process 5-5 Deviations from an estimated trend -2 2 4 6 8 2 4 6 8 5of4 The Autocorrelation Function (ACF) For a stationary time series we define the autocorrelation function (ACF) as Ã! ρ k = Corr(y t,y t k )= γ k = Cov(y t,y t k ), = Cov(y t,y t k ) p. γ V (y t ) V (yt ) V (y t k ) Note that ρ k, ρ =,andρ k = ρ k. Recall that the ACF can (e.g.) be estimated by OLS in the regression model y t = c + ρ k y t k + residual. If ρ = ρ 2 =... =, it holds that V (bρ k )=T, and a 95% confidence band is given by ±2/ T. 6of4
The Partial Autocorrelation Function (PACF) An alternative measure is the partial autocorrelation function (PACF), which is the correlation conditional on the intermediate values, i.e. Corr(y t,y t k y t,...,y t k+ ). The PACF can be estimated as the OLS estimator b θ k in the regression where the intermediate lags are included. y t = c + θ y t +... + θ k y t k + residual, If θ = θ 2 =... =,itholdsthat V ( b θ k )=T, and a 95% confidence band is given by ±2/ T. 7of4 Example: Danish GDP 7. Danish GDP, log.5 Deviation from trend.5 First difference 6.75..25. 6.5 98 2 -.5 98 2 -.25 98 2 ACF - 2-2 - 2 PACF - 2-2 - 2 8of4
The ARMA(p,q) Model We consider two simple models for y t : The autoregressive AR(p) model and the moving average MA(q) model. First define a white noise process, t i.i.d.(,σ 2 ). The AR(p) model is defined as y t = θ y t + θ 2 y t 2 +... + θ p y t p + t. Systematic part of y t is a linear function of p lagged values. The MA(q) model is defined as y t = t + α t + α 2 t 2 +... + α q t q. y t is a moving average of past shocks to the process. They can be combined into the ARMA(p,q) model y t = θ y t +... + θ p y t p + t + α t +... + α q t q. 9of4 Dynamic Properties of an AR() Model Consider the AR() model Y t = δ + θy t + t. Assume for a moment that the process is stationary. As we will see later, this requires θ <. First we want to find the expectation. Stationarity implies that E[Y t ]=E[Y t ]=µ. Wefind Note the following: E[Y t ] = E[δ + θy t + t ] E[Y t ] = δ + θe[y t ]+E[ t ] ( θ) µ = δ µ = δ θ. () The effect of the constant term, δ, depends on the autoregressive parameter, θ. (2) µ is not defined if θ =.Thisisexcludedforastationaryprocess. of 4
Next we want to calculate the variance and the autocovariances. It is convenient to define the deviation from mean, y t = Y t µ, sothat Y t = δ + θy t + t Y t = ( θ) µ + θy t + t Y t µ = θ (Y t µ)+ t y t = θy t + t. We note that γ = V [Y t ]=V [y t ].Wefind: V [y t ] = E[yt 2 ] = E[(θy t + t ) 2 ] = E[θ 2 yt 2 + 2 t +2θy t t ] = θ 2 E[yt ]+E[ 2 2 t]+2θe[y t t ] = θ 2 V [y t ]+V [ t ]+. Using stationarity, γ = V [y t ]=V [y t ],weget γ ( θ 2 )=σ 2 or γ = σ2 θ 2. of 4 The covariances are given by The ACF is given by γ = Cov[y t,y t ]=E[y t y t ] = E[(θy t + t )y t ] = θe[yt ]+E[y 2 t t ] = θ σ2 θ 2 = θγ γ 2 = Cov[y t,y t 2 ]=E[y t y t 2 ] = E[(θy t + t ) y t 2 ] = E[(θ(θy t 2 + t )+ t ) y t 2 ] = E[θ 2 y t 2 + θy t 2 t + y t 2 t ] = θ 2 E[yt 2]+θE[y 2 t 2 t ]+E[y t 2 t ]=θ 2 σ 2 γ k = Cov[y t,y t k ]=θ k γ ρ k = γ k γ = θk γ γ = θ k. θ 2 = θ2 γ The PACF is simply the autoregressive coefficients: θ,,,... 2 of 4
Examples of Stationary AR() Models 2 y t =ε t ACF- PACF- -2 2.5 5 y t =.8 y t +ε t 2 ACF- 2 PACF-. -2.5 5 5 y t =.8 y t +ε t 2 ACF- 2 PACF- -5 5 2 2 3 of 4 Examples of AR() Models 2.5 y t =.5 y t +ε t 2.5 y t =.95 y t +ε t.. -2.5-2.5 2 4 6 8-5. 2 4 6 8 y t =y t +ε t y t =.5 y t +ε t -25-5 -5-75 - 2 4 6 8 2 4 6 8 4 of 4
Dynamic Properties of an AR(2) Model Consider the AR(2) model given by Y t = δ + θ Y t + θ 2 Y t 2 + t. Again we find the mean under stationarity: E [Y t ] = δ + θ E [Y t ]+θ 2 E [Y t 2 ]+E [ t ] E [Y t ] = δ = µ. θ θ 2 We then define the process y t = Y t µ for which it holds that y t = θ y t + θ 2 y t 2 + t. 5 of 4 Multiplying both sides with y t and taking expectations yields E yt 2 = θ E [y t y t ]+θ 2 E [y t 2 y t ]+E [ t y t ] γ = θ γ + θ 2 γ 2 + σ 2 Multiplying instead with y t yields E [y t y t ] = θ E [y t y t ]+θ 2 E [y t 2 y t ]+E [ t y t ] γ = θ γ + θ 2 γ Multiplying instead with y t 2 yields E [y t y t 2 ] = θ E [y t y t 2 ]+θ 2 E [y t 2 y t 2 ]+E [ t y t 2 ] γ 2 = θ γ + θ 2 γ Multiplying instead with y t 3 yields E [y t y t 3 ] = θ E [y t y t 3 ]+θ 2 E [y t 2 y t 3 ]+E [ t y t 3 ] γ 3 = θ γ 2 + θ 2 γ These are the so-called Yule-Walker equations. 6 of 4
To find the variance we can solve the substitute γ and γ 2 into the equation for γ. This is, however, a bit tedious. We can find the autocorrelations, ρ k = γ k /γ,as or alternatively that ρ = θ + θ 2 ρ ρ 2 = θ ρ + θ 2 ρ k = θ ρ k + θ 2 ρ k 2, k 3 θ ρ = θ 2 ρ 2 = θ 2 + θ 2 θ 2 ρ k = θ ρ k + θ 2 ρ k 2, k 3. 7 of 4 Examples of AR(2) Models 2.5 y t =.5 y t +.4 y t 2 +ε t. ACF- PACF-..5-2.5 5 5 y t =.8 y t 2 +ε t 2 ACF- 2 PACF- -5 5 5 y t =.3 y t.8 y t 2 ε t 2 ACF- 2 PACF- -5 5 2 2 8 of 4
Dynamic Properties of a MA() Model Consider the MA() model Y t = µ + t + α t. The mean is given by E[Y t ]=E[µ + t + α t ]=µ which is here identical to the constant term. Next we find the variance: V [Y t ] = E h(y t µ) 2i = E h( t + α t ) 2i = E t 2 + E α 2 t 2 + E [2α t t ] = +α 2 σ 2. 9 of 4 The covariances are given by γ = Cov[Y t,y t ] = E [(Y t µ)(y t µ)] = E[( t + α t )( t + α t 2 )] = E[ t t + α t t 2 + α 2 t + α 2 t t 2 ] = ασ 2 γ 2 = Cov[Y t,y t 2 ] = E [(Y t µ)(y t 2 µ)] = E[( t + α t )( t 2 + α t 3 )] = E[ t t 2 + α t t 3 + α t t 2 + α 2 t t 3 ] = γ k = for k =3, 4,... The ACF is given by ρ = γ ασ 2 = γ ( + α 2 ) σ = α 2 ( + α 2 ) ρ k =, k 2. 2 of 4
Examples of MA Models 2.5 y t =ε t.9 ε t ACF- PACF-. -2.5 5. 5 y t =ε t +.9 ε t 2 ACF- 2 PACF- 2.5. -2.5 2.5 5 y t =ε t.6 ε t +.5 ε t 2 +.5 ε t 3 2 ACF- 2 PACF-. -2.5 5 2 2 2 of 4 The Lag and Difference Operators Now we introduce an important tool called the lag-operator, L. It has the property that and, for example, L y t = y t, L 2 y t = L(Ly t )=Ly t = y t 2. Also define the first difference operator, = L, such that y t =( L) y t = y t Ly t = y t y t. The operators L and are not functions, but can be used in calculations. 22 of 4
Lag Polynomials Consider as an example the AR(2) model That can be written as y t = θ y t + θ 2 y t 2 + t. y t θ y t θ 2 y t 2 = t y t θ Ly t θ 2 L 2 y t = t ( θ L θ 2 L 2 )y t = t θ(l)y t = t, where θ(l) = θ L θ 2 L 2 is a polynomial in L, denoted a lag-polynomial. Standard rules for calculating with polynomials also hold for polynomials in L. 23 of 4 For a model Characteristic Equations and Roots we define the characteristic equation as y t θ y t θ 2 y t 2 = t θ(l)y t = t, θ(z) = θ z θ 2 z 2 =. The solutions, z and z 2, are denoted characteristic roots. An AR(p) has p roots. Some of them may be complex values, h ± v i, wherei =. Recall, that the roots can be used for factorizing the polynomial θ(z) = θ z θ 2 z 2 =( φ z)( φ 2 z), where φ = z and φ 2 = z 2 are the inverse roots. 24 of 4
Invertibility of Polynomials Define the inverse of a polynomial, θ (L) of θ(l), sothat θ (L)θ(L) =. Consider the AR() case, θ(l) = θl, and look at the product ( θl) +θl + θ 2 L 2 + θ 3 L 3 +... + θ k L k = ( θl)+ θl θ 2 L 2 + θ 2 L 2 θ 3 L 3 + θ 3 L 3 θ 4 L 5 +... = θ k+ L k+. If θ <, itholdsthatθ k+ L k+ as k implying that θ (L) =( θl) = θl =+θl + θ2 L 2 + θ 3 L 3 +... = X θ i L i. i= If θ(l) is a finite polynomial, the inverse polynomial, θ (L), isinfinite. 25 of 4 ARMA Models in AR and MA form Using lag polynomials we can rewrite the stationary ARMA(p,q) model as y t θ y t... θ p y t p = t + α t +... + α q t q ( ) θ(l)y t = α(l) t. where θ(l) and α(l) are finite polynomials. If θ(l) is invertible, ( ) can be written as the infinite MA( ) model This is called the MA representation. y t = θ (L)α(L) t y t = t + γ t + γ 2 t 2 +... If α(l) is invertible, ( ) can be written as an infinite AR( ) model α (L)θ(L)y t = t y t γ y t γ 2 y t 2... = t. This is called the AR representation. 26 of 4
Invertibility and Stationarity A finite order MA process is stationary by construction. It is a linear combination of stationary white noise terms. Invertibility is sometimes convenient for estimation and prediction. An infinitemaprocessisstationaryifthecoefficients, α i, converge to zero. We require that P i= α2 i <. An AR process is stationary if θ(l) is invertible. This is important for interpretation and inference. In the case of a root at unity standard results no longer hold. We return to unit roots later. 27 of 4 Consider again the AR(2) model θ(z) = θ z θ 2 z 2 =( φ L)( φ 2 L). The polynomial is invertible if the factors ( φ i L) are invertible, i.e. if φ < and φ 2 <. In general a polynomial, θ(l), is invertible if the characteristic roots, z,..., z p,are larger than one in absolute value. In complex cases, this corresponds to the roots being outside the complex unit circle. (Modulus larger than one). Imaginary part i Real part 28 of 4
ARMA Models and Common Roots Consider the stationary ARMA(p,q) model y t θ y t... θ p y t p = t + α t +... + α q t q θ(l)y t = α(l) t ( φ L)( φ 2 L) φ p L y t = ( ξ L)( ξ 2 L) ξ q L t. If φ i = ξ j for some i, j, they are denoted common roots or canceling roots. The ARMA(p,q) model is equivalent to a ARMA(p-,q-) model. As an example, consider y t y t +.25y t 2 = t.5 t L +.25L 2 y t = (.5L) t (.5L)(.5L) y t = (.5L) t (.5L) y t = t. 29 of 4 Unit Roots and ARIMA Models A root at one is denoted aunit root, and has important consequences for the analysis. We consider tests for unit roots and unit root econometrics later. Consider an ARMA(p,q) model θ(l)y t = α(l) t. If there is a unit root in the AR polynomial, we can factorize into θ(l) =( L)( φ 2 L) φ p L =( L)θ (L), and we can write the model as θ (L)( L)y t = α(l) t θ (L) y t = α(l) t. An ARMA(p,q) model for d y t is denoted an ARIMA(p,d,q) model for y t. 3 of 4
Example: Danish Real House Prices Consider the Danish real house prices in logs, p t. An AR(2) model yields p t =.55 (2.7) The lag polynomial is given by p t.5734 ( 7.56) p t 2 +.3426 (.3) θ(l) =.55 L +.5734 L 2, with inverse roots given by.9422 and.686. One root is close to unity and we estimate an ARIMA(2,,) model for p t : p t =.323 (6.6) The lag polynomial is given by with complex (inverse) roots given by p t.4853 ( 6.2) p t 2 +.9959 (.333) θ(l) =.323 L +.4853 L 2,.664 ±.2879 i, where i =. 3 of 4 ARIMA(p,d,q) Model Selection Find a transformation of the process that is stationary, e.g. d Y t. Recall, that for the stationary AR(p) model The ACF is infinite but convergent. The PACF is zero for lags larger than p. For the MA(q) model The ACF is zero for lags larger than q. The PACF is infinite but convergent. The ACF and PACF contains information p and q. Can be used to select relevant models. 32 of 4
If alternative models are nested, they can be tested. Model selection can be based on information criteria IC = logbσ 2 {z } Measures the likelihood The information criteria should be minimized! Three important criteria + penalty(t,#parameters) {z } A penalty for the number of parameters AIC = logbσ 2 + 2 k T HQ = logbσ 2 2 k log(log(t )) + T BIC = logbσ 2 + k log(t ), T where k is the number of estimated parameters, e.g. k = p + q. 33 of 4 Example: Consumption-Income Ratio (A) Consumption and income, logs. Consumption (c) Income (y). (B) Consumption-Income ratio, logs. 6.25 -.5 6. 97 98 99 2 -. -.5 97 98 99 2 (C) ACF for series in (B) (D) PACF for series in (B) 5 5 2 5 5 2 34 of 4
Model T p log-lik SC HQ AIC ARMA(2,2) 3 5 3.825-4.448-4.563-4.55 ARMA(2,) 3 4 3.39537-4.477-4.524-4.5599 ARMA(2,) 3 3 3.3898-4.59-4.5483-4.5752 ARMA(,2) 3 4 3.42756-4.4722-4.5246-4.564 ARMA(,) 3 3 299.99333-4.53-4.5422-4.569 ARMA(,) 3 2 296.7449-4.486-4.578-4.5258 ARMA(,) 3 249.8264-3.86-3.89-3.828 35 of 4 ---- Maximum likelihood estimation of ARFIMA(,,) model ---- The estimation sample is: 97 () - 23 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-.85736.565 5.2. MA- -.382.9825-3.6.3 Constant -.934.9898-9.44. log-likelihood 299.993327 sigma.239986 sigma^2.575934 ---- Maximum likelihood estimation of ARFIMA(2,,) model ---- The estimation sample is: 97 () - 23 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-.53683.8428 6.36. AR-2.25548.8479 2.95.4 Constant -.93547.948-9.87. log-likelihood 3.38984 sigma.239238 sigma^2.572349 36 of 4
Estimation of ARMA Models The natural estimator is maximum likelihood. With normal errors log L(θ, α, σ 2 )= T TX 2 2 log(2πσ2 t ) 2 σ 2, where t is the residual. For an AR() model we can write the residual as and OLS coincides with ML. t = Y t δ θ Y t, Usual to condition on the initial values. Alternatively we can postulate a distribution for the first observation, e.g. µ δ Y N θ, σ 2 θ 2, where the mean and variance are chosen as implied by the model for the rest of the observations. We say that Y is chosen from the invariant distribution. t= 37 of 4 For the MA() model Y t = µ + t + α t, the residuals can be found recursively as a function of the parameters = Y µ 2 = Y 2 µ α 3 = Y 3 µ α 2. Here, the initial value is =, but that could be relaxed if required by using the invariant distribution. The likelihood function can be maximized wrt. α and µ. 38 of 4
Forecasting Easy to forecast with ARMA models. Main drawback is that here is no economic insight. We want to predict y T +k given all information up to time T, i.e. given the information set I T = {y,...,y T,y T }. The optimal predictor is the conditional expectation y T +k T = E[y T +k I T ]. 39 of 4 Consider the ARMA(,) model y t = θ y t + t + α t, t =, 2,..., T. To forecast we Substitute the estimated parameters for the true. Use estimated residuals up to time T. Hereafter, the best forecast is zero. The optimal forecasts will be y T + T = E[θ y T + T + + α T I T ] = b θ y T + bα b T y T +2 T = E[θ y T + + T +2 + α T + I T ] = b θ y T + T. 4 of 4
-.2 -.4 Forecasts Actual -.6 -.8 -. -.2 -.4 -.6 99 992 994 996 998 2 22 24 26 28 4 of 4