Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1
Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis III 2
Cointegration Time Series Analysis III An m dimensional stochastic process {X t } = {..., X t 1, X t,...} is I (d), Integrated of order d if the d-differenced process d X t = (1 L) d X t is stationary. If {X t } has a VAR(p) representation, i.e., Φ(L)X t = ɛ t, where Φ(L) = I A 1 L A 2 A p L p. then Φ(L) = (1 L) d Φ (L) where Φ (L) = (1 A 1 L A 2 L2 A ml m ) specifies the stationary VAR(m) process { d X t } with m = p d. Issue: Every component series of {X t } may be I (1), but the process may not be jointly integrated. Linear combinations of the component series (without any differencing) may be stationary! If so, the multivariate time series {X t } is Cointegrated MIT 18.S096 Time Series Analysis III 3
Consider {X t } where X t = (x 1,t, x 2,t,..., x m,t ) an m-vector of component time series, and each is I (1), integrated of order 1. If {X t } is cointegrated, then there exists an m-vector β = (β 1, β 2,..., β m ) such that β X t = β 1 x 1,t + β 2 x 2,t + + β m x m,t I (0), a stationary process. The cointegration vector β can be scaled arbitrarily, so assume a normalization: β = (1, β 2,..., β m ) The expression: β X t = u t, where {u t } I (0) is equivalent to: x 1,t = (β 2 x 2,t + β m x m,t ) + u t, where β X t is the long-run equilibrium relationship i.e., x 1,t = (β 2 x 2,t + β m x m,t ) u t is the disequilibrium error / cointegration residual. MIT 18.S096 Time Series Analysis III 4
Examples of Cointegration Term structure of interest rates: expectations hypothesis. Purchase power parity in foreign exchange: cointegration among exchange rate, foreign and domestic prices. Money demand: cointegration among money, income, prices and interest rates. Covered interest rate parity: cointegration among forward and spot exchange rates. Law of one price: cointegration among identical/equivalent assets that must be valued identically to limit arbitrage. Spot and futures prices. Prices of same asset on different trading venues MIT 18.S096 Time Series Analysis III 5
Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis III 6
The m-dimensional multivariate time series {X t } follows the VAR(p) model with auto-regressive order p if X t = C + Φ 1 X t 1 + Φ 2 X t 2 + + Φ p X t p + ηt where C = (c 1, c 2,..., c m ) is an m-vector of constants. Φ 1, Φ 2,..., Φ p are (m m) matrices of coefficients {η t } is multivariate white noise MWN(0 m, Σ) The VAR(p) [ model is covariance stationary if ] det I m (Φ 2 p 1z + Φ 2 z + + Φ p z ) = 0 has roots outside z 1 for complex z. Suppose {X t } is I (1) of order 1. We develop a Vector Error Correction Model representation of this model by successive modifications of the model equation: MIT 18.S096 Time Series Analysis III 7
VAR(p) Model Equation X t = C + Φ 1 X t 1 + Φ 2 X t 2 + + Φ p X t p + η t Subtract X t 1 from both sides: X t = X t X t 1 = C + (Φ 1 I m)x t 1 + Φ 2 X t 2 + + Φ px t p + η t Subtract and add (Φ 1 I m )X t 2 ) from right-hand side: X t = C + (Φ 1 I m) X t 1 + (Φ 2 + Φ 1 I m)x t 2 + + Φ px t p + η t Subtract and add (Φ 2 + Φ 1 I m )X t 3 ) from right-hand side: X t = C +(Φ 1 I m) X t 1 +(Φ 2 +Φ 1 I m) X t 2 +(Φ 3 +Φ 2 +Φ 1 I m)x t 3 + = X t = C + (Φ 1 I m ) X t 1 +(Φ 2 + Φ 1 I m ) X t 2 +(Φ 3 + Φ 2 + Φ 1 I m ) X t 3 + +(Φ p 1 + + Φ 3 + Φ 2 + Φ 1 I m ) X t (p 1) +(Φ p + + Φ 3 + Φ 2 + Φ 1 I m )X t p + η t Reversing the order of incorporating -terms we can derive X t = C + ΠX t 1 + Γ 1 X t 1 + + Γ p 1 X t (p 1) + η t where: Π = (Φ 1 + Φ 2 + Φ p I m ) and Γ k = ( p j=k+1 Φ j), k = 1, 2,..., p 1. MIT 18.S096 Time Series Analysis III 8
Vector Error Correction Model (VECM) The VAR(p) model for {X t } is a VECM model for { X t }. X t = C + ΠX t + Γ 1 X t 1 + + Γ p 1 X t (p 1) + η t By assumption, the VAR(p) model for {X t } is I (1), so the VECM model for { X t } is I (0). The left-hand-side X t is stationary / I (0). The terms on the right-hand-side X t j, j = 1, 2,..., p 1 are stationary / I (0). The term ΠX t must be stationary /I (0). This term ΠX t contains any cointegrating terms of {X t }. Given that the VAR(p) process had unit roots, it must be that Π is singular, i.e., the linear transformation eliminates the unit roots. MIT 18.S096 Time Series Analysis III 9
The matrix Π is of reduced rank r < m and either rank(π) = 0 and Π = 0 and there are no cointegrating relationships. rank(π) > 0 and Π defines the cointegrating relationships. If cointegrating relationships exist, then rank(π) = r with 0 < r < m, and we can write Π = αβ, where α and β are each (m r) matrices of full rank r. The columns of β define linearly independent vectors which cointegrate X t. The decomposition of Π is not unique. For any invertible (r r) matrix G, Π = α β where α = αg and β = G 1 β. MIT 18.S096 Time Series Analysis III 10
Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis III 11
Unrestricted Least Squares Estimation Sims, Stock, and Watson (1990), and Park and Phillips (1989) prove that in estimation for cointegrated VAR(p) models, the least-squares estimator of the original model yields parameter estimates which are: Consistent. Have asymptotic distributions identical to those of maximum-likelihood estimators. Constraints on parameters due to cointegration (i.e., the reduced rank of Π) hold asymptotically. Maximum Likelihood Estimation* Banerjee and Hendry (1992): apply method of concentrated likelihood to solve for maximum likelihood estiamtes. * Advanced topic for optional reading/study MIT 18.S096 Time Series Analysis III 12
Maximum Likelihood Estimation (continued) Johansen (1991) develops a reduced -rank regression methodology for the maximum likelihood estimation for VECM models. This methodology provides likelihood ratio tests for the number of cointegrating vectors: Johansen s Trace Statistic (sum of eigenvalues of Πˆ ) Johansen s Maximum-Eigenvalue Statistic (max eigenvalue of Πˆ ). MIT 18.S096 Time Series Analysis III 13
Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis III 14
Linear State-Space Model General State-Space Formulation y t : (k 1) observation vector at time t s t : (m 1) state-vector at time t ɛ t : (k 1) observation-error vector at time t η t : (n 1) state transition innovation/error vector State Equation / Transition Equation S t+1 = T t s t + R t η t where T t : (m m) transition coefficients matrix R t : (m n) fixed matrix; often column(s) of I p η t : i.i.d. N(0 n, Q t ), where Q t (n n) is positive definite. MIT 18.S096 Time Series Analysis III 15
Linear State-Space Model Formulation Observation Equation / Measurement Equation y t = Z t s t + ɛ t where Z t : (k m) observation coefficients matrix ɛ t : i.i.d. N(0 k, H t ), where H t (k k) is positive definite. Joint Equation [ ] [ ] [ ] st+1 Tt Rt η = s t + t y t Z t ɛ t = Φ t s t + u t, where [ ] [ Rt η u t = t R t QtR T ] t 0 N(0, Ω), with Ω = ɛ t 0 H t Note: Often model is time invariant (T t, Rt, Z t, Q t, H t constants) MIT 18.S096 Time Series Analysis III 16
CAPM Model with Time-Varying Betas Consider the CAPM Model with time-varying parameters: r t = α t + β t r m,t + ɛ t, ɛ t N(0, σɛ 2 ) α t+1 = α t + ν 2 t, ν t N(0, σ ν ) β t+1 = β t + ξ t, ξ t N(0, σξ 2) where r t is the excess return of a given asset r m,t is the excess return of the market portfolio {ɛ t }, {ν t }, {ξ t } are mutually independent processes Note: {α t } is a Random Walk with i.i.d. steps N(0, σν) 2 {β t } is a Random Walk with i.i.d. steps N(0, σξ 2) (Mutually independent processes) MIT 18.S096 Time Series Analysis III 17
Time-Varying CAPM Model: Linear State-Space Model State [ Equation ] [ ] [ ] α t+1 αt ν = + t β t+1 β [ t ] [ ξ t ] [ ] [ ] 1 0 α = t 1 0 νt + 0 1 βt 0 1 Equivalently: s t+1 = T t s t + R t η t where: [ ] [ ] αt 1 0 s t =, T t = R t =, [ β t 0 1 ] [ ] νt σ 2 η t = N 2 (0 2, Q t ), with Q t = ν 0 ξt 0 σξ 2 Terms: state vector s t, transition coefficients T t transition white noise η t MIT 18.S096 Time Series Analysis III 18 ξ t
Time-Varying CAPM Model: Linear State-Space Model Observation Equation / Measurement Equation [ ] [ α t r t = 1 r m,t β t ] + ɛ t Equivalently r t = Z t s t + ɛ t where Z t = [ ] 1 r m,t is the observation coefficients matrix ɛ t N(0, H t ), is the observation white noise with H t = σɛ 2. Joint System [ of ] Equations [ ] [ ] s t+1 Tt R = t η s t y t + t Z t ɛ t [ R Ω T t = Φ ηrt 0 t s t + u t with Cov(u t ) = 0 H t MIT 18.S096 Time Series Analysis III 19 ]
Linear Regression Model with Time-Varying β Consider a normal linear regression model with time-varying regression coefficients: y t = x T t β t + ɛ t, where ɛ 2 t are i.i.d. N(0, σ ɛ ). where x t = (x 1,t, x 2,t,..., x t p,t), p-vector of explanatory variables β t = (β 1,t, β 2,t,..., β p,t ) t, regression parameter vector and for each parameter component j, j = 1,..., p, β j,t+1 = β j,t + η j,t, with {η j,t, t = 1, 2,...} i.i.d. N(0, σj 2). i.e., a Random Walk with iid steps N(0, σj 2). Joint State-Space [ ] Equations [ ] [ ] st+1 Ip ηt = T s t +, with state vector s t = βt y t x t ɛt [ ] [ ] Tt Rt η = s t + t Z t ɛ t MIT 18.S096 Time Series Analysis III 20
(Time-Varying) Linear Regression as a State-Space Model where η t N(0, Q t ), with Q t = diag(σ 2 1,..., σ2 p) ɛ t N(0, H t ), with H t = σ 2 ɛ. Special Case: σ 2 j 0: Normal Linear Regression Model Successive estimation of state-space model parameters with t = p + 1, p + 2,..., yields recursive updating algorithm for linear time-series regression model. MIT 18.S096 Time Series Analysis III 21
Autoregressive Model AR(p) Consider the AR(p) model φ(l)y t = ɛ t where φ(l) = 1 p j 2 j=1 φ jl and {ɛ t } i.i.d. N(0, σ ɛ ). so y t+1 = p j=1 φ jy t+1 j + ɛ t+1 Define state vector: s t = (y t, y t 1,..., yt p+1) T Then yt+1 φ1 φ 2 φ p 1 φ p y t ɛt+1 y t 1 0 0 0 1 0 y s 1 t+1 = ṭ t = y 0 1 0 0 +............ yt (p 1) 0 yt (p 2) 0 0 1 0 MIT 18.S096 Time Series Analysis III 22
State-Space Model for AR(p) State Equation s t+1 = T t s t + R t η t, where φ1 φ2 φ p 1 φp 1 1 0 0 0 0 Tt = 0 1 0 0.., R 0 t =...... 0 0 1 0 0 and {η t = ɛ 2 t+1} i.i.d.n(0, σ ɛ ). Observation Equation [ /Measurement ] Equation y t = 1 0 0 0 st = Z t s t (no measurement error), MIT 18.S096 Time Series Analysis III 23
Moving Average Model MA(q) Consider the MA(q) model y t = θ(l)ɛ t where θ(l) = 1 + q j=1 θ jl j and {ɛ t } i.i.d. N(0, σɛ 2 ). so y t+1 = ɛ t+1 + q j=1 θ jɛ t+1 j Define state vector: s t = (ɛ t 1, ɛ t 2,..., ɛt q) T Then 0 0 0 0 ɛ t ɛ t 1 ɛ 1 0 0 0 t ɛ t 1 ɛt 2 0 0 1 0 0 s t+1 = = +.......... ɛt (q 1) ɛt q 0 0 0 1 0 MIT 18.S096 Time Series Analysis III 24
State-Space Model for MA(q) State Equation s t+1 = T t s t + R t η t, where 0 0 0 0 1 1 0 0 0 0 T = 0 1 0 0 t = 0 t, R,........ 0 0 1 0 0 and {η t = ɛ t } i.i.d.n(0, σɛ 2 ). Observation Equation / Measurement Equation y t = [ ] θ 1 θ 2 θ q 1 θ q st + ɛ t = Z t s t + ɛ t MIT 18.S096 Time Series Analysis III 25
Auto-Regressive-Moving-Average Model ARMA(p, q) Consider the ARMA(p, q) model φ(l)y t = θ(l)ɛ t where φ(l) = 1 p j=1 φ jl j and {ɛ t } i.i.d. N(0, σɛ 2 ). q θ(l) = 1 + j=1 θ jl j and {ɛ t } i.i.d. N(0, σɛ 2 ). so y t+1 = p j=1 φ jy t+1 j + ɛ t+1 + p j=1 θ j ɛ t+1 j Set m = max(p, q + 1) and define m { φ 1,..., φ m } : φ(l) = 1 j=1 φ jl j { } m θ 1,..., θ m : θ(l) = 1 j=1 θ jl j i.e., φ j = 0 if p < j m θ j = 0 if q < j m So: {y t } ARMA(m, m 1) MIT 18.S096 Time Series Analysis III 26
State Space Model for ARMA(p,q) Harvey (1993) State-Space Specification Define state vector: s t = (s 1,t, s T 2,t,..., s m,t ), where m = max(p, q + 1). recursively: s 1,t = y t : Use this definition and the main model equation to define s 2,t and η t : y t +1 = p q j=1 φ jy t+1 j + ɛ t+1 + j=1 θ j ɛ t+1 j s 1,t+1 = φ 1 s 1,t + 1 s 2,t + η t where s 2,t = m m 1 i=2 φ iy t+1 i + j=1 θ j ɛ t+1 j η t = ɛ t+1 MIT 18.S096 Time Series Analysis III 27
Use s 1,t = y t, s 2,t, and η t = ɛ t+1 : to define s 3,t m m 1 s 2,t+1 = i=2 φ iy t+2 j + j=1 θ jɛ t+2 j m m 1 = φ 2 y t + 1 [ i=3 φ iy t+2 j + j=2 θ j ɛ t+2 j ] + (θ 1 ɛ t+1 ) = φ 2 s 1,t + 1 [s 3,t ] + R 2,1 η t where m m 1 s 3,t = i=3 φ iy t+2 j + j=2 θjɛt+2 j R 2,1 = θ 1 η t = ɛ t+1 Use s 1,t = y t, s 3,t, and η t = ɛ t+1 : to define s 4,t m m 1 s 3,t+1 = i=3 φ iy t+3 j + j=2 θ jɛ t+3 j m m 1 = φ 3 y t + 1 [ i=4 φ iy t+3 j + j=3 θ j ɛ t+3 j ] + (θ 2 ɛ t+1 ) = φ 3 s 1,t + 1 [s 4,t ] + R 3,1 η t where m m 1 s 4,t = i=4 φ iy t+3 j + j=3 θjɛt+3 j R 3,1 = θ 2 η t = ɛ t+1 MIT 18.S096 Time Series Analysis III 28
Continuing until s m,t = m i=m φ iy t+(m 1) j + m 1 j=m 1 θ jɛ t+m 1 j = φ m y t 1 + θ m 1 ɛ t which gives s m,t+1 = φ m y t + θ m 1 ɛ t+1 = φ m s 1,t + R m,1 η t where R m,1 = θ m 1 and η t = ɛ t+1 All the equations can be written together: s t+1 = Ts t + Rη t y t = Zs t (no measurement error term) where φ 1 1 0 0 1 φ 2 0 1 0 θ 1 T =., R...... = and. φ m 1 0 0 1 θ m 2 φ m 0 0 0 θ m 1 η t i.i.d. N(0, σɛ 2 [ ] ), and Z = 1 0 0 (1 m) MIT 18.S096 Time Series Analysis III 29
Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis III 30
Time Series Analysis III Linear State-Space Model: Joint Equation [ ] [ ] [ ] s t+1 Tt Rt η = s y t Z t + t t ɛ t = Φ t s t + u t, where {η [ t } i.i.d.n ] n (0, Q t ), {ɛ t } i.i.d.n k (0, H t ), so [ R t η t R t Q u T ] t = trt 0 Nm+k(0 m+k, Ω t ), with Ω t = ɛ t 0 H t For F t = {y 1, y 2,..., y t }, the observations up to time t, the is the recursive computation of the probability density functions: p(s t+1 F t ), t = 1, 2,... p(s t+1, y t+1 F t ), t = 1, 2,... p(y t+1 F t ), t = 1, 2,.... Define Θ = { all parameters in T t, Z t, R t, Q t, H T } MIT 18.S096 Time Series Analysis III 31
Time Series Analysis III Notation: Conditional Means s t t = E(s t F t ) s t t 1 = E(s t F t 1 ) y t t 1 = E(y t F t 1 ) Conditional Covariances / Mean-Squared Errors Ω s (t t) = Cov(st F t ) = T E[(s t s t t)(st s t t ) ] Ω s (t t 1) = Cov(s t F t 1 ) = E[(st s t t 1 )(s t s t t 1) T ] Ω y (t t 1) = Cov(y t F t 1 ) = E[(y t y t t 1 )(y t y T Observation Innovations /Residuals ɛ t = (y t y t t 1 ) = y t Z t s t t 1 t t 1) ] MIT 18.S096 Time Series Analysis III 32
: Four Steps (1) Prediction Step: Predict state vector and observation vector at time t given F t 1 s t t 1 = Tt 1s t 1 t 1 y t t 1 = Z t s t t 1 Predictions are conditional means with mean-squared errors (MSEs): Ω s (t t 1) = Cov(st F t 1 ) = T t 1 Cov(s t 1 )T T + Ω = T T t Ω s (t 1 t 1)T T t + R t Q t R t Ω y (t t 1) = Cov(y T t F t 1 ) = Z t Cov(s t t 1 )Z t + Ω ɛt = Z t Ω s (t t 1)Zt T + H t t 1 t 1 R tη t MIT 18.S096 Time Series Analysis III 33
: Four Steps (2) Correction / Filtering Step: Update the prediction of the state vector and its MSE given the observation at time t: s t t = s t t 1 + G t (y t y t t 1) Ω s (t t) = Ω s (t t 1) G t Ω y (t t 1)Gt T where G t = Ω s (t 1 t)zt T [Ω 1 s(t 1 t)] is the Filter Gain matrix. (3) Forecasting Step: For times t > t, the present step, use the following recursion equations for t = t + 1, t + 2,... s t t = T t 1s t 1 t Ω s (t t) = T t 1Ω s (t 1 t)tt T 1 + Ω R t η t y t t = Z t s t 1 t Ω y (t t) = Z t Ω y (t 1 t)zt T + Ωɛ t MIT 18.S096 Time Series Analysis III 34
: Four Steps (4) Smoothing Step: Updating the predictions and MSEs for times t < t to use all the information in F t rather than just F t. Use the following recursion equations for t = t 1, t 2,... s t t = st t + S t (s t +1 t s t +1 t ) Ω s (t t) = Ω s (t t ) S t [Ω s (t + 1 t ) Ω s (t + 1 t) St T where S t = Ω s (t t )Tt T s(t + 1 t )] 1 is the Kalman Smoothing Matrix. MIT 18.S096 Time Series Analysis III 35
: Maximum Likelihood Likelihood Function Given θ = { all parameters in T t, Z t, R t, Q t, H T }, we can write the likelihood function as: L(θ) = p(y 1,..., y T ; θ) = p(y 1 ; θ)p(y 2 y 1 ; θ) p(y T y 1,..., y T 1 ; θ) Assuming the transition errors (η t ) and observation errors (ɛ t ) are Gaussian, the observations y t have the following conditional normal distributions: [y t F t 1 ; θ] N[y t t 1, Ω y (t t 1)] The log likelihood is: l(θ) = log p(y 1,..., y T ; θ) T = i=1 log p(y i; F t 1 ; θ) = kt 2 log(2π) 1 T 2 t=1 log Ω y (t t 1) 1 T [ (y t y ) [Ω y (t t 1)] 1 t t 1 ] (y t y t t 1 ) 2 t=1 MIT 18.S096 Time Series Analysis III 36
: Maximum Likelihood Computing ML Estimates of θ The Kalman-Filter algorithm provides all terms necessary to compute the likelihood function for any θ. Methods for maximizing the log likelihood as a function of θ EM Algorithm; see Dempster, Laird, and Rubin (1977). Nonlinear optimization methods; e.g., Newton-type methods For T, the MLE θˆt is Consistent: θ T θ, true parameter. Asymptotically normally distributed: θˆt θ D N(0, I 1 T ) where I T = E [ ( log L(θ))( ] log L(θ))T θ [ θ ] = ( 1) E ( 2 log L(θ) θ θ T is the Fisher Information Matrix for θ MIT 18.S096 Time Series Analysis III 37
Note: Time Series Analysis III Under Gaussian assumptions, all state variables and observation variables are jointly Gaussian, so the Kalman-Filter recursions provide a complete specification of the model. Initial state vector s 1 is modeled as N(µ S1, Ω s (1)), where the mean and covariance parameters are pre-specified. Choices depend on the application and can reflect diffuse (uncertain) initial information, or ergodic information (i.e., representing the long-run stationary distribution of state variables). Under covariance stationary assumptions for the {η t } and {ɛ t } processes, the recursion expressions are still valid for the conditional means/covariances. MIT 18.S096 Time Series Analysis III 38
MIT OpenCourseWare http://ocw.mit.edu 18.S096 Topics in Mathematics with Applications in Finance Fall 2013 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.