Segmentation and Cointegration on Yield Curve Predictability



Similar documents
Working Paper Series Brasília n. 288 Jul p. 1-44

(Arbitrage-Free, Practical) Modeling of Term Structures of Government Bond Yields

Siem Jan Koopman* Max I.P. Mallee Michel van der Wel*

How To Predict Term Structures Of Treasury And Corporate Yields Using Dynamic Nelson-Siegel Models

High Dimensional Yield Curves: Models and Forecasting

Discussion: Realtime forecasting with macro-finance models in the presence of a zero lower bound by Lewis and Krippner

Th e Dy n a m i c s o f Ec o n o m i c Fu n c t i o n s:

How To Model The Yield Curve

Fixed income strategies based on the prediction of parameters in the NS model for the Spanish public debt market

Time Series Analysis

1 Teaching notes on GMM 1.

The Behavior of Bonds and Interest Rates. An Impossible Bond Pricing Model. 780 w Interest Rate Models

Facts, Factors, and Questions

Do Commodity Price Spikes Cause Long-Term Inflation?

Immunization of Fixed-Income Portfolios Using an Exponential Parametric Model *

State Space Time Series Analysis

Fitting Subject-specific Curves to Grouped Longitudinal Data

Predict the Popularity of YouTube Videos Using Early View Data

Lecture L3 - Vectors, Matrices and Coordinate Transformations

The information content of lagged equity and bond yields

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

The Dynamics of Economic Functions: Modeling and Forecasting the Yield Curve

Applying a Macro-Finance Yield Curve to UK Quantitative Easing

A Model for Hydro Inow and Wind Power Capacity for the Brazilian Power Sector

Lecture 2: ARMA(p,q) models (part 3)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Master of Mathematical Finance: Course Descriptions

Java Modules for Time Series Analysis

Is the Basis of the Stock Index Futures Markets Nonlinear?

Understanding Poles and Zeros

Predictive power of the term structure of interest rates over recessions in Europe

Interpreting Market Responses to Economic Data

7 Gaussian Elimination and LU Factorization

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Analysis of Bayesian Dynamic Linear Models

1 The Brownian bridge construction

Review of Fundamental Mathematics

Econometrics Simple Linear Regression

Forecasting in supply chains

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

OPTIMAL PORTFOLIO ALLOCATION UNDER ASSET AND SURPLUS VaR CONSTRAINTS

TURUN YLIOPISTO UNIVERSITY OF TURKU TALOUSTIEDE DEPARTMENT OF ECONOMICS RESEARCH REPORTS. A nonlinear moving average test as a robust test for ARCH

1 Review of Least Squares Solutions to Overdetermined Systems

Forecasting interest rates. Gregory R. Duffee Johns Hopkins University This version July Abstract

DEMB Working Paper Series N. 53. What Drives US Inflation and Unemployment in the Long Run? Antonio Ribba* May 2015

Potential research topics for joint research: Forecasting oil prices with forecast combination methods. Dean Fantazzini.

Basics of Statistical Machine Learning

INTEREST RATES AND FX MODELS

The Engle-Granger representation theorem

An introduction to Value-at-Risk Learning Curve September 2003

Introduction to Matrix Algebra

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

Centre for Central Banking Studies

Jim Gatheral Scholarship Report. Training in Cointegrated VAR Modeling at the. University of Copenhagen, Denmark

Regression III: Advanced Methods

Association Between Variables

Volatility modeling in financial markets

11. Time series and dynamic linear models

Bond Portfolio Optimization: A Dynamic Heteroskedastic Factor Model Approach

Least Squares Estimation

Online Appendix for Demand for Crash Insurance, Intermediary Constraints, and Risk Premia in Financial Markets

SYSTEMS OF REGRESSION EQUATIONS

A Three-Factor Yield Curve Model: Non-Affine Structure, Systematic Risk Sources, and Generalized Duration

Section 5.0 : Horn Physics. By Martin J. King, 6/29/08 Copyright 2008 by Martin J. King. All Rights Reserved.

Spatial panel models

CREATING A CORPORATE BOND SPOT YIELD CURVE FOR PENSION DISCOUNTING DEPARTMENT OF THE TREASURY OFFICE OF ECONOMIC POLICY WHITE PAPER FEBRUARY 7, 2005

Credit Risk Models: An Overview

Internet Appendix for Money Creation and the Shadow Banking System [Not for publication]

Unspanned Macroeconomic Risks in Oil Futures

The US dollar exchange rate and the demand for oil

Machine Learning in Statistical Arbitrage

3. Regression & Exponential Smoothing

the points are called control points approximating curve

Part 2: Analysis of Relationship Between Two Variables

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Forecasting Chilean Industrial Production and Sales with Automated Procedures 1

1 Solving LPs: The Simplex Algorithm of George Dantzig

Forecasting GILT Yields with Macro-Economic Variables: Empirical Evidence from the UK

Multi-variable Calculus and Optimization

Fractionally integrated data and the autodistributed lag model: results from a simulation study

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

JetBlue Airways Stock Price Analysis and Prediction

Topic 5: Stochastic Growth and Real Business Cycles

Introduction. example of a AA curve appears at the end of this presentation.

Stock Returns and Equity Premium Evidence Using Dividend Price Ratios and Dividend Yields in Malaysia

Properties of the SABR model

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Mathematics Pre-Test Sample Questions A. { 11, 7} B. { 7,0,7} C. { 7, 7} D. { 11, 11}

Term Structure Estimation for U.S. Corporate Bond Yields

Building a Smooth Yield Curve. University of Chicago. Jeff Greco

Data Mining: Algorithms and Applications Matrix Math Review

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Price-Earnings Ratios: Growth and Discount Rates

Investment Statistics: Definitions & Formulas

Non-Stationary Time Series andunitroottests

Applying a Macro-Finance Yield Curve to UK Quantitative Easing

Transcription:

Segmentation and Cointegration on Yield Curve Predictability Caio Almeida Kym Ardison Daniela Kubudi Axel Simonsen José Vicente November 12, 2014 Abstract We propose a parametric interest rate model that split the term structure into segments. A class of such models is derived and compared, based on a sequence of out-of-sample forecasting exercises, to successful term structure benchmarks. We show that introducing cointegrated spreads in latent factors dynamics (of all analyzed models) significantly improves out-of-sample U.S. Treasury yield forecasts when compared to autoregressive alternatives. More specifically, we identify that while local shocks improve longer horizon term structure forecasts, the combination of Error Correction Model dynamics and segmentation strongly improves short-maturity yield forecasts. We also identify that all analyzed models (except for the Gaussian Affine) lose power during the recent zero lower bound period. Keywords: Preferred-Habitat Theory, Error Correction Models, Model Selection, Exponential Splines, Local Shocks. We would like to thank Francis Diebold, Allan Timmermann, Michel van der Wel (discussant) and participants at the 2010 Workshop on Yield Curve Modeling and Forecasting at Erasmus Univ., 2011 Brazilian Econometric Society Meeting, 2012 SoFiE Annual Conference at Oxford, and 2012 Brazilian Meeting of Finance for useful comments and suggestions. This paper previously circulated with the title Forecasting Bond Yields with Segmented Term Structure Models. The views expressed are the authors and do not necessarily reflect those of the Central Bank of Brazil. Caio Almeida and José Vicente gratefully acknowledge financial support from CNPq. E-mail: calmeida@fgv.br, EPGE-FGV, Rio de Janeiro, Brazil. E-mail: kym.ardison@fgvmail.br, EPGE-FGV, Rio de Janeiro, Brazil. E-mail: dkubudi@gmail.com, EPGE-FGV, Rio de Janeiro, Brazil. E-mail: axel.simonsen@behaviorcap.com.br, Behavior Capital Management. E-mail: jose.valentim@bcb.gov.br, Ibmec Business School and Central Bank of Brazil. 1

1 Introduction The search for identification of the driven forces of bond yields has produced a large number of models and methods explored in different empirical applications. Despite a huge effort put in this subject, there are still not completely satisfactory results in the literature, in special in what regards forecasting future yields, with most models struggling to outperform the Random Walk. In this paper, motivated by this continuous search for a better understanding of the yield curve and for stronger forecasting models, we propose a class of exponential-spline models that segment the yield curve into different regions of maturities. The model is based on ideas coming from three main sources: the preferred-habitat theory of the term structure (Modigliani and Sutch, 1966), new methods on functional time series analysis applied to the yield curve (Bowsher and Meeks, 2008), and the exponential Dynamic Nelson and Siegel model (Diebold and Li, 2006). The preferred-habitat theory of the term structure advocates that interest rates for each maturity may be influenced by local shocks. Empirical evidence related to such theory reveals that supply and demand of Treasury bonds have non-negligible effects on yield spreads, term structure movements, and bond risk premium. 1 Inspired by this theory and related empirical findings, we separate the yield curve into segments that are interconnected representing a whole curve but that at the same time present their own local shocks. Bowsher and Meeks (2008) assume that the yield curve is a perturbed version of a polynomial cubic spline function. In their methodology, latent variables are yields with maturities coinciding with the knots of the spline, while observed yields are assumed to be contaminated with error. They show that looking at the term structure as a perturbed spline is very useful to provide good short-term forecasts of future yields specially when there is a large cross-section of correlated yields. We expand their perturbed spline idea in two dimensions. First, we consider exponential-type functions as alternative to the cubic polynomials that 1 See Krishnamurthy (2002), Greenwood and Vayanos (2010, 2014), and Krishnamurthy and Vising-Jorgensen (2012). 2

appear within knots in the Bowsher and Meeks splines formulation. 2 Those exponential splines naturally generalize Diebold and Li (2006) type models obtaining segmented versions of four factor exponential models. Second, we look at these spline functions from two different perspectives: similarly to Bowsher and Meeks, as perturbed splines with latent yields at knots, and alternatively as models with sets of local factors that drive the dynamics of specific segments of the yield curve. In this sense, we connect the functional time series idea of Bowsher and Meeks to the preferred-habitat theory that segments the yield curve. In fact, we use Vayanos and Vila (2009) recently formalized preferred-habitat model of the term structure to introduce important insights to our reduced-form version of segmentation inspired by their structural model (see Section 2). The segmented models proposed split the original set of maturities into a group of segments. Each shock to a local factor affects maturities specific to its segment. The model is completed by imposing to the yield curve spline smoothing conditions that interconnect all local factors as part of a unique global system, a restriction that resembles the role of arbitrageurs in the preferred-habitat model of Section 2. In the empirical section, we compare out-of-sample forecasts of segmented models against successful competitors in the term structure literature: the Random Walk, Diebold and Li, the Dynamic Svensson Model 3, and a threefactor Gaussian Affine model with essentially affine market price of risk. Note that Diebold and Li and its variation are parsimonious exponential models that have outperformed the Random Walk in previous studies. The essentially affine model is included due to its good forecasting performance when compared to other versions of Gaussian Affine models (see Duffee, 2002). The choice of the Random Walk as our main benchmark was motivated by Carriero, Kapetanios, and Marcellino (2012) who showed that it outperforms a large number of term structure models in out-of-sample forecasting exercises. An important aspect of our contribution regards the choice of dynam- 2 Later we show that the exponential spline models outperform polynomial spline models in out-of-sample forecasting exercises. 3 The Dynamic Svensson Model adds a second curvature to the Diebold and Li model and its forecasting ability was analyzed in Almeida et al. (2009). 3

ics for the latent factors in term structure models. While the vast majority of papers in term structure modeling adopts autoregressive processes 4, we show that adopting Error Correction Models (ECM) in factors dynamics strongly improves the forecasting performance of a number of models. In particular, our segmented models, Diebold and Li and the Dynamic Svensson Model produce much stronger out-of-sample forecasting results when estimated with ECM factor dynamics instead of corresponding Autoregressive dynamics. Moreover, all these models with ECM dynamics outperform the Random Walk in out-of-sample forecasts of short-maturity yields for different forecasting horizons (1m, 6m and 1-y) 5. We decompose our empirical exercise in two parts. First, all models are estimated with AR/VAR dynamics and their forecasting ability is compared across them and against the Random Walk. And in a second step, for each model (except for the Affine), we change factor dynamics on the estimation process from VAR to ECM, and perform model comparisons again. Although most of the empirical results are carefully described in the empirical section, we highlight here the comparison between exponential segmented models with ECM dynamics and the Random Walk. In this context, exponential segmented models have considerable smaller RMSEs for shortmaturities for all forecasting horizons. For intermediate and long maturities, there are mixed results that depend on the forecasting horizon. Segmented models consistently outperform the Random Walk for the 12-month forecasting horizon for all maturities. However, they do slightly worse on the comparison for the 6-month forecasting horizon, for most intermediate and long maturities 6. The improvement in the forecasting performance of seg- 4 Most dynamic Affine term structure models are Vector autoregressive (VAR) models of order one in continuous time (see Duffee, 2002). In addition, the Diebold and Li model and most of its variations also adopt VAR dynamics (see Diebold and Rudebusch, 2013). An important exception appears in Bowsher and Meeks (2008), which inspires our use of ECM latent factor dynamics. 5 See also Dijk, et al. (2014) who show that incorporating slowly time-varying means to the autoregressive processes of Diebold and Li factors dynamics strongly improves outof-sample forecasting results, specially for long-maturity yields and long-horizon forecasts in the US market. 6 All the forecasting results in the paper are confirmed in a number of robustness tests that appear in a supplementary online appendix. There we switch from FRED boot- 4

mented models against the Random Walk is statistically significant according to Diebold and Mariano tests (see Table 3). In summary, our results indicate that the yield curve is affected by local shocks on its short-term segment. In addition, another important effect observed is that segmentation implies an improvement of longer horizon forecasts, for all maturities analyzed. Those results can be related to recent research identifying that intermediation frictions, manifesting either through a funding liquidity factor (Fontaine and Garcia, 2012) or a slow ability to move capital (Duffie, 2010), explain part of the US Treasury bond market dynamics. Garcia and Fontaine (2012) identify a liquidity factor based on bonds with similar cash flows but different ages, which is negatively correlated to bond risk premia. Such factor suggests that US Treasury bonds work as hedges against funding liquidity shocks. In this sense, part of the improvement in forecasting short-term maturities with our segmented models might be related to capturing the dynamics of some liquidity issues and frictions in the repo markets as suggested by Fontaine and Garcia (2012). Some (weak) empirical evidence in this direction is revealed in one of our five factor exponential segmented models (NS4). Within this model, two (local) factors associated with short-maturity movements present correlations around 30% with Fontaine and Garcia s liquidity factor while the remaining three factors that may be more associated with traditional term structure movements present only half this correlation with the same liquidity factor. 7 In a related issue, Duffie (2010) shows that patterns of price responses to supply and demand shocks involve an initial sharp reaction followed by a subsequent slower reversal in different markets, including the US Treasury market. The speed of this reversal depends on the amount of intermediary capital needed to be raised. Such theoretical result can be related to the improvement obtained for longer horizon forecasts (one-year) with our segmented models. In fact, our maturity segmentation combined with ECM factor dynamics could be picking up as part of its improvement in forecasting strapped yields to unsmoothed and smoothed Fama-Bliss yields. 7 Similar results for the correlation between the liquidity factor and latent factors in the extended segmented exponential model NS4E are also obtained. 5

power, slower moving responses to supply and demand shocks happening in the US Treasury market. Although we are aware that such reversals of risk premia have certainly short-term cycles (of weeks at most), we conjecture that their existence could be enough to impact the dynamics of the factors in the segmented term structure models and affect longer horizon forecasting. Our conjecture is in line with Adrian, Moench and Shin (2010) who find that intermediary balance sheet aggregates contain strong predictive power for 3-month horizon excess returns on Treasury instruments. 2 Economic Motivation In this section, we motivate our strategy to segment the term structure based on the Vayanos and Vila (2009) preferred habitat model. In their model, there are only two types of agents: investors and arbitrageurs. Investors represent all possible types of clienteles and have preferences with demands for specific bond maturities. 8 In contrast, arbitrageurs are able to invest in bonds with any maturity and aim at maximizing a mean-variance utility function. The demand at time t for a bond with maturity τ is assumed to be a linear function of the bond yield R t,τ given by: 9 y t,τ = α(τ)τ(r t,τ β t,τ ) (1) where α(τ) is a positive constant, and β t,τ is a function that possibly embed multiple demand risk factors: β t,τ = β + K θ k (τ)β k,t (2) k=1 In Equation (2), each {β k,t } k=1,...,k represents a distinct factor that captures demand risk, while {θ k (τ)} k=1,...,k characterizes how each demand factor impacts the demand for bonds with maturity τ. 8 For instance, pension funds might demand long term bonds while speculators might have stronger demand for short-term bonds as in Greenwood and Vayanos (2010). 9 Vayanos and Vila (2009) prove based on an overlapping generation model with an infinite number of risk averse agents that it is possible to generate this kind of demand. 6

The model is completed with a stochastic differential equation for the short-term rate r t and for the demand risk factors β k,t, all following mean reverting Gaussian processes: dr t = κ r ( r r t )dt + σ r db r,t (3) dβ k,t = κ βk β k,t dt + σ βk db βk,t (4) Vayanos and Vila (2009) show that with a linear demand function on yields, arbitrageurs that maximize a mean-variance utility, market clearing conditions, and Gaussian Affine dynamics for the short-rate and demand risk factors, the yield of a bond with maturity τ is an affine function of the short-rate and demand risk factors: 10 R t,τ = a r (τ)r t + K a βk (τ)β k,t (5) An important aspect of their model is that the original loadings of demand risk factors {θ k (τ)} k=1,...,k appearing in the demand function have important role in determining the new loadings a βk (τ) k=1,...,k that relate yield R t,τ to demand factors β k,t. This is what connects (possibly local) demand shocks to the cross-section of yields. Determination of the loadings a βk (τ) as a function of the original demand loadings {θ k (τ)} is strongly affected by the existence of arbitrageurs. For instance, in the limiting case where there were no arbitrageurs, yield R t,τ would be directly given by the demand function for τ-maturity bonds: R t,τ = β+ K k=1 θ k(τ)β k,t. This could, for instance, represent an extreme case of segmentation if functions {θ k (τ)} k=1,...,k were chosen to be single-peaked around specific maturities. In such case, {β k,t } k=1,...,k would represent local demand shocks. Note that arbitrageurs play then two crucial roles in the Vayanos and Vila model: First, they generate the link between demand factors and bond yields as presented above. Second, they smooth the yield curve guaranteeing that bond yields at close maturities have close values. Without them, in theory, 10 Moreover, market prices of risk in this economy are also affine functions of the shortrate and demand risk factors. k=1 7

yields could be completely different for different maturities and arbitrage opportunities could become available. Therefore, existence of arbitrageurs guarantees a certain smoothness of the yield curve and that the no-arbitrage condition is assured. In our reduced-form segmented model, we incorporate the Vayanos and Vila clientele effect by allowing the term structure of interest rates to have local shocks. Arbitrageurs are paralleled with a spline second-order smoothness condition imposed to the model. In our model, the smoothness condition distributes information from local shocks to other parts of the term structure exactly how arbitrageurs in Vayanos and Vila (2009) transform the original demand loadings {θ k (τ)} into yields loadings a βk (τ) propagating local demand information to the whole curve. It is important to highlight however, that we do not impose no-arbitrage conditions in our model. Note that Joslin, Singleton and Zhou (2011) have showed that neither cross-section nor dynamic no-arbitrage conditions are helpful to improve forecasting ability in Gaussian Affine models. 11 Summarizing, there are two important features in Vayanos and Vila (2009) that inspire our segmented spline model: The (local) clientele effect and the smoothing condition implied by the presence of arbitrageurs in the market. 3 Segmented Loading Model 3.1 The General Model Our model builds on the work of Bowsher and Meeks (2008), who define the yield curve as a piecewise polynomial function plus an error term. Letting [T m, T M ] be the interval for maturities of the yields, consider the following partition fixed over time: 11 The fact that our model is Gaussian but differs from classical affine factor models due to spline segmentation could inspire analyzing if no-arbitrage improves forecasting ability, but this is beyond the scope of this paper. 8

φ = {T m = τ 0 < τ 1 <... < τ k = T M }, (6) The τ i s are denominated knots of the model, and k is the exogenously chosen number of segments in the yield curve. by: The yield of a bond with maturity τ [T m, T M ] at time t will be expressed y t (τ) = k ft i (τ)i(a i ), (7) i=1 where I(A i ) is the indicator function of set A i 12, and A i is defined by: { A i = τ R : τ i 1 τ τ i, i = 1,..., k The shapes of local loadings f i t (τ) at each segment of the yield curve should be seen as important ingredients of the model. Generalizing the idea of a polynomial cubic spline, we allow for four different types of local movements within each segment. To that end, we let functions 1, g i, h i and z i represent the time-invariant loadings at segment i. Paralleling the preferred habitat model of Section 2, they resemble the loadings θ k (τ) s of the demand risk factors in cases where θ k (τ) s were chosen single-peaked around specific maturities representing a segment of the yield curve. With the formulation above, at each segment, f i is given by: } (8) f i t (τ) = a i t + b i tg i (τ) + c i th i (τ) + d i tz i (τ), i = 1,..., k. (9) where [a i t b i t c i t d i t] represent the time-varying local yield curve factors of segment i. In order to complete the model, we also want to ensure smoothness of the yield curve. In the habitat model, this is achieved by introducing arbitrageurs. Here we do not impose no-arbitrage restrictions but only require functions f i to be of class C 2 in the closure of A i and to satisfy the splines constraints, i.e., ft i 1 and ft i share at each knot, common values for the function and its derivatives of order one and two. 12 I(A i ) = 1, if τ A i, and 0, otherwise. 9

At this point, we proceed developing the model in matrix form to increase its computational tractability. To that end, let β t = [βt 1 βt k ] = [a 1 t b 1 t c 1 t d 1 t a k t b k t c k t d k t ] be the vector of parameters representing local factors and Y t ( τ) and τ vectors containing the m observed yields at time t and their respective maturities. Let also φ be the vector of knots excluding τ 0. In matrix form, the dynamic model can be represented by: Y t ( τ) = W ( τ)β t + ε t ( τ) s.t. R( φ)β t = 0. (10) where W and R are m 4k and 3k 1 4k, matrices, ε is a residual term, 4k is the total number of factors, and 3k 1 is the number of splines constraints. This is what we call the restricted version of the model with unrestricted or untransformed loadings. Each line of W defines the spline function for a specific maturity in τ. That is, for each τ j τ, there i such that τ j A i, implying that the j-th row of W ( τ) contains only the local loading functions of segment i: [ ] W (τ)(j, :) = 0 1 4 0 1 4 1 g i ( τ j ) h i ( τ j ) z i ( τ j ) 0 1 4 0 1 4. Interestingly, while matrix W is built based on the vector τ of maturities of observed yields, in contrast, condition R( φ)β t = 0 represents the spline restrictions that should be satisfied at the knots in φ. It guarantees that even though coefficients and functional loadings may change across segments, the term structure is smooth at the domain of maturities [T m, T M ]. In order to exemplify how these restrictions take place, consider the first line of R: [1, g 1 (τ 1 ), h 1 (τ 1 ), z 1 (τ 1 )] [1, g 2 (τ 1 ), h 2 (τ 1 ), z 2 (τ 1 )] 0 1 4... 0 1 4 (11) If we multiply the first line of R by β t we obtain: [1, g 1 (τ 1 ), h 1 (τ 1 ), z 1 (τ 1 )]βt 1 [1, g 2 (τ 1 ), h 2 (τ 1 ), z 2 (τ 1 )]βt 2 (12) Setting (12) equal to zero gives the restriction that at τ 1, f 1 (τ 1 ) should be equal to f 2 (τ 1 ). All the other lines of R follow the same logic imposing equalities for the value of either f, its first or second derivatives at both segments that intercept each knot τ i φ, i = 1, 2,..., k. 10

Using that all restrictions are in the form of equalities, we are able to rewrite Equation (10) in an unconstrained form, reducing the dimensionality of the factors of the model from 4k to k + 1. This is acomplished by noting that matrix R is block diagonal allowing us to rewrite this restriction in two separate terms. Performing a similar separation at W ( τ) we get the following version of the model: Y ( τ) = W 1 ( τ)θ + W 2 ( τ)ˆθ + ɛ( τ) R 1ˆθ + R2 θ = 0 (13) where R 1 is a full-rank square matrix, R 2 is a complementary matrix and θ, ˆθ are the corresponding vectors of parameters. Using that R 1 can be inverted and setting Z( τ) = (W 1 ( τ) W 2 ( τ)r 1 1 R 2 ) we get an unrestricted form for the model: Y t ( τ) = Z( τ)θ t + ε t ( τ), (14) The dimensions of Z and θ t are m (k + 1) and (k + 1) 1, respectively. Therefore our model can be specified in a restricted form as in (10) or in an unrestricted form as in (14). Figure 3 helps to develop some intuition for the particular case adopted later in the empirical section where we have four segments of the yield curve. Its top panel shows, for each segmented model, sixteen local factors (four on each segment separated by black vertical lines). Its bottom panel contains transformed loadings after smoothing conditions were imposed. Basically, instead of dealing with sixteen unrestricted loadings, four within each segment that appear in the restricted form (10), the unrestricted version of the model (14) uses the smoothing conditions to rotate the original factors and reduces their dimension from sixteen to five. As mentioned in the above paragraph, it is important to emphasize that the spline smoothing conditions are specifically responsible for linking the 4k untransformed local spline functions 1, g i, h i and z i appearing within each segment i and represented in matrix form by W ( τ) in (10) to the final k + 1 transformed loadings Z( τ) in (14). This is exactly what happens in the Vayanos and Vila model where arbitrageurs connect the (local) demand 11

loadings {θ k (τ)} s to the transformed yield s loadings a βk (τ). 13 Note that equation (14) should be valid for any pair of maturities and yields. In particular, since in the model, yields with maturities corresponding to the knots in φ are assumed to not contain an error term, it must be true that Y t (φ) = Z(φ)θ 14 t. Using that matrix Z(φ) is square and invertible, we obtain a relation between Y t ( τ) and Y t (φ): Y t ( τ) = Z( τ) (Z(φ)) 1 Y t (φ) + ε t ( τ) = Y t ( τ) = Π( τ, φ)y t (φ) + ε t ( τ). (15) Therefore, the latent yields can be estimated in the cross-section by OLS: Ŷ t (φ) = Z(φ)(Z( τ) Z( τ)) 1 Z( τ) Y t ( τ). (16) Finally, if we set g i (τ) = τ, h i (τ) = τ 2 and z i (τ) = τ 3 for i = 1, 2,..., k, we have the cross-section specification of the Bowsher and Meeks model. 3.2 Specification of Loadings in Segmentation We distinguish two forms of segmentation conveniently denominated weak and strong. Under the former, only local factors (a i, b i, c i and d i ) change across segments but loadings (g i, h i, and z i ) are the same, therefore being independent of i. On the other hand, under the strong segmentation form, not only factors dynamics vary across segments but also the functional form of the loadings, i.e., both coefficients a i, b i, c i and d i and functions g i, h i, and z i are not independent of i. We start by developing the strong form of segmentation and then end up writing the weak model as a special case of the strong segmented model. The strong segmentation factor loadings are based on the Nelson and Siegel (1987) - Svensson (1994) - Diebold and Li models. However, we allow 13 An important word on notation: Note that the restricted form of the model in (10) presents unrestricted loadings since the restriction is written explicitly in the model formulation but not applied. On the other hand, the unrestricted form of the model in (14) presents restricted loadings since the original restriction was applied to the model. 14 Note that although in general the yields Y t (φ) are latent, they don t have to be. For instance, if we choose a knot equal to the maturity of an observed yield. In such cases, we assume that the yield at that knot is observed with error. 12

different functional forms for g i, h i, and z i within each segment. It is important to note that despite the existence of possible discontinuities in the loading functions at the knots, the smoothing restrictions guarantee that the yield curve remains continuous and smooth. Under this model, each segment has its own dynamics and functional loadings, while smoothing constraints connect local to global dynamics across maturities, reinforcing again the analogy with the preferred-habitat theory. We propose the following form for the local loading functions: g i (τ) = (1 e λ 1Λ i (τ) ), (17) λ 1 Λ i (τ) h i (τ) = (1 e λ 1Λ i (τ) ) e λ 1Λ i (τ), λ 1 Λ i (τ) (18) z i (τ) = (1 e λ 2τ ) e λ2τ, λ 2 τ (19) where Λ i s are functions that introduce discontinuities in the loadings at the knots, i = 1,..., k 15. Although there are many possibilities for Λ, just as a first assessment of how this kind of segmentation could affect forecasting, we adopt the following linear functional form in the empirical section: Λ i (τ) = τ τ i 1 (1 p), τ A i, τ i 1 φ and p [0, 1]. (20) It is important to observe that while the parameters λ 1 and λ 2 control the decay rates on the exponential loadings, the parameter p has a different role. It controls the degree of loading segmentation. To go from the strong segmentation model to the weak version we just need to take Λ i (τ) = τ for all i. Finally, note that it is possible to classify the segmented-loading models in two different ways: First, with respect to the parametric family of functions that define local loadings, there are polynomial versions (Bowsher and Meeks 15 The choice of how to introduce a discontinuity in the loadings, and in which loadings to introduce such discontinuity is completely arbitrary. For illustration purposes we choose to segment only the slope and the first curvature, keeping the level and the second curvature intact. 13

model) and exponential versions (given above); with respect to the degree of segmentation of loading functions, models may present strong or weak segmentation. At this point, it is also useful to introduce some notation. We will refer to the polynomial version of the model as Bowsher and Meeks (BM) model. The exponential versions of the model are respectively denominated exponential segmented model (NS4, for Nelson and Siegel four factor model), when considering the weak segmentation form, and extended segmented model (NS4E), when considering the strong segmentation form. 3.3 Latent Factor Dynamics A final step to conclude model specification consists in choosing the dynamics for the latent factors. Here, targeting the possibility of disentangling the role of imposing cross-sectional segmentation from the role played by the kind of dynamics chosen for the latent factors, we propose an analysis of two different important types of latent factor s dynamics previously adopted in the term structure literature: Autoregressive versus Error Correction Model. First, following Diebold and Li and the affine literature, we make use of a classical form of factors autoregressive dynamics. In such case, we adopt VAR s / AR processes for the latent yields, and the model can be represented in the following state-space form: Y t (τ) = Π (τ, φ) Y t (φ) + ε t (τ), (21) Y t+1 (φ) = c + γy t (φ) + ν t, (22) for t = 1, 2,... where c is a k + 1 vector and γ is a k + 1 k + 1 that is either a diagonal or a triangular-diagonal matrix. The error terms ε and ν are such that E(ε t (τ)ε t (τ) ) = Σ ε, E(ν t (τ)ν t (τ) ) = Σ ν, E(ε t (τ)ν t ) = 0. It is interesting to note however, that while the factors that appear in Diebold and Li and also in dynamic Affine models are usually rotations of principal components of yields, the latent factors in our model are latent yields that inherit nonstationarity conditions from observed yields 16. There- 16 We are aware that although some principal components might also be nonstationary as usually is the case of a level factor, in general slope and curvature are stationary factors. 14

fore, with the goal of controlling such potential nonstationarity, in the second approach, we follow the cointegration-based yield curve literature (Hall, Anderson and Granger, 1992, and Bowsher and Meeks, 2008) proposing an Error Correction Model to the dynamics of the latent yields. In such case, the model state-space form is given by: Y t (τ) = Π (τ, φ) Y t (φ) + ε t (τ), (23) Y t+1 (φ) = α(ρ Y t (φ) µ s ) + Ψ Y t (φ) + ν t, (24) for t = 1, 2,... where α, and ρ are k+1 k matrices, Ψ is a k+1 k+1 matrix, and µ s is a k + 1 vector. ρ is a cointegration matrix that here is fixed such that ρ Y t (φ) µ s is a stationary mean-zero vector of cointegrating relations. More specifically, ρ Y t (φ) are the k spreads between the knot yields. 17 The error terms ε and ν are such that E(ε t (τ)ε t (τ) ) = Σ ε, E(ν t (τ)ν t (τ) ) = Σ ν, E(ε t (τ)ν t ) = 0. For more details on this specification we refer to Bowsher and Meeks (2008). Both systems (21 and 22) and (23 and 24) can be estimated with the use of a Kalman filter (Bowsher and Meeks, 2008). Alternatively, Diebold and Li (2006) suggest a simpler two-step estimation procedure that in general produces strong out-of-sample forecasting results. In addition, according to Diebold and Rudebusch (2013) little is lost in practice by using two-step estimation because there is typically enough cross-sectional variation.... By privileging simplicity, we adopt the two-step procedure proposed by Diebold and Li, by first estimating the latent yields running an OLS regression (see (15) and (16)), and then estimating the VAR/AR(1) in (22) and the ECM in (24) to obtain the parameters {c, γ, Σ ε, Σ ν } and {α, µ s, Ψ, Σ ε, Σ ν }, respectively. 17 The i th column of ρ contains -1 at the i th line, 1 at the (i + 1) th line, and zero for all other entries. 15

4 Empirical Results 4.1 Data Our goal in this section is to perform empirical tests of segmented term structure models. Due to the existence of different methodologies for the construction of zero coupon yield curves, for robustness purposes, we adopt three datasets in our forecasting exercise. First, we consider St. Louis FED s constant-maturity U.S. Treasury yields on a monthly basis, starting in January of 1985 and ending in October of 2012. This dataset consists of par yields from which we obtain zero-coupon yields by applying a bootstrap procedure with flat-forward interpolation. The two other datasets consist of smoothed and unsmoothed Fama-Bliss yields that are also observed on a monthly basis starting in January of 1985, and ending in December of 2009. Due to limitations on the size of the paper, we report only results obtained with one of the datasets (FRED), reserving an online appendix to carefully report results based on the other two datasets. While we choose to report the FRED dataset since it contains recent data encompassing the zero lower bound period, differences obtained across the three datasets are discussed in details in Section 5. Figure 1 plots the time evolution of bootstrapped zero-coupon yields with maturities of 3-, 6-, 12-, 24-36-, 60-, 84- and 120-months. In a lower frequency (annual), yields decline over time, while on a monthly basis there are many periods where yields oscillate around a fixed region. The yields achieve a maximum of 11.6% (for the 84-month maturity yield) at the beginning of the sample and a minimum of 0.04% (for the 3-month maturity yield) at the end of the sample. Along the observed period, yield curves frequently have a downward sloping shape. Nevertheless, at some points in time, the term structure is increasing, hump-shaped, and inverse, as can be observed in Figure 2. This picture presents both observed and model-based yield curves in different moments of time illustrating the various forms that the term structure can assume. 16

4.2 Competing Models We have two main goals in the empirical section: First, verify if segmented models provide better out-of-sample forecasts than competing models. Second, for all competing models, compare results obtained when latent factors follow an autoregressive dynamics to those obtained with corresponding ECM dynamics. In principle, we could choose from a large pool of models to accomplish our goals. However, we prefer to concentrate on a smaller number of wellestablished benchmarks that have appeared in the term structure literature. In fact, we choose as competitors the Random Walk, Diebold and Li, Dynamic Svensson Model, and a three-factor Gaussian Affine model with essentially affine market prices of risk. Note that Diebold and Li and its variation are parsimonious exponential models that have outperformed the Random Walk in previous studies, as shown, for instance in Diebold and Li (2006) and De Pooter (2007). The essentially affine model is included due to its good forecasting performance documented by Duffee (2002). The choice of the Random Walk as our main benchmark was motivated by the work of Carriero, Kapetanios, and Marcellino (2012), who showed that it outperforms a large number of term structure models in out-of-sample forecasting exercises. As mentioned before, we are interested in investigating the importance of adopting ECM dynamics for the latent factors and in particular, in trying to separate the role of ECM from that of segmentation. To that end, we include in our analysis modified versions of the Diebold and Li and the Dynamic Svensson Models that have ECM latent factors dynamics. In the remaining sub-sections of section 4, we discuss estimation and forecasting performance of the different models. 4.3 Model Estimation There are two important characteristics to take into account when detailing the analyzed models: Choice of the number of latent factors, and types of parameters that enter in each model. There is no consensus in the literature when the goal is to choose the 17

number of latent factors. Traditionally Litterman and Scheinkman (1991) have showed that three factors describe more than 95% of the variability of the U.S. term structure of interest rates. Building on their result, most term structure models adopt only three dynamic factors. On the other hand, Cochrane and Piazzesi (2005) identify that more than three factors are necessary to forecast bond risk premia. 18 In addition, Duffee (2011) implements a five-factor Gaussian Affine filtered model to monthly U.S. Treasury yields. He finds that a latent factor with almost no effect in cross-sectional fitting explains around 30% of the total variance in expected bond returns. Therefore, apparently the number of optimal latent factors depends on the kind of application desired. It appears to be reasonable to search in a range from three to five factors when designing a term structure model. Building on the literature on four factor exponential models like Svensson and its variations 19, all our segmented models present four local factors within each segment (see section 3.1). We also know that the number of final dynamic factors, after imposing smoothing restrictions, should be equal to the number of segments plus one. In this paper, in the analyzed segmented models, targeting to capture part of the dynamics of the yield curve with factors that differ from the traditional level, slope and curvature, we choose to have five dynamic factors precisely as in Duffee (2011). Our choice automatically imposes a couple of important restrictions to these models: each model will have to present four segments (five knots), and sixteen unrestricted loadings, four per segment. In order to choose knots positions, we follow Bowsher and Meeks (2008) fixing an in-sample training period (January 1985 to January 1994). The in-sample stage for knot selection is based on assessing the cross-sectional fit for both segmented models (NS4 and NS4E). Any candidate has extreme knots at 1 and 120 months and avoids combinations in which the distance between neighboring knots is less than 12 months to avoid clustering. The best 18 In particular, Cochrane and Piazzesi (2005) find that a linear combination of forward rates that has an important component unrelated to the traditional level, slope and curvature movements, has a strong forecasting power on expected excess returns on bonds. 19 For a few dynamic versions of exponential models with more than three factors see Koopman, Mallee, and Van der Wel (2010), De Pooter (2007), and Almeida et al. (2009). 18

knot vector is the one that minimizes the root mean square error (RMSE) of the panel of yields within the training period subject to the mentioned constraints. The optimal knot vector for the weak segmented model is {1, 16, 55, 108, 120}, and for its strong extended version, {1, 13, 39, 108, 120}, with all maturities measured in months. Those knots define four segments: two short-maturities, one medium-maturity, and one long-maturity. In what regards different types of parameters on exponential models (segmented models, Diebold and Li, or Svensson models), we must choose values for λ 1 and/or λ 2, the parameters that govern the decaying speeds of the exponential loadings. Following Diebold and Li, we fix λ 1 = 0.0609, making the first curvature loading related to medium-term maturities, with a maximum at 30 months. For the second curvature parameter λ 2, we follow Almeida et al. (2009) and fix λ 2 > λ 1 aiming for a maximum value at the short-end of the yield curve. 20 Varying only λ 2, we minimize the in-sample RMSE within the training period finding a value of λ 2 = 0.24. With this value, the second curvature achieves a maximum around 7 months, ending up locally driving short-term movements at each segment. In our empirical analysis, this short-maturity peek for the second curvature is responsible for an improvement in forecasts at the shortend of the yield curve. When considering the extended segmented-loading model (NS4E), we must also choose the value of p, a parameter that controls the segmentation across loadings. Note that p = 0 corresponds to maximum loading segmentation, while p = 1 to no segmentation. We set a grid of values for p, ranging from 0 to 0.95 with increments of 0.05 to verify the sensitivity of forecasting results (within the training period) to changes in this parameter. For values of p close to 0.95, the model produces slightly worse forecasts than for p = 0.5, an intermediate value. However, for values of p close to 0, results are mixed when compared to p = 0.5. 21 Apparently, a strong degree of 20 Almeida et al. (2009), adopt a dynamic Svensson model and estimate the two decay parameters (lambdas) that minimize in-sample RMSEs. They find that a second curvature factor controlling short-term movements is very important in out-of-sample forecasting exercises. 21 For instance, while p = 0 produces better long-horizon forecasts than p = 0.5 with a decrease of 2 to 3% on average RMSEs, it produces worse results for short-horizon forecasts 19

loading segmentation behaves slightly better in terms of forecasting. Despite this fact, we keep the moderate segmentation in our analysis and present all the results for p = 0.5. 4.4 Factors Loadings and Impulse Responses In this section, we present explanations for factor s loadings and factor s dynamics on segmented term structure models. Figure 3 shows the unrestricted (or unsmoothed) loadings at the top panel and restricted (or transformed) loadings at the bottom panel. Considering the unsmoothed loadings, vertical black lines indicate the positions of knots within each model. Each of the three models contains a set of four loading functions within each segment. For the Bowsher and Meeks and the NS4 segmented models, the total of sixteen unsmoothed loadings that appear when observing the whole set of maturities are visually equivalent to four loadings (level, slope, curvature, and second curvature) observed in traditional term structure models. The difference however, lies on the yield curve response to shocks in factors attached to each of these loadings: While in the segmented models (Bowsher and Meeks and NS4) the response is potentially local at each segment, in traditional models the response for a shock in any factor spreads through the whole set of maturities. In contrast, the extended model (NS4E) presents discontinuities at the unsmoothed loadings of slope and first curvature implying that the responses to shocks in factors are not only dependent on a particular segment of the term structure but also may affect segments with intensities that differ from corresponding loadings of traditional four-factor exponential models. When considering the restricted loadings at Figure 3, note that instead of having sixteen (4k) loadings we will have only five (k + 1). This reduction in dimension is due to the smoothing restrictions discussed at Section 3.1. All models present level (red) with loadings very similar to a traditional level factor in the spirit of Litterman and Scheinkman (1991). The same is true for the slope factor of the Bowsher and Meeks and segmented NS4 model (cyan). Nevertheless, the slope for the extended model NS4E (magenta) is non-linear with an increase of 1% on average RMSEs. 20

being approximated by two linear functions, one steeper for short-maturities up to twenty months and the other almost flat, picking up longer maturities. For each model, there is a set of traditional factors as in Litterman and Scheinkman (1991) and another of more specific local factors. Local factors are those that after a range of maturities quickly decay to zero, staying there for any longer maturity. This is the case with the blue and green factors of the Bowsher and Meeks and two exponential segmented models, and with the cyan factor of the extended segmented model. These local loadings are important only for certain subsets of maturities. We will see later that part of the differences in the forecasting ability of models will come from the existence of local factors and from differences across these factors. They will help to capture additional yield dynamics that is not identified by traditional non-segmented factor models. The most interesting differences between the three segmented models come from these local factors. Under the exponential models, the loadings of two local factors (blue and green) have a faster decay than in the polynomial BM model. This fast decay is related to the role that the second curvature factor plays at the short-end of the yield curve. While the green loading achieves zero for a maturity of 60 months under the Bowsher and Meeks model, in contrast it achieves zero around 45 months under the exponential models. In addition, as observed before for the extended model, there are three local factors (loadings in blue, green and cyan) and only two traditional factors (level and curvature). The existence of a third local factor improves forecasting results at the short-end of the yield curve over the already strong results obtained by the two other segmented models as we will see in Section 4.5 (Table 2). We have just provided a static view on the implications of factor s loadings to the cross-section of yields in segmented models. In order to complement this view, we close this section with in an impulse response exercise applied to the restricted factors in the exponential segmented model (NS4) with ECM dynamics. In particular, we show how observed yields at different segments of maturities react to shocks applied to the transformed factors of this model. 22 22 Results for the extended exponential segmented model are similar and available upon request. 21

Using the whole sample available, we estimate the dynamics in (23) and (24), apply a one standard deviation shock to each restricted dynamic factor, and track the system response for twenty periods. Then, using Equation (23), we translate factors responses into observable yields responses. The results are presented in Figure 4. Let us first restrict our attention to the first two panels that present shocks relative to the two local factors of the NS4 model. Those are the factors attached to the blue and green loadings appearing at the central picture at the bottom of Figure 3. In line with discussions from previous paragraphs, we note that a shock applied to the dynamic factor attached to the blue loading affects on the short-run, shortmaturity yields more severely than the other segments. This effect dissipates in the long-term and inverts leading in the long-run to a smaller response for the short-maturity segment relatively to the other segments. The effect of the shock for longer maturities is smaller in the short-run and stronger in the long-run indicating clearly different behaviors for the responses to this shock across segments of the term structure. For a shock in the dynamic factor attached to the green loading, the most affected yields are the longer-maturity ones, both at the short and long run of the impulse response. Although all yields present responses to the shock with a similar shape, the intensities and decaying speeds are clearly different between short-maturity and longer-maturity yields. In contrast, at the bottom of Figure 4 we present shocks relative to the traditional level, slope and curvature dynamic factors, respectively attached to the red, cyan and magenta loadings of the NS4 model at the bottom of Figure 3. Notably, movements of the term structure relative to a shock in any of these factors are very similar with yields moving much closer to each other than in the top panel. 4.5 Forecasting When evaluating the econometric performance of interest rate models, researchers work in general with a unique out-of-sample window, sometimes 22

considering a small number of subsamples for robustness purposes. 23 Alternatively, in this paper, we compare models based on multiple forecasting exercises. Although we propose the use of multiple out-of-sample windows, we still have to choose between two usual approaches, rolling-window or recursive method (see MacCracken, 2007). The rolling-window method performs model re-estimation with a fixed-size window, while the recursive method reestimates models with cumulative data. Giacomini and White (2006) show that the rolling-window method produces substantial forecasting accuracy gains on important economic series and forecasting models. Based on their findings, we adopt the rolling-window method for each of the 143 out-ofsample forecasting exercises in our dataset. In a recent paper, Hansen and Timmermann (2011) observe how important is to control the split point of a dataset into estimation and evaluation periods in out-of-sample tests. Based on this observation, they introduce a test statistic robust to mining when choosing the starting point of the out-ofsample period. On a related issue, Rossi and Inoue (2012) propose a robust approach to data snooping across the length of the estimation window, in rolling-window forecasting evaluations. In this paper, we do not examine variations of window-size nor mining over the sample split. Instead, we vary the datasets creating different forecasting exercises but keeping both windowsize and sample split constants. Keeping the numbers proposed by Diebold and Li (2006), our in-sample window-size is equal to 108 h + 1 months (h, forecasting horizon) and out-of-sample window-size is equal to 84 months. 4.6 Multiple Forecasting Exercises A forecasting exercise consists of a sequence of 84 out-of-sample monthly forecasts constructed with the rolling-window method. For each exercise and forecasting horizon h, we define the RMSE of the m-maturity yield by: 23 Papers that adopt subsample robustness tests include De Pooter (2007), Jungbaker, Koopman and van der Wel (2013), Koopman and van der Wel (2013). 23

RMSE(T E, h, m) = ) 2 84 h i=1 (Ŷ T E 84+i (m) Y T E 84+i(m), (25) 84 where T E is the last prediction date and Ŷ h T E 84+i (m) Y T E 84+i(m) is the prediction error, since Ŷ h T E 84+i (m) = E T E 84+i h (Y TE 84+i(m)) is the model conditional expectation at time T E 84 + i h for the m-maturity yield at time T E 84 + i, h months in the future. To calculate this conditional expected value, the model is estimated based on data between T E 192 + i and T E 84 + i h (in-sample period). For example, suppose T E = January 2001 and h = 6. In this forecasting exercise, the first prediction is done for February 1994, with an in-sample period from February 1985 to August 1993. Similarly, the eighty-fourth (last) prediction is done for January 2001 with an in-sample period from January 1992 to July 2000. The set of all 143 different forecasting exercises is obtained by changing T E monthly between December of 2000 and October 2012 24. We are particularly interested in making three kinds of comparisons: Segmented models versus traditional models (DL, DSM and Gaussian Affine), segmented models versus the Random Walk, and exponential versus polynomial segmented models. First, following Diebold and Li, we estimate all models (except for the Affine) with AR dynamics for each latent factor. 25 The three-factor Gaussian Affine model with essentially affine market prices of risk (Duffee, 2002) is estimated with a Maximum Likelihood method described in the online appendix. 26 24 Our methodology based on the analysis of multiple datasets, although less formal from a statistical viewpoint, is related to Giacomini and Rossi (2010). They develop statistical tests to examine the stability over time of out-of-sample relative forecasting performance of a pair of models in the presence of an unstable environment. 25 The difference on average RMSEs obtained with VAR dynamics versus univariate independent AR dynamics for latent factors are presented at Table 1 in the online Appendix. Negative numbers indicate that the VAR methodology outperforms the AR, while positive numbers indicate the opposite. We observe that in 70% of the cases the AR methodology outperforms VAR, what encourages us to keep in the main body of the paper results with AR dynamics. 26 The forecasting performance (average RMSEs) of a four-factor Gaussian Affine model 24