Common Factors in Commodity Futures Curves *

Size: px
Start display at page:

Download "Common Factors in Commodity Futures Curves *"

Transcription

1 Common Factors in Commodity Futures Curves * Dennis Karstanje, Michel van der Wel, and Dick van Dijk May 28, 213 Preliminary and Incomplete Draft Please do not distribute without authors permission Abstract Commodities, such as oil, gold and metals, trade on the futures markets with a broad array of exercise dates. This paper examines the comovement of factors driving individual commodity futures curves. We adopt the framework of the dynamic Nelson-Siegel model to decompose the commodities futures curves into level, slope and curvature factors. We extend the model by (i) including a seasonal term to account for periodic behavior of commodity futures prices, and (ii) allowing for comovement across commodities by decomposing the factors into global, sector and idiosyncratic components. Our empirical results based on 24 commodities over the period document that the individual commodity futures curves factors are driven by common components. Around 75% of the individual level factor variation can be explained by global and sector components. Both for the slope and curvature factors, this fraction is around 6%. These results give more insight in the market dynamics and can help in the construction of commodity futures portfolios and hedging decisions. Keywords: Commodity futures prices; comovement; term structure; global dynamic Nelson Siegel model. * Karstanje is from Erasmus University Rotterdam and the Tinbergen Institute. Van Dijk is from Erasmus University Rotterdam, Tinbergen Institute and ERIM. Van der Wel is from Erasmus University Rotterdam, CREATES, Tinbergen Institute and ERIM. Michel van der Wel is grateful to Netherlands Organisation for Scientific Research (NWO) for a Veni grant; and acknowledges support from CREATES, funded by the Danish National Research Foundation. We are responsible for all errors. Corresponding author. Address: Econometric Institute, H8-11, Erasmus University Rotterdam, PO Box 1738, Rotterdam, 3DR, The Netherlands. Tel.: karstanje@ese.eur.nl. Other author s addresses: vanderwel@ese.eur.nl (van der Wel), djvandijk@ese.eur.nl (van Dijk).

2 1 Introduction Commodities have become a popular asset class among investors in recent years, as they offer interesting diversification opportunities in the context of broader investment portfolios as well as a useful hedge against inflation. 1 While commodity characteristics vary, we still witness comovements in their prices, as seen during the price boom in the period and the downturn in the second half of 28. As commodity prices are important from an economic, sociological, and political perspective, so are their determinants and comovements. This makes it interesting to look at the possibility to characterize commodity price comovements in terms of common factors. In this paper we examine the comovement across commodities of factors driving their entire futures curve, i.e. the collection of all available futures contracts of a particular commodity. To the best of our knowledge, we are the first to look at this issue. By including futures information we can examine not only comovement in the price level of different commodities but also comovement in the slope and curvature of their curves. This additional analysis of the shape of the futures curve sheds light on the beliefs of market participants about the price fundamentals. For example, the Theory of Storage states that the slope of the futures curve is related to the level of inventories (Kaldor, 1939; Working, 1949; Fama and French, 1987). Furthermore, using the entire futures curve can help to account for seasonal effects that might otherwise disturb the analysis of comovement in the price level. A common approach to model the commodity futures curve is to express it in terms of unobserved factors or so called state variables (e.g. Schwartz, 1997). By specifying the functional form of these state variables and linking them to observed market factors, one can estimate the model using observed futures prices. However, the parameter estimates depend critically on the choice of the stochastic processes for the state variables. Furthermore this approach is quite restrictive because it needs many assumptions on the market 1 See e.g. Gorton and Rouwenhorst (26). Note that the usefulness to act as a hedging instrument varies across individual commodities because of their heterogeneous nature, see Erb and Harvey (26), Brooks and Prokopczuk (211), and Daskalaki, Kostakis, and Skiadopoulos (212), among others. 1

3 factors and it has difficulty incorporating seasonal patterns (West, 212). Taking into account the above, we take a different approach and use the Nelson and Siegel (1987) model. Particularly the parsimonious nature and its generality make it a natural model choice for our analysis. West (212) uses it to fit a commodity futures curve and to obtain commodity price estimates for OTC forward contracts beyond the longest available maturity of exchange traded futures contracts. Furthermore, Diebold and Li (26) show that the Nelson-Siegel can be used to model the term structure of interest rates. As the statistical features of commodities futures curves resemble those of the term structure of interest rates, it supports our choice of the Nelson-Siegel model. We use an enhanced version of the Nelson and Siegel (1987) model to extract the factors that drive the individual commodity futures curves. The use of the Nelson-Siegel model is inspired by Diebold, Li, and Yue (28), who model the comovement in the level and slope of government bond yield curves of multiple countries. They model the individual country curves as a function of a common global curve and country-specific idiosyncratic components. To adopt the framework for the commodity futures curves, we extend the model of Diebold, Li, and Yue (28) by including additional factors. Since commodity futures curves have both a larger variety of shapes and pronounced seasonal patterns compared to yield curves we add a curvature factor and account for seasonality. Besides including additional factors, we also adapt the decomposition set-up to commodity specific features. We investigate the comovement across commodities by decomposing each individual level, slope and curvature factor in a global, sector, and idiosyncratric component. Here we propose another extension to the approach of Diebold, Li, and Yue (28), i.e. we consider a sector component besides global and idiosyncratic components. This sector component is a self-evident addition since we have a natural way of ordering individual commodities into sectors. It may be important to make the distinction between factors that capture comovement within a particular sector and global factor that affect all commodities. A related set-up is used in the business cycle literature, see e.g. Kose, Otrok, and Whiteman (23). They use a decomposition into a global, regional, and idiosyncratic components but differ in the fact that they do not have a futures curve 2

4 dimension (as they only investigate macroeconomic variables). We investigate commodities that are part of the Goldman Sachs Commodity Index (GSCI). The 24 commodities we study can be split into 5 sectors: energy, metals, softs, grains, and meats. As a large part of these commodities are also included in the Dow Jones-UBS Commodity Index (DJ-UBSCI) 2, and as we include at each point in time all available futures contracts, our data covers a large part of total exchange-related commodity trading. Our results show that our enhanced version of the Nelson Siegel model is suited to model commodity futures curves. Furthermore, using our global model we find that there is comovement in common factors of commodity futures curves. For the level factor, the comovement is mostly due to a global level component (around 5%) while for 25% due to sector components. The individual slope and curvature factors are for more than 6% driven by common components. For the slope factors it is slightly more due to the sector component, while for the curvature factors it is more due to the global component. The dynamic behavior of commodity prices is an interesting topic from various perspectives. Since commodities are an important input for many production processes, it is directly related to a large part of developed, industrialized economies. In addition, price fluctuations may severely affect developing countries that rely heavily upon revenues from exports of natural resources. Furthermore, policy makers are concerned by increasing commodity prices, particularly oil and related products, because this can lead to higher inflation. Finally, rising commodity prices of food and energy products can lead to all kind of humanitarian and political unrest. There is a large existing literature on commodity prices and their comovement. One strand focuses on the excess comovement of commodities. Pindyck and Rotemberg (199) state in their excess comovement hypothesis that seemingly unrelated commodities comove more than expected, after correcting for macroeconomic influences. Other papers find weaker evidence or reject this excess comovement hypothesis after accounting for 2 This index was formerly known as the Dow Jones-AIG Commodity Index (DJ-AIGCI). Together, the GSCI and DJ-AIGCI are the two commodity indexes that have emerged as industry benchmarks (Stoll and Whaley, 21). 3

5 model misspecification, conditional heteroskedasticity, and non normality (Deb, Trivedi, and Varangis, 1996) or after incorporating inventory and harvest information (Ai, Chatrath, and Song, 26). Another strand documents common unobserved factors among individual commodity prices, e.g. Vansteenkiste (29) and Byrne, Fazio, and Fiess (212). Using various techniques they extract a latent factor that drives the commodity prices and then they link this factor to observed economic variables, like exchange rates or real interest rates. We extend the work of Vansteenkiste (29). She investigates a common factor in 32 non-fuel commodities and also allows for a sector specific and global factor. We differ by investigating the comovement of the entire futures curve of commodity prices instead of one prices series per commodity. By applying the Nelson-Siegel model we have, by construction, a clear interpretation of our latent factors. Hence we focus more on the differences and time variation in loadings of factors on the global, sector, and idiosyncratic components than linking these factors to observed variables. Closest to our paper is the work of Ohana (21). He models the joint evolution of the U.S. natural gas and heating oil futures curves by decomposing their returns in short-term and long-term shocks. Although our idea is related, we take a completely different approach by including a curvature factor and by investigating the comovement across 24 commodities instead of 2. The rest of this paper is organized as follows. In the next section we present the methodology, followed by Section 3 where we present the data and the descriptive statistics. In Section 4 and 5 we discuss the results and in Section 6 we conclude the paper. 2 Methodology In this section we show how we decompose the futures curves and how we model commonality. We start by discussing our model step by step. In the second subsection, we discuss the applied estimation procedure. 4

6 2.1 Model We model a collection of futures prices for N different commodities. We label the futures price for commodity i at time t with as argument the time to maturity τ with f i,t (τ). We start from the dynamic version of the Nelson and Siegel (1987) model, as introduced in Diebold and Li (26), to model the futures curve of each individual commodity i, for i = 1, 2,..., N, as ( 1 exp λ i ) ( τ 1 exp λ i τ f i,t (τ) = l i,t + s i,t + c i,t λ i τ λ i τ ) exp λ iτ + ν i,t (τ), (1) where l i,t, s i,t, c i,t are allowed to vary over time and are interpreted as unobserved factors, the decay parameter λ i is assumed to be constant over time but commodity specific, and ν i,t (τ) is a disturbance term. The interpretation of the unobserved factors l i,t, s i,t, and c i,t is determined by their loadings. The loading on the first factor is a constant such that l i,t affects all futures prices in the same way irrespective of their maturity, hence the name level factor. The loading on the second factor is a decreasing function of the futures contract maturity τ and s i,t can therefore be seen as the slope of the futures curve. The loading on the third factor is a concave function of τ, which allows to fit humped-shaped term structures, and c i,t can thus be interpreted as a curvature factor. Commodity futures curves experience pronounced seasonal effects due to seasonal supply, e.g. harvest, or seasonal demand, e.g. cold weather (see e.g. Milonas, 1991). Hence, we enhance the model with a term that accounts for seasonality by including a trigonometric function that depends on the expiry month g i (t, τ) of the contract. 3 Trigonometric functions are often used to model seasonality, see for instance Sorensen (22). The model enhanced with seasonal term is given by 3 The mathematical expression for the expiration month is g i (t, τ) = t + τ S t+τ S, with S = 12 the number of distinct seasons and the function x returns the largest integer not greater than x. Therefore, g i (t, τ) results in the integers {, 1,..., 11}. Our sample starts in January with t = such that the integers {, 1,..., 11} represent the expiry months January, February,..., December. 5

7 ( 1 exp λ i ) ( τ 1 exp λ i τ f i,t (τ) = l i,t + s i,t + c i,t λ i τ λ i τ ) exp λ iτ + κ i cos (ωg i (t, τ) ωθ i ) + ν i,t (τ), (2) where the parameter κ i determines the commodity-specific exposure to the seasonal, the constant ω determines the cycle length, and the parameter θ i indicates the peak of the seasonal. We investigate the comovement of commodity prices by linking the futures curves of the individual commodities. This link across commodities is accomplished by introducing dependence of latent level, slope, and curvature factors on global, sector, and idiosyncratic components. If, for example, the commodities individual level factors load mostly on a global component, there will be strong comovement in the level of the futures curves. We define the factor decompositions l i,t = α L i + β L i L global,t + γ L i L sector,t + ε L i,t, s i,t = α S i + β S i S global,t + γ S i S sector,t + ε S i,t, (3) c i,t = α C i + β C i C global,t + γ C i C sector,t + ε C i,t, where { α L i, α S i, α C i } are constant terms, { β L i, β S i, β C i } are loadings on global components, { γ L i, γ S i, γ C i } { } are loadings on sector components, and ε L i,t, ε S i,t, ε C i,t are idiosyncratic components. We include sector components besides the global components, because we may expect that commodities in the same sector are more closely realted than commodities across different sectors. The three-way decomposition of the futures curve factors is in line with Kose, Otrok, and Whiteman (23), who let business cycles depend on global, regional, and an idiosyncratic part. In contrast, Diebold, Li, and Yue (28) decompose the country yield factors only in a global and idiosyncratic component. The global, sector, and idiosyncratic components are assumed to have first-order au- 6

8 toregressive dynamics: L x,t S x,t C x,t ϕ x 11 ϕ x 12 ϕ x 13 = ϕ x 21 ϕ x 22 ϕ x 23 ϕ x 31 ϕ x 32 ϕ x 33 L x,t 1 S x,t 1 C x,t 1 + η L x,t η S x,t η C x,t, (4) where x = {global, sector}, and the disturbances η x,t = (η L x,t, η S x,t, η C x,t) are normally distributed with covariance matrix Σ ηx, plus ε L i,t ε S i,t ε C i,t ϕ idio 11 ϕ idio 12 ϕ idio 13 = ϕ idio 21 ϕ idio 22 ϕ idio 23 ϕ idio 31 ϕ idio 32 ϕ idio 33 ε L i,t 1 ε S i,t 1 ε C i,t 1 + u L i,t u S i,t u C i,t, (5) where the shocks u i,t = (u L i,t, u S i,t, u C i,t) are normally distributed with covariance matrix Σ ui. We do not allow for cross-correlation of the shocks u i,t across commodities and also assume they are not correlated with the shocks η x,t to the global and sector components. We have two related identification issues as neither the signs nor the scales of the global and sector factors and their factor loadings are seperately identified. We follow Sargent and Sims (1977) and Stock and Watson (1989) to identify the scales by assuming that each disturbance variance is equal to a constant, i.e. Σ ηx (j, j) =.1 for j = 1, 2, 3. 4 We identify the factor signs by restricting one of the loadings for each of the global and sector components to be positive. Besides the assumptions we need for identification, we have additional assumptions to facilitate tractable estimation by reducing the number of parameters. That is, both the disturbance matrices Σ ηx and Σ ui and the autoregressive matrices in (4) and (5) are assumed to be diagonal. 2.2 Estimation The model as given by (2)-(5) can be estimated either using a two step approach (see e.g. Diebold et al., 28) or a one step approach (see e.g. Diebold et al., 26). In the two 4 The value of.1 is in line with the estimated variance of the idiosyncratic factors, as will become clear in the results section. 7

9 step approach one first extracts the latent factors l i,t, s i,t, and c i,t in (2) at each point in time for each commodity and in the second step decomposes the extracted factors into the global, sector and idiosyncratic components. We use the one step approach where we cast the complete model in a state space representation and use the Kalman filter to estimate all parameters as well as the latent factors simultaneously. The advantage of this procedure is that it takes the estimation uncertainty in the extracted factors into account in the second step. Furthermore we can use both time series and cross-sectional observations to accurately estimate the parameters. To initialize the one step approach we can use estimation results of smaller versions of our full model, e.g. a variant without global or sector components. The state space representation follows naturally from the model given by (2)-(5). The measurement equation in (6) is a combination of (2) and (3). Note that the individual latent level l i,t, slope s i,t, and curvature c i,t factors do not appear in the measurement equation, as we can link the observed futures prices f i,t (τ) directly to the unobserved global, sector and idiosyncratic components. The transition equations of the latent states are given by (4) and (5). f 1,t (τ 1 ) f 1,t (τ 2 ). = A f 1,t (τ J1 ). f N,t (τ JN ) α L 1 α S 1 α C 1. α C N +B L global,t S global,t C global,t +C L Energy,t S Energy,t C Energy,t. L Meats,t S Meats,t C Meats,t ε L 1,t ε S 1,t +A ε C 1,t +D. ε C N,t κ 1 κ 2. κ N ν 1,t (τ 1 ) ν 1,t (τ 2 ). + ν 1,t (τ J1 ). ν N,t (τ JN ) (6) where J i is the number of available contracts of commodity i, ( ) ( ) 1 e 1 λ 1 τ 1 1 e λ 1 τ 1 λ 1 τ 1 λ 1 τ 1 e λ1τ1 ( ) ( ) 1 e 1 λ 1 τ 2 1 e λ 1 τ 2 λ A = 1 τ 2 λ 1 τ 2 e λ 1τ 2, ( ) ( ) 1 e 1 λ N τ J N 1 e λ N τ J N λ N τ JN λ N τ JN e λ N τ JN 8

10 β1 l β1 s β1 l β1 s B =. β l N βs N ( ) 1 e λ 1 τ 1 λ 1 τ 1 ( ) 1 e λ 1 τ 2 λ 1τ 2. ( ) 1 e λ N τ J N λ N τ JN β c N ( ) β1 c 1 e λ 1 τ 1 λ 1 τ 1 e λ 1τ 1 ( ) β1 c 1 e λ 1 τ 2 λ 1τ 2 e λ1τ2,. ( ) 1 e λ N τ J N λ N τ JN e λ N τ JN ( ) ( ) γ 1 l γ s 1 e λ 1 τ 1 1 λ 1 τ 1 γ c 1 e λ 1 τ 1 1 λ 1 τ 1 e λ 1τ 1 ( ) ( ) γ1 l γ1 s 1 e λ 1 τ 2 λ C = 1τ 2 γ1 c 1 e λ 1 τ 2 λ 1τ 2 e λ1τ2, ( ) ( ) γn l γs 1 e λ N τ J N N λ N τ JN γn c 1 e λ N τ J N λ N τ JN e λ N τ JN cos (ωg 1 (t, τ 1 ) ωθ 1 ) cos (ωg 1 (t, τ 2 ) ωθ 1 )... D =. cos (ωg 1 (t, τ J1 ) ωθ 1 )... cos (ωg N (t, τ JN ) ωθ N ) We present our model in stacked form because we treat the multivariate series as univariate series, following Koopman and Durbin (2). We can consider the futures prices separately since we assume that the variance of ν i,t (τ j ) is the same for all contracts of a particular commodity i, and there is no cross-correlation between different commodities (all commonality is absorbed by the factors). In other words, the variance matrix of ν i,t (τ j ) can be written as an N by N diagonal matrix. The univariate treatment gives us not only computational gains but also allows the number of term-structure observations J i to be time-varying. 5 The parameters in (6) (λ, α, β, γ, κ, θ), the VAR coefficient matrices Φ x, and variance matrices Σ ηx and Σ ui are treated as unknown coefficients that are collected in the parameter vector Ψ. Estimation of Ψ is based on the numerical maximization of the loglikelihood function that is constructed via the prediction error decomposition. 5 Note that for readability reasons we keep writing J i instead of J it. 9

11 3 Data We study futures curves for 24 commodity contracts. Our commodity futures data is provided by Datastream. We consider the period January 1995 to September 212 and use all individual contracts that expire between January 1995 and December Our commodity selection is based on the composition of the S&P Goldman Sachs Commodity Index (GSCI), such as oil, gold, etc. These can be split in 5 sectors: energy, metals, softs, grains, and meats. An overview of the data is given in Table 1. All our analyses are done at the monthly frequency, and for this we use month-end log prices. Furthermore, we standardize these prices since the pricing grid of the commodities is quite diverse. To avoid liquidity issues close to the expiration date, we do not consider price information in the month that the future is expiring. [insert Table 1 here] Table 1 shows that the number of cross-sectional observations varies per commodity. For the energy and industrial metal commodities there is an expiring futures contract each month, while for the other commodities there are less than 12 expiry months a year. In general, agricultural commodities have a small number of active futures, partly because of the lower number of distinct expiry months per year. The energy and industrial metal commodities have active futures with a much longer time-to-maturity such that investors can hedge positions much further into the future. The variation in the number of contracts and the maximum time to maturity indicate that it is important to use a commodity specific decay parameter λ i in (2). [insert Figure 1 here] Figure 1 gives some insight in the data we use by showing the complete set of available futures prices for natural gas and coffee. For both commodities we observe that the shape of the futures curve varies substantially over time, with alternating periods of pronounced 6 The start of the sample period is based on the availability of the metal commodities traded on LME, which are only available from July 1993 onwards. 1

12 contango and backwardation especially for natural gas. 7 Both futures curves also clearly show the large increase in the general price level during the period A notable difference between these commodities is that the futures curve of natural gas displays a strong periodic pattern with spikes occurring for expiry months during the winter, while the curve of coffee does not show any signs of seasonality. Finally, Figure 1 illustrates a data feature that needs to be taken into account in the state space methodology, namely that the number of available contracts varies over time. For both natural gas and coffee (and in fact also for most other commodities), contracts with longer maturities only have become available in the most recent years of our sample period. The summary statistics in Table 1 show that there are large differences both across commodities and along their futures curves. The returns of the contracts range between 2% to 18.8% and are more extreme for the first nearby contract. The volatility of the returns confirms this as in all cases the fifth nearby contract returns are less volatile than the returns of the first nearby contract, also known as the Samuelson (1965) effect. 4 Individual Commodity Results Before we estimate our full global state space model, we start with analyzing the individual commodities separately. For this we still use the Kalman filter but leave out the global and sector components in (6). First, we present the extracted factors, their parameters and decide on the number of factors to include for each commodity. Second, we investigate the commonality in extracted factors across commodities using a principal component analysis (PCA). 4.1 Estimation Results Individual Factors So far, we presented the model in general form, where all commodity curves are build up from a level, slope, and curvature factor combined with a seasonal term. However, not 7 An upward sloping commodity futures curve is said to be in contango, while a downward sloping curve is in backwardation. 11

13 all commodity curves show dynamics for which we need the flexibility of three factors. Furthermore, not all commodities in our sample display periodic behavior that we need to account for. Based on only commodity specific data we decide on the number of factors to include and whether we include the seasonal term or not. Furthermore, we estimate the decay parameter λ. [insert Figure 2 and Table 2 here] We first provide evidence that our models are appropriate to model commodity futures curves data. Figure 2 shows four examples of the fit of our models. Subfigure A and B corresponds to the natural gas futures curve, while Subfigure C and D correspond to the futures curve of Coffee. Note that the presented figures are snapshots at one particular point in time. The natural gas futures curve display a pronounced seasonal pattern. In general our fitted values are close to the real prices, with some exceptions at the short end of the curve. The inclusion of the seasonal term seems an appropriate solution to model the periodic behavior. The futures curves of coffee do not show severe seasonal patterns. Our dashed data is again close to the true prices. The left hand side of Table 2 shows the model fit in terms of error descriptives, where the errors are defined as the difference between the fitted and the true observed data. The first three columns give insight in the errors per commodity for the individual models, the last three columns present the same statistics for the joint model we will discuss in Section 5. The average errors are close to zero and much smaller than their volatility presented in the second column. Due to the standardization and transformation of the data, the exact magnitude of the errors is difficult to interpret. This also implies that the numbers are not suitable for cross-commodity comparison. [insert Figures 3, 4, and 5 here] We first turn to the extracted level, slope, and curvature factors of each commodity. Figures 3, 4, and 5 show the extracted level, slope, and curvature factors per commodity sector. In general, we see a similar pattern in all level factors. Until 24 they are relatively constant, then they increase until they peak in 28, whereafter they again remain 12

14 constant. Both the level factors within the energy and the softs sectors seem to comove the most. The slope factors in Figure 4 show some peaks and troughs. Especially for the energy commodities we see a sharp decline in 28 and a gradual increase thereafter. As the Nelson-Siegel loading on the slope factor in (2) is a decreasing function, a negative factor estimate signifies an upward sloping (i.e. contangoed) futures curve. This implies that in 28 all the backwardated energy futures curves quickly went into contango, and only gradually returned back to being backwardated. Last, the curvature factors in Figure 5 show again some degree of comovement. Note that the metals commodities are missing, as we find that two factors are enough to capture their curve dynamics (see more details below). Of the four sectors the energy commodities have the most comoving curvature factors. [insert Table 3 here] Table 3 presents the estimated parameter values and the final model choice for each of the commodities. The decay parameter λ varies substantially across commodities, ranging from.5 for metals to.625 for live cattle. The value of.5 is the lowest value we allow to prevent that the loadings of the level and slope factor become too similar. The variation in λ is both due differences in curve shapes and due to differences in the number of available futures contracts. Examples of commodities with a large termstructure dimension are Brent crude oil, WTI crude oil, natural gas, and the industrial metals. As expected, these commodities have indeed a relatively low value for λ. The effect of different futures curves shapes on λ becomes clear when we compare results for commodities with a similar number of contracts. Gasoil, heating oil, gasoline, gold, silver, and soybeans have all between 2 to 4 available contracts. Still, gold and silver have an estimated λ-value of.5 while the others range between.148 and.277. These differences can be explained by the almost hardly curved shaped curves of gold and silver versus the hump-shaped curves of the other commodities. We, from now onwards, exclude the curvature factor for the metal commodities. Given their estimate for λ and the number of available contracts, the loading on the slope and curvature factor are highly correlated, 13

15 which leads to identification and estimation issues. The estimates in Table 3 related to the seasonal correction term are the exposure κ and the location parameter θ. When κ is close to zero, θ cannot be identified, which is represented by -. We find that most of the agriculturally orientated commodities display seasonal patterns. Furthermore, in line with our expectations, we find that natural gas and to a lesser extent also heating oil and gasoline show seasonal behavior. For all commodities where κ is not close to zero and θ is identifiable, we will deduct the seasonal term before putting the price data into the global model. 4.2 Commonality Across Individual Factors Based on the plots of the individual factors, there seems to be commonality across commodities. To further investigate this, we apply PCA both on all commodities and on subgroups that correspond to the commodity sectors. We first analyze the percentage of variation a global component explains. The global level component is approximated by the first principal component when PCA is applied to all 24 extracted level factors. For each commodity we can compute the fraction of individual variance that is explained by this first principal component. We then take the average over all commodities in a particular sector. The same analysis is also applied on all slope factors and on all curvature factors. Panel A in Table 4 shows that there seems to be a global component that drives the level factors as the first principal component explains 81% of the variation in individual level factors. Especially, the energy, metals, and grains level factors comove with the global level component, as more than 85% of their variation is explained. For the other two commodity sectors we observe that around 65% of their variation is explained by the global level component. Investigating the global slope component shows that there is less comovement on average indicated by the explained variation of 26.2%. The decomposition in sectors shows that the global slope component still explains half of the variation in the energy slope factors, while it hardly explains variation for the softs and grains sectors. The global curvature component shows similar results as the global slope component. In general, the global curvature component explains 21.1% of the variation 14

16 in the individual commodity curvature factors. For energy the percentage of explained variation is much higher (46.1%), while for the softs and grains sector it explains less than 5% of the variation. [insert Table 4 here] Panel B in Table 4 shows how much variation the global and sector components jointly explain. The results are based on PCA applied on only the extracted factors of commodities in a particular sector. Hence, we can see the percentage of variation that is explained by common comovement, which can either be due to the global or due to the sector specific component. Roughly stated, all differences in percentages between Panel A and Panel B can be attributed to the sector specific component. In case of the level factors we see that the percentages in Panel B are close to the percentages in Panel A. Hence only about 1% of the variation explained can be attributed to the energy sector level component. The results for the slope and curvature factor are different. Here we see clear differences between Panel A and Panel B, in particular for the softs and grains sector. This implies that up to 7% of the variation in individual slope or curvature factors can be explained by a sector specific component. 5 Joint Model for Collection of Commodity Curves In this section we estimate the global state space model given by (4)-(6). First, we discuss the dynamics of the unobserved factors and the loadings of the commodity factors on the common components. Thereafter, we investigate the importance of the various components using variance decompositions. 5.1 Estimation Results Joint Model When we estimate our full model, we fix the λ, κ, and θ parameters to their estimates based on commodity specific data, in order to reduce the computational burden somewhat. 15

17 All other parameters are collected in the parameter vector Ψ and estimated using the Kalman filter and maximum likelihood. [insert Table 2 and Table 5 here] We first provide evidence that our models are appropriate to model commodity futures curves data. Figure 2 shows four examples of the fit of our models. Subfigure A and B correspond to the natural gas futures curve, while Subfigure C and D correspond to the futures curve of coffee. Note that the presented figures are snapshots at one particular point in time. The natural gas futures curve displays a pronounced seasonal pattern. In general our fitted values are close to the real prices, with some exceptions at the short-end of the curve. The inclusion of the seasonal term seems an appropriate solution to model the periodic behavior. The futures curves of coffee do not show severe seasonal patterns. Our dashed data is again close to the true prices. We start by comparing the model fit of the individual models from the previous section with the fit of the joint model. The left hand side of Table 2 shows the error descriptives of the individual models while the right hand side shows the joint model error statistics. The average errors of the joint model are all positive and in general slightly larger than their individual model counterparts, still they are not significantly different from zero. The error volatilities of the joint model are smaller than the corresponding numbers in the left hand side of the table, hence the joint model seems to include a slight upward bias but overall a better fit. The better fit is also reflected by the mean squared errors in the last column, as these numbers are slightly lower than the number in column 3. The inclusion of common factors has some effect on the fit of the individual futures curves but overall we the joint model seems still appropriate for our data. Table 5 shows the estimates related to the state equations (4) and (5). Note that we have assumed that both the autoregressive coefficient matrices and the covariance matrices are diagonal. The first three columns show the autoregressive coefficients and the last three columns show the variances of the factor disturbances. All level factors are highly persistent with first-order autoregressive coefficients all above.9. The global 16

18 factor and the energy sector factor are close to non-stationary with values of.99. Given the rise in commodity prices over the last 15 years this was to be expected. The slope factors are slightly less persistent. The lowest value is.76, however most values are again above.9. The curvature factors show similar coefficients as those of the slope factors. Overall, most of our unobserved factors are persistent. The estimated variances of the factor disturbances are in general comparable across factors. Noteworthy exceptions are the estimates for the natural gas factors. Hence, our three factor structure with seasonal correction term seems still not sufficient to capture all price dynamics. On the other hand, our goal is not to explain all commodity prices up to the detail but to extract the most relevant dynamics to see if these are related across commodities. [insert Figures 6, 7, and 8 here] To get a better grasp on the unobserved level, slope, and curvature factors and their different components, we show them in Figures 6, 7, and 8. In subplot (a) of each figure we show the global component together with the five sector components. In subplots (b)-(f) we show the sector components together with the corresponding idiosyncratic commodity components. [insert Table 6 here] So far, we have discussed the individual dynamics of the unobserved factors. Now, we turn to the commonality across commodities as expressed by their loadings on global and sector components. Each commodity level, slope, and curvature factor can load on a global, sector and idiosyncratic component, see also (3). Table 6 shows for each commodity level, slope, and curvature factor the estimated constant, α, the estimated loading on the global component, β, and the estimated loading on the sector component, γ. The α-parameters make sure that the idiosyncratic components have mean zero. Most values are comparable and close to zero because we already scale the log prices. Since the variance of the global and sector components disturbances is fixed to the same value, the magnitude of their loadings is comparable. All loadings on the global level component are 17

19 positive, which indicates that there exists a link between the level of different commodity prices. The global slope component shows different results as this factor seems mostly dominated by the energy commodities. Furthermore, there are six commodities that load negatively on the global slope component. The global curvature component seems again to be dominated by energy commodities, especially the two crude oil commodities. The loadings on the sector components give more insight in intra sector commonality. Within the energy sector all commodities load similarly on the sector level, slope, and curvature factors, except for WTI crude oil on the curvature factor. We also have to note that the loadings of Brent crude oil on the slope and curvature factor components are close to zero. Within the metals sector we see a split between the precious and the industrial metals. For both the level and slope component, gold and silver have close to zero loadings, while the industrial metals all have negative coefficients of the same magnitude. For the softs sector, cocoa loads differently on the all three sector components than the other three commodities in the same sector. Within the grains sector, there is somewhat more commonality, at least for the sector level and curvature component. However, corn does have a close to zero loading on both the slope and curvature sector components. Finally, the meats sector where feeder cattle has a close to zero loading on the level and slope sector components while the other two commodities have a negative coefficient. The loadings on the curvature sector component are all positive but the magnitude of the loadings varies a lot. The last column of Table 6 shows the variance estimates of the measurement equation errors. We assume that these variances are commodity specific but within each commodity they are the same for all different contract maturities. We believe this assumption is appropriate because the factor structure can already account for volatility differences across the term-structure dimension due to the time-to-maturity dependent factor loadings. Almost all estimated variances are well below the variances of the factor disturbances. The only exceptions are natural gas and lean hogs, with 1.44 and 1.38, respectively. Natural gas and lean hogs are the two most challenging commodities in our sample in terms of modeling, for a large part because of their pronounced seasonal patterns (see also Table 18

20 3). It is therefore not completely surprising that they have the largest measurement error variances. 5.2 Variance Decompositions To combine both the factor dynamics and factor loadings results, we look at variance decompositions. We decompose the variation in commodity level, slope, and curvature factors into parts driven by the global, sector, and idiosyncratic components. As mentioned in Kose, Otrok, and Whiteman (23) and Diebold, Li, and Yue (28), the global, sector, and commodity-specific components may be correlated as they are extracted from a finite sample. Hence we orthogonalize the extracted components using a Cholesky decomposition to ensure that they add up. 8 Then, we can use (3) to write var (l i,t ) = ( β L i ) 2 var (Lglobal,t ) + ( γ L i ) 2 var (Lsector,t ) + var ( ) ε L i,t, var (s i,t ) = ( β S i ) 2 var (Sglobal,t ) + ( γ S i ) 2 var (Ssector,t ) + var ( ) ε S i,t, (7) var (c i,t ) = ( β C i ) 2 var (Cglobal,t ) + ( γ C i ) 2 var (Csector,t ) + var ( ) ε C i,t. The fraction of, e.g., the commodity level factor variance driven by the global component is given by ( ) β L 2 i var (Lglobal,t ). var (l i,t ) The fractions of explained variance per component are shown in Table 7. The global level component explains on average 5.1% of the variance of the commodity level factors, followed by the idiosyncratic component (23.7%) and last the sector component (26.2%). However, the differences across commodities are large. For example, in the case of Brent crude oil the global component explains nothing of its level variation while the sector component explains 81% of its variation. In contrast, the global component explains 94% of the level factor of silver. Comparing the different sectors, we find that the global component explains most variation for the metals, softs, and grains commodities. For the 8 We put the global component first, followed by the sector component, and last the idiosyncratic component. 19

21 energy related commodities the sector component explains most variation, while for the meats it is the idiosyncratic component. [insert Table 7 here] The variance decomposition results for the slope factor are again diverse. The global, sector, and idiosyncratic components explain on average 25.8%, 36.8%, and 37.4%, respectively, of the commodities slope factors. The global component explains more than 75% of the energy commodities slope variation, with the exception on natural gas. Most variation for the precious metals is explained by the idiosyncratic component, while for the industrial metals it is the sector component. Finally, the meats sector component explains most variation for lean hogs but hardly any variation of feeder and live cattle. The global, sector, and idiosyncratic curvature components explain all (on average) a similar portion of the individual commodity factors. Within the energy and grain sector most variation is explained by the global component, for the softs its mostly the idiosyncratic component, and the results for the meat commodities are mixed. 6 Conclusion We use an enhanced version of the Nelson and Siegel (1987) model and extend the framework of Diebold, Li, and Yue (28) to extract the factors that drive the individual commodity futures curves. Using a monthly dataset of 24 commodities that are part of the S&P Goldman Sachs Commodity Index (GSCI), we investigate comovement across commodities by examining the commonality in the level, slope and curvature factors. We achieve this by decomposing each individual factor in a global, sector, and idiosyncratric component. Our state space results show that there is comovement in common factors of commodity futures curves, either due to a global or due to a sector component. For the individual commodity level factors variation, 5% is driven by a global component and almost 25% by sector components. For individual slope and curvature factors, more than 6% of 2

22 the variation is related to common components. For the slope factors it is slightly more due to the sector component, while for the curvature factors it is more due to the global component. The current findings are insightful for portfolio construction, risk management and hedging purposes using commodity futures. Future analysis is needed to better understand these global and sector components and also to see if the exposure to these components is time-varying. 21

23 References Ai, C., A. Chatrath, and F. Song (26). On the comovement of commodity prices. American Journal of Agricultural Economics, 88(3), Brooks, C. and M. Prokopczuk (211). The dynamics of commodity prices. Working paper. Byrne, J. P., G. Fazio, and N. Fiess (212). Primary commodity prices: Co-movements, common factors and fundamentals. Journal of Development Economics, 11, Daskalaki, C., A. Kostakis, and G. S. Skiadopoulos (212). Are there common factors in commodity futures returns? Working paper. Deb, P., P. Trivedi, and P. Varangis (1996). The excess comovement of commodity prices reconsidered. Journal of Applied Econometrics, 11, Diebold, F. X. and C. Li (26). Forecasting the Term Structure of Government Bond Yields. Journal of Econometrics, 13, Diebold, F. X., C. Li, and V. Z. Yue (28). Global yield curve dynamics and interactions: A dynamic Nelson-Siegel approach. Journal of Econometrics, 146, Diebold, F. X., G. D. Rudebusch, and S. B. Aruoba (26). The macroeconomy and the yield curve: a dynamic latent factor approach. Journal of Econometrics, 131(1-2), Durbin, J. and S. Koopman (212). Time Series Analysis by State Space Methods. Oxford University Press. Erb, C. and C. Harvey (26). The tactical and strategic value of commodity futures. Financial Analysts Journal, 62(2), Fama, E. and K. French (1987). Commodity futures prices: Some evidence on forecast power, premiums and the theory of storage. Journal of Business, 6, Gorton, G. and K. G. Rouwenhorst (26). Facts and fantasies about commodity futures. Financial Analysts Journal, 62(2), Kaldor, N. (1939). Speculation and economic stability. The Review of Economic Studies, 7, Koopman, S. and J. Durbin (2). Fast filtering and smoothing for multivariate state space models. Time Series Analysis, 21, Kose, M. A., C. Otrok, and C. H. Whiteman (23). International business cycles: World, region, and country-specific factors. The American Economic Review, 93(4), Milonas, N. T. (1991). Measuring seasonalities in commodity markets and the half-month effect. Journal of Futures Markets, 11(3),

24 Nelson, C. and A. Siegel (1987). Parsimonious modeling of yield curves. Journal of Business, 6, Ohana, S. (21). Modeling global and local dependence in a pair of commodity forward curves with an application to the us natural gas and heating oil markets. Energy Economics, 32, Pindyck, R. and J. Rotemberg (199). The excess co-movement of commodity prices. Economic Journal, 1, Samuelson, P. (1965). Proof that properly anticipated prices fluctuate randomly. Industrial Management Review, 2, Sargent, T. J. and C. A. Sims (1977). Business cycle modeling without pretending to have too much a priori economic theory. Tech. rep. Schwartz, E. S. (1997). The stochastic behavior of commodity prices: Implications for valuation and hedging. Journal of Finance, 52(3), Sorensen, C. (22). Modeling seasonality in agricultural commodity futures. Journal of Futures Markets, 22(5), Stock, J. H. and M. W. Watson (1989). New indexes of coincident and leading economic indicators. In NBER Macroeconomics Annual 1989, Volume 4, MIT Press. Stoll, H. R. and R. E. Whaley (21). Commodity index investing and commodity futures prices. Journal of Applied Finance, 2(1), Vansteenkiste, I. (29). How important are common factors in driving non-fuel commodity prices? A dynamic factor analysis. Working paper. West, J. (212). Long-dated agricultural futures price estimates using the seasonal nelson-siegel model. International Journal of Business and Management, 7(3), Working, H. (1949). The theory of price storage. American Economic Review, 39,

25 Table 1 Commodity data overview The table presents an overview of the 24 commodity futures series that are all present in the S&P Goldman Sachs Commodity Index (GSCI). We consider the period January 1995 to September 212. We show the number of cross-sectional contracts that are available in the last year of our dataset. Furthermore we show the average annualized return and volatility of both the first maturing futures contract and the fifth nearby futures contract. 1st nearby contract 5th nearby contract Sector Commodity # contracts Avg. return Volatility Avg. return Volatility Energy Brent crude oil % 32.3% 13.8% 26.5% WTI crude oil 72 9.% 32.8% 12.6% 26.8% Gasoil % 32.8% 1.5% 27.3% Heating oil % 34.8% 11.5% 27.6% Natural gas % 53.1% 3.8% 34.1% Gasoline (RBOB) % 39.2% 14.7% 27.9% Metals Gold % 16.4% 5.5% 16.4% Silver % 3.6% 8.3% 3.3% Aluminum % 19.9% 1.8% 18.9% Copper % 28.% 9.4% 26.6% Lead % 3.2% 7.8% 28.3% Nickel % 36.4% 6.7% 35.3% Zinc % 27.6%.9% 26.4% Softs Cocoa 1.6% 31.4% 1.% 27.4% Coffee % 36.7% 5.3% 29.8% Cotton % 29.9% 2.5% 22.% Sugar 12.5% 38.2% 3.9% 22.5% Grains Corn % 29.% 1.5% 22.4% Soybeans 2 7.3% 26.6% 5.9% 23.3% Chicago wheat % 3.3%.4% 23.% Kansas wheat 1.2% 29.3% 2.2% 23.1% Meats Feeder cattle 8 1.5% 14.5% 4.7% 1.9% Lean hogs % 28.% 5.2% 14.7% Live cattle 9.5% 14.9% 2.2% 8.9% 24