Small Sample Properties of Forecasts from Autoregressive Models under Structural Breaks


 Oscar Gibson
 1 years ago
 Views:
Transcription
1 Small Sample Properties of Forecasts from Autoregressive Models under Structural Breaks M. Hashem Pesaran University of Cambridge and USC Allan Timmermann University of California, San Diego May 2003, this version February 2004 Abstract This paper develops a theoretical framework for the analysis of smallsample properties of forecasts from general autoregressive models under structural breaks. Finitesample results for the mean squared forecast error of onestep ahead forecasts are derived, both conditionally and unconditionally, and numerical results for different types of break specifications are presented. It is established that forecast errors are unconditionally unbiased even in the presence of breaks in the autoregressive coefficients and/or error variances so long as the unconditional mean of the process remains unchanged. Insights from the theoretical analysis are demonstrated in Monte Carlo simulations and on a range of macroeconomic time series from G7 countries. The results are used to draw practical recommendations for the choice of estimation window when forecasting from autoregressive models subject to breaks. JEL Classifications: C22, C53. Key Words: Small sample properties of forecasts, MSFE, structural breaks, autoregression, rolling window estimator. We are grateful to the editor, four referees, and seminar participants at Cass Business School (London) for helpful comments on an earlier version of the paper. We would also like to thank Mutita Akusuwan for excellent research assistance.
2 1. Introduction Autoregressive models are used extensively in forecasting throughout economics and finance and have proved so successful and difficult to outperform in practice that they are frequently used as benchmarks in forecast competitions. Due in large part to their relatively parsimonious form, autoregressive models are frequently found to produce smaller forecast errors than those associated with models that allow for more complicated nonlinear dynamics or additional predictor variables, c.f. Stock and Watson (1999) and Giacomini (2002). Despite their empirical success, there is now mounting evidence that the parameters of autoregressive (AR) models fitted to many economic time series are unstable and subject to structural breaks. For example, Stock and Watson (1996) undertake a systematic study of a wide variety of economic time series and find that the majority of these are subject to structural breaks. Alogoskoufis and Smith (1991) and Garcia and Perron (1996) are other examples of studies that document instability related to the autoregressive terms in forecasting models. Clements and Hendry (1998) view structural instability as a key determinant of forecasting performance. This suggests a need to study the behaviour of the parameter estimates of AR models as well as their forecasting performance when these models undergo breaks. Despite the interest in econometric models subject to structural breaks, little is known about the small sample properties of AR models that undergo discrete changes. In view of the widespread use of AR models in forecasting, this is clearly an important area to investigate. The presence of breaks makes the focus on small sample properties more relevant: even if the combined pre and postbreak sample is very large, the occurrence of a structural break means that the postbreak sample will often be quite small so that asymptotic approximations may not be nearly as accurate as is normally the case. A key question that arises in the presence of breaks is how much data to use to estimate the parameters of the forecasting model that minimizes a loss function such as root mean squared forecast error (RMSFE). We show that the RMSFEminimizing estimation window crucially depends on the size of the break as well as its direction (i.e., does the break lead to higher or lower persistence) and which parameters it affects (i.e., the mean, variance or autoregressive slope parameters). In some situations the optimal estimation window trades off an increased bias 1
3 introduced by using prebreak data against a reduction in forecast error variance resulting from using a longer window of the data. However, in other situations the small sample bias in the autoregressive coefficients may in fact be reduced after introducing prebreak data if the size of the break is small or even when the break is large provided that it is in the right direction (e.g., when persistence declines). In the presence of parameter instability it is common to use a rolling window estimatorthatmakesuseofafixed number of the most recent data points, although the size of the rolling window is based on pragmatic considerations rather than an empirical analysis of the underlying time series process. Another possibility would be to test for breaks in the parameters and/or error variances and only use data after the most recent break, assuming a break is in fact detected. Alternatively, if no statistically significant break is found, an expanding window estimator could be used. Our theoretical analysis allows us to better understand when each of these procedures is likely to work well and why it is generally best to use prebreak data when forecasting using autoregressive models. First, breaks in the autoregressive parameters need not introduce bias in the forecasts (at least unconditionally). This tends to happen when an autoregressive coefficient declines after a break or the break only occurs in the intercept or variance parameter. Including prebreak data in such cases will tend to lead to a decline in RMSFE due to both a smaller squared bias and a reduction in the variance of the parameter estimate. Furthermore, in practice, there is likely to be a considerable error in detecting and estimating the point of the break of the autoregressive model. This leads to a worse performance of a postbreak estimation procedure but also makes determination of the length of a rolling window more difficult. Several practical recommendations emerge from our analysis regarding the choice of estimation window when forecasting from autoregressive models. First, for the macroeconomic data examined here, in general it appears to be difficult in practice to outperform expanding or long rolling window estimation methods. Unlike the case with exogenous regressors, forecasts from autoregressive models can be seriously biased even if only postbreak observations are used. Including prebreak data in estimation of autoregressive models can simultaneously reduce the bias and the variance of the forecast errors. In most applications where breaks are not too large, expanding window methods or rolling window procedures with relatively large window sizes are likely to perform well. This conclusion may not of course 2
4 carry over to longer data sets, e.g. high frequency financial data with thousands of observations, where estimation uncertainty can be reduced more effectively than with the relatively short macroeconomic data considered here. The main contributions of this paper are as follows. First, we present a new procedure for computing the exact small sample properties of the parameters of AR models of arbitrary order, thus extending the existing literature that has focused on the AR(1) model. Our approach allows for fixed or random starting points and considers stationary AR models as well as models with unit root dynamics. We allow for the possibility of the AR model to switch from a unit root process to a stationary one and vice versa. Such regime switches could be particularly relevant to the analysis of inflation in a number of OECD countries since the first oil price shock in early 1970 s. In addition to considering properties such as bias in the parameters, we also consider the RMSFE in finite samples. Second, we extend existing results on exact small sample properties of AR models to allow for a break in the underlying data generating process. We establish that onestep ahead forecast errors from AR models are unconditionally unbiased even in the presence of breaks in the autoregressive coefficients and in the error variances so long as the unconditional mean of the process remains unchanged. Our results also apply to models with unit roots. This extends Fuller (1996) s result obtained for AR models with fixed parameters, and generalizes a related finding due to Clements and Hendry (1999, pp.3942). Third, we present extensive numerical results quantifying the effect of the sizes of the prebreak and postbreak data windows on parameter bias and RMSFE. Fourth, we undertake an empirical analysis for a range of macroeconomic time series from the G7 countries that compares the forecasting performance of expanding window, rolling window and postbreak estimators. This analysis which allows for multiple breaks at unknown times confirms that, at least for macroeconomic time series such as those considered here, it is generally best to use prebreak data in estimation of the forecasting model. The outline of the paper is as follows. Section 2 provides a brief overview of the small sample properties of the firstorder autoregressive model that has been extensively studied in the extant literature. Theoretical results allowing us to characterize the small sample distribution of the parameters and forecast errors of autoregressive models are introduced in Section 3. Section 4 presents numerical results for AR models subject to breaks and Section 5 presents empirical results 3
5 for a range of macroeconomic time series. Section 6 concludes with a summary and a discussion of possible extensions to our work. 2. Small Sample Properties of Forecasts from Autoregressive Models A large literature has studied small sample properties of estimates of the parameters of autoregressive models. The majority of studies has concentrated on deriving either exact or approximate small sample results for the distribution of ˆα T and ˆβ T, the Ordinary Least Squares (OLS) estimators of α and β, inthefirstorder autoregressive (AR(1)) model y t = α + βy t 1 + σε t,t=1, 2,..., T, ε t iid(0, 1). (1) Analysis of the small sample bias of ˆβ T dates back to at least Bartlett (1946). Early studies focus on the stationary AR(1) model without an intercept (α = 0, β < 1) but have been extended to higher order models with intercepts (Sawa (1978)) and exogenous regressors (Grubb and Symons (1987), Kiviet and Phillips (1993, 2003a)). Assuming stationarity ( β < 1), ˆβ T has been shown to have an asymptotic normal distribution and its finitesample distribution has been studied by Phillips (1977) and Evans and Savin (1981). The case with a unit root, β =1, has been studied by, inter alia, Banerjee, Dolado, Hendry and Smith (1986), Phillips (1987), Stock (1987), Abadir (1993) and Kiviet and Phillips (2003b). To a forecaster, the bias in ˆα T and ˆβ T is of direct interest only to the extent that it might adversely influence the forecasting performance. Ullah (2003) provides an extensive discussion and survey of the properties of forecasts from the AR(1) model. Box and Jenkins (1970) characterized the asymptotic mean squared forecast error (MSFE) for a stationary firstorder autoregressive process considering both the singleperiod and multiperiod horizon. Assuming a stationary process, Copas (1966) used Monte Carlo methods to study the MSFE of leastsquares and maximum likelihood estimators under Gaussian innovations. In practice, the conditional forecast error is of more interest than the unconditional error since the data needed to compute conditional forecasts is always available. A comprehensive asymptotic analysis for the stationary AR(p) model is provided in Fuller and Hasza (1981) and Fuller (1996). Using Theorem in 4
6 Fuller (1996) it is easily seen that, conditional on y T, MSFE(ŷ T +1 y T ) = E (y T +1 ŷ T +1 ) 2 y T = σ 2 (1 + 1 T )+1 β2 T This yields the more familiar unconditional result µ y T α 2 + O(T 3/2 ). 1 β MSFE(ŷ T +1 )=E (y T +1 ŷ T +1 ) 2 = σ 2 (1 + 2 T )+O(T 3/2 ). Generalizations to AR(p) and multistep forecasts are also provided in Fuller (1996, pp ), where it is established that the forecast error, y T +1 ŷ T +1, is unbiased in small samples assuming ε t has a symmetric distribution and E ( ŷ T +1 ) <. This is particularly noteworthy considering the often large small sample bias associated with estimates of the autoregressive parameters. 3. AR(p) Model in the Presence of Structural Breaks In parallel with the work on the small sample properties of estimates of autoregressive models, important progress has been made in testing for and estimating both the time and the size of breakpoints, as witnessed by the recent work of Andrews (1993), Andrews and Ploberger (1996), Bai and Perron (1998, 2003), Banerjee, Lumsdaine and Stock (1992), Chu, Stinchcombe and White (1996), Chong (2001), Elliott and Muller (2002), Hansen (1992), Inclan and Tiao (1994) and Ploberger, Kramer and Kontrus (1989). Building on this work we consider the small sample problem of estimation and forecasting with AR(p) models in the presence of structural breaks. For this purpose, we consider the following AR(p) model defined over the period t =1, 2,..., T ; and assumed to have been subject to a single structural break at time T 1 : y t = ( α 1 + β 11 y t 1 + β 12 y t β 1p y t p + σ 1 ε t,,for t T 1, α 2 + β 21 y t 1 + β 22 y t β 2p y t p + σ 2 ε t,,for t>t 1,. (2) As before ε t iid(0, 1) for all t. For the analysis of the unit root case it is also convenient to consider the following parameterization of the intercept terms, α i : α i = µ i (1 β i ), i =1, 2, (3) 5
7 where β i = P p j=1 β ij, = τ 0 pβ i, β i =(β i1, β i2,..., β ip ) 0 and τ p is a p 1 unit vector. Note that (1 β i )alsorepresentsthecoefficient of y t 1 in the error correction representation of (2). This specification is quite general and allows for intercept and slope shifts, as well as a change in error variances immediately after t = T 1. It is also possible for the y t process to contain a unit root (or be integrated of order 1) in one or both of the regimes. The integration property of y t under the two regimes is governed by whether β i =1orβ i < 1. More specifically, we shall assume that the roots of px λ j β ij 1 = 0, for i =1, 2, (4) j=1 lie on or outside the unit circle. 1 As µ i is allowed to vary freely, the intercepts α i = µ i (1 β i ) are unrestricted when the underlying AR processes are stationary. However, to avoid the possibility of generating linear trends in the y t process, the intercepts are restricted (α i = 0) in the presence of unit roots. In the stationary case µ i represents the unconditional mean of y t in regime i. In the unit root case µ i is not identified and we have E( y t )=0. Analysis of forecast errors from AR models subject to structural change have been recently addressed by Clements and Hendry (1998,1999). However, these authors either abstract from the problem of parameter uncertainty, or only allow for it assuming that the parameters of the AR model remain unchanged during the estimation period. Consider first the analysis provided in Clements and Hendry (1998, pp ), where it is assumed that parameters are known and the break takes place immediately prior to the forecasting period. In this case the onestep ahead forecast error is given by y T +1 ỹ T +1 = µ 2 (1 β 2) µ 1 (1 β 1)+x 0 T (β 2 β 1 )+σ 2 ε T +1, where x T =(y T,y T 1,...,y T p+1 ) 0,(µ 1, β 1 ) are the parameters prior to the forecast period, and (µ 2, β 2 ) are the parameters during the forecast period, here T +1. Following Clement and Hendry and noting that β i = τ 0 pβ i,itiseasilyverified that y T +1 ỹ T +1 =(µ 2 µ 1 )(1 β 2)+(β 2 β 1 ) 0 (x T µ 1 τ p )+σ 2 ε T +1, 1 Our analysis can also allow for the possibility of y t beingintegratedofordertwoinoneor both of the two regimes. But in this paper we shall only consider the unit root case explicitly. 6
8 and E (y T +1 ỹ T +1 )=(µ 2 µ 1 )(1 β 2)+(β 2 β 1 ) 0 E (x T µ 1 τ p ). Inthecasewherey t is stationary we have E (x T µ 1 τ p ) = 0, and E (y T +1 ỹ T +1 )=(µ 2 µ 1 )(1 β 2), which does not depend on the size of the break in the slope coefficients, β 2 β 1,and will be zero when µ 2 = µ 1. This is an interesting theoretical result but its relevance is limited in practice where estimates of (µ 1, β 1 ) based on past observations need to be used. One of the contributions of this paper might be viewed as identifying the circumstances under which the above result is likely to hold in the presence of estimation uncertainty. In a related contribution Clements and Hendry (1999, pp ) consider the effect of estimation uncertainty on the forecast error decomposition using a firstorder vector autoregressive model, and conclude estimation uncertainty to be relatively unimportant. However, their analysis assumes that the estimation is carried out immediately prior to the break, based on a correctly specified model which is not subject to any breaks. The assumption that parameters have been stable prior to forecasting is clearly restrictive, and it is therefore important that a more general framework is considered where the effect of estimation uncertainty can be analysed even in the presence of multiple breaks in the parameters (slope coefficients as well as error variances) over the estimation period. In this paper we provide such a framework in the case of AR(p) models subject to a single break point over the estimation period. But, it should become clear that the analysis readily extends to two or more break points. 2 In particular, our interest in this paper lies in the point (or probability) forecast of y T +1 conditional on Ω T = {y 1,y 2,..., y T } in the context of the break point specification (2). In the case where the postbreak window size, v 2 = T T 1 is sufficiently large (v 2 ), the structural break is relatively unimportant and the forecast of y T +1 can be based exclusively on the postbreak observations. However, when v 2 is small it might be worthwhile to base the forecasting model on prebreak 2 Explicitly allowing for breaks and parameter uncertainty prior to forecasting also raises the issueofthechoiceofobservationwindowdiscussedinrelatedpapersinpesaranandtimmermann (2002, 2003). 7
9 as well as postbreak observations. The number of prebreak observations, which we denote by v 1, then becomes a choice parameter. In what follows we assume T 1 is known but consider forecasting y T +1 using the past T m + p +1observations, m p being the starting point of the estimation window, y T (m p) =(y m p,y m p+1,..., y T1,y T , y T ) 0, (5) with the p observations y m p,y m p+1,..., y m 1 treated as given initial values. 3 The length of the prebreak window is then given by v 1 = T 1 m +1, and the number oftimeperiodsusedinestimationisthereforev = v 1 +v 2 = T m+1. To simplify the notations we shall consider values of v 1 p, orm T 1 p 1. The point forecast of y T +1 conditional on y T (m p) isgivenby ŷ T +1 (m) =ˆα T (m)+x 0 T ˆβ T (m), where x T =(y T,y T 1,..., y T p+1 ) 0, ˆβT (m) =³ˆβ1T (m), ˆβ 2T (m),..., ˆβ pt (m) 0, τ v is a v 1 vector of ones, M τ = I ν τ v (τ 0 vτ v ) 1 τ 0 v,and X T (m) =(y T 1 (m 1), y T 2 (m 2),..., y T p (m p)), so that ˆβ T (m) =[X 0 T (m) M τx T (m)] 1 X 0 T (m) M τy T (m), (6) ˆα T (m) = τ 0 vy T (m) τ 0 vx T (m) ˆβ T (m), (7) v The onestep ahead forecast error is e T +1 (m) =y T +1 ŷ T +1 (m) =σ 2 ε T +1 ξ T (m), (8) where ξ T (m) =[ˆα T (m) α 2 ]+x 0 T ³ˆβT (m) β 2. (9) β 2 =(β 21, β 22,..., β 2p ) 0 and α 2 = µ 2 1 τ 0 p β 2. We consider both the unconditional and conditional mean squared forecast error given by E ε e 2 T +1 (m) and E ε e 2 T +1 (m) Ω T, respectively, where the expectations operator Eε ( ) isdefined with respect to the distribution of the innovations ε t. TotheseehowtheMSFE 3 Throughout the paper we shall use the notation y T (k) =(y k,..., y T ) 0. 8
10 depends on the starting point of the estimation window, m, notethatε T +1 and ξ T (m) are independently distributed and E ε e 2 T +1 (m) Ω T = σ E ε ξ 2 T (m) Ω T. (10) To carry out the necessary computations, an explicit expression for ξ T (m) interms of the ε 0 ts is required. This is complicated and depends on the state of the process just before the first observation is used for estimation. For a given choice of m>pand a finite sample size T, the joint distribution of ˆβ T (m) andˆα T (m) depends on the distribution of the initial values y m 1 (m p)= (y m p,y m p+1,..., y m 1 ) 0. (11) We distinguish between the two important cases where the prebreak process is stationary and when it contains a unit root PreBreak Process is Stationary In the case where the prebreak regime is stationary and has been in operation for sufficiently long time, the distribution of y m 1 (m p) does not depend on m and is given by y m 1 (m p) N(µ 1 τ p, σ 2 1V p ), (12) where V p is defined in terms of the prebreak parameters. For example, for p =1, V 1 =1/(1 β 2 11 ), and for p =2 Ã! 1 V 2 = (1 + β 12 ) 1 β (1 β 12 ) 2 12 β 11. β 2 11 β 11 1 β PreBreak Process is I(1) If the prebreak process contains a unit root, the covariance of y m 1 (m p) isno longer given by σ 2 1V p and in general depends on m. Under a prebreak unit root, β 1 = 1 and the prebreak process is given by p 1 X y t = δ 1j y t j + σ 1 ε t, for t T 1, (13) j=1 where δ 1j = P p `=j+1 β 1`. The distribution of initial values can now be specified in terms of the stationary distribution of the first differences, ( y 2, y 3,..., y p ), and 9
11 an assumption concerning the first observation in the sample, y 1. In what follows we assume that y 1 is given by y 1 = µ 1 + ωε 1, (14) where ω will be treated as a free parameter, and ε 1 N(0, 1). Using (13) and (14) it is now possible to derive the distribution of the initial values, y m 1 (m p) =(y m p,y m p+1,..., y m 1 ) 0,notingthat y m i = y 1 + y y m i, for i =1, 2,..., p. In the AR(1) case we have and in conjunction with (14) we have y t = σ 1 ε t, for t =2, 3,..., T 1, y m 1 = y 1 + y y m 1 and hence y m 1 N (µ 1, V 1,m ), where = µ 1 + ωε 1 + σ 1 (ε 2 + ε ε m 1 ), V 1,m = ω 2 +(m 2)σ 2 1. (15) FortheAR(2)specification we have y m 1 (m 2) = (y m 2,y m 1 ) 0 N (µ 1 τ 2, V 2,m ), where V 2,m is derived in Appendix A OLS Estimates Using(12)and(2)fort = m, m +1,..., T, in matrix notations we have By T (m p) =d + D ε, (16) where D = σ 1 ψ p I ν1 0, d =µ 1 τ p (1 β 1)τ v1, (17) 0 0 (σ 2 /σ 1 ) I ν2 I p 0 0 B = B 21 B B 32 B (µ 2 /µ 1 )(1 β 2)τ v2. (18)
12 The submatrices, B ij, depend only on the slope coefficients, β 1 and β 2 and are definedinappendixb.i ν1 and I ν2 are identity matrices of order ν 1 and ν 2, respectively and ε =(ε m p, ε m p+1,..., ε T ) 0 N(0, I ν+p ). The form of ψ p depends on whether the prebreak process is stationary or contains a unit root. Under the former ψ p is a lower triangular Cholesky factor of V p,namelyv p = ψ p ψ 0 p, where V p is the covariance matrix of y m 1 (m p). Appropriate expressions for V p inthecaseofp = 1 and 2 are already provided in Section When the prebreak process has a unit root, ψ p is given by the lower triangular Cholesky factor of V p,m, which is given by (15) above for p =1andin Appendix A by (38) for p =2. Using (40) derived in Appendix B, in general we have y T i (m i) =G i (c + Hε), for i =0, 1,..., p, (19) where G i are v (v + p) selection matrices defined by G i = (0 v p i.i ν.0 v i ), H = B 1 D,andc = B 1 d. In particular, y T (m) =G 0 (c + Hε), and X T (m) = h i G 1 (c + Hε), G 2 (c + Hε),..., G p (c + Hε). Therefore, in general the (i, j) element of the product moment matrix, X 0 T (m) M τ X T (m), is given by (c + Hε) 0 G 0 im τ G j (c + Hε), for i, j =1, 2,..., p, andthej th element of X 0 T (m) M τ y T (m) isgivenby(c + Hε) 0 G 0 jm τ G 0 (c + Hε), for j =1, 2,..., p. Hence, ˆβT (m) = ³ˆβ1T (m), ˆβ 0, 2T (m),...,ˆβ pt (m) is a nonlinear function of the quadratic forms (c + Hε) 0 G 0 im τ G j (c + Hε), for i =1, 2,...p, and j =0, 1,..., p, with known matrices H, G i, c, andε N(0, I ν+p ). Similarly, using (7) we have ˆα T (m) =v 1 τ 0 v G 0(c + Hε) v 1 τ 0 v In the AR(1) case these results simplify to px G i (c + Hε)ˆβ it (m). (20) i=1 ˆβ T (m) = (c + Hε)0 G 0 1M τ G 0 (c + Hε) (c + Hε) 0 G 0 1 M τg 1 (c + Hε), (21) and ˆα T (m) =v 1 τ 0 vg 0 (c + Hε) v 1 τ 0 vg 1 (c + Hε)ˆβ T (m). (22) 11
13 Using the above results in (6) it is now easily seen that in general ˆβ T (m) depends on the ratios, µ 1 /σ 1, σ 1 /σ 2 and µ 1 /µ 2 (or µ 2 /µ 1 ), whilst ˆα T (m) depends on all the four coefficients, µ 1,µ 2, σ 1,andσ 2, individually. Two cases of special interest arise when there is no mean shift in the model, and when the postbreak process contains a unit root. In both cases, as shown in Appendix B, G i c =κτ v where κ = µ when there is no mean shift (i.e. µ 1 = µ 2 = µ), and κ = µ 1 if there is a mean shift but β 2 = 1. Under either of these two special cases we have M τg i c = 0, for all i, and ˆβ T (m) will be a function of the quadratic terms, ε 0 H 0 G 0 im τ G j Hε, which depend only on the ratio of the error variances, σ 1 /σ 2. These results also establish the following proposition: Proposition 1 Under µ 1 = µ 2 or if β 1 < 1 and β 2 =1, ˆβ T (m) defined by (6) does not depend on the scale of the error variances (σ 2 1, σ 2 2) or the unconditional means, µ 1, µ 2, and is an even function of ε. This proposition plays a key role in the analysis of prediction errors below. It is also worth noting that ˆβ T (m) will continue to be an even function of the errors in the more general case where the slope coefficients and/or the error variances are subject to multiple breaks, so long as the mean of the process remains unchanged. This proposition does not, however, extend to the OLS estimate of the intercept, ˆα T (m). To see this, using (20) and noting that under µ 1 = µ 2,orifβ 2 =1, G i c =µ 1 τ v we have ³ ˆα T (m) =µ 1 1 ˆβ (m) µ τ 0 T + v G 0 Hε px µ τ 0 v G i Hε ˆβ v v it (m), (23) where ˆβ T (m) =P p i=1 ˆβ it (m) =τ 0 pˆβ T (m). It is clear that in this case ˆα T (m) isan odd function of ε, and depends on σ 1, σ 2 and µ 1 individually. i= Forecast Error Decomposition Using (20) and (9) in (8), and recalling that α 2 = µ 2 1 τ 0 p β 2, then after some algebra the forecast error, e T +1 (m), can be decomposed as e T +1 (m) =σ 2 ε T +1 X 1T (m) X 2T (m) X 3T (m), (24) where µ τ 0 X 1T (m) = v G 0 c µ v 2 px µ τ 0 v G i c µ v 2 ˆβ it (m), (25) i=1 12
14 and X 2T (m) = τ 0 vg 0 Hε v px µ τ 0 v G i Hε ˆβ v it (m), (26) i=1 X 3T (m) =(x T µ 2 τ p ) 0 ³ˆβ T (m) β 2. (27) The first term in this decomposition refers to future uncertainty which is independently distributed of the other terms. The second term, X 1T (m), is due to the mean shift and disappears under µ 1 = µ 2 = µ. Recall that in this case v 1 τ 0 vg i c =µ, for all i. 4 The third term, X 2T (m), captures the uncertainty associated with the unconditional mean of the process and reduces to zero if µ 1 = µ 2 =0. Thelast term represents the slope uncertainty and depends on whether the analysis is carried out unconditionally, or conditionally on x T =(y T,y T 1,..., y T p+1 ) 0,inwhich case the extent of the bias will generally depend on the size of the gap x T µ 2 τ p Unconditional MSFE To obtain the unconditional form of e T +1 (m), we first note that x T can be written as S p y T (m), where S p =(0 p (v p).j p ), and J p is the p p matrix Therefore, using (19) we have J p = x T µ 2 τ p =(S p G 0 c µ 2 τ p )+S p G 0 Hε, and X 3T (m), defined by (27), decomposes further as X 3T (m) =(S p G 0 c µ 2 τ p ) 0 ³ˆβT (m) β 2 +(S p G 0 Hε) 0 ³ˆβT (m) β 2. 4 See the last section of Appendix B. Note also that X 1T (m) does not disappear if β 2 = τ 0 pβ 2 = 1, solongasµ 1 6= µ 2. However, under β 2 = 1, itsimplifies to ³ X 1T (m) =(µ 1 µ 2 ) 1 τ 0 pˆβ T (m). 13
15 However, under µ 1 = µ 2 = µ the first term of X 3T (m) vanishes and we have 5 e T +1 (m) =σ 2 ε T +1 X 2T (m) ³ˆβT (m) β 2 0 Sp G 0 Hε. (28) Also under µ 1 = µ 2 = µ, e T +1 (m), and hence E ε e 2 T +1 (m), do not depend on the unconditional mean of the autoregressive process. The computation of E ε e 2 T +1 (m) can be carried out via stochastic simulations. We have Ê R e 2 T +1 (m) = σ R RX r=1 h X (r) 1T (m)+x(r) 2T (m)+x(r) 3T (m) i 2, where the terms X (r) it (m),i =1, 2, 3 can be computed using random draws from ε N(0, I ν+p ), whichwedenotebyε (r), r =1, 2,..., R. In particular, µ X (r) τ 0 1T (m) = v G 0 c µ v 2 X (r) 0 2T (m) =τ vg 0 Hε (r) v px µ τ 0 v G i c µ v 2 i=1 Ã! px τ 0 vg i Hε (r) i=1 X (r) 3T (m) =(S pg 0 c µ 2 τ p ) 0 ³ˆβ(r) T (m) β 2 + ˆβ (r) it (m), (29) ˆβ (r) it (m), (30) v ³ S p G 0 Hε (r) 0 ³ˆβ(r) T (m) β 2, (31) ˆβ (r) it (m) denotestheestimateofβ i based on ε (r). Assuming E ε e 2 T +1 (m) and exists, then due to the independence of ε (r) across r, andthefactthatx (r) it also independently and identically distributed across r, we have (as R ) Ê R e 2 T +1 (m) p Eε e 2 T +1 (m). (m) are The following proposition generalizes Theorem in Fuller (1996, page 445) to the case where estimation has been based on an AR(p) model which has been subject to breaks in the slope coefficients and/or error variances. Proposition 2: The onestep ahead forecast errors, e T +1 (m), defined by (8) from the AR(p) model, (2), subject to a break in the AR coefficients (β 1 6= β 2 )or a break in the innovation variance (σ 2 1 6= σ 2 2) are unbiased provided that: (i) The probability distribution of ε =(ε 0, ε T +1 ) 0 is symmetrically distributed around E(ε )=0, and its first and second order moments exist; 5 Note that in this case S p G 0 c =µs p τ v = µτ p. 14
16 (ii) The firstorder moments of the estimated slope coefficients, ˆβ it (m), exist, namely E ˆβ it (m) <, for i =1, 2,..., p; (iii) There is no break in the mean of the process, µ 1 = µ 2. as Proof: Under µ 1 = µ 2, using (26) and (28), the prediction error can be written e T +1 (m) = σ 2 ε T +1 ³ˆβT (m) β 2 0 Sp G 0 Hε " τ 0 v G 0Hε px µ # τ 0 v G i Hε ˆβ v v it (m). It is clear that under assumption (i) the terms σ 2 ε T +1, β 0 2S p G 0 Hε, andτ 0 vg 0 Hε, which are linear functions of ε, have mean zero and we have i=1 i E ε [e T +1 (m)] = E ε hˆβ 0 T (m)s p G 0 Hε + px i=1 µ τ 0 E v G i Hε ε ˆβ v it (m). Also under µ 1 = µ 2 and by Proposition 1, ˆβ T (m), is an even function of ε. Hence, ˆβ 0 T (m)s pg 0 Hε, and(τ 0 vg i Hε) ˆβ it (m) fori =1, 2,..., p are odd functions of ε, and under assumptions (i) and (ii) their expectations exist and are equal to zero by the symmetry assumption. Therefore, E ε [e T +1 (m)] = 0. Inthecasewhereµ 1 6= µ 2, ˆβ jt (m) is not an even function of ε, the term X 1T defined by (25) does not vanish and the prediction error given by (24), is no longer anoddfunctionofε, so it will, in general, not have a zero mean. Remark: Conditions under which moments of ˆβ it (m) exists in the case of AR(1) models with fixed coefficients have been investigated in the literature and readily extends to AR(1) models subject to breaks. For the AR(1) model under µ 1 = µ 2 we have [see (21)] ˆβ T (m) = ε0 H 0 G 0 1M τ G 0 Hε ε 0 H 0 G 0 1M τ G 1 Hε. Assuming that ε is normally distributed and applying a Lemma due to Smith (1988) to (ε 0 H 0 G 0 1M τ G 1 Hε) 1, it is easily established that the r th moment of 15
17 ˆβ T (m) existsifrank (H 0 G 0 1M τ G 1 H)=v 1=T m>2r. 6 Hence, ˆβ T (m) hasa firstorder moment if T>m+ 2. To our knowledge no such conditions are known for higher order AR processes, even with fixed coefficients. Proposition 2 has important implications for the tradeoff that exists in the estimation bias of the slope and intercept coefficients in the AR models even in the presence of breaks so long as µ 1 = µ 2 = µ. To see this notice from (22) that i E [ˆα T (m) α 2 ]= µ E hˆβ T (m) β 2. This provides an interesting relationship between the small sample bias of the estimator of the intercept term, E [ˆα i T (m) α 2 ], and the small sample bias of the longrun coefficient, E hˆβ T (m) β 2. The estimator of the intercept term, ˆα T (m), is unbiased only if the sample mean is zero. But, in general there is a spillover effect from the bias of the slope coefficienttothatoftheinterceptterm. For the AR(1) model the results simplify further and we have i E [ˆα T (m) α 2 ]= µ E hˆβt (m) β 2. (32) i Since E hˆβt (m) β 2 < 0, it therefore follows that E [ˆα T (m) α 2 ] > 0ifµ>0, E [ˆα T (m) α 2 ] 0ifµ 0. Once again these results hold irrespective of whether β 1 = β 2 or not Conditional MSFE As before we have e T +1 (m) =σ 2 ε T +1 X 1T (m) X 2T (m) X 3T (m), where X it (m), i = 1, 2, 3, are defined by (25), (26), and (27). In computing the conditional MSFE, defined by E ε e 2 T +1 (m) Ω T,wefix xt and integrate with respect to the distribution of ε. Recall that ˆβ T (m) andˆα T (m) asdefinedin(6)and (7) are only functions of ε and are hence not constrained by the terminal value, 6 Note that H is full rank, rank(g i )=v, andrank(m τ )=v 1. 16
18 x T. 7 To investigate the effect of parameter estimation uncertainty we therefore draw values of ε independently of x T. Once again the results simplify when µ 1 = µ 2 = µ. In this case X 1T (m) = 0, X 2T (m) is an odd function of ε, and assuming that the distribution of ε is symmetric we have E ε [e T +1 (m) Ω T ]= (x T µτ p ) 0 E ε ³ˆβT (m) β 2. Suppose p =1, so that it is easy to characterize when x T is above or below the mean. Then E ε [e T +1 (m) Ω T ]= (y T µ) E ε ³ˆβT (m) β 2. (33) Since, E ε ³ˆβT (m) β 2 < 0, E ε [e T +1 (m) Ω T ]= ( > 0 if y T >µ 0 if y T µ, (34) and the estimated model underpredicts if the last observation is above the unconditional mean (y T >µ), while conversely it overpredicts if the last observation is below the unconditional mean (y T <µ). Therefore, conditional predictions tend to be biased towards the unconditional mean of the process. As with the unconditional MSFE, the computation of the conditional MSFE can also be carried out by stochastic simulations. In general, for a given value of x T, and using draws from ε N(0, I ν+p )wehave Ê R e 2 T +1 (m) Ω T = σ R RX h i 2 X (r) 1T (m)+x(r) (r) 2T (m)+ X 3T (m), (35) r=1 where X (r) 1T term, X(r) 3T (m) andx(r) 2T (m) are given by (29) and (30), as before, with the third (m), now defined by X (r) 3T (m) =(x 0 (r) T µ 2 τ p ) ³ˆβ T (m) β 2. (36) Once again as R, we would expect ÊR e 2 T +1 (m) Ω T p Eε e 2 T +1 (m) Ω T. 7 This is consistent with the approach taken in calculating asymptotic results, c.f. Fuller (1996). If we literally condition on the full path of yvalues in Ω T,thenˆβ T (m) andˆα T (m) areofcourse nonrandom (fixed) constants and no estimation uncertainty arises. 17
19 4. Numerical Results Our approach is quite general and allows us to study the small sample properties of AR models in some detail. The existing literature has focused on the AR(1) model without a break, where the key parameters affecting the properties of the OLS estimators, ˆα T (m) andˆβ T (m), are the sample size and the persistence parameter, β 1. In our setting there are many more parameters to consider. In the absence of a break there are now p autoregressive parameters plus the intercept, α, and the innovation variance, σ 2. Under a single break, we need to consider both the pre and postbreak parameters  i.e. the AR coefficients (β 1, β 2 ), the intercepts (α 1, α 2 ) and the innovation variances (σ 2 1, σ 2 2). Furthermore, how the total sample divides into pre and postbreak periods (v 1 and v 2 ) is now crucial to the bias in the postbreak parameter estimates and to the bias and variance of the forecast error. To ensure that our results are comparable to the existing literature, our benchmark model is the AR(1) specification without a break (experiment 1 in Table 1). We study breaks in the autoregressive parameter in the form of both moderately sized (0.3) and large (0.6) breaks in either direction (experiments 24) as well as a unit root process in the postbreak (experiment 5) or prebreak (experiment 9) period. We also consider pure breaks in the innovation variance (experiments 6 and 7), where σ changes between values of 1/4 and 1 or 4 and 1, and in the mean (experiment 8), where µ changes between 1 and 2. For convenience the parameter values assumed in each of the experiments are summarized in Table 1. Since our focusisontheeffect of breaks on the bias and forecasting performance of AR models, results are presented as a function of the prebreak window size (v 1 )and the postbreak window size (v 2 ). We vary v 1 from zero (no prebreak information) through 1, 2, 3, 4, 5, 10, 20, 30, 50 and 100, while the postbreak window, v 2,is set at 10, 20, 30, 50 and 100. Simulation results are presented in Tables 25. Results are based on 50,000 Monte Carlo simulations with innovations drawn from an IID Gaussian distribution. 8 Table 2 shows the bias in ˆβ 1 while Table 3 shows the conditional bias in the forecast for a situation where y T is above its mean, i.e., y T = µ 2 + σ 2. 9 To measure 8 We also considered an AR(2) specification to study the effect of higher order dynamics. Results were very similar to those reported below and are available from the authors web site. 9 Estimated values are computed as averages across Monte Carlo simulations relative to the 18
20 forecasting performance, Table 4 reports the unconditional RMSFE while Table 5 shows the RMSFE conditional on y T = µ 2 + σ 2, as functions of the prebreak (v 1 ) and postbreak (v 2 ) window sizes. We condition on this particular value since if y T = µ 2 the conditional bias is zero while if y T = µ 2 σ 2 the conditional bias takes the same value but with the sign reversed, c.f. (33) Bias Results First consider the bias in ˆβ 1. In the absence of a break, ˆβ 1 is downward biased with a bias that disappears as v 1 and v 2 increase and becomes quite small when the combined sample v = v 1 + v 2 is sufficiently large. 10 Notice the symmetry of the results in v 1 and v 2 which follows since (under no break) only v 1 + v 2 matters for the bias. 11 Once a break is introduced in the AR parameter, the bias in ˆβ 1 continues to decline in v 2 but need no longer decline monotonically as a function of v 1. The reason for this is simple: including prebreak data generated by a different (less persistent) process introduces a new bias term in ˆβ 1. It is only to the extent that this term is offset by a reduction in the small sample bias of the AR estimate that inclusion of prebreak data will lead to a bias reduction. Thus, when v 2 is very large (e.g., 50 or 100 postbreak observations) the small sample bias in ˆβ 1 based purely on postbreak observations is already quite small. In this situation, inclusion of prebreak data will not lower the bias in ˆβ 1. Conversely, when the postbreak sample is small (i.e., v 2 =10 20 observations), the small sample bias in ˆβ 1 is very large and including up to 30 prebreak observations will actually reduce the bias under a moderately sized break. Naturally, if the break size is large (experiment 4), this effect is reduced since the true bias due to including prebreak observations in the estimation window dominates any reduction in the small sample bias in ˆβ 1 true postbreak values. To ensure comparability across the experiments they are based on the same random numbers. 10 The bias estimates are in line with the well known Kendall (1954) approximation formula ³ˆβ1 E β 1 = (1 +3β 1) + O(v 3/2 ),v= v 1 + v 2. v 11 Recall from (32) that in the case of Gaussian errors the bias in ˆα T (m) can be exactly inferred from the bias of ˆβ T (m) when there is no break in the mean. For this reason we focus our analysis on the bias in ˆβ T (m). 19
21 based solely on postbreak data for all but the smallest postbreak window sizes. Interestingly, when the break is in the reverse direction (experiment 3) so that the true value of β 1 declines, including a small number of prebreak data points leads to a reduction in the bias in ˆβ 1 even for very large postbreak windows. For example, the bias in ˆβ 1 is minimized by including 3 prebreak observations even when v 2 = 100. The reason is again related to the direction of the small sample bias in ˆβ 1.Sinceˆβ 1 is downward biased, when the break is from high to low persistence, the (upward) bias introduced by inclusion of the more persistent prebreak data works in the opposite direction of the small sample (downward) bias in ˆβ 1.Forthis reason the biases under a decline in β 1 tend to be smaller than the biases observed when β 1 increases at the time of the break. Under a postbreak unit root (experiment 5) the biasminimizing prebreak window size is around 20 observations. Under a prebreak unitroot (experiment 9), bias is smallest for either v 1 =0orv 1 = 1. When a break occurs in the innovation variance (experiments 6 and 7), the smallest bias is always achieved by the longest prebreak windows. The only difference to the case without a break is that the bias is no longer a symmetric function of v 1 and v 2. Allowing for a break in the mean (experiment 8), the forecast error is no longer unbiased unconditionally and the optimal prebreak window size rises to 100 irrespective of the value of v 2. Turning next to the conditional bias in the forecast, Table 3 shows that, in the absence of a break, the bias is positive when the prediction is made conditional on y T = µ 2 + σ 2, a value above the mean of the process. This is, of course, consistent with (34) and with the sign of the bias in ˆβ 1. In general, the results for the conditional bias in the forecast error mirror those of the bias in ˆβ 1, except for the case with a break in the mean. Whereas the bias in ˆβ 1 was reduced the larger the value of v 1 when the mean increases at the time of the break, the bias in the forecast error is smallest when v 1 = 0 and the mean increases assuming a large postbreak sample (v 2 = 50 or 100) Forecasting Performance To measure forecasting performance for the AR(1) model, unconditional and conditional RMSFE values are shown in Tables 4 and 5. Under no break the unconditional RMSFE is 1.15 for the smallest combined sample (v 1 =0,v 2 = 10) and it declines symmetrically as a function of v 1 and v 2. In the presence of a moderate 20
22 break in the AR coefficient, the unconditional RMSFE continues to decline as a function of v 2 but it no longer declines monotonically in v 1, the prebreak window. Furthermore, the unconditional RMSFE no longer converges to one  its theoretical value in the absence of parameter estimation uncertainty  provided the ratio v 1 /v 2 does not go to zero. For example, when v 1 = v 2 = 100, the unconditional RMSFE under a moderate break in β 1 is close to 1.02 as opposed to a value of observed in the case without a break. This difference is due to the squared bias in the AR parameters introduced by including prebreak data points. Generally, the windows that minimize the unconditional RMSFE tend to be longer than the windows that minimize the bias. Increasing the window size beyond the point that produces the smallest bias may be acceptable if it reduces the forecast error variance by more than the associated increase in the squared bias. A moderately sized break in β 1 implies that the optimal prebreak window size declines to observations under the unconditional RMSFE criterion although it remains much longer under the conditional RMSFE criterion. In both cases, the optimal value of v 1 is smaller, the larger the value of v 2 and the larger the size of the break in β 1 as can be seen by comparing the results from experiments 2 and 4. Somewhat different patterns emerge when the AR model switches from having a unit root process to being stationary and vice versa. Under a postbreak unit root the conditional RMSFE is minimized for rather large values of v 1,whereas the unconditional RMSFE is minimized at much smaller values of v 1, typically below 10 observations. But, under the prebreak unit root scenario, the smallest unconditional and conditional RMSFE values are produced by at most including one or two prebreak observations. When the postbreak innovation variance is higher, it is optimal to set the prebreak window as long as possible since this maximizes the length of the less noisy data and thus brings down the forecast error variance without introducing a bias in the forecast. In contrast, when the innovation variance declines at the time of the break, the optimal prebreak window size is only long provided the postbreak window, v 2,israthershortanditdeclinestozeroforlargervaluesofv 2. Notice how the performance of the forecast can deteriorate badly upon the inclusion of a single prebreak data point even with quite long postbreak windows. This is due to the extra noise introduced by using prebreak data for parameter estimation. Under a break to the mean (experiment 8), the lowest conditional and uncon 21
23 ditional RMSFE values are observed for the longer prebreak windows. This is an interesting finding and holds despite the fact that additional bias is introduced into the forecast. For example, in Table 4 the RMSFE is systematically reduced by increasing the prebreak window, v 1. In practice, breaks are likely to involve the meansaswellastheslopecoefficients. In such situations our results suggest that, at least for breaks of similar size to those assumed here, it is difficult to outperform the forecasting performance generated by a model based on an expanding window of the data Forecasting Performance of Rolling, Expanding and Postbreak windows To shed light on the practical implications of our results, we next consider the outofsample forecasting performance of a range of widely used estimation windows. One way to deal with parameter instability is to use a rolling observation window. The size of the rolling window is often decided by aprioriconsiderations. Here we consider a short rolling window using the most recent 25 observations and a relatively long rolling window based on the most recent 50 observations. If parameter instability is believed to be due to the presence of rare structural breaks, another possibility is to only use postbreak data. In some cases the timing of the break may be known, but in most cases both the timing and the number of breaks must be estimated. We therefore use the BaiPerron (1998) method to test for the presence of structural breaks and determine their timing, allowing up to three breaks and selecting the number of breaks by the Schwarz information criterion. If one or more breaks is identified at time t, this procedure uses data after the most recent break date to produce a forecast for period t+1. Ifnobreakisidentified, an expanding data window is used to generate the forecast. Finally, as a third option an expanding window is considered. This is the most efficient estimation method in the absence of breaks and provides a natural benchmark. We initially undertook the following simulation exercise. For each of the original AR(1) experiments we assume a break has taken place at observation 101. Our postbreak forecast evaluation period runs from observations 111 to 150. For this period we computed RMSFEs of the onestep ahead forecasts obtained under different estimation windows by Monte Carlo simulation. Panel A of Table 6 reports the results under a single break. As expected, when a break is not present the expanding window method produces the lowest RMSFE 22
24 values. The expanding window also performs well when the break only affects the volatility or the mean parameter. The fact that the expanding window performs best even when the prebreak volatility is higher than the postbreak volatility can be explained by the reduction in the variance of the parameter estimation error due to using a very long estimation window. The finding for a break in the mean is consistent with the simulation results in Table 4. In the experiments with a very large change in the autoregressive parameter (experiments 45), the short rolling window method produces the best performance, while the long rolling window works best for smaller breaks (experiments 23) which generate a lower squared bias. Interestingly, the use of a postbreak window with an estimated break point does not produce the lowest RMSFE performance in any of the experiments 18. A possible explanation of this finding lies in the modest power of break point tests to detect changes in autoregressive parameters as documented by Banerjee, Lumsdaine and Stock (1992). The only case where the postbreak window method results in the lowest RMSFE is under a prebreak unit root (experiment 9). For this case the expanding window method performs quite poorly. This is consistent with our simulation results which showed that the conditional and unconditional RMSFE performance was best for very small  frequently zero  prebreak windows under a prebreak unit root. We also modified the simulation with the prebreak unit root to ensure that the point towards which the postbreak process mean reverts is the terminal point of the prebreak unit root process (experiment 10) rather than simply µ 2. This is likely to generate sample paths more similar to those observed in practice, c.f. Banerjee, Lumsdaine and Stock (1992). The results show that although the expanding window method performs relatively better, it still does not produce the lowest RMSFE Multiple Breaks So far we have focused on the case with a single structural break, but in practice the time series process under consideration may be subject to multiple breaks. Our procedure can readily be generalized to account for this possibility. Accordingly, we extended our simulation experiments to allow for two breaks occurring after 50 and 100 observations, respectively. The presence of multiple breaks raises questions concerning the process generating the breaks. Barring a general theory we consider 23
Sales forecasting # 2
Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univrennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationBias in the Estimation of Mean Reversion in ContinuousTime Lévy Processes
Bias in the Estimation of Mean Reversion in ContinuousTime Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA
More informationTesting against a Change from Short to Long Memory
Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer GoetheUniversity Frankfurt This version: December 9, 2007 Abstract This paper studies some wellknown tests for the null
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulationbased method for estimating the parameters of economic models. Its
More informationTesting against a Change from Short to Long Memory
Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer GoetheUniversity Frankfurt This version: January 2, 2008 Abstract This paper studies some wellknown tests for the null
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.
More informationVariance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationC: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}
C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression
More informationAnalysis of Financial Time Series with EViews
Analysis of Financial Time Series with EViews Enrico Foscolo Contents 1 Asset Returns 2 1.1 Empirical Properties of Returns................. 2 2 Heteroskedasticity and Autocorrelation 4 2.1 Testing for
More informationMarketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationForecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network
Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Dušan Marček 1 Abstract Most models for the time series of stock prices have centered on autoregressive (AR)
More informationLecture 15: Time Series Modeling Steven Skiena. skiena
Lecture 15: Time Series Modeling Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Modeling and Forecasting We seek to
More informationEstimation and Inference in Cointegration Models Economics 582
Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationHow to assess the risk of a large portfolio? How to estimate a large covariance matrix?
Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk
More informationCentre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
More informationNormalization and Mixed Degrees of Integration in Cointegrated Time Series Systems
Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 EMail: r.j.rossana@wayne.edu
More informationAre the US current account deficits really sustainable? National University of Ireland, Galway
Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Are the US current account deficits really sustainable? Author(s)
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More information. This specification assumes individual specific means with Ey ( ) = µ. We know from Nickell (1981) that OLS estimates of (1.1) are biased for S S
UNIT ROOT TESTS WITH PANEL DATA. Consider the AR model y = αy ( ), it it + α µ + ε i it i =,... N, t =,..., T. (.) where the ε 2 IN(0, σ ) it. This specification assumes individual specific means with
More informationThe information content of lagged equity and bond yields
Economics Letters 68 (2000) 179 184 www.elsevier.com/ locate/ econbase The information content of lagged equity and bond yields Richard D.F. Harris *, Rene SanchezValle School of Business and Economics,
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationAn Introduction to Time Series Regression
An Introduction to Time Series Regression Henry Thompson Auburn University An economic model suggests examining the effect of exogenous x t on endogenous y t with an exogenous control variable z t. In
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationMinimum LM Unit Root Test with One Structural Break. Junsoo Lee Department of Economics University of Alabama
Minimum LM Unit Root Test with One Structural Break Junsoo Lee Department of Economics University of Alabama Mark C. Strazicich Department of Economics Appalachian State University December 16, 2004 Abstract
More information2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)
2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More information7. Autocorrelation (Violation of Assumption #B3)
7. Autocorrelation (Violation of Assumption #B3) Assumption #B3: The error term u i is not autocorrelated, i.e. Cov(u i, u j ) = 0 for all i = 1,..., N and j = 1,..., N where i j Where do we typically
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.unimuenchen.de/2962/ MPRA Paper No. 2962, posted
More informationPowerful new tools for time series analysis
Christopher F Baum Boston College & DIW August 2007 hristopher F Baum ( Boston College & DIW) NASUG2007 1 / 26 Introduction This presentation discusses two recent developments in time series analysis by
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationIs Infrastructure Capital Productive? A Dynamic Heterogeneous Approach.
Is Infrastructure Capital Productive? A Dynamic Heterogeneous Approach. César Calderón a, Enrique MoralBenito b, Luis Servén a a The World Bank b CEMFI International conference on Infrastructure Economics
More informationForecasting Chilean Industrial Production and Sales with Automated Procedures 1
Forecasting Chilean Industrial Production and Sales with Automated Procedures 1 Rómulo A. Chumacero 2 February 2004 1 I thank Ernesto Pastén, Klaus SchmidtHebbel, and Rodrigo Valdés for helpful comments
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationTesting for Serial Correlation in FixedEffects Panel Data Models
Testing for Serial Correlation in FixedEffects Panel Data Models Benjamin Born Jörg Breitung In this paper, we propose three new tests for serial correlation in the disturbances of fixedeffects panel
More informationTime Series Analysis III
Lecture 12: Time Series Analysis III MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Time Series Analysis III 1 Outline Time Series Analysis III 1 Time Series Analysis III MIT 18.S096 Time Series Analysis
More informationA SubsetContinuousUpdating Transformation on GMM Estimators for Dynamic Panel Data Models
Article A SubsetContinuousUpdating Transformation on GMM Estimators for Dynamic Panel Data Models Richard A. Ashley 1, and Xiaojin Sun 2,, 1 Department of Economics, Virginia Tech, Blacksburg, VA 24060;
More informationTime Series Analysis
Time Series Analysis Forecasting with ARIMA models Andrés M. Alonso Carolina GarcíaMartos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and GarcíaMartos (UC3MUPM)
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measuresoffit in multiple regression Assumptions
More informationInstrumental Variables (IV) Instrumental Variables (IV) is a method of estimation that is widely used
Instrumental Variables (IV) Instrumental Variables (IV) is a method of estimation that is widely used in many economic applications when correlation between the explanatory variables and the error term
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More information1 Portfolio mean and variance
Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a oneperiod investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring
More informationTime Series Analysis
Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina GarcíaMartos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and GarcíaMartos
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics  HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationForecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
More informationproblem arises when only a nonrandom sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a nonrandom
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 1618, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationVI. Real Business Cycles Models
VI. Real Business Cycles Models Introduction Business cycle research studies the causes and consequences of the recurrent expansions and contractions in aggregate economic activity that occur in most industrialized
More informationMachine Learning in Statistical Arbitrage
Machine Learning in Statistical Arbitrage Xing Fu, Avinash Patra December 11, 2009 Abstract We apply machine learning methods to obtain an index arbitrage strategy. In particular, we employ linear regression
More informationChapter 4. Simulated Method of Moments and its siblings
Chapter 4. Simulated Method of Moments and its siblings Contents 1 Two motivating examples 1 1.1 Du e and Singleton (1993)......................... 1 1.2 Model for nancial returns with stochastic volatility...........
More informationThe Power of the KPSS Test for Cointegration when Residuals are Fractionally Integrated
The Power of the KPSS Test for Cointegration when Residuals are Fractionally Integrated Philipp Sibbertsen 1 Walter Krämer 2 Diskussionspapier 318 ISNN 09499962 Abstract: We show that the power of the
More informationA Simple Model of Price Dispersion *
Federal Reserve Bank of Dallas Globalization and Monetary Policy Institute Working Paper No. 112 http://www.dallasfed.org/assets/documents/institute/wpapers/2012/0112.pdf A Simple Model of Price Dispersion
More informationDepartment of Economics
Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 14730278 On Testing for Diagonality of Large Dimensional
More informationSchooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)
Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationNonStationary Time Series andunitroottests
Econometrics 2 Fall 2005 NonStationary Time Series andunitroottests Heino Bohn Nielsen 1of25 Introduction Many economic time series are trending. Important to distinguish between two important cases:
More informationWorking Papers. Cointegration Based Trading Strategy For Soft Commodities Market. Piotr Arendarski Łukasz Postek. No. 2/2012 (68)
Working Papers No. 2/2012 (68) Piotr Arendarski Łukasz Postek Cointegration Based Trading Strategy For Soft Commodities Market Warsaw 2012 Cointegration Based Trading Strategy For Soft Commodities Market
More informationEffects of electricity price volatility and covariance on the firm s investment decisions and longrun demand for electricity
Effects of electricity price volatility and covariance on the firm s investment decisions and longrun demand for electricity C. Brandon (cbrandon@andrew.cmu.edu) Carnegie Mellon University, Pittsburgh,
More informationEstimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine AïtSahalia
Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine AïtSahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationTesting for Cointegrating Relationships with NearIntegrated Data
Political Analysis, 8:1 Testing for Cointegrating Relationships with NearIntegrated Data Suzanna De Boef Pennsylvania State University and Harvard University Jim Granato Michigan State University Testing
More informationStatistical Tests for Multiple Forecast Comparison
Statistical Tests for Multiple Forecast Comparison Roberto S. Mariano (Singapore Management University & University of Pennsylvania) Daniel Preve (Uppsala University) June 67, 2008 T.W. Anderson Conference,
More informationInvestment Statistics: Definitions & Formulas
Investment Statistics: Definitions & Formulas The following are brief descriptions and formulas for the various statistics and calculations available within the ease Analytics system. Unless stated otherwise,
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More information2. What are the theoretical and practical consequences of autocorrelation?
Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since
More informationTime Series Analysis
Time Series Analysis Time series and stochastic processes Andrés M. Alonso Carolina GarcíaMartos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and GarcíaMartos
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationUnderstanding the Impact of Weights Constraints in Portfolio Theory
Understanding the Impact of Weights Constraints in Portfolio Theory Thierry Roncalli Research & Development Lyxor Asset Management, Paris thierry.roncalli@lyxor.com January 2010 Abstract In this article,
More information11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
More informationα = u v. In other words, Orthogonal Projection
Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v
More informationRegression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture  2 Simple Linear Regression
Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture  2 Simple Linear Regression Hi, this is my second lecture in module one and on simple
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationCFA Examination PORTFOLIO MANAGEMENT Page 1 of 6
PORTFOLIO MANAGEMENT A. INTRODUCTION RETURN AS A RANDOM VARIABLE E(R) = the return around which the probability distribution is centered: the expected value or mean of the probability distribution of possible
More informationPersuasion by Cheap Talk  Online Appendix
Persuasion by Cheap Talk  Online Appendix By ARCHISHMAN CHAKRABORTY AND RICK HARBAUGH Online appendix to Persuasion by Cheap Talk, American Economic Review Our results in the main text concern the case
More informationFULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS
FULLY MODIFIED OLS FOR HEEROGENEOUS COINEGRAED PANELS Peter Pedroni ABSRAC his chapter uses fully modified OLS principles to develop new methods for estimating and testing hypotheses for cointegrating
More informationELECE8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems
Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed
More informationThe cyclical component factor model
The cyclical component factor model Christian M. Dahl Department of Economics Purdue University John Smidt The Danish Economic Council August 28, 2005 Henrik Hansen Department of Economics University of
More informationSpectral Measure of Large Random Toeplitz Matrices
Spectral Measure of Large Random Toeplitz Matrices Yongwhan Lim June 5, 2012 Definition (Toepliz Matrix) The symmetric Toeplitz matrix is defined to be [X i j ] where 1 i, j n; that is, X 0 X 1 X 2 X n
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationDOWNSIDE RISK IMPLICATIONS FOR FINANCIAL MANAGEMENT ROBERT ENGLE PRAGUE MARCH 2005
DOWNSIDE RISK IMPLICATIONS FOR FINANCIAL MANAGEMENT ROBERT ENGLE PRAGUE MARCH 2005 RISK AND RETURN THE TRADEOFF BETWEEN RISK AND RETURN IS THE CENTRAL PARADIGM OF FINANCE. HOW MUCH RISK AM I TAKING? HOW
More informationTopic 5: Stochastic Growth and Real Business Cycles
Topic 5: Stochastic Growth and Real Business Cycles Yulei Luo SEF of HKU October 1, 2015 Luo, Y. (SEF of HKU) Macro Theory October 1, 2015 1 / 45 Lag Operators The lag operator (L) is de ned as Similar
More informationConditional guidance as a response to supply uncertainty
1 Conditional guidance as a response to supply uncertainty Appendix to the speech given by Ben Broadbent, External Member of the Monetary Policy Committee, Bank of England At the London Business School,
More informationLecture 2: ARMA(p,q) models (part 3)
Lecture 2: ARMA(p,q) models (part 3) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEANice) Sept. 2011  Jan. 2012 Florian Pelgrin (HEC) Univariate time series Sept.
More informationUnivariate Time Series Analysis; ARIMA Models
Econometrics 2 Fall 25 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen of4 Univariate Time Series Analysis We consider a single time series, y,y 2,..., y T. We want to construct simple
More informationTesting Linearity against Nonlinearity and Detecting Common Nonlinear Components for Industrial Production of Sweden and Finland
Testing Linearity against Nonlinearity and Detecting Common Nonlinear Components for Industrial Production of Sweden and Finland Feng Li Supervisor: Changli He Master thesis in statistics School of Economics
More information85 Quantifying the Impact of Oil Prices on Inflation
85 Quantifying the Impact of Oil Prices on Inflation By Colin Bermingham* Abstract The substantial increase in the volatility of oil prices over the past six or seven years has provoked considerable comment
More informationReview of Fundamental Mathematics
Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decisionmaking tools
More informationStandardization and Estimation of the Number of Factors for Panel Data
Journal of Economic Theory and Econometrics, Vol. 23, No. 2, Jun. 2012, 79 88 Standardization and Estimation of the Number of Factors for Panel Data Ryan GreenawayMcGrevy Chirok Han Donggyu Sul Abstract
More informationNumerical Summarization of Data OPRE 6301
Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting
More informationDoes the interest rate for business loans respond asymmetrically to changes in the cash rate?
University of Wollongong Research Online Faculty of Commerce  Papers (Archive) Faculty of Business 2013 Does the interest rate for business loans respond asymmetrically to changes in the cash rate? Abbas
More information