Time Series Analysis Exercises

Similar documents
Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Time Series Analysis

Univariate and Multivariate Methods PEARSON. Addison Wesley

Performing Unit Root Tests in EViews. Unit Root Testing

Time Series Analysis

Lecture 2: ARMA(p,q) models (part 3)

2. Linear regression with multiple regressors

Advanced Forecasting Techniques and Models: ARIMA

Some useful concepts in univariate time series analysis

TIME SERIES ANALYSIS

ARMA, GARCH and Related Option Pricing Method

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2

Forecasting Using Eviews 2.0: An Overview

Chapter 6: Multivariate Cointegration Analysis

The relationship between stock market parameters and interbank lending market: an empirical evidence

TIME SERIES ANALYSIS

Simple linear regression

Price volatility in the silver spot market: An empirical study using Garch applications

Module 5: Multiple Regression Analysis

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Time Series Analysis

How To Model A Series With Sas

3.1 Stationary Processes and Mean Reversion

Vector Time Series Model Representations and Analysis with XploRe

CALCULATIONS & STATISTICS

Testing for Granger causality between stock prices and economic growth

Software Review: ITSM 2000 Professional Version 6.0.

Threshold Autoregressive Models in Finance: A Comparative Approach

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Sales forecasting # 2

The SAS Time Series Forecasting System

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

16 : Demand Forecasting

Time Series Analysis

Analysis and Computation for Finance Time Series - An Introduction

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Time Series - ARIMA Models. Instructor: G. William Schwert

Univariate Time Series Analysis; ARIMA Models

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Week TSX Index

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

Chapter 9: Univariate Time Series Analysis

Time Series Analysis and Forecasting

9th Russian Summer School in Information Retrieval Big Data Analytics with R

Luciano Rispoli Department of Economics, Mathematics and Statistics Birkbeck College (University of London)

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Chapter 4: Vector Autoregressive Models

1 Short Introduction to Time Series

Time Series Analysis

8. Time Series and Prediction

Time Series Analysis

4. Simple regression. QBUS6840 Predictive Analytics.

THE IMPACT OF EXCHANGE RATE VOLATILITY ON BRAZILIAN MANUFACTURED EXPORTS

Does the interest rate for business loans respond asymmetrically to changes in the cash rate?

ITSM-R Reference Manual

Integrated Resource Plan

Time Series and Forecasting

Unit root properties of natural gas spot and futures prices: The relevance of heteroskedasticity in high frequency data

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Time Series Analysis of Aviation Data

Graphical Tools for Exploring and Analyzing Data From ARIMA Time Series Models

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Turkey s Energy Demand

Forecasting areas and production of rice in India using ARIMA model

Using JMP Version 4 for Time Series Analysis Bill Gjertsen, SAS, Cary, NC

Air passenger departures forecast models A technical note

LOGIT AND PROBIT ANALYSIS

Forecasting of Paddy Production in Sri Lanka: A Time Series Analysis using ARIMA Model

Studying Achievement

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

IS THERE A LONG-RUN RELATIONSHIP

Introduction to Regression and Data Analysis

Promotional Forecast Demonstration

The Standard Normal distribution

Univariate Time Series Analysis; ARIMA Models

Simple Methods and Procedures Used in Forecasting

Simple Regression Theory II 2010 Samuel L. Baker

Forecasting model of electricity demand in the Nordic countries. Tone Pedersen

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Examples. David Ruppert. April 25, Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.

Causes of Inflation in the Iranian Economy

Premaster Statistics Tutorial 4 Full solutions

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

The average hotel manager recognizes the criticality of forecasting. However, most

State Space Time Series Analysis

Regression III: Advanced Methods

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Efficient Curve Fitting Techniques

Non-Stationary Time Series andunitroottests

The Cobb-Douglas Production Function

Review of Fundamental Mathematics

Transcription:

Universität Potsdam Time Series Analysis Exercises Hans Gerhard Strohe Potsdam 005 1

I Typical exercises and solutions 1 For theoretically modelling the economic development of national economy scenarios the following models for GDP increment are analysed: a. Y t = Y t-1 + a t b. Y t = 1.097 Y t-1-0,97 Y t- + a t, where Y - GDP increment a - White noise with zero mean and constant variance σ =100 t - Time (quarters starting with Q1, 1993) Check by an algebraic criterion which one is stationary.

Answer: a) The process can be written φ(l)y t = a t with the lag polynom φ(l) = 1-L and L being the lag or backshift operator LY t.= Y t-1 From this follows the characteristic equation φ(z) = 1-z = 0 The root, i.e. the solution of it, is z=1. It is not higher than 1, i.e. it is not placed outside the unit circle. The process is not stationary. It is a unit root process or more specific a random walk. b) The process can be written φ(l)y t = a t with the lag polynom φ(l) = 1 1,079L + 0,97 L. From this follows the characteristic equation φ(z) = 1-1,097z + 0,97 z = 0 The roots (complex numbers) of it are z 1 = 0,57 + 0,84 i z = 0,57-0,84 i with z = 0,57 + 0,84 = 1,015 1. > 1 / That means that both of the roots are outside the unit circle what is a sufficient condition for the stationarity of the process. 3

The variable l (=labour) in the file employees.dat denotes the quantity of labour force, i. e. the number of employees, in a big German company from January 1995 till December 004. i. Display the graph of the time series l t. ii. iii. iv. What are the characteristics of a stationary time series? Is the time series l t likely to be stationary? Check it first by the naked eye. Test the stationarity of l t by a suitable procedure. Determine the degree of integration. Estimate the correlation and the partial correlation function of the process. Give a description and an interpretation. v. What type of basic model could fit the time series l t. Why? vi. Estimate the parameters concerning the model assumed in question v. vii. Estimate alternative models and compare them by suitable indicators. viii. Forecast the time series for 005. 4

ix. A particular analysis method of time series l t results in the following graph. 1.6 1.4 Bartlett 1. Tukey 1.0 0.8 Parzen 0.6 0 1 3 4 Circular frequency Fig. 1: A special diagnostic function What is the name of this particular analysis method? What can you derive from the curve. What should be the typical shape of the function estimated for a process type assumed in question v? 5

Answers: i. The graph of the time series l t. 6000 5600 500 4800 4400 4000 95 96 97 98 99 00 01 0 03 04 Fig. : Graph of the employees time series L ii. The main characteristics of a stationary time series are that the mean µ t and the variance of the stochastic process generating this special time series are independent from time t, µ t = µ = const σ t = σ = const, and that the autocovariances γ t1, t depend only on the time difference τ : γ t1, t = γ t 1 - t = γ τ with τ = t 1 t (Lag) A check of the graph by the naked eye gives no reason to assume the time series not to be stationary. At the first glance there does not appear any trend or relevant development of the variance. 6

iii. Test of stationarity by DF Test. The following Dickey Fuller regression of l on l t-1 produces a t-value 3,63. Application of an augmented Dickey-Fuller regression is not necessary because the augmented model would have higher values of the Schwarz criterion (SIC, version on the base of the error variance). Table 1: Dickey-Fuller Test Equation Dependent Variable: l t Method: Least Squares Sample (adjusted): 1995M0 004M1; observations: 119 Variable Coefficient Std. Error t-statistic Prob. l t-1-0.00468 0.05577-3.66566 0.0004 C 1096.460 301.8879 3.63011 0.0004 R-squared 0.101051 Mean dependent var.0380 Adjusted R-squared 0.093368 S.D. dependent var 9.83931 S.E. of regression 88.39903 Akaike info criterion(σ) 11.8186 Sum squared resid 91483.5 Schwarz criterion(σ) 11.86497 The critical value of the t-statistic for a model with intercept c is -,89 on the 5 % level: Table : Null Hypothesis: l has a unit root Exogenous: Constant Lag Length: 0 (Automatic based on SIC, MAXLAG=1) t-statistic Test critical values: 1% level -3.486064 5% level -.885863 The t-value measured exceeds the critical value downwards. That means that the null hypothesis of nonstationarity or a unit root can be rejected on the 5 % level. The time series is stationary at least with a probability of 95 %. As l t is stationary its degree of integration is 0 ( I(0)). 7

iv. Sample correlation and partial correlation functions of the process. The sample autocorrelation function (AC) for a short series can be calculated using the formulae for the sample autocovariance (9.7) T τ ( xt x)( xt + τ x) t= 1 cτ = T τ and the sample autocorrelation (9.8) r c τ τ = s x with x and s x being the average and the standard deviation of the time series, respectively. The partial sample autocorrelation function (PAC) can be obtained by linear OLS regression of x t on x t-1, x t-,, x t-τ corresponding to formula (9.9): x t = φ0 + φ1 τ xt 1 +... + φ ττ xt τ The partial correlation coefficient ρ part (τ) is the coefficient φ ττ of the highest order lagged variable x t-τ. By calculating the first 5 regressions this way we obtain the following highest order coefficients for each of them, respectively: ф 11 = 0,79953; ф = -0,037588; ф 33 = 0,0757; ф 44 = 0,008910; ф 55 = 0,115970 We can compare these manually calculated coefficient with the partial autocorrelation functions displayed in the following table and figure together with the sample autocorrelation function delivered as a whole in one step by eviews. The differences between the regression results and the eviews output probably are caused by different estimation methods. 8

Table 3: τ AC PAC τ AC PAC τ AC PAC 1 0.794 0.794 9 0.039-0.008 17-0.157-0.066 0.616-0.04 10 0.014 0.049 18-0.154 0.038 3 0.495 0.05 11-0.004-0.07 19-0.15 0.04 4 0.390-0.08 1-0.050-0.09 0-0.096 0.038 5 0.348 0.118 13-0.061 0.069 1-0.098-0.083 6 0.98-0.038 14-0.100-0.08-0.14-0.17 7 0.05-0.115 15-0.139-0.04 3-0.150 0.046 8 0.109-0.083 16-0.137 0.013 4-0.183-0.134 Correlogram 1, 1 0,8 0,6 0,4 0, 0-0, -0,4 0 1 3 4 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 1 3 4 5 AC PAC Fig. 3: Correlogram The bold curve shows the sample autocorrelation function. The thin one displays the partial autocorrelation, both depending on the lag τ. While the smooth autocorrelation continuously decreases from 1 towards zero and below to a limit of about 0,1 the partial autocorrelation starts as well at the value 1 and has the same value as the autocorrelation for τ=1 but then drops down to zero and remains there oscillating between -0,1 and 0,1. 9

v. Possible type of basic model fitting the time series l. The basic model could be a first order autoregressive process, because AC is exponentially decreasing and PAC is dropping down after τ=1. The shape of the AC is typical for the autocorrelation of an AR process with positive coefficients. The partial autocorrelation drops down to and remains close to zero after the lag τ=1 what indicates an AR(1) process. vi. Estimated parameters concerning the model assumed in answer v. Table 4: Dependent Variable: l t Method: Least Squares Sample 1995M0 004M1, observations: 119 Variable Coefficient Std. Error t-statistic Prob. CONST 1096.460 301.8879 3.63011 0.0004 l t-1 0.79953 0.05577 14.46398 0.0000 R-squared 0.64133 Mean dependent var 5461.386 Adjusted R-squared 0.63866 S.D. dependent var 146.978 S.E. of regression 88.39903 Akaike info criterion 11.8186 Sum squared resid 91483.5 Schwarz criterion 11.86497 Thus we get the empirical model: l t = 1096.5+ 0.7995 l t-1 + a t Because the t-statistic of both exceeds the 5-% critical value 1,96, both coefficients are significant on this level, at least. The prob. values behind indicate that the even are on the 1 % level. 10

Another way of estimation is to take advantage of the special ARMA estimation procedure of eviews. Here first the mean of the variable is estimated and then the AR(1) coefficient of an AR model of the deviations from the mean: Table 5: Dependent Variable: l Method: Least Squares Sample: 1995M0 004M1; observations: 119 Convergence achieved after 3 iterations Variable Coefficient Std. Error t-statistic Prob. CONST 5469.515 40.505 134.983 0.0000 AR(1) 0.79953 0.05577 14.46398 0.0000 R-squared 0.64133 Mean dependent var 5461.386 Adjusted R-squared 0.63866 S.D. dependent var 146.978 S.E. of regression 88.39906 Akaike info criterion(σ) 11.8186 Sum squared resid 91484. Schwarz criterion(σ) 11.86497 The equivalent model obtained this way can be written: (l t - 5469,5) = 0,7995 ( l t-1-5469.5) + a t 11

vii. As an alternative, an ARMA(,1) model would fit the data. It can be found by trial and error via several different models and the aim of obtaining significant coefficients and compared by Schwarz criterion. As the model contains an MA term ordinary least squares is not practicable for estimating the coefficients. Therefore again the iterative nonlinear least squares procedure by eviews is used: Table 6: Dependent Variable: l Method: Least Squares Sample 1995M03 004M1; observations: 118 Convergence achieved after 7 iterations Backcast: 1994M1 Variable Coefficient Std. Error t-statistic Prob. CONST 5469.00 37.73310 144.9444 0.0000 AR() 0.600156 0.09363 6.409735 0.0000 MA(1) 0.85671 0.060133 14.3960 0.0000 R-squared 0.64698 Mean dependent var 546.35 Adjusted R-squared 0.640787 S.D. dependent var 147.53 S.E. of regression 88.3855 Akaike info criterion(σ) 11.8306 Sum squared resid 895394.8 Schwarz criterion(σ) 11.89350 Again here is to be considered that the constant is the mean and the coefficients belong to a model of derivations from it. Thus the ARMA(,1) model is to be written: (l t - 5469,) = 0,600 ( l t- - 5469.5) + 0,8563 a t + a t In order to have the model in the explicit form, the mean is subtracted: l t = 5469, + 0,600 ( l t- - 5469.) + 0,8563 a t + a t = 5469, - 5469. 0,600.+ 0,600 l t- + 0,8563 a t + a t and at last l t = 186,6.+ 0,600 l t- + 0,8563 a t + a t This model does not show an improvement compared by Akaike and Schwarz criteria with the AR(1) estimated earlier. It has slightly higher values. EViews provides these criteria on the base of the error variance which is to be minimised (in contrast to those on likelihood base such as in Microfit that are to be maximised). 1

viii. Forecast of the number of employees for 005: The somewhat uneasy shape of both models: (l t - 5469,5) = 0,7995 ( l t-1-5469.5) + a t and (l t - 5469,) = 0,600 ( l t- - 5469.5) + 0,8563 a t + a t give us the advantage of easily forecasting the process in the long run because the stochastic limes for time tending to infinity is given by what we called the means 5469,5 and 5469,, respectively: plim (l t - 5469,5) = plim (0,7995 ( l t-1-5469.5) + a t ) = plim (0,7995 ( l t-1-5469.5) + plim a t That means future. = 0 + 0. plim l t = 5469,5 and we can take this constant as suitable forecast for the farther In the same way we obtain from the second model the forecast 5469, that does not differ very much. The following graph displays the forecast for the years 004 and 005. In 004 we obtain an irregular curve, because the one-month ahead forecast for every month can be calculated on the base of the varying values of the previous month. But in 005, there is a smooth exponential curve because the forecasts can be calculated only on the base of the previous forecast instead of the real data. 13

5800 5700 5600 5500 5400 5300 500 5100 5000 004M01 004M07 005M01 005M07 Forecast of the number of employees Fig. 4: Forecast ix. Spectral analysis is the name of this method. Basically, the main peaks of spectral density occur at the circular frequencies ω 1 = 0,71 ω = 1,86. These correspond to frequencies f 1 = 0,114 f = 0,95 and to periods of average length p 1 = 8,8 p = 3,4 month, respectively. But these periodicities are not of any practical importance. They superimpose to the point of being unrecognizable. They are not visible in the original time series and the correlogram. Taking standard errors of the spectral estimates into consideration, peaks and troughs of the spectral density curve does not significantly differ. The typical shape of the spectral density function for an following AR(1) process is similar to the 14

Estimates of spectral density of an AR(1) process 8 Bartlett 6 4 Tukey Parzen 0 0 1 3 4 Circular frequency Fig. 5: Estimates of spectral density of an AR(1) process In case of an additional seasonal component the spectral density would show a peak over the frequency f = 1/1 or the circular frequency ω = π/1 = 0,5. 15

3 The file dax_j95_a04.txt contains the daily closing data of the main German stock price index DAX from January 1995 by August 004, i.e. 498 values. i. Display the graph of the time series dax. ii. iii. iv. Determine the order of integration of dax. Generate the series of the growth rate or rate of return r in the direct way and in the logarithmic way. Display both graphs of r. Compare the r data. Characterize the general patterns of both graphs of r v. Check whether or not r is stationary. vi. vii. Estimate the correlation function and the partial correlation function till a lag of 0. Characterize generally the extent of correlation in this series. Estimate an AR(8) model to r that contains only coefficients significant at least on the five percent level viii. Estimate an ARIMA(8,d,8) model to r that contains only coefficients significant at least on the one percent level. ix. Test the residuals of this model for autoregressive conditional heteroscedasticity by a rough elementary procedure x. Make visible conditional heteroscedasticity by estimating a 5-day moving variance of the residuals xi. xii. Estimate an ARCH(1) model on the base of the ARIMA(8,d,8) model estimated in viii. You can change the ARMA part in order to keep only coefficients significant on the one percent level. Estimate a GARCH(1,1) model on the base of the simple model r=const. 16

Answers: i. The graph of dax. 9000 8000 7000 6000 5000 4000 3000 000 1000 0 500 1000 1500 000 DAX Fig. 6: The graph of the German share price index This graph indicates a non-stationary process, perhaps a random walk. It is characterized by changing stochastic trends and increasing variance. 17

ii. The order of integration of dax. For testing whether or not dax is stationary first the original date are tested by the Dickey-Fuller test of the levels. The following table shows the estimation of the coefficient of the Dickey- Fuller Test regression of dax t on daxt -1 : Table 7: Dickey-Fuller Test Equation Dependent Variable: dax t Method: Least Squares Sample (adjusted): 498; observations: 497 Variable Coefficient Std. Error t-statistic Prob. dax t-1-0.001453 0.000919-1.580936 0.1140 C 6.970114 4.1383 1.654107 0.098 R-squared 0.001001 Mean dependent var 0.701418 Adjusted R-squared 0.000600 S.D. dependent var 71.8183 S.E. of regression 71.6043 Akaike info criterion(σ) 11.37136 Sum squared resid 1669731 Schwarz criterion(σ) 11.3760 The t-value of the slope coefficient is -1.580936. As the following table demonstrates it is not less than the 5%-critical t-value for a model with a constant, i.e. -.86497. Table 8: Null Hypothesis: DAX has a unit root Exogenous: Constant Lag Length: 0 (Automatic based on SIC, MAXLAG=6) t-statistic Prob.* Augmented Dickey-Fuller test statistic -1.580936 0.49 Test critical values: 1% level -3.43775 5% level -.86497 That means that dax is on the 5% level not stationary. The next step is to check the first differences of dax in the same way. 18

The Dickey-Fuller test equation of the first differences is a regression of the second differences dax t on the lagged first differences dax t-1. Table 9: Augmented Dickey-Fuller Test Equation Dependent Variable: D(DAX,) = dax t Method: Least Squares Sample: 3 498; observations: 496 Variable Coefficient Std. Error t-statistic Prob. D(DAX(-1))= dax t-1-0.99634 0.0006-49.74669 0.0000 C 0.7090 1.47408 0.4943 0.65 R-squared 0.498061 Mean dependent var -0.017540 Adjusted R-squared 0.497860 S.D. dependent var 100.6319 S.E. of regression 71.30959 Akaike info criterion(σ) 11.3774 Sum squared resid 168133 Schwarz criterion(σ) 11.37741 The t-value -49.74669 obviously exceeds all thinkable critical values. Thus the first differences are stationary and dax itself is first order integrated I(1). iii. The growth rate or rate of return r can be generated as the ratio r t = ( dax ) t daxt 1 daxt 1 That is displayed in the following graph: 19

.08.04.00 -.04 -.08 -.1 500 1000 1500 000 R _ R A T IO Fig. 7: Graph of the growth rate of DAX Another way of calculating the rate of return mostly used in financial market analysis is the logarithmic way: r ln dax ln dax t = t t 1 The next figure shows the graph of this logarithmically generated rate of return r..08.04.00 -.04 -.08 -.1 500 1000 1500 000 Fig. 8: Graph of the logarithmic return rate of DAX R _ L O G 0

iv. Both curves are very similar to each other and can hardly be distinguished by the naked eye. An enlarged display of the differences as shown in the next figure demonstrates that the ratio always slightly exceeds the logarithmic approximation. In the following analysis the latter will be used. Differences between alternative series of dax return rate r.005.004.003.00.001.000 500 1000 1500 000 DIFF_R Fig. 9: Graph of the differences between both ways of calculating return rates Both graphs of r show certain common general patterns. These are significant clusters of high variability or volatility separated by quieter periods. This changing behaviour of variance is typical for ARCH or GARCH processes. v. Stationarity test for r. Particularly for later fitting an ARCH or GARCH model, the time series should be stationary. The following Dickey Fuller Regression of r t on r t-1 produces a negative t-value of -50.377 that lies below of all possible critical values: The return rate r proves stationary. 1

Table 10: Dickey-Fuller Test Equation Dependent Variable: r t Method: Least Squares Sample (adjusted): 3 498; observations: 496 Variable Coefficient Std. Error t-statistic Prob. r t-1-1.008850 0.0006-50.37696 0.0000 C 0.00050 0.00030 0.780708 0.4350 R-squared 0.504356 Mean dependent var -3.68E-06 Adjusted R-squared 0.504157 S.D. dependent var 0.0675 S.E. of regression 0.015967 Akaike info criterion(σ) -5.435790 Sum squared resid 0.63589 Schwarz criterion(σ) -5.43115 vi. Sample correlation function and partial correlation functions of r Correlogram 1, 1 0,8 0,6 0,4 AC PAC 0, 0-0, 0 1 3 4 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 1 3 4 5 Fig. 10: Graph of sample correlation and partial correlation function

Table 11: τ AC PAC τ AC PAC τ AC PAC 1-0.009-0.009 13-0.017-0.015 5 0.005 0.004-0.005-0.005 14 0.071 0.076 6-0.048-0.048 3-0.03-0.03 15 0.07 0.08 7-0.046-0.036 4 0.07 0.06 16-0.04-0.06 8 0.01 0.014 5-0.041-0.040 17-0.038-0.030 9 0.056 0.048 6-0.04-0.043 18 0.001-0.001 30 0.05 0.033 7-0.006-0.006 19-0.048-0.048 31 0.00 0.030 8 0.04 0.038 0 0.03 0.09 3 0.010 0.001 9-0.005-0.005 1 0.05 0.031 33-0.001-0.001 10-0.007-0.007 0.005-0.008 34-0.014-0.006 11 0.01 0.011 3-0.00-0.004 35-0.01-0.0 1 0.014 0.010 4 0.033 0.033 36 0.019 0.017 Let us try a general characterisation of the extent of correlation in this series: The sample correlation and partial correlation function do not differ very much. The graphs of both of them coincide. There is not any significant correlation value. This could be the functions of a white noise. Anyway, because of the little peaks at 5, 6 and 8 it could be sensible try to fit an AR(8) model. vii. Example: AR(8) model coefficients significant on the five percent level. Several trials with AR coefficient up to the order 8 finally resulted in three significant coefficients and a missing constant. The criterion for rejecting other variants was nonsignificance of coefficients. Table 1: Dependent Variable: r Method: Least Squares Sample (adjusted): 10 498; observations: 489 Variable Coefficient Std. Error t-statistic Prob. r t-5-0.03984 0.0003-1.989799 0.0467 r t-6-0.041799 0.00013 -.088571 0.0368 r t-8 0.039960 0.0008 1.995155 0.0461 Mean dependent var 0.00044 S.D. dependent var 0.01598 S.E. of regression 0.015951 Akaike info criterion -5.437444 Sum squared resid 0.63487 Schwarz criterion -5.43049 3

Thus we obtain by OLS regression the AR(8) model r t = -0.03984 r t-5-0.041799 r t-6 + 0.039960 r t-8 + e t, where e t is the error term of the process estimated. We get exactly the same results, if we consider this AR model as a special case of an ARMA model estimated by iterative nonlinear least squares Table 13: Dependent Variable: r Method: Least Squares Sample (adjusted): 10 498; observations: 489 Convergence achieved after 3 iterations Variable Coefficient Std. Error t-statistic Prob. AR(5) -0.03984 0.0003-1.989799 0.0467 AR(6) -0.041799 0.00013 -.088571 0.0368 AR(8) 0.039960 0.0008 1.995155 0.0461 Mean dependent var 0.00044 S.D. dependent var 0.01598 S.E. of regression 0.015951 Akaike info criterion(σ) -5.437444 Sum squared resid 0.63487 Schwarz criterion(σ) -5.43049 viii. Example: ARIMA(8,d,8) model for r that contains only coefficients significant at least on the one percent level. Now we try to improve the model by including a moving average term. In this case we must use the iterative nonlinear least squares method. Now the aim is to obtain coefficients, significant on the 1% level. Again after a series of trials with varying ARMA(8,8) models we got finally the following results by omitting non significant coefficients: 4

Table 14: Dependent Variable: r Method: Least Squares Sample (adjusted): 10 498; observations: 489 Convergence achieved after 15 iterations Variable Coefficient Std. Error t-statistic Prob. AR(3) -0.33958 0.05338-13.38956 0.0000 AR(8) -0.613633 0.05849-3.73905 0.0000 MA(3) 0.96335 0.07198 10.89544 0.0000 MA(6) -0.075689 0.013945-5.47549 0.0000 MA(8) 0.654681 0.00088 3.59034 0.0000 Mean dependent var 0.00044 S.D. dependent var 0.01598 S.E. of regression 0.015874 Akaike info criterion(σ) -5.4466 Sum squared resid 0.65950 Schwarz criterion(σ) -5.434536 The model resulting from this estimation is the following: r t = - 0.339 r t-3-0.614 r t-8 + e t + 0.96 e t-3-0.076 e t-6 + 0.655 e t-8 where e t is the error term of the estimated ARMA process. Now the goodness of fit of both models should be compared. Besides considering the significance level of the coefficients in both models a powerful mean of comparison are the Akaike and Schwarz criteria, which are to be minimised. Here both criteria are very close to each other. But the slightly smaller (negative!) values of both criteria in the second case give a certain preference to the ARMA model. Because we know from v. that r is stationary, that means integrated of order 0, this model is an ARIMA(1,0,1) for r at the same time. 5

ix. Rough and preliminary test of the residuals for autoregressive conditional heteroscedasticity by OLS regression A rough check for first order autoregressive conditional heteroscedasticity (ARCH(1)) is the estimation of a regression of the squared residuals e t on e t-1 : Table 15: Dependent Variable: e t Method: Least Squares Sample (adjusted): 11 498; observations: 488 Variable Coefficient Std. Error t-statistic Prob. C 0.00008 1.18E-05 17.5938 0.0000 e t-1 0.175073 0.019746 8.86635 0.0000 R-squared 0.03065 Mean dependent var 0.0005 Adjusted R-squared 0.03063 S.D. dependent var 0.000543 S.E. of regression 0.000535 Akaike info criterion (σ) -1.961 Sum squared resid 0.000710 Schwarz criterion (σ) -1.493 For testing this dependence we cannot simply use the usual t-test because the t-values estimated do not meet an exact student distribution. Anyway, because here the t-values immensely exceed the 5-percent and 1-percent critical values, we can with great practical confidence assume, that there exist a highly significant relationship between e t and e t-1 i.e. high conditional heteroscedasticity. x. Visualisation of conditional heteroscedasticity by moving 5-day residual variance The existence of conditional heteroscedasticity can be visualised by smoothing the series of squared residuals that means by moving averages of the e t or moving variances. 6

.004.003.00.001.000 500 1000 1500 000 5-day moving residual variance Fig. 11: At the figure, you can see typical clusters of higher variance, i.e. volatility, changing with intervals of lower variance. The similarity of neighbouring variances is one more indicator for conditional heteroscedasticity: On the base of knowing the variance at time t (i. e. conditionally) you can forecast the variance at time t+1. xi. ARCH(1) model on the base of the AR(8) model Because of the latter analytical results it would be worthwhile to estimate an ARCH(1) model. Then the conditional variance of the error e t is h t = var( et et 1) = E( et et 1) = 0 + λ1et 1 where practically h t is estimated by e t λ, In the software used this is a special case of the more general GARCH model. An ARCH(1) there corresponds to a GARCH(0,1) model. We assume here the time series to follow an AR(8) process as estimated earlier without ARCH: 7

Table 16: Dependent Variable: r Method: ML ARCH Sample (adjusted): 10 498; observations: 489 Included after adjustments GARCH = C(4) + C(5)*RESID(-1)^ Coefficient Std. Error z-statistic Prob. AR(5) -0.04570 0.013304-3.43516 0.0006 AR(6) -0.046538 0.014090-3.303005 0.0010 AR(8) 0.044716 0.01381 3.37367 0.001 Variance Equation C 0.000194 4.67E-06 41.55146 0.0000 RESID(-1)^ 0.43443 0.06156 9.307385 0.0000 R-squared 0.004705 Mean dependent var 0.00044 Adjusted R-squared 0.00310 S.D. dependent var 0.01598 S.E. of regression 0.015958 Akaike info criterion(σ) -5.49354 Sum squared resid 0.63539 Schwarz criterion(σ) -5.481833 The coefficients differ from the earlier estimated ones because here they are estimated simultaneously with the ARCH term. The newly estimated model is: r t = -0.0457 r t-5-0.0465 r t-5 + 0.0447 r t-8 + e t, with the error process e t = 0.000194 + 0.434 e t-1 + a t where a t should be a pure random series. The AR representation of the error process gives the opportunity to forecast the volatility. 8

xii. GARCH(1,1) model on the base of the simple model r=const. Here you should model the return rate as a real GARCH process. The generalized autoregressive conditional heteroscedasticity model (GARCH (p,q)) describes a process where the conditional error variance on all information Ω t-1 available at time t ht = var (ut Ωt 1 ) is assumed to obey an ARMA(p,q) model: ht = α + α ht +... + α pht p + β ut + β ut +... + β qut q 0 1 1 with u t being the error process. 1 1 Table 17: Dependent Variable: r Method: ML - ARCH (Marquardt) - Normal distribution Sample (adjusted): 10 498; observations: 489 Convergence achieved after 1 iterations GARCH = C(5) + C(6)*RESID(-1)^ + C(7)*GARCH(-1) Coefficient Std. Error z-statistic Prob. C 0.000710 0.00046.8834 0.0039 r t-5-0.09571 0.00063-1.473904 0.1405 r t-5-0.036495 0.00360-1.79478 0.0731 r t-8 0.01575 0.00570 0.611309 0.5410 Variance Equation C.01E-06 4.03E-07 4.980196 0.0000 RESID(-1)^ 0.08160 0.009593 8.56498 0.0000 GARCH(-1) 0.911339 0.009476 96.1717 0.0000 R-squared 0.003340 Mean dependent var 0.00044 Adjusted R-squared 0.000930 S.D. dependent var 0.01598 S.E. of regression 0.015975 Akaike info criterion(σ) -5.75055 Sum squared resid 0.633406 Schwarz criterion (σ) -5.735688 While in the original AR(8) model the constant could be omitted here the constant is the only significant term in the regression part of the model. Therefore all lagged r terms can be omitted now: 9

Table 18: Dependent Variable: r Method: ML ARCH Sample (adjusted): 498; observations: 497 Convergence achieved after 1 iterations GARCH = C() + C(3)*RESID(-1)^ + C(4)*GARCH(-1) Coefficient Std. Error z-statistic Prob. C 0.00066 0.00040.753134 0.0059 Variance Equation C.08E-06 4.06E-07 5.11631 0.0000 RESID(-1)^ 0.08319 0.00960 8.664398 0.0000 GARCH(-1) 0.910103 0.009445 96.35969 0.0000 R-squared -0.000681 Mean dependent var 0.00045 Adjusted R-squared -0.001886 S.D. dependent var 0.015961 S.E. of regression 0.015977 Akaike info criterion(σ) -5.75684 Sum squared resid 0.636337 Schwarz criterion(σ) -5.747496 The model proves very simple: DAX return equals the constant 0,00066 (i.e. 0,06 % per day on an average) with sort of an ARMA(1,1) conditional variance that describes the development of volatility or risk: h t =,08!0-06 + 0,9101 h t-1 + 0.08319 e t-1 + a t with h t being the conditional variance of r t on base of the information by time t, and e t being the deviation of r from its mean in this model. In the meaning of Akaike and Schwarz criteria, the model fits better than all considered before: The values of these criteria are the lower ones despite the model is extremely simple. The model shows that conditional variance as a measure of volatility and investment risk is highly determined by the variance of the last day, i.e. rather by the more theoretical conditional variance h t-1 than by the directly measurable deviation e t-1. 30