1 Time Series Analysis using In a Nushell dr. JJM J.J.M. Rijpkema Eindhoven Universiy of Technology, dep. Mahemaics & Compuer Science P.O.Box 513, 5600 MB Eindhoven, NL 2012 j.j.m.rijpkema@ue.nl Sochasic Processes:, X,,, X 1 2 X Sochasic Process Individual sochass, ha migh be dependen Daa Collecion Model Fi Model Idenificaion Parameer Esimaion Model Validaion laborforce - daa 14 x 104 13 12 11 10 9 8 Realizaion 7 Model Use 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year 2 Overview Inroducion & Preliminaries Exploraory Daa Analysis Time Sequence Plo (Parial) Auocorrelaion Funcion Exponenial Smoohing Mehods Simple Exponenial Smoohing Hol and Hol-Winers Models ARMA Models The Bare Essenials Model Fiing More Time Series Modeling 2IS55 TSA wih R 1
3 Inroducion & Preliminaries Sochasic Processes: X, X,,, X 1 2 Sochasic Process Individual sochass, ha migh be dependen Daa Collecion Model Fi Model Idenificaion Parameer Esimaion Model Validaion laborforce - daa 14 x 104 13 12 11 10 9 8 Realizaion 7 Model Use 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year Relevan Models? 2IS55 Resricion Exploraory Daa Analysis? 4 References: hp://a-lile-book-of-r-for-ime-series.readhedocs.org/ Coghlan, Avril, A Lile Book of R for Time Series, Cowperwai, Paul S.P. e al., Inroducory Time Series wih R, chapers 1& 2 2IS55 TSA wih R 2
5 Sofware: hp://www.r-projec.org/ projec org/ Meaphor: Download and Insall : CRAN mirror: hp://cran-mirror.cs.uu.nl/ 2IS55 TSA wih R 3
Sar -Console: Alernaive: R-sudio hp://rsudio.org/ 2IS55 TSA wih R 4
Insall package forecas (& series ): Neherlands (Urech) Load package forecas & series : > library(forecas) 2IS55 TSA wih R 5
11 Exploraory Daa Analysis Example: Age of Deah of 42 Successive Kings of England > load("d:/.../kings.rdaa") General Properies: > class(kings) [1] "numeric" > kings [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 [23] 69 59 48 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56 Conversion o ime series objec: > kings.s <- s(kings) > kings.s Time Series: Sar = 1 End = 42 Frequency = 1 [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 [23] 69 59 48 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56 12 Time Sequence Plo: > plo(kings.s) kings.s 40 60 80 20 0 10 20 30 40 Time Exploraory Daa Analysis Main Properies: Trend? Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 2IS55 TSA wih R 6
13 Auocorrelaion Funcion: Are successive observaions relaed? Scaerplo &S Sample Correlaion Coefficien i of f{( {(x,x -1 )} > lag.plo(kings.s,do.lines=f) r x x x1 x x x 1 2 Approximaion! r=0.4 kings.s k 20 40 60 80 0 20 40 60 80 100 lag 1 Auocorrelaion Coefficien a ime-lag 1 14 Generalisaion: r k x x xk x x 2 x > lag.plo(kings.s,lags=12,do.lines=f) -20 20 60 100-20 20 60 100 kings.s 20 60 kings.s kings.s lag 1 lag 2 lag 3 kings.s lag 4 kings.s lag 5 kings.s lag 6 20 60 kings.s 20 60 kings.s kings.s lag 7 lag 8 lag 9 kings.s lag 10 kings.s lag 11-20 20 60 100 kings.s lag 12 20 60 2IS55 TSA wih R 7
15 Auocorrelaion Funcion: > acf(kings.s)$acf [,1] [1,] 1.000000000 > acf(kings.s) [2,] 0.400584149 [3,] 0.238081338 [4,] 0.259547258 [5,] 0.347780525 [6,] 0.160923687 [7,] 0.031201729 [8,] 0.115411950 [9,] 0.078385052 [10,] -0.036221926 [11,] -0.001138564 [12,] 0.110712654 [13,] -0.010138296 [14,] -0.116699180 [15,] 0.034735266 [16,] 0.025070163 0 5 10 15 [17,] -0.119576411 0.2 0.6 1.0-0.2 16 Parial Auocorrelaion: Direc Pahway x -k Indirec Pahway x x -k-1 x -k-2 x -2 x -1 Auocorrelaion Funcion: Measure for he overall correlaion beween {x -k } and {x } Boh hrough he direc and he indirec pahway! Parial Auocorrelaion Funcion: Measure for he direc correlaion beween {x -k } and {x } Indirec pahway correced for! 2IS55 TSA wih R 8
17 (Parial) Auocorrelaion Funcion: > pacf(kings.s) Parial -0.3-0.1 0.1 0. 3 > acf(kings.s) 5 10 15 18 EDA-Summary: > sdisplay(kings.s) kings.s 20 60 0 10 20 30 40-0.4 0.0 P -0.4 0.0 0.4 0.4 2 4 6 8 12 2 4 6 8 12 2IS55 TSA wih R 9
19 Example: Birhs per Monh in New York Ciy > load("d:/.../birhs.rdaa") > birhs.s <- s(birhs,sar=c(1946,1),frequency=12) > plo(birhs.s) birhs.s 20 22 24 26 28 30 1946 1950 1954 1958 Time Exploraory Daa Analysis Main Properies: Trend? Seasonal Variaion? Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 20 EDA-Summary: sdisplay(birhs.s) birhs.s 20 26 1946 1948 1950 1952 1954 1956 1958 1960-0.4 0.2 P -0.4 0.2 0.6 0.6 0 5 15 25 35 0 5 15 25 35 2IS55 TSA wih R 10
21 Example: Monhly Sales for Ausralian Souvenir Shop > load("d:/.../souvenir.rdaa") > souvenir.s <- s(souvenir,sar=c(1987,1),frequency=12) > plo(souvenir.s) souvenir.s 4e+04 8e+04 Exploraory Daa Analysis Main Properies: Trend? Seasonal Variaion? 0e+00 1987 1989 1991 1993 Time Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 22 EDA-Summary: sdisplay(souvenir.s) souvenir.s 1e+05 1987 1988 1989 1990 1991 1992 1993 1994-0.2 0.2 P -0.2 0.2 0e+00 0.6 0.6 0 5 10 20 0 5 10 20 2IS55 TSA wih R 11
23 Time Sequence Plo afer Log-Trafo: > souvenir.log <- log(souvenir.s) > plo(souvenir.log) souvenir.log 9 10 11 Exploraory Daa Analysis Main Properies: Trend? 8 1987 1989 1991 1993 Seasonal Variaion? Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 24 EDA-Summary: > sdisplay(souvenir.log.s) souvenir.log 8 10 1987 1988 1989 1990 1991 1992 1993 1994-0.6 0.0 P -0.6 0.0 0.4 0.4 0 5 10 20 0 5 10 20 2IS55 TSA wih R 12
25 Example: Annual Rainfall in London > load("d:/.../rain.rdaa") > rain.s <- s(rain,sar=1813) > plo(rain.s) rain.s 25 30 35 Exploraory Daa Analysis Main Properies: Trend? 20 1820 1840 1860 1880 1900 Time Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 26 EDA-Summary: > sdisplay(rain.s) rain.s 20 30 1820 1840 1860 1880 1900-0.3 0.0 0.2 P -0.3 0.0 0.2 5 10 15 20 5 10 15 20 2IS55 TSA wih R 13
27 Example: Annual Diameer of Women s Skirs > load("d:/.../skirs.rdaa") > skirs.s <- s(skirs,sar=1866) > plo(skirs.s) skirs.s 600 0 800 1000 1870 1880 1890 1900 1910 Time Exploraory Daa Analysis Main Properies: Trend? Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 28 EDA-Summary: > sdisplay(skirs.s) skirs.s 0 1870 1880 1890 1900 1910-0.4 0.2 P -0.4 0.2 0.8 0.8 600 90 2 4 6 8 12 2 4 6 8 12 2IS55 TSA wih R 14
29 Example: Volcanic Dus Veil Index > load("d:/.../volcano.rdaa") > volcano.s <- s(volcano, sar=1500) > plo(volcano.s) volcano.s 200 400 600 Exploraory Daa Analysis Main Properies: Trend? 0 1500 1600 1700 1800 1900 Time Cyclic Variaion? Irregular Variaion? Sudden Changes? Ouliers? Missing Values? 30 EDA-Summary: > sdisplay(volcano.s) volcano.s 0 300 700 1500 1600 1700 1800 1900 0.0 0.4 P 0.0 0.4 0 5 10 15 20 25 0 5 10 15 20 25 2IS55 TSA wih R 15
31 Types of Variaion: Trend Long erm changes in mean Can be esimaed & modeled Can be correced for Seasonal Variaion Periodic variaions over ime Can be esimaed & modeled Can be correced for Saionary Time Series No sysemaic change in mean and variance No sricly periodic variaions Oher Cyclic Variaion Business cycle (~7 year) Oher Irregular Flucuaions Random or wih srucure? 32 Simple Exponenial Smoohing Series wih NO Trend and No Seasonaliy Example: Annual Rainfall in London rain.s 20 25 30 35 Saionary Series No rend (??) No Seasonaliy 1820 1840 1860 1880 1900 Basic Idea: one-sep ahead predicion 0 1 1 2 2 xˆ 1 c x c x c x T T T T Geomeric Weighs: c 1 i i Remark: mainly used for shor-erm forecass! 2IS55 TSA wih R 16
33 Inerpreaion: xˆ T 1 xt 1 xt 1 1 xt2 1 1 ˆ 1 xˆt x T xt 1 Weighed average of pas and presen ˆ xˆ 1 x x 1 T1 T T 1 α =0 pas α =1 presen Overview: Simple Exponenial Smoohing x 1 Lˆ 1 xˆ 1 Lˆ 1 1 1 ˆ 1 Lˆ x L x Lˆ Remark: Simple Exponenial Smoohing allows for updaes of level esimaes Example: fixed parameer = 0.2 Adequae iniial value for he level esimae needed ˆL x 1 1 > rain.ses1 <- HolWiners(rain.s,alpha=0.2,bea=FALSE,gamma=FALSE) Available Informaion: Smoohing Parameers: > names(rain.ses1) [1] "fied" "x" "alpha" "bea "gamma [6] "coefficiens" "seasonal" "SSE" "call" > rain.ses1 Hol-Winers exponenial smoohing wihou rend and wihou seasonal componen. Smoohing parameers: alpha: 0.2 bea : FALSE gamma: FALSE 34 Final Componen Esimae: Coefficiens: [,1] a 25.30941 2IS55 TSA wih R 17
35 Running Componen Esimaes: > rain.ses1$fied xha level 1814 23.56000 23.56000 1815 24.06200 24.06200 1816 23.62160 23.62160 1817 25.14528 25.14528 1818 24.84622 24.84622 1819 24.65298 24.65298 1820 25.00438 25.00438 1821 24.53751 24.53751 1822 25.96801 25.96801 1823 25.54640 25.54640... xha level 22 25 2822 2 25 28 xˆ 1 L ˆ 1 1 > plo(rain.ses1$fied) α = 0.2 1820 1840 1860 1880 1900 36 In-Sample Predicion vs. Realizaion: > plo(rain.ses1) Observed / Fied 20 25 30 35 Running SSE T s0 1 > rain.ses1$sse [1] 1972.197 x xˆ 1 2 1820 1840 1860 1880 1900 Remark: In-sample comparison uses he same daa for fiing and validaion. Independen hold-ou sample for validaion would be preferred! 2IS55 TSA wih R 18
37 Goodness-of-Fi Measures: Bias: Mean Error n 1 ME x ˆ x n 1 The closer o 0, he Mean Percenage Error 1 n x ˆ x MPE n 1 x Variabiliy: Mean Absolue Error n 1 MAE x ˆ x n 1 Mean Absolue Percenage Error MAPE n 1 x xˆ x 100 n > accuracy(rain.ses1) Roo Mean Squared Error n 1 RMSE x ˆ x n 1 In-Sample Accuracy: 2 Mean Absolue Scaled Error 1 ˆ n x x MASE n q 1 ME RMSE MAE MPE MAPE MASE 0.08835385 4.46331492 3.41468131-2.41358228 13.79285103 0.70334023 38 Hol-Winers Forecass: > rain.ses1.fore <- forecas.holwiners(rain.ses1,h=10) > rain.ses1.fore Poin Forecas Lo 80 Hi 80 Lo 95 Hi 95 1913 25.30941 19.56146 31.05736 16.51867 34.10014 1914 25.30941 19.44762 31.17119 16.34458 34.27423 1915 25.30941 19.33596 31.28285 16.17381 34.44500 1916 25.30941 19.22635 31.39247 16.00617 34.61264 1917 25.30941 19.11867 31.50014 15.84150 34.77732 > plo.forecas(rain.ses1.fore) Forecass from HolWiners 1918 25.30941 19.01284 31.60597 15.67964 34.93917 1919 25.30941 18.90876 31.71005 15.52046 35.09835 1920 25.30941 18.80634 31.81247 15.36383 35.25498 1921 25.30941 18.70551 31.91330 15.20962 35.40919 1922 25.30941 18.60620 32.01261 15.05774 35.5610756107 35 15 20 25 30 1820 1840 1860 1880 1900 1920 2IS55 TSA wih R 19
39 Diagnosics: In-Sample Forecas Errors Consan over ime? -5 5 15 1820 1840 1860 1880 1900 > sdisplay(rain.ses1.fore$residuals) X Non-zero Auocorrelaions? k 2 2 i 2 BL NN 2 N # p i 1 N i r -0 0.3 0.0 0.2 5 10 15 20 P -0 0.3 0.0 0.2 5 10 15 20 > Box.es(rain.ses1.fore$residuals,lag=20,ype="Ljung-Box") Box-Ljung es daa: rain.ses1.fore$residuals X-squared = 22.5621, df = 20, p-value = 0.3108 Example: opimal esimaed Principle: Opimal value for he parameer o be deermined from running 1-sep ahead predicion: > rain.ses2 <- HolWiners(laborforce.s,bea=FALSE,gamma=FALSE) Smoohing Parameers: T 0 > rain.ses2 1 2 min x x ˆ 1 (or relaed, eg. AIC or BIC) Smoohing parameers: alpha: 0.0241215102412151 bea : FALSE gamma: FALSE 1 1 ˆ 1 xˆt xt xt 1 More emphasis on he pas! 40 Final Componen Esimae: Coefficiens: [,1] [,1] a 24.67819 2IS55 TSA wih R 20
41 Running Componen Esimaes: > rain.ses2$fied xha level 1814 23.56000 23.56000 1815 23.62054 23.62054 1816 23.57808 23.57808 1817 23.76290 23.76290 1818 23.76017 23.76017 1819 23.76306 23.76306 1820 23.82691 23.82691 1821 23.79900 23.79900 1822 23.98935 23.98935 1823 23.98623 23.98623... xha.5 24.5 23. xˆ 1 L ˆ 1 1 > plo(rain.ses2$fied) level 23.5 24.5 1820 1840 1860 1880 1900 opimal α 42 In-Sample Predicion vs. Realizaion: > plo(rain.ses2) Observed / Fied 20 25 30 35 Running SSE T s0 1 > rain.ses2$sse [1] 1828.855 x xˆ 1 2 1820 1840 1860 1880 1900 In-Sample Accuracy: > accuracy(rain.ses2) ME RMSE MAE MPE MAPE MASE 0.4682496 4.2980556 3.3501092-0.8688727 13.4015953 0.6900400 2IS55 TSA wih R 21
43 Hol-Winers Forecass: > rain.ses2.fore <- forecas.holwiners(rain.ses2,h=10) > rain.ses2.fore Poin Forecas Lo 80 Hi 80 Lo 95 Hi 95 1913 24.67819 19.17493 30.18145 16.26169 33.09470 1914 24.67819 19.17333 30.18305 16.25924 33.09715 1915 24.67819 19.17173 30.18465 16.25679 33.09960 1916 24.67819 19.17013 30.18625 16.25434 33.10204 1917 24.67819 19.16853 30.18785 16.25190 33.10449 > plo.forecas(rain.ses2.fore) 1918 24.67819 19.16694 30.18945 16.24945 33.10694 Forecass from HolWiners 1919 24.67819 19.16534 30.19105 16.24701 33.10938 1920 24.67819 19.16374 30.19265 16.24456 33.11182 1921 24.67819 19.16214 30.19425 16.24212 33.11427 1922 24.67819 19.1605416054 30.19584 16.23968 33.11671 35 20 25 30 1820 1840 1860 1880 1900 1920 44 Diagnosics: In-Sample Forecas Errors Consan over ime? -5 5 1820 1840 1860 1880 1900 > sdisplay(rain.ses2.fore$residuals) X Non-zero Auocorrelaions? k 2 2 i 2 BL NN 2 N # p i 1 N i r -0 0.3 0.0 0.2 5 10 15 20 P -0 0.3 0.0 0.2 5 10 15 20 > Box.es(rain.ses2.fore$residuals,lag=20,ype="Ljung-Box") Box-Ljung es daa: rain.ses2.fore$residuals X-squared = 17.4008, df = 20, p-value = 0.6268 2IS55 TSA wih R 22
45 Hol s Exponenial Smoohing Series wih Trend bu No Seasonaliy Example: Annual Diameer of Women s Skirs skirs.s 600 800 1000 No Seasonaliy (annual daa!) 1870 1880 1890 1900 1910 Time Principle: x T 1 Lˆ 1 Tˆ 1 Hol s Exponenial Smoohing xˆ 1 Lˆ Tˆ T 1 1 1 1 1 1 ˆ T L ˆ ˆ x L T T x Lˆ Tˆ 1 T ˆ L ˆ L ˆ T ˆ 1 1 46 Remarks: Hol s ES allows for updaes of level and rend esimaions Two parameer version of Exponenial Smoohing Special: Brown s Exponenial Smoohing Boh parameers are equal: = β Similar o ARIMA(0,2,1) model (o be discussed) Pracical: Adequae iniial values for he level and rend esimaes needed ˆ ˆL x T ˆ x x 1 1 1 2 1 Opimal value for he parameers and o be deermined from running 1-sep ahead predicion:, T 0 1 2 min x x ˆ 1 (or relaed, eg. AIC or BIC) 2IS55 TSA wih R 23
47 Example: opimal and esimaed > skirs.hes <- HolWiners(x=skirs.s,gamma=FALSE) Smoohing Parameers: > skirs.hes Smoohing parameers: alpha: 0.8383481 bea : 1 gamma: FALSE T 1 1 1 1 1 1 Lˆ ˆ ˆ x L T T ˆ L ˆ L ˆ T ˆ Main emphasis on recen values! Final Componen Esimae: Coefficiens: [,1] a 529.308585 b 5.690464 48 Running Componen Esimaes: > skirs.hes$fied xha level rend 1868 626.0000 617.0000 9.0000000 1869 633.3233 625.1617 8.1616519 1870 645.9730 635.5673 10.4056551 1871 674.8676 655.2175 19.6501514 1872 721.5669 688.3922 33.1747101 1873 765.5280 726.9601 38.5679053...... xha 500 800 xˆ 1 Lˆ Tˆ T 1 1 1 > plo(skirs.hes$fied) opimal α and leve el 600 900 rend -60 0 40 1870 1880 1890 1900 1910 2IS55 TSA wih R 24
49 In-Sample Predicion vs. Realizaion: > plo(skirs.hes) Observed / Fied 0 700 900 Running SSE T s0 1 > skirs.hes$sse [1] 16954.18 x xˆ 1 2 50 1870 1880 1890 1900 1910 In-Sample Accuracy: > accuracy(skirs.hes) ME RMSE MAE MPE MAPE MASE 255.25079 454.78440 269.86059 33.42056 35.34162 11.21304 50 Hol-Winers Forecass: > skirs.hes.fore <- forecas.holwiners(skirs.hes,h=19) > skirs.hes.fore Poin Forecas Lo 80 Hi 80 Lo 95 Hi 95 1912 534.9990 509.55210 560.4460 496.08130 573.9168 1913 540.6895 491.01052 590.3685 464.71204 616.6670 1914 546.3800 465.36129 627.3987 422.47258 670.2874 1915...... > plo.forecas(skirs.hes.fore) -1000 0 1000 0 2000?? 1870 1890 1910 1930 2IS55 TSA wih R 25
51 Diagnosics: In-Sample Forecas Errors Consan over ime? -20 20 1870 1880 1890 1900 1910 > sdisplay(skirs.hes.fore$residuals) X Non-zero Auocorrelaions? k 2 2 i 2 BL NN 2 N # p i 1 N i r -0.4 0.0 0.4-2 4 6 8 12 P -0.4 0.0 0.4-2 4 6 8 12 > Box.es(skirs.hes.fore$residuals,lag=20,ype="Ljung-Box") Box-Ljung es daa: skirs.fore$residuals X-squared = 19.7312, df = 20, p-value = 0.4749 Hol-Winers Exponenial Smoohing (Addiive Seasonaliy) Example: Birhs per Monhs in New York Ciy 52 birhs.s 0 20 22 24 26 28 30 Trend? Addiive Seasonaliy? 1946 1950 1954 1958 Time Pi Principle: i x TS 1 Lˆ 1 Tˆ 1 Iˆ 1 Addiive Hol-Winer s ES xˆ 1 Lˆ Tˆ Iˆ TS 1 1 1 s TS 1 1 1 1 1 1 TS 1 Lˆ x Iˆ Lˆ Tˆ s T ˆ L ˆ L ˆ T ˆ Iˆ x Lˆ Iˆ s TS x Lˆ Tˆ Iˆ 2IS55 TSA wih R 26
53 Remarks: Hol-Winer s ES allows for updaes of level, rend and seasonaliy esimaions Three parameer version of Exponenial Smoohing Similar o SARIMA (0,1,1)x(0,1,1) s model (o be discussed) Special: Simple Seasonal ES No rend (T =0), only seasonaliy! Simple Addiive Seasonal ES x S 1 Lˆ 1 ˆ 1 xˆ 1 Lˆ Iˆ S 1 1 s Lˆ x Iˆ 1 Lˆ S 1 s S x Lˆ Iˆ I TS 1 Iˆ x Lˆ Iˆ s Pracical: Adequae iniial values for he level, rend and seasonaliy esimaes needed Opimal values for he parameers, and o be deermined from:,, T 0 1 2 min x x ˆ 1 (or relaed, eg. AIC or BIC) Example: opimal and esimaed birhs.hw <- HolWiners(birhs.s,seasonal="addiive") 54 Smoohing Parameers: birhs.hw Smoohing parameers: alpha: 0.4823655 bea : 0.02988495 gamma: 0.563186 T 1 1 1 1 1 1 TS 1 Lˆ ˆ ˆ x L T T ˆ L ˆ L ˆ T ˆ Iˆ x Lˆ Iˆ s Emphasis for rend on he pas! Final Componen Esimaes: Coefficiens: [,1] a 28.04366357 b 0.04199921 s1-0.78546221 s2-2.19944507 s3 0.87813012 s4-0.65164728 s5 0.63427267 s6 0.21182821 s7 2.23177191 s8 2.17167733 s9 1.52077678 s10 1.16900861 s11-0.97500043 s12-0.18636055 2IS55 TSA wih R 27
l Running Componen Esimaes: > birhs.hw$fied xˆ 1 Lˆ Tˆ Iˆ TS 1 1 1 s 55 xha level rend season Jan 1947 23.13579 23.81055-0.1567618007-0.51798958 Feb 1947 21.83089 22.83531-0.1812218860-0.82319792 Mar 1947 23.90724 22.29623-0.1919165635 1.80292708 Apr 1947 21.58463 22.00869-0.1947742244-0.22928125 > plo(birhs.hw$fied) May 1947 21.51602 21.85461-0.1935580066-0.14503125 Jun 1947 20.43661 21.77488-0.1901562399-1.14811458 Jul 1947 22.44490 21.74120-0.1854799895 0.88917708 Aug 1947 22.51935 22.05453-0.1705728887 0.63538542 Sep 1947 22.50969 22.51328-0.1517657072 0.14817708 Oc 1947 22.96787 22.64867-0.1431840736 0.46238542 Nov 1947 21.63717 22.57404-0.1411352421-0.79573958 Dec 1947 22.07360 22.49168-0.1393790022-0.27869792 Jan 1948 21.19997 22.35201-0.1393876391-1.01264664 Feb 1948... opimal α, and xha level rend season 22 26 24 27 21-2 0-0.20 2 0.00 1948 1950 1952 1954 1956 1958 1960 56 In-Sample Predicion vs. Realizaion: > plo(birhs.hw) 30 Observed / Fied 20 22 24 26 28 Running SSE T s0 1 > birhs.hw$sse [1] 90.94058 x xˆ 1 2 1948 1952 1956 1960 In-Sample Accuracy: > accuracy(birhs.hw) ME RMSE MAE MPE MAPE MASE 12.60682 17.77135 12.93276 50.15378 51.52523 10.81909 2IS55 TSA wih R 28
57 Hol-Winers Forecass: > birhs.hw.fore <- forecas.holwiners(birhs.hw,h=48) > birhs.hw.fore Poin Forecas Lo 80 Hi 80 Lo 95 Hi 95 Jan 1960 27.30020 26.32517 28.27523 25.80902 28.79139 Feb 1960 25.92822 24.83950 27.01694 24.26316 27.59327 Mar 1960 29.04779 27.85040 30.24518 27.21654 30.87905 Apr 1960 27.56001 26.25756 28.86247 25.56808 29.55195 May 1960 28.88793 27.48307 30.29280 26.73938 > plo.forecas(birhs.hw.fore) 31.03649 Jun 1960 28.50749 27.00220 30.01278 26.20535 30.80963 Jul 1960 30.56943 28.96521 32.17365 28.11598 33.02288 Aug 1960 30.55133 28.84929 32.25338 27.94828 33.15439 Sep 1960 29.94243 28.14338 31.74148 27.19102 32.69384 Oc 1960 29.63266 27.7372073720 31.52813 26.73380 32.53153 Nov 1960 27.53065 25.53918 29.52213 24.48496 30.57635 Dec 1960 28.36129 26.27407 30.44852 25.16916 31.55342 Jan 1961 27.80419 25.52190 30.08648 24.31373 31.29466 Feb 1961 26.4322... 20 25 30 35 40 1950 1955 1960 58 Diagnosics: In-Sample Forecas Errors Consan over ime? -2 0 2 1948 1950 1952 1954 1956 1958 1960 > sdisplay(birhs.hw.fore$residuals) X Non-zero Auocorrelaions? k 2 2 i 2 BL NN 2 N # p i 1 N i r -0.2 0.0 0.2-0 5 15 25 35 P -0.2 0.0 0.2-0 5 15 25 35 > Box.es(birhs.hw.fore$residuals,lag=20,ype="Ljung-Box") daa: birhs.hw.fore$residuals X-squared = 70.9555, df = 35, p-value = 0.0003082 2IS55 TSA wih R 29
59 ARMA Models: he Bare Essenials Sochasic Processes:, X,,, X 1 2 X Sochasic Process Individual sochass, ha migh be dependen Model Fi Model Idenificaion Parameer esimaion Model Validaion laborforce - daa x10 4 14 13 12 11 10 9 8 Realizaion: 7 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year Purely Random Processes Moving Average Processes: MA(q) Auoregressive Processes: AR(p) ARMA(p,q) & ARIMA(p,d,q) Processes 60 Purely Random Process: (whie noise) Z Z Z,,,, 1 2 Z ~ 0, 2 N Z and muually independen Basic Building Block Simulaion: > s.sim <- arima.sim(lis(order=c(0,0,0)),n=100) > plo(s.sim) im s.si -3-2 -1 0 1 2 3 Exploraory Daa Analysis Main Properies: Trend? Seasonal Variaion? Cyclic Variaion? Irregular Variaion? 0 20 40 60 80 100 Time 2IS55 TSA wih R 30
61 EDA-Summary: > sdisplay(s.sim) s.sim -3 0 2 0 20 40 60 80 100 0.2 0.2-0.3 0.0 5 10 15 20 P -0.3 0.0 5 10 15 20 62 Moving Average Process: MA(q) X Z Z Z 1 1 q q Inerpreaion: Presen Value = Moving Average of Pas Disurbances (=shock) Process is mainly influenced by random evens from he pas: Economics?? MA-models are ofen used o model ime series which show shor erm dependencies beween successive observaions Operaor Noaion: B X X 1 backshif operaor B X B Z q wih: B 1B B q 1 q q 2IS55 TSA wih R 31
63 Example: MA(2) Process X Z 0.3Z 0.4Z 1 2 Operaor Noaion: X B Z q wih: B 10.3 B q 0.4 B 2 Simulaion: 2 3 > s.sim <- arima.sim(lis(order= + c(0,0,2), ma=c(-0.3,-0.4)),n=100) > plo(s.sim) s.sim -3-2 -1 0 1 0 20 40 60 80 100 64 EDA-Summary: > sdisplay(s.sim) s.sim -3 0 2 0 20 40 60 80 100 MA-specific -0.3 0.0 P -0.3 0.0 0.3 0.3 5 10 15 20 5 10 15 20 2IS55 TSA wih R 32
65 Auoregressive Process: AR(p) X X X Z 1 1 p p Inerpreaion: Presen Value = Moving Average of Pas Values + Disurbance Process is mainly influenced by pas values of he process! AR-models are ofen used o model ime series which show longer erm dependencies beween successive observaions Operaor Noaion: B X X 1 backshif operaor B B X Z p wih: B 1B B p 1 p p 66 Example: AR(2) Process X 0.3X 0.4X Z 1 2 Operaor Noaion: B X Z p wih: B 10.3 B p 0.4 B 2 Simulaion: > s.sim <- arima.sim(lis(order= + c(2,0,0), ar=c(0.3,0.4)),n=100) > plo(s.sim) s.sim 2 3-3 -2-1 0 1 0 20 40 60 80 100 Time 2IS55 TSA wih R 33
67 EDA-Summary: > sdisplay(s.sim) s.sim -3 0 2 0 20 40 60 80 100 AR-specific -0.2 0.2 P -0.2 0.2 5 10 15 20 5 10 15 20 68 Auoregressive Moving Average Process: ARMA(p,q) 1 1 1 1 X X X Z Z Z p p q q Inerpreaion: Process is influenced boh by levels and by disurbances from he pas! Operaor Noaion: BX B Z p q backshif operaor B B 1B B p 1 p p B 1B B q 1 q q 2IS55 TSA wih R 34
69 Example: ARMA(1,1) Process X 0.6X Z 0.4Z 1 1 Operaor Noaion: BX B Z p q 1 0.6B 1 B B q 10.4 B Simulaion: > s.sim <- arima.sim(lis(order= + c(1,0,1),ar=c(0.6),ma=c(0.4)),n=100) > plo(s.sim) s.sim 2 3-3 -2-1 0 1 0 20 40 60 80 100 70 EDA-Summary: > sdisplay(s.sim) s.sim -3 0 3 0 20 40 60 80 100 MA-specific? AR-specific? -0.2 0.2 P -0.2 0.2 0.6 0.6 5 10 15 20 5 10 15 20 2IS55 TSA wih R 35
ARMA Model Fiing saionary series wihou seasonaliy 71 Sochasic Processes: X, X,,, X 1 2 Sochasic Process Individual sochass, ha migh be dependen Model Fi Model Idenificaion Parameer esimaion Model Validaion laborforce - daa 14 x 104 13 12 11 10 9 8 Realizaion: 7 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year Theoreical Finger Prin Auocorrelaion Parial Auocorrelaion EDA - Finger Prin Auocorrelaion Parial Auocorrelaion 72 Example: Volcanic Dus Veil Index > plo(volcano.s); sdisplay(volcano.s) volcano.s 600 0 200 400 6 Saionary? 1500 1600 1700 1800 1900 0.0 0.4 MA(3)? P 0.0 0.4 AR(2)? ARMA(1,1)? 0 5 10 15 20 25 0 5 10 15 20 25 2IS55 TSA wih R 36
73 Conjecure 1: MA(3)-model? > volcano.arma.0.3 <- Arima(volcano.s,order=c(0,0,3)) > volcano.arma.0.3 Series: volcano.s ARIMA(0,0,3) wih non-zero mean Coefficiens: ma1 ma2 ma3 inercep X X 57.5 0.7438 0.4513 0.1916 57.4559 s.e. 0.0455 0.0502 0.0442 7.6534 X Z 0.74Z 10.45Z2 0.19Z3 sigma^2 esimaed as 4852: log likelihood=-2661.69 AIC=5333.39 39 AICc=5333.52 52 BIC=5354.15 Model Forecass: > volcano.arma.0.3.fore <- + forecas(volcano.arma.0.3,h=19) > plo.forecas(volcano.arma.0.3.fore) 0 200 400 600 1500 1600 1700 1800 1900 2000 74 In-Sample Accuracy: > accuracy(volcano.arma.0.3) ME RMSE MAE MPE MAPE MASE -0.2205417 69.7825327 37.8638920 -Inf Inf 0.9926308 In-Sample Diagnosics: > sdiag(volcano.arma.0.3) arma Consan over ime? 0 8 Sandardized Residuals 1500 1600 1700 1800 1900 Time Non-zero Auocorrelaions? 0.0 0 of Residuals 0 5 10 15 20 25 X k 2 2 i 2 BL NN 2 N # p i 1 N i r p value 0.0 p values for Ljung-Box saisic 2 4 6 8 10 lag 2IS55 TSA wih R 37
75 Conjecure 2: AR(2)-model? > volcano.arma.2.0 <- Arima(volcano.s,order=c(2,0,0)) > volcano.arma.2.0 Series: volcano.s ARIMA(2,0,0) wih non-zero mean Coefficiens: ar1 ar2 inercep X X 57.5 0.7533-0.1268 57.5274 s.e. 0.0457 0.0458 8.5958 X 0.75X 1 0.12X2 Z sigma^2 esimaed as 4870: log likelihood=-2662.54 AIC=5333.09 AICc=5333.17 BIC=5349.7 Model Forecass: > volcano.arma.2.0.fore <- + forecas(volcano.arma.2.0,h=19) > plo.forecas(volcano.arma.2.0.fore) 0 200 400 600 1500 1600 1700 1800 1900 2000 76 In-Sample Accuracy: > accuracy(volcano.arma.2.0) ME RMSE MAE MPE MAPE MASE -0.2205417 69.7825327 37.8638920 -Inf Inf 0.9926308 In-Sample Diagnosics: > sdiag(volcano.arma.2.0) arma Consan over ime? 0 8 Sandardized Residuals 1500 1600 1700 1800 1900 Time Non-zero Auocorrelaions? 0.0 0 of Residuals 0 5 10 15 20 25 X k 2 2 i 2 BL NN 2 N # p i 1 N i r p value 0.0 p values for Ljung-Box saisic 2 4 6 8 10 lag 2IS55 TSA wih R 38
77 Conjecure 3: ARMA(1,1)-model? > volcano.arma.1.1 <- Arima(volcano.s,order=c(1,0,1)) > volcano.arma.1.1 Series: volcano.s ARIMA(1,0,1) wih non-zero mean Coefficiens: ar1 ma1 inercep X X 57.5 0.5848 0.1555 57.5473 s.e. 0.0526 0.0613 8.9412 X 0.3X 1 0.4X 2 0.4X3 Z sigma^2 esimaed as 4883: log likelihood=-2663.21 AIC=5334.42 42 AICc=5334.5 5 BIC=5351.03 Model Forecass: > volcano.arma.1.1.fore <- + forecas(volcano.arma.1.1,h=19) > plo.forecas(volcano.arma.1.1.fore) 0 200 400 600 1500 1600 1700 1800 1900 2000 78 In-Sample Accuracy: > accuracy(volcano.arma.1.1) ME RMSE MAE MPE MAPE MASE -0.2374427 69.8819979 37.7767429 -Inf Inf 0.9903461 In-Sample Diagnosics: > sdiag(volcano.arma.1.1) arma 1) Consan over ime? 0 8 Sandardized Residuals 1500 1600 1700 1800 1900 Time Non-zero Auocorrelaions? 0.0 0 of Residuals 0 5 10 15 20 25 X k 2 2 i 2 BL NN 2 N # p i 1 N i r p value 0.0 p values for Ljung-Box saisic 2 4 6 8 10 lag 2IS55 TSA wih R 39
500 400 300 200 100 0-100 -200-300 500 400 300 200 100 0-100 -200-300 -400 1940 1950 1960 1970 1980 1990 2000 2010 year -400 1940 1950 1960 1970 1980 1990 2000 2010 year 6000 4000 2000 0-2000 -4000-6000 1940 1950 1960 1970 1980 1990 2000 2010 year 13 12 11 10 9 8 7 6 13 12 11 10 9 8 7 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year 5 1940 1950 1960 1970 1980 1990 2000 2010 year 79 ARIMA Model Fiing non-saionary series skirs.s 600 800 1000 Saionary? Level / Variaion? kings.s 20 40 60 80 1870 1880 1890 1900 1910 Time 0 10 20 30 40 Time Remedy: Remove dominan non-saionariy hrough finie differencing: X X X X 1 1 B If successful hen fi ARMA(p,q) on X ARIMA(p,1,q) on X 80 ARMA(p,q) model on W : laborforce - AR(aic) noise BW B Z p q 14 x 104 laborforce - derend piecewise W d X laborforce - daa ARIMA(p,d,q)-model on X : 14 x 104 laborforce - AR(aic) noise d B 1B X B Z p q laborforce - daa 2IS55 TSA wih R 40
81 Example: Annual Diameer of Women s Skirs rs.s ski 800 1000 > plo(skirs.s) 600 1870 1880 1890 1900 1910 > skirs.d1 <- diff(skirs.s,differences=1) > plo(skirs.d1) skirs.dif1 60-20 20 60-6 skirs.dif2-40 -20 0 20 40 1870 1880 1890 1900 1910 > skirs.d2 <- diff(skirs.s,differences=2) > plo(skirs.d2) 1870 1880 1890 1900 1910 82 EDA-Summary: > sdisplay(skirs.s) -0.4 0.2 0.8 P -0.4 0.2 0.8 2 4 6 8 12 2 4 6 8 12-0.4 0.2 0.6 P -0.4 0.2 0.6 > sdisplay(skirs.d1) 2 4 6 8 12 2 4 6 8 12 > sdisplay(skirs.d2) -0.4 0.0 0.4 P -0.4 0.0 0.4 2 4 6 8 12 2 4 6 8 12 2IS55 TSA wih R 41
83 2 Model: W X X X X 2X X 1 1 2 > sdisplay(skirs.d2) -0.4 0.0 0.4 MA(?) P -0.4 0.0 0.4 AR(?) 2 4 6 8 12 2 4 6 8 12 ARMA(p,q) on W ARIMA(p,2,q) on X 84 Conjecure: ARIMA(1,2,1)-model? > skirs.arima.1.2.1 <- Arima(skirs.s,order=c(1,2,1)) > skirs.arima.1.2.1 Series: skirs.s ARIMA(1,2,1) Coefficiens: ar1 ma1-0.3123 0.0139 s.e. 0.4265 0.4449 2 W X W 0.31W Z 0.01Z 1 1 sigma^2 esimaed as 388.7: log likelihood=-193.66 AIC=393.33 33 AICc=393.93 BIC=398.68 Model Forecass: > skirs.arima.1.2.1.fore <- + forecas(skirs.arima.1.2.1,h=19) > plo(skirs.arima.1.2.1.fore) -1000 0 500 1500 1870 1890 1910 1930 2IS55 TSA wih R 42
500 400 300 200 100 0-100 -200-300 -400 1940 1950 1960 1970 1980 1990 2000 2010 year 500 400 300 200 100 0-100 -200-300 -400 1940 1950 1960 1970 1980 1990 2000 2010 year 5000 4000 3000 2000 1000 0-1000 -2000-3000 -4000 1940 1950 1960 1970 1980 1990 2000 2010 year 5000 4000 3000 2000 1000 0-1000 -2000-3000 -4000 1940 1950 1960 1970 1980 1990 2000 2010 year 85 SARIMA Model Fiing birhs.s 20 22 24 26 28 30 1946 1950 1954 1958 Time Saionary? Level / Variaion? Seasonaliy? Addiive / Muliplicaive? souvenir.s 0e+00 4e+04 8e+04 1987 1989 1991 1993 Time Remedy: Remove dominan non-saionariy hrough finie differencing: X X X X 1 1 B Remove dominan seasonaliy hrough seasonal differencing: X X X 1B X s S s 86 d D d s Combined adjusmen: 1B 1B ARMA(p,q) for W : W X X s D laborforce - AR(aic) noise BW B Z p q laborforce - aperiodic Generalisaion o accoun for changes in seasonaliy: laborforce - AR(aic) noise s B B W s B B p P Z q Q laborforce - aperiodic 2IS55 TSA wih R 43
500 400 300 200 100 0-100 -200-300 -400 1940 1950 1960 1970 1980 1990 2000 2010 year 500 400 300 200 100 0-100 -200-300 -400 1940 1950 1960 1970 1980 1990 2000 2010 year 6000 4000 2000 0-2000 -4000-6000 1940 1950 1960 1970 1980 1990 2000 2010 year 13 12 11 10 9 8 7 6 13 12 11 10 9 8 7 6 5 1940 1950 1960 1970 1980 1990 2000 2010 year 5 1940 1950 1960 1970 1980 1990 2000 2010 year 87 Equivalen: laborforce - AR(aic) noise s B B W s B B p P Z q Q 14 x 104 laborforce - derend piecewise W X d D s laborforce - daa SARIMA (p,d,q)x(p,d,q) s -model: laborforce - AR(aic) noise B B s B B s d D X p P s Z q Q laborforce - daa 14 x 104 88 Example: Birhs per Monh in New York Ciy 28 30 > plo(birhs.s); sdisplay(birhs.s) birhs.s 20 22 24 26-0.4 0.2 0.6 P -0.4 0.2 0.6 1946 1950 1954 1958 5 10 15 20 5 10 15 20 > birhs.d1.sd1<-diff(diff(birhs.s,differences=1),lag=12,differences=1) > plo(birhs.d1.sd1); sdisplay(birhs.d1.sd1) birhs.d1.sd1-3 -2-1 0 1 2 1948 1952 1956 1960-0.4-0.1 0.2 0 5 15 25 35 P -0.4-0.1 0.2 0 5 15 25 35 2IS55 TSA wih R 44
89 Model: 12 1B1B 12 W X X sdisplay(birhs.d1.sd1,lag.max=40) MA(?), MA s (?) AR(?), AR s (?) -0.4-0.1 0.2 P -0.4-0.1 0.2 0 10 20 30 40 0 10 20 30 40 ARMA(p,q) on W SARIMA(p,1,q)x(P,1,Q) 12 on X 90 Conjecure: SARIMA(3,1,3)x(1,1,0) 12 -model? > birhs.sarima.3.1.3.1.1.0<-arima(birhs.s,order=c(3,1,3), + seasonal=lis(order=c(1,1,0),period=12)) > birhs.sarima.3.1.3.1.1.0 Series: birhs.s ARIMA(3,1,3)(1,1,0)[12] Coefficiens: ar1 ar2 ar3 ma1 ma2 ma3 sar1-1.0567-0.0302 0.3607 0.9311-0.4072-0.7804-0.5139 s.e. 0.1441 0.2055 0.1497 0.1211 0.1565 0.1288 0.0767 sigma^2 esimaed as 0.5566: log likelihood=-178.26 AIC=372.51 AICc=373.5 BIC=396.86 Model Forecass: > birhs.sarima.3.1.3.1.1.0.fore<- + forecas(birhs.sarima.3.1.3.1.1.0,h=12) > plo(birhs.sarima.3.1.3.1.1.0.fore) 2 20 22 24 26 28 30 32 1950 1955 1960 2IS55 TSA wih R 45
91 More Time Series Modeling. Paul S. Cowperwaih e al., Inroducory Time Series wih R, ISBN 978-0387886978 92 Journal of Saisical Forecasing, July 2008, Volume 27, Issue 3 hp://www.youube.com/wach?v=1lh1hlbuf8k 2IS55 TSA wih R 46
93 Rob. J. Hyndman e al., l Rober M. Shumway e al., l Forecasing wih Exponenial Smoohing, ISBN 978-3540719168 Time Series Analysis and is Applicaions wih R examples, ISBN 978-1441978646 Jonahan D. Cryer e al., Time Series Analysis wih Applicaions in R, ISBN 978-1441926135 94 Suggesions for Improvemens of TSA R are welcomed: j.j.m.rijpkema@ue.nl 2IS55 TSA wih R 47