Forecasting the Direction and Strength of Stock Market Movement

Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems n the world, and t has connecton wth almost every part n our lfe. In ths paper, we appled dfferent Machne Learnng methods to predct the drecton and strength of the market ndex prce movement. Specfcally, we looked at Multnomal Logstc Regresson, K- nearest neghbors algorthm, Lnear Dscrmnant Analyss, Quadratc Dscrmnant Analyss, and Multclass Support Vector Machne, and compared ther performance results based on test accuracy, robustness, and run- tme effcency.. Introducton Stock market movement predcton s a challengng task because of the hgh data ntensty, nose, hdden structures, and the hgh correlaton wth the whole world. In addton to forecastng the movement predcton, we also tred to predct the movement strength of stock market at the same tme. We started wth a bref descrpton of the response and the predctors, whch we thnk are relevant, and dd feature selecton based on the correlaton coeffcents between them. Fnally, we appled dfferent Machne Learnng methods to tran the data and make predctons. The data s from Yahoo! Fnance and Federal Reserve Bank of St. Lous. 2. Dataset The response n our models s based on the hstorcal data of S&P 5 Index. We dvded the returns of S&P 5 Index nto four categores based on our segment threshold δ as followng: (, δ],( δ,],(,δ],(δ,+ ). We marked them wth Class, 2, 3 and 4, so that we can ntegrate the nformaton about stock market movement and ts strength together. Our predctors nclude the lagged change rates of followng ndces - S&P 5 Index, US Dollar Index, Volatlty Index (VIX), FTSE Index, Nkke 225 Index, Hang Seng Index, Shangha Stock Exchange Composte Index, S&P 5 Index Volume, S&P 5 Index hghest prce change rate durng the perod, S&P 5 Index lowest prce change rate, 3-Month T-Blls Rate, -Year T-Notes Rate and Intal Jobless Clams. The range of the data s from 2/28/99 to //23. Therefore, there are 5758 observatons n the daly data and 58 observatons n the weekly data. 3. Evaluaton Metrc We set asde the last 2 observatons n our data sets as the test data because we only care about the predctablty of our models n the recent perod and the best model mght change as tme fles. We traned our models on the data based on dfferent tranng dataset sze and dfferent segment thresholds and then appled the coeffcents we obtaned to make predctons on the test data. We use test precson defned as () () () () () {y pred = y actual }/ {y actual }, where y and pred y are the -th predcted response and the - actual th actual response. Also, we looked at the test precson based on the probablty of Class 4 occurrng when we predct the response s 4,.e., P(Y actual = 4 Y pred = 4) () {y pred () = y actual () {y pred The realzed dstrbutons of our test set on the 4 categores are shown below. Realzed Dstrbuton of Test Sample (Daly) = 4}. = 4} Realzed Dstrbuton of Test Sample (Weekly).8.8.6.6 Dstrbuton.4.2 Dstrbuton.4.2.5..5 4 3 2 Category..2.3 4 3 Category 2

4. Multnomal Logstc Regresson Multnomal Logstc Model wth four categores has three logt functons: 𝐿𝑜𝑔𝑖𝑡 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝑌 = 𝑖 𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑡𝑜 𝑙𝑜𝑔𝑖𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝑌 = 4, 𝑖 =,2,3 whle Class Y = 4 s the reference group. So we can wrte the logt functons as the followng: 𝑙𝑜𝑔 𝑔 𝑌 = 𝑖 = 𝐴! + 𝐵!! 𝑋! + + 𝐵!" 𝑋!, 𝑙𝑜𝑔 𝑔 𝑌 = 4 = 𝑙𝑜𝑔 =, 𝑖 =,2,3 where 𝐴! and 𝐵!" are the regresson coeffcents. The Multnomal Logstc Functons are then defned as: 𝑔 𝑌=𝑖, 𝑔 𝑌 = +𝑔 𝑌 =2 +𝑔 𝑌 =3 + 𝑓 𝑌=4 =. 𝑔 = +𝑔 𝑌 =2 +𝑔 𝑌 =3 + 𝑓 𝑌=𝑖 = 𝑖 =,2,3 We then only need to compare the four Logstc Functons and pck the greatest one as the predcted category. The out-of-sample category predcton results based on daly and weekly data are shown below. MLR: Overall (Weekly).7.7.6.6 MLR: Overall (Daly).5.4.5.4.3.3.2 3.2 8.5 2.3 6. 4.5. 2 MLR: 4th Class (Weekly) MLR: 4th Class (Daly).8.8.6.6.2.4.2 3.4.2.5 2. 8.3 6 4.5.2. 2 The frst row of graphs presents the overall category predcton results based on our MLR model. As the segment threshold ncreases (above.5% for daly and % for weekly), both daly and weekly data perform better; however, the performance of weekly data s less volatle than the daly data, and weekly data generally yelds a better performance wth respect to the same level of realzed dstrbuton of the test sample we use.

The stuaton n the second row s smlar to that of the frst row. In terms of the 4 th Class (.e. Y = 4 or Extreme Postve), weekly-data-based model sgnfcantly outperforms the one based on daly data wth respect to both accuracy rate and relablty. Therefore, weekly-data-based MLR model s more prosperous and relable for real-world real-tme predcton. 5. Lnear Dscrmnant Analyss We appled both lnear dscrmnant analyss (LDA) and quadratc dscrmnant analyss (QDA) to model the data. QDA assumes that each class has ts own covarance matrx. Consequently, LDA has lower varance, whch leads to mproved predcton performance. As we performed both models, we found out that LDA substantally outperforms QDA. Thus, we wll only present the LDA model here. Wth dfferent values of tranng sze and segment threshold, the overall and 4 th Class test precsons wth weekly and daly data are shown as below. LDA: overall (daly) LDA: overall (weekly).5.5.4.3.2.45.4.35.3.25. 3 2.5..5.2 5..2.3 LDA: 4th class (daly) LDA: 4th class (weekly).8.7.6.6.4.2.5.4.3 3 2.5..5.2 5..2.3 As can be seen from the fgures above, weekly data performs better than daly data and the test precson does not change sgnfcantly wth dfferent values of tranng szes. However, t severely depends on the segment threshold value. It s worthy to note that, when threshold becomes larger, the observaton n Class 3 wll ncrease and our model tends to predct Class 3, thus the overall precson ncreases whle the 4 th Class precson drops. 6. K-Nearest Neghbors Algorthm We performed senstvty analyss wth dfferent values of K and t turned out that K= would be a reasonable choce. Wth dfferent values of tranng sze and segment threshold, the overall and 4th Class test precsons wth weekly and daly data are shown below.

KNN: overall (daly) KNN: overall (weekly).7.6.6.5.5.4.3.4.3.2 3 2.5..5.2 5..2.3 KNN: 4th class (daly) KNN: 4th class (weekly).8.7.6.6.4.2.5.4.3 3 2.5..5.2 5..2.3 We can see that KNN model underperforms LDA model and shows bad results for medum segment threshold values, nor s ts predcton accuracy for 4 th class satsfyng. Besdes, the runnng speed s also much lower than LDA. Therefore, we recommend LDA to model data. 7. Multclass Support Vector Machne The last model we used s the Multclass SVM Model, whch conssts of the precedng observatons of the test pont. Because we have 2 test ponts, so we should tran our models for 2 tmes and make predctons separately. We appled the same segment thresholds to categorze the returns of S&P 5 Index, except that we changed the range and the sze of step. We traned our models on the scaled and non-scaled daly data and weekly data, and we also appled dfferent kernels to the model, ncludng lnear kernel, polynomal kernel, radal bass functon kernel, and sgmod kernel. After tryng out all the kernels above, however, we dscovered that the lnear kernel and the polynomal kernel do not apply to ths problem, and sgmod kernel behaves smlar to radal bass functon kernel. Moreover, the outcome of daly data s smlar to that of weekly data. Therefore, we only show the results of our SVM models usng radal bass functon kernel on daly data. The results of our Multclass SVM are shown below.

Multclass SVM: Overall (nonscaled) Multclass SVM: overall (scaled).7.7.6.5.4.3.6.5.4.2.5.5 5.5. 5.5. We can see that scaled predctors perform more robust than non-scaled predctors because the range of test precsons s narrower on scaled predctors. Also, we can conclude as before that the tranng sze has lttle effect on the test precson whle segment threshold affects the test precson sgnfcantly. In addton, we can acheve more than 6% test precson f we choose a proper segment threshold no matter whether we scale the predctors. 8. Concluson Based on the results from dfferent machne learnng methods, we can see that Multnomal Logstc Regresson performs the best n all the models n terms of the model robustness, predcton precson, and run-tme effcency. All the models perform better on the weekly data than on the daly data. For Multclass SVM, scaled predctors can mprove the robustness of the model n the predcton. Reference G.James, D.Wtten, T.Haste, R.Tbshran (29). An Introducton to Statstcal Learnng. Sprnger. R.Choudhry, K.Grag (28). A Hybrd Machne Learnng System for Stock Market Forecastng. World Academy of Scence, Engneerng and Technology, 5. W.Huang, Y.Nakamor, S.Wang (25). Forecastng stock market movement drecton wth support vector machne. Computers & Operatons Research, 32, pp. 253 2522.