A Hybrd Mehod for Forecasng Sock Marke Trend Usng Sof-Thresholdng De-nose Model and SVM Xueshen Su, Qnghua Hu, Daren Yu, Zongxa Xe, and Zhongyng Q Harbn Insue of Technology, Harbn 150001, Chna Suxueshen@Gmal.com Absrac. Sock marke me seres are nherenly nosy. Alhough suppor vecor machne has he nose-oleran propery, he nosed daa sll affec he accuracy of classfcaon. Compared wh oher sudes only classfy he movemens of sock marke no up-rend and down-rend whch does no concern he nosed daa, hs sudy uses wavele sof-hreshold de-nosng model o classfy he nosed daa no sochasc rend. In he expermen, we remove he sochasc rend daa from he SSE Compose Index and ge de-nosed ranng daa for SVM. Then we use he de-nosed daa o ran SVM and o forecas he esng daa. The h rao s 60.12%. Comparng wh 54.25% h rao ha s forecased by nosy ranng daa SVM, we enhance he forecasng performance. Keywords: Sof-hresholdng, De-nose, SVM, Sock marke, Fnancal me seres. 1 Inroducon Sock marke rend forecasng gves nformaon on he correspondng rsk of he nvesmens and also wll nfluence he radng behavor. Sock marke me seres are nheren nosy, non-saonary, and deermnscally chaoc [1]. I has been shown ha daa exrapolaed from sock markes are almos corruped by nose and appears ha no useful nformaon can be exraced from such daa. Modelng such nosy and non-saonary me seres s expeced o be a challengng ask [2]. In recen years, numerous sudes have demonsraed ha neural neworks are a more effecve mehod n descrbng he dynamcs of non-saonary me seres due o her unque non-paramerc, non-assumable, nose-oleran and adapve properes [3]. However, neural neworks sll have several lmaons. SVM orgnaes from Vapnk s sascal learnng heory. Unlke mos of he radonal mehods whch mplemen he emprcal rsk mnmzaon prncpal, SVM mplemens he srucural rsk mnmzaon prncpal whch seeks o mnmze an upper bound of he generalzaon error raher han mnmze he ranng error [4]. Many applcaons of he SVM o forecas fnancal me seres have been repored. Cao and Tay used he heory of SVM n regresson o forecas he S&P 500 Daly Index n he Chcago Mercanle. They measured he degree of accuracy and he accepably of ceran forecass by he esmaes devaons from he observed values [3]. Km forecased he drecon of he change n daly Korea compose sock A. An e al. (Eds.): RSFDGrC 2007, LNAI 4482, pp. 387 394, 2007. Sprnger-Verlag Berln Hedelberg 2007
388 X. Su e al. prce ndex (KOSPI) wh he heory of SVM n classfcaon. The bes predcon performance for he holdou daa s 57.83% [5]. Tony Van Gesel desgned he LS- SVM me seres model n he evdence framework o predc he daly closng prce reurn of he German DAX30 ndex (Deuscher Aken Index) [6].Many of he prevous sudes have compared he performance of SVM wh BP neural nework, case-based reasonng (CBR) and so on. All of he resuls prove ha he general performance for SVM s beer han he radonal mehods. Many sudes had seleced opmum parameers of SVM when hey would enhance he forecasng performance. Ths sudy proposes dealng wh he nose of he sock marke n order o enhance he forecasng performance of SVM. Accordng o he wavele de-nosng model of sof-hresholdng, we classfy he sock marke shorerm rend no up-rend, sochasc rend and down-rend. We remove he sochasc rend daa from he orgnal Index daa and ake he res daa whch belong o he uprend and down-rend as he ranng daa. Then we use he raned SVM o forecas he sock marke rends. 2 Theorecal Backgrounds 2.1 Sof-Thresholdng De-nose Model f s he orgnal sgnal, he pollued mage sgnal s () Supposng () sgnal s e (). Then, he model of he nosed maged s s, and nose s () = f() + σ e () (1) whereσ denoes a nose level and e () s a Gauss whe nose Fgure 1 s he block dagram of sgnal de-nosng wh wavele ransformaon. The hree blocks n fgure 1 represen he hree basc seps of de-nosng respecvely. Fg. 1. The block dagram of wavele de-nosng Wavele decomposon s he frs sep: selecng wavele and decomposon Level, and calculang he coeffcens of he ransformaon from s () o he layer J. The second sep whch s he hreshold manpulaon sep: selecng he hreshold and dealng wh he coeffcens accordng o he equaon as follows: The sof-hreshold de-nosng funcon ' djk,, djk, d = jk, { 0, djk, < (2)
A Hybrd Mehod for Forecasng Sock Marke Trend 389 where d jk, denoes he coeffcen of he ransformaon, d, jk denoes he coeffcen of he hreshold manpulaon, = σ 2 log( N) s he hreshold, and N s he oal number of he mage pxel. The fnal sep s he reconsrucon sep: reconsrucng he mage wh he coeffcens d jk, by nverse wavele ransformaon [7, 8, 9]. 2.2 Suppor Vecor Machne n Classfcaon In hs secon, we only brefly nroduce he fnal classfcaon funcon. For he dealed heory of SVM n classfcaon, please refer o [10,11,12]. The fnal classfcaon funcon s T 1 f x Sgn y x x y y x x = 1 Ns 0< α j < C = 1 N N T ( ) = α ϕ( ) ϕ( ) + j α ϕ( ) ϕ( j) T T If here s a Kernel funcon such as K( x,xj) ϕ( x) ϕ( xj) unnecessary o explcly know wha ϕ ( x) (3) =, s usually s, and we only need o work wh a kernel funcon n he ranng algorhm. The non-lnear classfcaon funcon s N 1 f x Sgn y K x,x y y K x,x 1 N = s 0< α j < C = 1 N ( ) = α ( ) + j α ( j) There are some dfferen kernels for generang he nner producs o consruc machnes wh dfferen ypes of nonlnear decson surfaces n he npu space. Choosng among dfferen kernels he model ha mnmzes he esmae, one chooses he bes model. Common examples of he kernel funcon are he polynomal kernel ( ) ( xy ) K x, y = + 1 d and he Gaussan radal bass funcon 2 ( ) 2 (, ) exp 1/ σ ( ) K x y = x y where d s he degree of he polynomal kernel funcon and σ s he bandwdh of he Gaussan radal bass funcon kernel. I has 2 proved ha he upper bound C and he kernel parameer σ play an mporan role n he performance of SVM. (4) 3 Expermen Desgn 3.1 Daa Collecon and Preprocessng by De-noce Model In our emprcal analyss, we se ou o examne he fve-day movng rend of he Shangha Sock Exchange (SSE) Compose Index. The orgnal daa pons cover he me perod from 28/04/1997 up o 12/09/2006 whch s 2261 daa. We selec 1920
390 X. Su e al. 2200 2000 1800 1600 1400 1200 1000 800 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Fg. 2. SSE compose ndex daa from he 2261 daa as ranng daa and ake he res 341 daa as esng daa. As shown n Fg. 2, he 1920 daa are llusraed. Fgure 3 llusrae he 1920 smooh SSE Compose Index daa whch had been denosed by sof-hresholdng. 2200 2000 1800 1600 1400 1200 1000 800 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Fg. 3. Smoohed SSE compose ndex
A Hybrd Mehod for Forecasng Sock Marke Trend 391 Based on he sof-hreshold whch deermned by he wavele de-nose Model, we classfy he sock marke no up-rend, sochasc rend and down-rend whch have 357, 1171,392 daa respecvely. Fgure 4 llusraes he deals resdual of SSE Compose Index and sof-hreshold [13]. Fg. 4. Resdual of SSE compose ndex and sof-hresholds We defne he nosy daa whose fve-day SSE Compose Index change value beween he up and lower sof-hreshold as he sochasc rend. In consequence, he value of daa above he upper sof-hreshold s defned as he up-rend and he value of daa below he lower sof-hreshold are defned as he down-rend. Then, we remove he 1171 sochasc rend daa from he orgnal SSE Compose Index and ake he res 749 daa whch belong o he up-rend and down-rend as he ranng daa. 3.2 The Inpu Daa of SVM The npu daa used n hs sudy s echncal ndcaors and he drecon of change n he fve-day SSE Compose Index. The seleced 12 echncal ndcaors are he nal arbues whch are presened n Table 1. 1) C, h, l s he closng, hghes and lowes prce a me, V s radng volume a me ; 2) AU( AD) 14 days C up (down) average rang; 3) EMA s he exponenal movng average; 4) HH, LL mean hghes hgh and lowes low. The forecasng performance P s evaluaed usng he followng equaon: m 1 (5) P = D ( = 1,2,, m) m = 1 D s he forecasng resul for he h up-rend and down-rend radng day where whch s no ncludng he sochasc radng day. I s defned by
392 X. Su e al. Table 1. Seleced echncal ndcaors and her formulas Techncal ndcaors Formula ALF(Alexander s Fler) ( C 5 / C -1) 100 1) RS(Relave Srong) AU / AD 2) RSI(Relave Srong Index) 100 100 AD /( AD + AU ) MFI(Money Flow Index) ( + MF ) /(( + MF ) + (- MF )) 100 + (-) MF = EMA( U ( D), n) C LB UB LB %BB(Bollnger s Band) ( ) /( ) 100 UB( LB) = MA + ( ) V 2 V(Volaly) SD( Y ) 100%, Y = ln( C / C 1 ) VB(Volaly Band) UB LB CHO (Chakn Oscllaor) MA( A,5) MA( A,10) MACD(Movng Average Convergence/Dvergence) A = ( C /(( h + l ) / 2) -1) V EMAC (,12) EMAC (,26) 3) %K ( C - LL -n ) /( HH -n - LL -n ) 100 4) A/D Osc (Accumulaon and ( H - C -1 )/( H - L ) dsrbuon oscllaor) Wllams %R ( H n - C )/( H n - C n ) 100 { 1 f PO= AO D, = 0 oherwse (6) where PO s he forecasng oupu from he model for he h radng day, AO s he acual oupu for he h radng day, m s he number of he es examples [5]. 4 Expermen Resul In hs sudy, we selec he daa pons coverng he me perod from 28/04/1997 o 12/09/2006. There are 2261 daa pons of Shangha Sock Exchange (SSE) Compose Index. We use he frs 1920 daa of he 2261 orgnal daa as ranng se and ake he res 341 daa as esng daa. We use wavele sof-hreshold de-nose Model o de-nose he 1920 ranng daa. As llusraed n Fg.2- Fg.4, we can see he deal process of de-nose. As a resul of he sof-hreshold de-nosng, we ge 1171 nosed daa whch are classfed as he sochasc rend daa. We remove he 1171sochasc rend daa from he orgnal SSE Compose Index and ake he res 749 daa whch belong o he up-rend and downrend as he ranng se for SVM. The Gaussan radal bass funcon s used as he kernel funcon of he SVM. We conduc he expermen wh respec o varous kernel parameers and he upper bound C. The range for kernel parameer s beween 1 and 100 and he range for C
A Hybrd Mehod for Forecasng Sock Marke Trend 393 s beween 1 and 100. We use he 749 daa menoned above o ran he SVM and apply he SVM o classfy he 341 es daa. For comparson, we also use he 1920 daa menoned above o ran SVM and employ SVM o classfy he same 341 es daa. The forecasng resuls of wo mehods are shown n able 2. Table 2. Bes forecasng resuls of wo mehods SVM Tesng/ranng daa C σ H rao De-nose Tesng daa 90 20 60.12% Tranng daa 30 10 99.87% Nosy Tesng daa 10 100 54.25% Tranng daa 50 10 99.95% The resuls n able 2 show ha bes h rao of he de-nose SVM s 60.12% whch are beer han he bes h rao 54.25% of nosy SVM. 5 Concluson Many applcaons of SVM o forecas fnancal me seres have been repored. Mos of he researches only pad aenon o selec opmum parameers of SVM when hey wan o enhance forecasng performance. However, as SVM has he nose-oleran propery, lle sudy dscusses abou preprocessng he nosy npu daa o enhance he forecasng performance. In hs sudy, on he condon of selecng opmum parameers of SVM, we use sof-hresholdng o de-nose he ranng daa and ge a beer opmal hyperplane han he opmal hyperplane learned wh nosy ranng daa. Consequenly, compared wh he 54.25% h rao of he nosy SVM, he forecasng performance of he de-nosed SVM s 60.12% h rao. The h rao s also beer han Km s 57.83%, whch s he bes predcon performance n forecasng he rend of Korea compose sock prce ndex (KOSPI) wh SVM [6]. Ths sudy proves ha de-nosng he ranng daa can effecvely enhance he forecasng performance of SVM. References 1. Deboeck, G. J.: Tradng on he Edge: Neural, Genec, and Fuzzy Sysems for Chaoc Fnancal Markes. Wley, New York (1994) 2. Cao, L. J., Tay, F. E. H.: Suppor Vecor Machne wh Adapve Parameers n Fnancal Tme Seres Forecasng. IEEE Transacons on Neural Neworks, 14 (2003) 33-52 3. Cao, L. J., Tay, F. E. H.: Fnancal Forecasng Usng Suppor Vecor Machnes. Neural Compu, (2001)184 192 4. Vapnk,V.: The Naure of Sascal Learnng Theory. Sprng-Verlag, New York (1995) 5. Km, Kyoung-jae: Fnancal me seres forecasng usng suppor vecor machnes. Neurocompung,(2003) 307 319
394 X. Su e al. 6. Gesel, T.V., Suykens, J. A. K., Baesaens, D.E., Lambrechs, A., Lanckre, G., Vandaele, B., Moor, B.D., Vandewalle, J. : Fnancal Tme Seres Predcon Usng Leas Squares Suppor Vecor Machnes whn he Evdence Framework. IEEE Transacons on Neural Neworks, 12 (2001) 809-821 7. Donoho, D. L.: Nonlnear wavele mehods for recovery of sgnals, denses, and specra from ndrec and nosy daa. Proceedngs of Symposa n Appled Mahemacs, 47 (1993) 173 205 8. Donoho, D. L.: De-nosng by sof-hresholdng. IEEE Trans. on Informaon Theory, 41 (1995) 613 6274 9. Kecman, V.: Learnng and Sof Compung, Suppor Vecor machnes, Neural Nework and Fuzzy Logc Models.The MIT Press, Cambrdge, MA (2001) 10. Vapnk, V.: Sascal learnng heory. Wley, New York (1998) 11. Wang, L.P., Fu.X.J. : Daa Mnng wh Compuaon Inellgence. Sprnger, Berln (2005) 12. Han, M., X, J., Xu, S., Yn, F.L.: Predcon of chaoc me seres based on he recurren predcor neural nework IEEE Trans. Sgnal Processng, 52(2004)3409-3416 13. Teo, K.K., Wang, L.P., Ln, Z.: Wavele packe mul-layer percepon for chaoc me seres predcon: effecs of wegh nalzaon. Lecure Noes n compuer Scence, 2074 (2001)210-317