Simulaeous opimizaio mehod of feaure rasformaio ad weighig for arificial eural eworks usig geeic algorihm : Applicaio o Korea sock marke Kyoug-jae Kim * ad Igoo Ha Absrac I his paper, we propose a ew hybrid model of arificial eural eworks (ANNs) ad geeic algorihm (GA) o opimal feaure rasformaio ad feaure weighig. Previous research proposed several varias of hybrid ANNs ad GA models icludig feaure weighig, feaure subse selecio ad ework srucure opimizaio. Amog he vas majoriy of hese sudies, however, ANNs did o lear he paers of daa well, because hey employed GA for simple use. I his sudy, we icorporae GA i a simulaeous maer o improve he learig ad geeralizaio abiliy of ANNs. I his sudy, GA plays role o opimize feaure weighig ad feaure rasformaio simulaeously. Globally opimized feaure weighig overcome he well-kow limiaios of gradie desce algorihm ad globally opimized feaure rasformaio also reduce he dimesioaliy of he feaure space ad elimiae irreleva facors i modelig ANNs. By his procedure, we ca improve he performace ad ehace he geeralisabiliy of ANNs. Key words: simulaeous opimizaio mehod, feaure rasformaio, feaure weighig, arificial eural eworks, geeic algorihm, sock marke predicio 1. Iroducio ANNs o predic accurae sock idex ad direcio of chage. I has bee widely acceped ha mos fiacial variables are o-liear. Recely, arificial eural eworks (ANNs) are applied o he problem of fiace rapidly, such as sock marke predicio, bakrupcy predicio, corporae bod raig, ec. Several sudies o sock marke predicio usig arificial ielligece (AI) echiques were performed durig pas decades. Sock marke predicio was ypical problem of fiacial imeseries predicio. Prior sudies used various ypes of Oe of he earlies sudies, Kimoo e al. (1990) used several learig algorihms ad predicio mehods for he Tokyo sock exchage prices idex (TOPIX) predicio sysem. Their sysem used modular eural ework o lear he relaioships amog various facors. Kamijo ad Taikawa (1990) used recurre eural ework ad Ahmadi (1990) used backpropagaio eural ework wih geeralized dela rule o predic he sock marke. Yoo ad Swales (1991) also performed Graduae School of Maageme Korea Advaced Isiue of Sciece ad Techology
predicio usig qualiaive ad quaiaive daa. Some researchers ivesigaed he issue of predicig sock idex fuures marke. Trippi ad DeSieo (1992) ad Choi e al. (1995) prediced daily direcio of chage i S&P 500 idex fuures usig ANNs. Duke ad Log (1993) execued daily predicio of Germa goverme bod fuures usig feedforward backpropagaio eural ework. Rece research eds o iclude ovel facors ad o hybridize several AI echiques. Hiemsra (1995) proposed fuzzy exper sysems o predic sock marke reurs. He suggesed ha ANNs ad fuzzy logic could capure he complexiies of he fucioal mappig because hey do o require he fucioal specificaio of he fucio o approximae. A more rece sudy of Kohara e al. (1997) icorporaed prior kowledge o improve he performace of sock marke predicio. Tsaih e al. (1998) iegraed he rule-based echique ad he ANNs o predic he direcio of he S&P 500 sock idex fuures o a daily basis. They, however, did o produce ousadig predicio accuracy parly because remedous oise ad o-saioary characerisics i sock marke daa. Traiig ANNs ed o be difficul wih high oisy daa he he ework fall io a aive soluio such as always predicig he mos commo oupu (Lawrece e al., 1996). For his reaso, several sudies proposed various kids of hybrid models improve he learig abiliy of ANNs. Pas research proposed several learig ad search echiques o hybridize wih ANNs such as MDA (Lee e al., 1996), ID3 (Lee e al., 1996), geeic algorihm (GA) (Harp ad Samad, 1991; Schaffer e al., 1992; Park e al., 1994; Ores ad Sklasky, 1997; Yag e al., 1998; Sexo e al., 1998a). Lee e al. (1996) also used selforgaizig feaure map ad Sexo e al. (1998b) used abu search as a hybridizig mehod. I his paper, we propose a ew hybrid model of ANNs ad GA for opimal feaure rasformaio ad feaure weighig. Previous researches proposed several varias of ANNs ad GA hybrid models such as feaure weighig, feaure subse selecio ad ework srucure opimizaio. The vas majoriy of hese sudies, however, ANNs did o lear he paers of daa well because hey employed GA for simple use. GA approach, however, ca poeially be used o opimize muliple facors of he learig process. I his sudy, we use GA o opimize muliple facors of learig process o improve he geeralisabiliy of ANNs. Firs, GA play a role o opimize feaure weighig of ANNs. Feaure weighig icludes he oios of feaure subse selecio ad he opimizaio of ework srucure. The majoriy of ANNs rely o a gradie desce algorihm o opimize he coecio weighs of ework. Gradie desce algorihm, however, ofe did o produce geeralized model because of wellkow limiaios. Secod, his sudy adops feaure rasformaio based o GA. Because daa preprocessig is a esseial sep for kowledge discovery ad elimiae some irreleva ad reduda feaures, may researchers i sociey of daa miig have a broad ieres i feaure rasformaio (Liu ad Mooda, 1998). I may applicaios, he size of daa is so large ha learig of paer may o work as well. Reducig ad rasformig he irreleva or reduda feaures shores he ruig ime of a learig model ad yields more geeralized resuls (Dash ad Liu, 1997). Feaure rasformaio, i his sudy, is o rasform coiuous values io discree oes i accordace wih opimized hreshold hrough geeic search. This approach effecively filers daa, rais he classifier, ad exracs he rules easier from he classifier. I addiio, i reduces he dimesioaliy of he feaure space he o oly decreases he cos ad imes i he operaio of he classifier bu also ehaces he geeralisabiliy of classifier. The res of he paper is orgaized io five
secios. The ex secio reviews feaure weighig ad feaure rasformaio mehodologies. I he hird secio, we propose simulaeous opimizaio mehod for feaure rasformaio ad feaure weighig i ANNs usig GA ad describe he beefis of proposed approach. I he fourh secio, we describe he desig of his research ad execue experimes. I he fifh secio, empirical resuls are summarized ad discussed. I he followig secio, coclusios ad research implicaios are preseed wih he assessme of our approach. 2. Feaure weighig ad feaure rasformaio for ANNs For a log ime, here have bee much research ieress o predic fuure. Amog hem, several amou of research o predic fuure usig daa miig echiques icludig ANNs. ANNs have preemie learig abiliy, however, ofe cofro wih icosise ad upredicable performace. As meioed earlier, because daa preprocessig is a esseial sep for kowledge discovery ad elimiae some irreleva ad reduda feaures, may researchers i sociey of daa miig have a broad ieres i feaure rasformaio ad subse selecio (Liu ad Mooda, 1998). I is especially impora o opimize ework srucure of ANNs. Feaure subse selecio ad ework srucure opimizaio is parly refleced by he feaure weighig i he process of modelig ANNs. Feaure weighig meas, i his sudy, opimizig he coecio weigh vecors of ANNs. The vas majoriy of ANNs sudies rely o a gradie desce algorihm o ge he weigh vecor of he model. Sexo e al. (1998a) poied ou he fac ha he gradie desce algorihm, however, applied o complex oliear opimizaio problems ofe resuled i icosise ad upredicable performace. Their idicaio sems from he fac ha backpropagaio is a local search algorihm ad may ed o become fell io local miimum. Several research have aemped o address his problem, Sexo e al. (1998a) saed as he use of he momeum, resarig raiig a may radom pois, resrucurig he ework archiecure, ad applyig sigifica cosrais o he permissible forms ca fix i. They suggesed ha oe of he more promisig direcios is usig global search algorihms o search he weigh vecor of ework isead of local search algorihm icludig backpropagaio. Some ANNs research advocaed global search ca improve performace. Sexo e al. (1998a) employed GA firs o search he weigh vecor of ANNs. They compared backpropagaio wih GA ad resuled each GA derived soluio was superior o he correspodig backpropagaio soluio. Sexo e al. (1998b) also used abu search o opimize he ework, abu search derived soluios were sigificaly superior o hose of backpropagaio soluios for all es daa i he resulig compariso. I aoher paper, Sexo e al. (1999) agai icorporaed simulaed aealig, oe of he global search algorihms, o opimize he ework. They compared wih he soluio derived by GA ad simulaed aealig ad cocluded soluio wih GA ouperformed ha wih simulaed aealig. Alhough he effor of Sexo ad his colleagues, Shi e al. (1998) cocluded ha ANNs are raied by gradie desce algorihm ouperform GA i heir applicaio of bakrupcy predicio. I his paper, we resul ha GA soluio cao always guaraee he superior performace ha ANNs are raied wih gradie desce algorihm. Feaure rasformaio is he process of creaig a ew se of feaures (Liu ad Mooda, 1998). I differs from feaure subse selecio i ha he laer does o geerae ew feaures ad i selecs a subse of origial feaures
(Blum ad Lagley, 1997; Dash ad Liu, 1997). Feaure rasformaio mehods are classified as edogeous (usupervised) versus exogeous (supervised), local versus global, parameerized versus o-parameerized, ad hard versus fuzzy (Sco e al., 1997; Susmaga, 1997). Edogeous (usupervised) mehods do o ake io cosideraio of he value of he decisio aribue while exogeous (supervised) do. Local mehods discreize oe aribue a oce while he global oes discreize all aribues simulaeously. Parameerized mehods specify he maximal umber of iervals geeraed i advace while o-parameerized mehods deermie i auomaically. Hard mehods discreize he iervals a he cuig poi exacly while fuzzy mehods discreize i by overlappig bouds (Susmaga, 1997). The mehods of edogeous feaure rasformaio iclude discreizaio usig a self-orgaizig map (Lawrece e al., 1996), perceile mehod (Sco e al., 1997; Buhlma, 1998), cluserig mehod (Sco e al., 1997; Kokae e al., 1997). Basak e al. (1998) also proposed euro-fuzzy approach usig feaure evaluaio idex ad Piramuhu e al. (1998) suggesed decisio-ree based approach as a edogeous mehod. These mehods have he advaage of simpliciy i rasformaio process. While hey do o give cosideraio o he correlaio amog each idepede ad depede variables. Predicio performace, however, is ehaced by he abiliy of discrimiaio from o oly sigle variable bu also he associaio amog variables. For he reaso of above limiaio, edogeous mehods do o provide a effecive way of formig caegories (Sco e al., 1997). O he oher had, he mehods of exogeous feaure rasformaio iclude maximizig he saisical sigificace of Cramer s V bewee oher dichoomized variable (Sco e al., 1997), eropy miimizaio heurisic i iducive learig ad k-eares eighbor mehod (Fayyad ad Irai, 1993; Tig, 1997; Mares e al., 1998). Exogeous mehods also iclude fucioal liks foud by geeic algorihm (GA) for ANNs ad C4.5 (Harig e al., 1997; Vafaie ad De Jog, 1998). These mehods rasform a idepede variable o maximize is associaio wih he values of depede ad oher idepede variables. 3. Feaure rasformaio by GA for ANNs Daa aalysis usig saisical mehod or AI echique icludes red predicio ad paer classificaio. Tred predicio usually reas sigle or muliple ime-series daa wih coiuous ype as ipu variables. I aims o capure emporal paers bewee he ime lag of hisorical daa. The examples of red predicio are he predicio of sock price, ieres rae, ad ecoomic idices. They were radiioally aalyzed by liear regressio or ime-series aalysis such as auoregressive iegraed movig average process (ARIMA). Paer classificaio such as bod raig ad credi evaluaio, however, usually employs muliple cross-secioal daa wih discree ad coiuous ype as ipu variable. I aims o grasp he causaliy bewee he daa simulaeously. As meioed above, i is very impora o cosider he emporal paers bewee daa o ime lag whe aalyzig he ime-series daa usig ANNs. A emporal paer, however, is difficul o rai because he muli-layer percepro has he risk of learig he uecessary radom correlaio ad oise, because i has a ousadig abiliy of fiig. Weiged e al. (1991) used weigh-elimiaio ad Jhee ad Lee (1993) used recurre eural ework o preve he overfiig problem. I addiio, ime-series predicio requires a log compuaioal ime because i uses a large umber of complex relaioships.
Because may fud maagers ad ivesors i sock marke geerally accep ad use crierio i Table 1 as he sigal of fuure marke red. Therefore, hey ierpre echical idicaor o by coiuous measure bu by qualiaive erm. Eve if a aribue represes a coiuous measure, he expers usually ierpre he values i qualiaive erms as bullish ad bearish or low, medium ad high. For sochasic %k, he value of abou 75 is basically acceped by sock marke aalyss as srog sigal, whe he value exceeds abou 75, he marke is regarded as a overbough siuaio or bullish marke. O he oher had, if i drop below 25, i is cosidered as oversold siuaio or he sigal of bearish marke. Whe he value of sochasic %k is placed bewee abou 25 ad 75, i is regarded as he sigal of eural marke (Edwards ad Magee, 1997). Table 1 reviews ierpreaio hreshold for echical idicaor i sock marke. The opimum ierpreaio hreshold, however, vary depedig o he securiy beig aalyzed ad overall marke codiio (Achelis, 1995). Because each marke has heir specific hreshold for ierpreig, we do o have geeral guidelies for discreizig hreshold. We may opimize he hreshold for discreizig coiuous measure io qualiaive orm o capure domai specific kowledge. Alhogh several research suggesed various mehods of rasformig feaures, we propose discreizaio of coiuous ime-series daa usig GA as a mehod of feaure rasformaio. This approach effecively filers daa, rais he classifier. I addiio, i reduces he dimesioaliy of he feaure space he o oly decreases he cos ad ime i he operaio bu also ehaces he geeralisabiliy of classifier. I also reflecs domai specific kowledge of each feaure from opimized ierpreig hreshold. This mehod discreize search space io discree caegories. Because feaure rasformaio usig GA is classified as exogeous, global, Achelis Gifford (1995) (1995) Sochasic %K 20/80 30/70 Edwards ad Magee (1997) 20~25 /75~80 Murphy Chag e Choi (1986) al. (1996) (1995) 30/70 25/75 20/80 parameerized, ad hard mehod, i may fid ear-opimal hreshold of discreizaio for maximum predicio performace. 20~25 Sochasic %D 20/80 30/70 30/70 25/75 20/80 /75~80 Sochasic 20~25 20/80 30/70 30/70 25/75 20/80 slow %D /75~80 Momeum -6.5/+6.5 0 4. Simulaeous opimizaio mehod usig GA for ANNs ROC -6.5/+6.5 100 LW %R 20/80 20/80 20/80 10/90 20/80 AD OSC 0.5 or 0.2/0.8 Dispariy 5 days 100% Dispariy 10 days 100% OSCP 0 0 CCI -100/+100-100/+100-100/+100 0 or 0 or 100/+100 100/+100 20~30 RSI 30/70 30/70 30/70 30/70 30/70 /70~80 <Table 1> Ierpreaio hreshold (Murphy, 1986; Achelis, 1995; Gifford, 1995; Chag e al., 1996; Edwards ad Magee, 1997; Choi, 1995) The overall framework of simulaeous opimizaio mehod is show i Figure 1. The process of opimizig he feaure weighs ad hreshold for feaure rasformaio cosiss of hree sages. For he firs sage, we search opimal coecio weighs ad hresholds for feaure rasformaio. I search process of GA, he parameers for searchig mus be ecoded o chromosomes as decisio variables. GA is a search algorihm based o survival of he fies amog srig srucures o form a search algorihm (Goldberg, 1989). For soluio of opimizaio problems, GA has
bee ivesigaed recely ad show o be effecive a explorig a complex space i a adapive way, guided by he biological evoluio mechaisms of reproducio, crossover, ad muaio (Adeli ad Hug, 1995). Crossover Muaio Assig he ipu weigh vecor Opimized weigh vecor Assig he hidde weigh vecor Fiess fucio GA Evaluaio Trasform he ipu vecor ANNs Figure 1. Simulaeous opimizaio mehod I order o improve he performace of AI echiques, GA is usually employed (Harp ad Samad, 1991; Schaffer, e al., 1992; Park e al., 1994). For ANNs, GA is usually employed o selec eural ework opology such as opimizig releva feaure subse, deermiig he opimal umber of hidde layer ad odes, ec. Opimized hreshold Crossover Muaio This sudy eeds hree vecors of parameers, firs is coecio weigh vecors bewee ipu ad hidde layer of ework ad secod se is coecio weigh vecors bewee hidde ad oupu layer. The hird se is he hreshold for feaure rasformaio of each feaure. Ecoded coecio weighs ad hresholds are opimized o maximize he fiess fucio. Fiess fucio is he average predicio accuracy rae of he es daa se. The parameers are opimized usig oly he iformaio abou raiig daa. Derived parameers from opimizaio The secod sage is he process of feedforward compuaio i ANNs. I his process, sigmoid fucio is used as acivaio fucio. This fucio is a popular aacivaio fucio o model ANNs, because i ca easily be differeiaed. Liear fucio is used as combiaio fucio for feedforward compuaio wih derived coecio weigh from firs sage. I he hird sage, derived coecio weighs ad hresholds of feaure rasformaio are applied o ou-of-sample daa. Table 2 summarizes he algorihm of simulaeous opimizaio for feaure rasformaio ad weighig. Sep 0 Iiialize he populaios (coecio weighs ad hresholds for feaure rasformaio). (Se o small radom values bewee 0.0 ad 1.0) Sep 1 While soppig codiio is false, do Seps 2 9. Sep 2 Do Seps 3 8. Sep 3 Sep 4 Sep 5 Sep 6 Sep 7 Sep 8 Sep 9 Each ipu processig eleme receives ipu sigal x i ad forwards his sigal o all processig elemes i hidde layer. Each processig eleme i hidde layer sums is weighed ipu sigals ad applies sigmoid acivaio fucio o compue is oupu sigal of hidde processig eleme ad forwards i o all processig elemes i oupu layer. Each processig eleme i oupu layer sums is weighed sigals from hidde layer ad applies sigmoid acivaio fucio o compue is oupu sigal of oupu processig eleme ad compues he differece bewee oupu sigal of oupu processig eleme ad arge value. Calculae fiess. Selec idividuals o become pares of he ex geeraio. Creae a secod geeraio from he pare pool. (Perform crossover ad muaio.) Tes he sop codiio. <Table 2> The algorihm of simulaeous opimizaio for feaure rasformaio ad weighig 5. Research desig ad experimes process are applied o ou-of-sample daa. Because ANNs have emie abiliy of learig he kow daa, he model may fall io overfiig wih he raiig daa. Therefore, he average predicio accuracy rae of es daa is used as fiess fucio o avoid i. I his sage, GA operaes he process of crossover ad muaio o iiial chromosomes ad ieraes uil he soppig codiios are saisfied. The research daa used i his sudy are echical idicaors ad correspodig direcio of chage i daily Korea sock idex (KOSPI). The oal umber of samples icludes 2928 radig days from Jauary, 1989 o December, 1998. The direcio of daily chage i sock idex are caegorized as 0 or 1, 0 meas ex day s
idex dimiish ha oday s idex, ad 1 represe ex day s idex icrease ha oday s idex. We seleced 12 echical idicaors as feaure subse by he review of domai expers ad pas research. Table 3 gives seleced feaures ad heir formulas. The backpropagaio algorihm ad sigmoid fucio are used i his model. Secod mehod rais ANNs wih weighs are opimized by GA isead of gradie desce algorihm. Ulike firs mehod, GA searches amog he several ses of weigh vecors simulaeously. I he hird model, GA simulaeously opimizes he weighs of ANNs Feaure Names of feaure Formulas X1 Sochasic %K 1 X2 Sochasic %D % K X3 Sochasic slow %D C H i = 0 1 i = 0 L L % D X4 Momeum C C 4 X5 ROC (rae of chage) 100 C 100 C X6 LW %R H C H L 100 X7 X8 X9 X10 A/D Oscillaor Dispariy 5 days Dispariy 10 days OSCP (price oscillaor) H H C MA C 5 MA 1 C L 10 100 100 MA MA 5 10 MA X11 CCI (commodiy chael idex) ( M SM ) ( 0. 015 D ) X12 RSI (relaive sregh idex) 100 100 1 Up i i= 0 1 + 1 Dw i i = 0 5 i i ad he hresholds of feaure rasformaio. Name of feaure Max Mi Mea Sadard Deviaio Sochasic %K 100.007 0.000 45.407 33.637 Sochasic %D 100.000 0.000 45.409 28.518 Sochasic slow %D 99.370 0.423 45.397 26.505 Momeum 102.900-108.780-0.458 21.317 ROC 119.337 81.992 99.994 3.449 LW %R 100.000-0.107 54.593 33.637 AD OSC 3.730-0.157 0.447 0.334 Dispariy 5 days 110.003 90.077 99.974 1.866 Dispariy 10 days 115.682 87.959 99.949 2.682 OSCP 5.975-7.461-0.052 1.330 CCI 226.273-221.448-5.945 80.731 RSI 100.000 0.000 47.598 29.531 <Table 4> Summary saisics <Table 3> Techical idicaors ( Achelis, 1995; Gifford, 1995; Chag e al., 1996; Edwards ad Magee, 1997; Choi, 1995) Noe) C: Closig price, L: Low price, H: High price, MA: Movig average of price, M : ( H L + C ) SM : i = 1 M i + 1, D : i = 1 M i + +, 3 1 SM, Up: Upward price chage, Dw: Dowward price chage Table 4 preses summary saisics for each feaure. I order o compare he effeciveess of simulaeous opimizaio mehod wih oher compeiive mehod, we orgaize hree differe ses of mehod accordig o feaure weighig ad rasformaio mehod. Firs mehod rais ANNs wih gradie desce algorihm. For he corollig parameers of GA search, he populaio size is se o 100 orgaisms ad he crossover ad muaio raes are chaged o preve fallig io he local miimum. The rage of crossover rae is se bewee 0.5 ad 0.7 ad muaio rae is raged from 0.05 o 0.1 i his sudy. As a soppig codiio, oly 5000 rials are permied. The rages of search space of coecio weighs are se from 5.0 o 5.0. The rage of hresholds for feaure rasformaio is permied bewee maximum ad miimum value of each feaure. The daa used i his sudy are spli io hree ses of daa. The firs se is raiig se. This se is used o develop he model ad o deermie he coecio weighs of eworks. The es se is he secod oe, his se measures how well he model ierpolaes usig he derived coecio weighs hrough he learig process of raiig se. The validaio se is he hird oe, his se
used o validae he geeralisabiliy of he model for usee daa. The process of exracig he es se from he raiig se is paricularly impora i he developme of effecive model (Klimasuaskas, 1994). I his sudy, he es ses are exraced by radom samplig. The umber of cases i each se is show i Table 5. ha here is a shade of differece of predicio accuracy bewee i-sample daa (raiig sample ad es sample) ad ou-of-sample daa for simulaeous opimizaio mehod. There is, however, a wide differece of predicio accuracy bewee i-sample ad ou-of-sample daa for oher wo mehods. Se 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 Toal Traiig 162 163 163 165 165 165 164 164 163 163 1637 Tes 70 70 71 71 72 72 71 71 71 71 710 Validaio 57 58 58 58 59 59 58 58 58 58 581 <Table 5> Number of cases i each daa se 6. Experimeal Resuls Three ses of model are compared accordig o feaure weighig ad feaure rasformaio mehod. Table 6 describes he average predicio accuracy of each model. The firs se (such se called Coveioal mehod i Table 6) assigs coecio weighs of ANNs usig gradie desce mehod ad rasforms feaure space by liear scalig o he rage bewee 0.0 ad 1.0. The firs se is coveioal mehod o model ANNs. The secod se (such se called Simple opimizaio mehod i Table 6) assigs coecio weighs by opimizaio process of GA ad rasforms feaure space usig liear scalig o he rage from 0.0 o 1.0. This model was proposed by he works of Sexo e al. (1998a) ad Shi e al. (1998). I he hird se (such se called Simulaeous opimizaio mehod i his sudy), GA simulaeously assigs coecio weighs ad he hresholds of feaure rasformaio. I Table 6, simulaeous opimizaio mehod has higher predicio accuracy ha he oher wo mehods by 12% ~ 13% a ou-of-sample daa. The predicio accuracy of coveioal mehod ad simple opimizaio mehod is similar o each oher. I is worh givig aeio o he fac Simulaeous Coveioal Simple opimizaio opimizaio mehod mehod mehod Year Isamplsamplsamplsamplsamplsample Ou-of- I- Ou-of- I- Ou-of- 1989 59.05 48.28 57.33 49.12 56.04 63.16 1990 62.23 49.15 59.23 56.90 63.95 62.07 1991 58.97 53.45 53.42 50.00 60.26 63.79 1992 61.02 51.72 60.17 44.83 62.29 56.90 1993 54.01 44.07 54.43 44.07 59.07 62.71 1994 62.45 64.41 61.18 59.32 64.98 66.10 1995 63.83 44.83 63.83 53.45 65.96 67.24 1996 61.28 60.35 61.70 50.00 65.11 68.97 1997 46.15 50.00 50.43 50.00 62.82 63.79 1998 55.98 51.72 56.84 48.28 60.68 63.79 Toal 58.50 % 51.81 % 57.86 % 50.60 % 62.12 % 63.86 % <Table 6> Average predicio accuracy (%) I Table 6, simulaeous opimizaio mehod has higher predicio accuracy ha he oher wo mehods by 12% ~ 13% a ou-of-sample daa. The predicio accuracy of coveioal mehod ad simple opimizaio mehod is similar o each oher. I is worh givig aeio o he fac ha here is a shade of differece of predicio accuracy bewee i-sample daa (raiig sample ad es sample) ad ou-of-sample daa for simulaeous opimizaio mehod. There is, however, a wide differece of predicio accuracy bewee i-sample ad ou-of-sample daa for oher wo mehods. We use he McNemar ess o examie wheher he performace of simulaeous opimizaio mehod is sigificaly higher ha ha of oher wo mehods. The McNemar es is oparameric es for wo relaed
Name of feaure Firs hreshold Secod hreshold samples. This es may be used wih omial daa ad is paricularly useful wih before-afer measureme of he same subjecs (Cooper ad Emory, 1995). Table 7 shows he resuls of McNemar ess o compare he performace of hree mehods for ou-of-sample daa. Coveioal mehod Simple opimizaio mehod Simulaeous opimizaio mehod Coveioal mehod 0.227 (0.634) 19.347* (0.000) Sochasic %K 32.1346 67.7126 Sochasic %D 34.7457 66.2435 Sochasic slow %D 28.8129 64.5794 Momeum -12.4254 17.1770 ROC 98.4868 103.6160 LW %R 33.4142 81.9412 AD OSC 0.3333 0.7132 Dispariy 5 days 98.9056 102.6326 Dispariy 10 days 98.4584 102.2129 OSCP -0.9347 0.7172 CCI -38.7281 90.9651 RSI 33.3030 70.8495 Simple opimizaio mehod 31.912* (0.000) <Table 8> Average hresholds for feaure rasformaio <Table 7> McNemar values (P values) for he pairwise compariso of performace amog mehods (* sigifica a he 1% level ) As show i Table 7, he performace of simulaeous opimizaio mehod performs sigificaly beer ha ha of oher wo mehods a a 1% level. Also Table 7 shows ha he oher wo mehods, coveioal mehod ad simple opimizaio mehod, are o sigificaly ouperforms wih each oher. Table 8 shows he opimized hresholds for feaure rasformaio. Each hreshold is used as crieria for dicreizig he coiuous daa. 7. Coclusios ad research implicaios Previous sudy such as Sexo e al. (1998a, 1998b, 1999) ad Shi e al. (1998) had ried o opimize he corollig parameers of ANNs. Their sudies oly focus o he opimizaio of coecio weighs for ANNs. GA approach, however, ca poeially be used o opimize oher specific facors of ANNs. I his paper, we prese he simulaeous opimizaio mehod for feaure rasformaio ad weighig of coecios i ANNs usig GA o predic he paer of sock marke reds. This mehod discreizes he origial daa accordig o opimal or ear-opimal hresholds of feaure rasformaio ad assigs opimal or ear-opimal coecio weighs simulaeously. We show ha simulaeous opimizaio mehod effecively filers daa, rais he classifier, opimizes coecio weighs. I addiio, we coclude simulaeous opimizaio mehod reduces he dimesioaliy of he feaure space he ehaces he geeralisabiliy of classifier from he empirical resuls. These resuls suppor followig fidigs. Firs, he resul of experime wih he simulaeous opimizaio mehod sigificaly ouperforms ha wih simple opimizaio mehod i his sudy. I appears ha simulaeous opimizaio mehod allows beer o lear oisy paers ha simple opimizaio mehod. The implicaios of his resul parly suppor ha he more facors of he process of ANNs are opimized by GA simulaeously, he higher he predicio performace. The simulaeous opimizaio mehod ca ehace he geeralisabiliy of models. Secod, alhough Sexo e al. (1998a) suggesed
simple opimizaio mehod for coecio weighs usig GA ouperform coveioal back-propagaio eural eworks wih gradie desce algorihm, his sudy does o fid a evidece o suppor heir coclusio. We show ha he resul of coveioal mehod ad simple opimizaio mehod does o have sigifica differece bewee each oher. The reasos of hese disappoiig resul are summarized as wo facors. The oe is geeric limiaio of GA. I oher work of Sexo e al. (1999), performace wih he coecio weighs of ANNs are opimized by simulaed aealig, oe of he global search algorihm, did o ouperform ha wih back-propagaio eural eworks wih gradie desce algorihm. This resul is also suppored by he work of Shi e al. (1998). They cocluded he reaso of disappoiig resuls comes from he fac ha GA is less compee i local search. As meioed earlier, global search is more desirable for learig ANNs, however, some imes local search is also eeded. The oher facor is maybe a curse of dimesioaliy. GA is a global search algorihm, however, fiacial daa icludig he daa of sock marke is oo complex o search a oce. Therefore, i is eeded o reduce he dimesioaliy of daa ad irreleva facors before searchig. I is suppored by he resuls of Table 6, simple opimizaio mehod ad coveioal learig mehod are o geeralized well, however, simulaeous opimizaio mehod is geeralized well for ou-of-sample daa. These resuls show global search algorihm may be a aleraive mehod for he learig i ANNs. I does o geeralize he fac ha global search always provide he opimal coecio weighs. Third, from he Table 1 ad Table 8, we fid opimized hresholds for feaure rasformaio is approximae some of domai kowledge i sock marke. As meioed earlier, aalyss i sock marke do o udersad he echical idicaors as coiuous form bu ierpre as qualiaive orm such as low, medium, ad high. The majoriy of hresholds for feaure rasformaio i Table 8 is coicide wih domai kowledge of sock marke are show i Table 1 excep for some idicaors icludig Momeum, CCI, ad OSCP. Therefore, feaure rasformaio ca o oly reduces oise ad irreleva facors bu also provides domai kowledge o learig process. Simulaeous opimizaio mehod i his sudy has several implicaios. Feaures i modelig ANNs coribue he value of ework oupus hrough o oly sole bu also syergisic way. The coecio weigh of specific coecio i ANNs reflecs he imporace of specific coecio. I is compued by akig io cosideraio of he value of depede ad he associaio of oher idepede feaures. The coribuio of each feaure o oupu value is ca o be fully refleced by coecio weighs. Coecio weighs provide syergisic effec of he associaio amog several feaures, however, may do o reflec he pure embeddig kowledge of each feaure. The pure embeddig kowledge of each feaure ca be refleced by qualiaive orm as meioed above. I is closely aki o he process of huma hikig. This sudy has some limiaios. Firs, he umber of caegories for feaure rasformaio of each feaure is limied o hree caegories. The umber is varied wih he aure of each feaure. This sudy limis he umber because he compuaioal burde of ulimied caegories is oo heavy o be efficiely execued by persoal compuer. The secod limiaio is he objecs for opimizaio are focused oly wo facors of he process of ANNs. Simulaeous opimizaio mehod i his sudy produces valid resuls, however, GA ca poeially be used o opimize several facors of he process of ANNs icludig feaure subse selecio, ework srucure opimizaio, learig parameer opimizaio. We also believe ha here is grea poeial for furher research
wih simulaeous opimizaio mehod usig GA for oher AI echiques icludig case-based reasoig ad decisio ree. Refereces Achelis, S. B., Techical Aalysis from A o Z, Probus Publishig, 1995. Adeli, H. ad Hug, S., Machie learig: Neural eworks, geeic algorihms, ad fuzzy sysems, Joh-Wiley & Sos, Ic, 1995. Ahmadi, H., Tesabiliy of he arbirage pricig heory by eural eworks, Proceedigs of he Ieraioal Coferece o Neural Neworks, 1990, pp. 385-393. Basak, J., De, R. K. ad Pal, S. K., Usupervised feaure selecio usig a euro-fuzzy approach, Paer Recogiio Leers, 1998, pp.997-1006. Blum, A. L. ad Lagley, P., Selecio of releva feaures ad examples i machie learig, Arificial Ielligece, Vol. 97, No. 1-2, 1997, pp.245-271. Buhlma, P., Exreme eves from he reur-volume process: a discreizaio approach for complexiy reducio, Applied Fiacial Ecoomics, Vol.8, 1998, pp.267-278. Chag, J., Jug, Y., Yeo, K., Ju, J., Shi, D. ad Kim, H., Techical Idicaors ad Aalysis Mehods, Seoul, Korea, Jiriamgu Publishig, 1996. Choi, J., Techical Idicaor, Seoul, Korea, Jiriamgu Publishig, 1995. Choi, J. H., Lee, M. K. ad Rhee, M. W., Tradig S&P 500 sock idex fuures usig a eural ework, The Third Aual Ieraioal Coferece o Arificial Ielligece Applicaios o Wall Sree, 1995, pp. 63-72. Cooper, D. R. ad Emory, C. W., Busiess research mehods, Fifh ediio, Irwi, Ic., 1995, pp. 455-457. Dash, M. ad Liu, H., Feaure selecio mehods for classificaios, Iellige Daa Aalysis: A Ieraioal Joural, Vol. 1, No. 3, 1997, hp://wwweas.elsevier.com/ida/browse/0103/ida00013/aricl e.hm. Duke, L. S., ad Log, J. A., Neural ework fuures radig - A feasibiliy sudy, Adapive Iellige Sysems, Elsevier Sciece Publishers, 1993, pp. 121-132. Edwards, R. D. ad Magee, J., Techical aalysis of sock reds, Chicago, Illiois, Joh Magee Ic., 1997. Fayyad, U. M. ad Irai, K. B., Muli-ierval discreizaio of coiuous-valued aribues for classificaio learig, Proceedigs of he 13 h Ieraioal Joi Coferece o Arificial Ielligece, 1993. pp. 1022-1027. Gifford, E., Ivesor s Guide o Techical Aalysis: Predicig Price Acio i he Markes, Lodo, Pima Publishig, 1995. Goldberg, D. E., Geeic Algorihms i Search, Opimizaio, ad Machie Learig, Addiso- Wesley Publishig Compay, Ic., 1989. Harig, S., Kok, J. N. ad va Wezel, M. C., Feaure selecio for eural eworks hrough fucioal liks foud by evoluioary compuaio, Advaces i Iellige Daa Aalysis, 1997. Harp, S. A. ad Samad, T., Opimizig eural eworks wih geeic algorihms, Proceedigs of he America Power Coferece, Chicago, pp.1138-1143. Hiemsra, Y., Modelig srucured oliear kowledge o predic sock marke reurs, I Trippi, R. R. (ed.), Chaos & Noliear Dyamics i he Fiacial Markes: Theory, Evidece ad
Applicaios, Irwi, 1995, pp. 163-175. Jhee, W. C. ad Lee, J. K., "Performace of Neural Neworks i Maagerial Forecasig", Ieraioal Joural of Iellige Sysems i Accouig, Fiace ad Maageme, Vol. 2, pp. 55 ~ 71, 1993. Kamijo, K. ad Taigawa, T., Sock price paer recogiio: A recurre eural ework approach, Proceedigs of he Ieraioal Joi Coferece o Neural Neworks, 1990, pp. 215-221. Kimoo, T., Asakawa, K., Yoda, M., ad Takeoka, M., Sock marke predicio sysem wih modular eural ework, Proceedigs of he Ieraioal Joi Coferece o Neural Neworks, 1990, pp. 1-6. Klimasauskas, C. C., Neural ework echiques, I Deboeck, G. J. (ed.), Tradig o he Edge, New York: Joh Wiley, 1994, pp.3-26. Kohara, K., Ishikawa, T., Fukuhara, Y. ad Nakamura, Y., Sock price predicio usig prior kowledge ad eural eworks, Ieraioal Joural of Iellige Sysems i Accouig, Fiace ad Maageme, Vol. 6, 1997, pp. 11-22. Kokae, P., Myllymaki, P., Silader, T. ad Tirri, H., A Bayesia approach o discreizaio, Proceedigs of he Europea Symposium o Iellige Techiques, 1997, pp.265-268. Lawrece, S., Tsoi, A. C. ad Giles, C. L., Noisy ime series predicio usig symbolic represeaio ad recurre eural ework grammaical iferece, Techical Repor UMIACS-TR-96-27 ad CS-TR-3625, Isiue for Advaced compuer Sudies, Uiversiy of Marylad, 1996. Lee, K. C., Ha, I. ad Kwo, Y., Hybrid eural ework models for bakrupcy predicios, Decisio Suppor Sysems, Vol. 18, 1996, pp. 63-72. Liu, H. ad Mooda, H., Feaure rasformaio ad subse selecio, IEEE Iellige Sysems ad Their Applicaios, Vol. 13, No. 2, 1998, pp. 26-28. Mares, J., Wes, G., Vahiee, J. ad Mues, C., A iiial compariso of a fuzzy eural classifier ad a decisio ree based classifier, Exper Sysems wih Applicaios, Vol.15, 1998, pp.375-381. Murphy, J. J., Techical Aalysis of he Fuures Markes: A Comprehesive Guide o Tradig Mehods ad Applicaios, New York, New York Isiue of Fiace, A Preice-Hall Compay, 1986. Piramuhu, S., Ragava, H. ad Shaw, M. J., Usig feaure cosrucio o improve he performace of eural eworks, Maageme Sciece, Vol. 44, No. 3, 1998, pp. 416-430. Ores, C. ad Sklaski, J., A eural ework ha explais as well as predics fiacial marke behavior, Proceedigs of he IEEE/IAFE, 1997, pp.43-49. Park, D., Kadel, A. ad Lagholz, Geeic-based ew fuzzy reasoig models wih applicaio o fuzzy corol, IEEE Trasacios o Sysems, Ma ad Cybereics, Vol.24, No.1, 1994, pp.39-41. Schaffer, J. D., Whiley, D. ad Eshelma, L. J., Combiaios of geeic algorihms ad eural eworks: a survey of he sae of he ar, Proceedigs of he Ieraioal Workshop o Combiaios of Geeic Algorihms ad Neural Neworks, Balimore, 1992, pp.1-37. Sco, P. D., williams, K. M. ad Ho, K. M., Formig caegories i exploraory daa aalysis ad daa miig, Advaces i Iellige Daa Aalysis, 1997, Spriger-Verlag, pp.235-246. Sexo, R. S., Dorsey, R. E. ad Johso, J. D., Toward global opimizaio of eural eworks: A compariso of he geeic algorihm ad backpropagaio, Decisio Suppor Sysems, Vol. 22, 1998a, pp. 171-185. Sexo, R. S., Alidaee, B., Dorsey, R. E. ad Johso, J.
D., Global opimizaio for arificial eural eworks: A abu search applicaio, Europea Joural of Operaioal Research, Vol. 106, 1998b, pp. 570-584. Sexo, R. S., Dorsey, R. E. ad Johso, J. D., Opimizaio of eural eworks: A comparaive aalysis of he geeic algorihm ad simulaed aealig, Europea Joural of Operaioal research, Vol. 114, 1999, pp. 589-601. Shi, K., Shi, T. ad Ha, I., Neuro-geeic approach for bakrupcy predicio: A compariso o backpropagaio algorihms, Proceedigs of he Korea Sociey of Maageme Iformaio Sysems 98 Ieraioal Coferece, Seoul, Korea, 1998, pp. 585-597. Susmaga, R., Aaalyzig discreizaios of coiuous aribues give a moooic discrimiaio fucio, Iellige Daa Aalysis: A Ieraioal Joural, Vol.1, No.3, 1997, hp://wwweas.elsevier.com/ida/browse/0103/ida00012/aricl e.hm. Tig, K.A., Discreizaio i lazy learig algorihms, Trippi, R. R., ad DeSieo, D., Tradig equiy idex fuures wih a eural ework, The Joural of Porfolio Maageme, 1992, pp. 27-33. Tsaih, R., Hsu, Y. ad Lai, C. C., Forecasig S&P 500 sock idex fuures wih a hybrid AI sysem, Decisio Suppor Sysems, 1998, pp. 161-174. Vafaie, H. ad De Jog, K., Feaure space rasformaio usig geeic algorihms, IEEE Iellige Sysems ad Their Applicaios, Vol.13, No.2, 1998, pp.57-65. Weiged, A. S., Rumelhar, D. E., ad Huberma, B. A., Geeralizaio by weigh elimiaio applied o currecy exchage rae predicio, Proceedigs of he Ieraioal Joi Coferece o Neural Neworks, 1991, pp. 837-841. Yag, J. ad Hoavar, V., Feaure subse selecio usig a geeic algorihm, IEEE Iellige Sysems ad Their Applicaios, Vol.13, No.2, 1998, pp.44-49. Yoo, Y. ad Swales, G., Predicig sock price performace: A eural ework approach, Proceedigs of he 24 h Aual Hawaii Ieraioal Coferece o Sysem Scieces, 1991, pp. 156-162. Arificial Ielligece Review, Vol.11, 1997, pp.157-174.