Foreign Exchange Rate Prediction using Computational Intelligence Methods

Internatonal Journal of Computer Informaton Systems and Industral Management Applcatons ISSN 5-7988 Volume 4 () pp 659-67 MIR Labs, wwwmrlabsnet/jcsm/ndehtml Foregn Echange Rate Predcton usng Computatonal Intellgence Methods V Rav *, Ramanuj Lal and N Raj Kran Insttute for Development and Research n Bankng Technology, Castle Hlls Road #, Masab Tank, Hyderabad 5 57 (AP) Inda *Correspondng Author rav_padma@yahoocom; nrajkran@gmalcom Department of Physcs and Meteorology, Indan Insttute of Technology, Kharagpur 73, West Bengal, Inda ramanujlal@gmalcom Abstract: Ths paper presents the applcaton of s nonlnear ensemble archtectures to forecastng the foregn echange rates n the computatonal ntellgence paradgm Intellgent technques such as Backpropagaton neural network (BPNN), Wavelet neural network (WNN), Multvarate adaptve regresson splnes (MARS), Support vector regresson (SVR), Dynamc evolvng neuro-fuzzy nference system (DENFIS), Group Method of Data Handlng (GMDH) and Genetc Programmng (GP) consttute the ensembles The data of echange rates of US dollar (USD) wth respect to Deutsche Mark (DEM), Japanese Yen (JPY) and Brtsh Pound (GBP) s used for testng the effectveness of the ensembles To account for the auto regressve nature of the tme seres problem, we consdered lagged varables n the epermental desgn All the technques are compared wth normalzed root mean squared error (NRMSE) and drectonal statstcs ( stat ) as the performance measures The results ndcate that GMDH and GP based ensembles yelded the best results consstently over all the currences GP based ensemblng emerged as the clear wnner based on ts consstency wth respect to both D stat and NRMSE, but GMDH outperforms t n one of the currences (DEM) Based on the numercal eperments conducted, t s nferred that usng the correct sophstcated ensemblng methods n the computatonal ntellgence paradgm can enhance the results obtaned by the etant technques to forecast foregn echange rates KeyWords: Foregn Echange Rate forecastng, Computatonal ntellgence, Ensemble, Intellgent technques, Market rsk I Introducton Foregn Echange Rate (Fore) markets are one of the most lqud markets n the world Lqudty mples the ablty to be easly converted through an act of buyng or sellng wthout causng a sgnfcant movement n the prce and wth mnmum loss of value because at any gven tme there are a large number of buyers and sellers n the market A crucal factor n mantanng ths lqudty s the presence of three types of market players nvestors (those who are lookng to nvest n a currency for long term gans), arbtrators (those who wsh to make rsk-free profts by eplotng any prce msmatch due to market neffcences), and speculators (who take bets on D drecton of prce movements) Typcally fnancal nsttutons would engage n all the three actvtes, ether on behalf of ther clents or on ther own Although the percentage prce movements and hence margnal gans n Fore markets s very low, the prncpal amount (also called the Nomnal value) of tradng runs n trllons of dollars resultng n hgh absolute profts (or losses!) Tradtonal daly turnover was reported to be over US$ 3 trllon n Aprl 7 by the Bank for Internatonal Settlements [] Snce then, the market has contnued to grow Accordng to Euromoney's annual FX Poll, volumes grew a further 4% between 7 and 8 [] In such a stuaton the proftablty of the trader depends upon hs ablty to predct future rate movements correctly For large multnatonal frms, ncludng banks, whch conduct substantal currency transfers n the course of busness, beng able to accurately forecast movements of currency echange rates can result n substantal mprovement n the overall proftablty of the frm Forecastng has been domnated by lnear statstcal methods for several decades Although lnear models possess many advantages n mplementaton and nterpretaton, they have serous lmtatons n that they cannot capture nonlnear relatonshps n the data whch are common n many comple real world problems [3] Appromaton of lnear models to complcated nonlnear forecastng problems s often not satsfactory In the early 98s, Makrdaks [4] organzed a large-scale forecastng competton (M-competton) n whch the majorty of commonly used lnear forecastng methods were tested usng real-tme-seres data The results showed that no sngle forecastng method s globally the best Accordng to Zhang et al [5] one of the major reasons for ths concluson s that there s a varyng degree of nonlnearty n the data, whch cannot be handled properly by lnear statstcal methods The popularty of Artfcal Neural Networks (ANNs) and other computatonally ntellgent methods s derved from the fact that they are generalzed nonlnear forecastng models Palt and Popovc [6] hghlght the advantages of the computatonal ntellgent methods as follows: () general non-lnear mappng between a subset of the past tme seres MIR Labs, USA

66 Rav, Lal & Kran values and the future tme seres values () the capablty of capturng essental functonal relatonshps among the data, whch s valuable when such relatonshps are not a pror known or are very dffcult to descrbe mathematcally and/or when the collected data are corrupted by nose () unversal functon appromaton capablty that enables modelng of arbtrary nonlnear contnuous functons to any degree of accuracy (v) capablty of learnng and generalzaton from eamples usng the data-drven self-adaptve approach Nevertheless, predctng echange rate movements s stll a problematc task Most conventonal econometrc models are not able to forecast echange rates wth sgnfcantly hgher accuracy In recent years, there has been a growng nterest to adopt the state-of-the-art artfcal ntellgence technologes to solve the problem One stream of these advanced technques focuses on the use of artfcal neural networks (ANN) to analyze the hstorcal data and provde predctons to future movements n the foregn echange market In ths study, we apply dfferent ensemble-based technques n predctng monthly echange rates of US dollar wth respect to three major foregn currences German marks (DEM), Brtsh pound (GBP) and Japanese yen (JPY) S dfferent non-lnear ensembles are desgned and tested where Backpropagaton neural network (BPNN), Wavelet neural network (WNN), Multvarate adaptve regresson splnes (MARS), Support vector regresson (SVR), Dynamc evolvng neuro-fuzzy nference system (DENFIS), Group Method of Data Handlng (GMDH) and Genetc Programmng (GP) consttute the ensembles The remander of ths paper s organzed as follows A revew of the lterature s gven n secton In Secton 3, we gve an overvew of the dfferent ntellgent technques appled n the echange rate predcton Then, a descrpton of the ensemble technques developed n ths paper s provded n secton 4 In Secton 5, a descrpton of the epermental methodology s presented Secton 6 dscusses the results obtaned The paper s then concluded n Secton 7 II Lterature Revew Many research studes have been carred out n the area of echange rate predcton n the recent years De Matos [7], as part of hs work, compared the strength of a mult-layer feed-forward neural network (MLFN) wth that of a recurrent network based on the forecastng of Japanese yen futures Kuan and Lu [8] provded a comparatve evaluaton of the performance of MLFN and a recurrent network on the predcton of an array of commonly traded echange rates Hsu et al [9] developed a clusterng neural network model to predct the drecton of movements n the USD/DEM echange rate Ther epermental results suggested that ther proposed model acheved better forecastng performance relatve to other ndcators Tent [] proposed the use of recurrent neural networks n order to forecast foregn echange rates Three recurrent archtectures were compared n terms of predcton accuracy of futures forecast for Deustche mark currency Muhammed and Kng [] presented an evolutonary fuzzy network method for predcton n foregn echange markets Fuzzy systems not only provded the mechansm to ntegrate human lngustc knowledge nto logcal framework but also provded the means to etract fuzzy rules from an observed data set Genetc Algorthms were used to adapt the parameters of the fuzzy network n order to obtan the best performance Shazly and Shazly [] desgned a hybrd system combnng neural networks and genetc tranng to the 3-month spot rate of echange for four currences: the Brtsh pound, the German mark, the Japanese yen and the Swss franc The emprcal results revealed that the networks forecasts outperformed predctons made by both the forward and futures rates n terms of accuracy and the drecton of change n the echange rate movement Also recently, Leung et al [3] n ther study compared the forecastng accuracy of MLFN wth the general regresson neural network (GRNN) Ther study showed that the GRNN possessed a greater forecastng strength relatve to MLFN wth respect to a varety of currency echanges Zhang and Berard [4] nvestgated the use of neural network combnng methods to mprove tme seres forecastng performance of the tradtonal sngle keep-the-best (KTB) model Instead of usng sngle network archtecture, ther research nvestgated the use of ensemble methods n ehange rate forecastng Two general approaches to combnng neural netoworks were proposed and eamned n predctng the echange rate between Brtsh pound and US dollar Essentally, the study proposed usng systematc and seral parttonng methods to buld ensemble models consstng of dfferent neural network strucures Results ndcated that the ensemble netowork could consstently outperform a sngle network desgn Walczak [5] n hs study, eamned the effects of dfferent szes of tranng sample sets on forecastng currency echange rates It was shown that those neural networks gven an approprate amount of hstorcal knowledge can forecast future currency echange rates wth 6 percent drectonal accuracy, whle those neural networks traned on a larger tranng set had a worse forecastng performance In addton to hgher-qualty forecasts, the reduced tranng set szes reduced development cost and tme Hu et al [6] appled a sequental learnng neural network, named as Mnmal Resouce Allocatng Network (MRAN) to forecast monthly echange rates between US dollar and varous other currences and found the neural network s performance to be better both n terms of forecast and drecton accuracy Also recently, Yu et al [7] proposed a nonlnear ensemble forecastng model ntegratng generalzed lnear autoregresson (GLAR) wth ANN and obtaned accurate predcton results and forecastng performances The proposed model s performance was compared wth the ndvdual forecastng methods, as well as the hyrd model and lnear combnaton models and the emprcal results showed that the predcton results usng the nonlnear ensemble model were better than those obtaned usng the other models Recently, more hybrd forecastng models have been developed that ntegrate neural network technues wth many conventonal forecastng methods such as econometrc models and tme seres models to mprove predcton accuracy Weedng II and Cos [8] constructed a model combnng Radal Bass Functon (RBF) networks, certanty factors, and Bo-Jenkns model Ther epermental results have shown that the combnaton approach mproves the overall relablty of tme seres forecastng They dscussed three dfferent

Foregn Echange Rate Predcton Usng Computatonal Intellgence Methods 66 methods n whch the two forecasts can be combned nto one hybrd forecast Smlarly, Zhang [9] proposed a hybrd methodology that combned Autoregressve Integrated Movng Average (ARIMA) and ANN models takng advantage of the unque strengths of ARIMA and ANN models n lnear and nonlnear modelng Ther epermental results wth real data sets ndcated that the combned model could be an effectve way to mprove forecastng accuracy acheved by ether of the models used seperately Chen and Leung [] proposed an adaptve forecastng approach that comned the strengths of neural networks and multvarate ecoometrc models Ther hyrd approach contaned two forecastng stages In the frst stage, a tme seres model generates estmates of the echange rates In the second stage, General Regresson Neural Network s used to correct the errors of the estmates Both emprcal and tradng smulaton eperments suggest that the proposed hybrd approach not only produces better echange rate forecasts but also results n hgher nvestment returns than the sngle-stage models Also, Ince and Trafals [] proposed a two-stage forecastng model that ncorporated both parametrc technques such as ARIMA and non-parametrc technques such as Support Vector Regresson (SVR) and ANN Ther fndngs showed that the nput selecton s very mportant and the SVR technque outperformed the ANN for the nput selecton methods consdered Lee and Wong [] nvestgated the predctve performance of a hybrd multvarate model, usng multple macroeconomc and mcrostructure of foregn echange market varables Conceptually, the proposed system combned and eploted the mert of adaptve learnng artfcal neural network (ANN) and ntutve reasonng (fuzzy-logc nference) tools An ANN was employed to forecast a foregn echange rate movement that was followed by the ntutve reasonng of mult-perod foregn currency returns usng mult-value fuzzy logc for foregn currency rsk management decson-makng Emprcal tests wth statstcal and machne learnng crtera revealed plausble performance of ts predctve capablty III Overvew of the Intellgent technques The followng technques are appled to predct foregn echange rates of US dollar n terms of the German Mark, the Brtsh Pound, and the Japanese Yen: () back-propagaton neural network (BPNN), () Wavelet Neural Network (WNN), () Dynamc Evolvng Neuro-Fuzzy Inference System (DENFIS), (v) Multvarate Adaptve Regresson Splnes (MARS), (v) Support Vector Regresson (SVR), (v) Group Method of Data Handlng (GMDH) and (v) Genetc Programmng (GP) As BPNN s very popular, t s not dscussed here All the remanng consttuents of the ensembles are descrbed brefly n the subsequent subsectons A Wavelet Neural Network The word wavelet s due to Morlet and Grossmann [3] n the early 98s Wavelets are a class of functons used to localze a gven functon n both space and scalng A famly of wavelets can be constructed from a functon ψ (), sometmes known as a "mother wavelet," whch s confned n a fnte a, b nterval "Daughter wavelets" ψ ( ) are then formed by translaton (b) and dlaton (a) Wavelets are especally useful for compressng mage data, snce a wavelet transform s n some ways superor to a conventonal Fourer transform An ndvdual wavelet can be defned by [4]: a b, ψ ( ) = α b ψ a Recently, due to the smlarty between the dscrete nverse wavelet transform and a one-hdden-layer neural network, the dea of combnng both wavelets and neural networks has been proposed Ths has resulted n the Wavelet neural network (WNN), a feedforward neural network wth one hdden layer of nodes, whose bass functons are drawn from a famly of orthonormal wavelets WNN solves the conventonal problem of poor convergence or even dvergence encountered n other knds of neural networks It can dramatcally ncrease convergence speed [5] Wavelets, n addton to formng an orthogonal bass, are capable of eplctly representng the behavor of a functon at varous resolutons of nput varables Consequently, a wavelet network s frst traned to learn the mappng at the coarsest resoluton level In subsequent stages, the network s traned to ncorporate elements of the mappng at hgher and hgher resolutons Wavelet networks employ actvaton functons that are dlated and translated versons of a sngle ψ : R d R, where d s the nput dmenson [6] functon [7] Ths functon called the mother wavelet s localzed both n the space and frequency domans [4] The WNN conssts of three layers: nput layer, hdden layer and output layer All the unts n each layer are fully connected to the nodes n the net layer The output layer contans a sngle unt The WNN mplemented here makes use of the Gaussan functon as the wavelet actvaton functon WNN was mplemented n ANSI C usng Vsual Studo 6 n Wndows envronment on a Pentum 4 machne wth 56 MB RAM f ( t) = ep( t ) B Dynamc Evolvng Neuro-Fuzzy Inference System (DENFIS) DENFIS was ntroduced by Kasabov [8] DENFIS evolve through ncremental, hybrd (supervsed/unsupervsed), learnng, and accommodate new nput data, ncludng new features, new classes, etc, through local element tunng New fuzzy rules are created and updated durng the operaton of the system At each tme moment, the output of DENFIS s calculated through a fuzzy nference system based on -most actvated fuzzy rules, whch are dynamcally chosen from a fuzzy rule set A set of fuzzy rules can be nserted nto DENFIS before or durng ts learnng process Fuzzy rules can also be etracted durng or after the learnng process DENFIS avalable n the student verson of the NeuCom tool obtaned from (http://wwwautacnz/research/research_nsttutes/kedr/resea

66 Rav, Lal & Kran rch_centres/centre_for_data_mnng_and_decson_support_s ystems/neucomhtm ) was used n ths paper C Multvarate Adaptve Regresson Splnes (MARS) Multvarate Adaptve Regresson Splnes (MARS) was ntroduced by Fredman [9] MARS s an nnovatve and fleble modelng tool that automates the buldng of accurate predctve models for contnuous and bnary dependent varables It ecels at fndng optmal varable transformatons and nteractons, the comple data structure that often hdes n hgh-dmensonal data In dong so, MARS effectvely uncovers mportant data patterns and relatonshps that are dffcult for other methods to reveal MARS avalable at (http://salford-systemscom/) was used n the paper D Support Vector Regresson (SVR) The SVR s a powerful learnng algorthm based on recent advances n statstcal learnng theory SVR s a learnng system that uses a hypothess space of lnear functons n a hgh dmensonal space, traned wth a learnng algorthm from optmzaton theory that mplements a learnng bas derved from statstcal learnng theory [3] SVR has recently become one of the popular tools for machne learnng and data mnng and can perform both classfcaton and regresson SVR uses a lnear model to mplement nonlnear class boundares by mappng nput vectors nonlnearly nto a hgh-dmensonal feature space usng kernels The tranng eamples that are closest to the mamum margn hyper-plane are called support vectors All other tranng eamples are rrelevant for defnng the bnary class boundares The support vectors are then used to construct an optmal lnear separatng hyper-plane (n case of pattern recognton) or a lnear regresson functon (n case of regresson) n ths feature space The support vectors are conventonally determned by solvng a quadratc programmng (QP) problem SVR has the followng advantages: () () It s able to generalze well even f traned wth a small number of eamples It does not assume pror knowledge of the probablty dstrbuton of the underlyng data set SVR s smple enough to be analyzed mathematcally In fact, SVR may serve as a sound alternatve combnng the advantages of conventonal statstcal methods that are more theory-drven and easy to analyze and machne learnng methods that are more data-drven, dstrbuton-free and robust Recently, SVR has been used n fnancal applcatons such as credt ratng, tme seres predcton and nsurance clam fraud detecton E Group method of data handlng (GMDH) Ths s a famly of nductve algorthms for mathematcal modelng of mult-parametrc datasets that features fully-automatc structural and parametrc optmzaton of models GMDH can fnd relatons n data to select optmal structure of model or network or to ncrease the accuracy of estng algorthms Ths self-organzng approach s dfferent from commonly used deductve modelng It s nductve as the best soluton s found by sortng-out of possble varants and the algorthm tself fnds the structure of the model and the laws of the system (http://enwkpedaorg/wk/gmdh) GMDH algorthms nductvely sort out gradually complcated polynomal models and select the best soluton by means of the eternal crteron A GMDH model wth multple nputs and one output s a subset of components of the base functon Y (,, n ) = a + m = a f where f are elementary functons dependent on dfferent sets of nputs, a are coeffcents and m s the number of the base functon components GMDH algorthm consders varous component subsets of the base functon called partal models and the coeffcents of these models are estmated by the least squares method The number of partal model components s gradually ncreased to fnd a model structure wth optmal complety ndcated by the mnmum value of an eternal crteron Ths process s called self-organzaton of models GMDH s also known as polynomal neural networks and statstcal learnng networks due to mplementaton of the correspondng algorthms n several commercal software products (http://enwkpedaorg/wk/gmdh) F Genetc Programmng Genetc programmng (GP) s a bologcally nspred evolutonary algorthm to fnd computer programs that perform a gven task It s lke genetc algorthms (GA) but here each ndvdual s a computer program It optmzes a populaton of computer programs accordng to a ftness landscape based on a program's ablty to perform a gven computatonal task Beng computatonally ntensve n the 99s GP was manly used to solve relatvely smple problems But thanks to mprovements n GP algorthms and to the eponental growth n CPU power, GP has become more prevalent and has produced many novel and outstandng results n areas such as quantum computng, electronc desgn, game playng, sortng, searchng etc (http://enwkpedaorg/wk/genetc_programmng) GP evolves computer programs tradtonally represented n memory as tree structures whch can be easly evaluated recursvely (http://enwkpedaorg/wk/genetc_programmng) Every tree node has an operator functon and every termnal node has an operand, makng mathematcal epressons easy to evolve and evaluate The man operators used n GP are crossover and mutaton Crossover s appled on an ndvdual by swtchng one of ts nodes wth another node from another ndvdual n the populaton Wth tree-based representaton, replacng a node means replacng the whole branch whch gves greater effectveness to the crossover operator The chldren epressons resultng from crossover are very much dfferent from ther ntal parents Mutaton affects an ndvdual n the populaton It can replace a whole node n the selected ndvdual, or t can replace just the node's nformaton The mulatons were run wth Dscpulus obtaned from http://wwwrmltechcom/

Foregn Echange Rate Predcton Usng Computatonal Intellgence Methods 663 IV Intellgent Nonlnear Ensembles The dea behnd ensemble systems s to eplot each consttuent model s unque features to capture dfferent patterns that est n the dataset Both theoretcal and emprcal works ndcate that ensemblng can be an effectve and effcent way to mprove accuraces Bates and Granger [3] n ther semnal work showed that a lnear combnaton of dfferent technques would gve a smaller error varance than any of the ndvdual technques workng n stand-alone mode Snce then, many researchers worked on ensemblng or combned forecasts Makrdaks et al [4] reported that combnng several sngle models has become common practce n mprovng forecastng accuracy Then, Pelkan et al [3] proposed combnng several feed-forward neural networks to mprove tme seres forecastng accuracy Some of the ensemble technques for predcton problems wth contnuous dependent varable nclude lnear ensemble (eg, smple average (Benedktsson et al [33]), weghted average (Perrone and Cooper [34]) and stacked regresson (Breman [35] and nonlnear ensemble (eg, neural-network-based nonlnear ensemble (Yu et al, [7]) Hansen et al [36] reported that the generalzaton ablty of a neural network system could be sgnfcantly mproved by usng an ensemble of a number of neural networks The purpose s to acheve mproved overall accuracy on the producton data In general, for classfcaton problems, an ensemble system combnes ndvdual classfcaton decsons n some way, typcally by a majorty votng to classfy new eamples The basc dea s to tran a set of models (eperts) and allow them to vote In majorty votng scheme, all the ndvdual models are gven equal mportance Another way of combnng the models s va weghted votng, wheren the ndvdual models are treated as unequally mportant Ths s acheved by attachng some weghts to the predctons gven by the ndvdual models and then combnng them Olmeda and Fernandez [37] presented a genetc algorthm based ensemble system, where a GA determnes the optmal combnaton of the ndvdual models so that the accuracy s mamzed Zhou et al [38] carred out a detaled study on ensemblng neural networks and proposed that usng a set of neural networks to form an ensemble s better than to use all the neural networks They proposed an approach that can be used to select the neural networks to become part of the ensemble from the avalable set of neural networks Genetc algorthm was used to assgn weghts to the consttuent networks It s generally the case that for a gven dataset one knd of an ntellgent technque outperforms the other and the results can be entrely opposte when a dfferent dataset s used In order not to lose any generalty and also to combne the advantages of the ntellgent technques, an ensemble uses the outputs of all the stand-alone ntellgent technques wth each beng assgned a certan prorty level and provdes the output wth the help of an arbtrator An ensemble uses the output obtaned from the ndvdual consttuents as nputs to t and the data s processed accordng to the desgn of the arbtrator S dfferent varants of ensembles are desgned and employed as shown n Fgure These nclude (a) Non-lnear ensemble based on BPNN, (b) Non-lnear ensemble based on WNN, (c) Non-lnear ensemble based on DENFIS, (d) Non-lnear ensemble based on MARS, (e) Non-lnear ensemble based on GMDH and (f) Non-lnear ensemble based on GP Fgure Generc Desgn of the Ensemble V Epermental Desgn The foregn echange data used n our study are obtaned from Pacfc Echange Rate Servce (http://fsauderubcca/) provded by Prof W Antweler, Unversty of Brtsh Columba, Vancouver, Canada They consst of monthly US dollar echange rates wth respect to three major currences - DEM, GBP and JPY The monthly data from January 97 to December (36 observatons) s used as the tranng sample n tranng the dfferent ntellgent technques that are appled and the monthly data from January to December 3 (36 observatons) s used as the test sample n comparng the performance of the dfferent ntellgent technques Snce foregn echange rate forecastng has only one dependent varable and no eplanatory varables n the strct sense and snce we have a tme-seres, we followed the general tme seres forecastng model n conductng our eperments, whch s represented n the followng form: Xt = where BPNN WNN DENFIS MARS SVR GMDH GP f ( X ') BPNN / WNN / DENFIS / MARS / GMDH / GP ' X s vector of lagged varables { t, t,, t p } Hence the key to fndng the soluton to the forecastng problems s to appromate the functon f Ths can be done by teratvely adjustng the weghts n the modelng process In ther poneerng study of weak-form effcency n markets, Cornell and Detrch [39] were the frst to use lagged values of the same tme-seres to predct future currency prce movements An llustraton of how tranng patterns can be

664 Rav, Lal & Kran desgned n the neural network modelng process s provded n Fgure (Xu et al [4]) In ths fgure, ' p ' denotes the 3 t p X 3 t p Y 3 4 t p + p p + p + t p + p + p + 3 number of lagged varables and ( t p) denotes the total number of tranng samples In ths representaton, ' X ' s a set of ( t p) vectors of dmenson ' p ' and ' Y ' s a vector of (t-p) dmensons Thus, n the transformed data set, ' X ' and ' Y ' represent the vector of eplanatory varables and dependent varable respectvely SPSS 4 (obtaned from http://wwwspsscom) was used to fnd the optmal lag for the gven tme-seres data We performed the tests of auto correlaton functon and partal auto correlaton functon as prescrbed by Bo-Jenkns methodology n Tme seres forecastng usng SPSS 4 software on the data set and found that lag was suffcent DEM and JPY whle GBP requred lag However, we wanted to nvestgate whether NRMSE values would mprove further when we go for hgher lags and we tested from lags 5 to 7 as prescrbed by Yu et al [7] In vew of the foregong dscusson on generatng lagged data sets out of the orgnal tme seres such as ths, we created four datasets correspondng to each echange rate - lag #, 5, 6 and 7 respectvely for DEM and JPY, lag #, 5, 6 and 7 for GBP Snce t s a tme-seres data, performng -fold cross valdaton does not make sense, as t nvolves randomly choosng samples nto the folds and then the tme aspect of the data gets obscured and overlooked -fold cross valdaton s etremely powerful and useful n assessng the performance of a model, provded we do not deal wth tme seres or spatal seres data Hence, we carred out hold-out method of testng vz, splttng the data set nto 36 tranng samples and 36 testng samples respectvely In fact, ths check s ncluded n many popular commercal data mnng / statstcal tools The tranng data s used to dentfy the optmal parameters for the model that satsfy the gven error crtera and those parameters are the used to forecast values on the test set The value of Normalzed Mean Square Error (NMSE) s used as the measurement crtera NRMSE = n = n = ^ y y y y t where n s the number of forecastng observatons; actual value at perod ; relablty at perod and ^ y s the y s the forecasted value of software y s the mean Clearly, accuracy s one of the most mportant crtera for forecastng models, but for the busness practtoners, the am of forecastng s to support or mprove decsons so as to make money In echange rate forecastng, mproved decsons often depend on correct forecastng drectons between the actual and predcted values, n testng set wth respect to drectonal change of echange rate movement (epressed n percentages) The ablty to forecast movement drecton can be measured by Drectonal change statstcs ( D stat ) developed by Yao and Tan [4] epressed as: D = N stat a N = where a = f otherwse, *% + + ^ ( y y )( y y ), and a = y s the actual value at perod ; forecasted value of software relablty at perod VI Results and Dscussons ^ y s the For each technque the approprate parameters, as specfed by the algorthm, are tweaked to obtan optmal results Fgures 4-9 depct graphcal representatons of the forecastng performance acheved through varous methods for echange rates of US dollar wth DEM, GBP and JPY usng dfferent models over dfferent lags In Tables -6 the results for all the methods and all the dfferent lags are presented In the tables, second to ffth columns show results for dfferent lags for varous methods The fgures n bold n each column denotes the best performance among all the methods for that partcular lag In the sth column the best performances among all the lags for the correspondng method s presented BPNN WNN DENFIS MAR SVM GMDH GP Fgure 3 Legend for all the Graphs MB Ensemble BPNN Ensemble WNN Ensemble MARS Ensemble GMDH Ensemble GP Ensemble

Foregn Echange Rate Predcton Usng Computatonal Intellgence Methods 665 Mean Dstat = 735 Mean NRMSE = 753 DEM Dstat Devaton From Mean GBP Nrmse Devaton From Mean 5 8 6 4 5 - -5 - -5 Method -4-6 -8 - - Method Fgure 4 Devaton of D stat values of varous methods for DEM from Mean Dstat 4 3 - - -3-4 -5-6 -7 Mean NRMSE = 49 DEM Nrmse Devaton From Mean Method Fgure 5 Devaton of NRMSE values of varous methods for DEM from Mean Nrmse 4 8 6 4 - -4-6 Mean D stat = 64468 GBP Dstat Devaton From Mean Method Fgure 6 Devaton of D stat values of varous methods for GBP from Mean Dstat Fgure 7 Devaton of NRMSE values of varous methods for GBP from Mean Nrmse 5 5 5-5 - Mean D stat = 596568 JPY Dstat Devaton From Mean Method Fgure 8 Devaton of D stat values of varous methods for JPY from Mean Dstat 6 4 - -4-6 -8 - - Mean NRMSE = 44869 JPY Nrmse Devaton From Mean Method Fgure 9 Devaton of NRMSE values of varous methods for JPY from Mean Nrmse Tables -3 show the forecastng performance of dfferent technques n terms of the D stat values over the three currences DEM, GBP and JPY respectvely over dfferent lags Also, Tables 4-6 show the forecastng performance of the technques n terms of the NRMSE values Results from ARIMA models have also been added for each currency for the sake of

666 Rav, Lal & Kran comparson Interestng observatons can be drawn from the Tables -3 Frstly, there seems to be a correlaton between the lag number and the correspondng NRMSE value In general t can be observed that the NRMSE values decrease wth the ncrease n the lag number However ths s not true for BPNN system for whch gong to hgher lags worsens both the D stat and NRMSE values Ths property s n lne wth Tme Seres Recency effect propounded by Walczak [5] for Backpropagaton networks, where addng etra lags as nput worsens the network performance But mportantly the Tme Seres Recency effect was not observed for other methods Secondly, the ensemble-based technques clearly outperformed ther stand-alone technques n terms of NRMSE and D stat Table A comparson of D stat values between dfferent technques for DEM over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 6574 6574 6857 6574 4857 6574 WNN 6574 68574 5748 68574 68574 DENFIS 6 6574 7485 68574 7485 MARS 4857 6857 6574 6574 6574 SVM 4857 34857 54857 5748 5748 GMDH 6937 7485 7748 88 88 GP 684 68574 7485 756 7485 MBensemble 6574 5748 6574 68574 68574 BPNNensemble 6574 68574 6574 7485 7485 WNNensemble 6574 74857 74857 74857 74857 DENFISensemble 4857 68574 7485 6574 7485 MARSensemble 34857 6574 68574 68574 68574 GMDHensemble 645 74857 7748 845 845 GPensemble 74857 68574 74857 68574 74857 Table A comparson of NRMSE values between dfferent technques for DEM over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 8 36 64 36 WNN 8 4 87 74 68 8 74 DENFIS 35 5 4 MARS 37 96 5 39 39 SVM 38 43 35 34 35 GMDH 55 83 74 439 439 GP 95 47 7 9 7 MB-ensembl e 77 369 6 47 47 BPNNensemble 65 45 4 4 WNNensemble 85 964 6 983 964 DENFISensemble 463 5 79 4 4 MARSensemble 375 43 7 GMDHensemble 65 935 85 85 GPensemble 38 98 97 75 97 Table 3 A comparson of D stat values between dfferent technques for GBP over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 6574 4857 54857 54857 4857 6574 WNN 6 6857 6 6 6857 DENFIS 6 68574 6574 6857 68574 MARS 6574 6 6 6 6574 SVM 6 6 4574 54857 6 GMDH 53846 6574 6574 6563 6574 GP 63578 68574 7485 657894 7485 MB -ensemble 5748 6574 6 6 6574 BPNNensemble 6 6574 6857 6857 6574 WNNensemble 5748 68574 6857 6 68574 DENFISensemble 68574 6 5748 6857 68574 MARSensemble 5748 68574 6857 6 68574 GMDHensemble 6563 6 5748 63578 63578 GPensemble 74857 68574 7748 6 7748

Foregn Echange Rate Predcton Usng Computatonal Intellgence Methods 667 Table 4A comparson of NRMSE values between dfferent technques for GBP over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 748 949 36 7559 WNN 6883 77 75 7746 6543 748 6883 DENFIS 734 754 785 76 785 MARS 8 89 8564 8565 8 SVM 749 739 8565 75 739 GMDH 386 5968 584 754 584 GP 7687 583 486 9346 486 MBensemble 774 795 73 7 73 BPNNensemble 689 6997 688 6779 6779 WNNensemble 6886 7 6876 6738 6738 DENFISensemble 697 783 75 7993 697 MARSensemble 7333 7548 7445 8565 7333 GMDHensemble 778 693 663 6953 663 GPensemble 6438 6846 6 75 6 Table 6 A comparson of NRMSE values between dfferent technques for JPY over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 4933 4975 48 576 WNN 487 484 465 484 655 48 465 DENFIS 4953 4736 475 483 4736 MARS 499 47 4733 473 47 SVM 4936 496 4969 56 496 GMDH 43 4569 446 3888 3888 GP 44 443 46 46 46 MBensemble 488 477 4574 459 4574 BPNNensemble 4994 468 464 47 464 WNNensemble 483 4596 446 4384 4384 DENFISensemble 496 4695 475 469 469 MARSensemble 499 474 475 473 473 GMDHensemble 3974 4459 435 3553 3553 GPensemble 4793 3 3968 4 4466 3968 Table 5 A comparson of D stat values between dfferent technques for JPY over dfferent lags Method lag lag 5 lag 6 lag 7 Best ARIMA (,,) BPNN 5485 3748 4857 4857 WNN 5748 4574 6 48574 6 4574 5485 DENFIS 5485 48574 54857 4857 54857 MARS 5485 5485 6857 6 6857 SVM 54857 5485 4574 4574 54857 GMDH 564 54857 6574 578947 6574 GP 5635 6857 6857 63578 63578 MB ensemble 4574 4857 5485 5485 5485 BPNNensemble 4574 4 5485 5748 5748 WNNensemble 48573 54857 6 6 6 DENFISensemble 4857 4574 54857 6 6 MARSensemble 5485 5485 54857 6 6 GMDHensemble 589743 6574 6574 684 684 GPensemble 7485 74857 7485 8857 8857 The best performance of all the networks over all lags s depcted n the form of a bar chart n Fg 4-9 Fg 3 depcts the common legend followed n Fg 4-9 For the sake of clarty and better nterpretablty we have plotted the devatons of varous methods from the mean performance n all fgures Ths way t s clearer whch methods are yeldng above average results and whch ones are laggng behnd Whle nterpretng the fgures t must be borne n mnd that for D stat charts hgher towers would mean a hgher drectonal accuracy and hence better performance However, for NRMSE charts lower values would mean better network performance We can see from the charts that for most cases the GMDH and GP based ensembles performed better than the other ensembles Fourthly, WNN beats BPNN n most of the cases Ffth, we see that the sophstcated non-lnear ensembles consstently outperformed the smple mean based ensembles (MB-ensemble) Fnally, the most mportant observaton s that the GP and GMDH based ensemble outperformed all other technques n most of the cases ecept DEM n terms of both D stat and NRMSE From the fgures we can see that ecept for a few cases where stand-alone methods perform very well, n most of the cases the ensemblng methods gve better results Ths can be nferred from the observaton that most of the fgures (Fg 4, 6, 8) depctng D stat show dps n the frst half (correspondng to stand-alone method) and taller bars n the second half (correspondng to ensemble methods) Epectedly, the opposte holds true for NRMSE fgures (Fg 5, 7, 9) Based on the D stat measure, we can comprehensvely conclude that ensemblng s yeldng better results than stand-alone technques We also observe that ensemblng s more tme consumng than usng ntellgent methods n ther stand-alone mode because, n general, an ensemble can be constructed only after the results of the consttuents are avalable

668 Rav, Lal & Kran However, t s observed that the gans accrued n the form of mproved accuracy more than offset the tme lost n ensemblng Further, we pont out that, when echange rate predcton s to be made accurately n an offlne manner, then tme s no constrant and nonlnear ensemble should be preferred However, when tme s a constrant on-lne methods lke DENFIS should be preferred, as they need only one-pass or one-teraton to gve predctons It should be noted that Yu et al [7] used the same data sets used n ths paper whle desgnng ANN based ensemble wth consttuents as () generalzed lnear auto regresson model (GLAR) () artfcal neural network (ANN) () GLAR-ANN hybrd, where the tme seres s modeled by GLAR and the errors are modeled by ANN (v) Mean based GLAR-ANN hybrd and (v) Weghted Mean based GLAR-ANN hybrd Even though Yu et al [7] reported ecellent results, our results cannot be compared wth thers because they dd not specfy whch lag was used n the epermental desgn Further t was not made clear n the aforementoned paper what data pre-processng steps were followed It was precsely because of ths reason we could not reproduce ther results Further, we clam that our method of ensemblng s smpler, more dverse and superor because of the vared types of ntellgent technques used as consttuents VII Conclusons S nonlnear ensemble archtectures are developed to forecast foregn echange rates n the computatonal ntellgence paradgm BPNN, WNN, MARS, SVR and DENFIS, GMDH and GP are chosen as the members of the ensembles The data of echange rates of US dollar (USD) wth respect to Deutsche Mark (DEM), Japanese Yen (JPY) and Brtsh Pound (GBP) s used for testng and comparng the performance of the ensembles Lagged varables are consdered throughout the study n order to account for the auto regressve nature of the tme seres problem S dfferent ensembles based on BPNN, WNN, MARS, DENFIS, GMDH and GP were developed All the technques are compared wth normalzed root mean squared error (NRMSE) and drectonal statstcs ( performance crtera D stat ) as the The results ndcate that, n terms of both, D stat and NRMSE, GP and GMDH based ensemble consstently outperformed other models over all the currences Out of these two although GMDH produced outstandng results for JPY, GP came out to be more consstent of the two when we consder the results of all the three currences and both the performance measures We note that though not all ensemblng methods are as consstent as GMDH and GP, but based on D stat values, ensembles consstently outperform stand-alone technques for all the three currences Ths gan n performance has to be weghed aganst addtonal computatonal complety n makng the ensemble Based on the results, t s nferred that ensemblng n the computatonal ntellgence paradgm s a sound alternatve to the etent technques to forecast foregn echange rates References [] Trennal Central Bank Survey (December 7), Bank for Internatonal Settlements [] Annual FX poll (May 8), Euromoney [3] CWJ Granger, T Terasvrta Modellng nonlnear economc relatonshps, Oford: Oford Unversty Press, 993 [4] S Makrdaks, A Anderson, R Carbone, R Fldes, M Hbdon, R Lewandowsk, J Newton, E Parzen, R Wnkler The accuracy of etrapolaton (tme seres) methods: results of a forecastng competton, Journal of Forecastng, pp -53, 98 [5] G Peter Zhang, B Eddy Patuwo, Mchael Y Hu A smulaton study of artfcal neural networks for nonlnear tme-seres forecastng, Computers & Operatons Research, 8, pp38-396, [6] A K Palt, D Popovc Computatonal Intellgence n Tme Seres Forecastng: Theory and Engneerng Applcatons, Sprnger-Verlag, 995 [7] G De Matos Neural networks for forecastng echange rate M Sc Thess, The Unversty of Mantoba, Canada, 994 [8] C M Kuan, T Lu Forecastng echange rates usng feed-forward and recurrent neural networks, Journal of Appled Econometrcs,, pp 347 364, 995 [9] W Hsu, L S Hsu, MF Tenoro A neural network procedure for selectng predctve ndcators n currency tradng, In: Refenes AN, edtor Neural networks n the captal markets New York: Wley; 995 pp 45 57 [] P Tent Forecastng foregn echange rates usng recurrent neural networks, Appled Artfcal Intellgence,, pp567 58, 996 [] A Muhammed, G A Kng Foregn echange market forecastng usng evolutonary fuzzy networks, IEEE/IAFE Conference on Computatonal Intellgence for Fnancal Engneerng, pp3-7, 996 [] M R El Shazly, H E El Shazly Forecastng currency prces usng genetcally evolved neural network archtecture, Internatonal Revew of Fnancal Analyss, 8, pp67 8,999 [3] M T Leung, A S Chen, H Daouk Forecastng echange rates usng general regresson neural networks, Computers & Operatons Research, 7, pp93, [4] G P Zhang, V L Berard Tme seres forecastng wth neural network ensembles: an applcaton for echange rate predcton, Journal of the Operatonal Research Socety, 5, pp65 664, [5] S Walczak An Emprcal Analyss of Data Requrements for Fnancal Forecastng wth Neural Networks, Journal of Management Informaton Systems, 4 (4), pp3-, [6] M Hu, P Saratchandran, S Narasmhan A sequental learnng neural network for foregn echange rate

Foregn Echange Rate Predcton Usng Computatonal Intellgence Methods 669 forecastng, Proceedngs of the IEEE Internatonal Conference on Systems, Man and Cybernetcs, 4, pp3963-3968, 3 [7] L Yu, S Wang, K K La A novel nonlnear ensemble forecastng model ncorporatng GLAR and ANN for foregn echange rates, Computers & Operatonal Research, 3, pp53-54, 5 [8] D K Weedng II, K J Cos Tme seres forecastng by combnng RBF networks, certanty factors, and the Bo-Jenkns model, Neurocomputng,, pp49 68,996 [9] G P Zhang Tme seres forecastng usng a hybrd ARIMA and neural network model, Neurocomputng, 5, pp59 75, 3 [] A S Chen, M T Leung Regresson neural network for error correcton n foregn echange forecastng and tradng, Computers and Operatons Research, 3(7), pp49-68, 4 [] H Ince, T B Trafals A hybrd model for echange rate predcton, Decson Support Systems, 4, pp54-6, 6 [] V C S Lee, H T Wong A multvarate neuro-fuzzy system for foregn currency rsk management decson makng, Neurocomputng, 7, pp94-95, 7 [3] A Grossmann, J Morlet Decomposton of Hardy Functons nto Square Integrable Wavelets of Constant Shape, Sam J Math Anal, 5(4), pp73-76, 984 [4] V M Becccera, R K H Galvao, M Abou-Seada Neural and Wavelet Network Models for Fnancal Dstress Classfcaton, Data Mnng and Knowledge Dscovery,, pp35-55, 5 [5] X Zhang, J Q, R Zhang, M L, Z Hu, H Xue, B Fan Predcton of programmed-temperature retenton values of naphthas by wavelet neural networks, Computers and Chemstry, 5, pp5-33, [6] Q Zhang, A Benvnste Wavelet Networks, IEEE Transactons on Neural Networks, 3(6), pp889-898, 99 [7] Zhang Q Usng Wavelet Network n Nonparametrc Estmaton IEEE Transactons on Neural Networks 997; 8(): 7-36 [8] N K Kasabov DENFIS: Dynamc Evolvng Neural-Fuzzy Inference System and Its Applcaton for Tme-Seres Predcton, IEEE Transactons on fuzzy systems, (), pp44-54, [9] J H Fredman Stochastc Gradent Boostng, Computatonal Statstcs and Data Analyss, 38(4), pp367-378, 999 [3] N Crstann, J Shawe-Taylor An Introducton to Support Vector Machnes and other kernel-based learnng methods, Cambrdge Unversty Press, Cambrdge, [3] J M Bates, C W J Granger The combnaton of Forecasts, Operatons research Quarterly,, pp45-468, 969 [3] E Pelkan, C De Groot, D Wurtz Power consumpton n West-Bohema: mproved forecasts decorrelatng connectonst networks, Neural Network World,, pp7-7, 99 [33] J A Benedktsson, J R Svensso, O K Ersoy, P H Swan Parallel Consensual Neural Networks, IEEE Transactons on Neural Networks, 8, pp54-64, 997 [34] M P Perrone, L N Cooper When networks dsagree: ensemble methods for hybrd neural netwoks, In: Mammone RJ (Ed) Neural Networks for speech and Image processng, Chapman Hall, pp6-4, 993 [35] L Breman Baggng predctors, Machne Learnng, 4, pp3-4, 996 [36] J Hansen, J McDonald, J Slce Artfcal ntellgence and generalzed qualtatve response models: An emprcal test on two audt decson-makng domans, Decson Scence, 3, pp78 73, 99 [37] I Olmeda, E Fernandez Hybrd classfers for fnancal mult-crtera decson makng: the case of bankruptcy predcton, Computatonal Economcs,, pp37-335, 997 [38] Z H Zhou, J Wu, W Tang Ensemblng neural networks: Many could be better than all, Artfcal Intellgence, 37, pp39-63, [39] W B Cornell, J K Detrch The Effcency of the Market for Foregn Echange under Floatng Echange Rates, The Revew of Economcs and Statstcs, 6 (), pp-, 978 [4] K Xu, M Xe, L C Tang, S L Ho Applcaton of neural networks n forecastng engne systems relablty, Appled Soft Computng,, pp55-68, 3 [4] J T Yao, C L Tan A case study on usng neural networks to perform techncal forecastng of fore, Neurocomputng, 34, pp79-98, Author Bographes Vadlaman Rav (V Rav) s an Assocate Professor n Insttute for Development and Research n Bankng Technology (IDRBT), Hyderabad snce February He receved hs PhD n Soft Computng from Osmana Unversty, Hyderabad and RWTH Aachen, Germany Earler, he was an Assstant Professor at IDRBT from 5- and Faculty at NUS, Sngapore from -5 Pror to that, he was a Scentst n CSIR research Insttute for 4 years Durng the last 3 years, he publshed 6 papers n refereed journals/conferences and nvted book chapters He edted Advances n Bankng Technology and Management: Impacts of ICT and CRM publshed by IGI Global, USA He s a Referee for several nternatonal journals and on the Edtoral Board of IJIDS, IJDATS, IJISSS, IJWS and IJITPM He s lsted n the Marqus Who s Who n the World n 9, and n Engneerng n

67 Rav, Lal & Kran Ramanuj Lal holds an Integrated MSc n Physcs from Indan Insttute of Technology, Kharagpur Hs research nterests nclude Computatonal Fnance, Algorthmc Tradng and Intellgent Technques He works for Deep Value Technology Pvt Ltd, Inda Nampally Raj Kran holds an MTech (IT) from Unversty of Hyderabad and IDRBT Hs research nterests ncluded Software Relablty predcton and Neural Networks He works as a Software Engneer for Emerson, Inda