Application of GA with SVM for Stock Price Prediction in Financial Market

Similar documents
An SVR-Based Data Farming Technique for Web Application

Lecture 7. Norms and Condition Numbers

APPENDIX III THE ENVELOPE PROPERTY

The Analysis of Development of Insurance Contract Premiums of General Liability Insurance in the Business Insurance Risk

STOCK INVESTMENT MANAGEMENT UNDER UNCERTAINTY. Madalina Ecaterina ANDREICA 1 Marin ANDREICA 2

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

Green Master based on MapReduce Cluster

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Settlement Prediction by Spatial-temporal Random Process

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

Efficient Traceback of DoS Attacks using Small Worlds in MANET

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

Numerical Methods with MS Excel

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

ROULETTE-TOURNAMENT SELECTION FOR SHRIMP DIET FORMULATION PROBLEM

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

CHAPTER 2. Time Value of Money 6-1

The Digital Signature Scheme MQQ-SIG

Automated Alignment and Extraction of Bilingual Ontology for Cross-Language Domain-Specific Applications

Average Price Ratios

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability

Simple Linear Regression

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

Speeding up k-means Clustering by Bootstrap Averaging

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

Optimizing Software Effort Estimation Models Using Firefly Algorithm

1. The Time Value of Money

A PRACTICAL SOFTWARE TOOL FOR GENERATOR MAINTENANCE SCHEDULING AND DISPATCHING

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

An Effectiveness of Integrated Portfolio in Bancassurance

Models for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software

The Time Value of Money

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm

Study on prediction of network security situation based on fuzzy neutral network

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

Report 52 Fixed Maturity EUR Industrial Bond Funds

Impact of Mobility Prediction on the Temporal Stability of MANET Clustering Algorithms *

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

The simple linear Regression Model

Finite Difference Method

Forecasting Trend and Stock Price with Adaptive Extended Kalman Filter Data Fusion

Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, eds.

Based on PSO cloud computing server points location searching

A Parallel Transmission Remote Backup System

M. Salahi, F. Mehrdoust, F. Piri. CVaR Robust Mean-CVaR Portfolio Optimization

Online Appendix: Measured Aggregate Gains from International Trade

Performance Attribution. Methodology Overview

Curve Fitting and Solution of Equation

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011

Agent-based modeling and simulation of multiproject

A particle Swarm Optimization-based Framework for Agile Software Effort Estimation

Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network

Innovation and Production in the Global Economy Online Appendix

Automated Alignment and Extraction of Bilingual Domain Ontology for Medical Domain Web Search

AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM ON CLOUD SERVICE PROVIDER BASED ON GENETIC

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

A Smart Machine Vision System for PCB Inspection

Software Aging Prediction based on Extreme Learning Machine

Time Series Forecasting by Using Hybrid. Models for Monthly Streamflow Data

Optimal replacement and overhaul decisions with imperfect maintenance and warranty contracts

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS

Research on Cloud Computing and Its Application in Big Data Processing of Railway Passenger Flow

10.5 Future Value and Present Value of a General Annuity Due

of the relationship between time and the value of money.

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

Developing tourism demand forecasting models using machine learning techniques with trend, seasonal, and cyclic components

Banking (Early Repayment of Housing Loans) Order,

USEFULNESS OF BOOTSTRAPPING IN PORTFOLIO MANAGEMENT

A particle swarm optimization to vehicle routing problem with fuzzy demands

An IG-RS-SVM classifier for analyzing reviews of E-commerce product

Synthesized Articulated Behavior using Space-temporal On-line Principal Component Analysis

Decision Science Letters

A Covariance Analysis Model for DDoS Attack Detection*

Chapter Eight. f : R R

Towards Network-Aware Composition of Big Data Services in the Cloud

A two-stage stochastic mixed-integer program modelling and hybrid solution approach to portfolio selection problems

Credibility Premium Calculation in Motor Third-Party Liability Insurance

A Security-Oriented Task Scheduler for Heterogeneous Distributed Systems

Statistical Intrusion Detector with Instance-Based Learning

Robust Realtime Face Recognition And Tracking System

TESTING AND SECURITY IN DISTRIBUTED ECONOMETRIC APPLICATIONS REENGINEERING VIA SOFTWARE EVOLUTION

Integrating Production Scheduling and Maintenance: Practical Implications

Report 05 Global Fixed Income

The Reliable Integrated Decision for Stock Price by Multilayer Integration Time-series of Coverage Reasonability

Fault Tree Analysis of Software Reliability Allocation

RQM: A new rate-based active queue management algorithm

Report 19 Euroland Corporate Bonds

Classic Problems at a Glance using the TVM Solver

Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center

A Novel Resource Pricing Mechanism based on Multi-Player Gaming Model in Cloud Environments

Beta. A Statistical Analysis of a Stock s Volatility. Courtney Wahlstrom. Iowa State University, Master of School Mathematics. Creative Component

Randomized Load Balancing by Joining and Splitting Bins

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

Web Service Composition Optimization Based on Improved Artificial Bee Colony Algorithm

Transcription:

Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 Appcato of GA wth SVM for Stock Prce Predcto Faca Market Om Prakash Jea, Dr. Sudarsa Padhy Departmet of Computer Scece ad Appcatos, Utka Uversty Odsha, Ida Abstract: Tme seres forecastg s recevg remarkabe atteto from the research commuty usg data mg techques to aayze the extesve hstorca datasets for sovg predcto probems. For such type of forecastg the dcators are requred to be derved from reevat tme seres. I stock prce forecastg the faca sector more tha 00 dcators have bee deveoped to uderstad stock market behavor ad thus the detfcato of the rght dcators s a chaegg probem. I such a case, optmzed computer agorthms eed to be vestgated ad apped for detfyg reay ecessary dcators. From the varous mache earg techques avaabe oe of the techque recety vestgated for tme seres forecastg s the Support Vector Regresso (SVR) or Support Vector Mache (SVM)[7]. Ths study appes GA-SVM to predct the stock prce dex. I addto, the study aso exames the feasbty of appyg GA-SVM faca forecastg by comparg t wth Support Vector Mache (SVM) ad case-based reasog. The expermeta resuts show that SVM wth GA provdes a more optmzed ad promsg ateratve to stock market predcto. Keywords: SVM, GA, Predcto, Stock prce. Itroducto Mache earg methods are beg used by severa researchers for successfuy predctg prces of faca strumets from the faca tme seres data of dfferet markets. Support vector maches (SVMs) are promsg methods for the predcto of faca tme-seres because they use a rsk fucto cosstg of the emprca error ad a reguarzed term whch s derved from the structura rsk mmzato prcpe. Exceet performaces of SVR appcatos have bee obtaed durg the recet decade[5][6]. These appcatos of the SVR for tme seres forecastg are based o dcators derved from reevat tme seres. I order to mprove the predcto accuracy, oe frst eeds to detfy mportat dcators. However, certa crcumstaces, e.g. stock prce forecastg the faca sector more tha 00 dcators have bee deveoped to uderstad stock market behavor ad thus the detfcato of the rght dcators s a chaegg probem. I ths paper, we have used GAs to choose the best puts for forecastg SVM mode from a gve set of puts These geetc agorthms offer a heurstc, popuato-based, evoutoary optmzato method whereby defed popuatos evove over geeratos usg the Darwa prcpe of survva of the fttest. GAs offers the best approach to sove optmzato probems f the requred characterstcs are strogy preset[]. modes wth too may parameters. It has ead to the mprovemet of the speed of the SVM processg ad ehaced the predcto accuracy.. SVM for Regresso Gve a set of trag data {(x, y ),..., (x, y )}, where each x X R, (X deotes the put space of the sampe) ad correspodg target vaue y R for =,, (where correspods to the sze of the trag data), the objectve of the regresso probem s to determe a fucto that ca approxmate the vaue of y for a x ot the trag set. The estmatg fucto f s take the form: f (x)= (w. (x)) + b Eq() where wr m, b R s the bas ad deotes a o-ear fucto from R to hgh dmesoa space R m (m > ). The objectve s to fd the vaue of w ad b such that vaues of f (x) ca be determed by mmzg the rsk: R( f ) C L ( y, f ( x) ) Eq() reg where L s the exteso of -sestve oss fucto orgay proposed by Vapk ad defed as: y z, y z 0, otherwse L ( y, z) Eq(3) As the ature of markets dfferet regos are dfferet, ths paper two mache earg techques: Support Vector Mache Techque (SVM) ad Support Vector Mache Techque wth Geetc Agorthm (GA-SVM) have bee used to predct futures prces traded Ida stock market. The performaces of these techques are compared ad t s observed that GA-SVM provdes better performace resuts as compared to SVM techque. I our work optmzato s acheved from GASVM appcato reazg ad removg the puts that are ot ecessary for predcto or avodg over fttg vuerabty that usuay occurs Voume 3 Issue 0, October 04 www.jsr.et Lcesed Uder Creatve Commos Attrbuto CC BY ' Itroducg the sack varabes ad the above probem may be reformuated as (P) Mmze ' C[ ( ) ) costrats subject to the foowg y -w. (x ) -b ϵ+ w. (x )+ b- y Paper ID: OCT445 498

Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 ϵ+ ', 0, ' 0 for =,,, ad where C above s a user specfed costat. Souto of the above probem (P) usg prma dua method. Determg the Lagrage mutpers ad that maxmzes the objectve fucto. Q(, ) y ( ) ( ). ( )( j j ) K( x, x j ) j Subject to the foowg codtos: ( ) 0 () () 0 C, 0 C Eq(4) for a =,,., where C s a user specfed costat ad K: X X R s the Mercer Kere defed by: K(x,z)=Φ(x).Φ(z) Eq(5) The souto of the prma yeds ( ) W = Φ(x) Eq(6) Ad b s cacuated usg Krush-Kuh-Tucker (KKT) codtos ( y w. (x ) b) 0, Eq(7) ( y w. (x ) b) 0 ( C ) 0 ad ( C ) 0 For =,,3,.., Eq(8) Sce, =0 ad =0 for (0, C), b ca be computed as foows: b= y w. ( x ) for0< <C b= y w. ( x ) for 0< <C Eq(9) For those 0< <C ad ad for whch the x ' s correspodg to 0< <C are caed support vectors. Usg expresso for w ad b equatos (6) ad (7) f(x) s computed as f(x)= ( )( ( x ). ( x)) b = ( ) k( x, x)) b Eq(0) It s to be oted that we do ot requre fucto to compute f(x) whch s oe of the advatages of usg the Kere. 3. Geetc Agorthm Geetc agorthms (GAs) are search methods based o prcpes of atura seecto ad geetcs (Fraser, 957; Hoad, 975)[][3]. GA ecode the decso varabes of a search probem to fte-egth strgs of aphabets of certa cardaty. The strgs whch are caddate soutos to the search probem are referred to as chromosomes, the aphabets are referred to as gees ad the vaues of gees are caed aees. It cossts of foowg steps. Itazato- The ta popuato of caddate soutos s usuay geerated radomy across the search space. Evauato- Oce the popuato s tazed or a offsprg popuato s created, the ftess vaues of the caddate soutos are evauated. 3 Seecto- Seecto aocates more copes of those soutos wth hgher ftess vaues ad thus mposes the survva-of-the-fttest mechasm o the caddate soutos. 4 Recombato- Recombato combes parts of two or more pareta soutos to create ew, possby better soutos (.e. offsprg). 5 Mutato- Whe recombato operates o two or more pareta chromosomes, mutato ocay but radomy modfes a souto. I other words, mutato performs a radom wak the vcty of a caddate souto. 6 Repacemet- The offsprg popuato created by seecto, recombato, ad mutato repaces the orga pareta popuato. 7 Repeat steps 6 ut a termatg codto s met. The structure of SVM wth GA ca be fgured as Voume 3 Issue 0, October 04 www.jsr.et Lcesed Uder Creatve Commos Attrbuto CC BY Paper ID: OCT445 499

Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 Fgure : The archtecture of a SVMG MODEL Fgure : Fow chart for Seectg Optma Parameter By Geetc Agorthm Voume 3 Issue 0, October 04 www.jsr.et Lcesed Uder Creatve Commos Attrbuto CC BY Paper ID: OCT445 500

4. Probem ad Proposed Methodoogy Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 Prevousy the Faca Tme seres forecastg was doe by Kyoug-jae Km of South Korea by takg SVM to cosderato.[4] I the SVM some specfc vaues of two parameter C ad Ϭ were chose. For fdg the better predcto he used the vaues of C ad Ϭ betwee some rage a probabstc maer. Here GA-SVM hybrd techque s used order to mprove the accuracy of the predcto by makg a optma choce of the parameter C ad Ϭ. To set up the expermet to prove the accuracy of the predcto of our proposed methodoogy we have take tme seres data of some Ida frms ad cacuated the predcto of ther share prces by usg SVM. Aga the parameters those are used SVM are passed the proposed Geetc Agorthm to make the optma vaues. The same data are aso mpemeted SVM-GA to compare the predcto accuracy of share prce wth SVM. 4. Data Sets The mode of forecastg s used wth the foowg rea dex futures data coected from the Natoa Stock Exchage (NSE) of Ida Lmted. They are S&P CNX NIFTY, S & P BSE FMCG INDEX, S&P INFOTECH 500, S & P BSE MIDCAP INDEX, ad S & P BSE OIL & GAS INDEX. We have take 650 sampes for each of the futures cotracts metoed above. The tme perod for each cotract s from st Jauary, 00 to 04 August, 03. The data coected cossts of day prevous cosg prce, ope prce, hgh prce, ow prce, traded voume, ad traded vaue. The day cosg prces are used as the data sets. Tabe shows hgh prce, ow prce, mea, meda ad stadard devato of the fve futures prces coected for our expermet. INDEX FUTURES Tabe : Descrpto of Data Sets HIGH LOW MEAN MEDIAN SD ad RDP-0) ad oe trasformed cosg prce s obtaed by subtractg a 5-day expoeta movg average (EMA5) from the cosg prce. The subtracto s performed to emate the tred prce as the maxmum vaue ad the mmum vaue the rato of about : a the fve data sets. EMA5 s used to mata as much of the formato cotaed the orga cosg prce as possbe sce the appcato of the RDP trasform to the orga cosg prce may remove some usefu formato. The output varabe RDP+5 s obtaed by frst smoothg the cosg prce wth a three-day expoeta movg average, because the appcato of a smoothg trasform to the depedet varabe geeray ehaces the predcto performace of eura etworks. The cacuatos for a the dcators are gve tabe beow EMA() s the -day expoeta movg average of the th day ad p() s the cosg prce of the th day. Sce outers may make t dffcut or tme-cosumg to arrve at a effectve souto for the SVMs, RDP vaues beyod the mts of ± stadard devatos are seected as outers. They are repaced wth the cosest marga vaues. A of the fve data sets are parttoed to three parts accordg to the tme sequece. Tabe : Performace Idcator Idcator Cacuato Iput Varabes EMA5 p()-ema5() RDP-5 (p()- p(-5))/ p(- 5)-00 RDP-0 (p() - p( -0)) / p(- 0) -00 RDP-5 (p() - p(-5)) / p(- 5)-00 RDP-0 (p() - p(-0))/ p(- 0)-00 Output Varabes RDP+5 ( p ( 5)) - p () / p () - 00 p () = EMA ( ) 4.3 Performace Crtera The predcto performace s evauated usg the foowg statstca metrcs, amey, the ormazed mea squared error (NMSE), mea absoute error (MAE) ad drectoa symmetry (DS). The deftos of these crtera ca be foud Tabe 3. NMSE ad MAE are the measures of the devato betwee the actua ad predcted vaues. The smaer the vaues of NMSE ad MAE, the coser are the predcted tme seres vaues to the actua vaues. DS provdes a dcato of the correctess of the predcted drecto of RDP+5 gve the form of percetages (a arge vaue suggests a better predctor). S&P CNX NIFTY 63.45 54. 467.85 473.8 93.08 S&P BSE FMCG 60.3 646.3 3880.67 389.53 666.9697 INDEX S&P INFOTECH 69.4 373.8 5766.86 5755.53 46.5086 500 S&P BSE 799.37 5073.5 6439.59 6397.05 50.567 MIDCAP INDEX S&P BSE OIL & 70.45 6835.94 950.007 936.9 03.73 Tabe 3: Performace Metrcs ad Ther Cacuato GAS INDEX Metrcs Cacuato 4.Preprocessg NMSE ( y y ) The frst step faca forecastg s to choose a sutabe forecastg horzo. From the predcto aspect, the ( y y) y y forecastg horzo shoud be short eough as the persstece of faca tme seres s of mted durato. A MAE forecastg horzo of fve days s a sutabe choce for the day data. The put varabes are determed from four y y agged RDP(reatve dfferece percetage of prce) vaues based o fve-day perods (RDP-5, RDP-0, RDP-5, Voume 3 Issue 0, October 04 www.jsr.et Paper ID: OCT445 50 Lcesed Uder Creatve Commos Attrbuto CC BY 3

Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 DS 00 d d 0 otherwse ( y y )( y y ) 0 Where N s the tota umber of data patters. y ad represet the actua ad predcted output vaue y 4.4 Computato Techques We have apped Vapk s SVM for regresso by usg LS- SVM too box. The typca kere fuctos used SVRs d are the Poyoma Kere k ( x, y) ( x. y ) ad the Gaussa Kere k (x, y) = exp (-( x y ) /δ ), where d s the degree of the poyoma kere ad δ s the badwdth of the Gaussa Kere. We have take Gaussa Kere fucto because t performs we uder geera smoothess assumptos. Poyoma Kere gves feror resut compared to Gaussa Kere ad takes a oger tme trag SVMs. I the SVR, those vaues of δ & epso are take, that produces the best resut o the vadato set of our data. Aga to seect more optma vaue of the parameter δ & epso Geetc agorthm s used as gve above dagram of the GA secto so that the predcto s much more accurate. 5. Resuts ad Dscusso Fgure 4: Comparso of SVM ad SVM-GA for S &P CNX NIFTY Futures Cotract Fgure 5: Comparso of SVM ad SVM-GA for S &P BSE O & Gas Idex Futures Cotract After the trag wth Support Vector Mache Regresso ad Support Vector Mache wth GA Regresso, the forecasted prce ad the actua prce for the test data are exhbted the fgures 3,4,5,6 ad7. Fgure 6: Comparso of SVM ad SVM-GA for S &P BSE MIDCAP Idex Futures Cotract Fgure 3: Comparso of SVM ad SVM-GA for S &P BSE FMCG Idex Futures Cotract Voume 3 Issue 0, October 04 www.jsr.et Lcesed Uder Creatve Commos Attrbuto CC BY Paper ID: OCT445 50

Iteratoa Joura of Scece ad Research (IJSR) ISSN (Oe): 39-7064 Impact Factor (0): 3.358 forecastg as t adopts the Structura Rsk Mmzato Prcpe, evetuay eadg to better geerazato tha that of covetoa techque. For future work, we ted to optmze the kere fucto, parameters ad feature subset smutaeousy. We woud aso ke to expad ths mode to appy to stace seecto probems. Refereces Fgure 7: Comparso of SVM &SVM-GA for S &P INFOTECH Idex Futures Cotract 5. Comparso of Resuts The forecastg resuts of the SVM ad SVM-GA for the test set are coected the tabe 4 whch shows SVM outperforms the SVM-GA most of the cases. SVM-GA provdes a smaer NMSE ad MAE ad arger DS tha those of SVM most of the cases. The performace crtera set for our expermet showed a very good agreemet of the predcted prce wth actua prce whe SVM-GA method s used comparso to SVM. The NSME for a the futures stock dex take to cosderato fa the rage of 0.799 to.73. The MAE fa the rage of 0.3 to 0.38 ad the ast crtera DS starts from 85.97 to 93.377. Tabe 4: Comparso of the Resuts of SVM & SVM-GA [] Fraser, A. S., 957, Smuato of geetc systems by automatc dgta computers.ii: Effects of kage o rates uder seecto, Austra. J. Bo. Sc.0:49 499. [] Ge, M. ad Cheg, R. (000), Geetc agorthms ad egeerg optmzato, Joh Wey, New York. [3] Hoad, J. H., 975, Adaptato Natura ad Artfca Systems, Uversty of Mchga Press, A Arbor, MI. [4] Kyoug-jae Km Faca tme seres forecastg usg support vector maches, Neurocomputg 55 (003) 307 39 [5] Tay FEH, Cao LJ (00a). Appcato of support vector maches faca tme seres forecastg. Omega, 9: 309-37. [6] Tay FEH, Cao LJ (00b). Improved faca tme seres forecastg by combg support vector maches wth sef-orgazg feature map.ite. Data Aa., 5: 339-354. [7] Vapk VN (999). A overvew of statstca earg theory. IEEE Tras Neura Networks, 0: 988-999. Author Profe Om Prakash Jea receved M.Tech Comp Sc from Utka Uversty 03 ad works as Research schoar dept of Computer Scece from 04 Utka Uversty. 6. Cocuso I ths research work, we have examed the feasbty of appyg two mache earg modes, Support Vector Maches (SVM) ad Support Vector Maches wth Geetc agorthm (SVM-GA) to faca tme-seres forecastg for the futures tradg Ida dervatve markets. The proposed expermets demostrated that: SVM-GA provde a promsg ateratve too to the Support Vector Maches for faca tme seres Voume 3 Issue 0, October 04 www.jsr.et Lcesed Uder Creatve Commos Attrbuto CC BY Paper ID: OCT445 503