Figure 1. Training and Test data sets for Nasdaq-100 Index (b) NIFTY index



Similar documents
Hybrid-Learning Methods for Stock Index Modeling

Lecture 2: Single Layer Perceptrons Kevin Swingler

Forecasting the Direction and Strength of Stock Market Movement

What is Candidate Sampling

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Support Vector Machines

Australian Forex Market Analysis Using Connectionist Models

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

L10: Linear discriminants analysis

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

BERNSTEIN POLYNOMIALS

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Credit Limit Optimization (CLO) for Credit Cards

An Alternative Way to Measure Private Equity Performance

A Genetic Programming Based Stock Price Predictor together with Mean-Variance Based Sell/Buy Actions

Performance Analysis and Coding Strategy of ECOC SVMs

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks

A COLLABORATIVE TRADING MODEL BY SUPPORT VECTOR REGRESSION AND TS FUZZY RULE FOR DAILY STOCK TURNING POINTS DETECTION

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

IMPACT ANALYSIS OF A CELLULAR PHONE

Financial market forecasting using a two-step kernel learning method for the support vector regression

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Single and multiple stage classifiers implementing logistic discrimination

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Logistic Regression. Steve Kroon

Forecasting and Modelling Electricity Demand Using Anfis Predictor

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Article Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Statistical Methods to Develop Rating Models

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

The Application of Fractional Brownian Motion in Option Pricing

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

An Interest-Oriented Network Evolution Mechanism for Online Communities

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

How To Calculate The Accountng Perod Of Nequalty

Meta-Analysis of Hazard Ratios

Prediction of Stock Market Index Movement by Ten Data Mining Techniques

Lecture 5,6 Linear Methods for Classification. Summary

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Can Auto Liability Insurance Purchases Signal Risk Attitude?

SVM Tutorial: Classification, Regression, and Ranking

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

Improved SVM in Cloud Computing Information Mining

CHAPTER 14 MORE ABOUT REGRESSION

Modelling of Web Domain Visits by Radial Basis Function Neural Networks and Support Vector Machine Regression

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Searching for Interacting Features for Spam Filtering

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Stock volatility forecasting using Swarm optimized Hybrid Network

Realistic Image Synthesis

Project Networks With Mixed-Time Constraints

Trade Adjustment and Productivity in Large Crises. Online Appendix May Appendix A: Derivation of Equations for Productivity

Solving Factored MDPs with Continuous and Discrete Variables

Gender Classification for Real-Time Audience Analysis System

The OC Curve of Attribute Acceptance Plans

An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki*

J. Parallel Distrib. Comput.

A DATA MINING APPLICATION IN A STUDENT DATABASE

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Damage detection in composite laminates using coin-tap method

Efficient Project Portfolio as a tool for Enterprise Risk Management

Using Association Rule Mining: Stock Market Events Prediction from Financial News

Research Article Enhanced Two-Step Method via Relaxed Order of α-satisfactory Degrees for Fuzzy Multiobjective Optimization

Fast Fuzzy Clustering of Web Page Collections

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Calculating the high frequency transmission line parameters of power cables

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

1 Example 1: Axis-aligned rectangles

Loop Parallelization

Transcription:

Modelng Chaotc Behavor of Stock Indces Usng Intellgent Paradgms Ajth Abraham, Nnan Sajth Phlp and P. Saratchandran Department of Computer Scence, Oklahoma State Unversty, ulsa, Oklahoma 746, USA, Emal: ajth.abraham@eee.org Department of Physcs, Cochn Unversty of Scence and echnology, Inda, Emal: nsp@stthom.ernet.n School of Electrcal and Electronc Engneerng, Nanyang echnologcal Unversty, Sngapore 639798, E-mal: epsarat@ntu.edu.sg Abstract he use of ntellgent systems for stock market predctons has been dely establshed. In ths paper, e nvestgate ho the seemngly chaotc behavor of stock markets could be ell represented usng several connectonst paradgms and soft computng technques. o demonstrate the dfferent technques, e consdered Nasdaq- ndex of Nasdaq Stock Market SM and the S&P CNX NIFY stock ndex. We analyzed 7 year s Nasdaq man ndex values and 4 year s NIFY ndex values. hs paper nvestgates the development of a relable and effcent technque to model the seemngly chaotc behavor of stock markets. We consdered an artfcal neural netork traned usng Levenberg-Marquardt algorthm, Support Vector Machne (SVM, akag-sugeno neurofuzzy model and a Dfference Boostng Neural Netork (DBNN. hs paper brefly explans ho the dfferent connectonst paradgms could be formulated usng dfferent learnng methods and then nvestgates hether they can provde the requred level of performance, hch are suffcently good and robust so as to provde a relable forecast model for stock market ndces. Experment results reveal that all the connectonst paradgms consdered could represent the stock ndces behavor very accurately. Key ords: connectonst paradgm, support vector machne, neural netork, dfference boostng, neuro-fuzzy, stock market.. INRODUCION Predcton of stocks s generally beleved to be a very dffcult task. he process behaves more lke a random alk process and tme varyng. he obvous complexty of the problem paves ay for the mportance of ntellgent predcton paradgms. Durng the last decade, stocks and futures traders have come to rely upon varous types of ntellgent systems to make tradng decsons [][3][7][][8][9][6][3][8]. Several ntellgent systems have n recent years been developed for modellng expertse, decson support and complcated automaton tasks etc [8][9][5][5][4][6][9][4][7]. In ths paper, e analysed the seemngly chaotc behavour of to ell-knon stock ndces namely Nasdaq- ndex of Nasdaq SM [] and the S&P CNX NIFY stock ndex [].

Nasdaq- ndex reflects Nasdaq's largest companes across major ndustry groups, ncludng computer hardare and softare, telecommuncatons, retal/holesale trade and botechnology []. he Nasdaq- ndex s a modfed captalzaton-eghted ndex, hch s desgned to lmt domnaton of the Index by a fe large stocks hle generally retanng the captalzaton rankng of companes. hrough an nvestment n Nasdaq- ndex trackng stock, nvestors can partcpate n the collectve performance of many of the Nasdaq stocks that are often n the nes or have become household names. Smlarly, S&P CNX NIFY s a ell-dversfed 5 stock ndex accountng for 5 sectors of the economy []. It s used for a varety of purposes such as benchmarkng fund portfolos, ndex based dervatves and ndex funds. he CNX Indces are computed usng market captalsaton eghted method, heren the level of the Index reflects the total market value of all the stocks n the ndex relatve to a partcular base perod. he method also takes nto account consttuent changes n the ndex and mportantly corporate actons such as stock splts, rghts, etc thout affectng the ndex value. Fgure. ranng and est data sets for Nasdaq- Index (b NIFY ndex Our research s to nvestgate the performance analyss of four dfferent connectonst paradgms for modellng the Nasdaq- and NIFY stock market ndces. he four dfferent technques consdered are an artfcal neural netork traned usng the Levenberg-Marquardt algorthm [6], support vector machne [7], dfference boostng neural netork [5] and a akag-sugeno fuzzy nference system learned usng a neural netork algorthm (neuro-fuzzy model [3]. Neural netorks are excellent forecastng tools and can learn from scratch by adjustng the nterconnectons beteen layers. Support vector machnes offer excellent learnng capablty based on statstcal learnng theory. Fuzzy nference systems are excellent for decson makng under uncertanty. Neuro-fuzzy computng s a popular frameork heren neural netork tranng algorthms are used to fne-tune the parameters of fuzzy nference systems. We analysed the Nasdaq- ndex value from January 995 to January [] and the

NIFY ndex from January 998 to 3 December []. For both the ndces, e dvded the entre data nto almost to equal parts. No specal rules ere used to select the tranng set other than ensurng a reasonable representaton of the parameter space of the problem doman. he complexty of the tranng and test data sets for both ndces are depcted n Fgures and respectvely. In Secton e brefly descrbe the dfferent connectonst paradgms folloed by expermentaton setup and results n Secton 3. Some conclusons are also provded toards the end. Fgure. ranng and est data sets for NIFY ndex. INELLIGEN SYSEMS: A CONNECIONIS MODEL APPROACH Connectonst models learn by adjustng the nterconnectons beteen layers. When the netork s adequately traned, t s able to generalze relevant output for a set of nput data. Learnng typcally occurs by example through tranng, here the tranng algorthm teratvely adjusts the connecton eghts (synapses. In an artfcal neural netork learnng occurs by the teratve updatng of connecton eghts usng a learnng algorthm.. ARIFICIAL NEURAL NEWORKS he artfcal neural netork (ANN methodology enables us to desgn useful nonlnear systems acceptng large numbers of nputs, th the desgn based solely on nstances of nput-output relatonshps. For a tranng set consstng of n argument value pars and gven a d-dmensonal argument x and an assocated target value t ll be approxmated by the neural netork output. he functon approxmaton could be represented as {( x, t : : n} In most applcatons the tranng set s consdered to be nosy and our goal s not to reproduce t exactly but rather to construct a netork functon that generalzes ell to ne functon values. We ll try to address the problem of selectng the eghts to learn

the tranng set. he noton of closeness on the tranng set s typcally formalzed through an error functon of the form n y t ( here y s the netork output. Our target s to fnd a neural netork? such that the output y =? (x, s close to the desred output t for the nput x ( = strengths of synaptc connectons. he error? =? ( s a functon of because y =? depends upon the parameters defnng the selected netork?. he objectve functon? ( for a neural netork th many parameters defnes a hghly rregular surface th many local mnma, large regons of lttle slope and symmetres. he common node functons (tanh, sgmodal, logstc etc are dfferentable to arbtrary order through the chan rule of dfferentaton, hch mples that the error s also dfferentable to arbtrary order. Hence e are able to make a aylor's seres expanson n for?. We shall frst dscuss the algorthms for mnmzng? by assumng that e can truncate a aylor's seres expanson about a pont o that s possbly a local mnmum. he gradent (frst partal dervatve vector s represented by g ( ( he gradent vector ponts n the drecton of steepest ncrease of? and ts negatve ponts n the drecton of steepest decrease. he second partal dervatve also knon as Hessan matrx s represented by H ( H ( H j ( ( (3 j he aylor's seres for?, assumed tce contnuously dfferentable about, can no be gven as ( O( ( g( ( ( H ( O( here O (d denotes a term that s of zero-order n small d such that lm. If for example there s contnuous dervatve at, then the remander term s of order 3 ( and e can reduce (4 to the follong quadratc model m( ( g( ( ( H( ( (5 akng the gradent n the quadratc model of (5 yelds m g( H( If e set the gradent g= and solvng for the mnmzng * yelds (4 (6

* H g he model m can no be expressed n terms of mnmum value of * as m( m( m( m( g( ( H g( H( ( a result that follos from (5 by completng the square or recognzng that g( * =. Hence startng from any ntal value of the eght vector, e can n the quadratc case move one step to the mnmzng value hen t exsts. hs s knon as Neton's approach and can be used n the non-quadratc case here H s the Hessan and s postve defnte... LEVENBERG-MARQUARD ALGORIHM he Levenberg-Marquardt (LM algorthm [6] explots the fact that the error functon s a sum of squares as gven n (. Introduce the follong notaton for the error vector and ts Jacoban th respect to the netork parameters J J e j, : p, j n (9 j : he Jacoban matrx s a large p n matrx, all of hose elements are calculated drectly by backpropagaton technque. he p dmensonal gradent g for the quadratc error functon can be expressed as g ( n e e ( Je and the Hessan matrx by (7 (8 H H j j k n k e j k n e k e k j e k e k j k n e k e k j J k J jk ( Hence defnng D H ( = JJ + D n e e yelds the expresson he key to the LM algorthm s to approxmate ths expresson for the Hessan by replacng the matrx D nvolvng second dervatves by the much smpler postvely scaled unt matrx I. he LM s a descent algorthm usng ths approxmaton n the form (

M k JJ I, k k k M k g( k ( Successful use of LM requres approxmate lne search to determne the rate a k. he matrx JJ s automatcally symmetrc and non-negatve defnte. he typcally large sze of J may necesstate careful memory management n evaluatng the product JJ. Hence any postve ll ensure that M k s postve defnte, as requred by the descent condton. he performance of the algorthm thus depends on the choce of. When the scalar s zero, ths s just Neton's method, usng the approxmate Hessan matrx. When s large, ths becomes gradent descent th a small step sze. As Neton's method s more accurate, s decreased after each successful step (reducton n performance functon and s ncreased only hen a tentatve step ould ncrease the performance functon. By dong ths, the performance functon ll alays be reduced at each teraton of the algorthm.. SUPPOR VECOR MACHINES (SVM Support Vector Machnes (SVMs [7] combne several technques from statstcs, machne learnng and neural netorks. SVM perform structural rsk mnmzaton. hey create a classfer th mnmzed VC (Vapnk and Chervonenks dmenson. If the VC Dmenson s lo, the expected probablty of error s lo as ell, hch means good generalzaton. SVM has the common capablty to separate the classes n the lnear ay. Hoever, SVM also has another specalty that t s usng a lnear separatng hyperplane to create a classfer, yet some problems can t be lnearly separated n the orgnal nput space. hen SVM uses one of the most mportant ngredents called kernels,.e., the concept of transformng lnear algorthms nto nonlnear ones va a map nto feature spaces. Fgures 3 and 4 llustrate to categores of data usng Y+ and Y- symbols. + + Y + + + Margn = Y y =.x b = y =.x b = y =.x b = + Y + Margn = + Y + + + y =.x b = y =.x b = y =.x b = + Fgure 3: he lnearly separable case. Fgure 4: he lnearly nseparable case.

.. LINEAR SVM We consder N tranng data ponts {(x, y, (x, y,..,(x N,y N } here x R d and y { }. We ould lke to explan a lnear separatng hyperplane classfer: f ( x sgn(. x b (3 Furthermore, e ant ths hyperplane to have the maxmum separatng margn th respect to the to classes. Specfcally, e ant to fnd ths hyperplane HP : y =.x b = and to hyperplanes parallel to t and th equal dstances to t, HP : y =.x b = + and HP : y =.x b = (4 th the condton that there are no data ponts beteen HP and HP, and the dstance beteen HP and HP s maxmzed. For any separatng plane HP and the correspondng HP and HP, e can alays normalze the coeffcents vector so that HP ll be y =.x b = +, and HP ll be y =.x b =. Our am s to maxmze the dstance beteen HP and HP. So there ll be some postve examples on HP and some negatve examples on HP. hese examples are called support vectors because only they partcpate n the defnton of the separatng hyperplane, and other examples can be removed and/or moved around as long as they don t cross the planes HP and HP. Recall that the -D, the dstance from a pont (x, y to a lne Ax+Bx+C = s Ax By C. Smlarly, the dstance of a pont on HP to HP :.x b = s A B x b., and the dstance beteen HP and HP s dstance, e should mnmze. So, n order to maxmze the th the condton that there are no data ponts beteen HP and HP.x b +, for postve example y = + and.x b -, for negatve example y = - hese to condton can be combned nto: y (.x b No the problem can be formulated as, b mn subject to y (.x b (5 hs s a convex, quadratc programmng problem (n, b n a convex set. Introducng Lagrange multplers,,. n, e have the follong Lagrangan: L(, b, N y (. x b N. (6

.4. NON LINEAR SVM When the to classes are non-lnearly dstrbuted then SVM can transform the data ponts to another hgh dmensonal space such that the data ponts ll be lnearly separable. Let the transformaton be (. In the hgh dmensonal space, e solve L D N, j j y y j ( x. ( x j Suppose, n addton, (x (x j = k(x,x j. hat s, the dot product n that hgh dmensonal space s equvalent to a kernel functon of the nput space. So, e need not be explct about the transformaton ( as long as e kno that the kernel functon k(x, x j s equvalent to the dot product of some other hgh dmensonal space. he Mercers s condton can be used to determne f a functon can be used as a kernel functon: here exsts a mappng K ( x, y ( x ( y and an expanson f and only f, for any g(x such that g( x dx s fnte, then (7 (8 K ( x, y g( x g( y dxdy. (9 he foundatons of SVM have been developed by Vapnk [7] and are ganng popularty due to many attractve features, and promsng emprcal performance. he possblty of usng dfferent kernels allos veng learnng methods lke Radal Bass Functon Neural Netork (RBFNN or mult-layer Artfcal Neural Netorks (ANN as partcular cases of SVM despte the fact that the optmzed crtera are not the same [4]. Whle ANNs and RBFNN optmzes the mean squared error dependent on the dstrbuton of all the data, SVM optmzes a geometrcal crteron, hch s the margn and s senstve only to the extreme values and not to the dstrbuton of the data nto the feature space. he SVM approach transforms data nto a feature space F that usually has a huge dmenson. It s nterestng to note that SVM generalzaton depends on the geometrcal characterstcs of the tranng data, not on the dmensons of the nput space. ranng a support vector machne (SVM leads to a quadratc optmzaton problem th bound constrants and one lnear equalty constrant. Vapnk [7] shos ho tranng a SVM for the pattern recognton problem leads to the follong quadratc optmzaton problem l l l Mnmze: W ( y y j jk( x, x j ( Subject to l y j : C Where l s the number of tranng examples s a vector of l varables and each component corresponds to a tranng example (x, y. he soluton of ( s the vector (

* for hch ( s mnmzed and ( s fulflled. We used the SVMorch for smulatng the SVM learnng algorthm []..3 NEURO-FUZZY SYSEM Neuro Fuzzy (NF computng s a popular frameork for solvng complex problems []. If e have knoledge expressed n lngustc rules, e can buld a Fuzzy Inference System (FIS [8], and f e have data, or can learn from a smulaton (tranng then e can use ANNs. For buldng a FIS, e have to specfy the fuzzy sets, fuzzy operators and the knoledge base. Smlarly for constructng an ANN for an applcaton the user needs to specfy the archtecture and learnng algorthm. An analyss reveals that the drabacks pertanng to these approaches seem complementary and therefore t s natural to consder buldng an ntegrated system combnng the concepts. Whle the learnng capablty s an advantage from the vepont of FIS, the formaton of lngustc rule base ll be advantage from the vepont of ANN. Fgure 5 depcts the 6- layered archtecture of multple output ANFIS and the functonalty of each layer s as follos: Layer-. Every node n ths layer has a node functon. O ( x, for =, or O B ( y, for =3,4,. O s the membershp grade of a fuzzy set A ( = A, A, B or B and t specfes the degree to hch the gven nput x (or y satsfes the quantfer A. Usually the node functon can be any parameterzed functon. A gaussan membershp functon s specfed by to parameters c (membershp functon center and s (membershp functon dth. guassan (x, c, s = e x c. Parameters n ths layer are referred to premse parameters. Layer-. Every node n ths layer multples the ncomng sgnals and sends the product out. Each node output represents the frng strength of a rule. O A ( x B ( y,,... fuzzy "AND" can be used as the node functon n ths layer., In general any -norm operator that perform Layer-3. he rule consequent parameters are determned n ths layer. A 3 O f xp yq r, here p, q, r are the rule consequent parameters. Layer-4. Every node n ths layer s th a node functon O 4 f ( p x q y r, here s the output of layer Layer-5. Every node n ths layer aggregates all the frng strengths of rules 5 O (

Layer-6. Every -th node n ths layer calculates the ndvdual outputs. 6 f O Output,,.... (3 premse parameters A x consequent parameters x A B W f f / Output W B C W 3 y C D W 4 f f / Output D y O O O 3 O 4 O 5 O 6 Fgure 5. Archtecture of ANFIS th multple outputs ANFIS uses a hybrd learnng rule th a combnaton of gradent descent and least squares estmate [3]. Assumng a sngle output ANFIS represented by output F( I, S (4 here I s the set of nput varables and S s the set of parameters, f there exst a functon H such that the composte functon H? F s lnear n some of the elements of S, then these elements can be dentfed by the least squares method [3]. More formally, the parameter set S can be decomposed nto to sets: S S S (here represents drect sum, (5 such that H? F s lnear n the elements of S. hen upon applyng H to equaton (6., e have: H ( output H F( I, S (6 hch s lnear n the elements of S. No the gven values of elements of S, e can plug P tranng data sets nto (6.3, and obtan a matrx equaton: AX = B (X = unknon vector hose elements are parameters n S (7 If S =M, (M= number of lnear parameters then the dmensons of A, X and B are P M, M and P respectvely. Snce P s alays greater than M, there s no exact

soluton to equaton (6.4. Instead a Least Square Estmate (LSE of X, X *, s sought to mnmze the squared error AX B. X * s computed usng the pseudo-nverse of X: X * ( A A A B (8 here A s the transpose of A and ( A A A s the pseudo-nverse of A here A A s non-sngular. Due to computatonal complexty, n ANFIS a sequental method s deployed as follos: Let the -th ro vector of matrx A defned n equaton 6.4 be a and -th element of matrx B defned be b, then X can be calculated teratvely usng the follong sequental formulae: X S S X S S a ( b a S S a a a, a X,,..., P here S s often called the covarance matrx and the least squares estmate X * s equal to X P. he ntal condton to bootstrap (6.6 are X O = and S O =? I, here? s a postve large number and I s the dentty matrx of dmenson M M. For a mult output ANFIS, (6.6 s stll applcable except the output F( I, S ll become a column vector. Each epoch of ths hybrd learnng procedure s composed of a forard pass and a backard pass. In the forard pass, e have to supply the nput data and functonal sgnals go forard to calculate each node output untl the matrces A and B n (6.4 are obtaned, and the parameters n S are dentfed by the sequental least squares formulae gven n (6.6. After dentfyng parameters n S, the functonal sgnals keep gong forard tll the error measure s calculated. In the backard pass, the error rates propagate from the output layer to the nput layers, and the parameters n S are updated by the gradent method gven by E here s the generc parameter, s a learnng rate and E the error measure. For gven fxed values of parameters n S, the parameters n S thus found are guaranteed to be the global optmum pont n the S parameter space due to the choce of the squared error measure. he procedure mentoned above s manly for offlne learnng verson. Hoever, the procedure can be modfed for an onlne verson by formulatng the squared error measure as a eghted verson that gves hgher eghtng factors to more recent data pars. hs amounts to the addton of a forgettng factor? to (9. (9 (3

X S X S S a ( b S a a S a Sa a X,,..., P he value of? s beteen and. he smaller the? s, faster the effects of old data decay. Hoever, a smaller? sometmes causes numercal nstablty and should be avoded..4 DIFFERENCE BOOSING NEURAL NEWORK (DBNN DBNN s based on the Bayes prncple that assumes the clusterng of attrbute values hle boostng the attrbute dfferences [5]. Boostng s an teratve process by hch the netork places emphass on msclassfed examples n the tranng set untl t s correctly classfed. he method consders the error produced by each example n the tranng set n turn and updates the connecton eghts assocated to the probablty P (U m C k of each attrbute of that example (U m s the attrbute value and C k a partcular class n k number of dfferent classes n the dataset. In ths process, the probablty densty of dentcal attrbute values flattens out and the dfferences get boosted up. Instead of the seral classfers used n the AdaBoost algorthm, DBNN approach uses the same classfer throughout the tranng process. An error functon s defned for each of the mss classfed examples based on t dstance from the computed probablty of ts nearest rval. he enhancement to the attrbute s done such that the error produced by each example decdes the correcton to ts assocated eghts. Snce t s lkely that more than one class ould be sharng at least some of the same attrbute values, ths ould lead to compettve update of ther attrbute eghts, untl ether the classfer fgures out the correct class or the number of teratons are completed. he net effect of ths ould be that the classfer ould become more and more dependent on the dfferences n the examples rather than ther smlartes. DBNN s bascally a classfcaton algorthm. It assgns output state labels to nput patterns th some degree of confdence that t acqures from the tranng set. We modfed the algorthm for tme seres predcton by approxmatng the tme seres by a class of slope predctons. A major lmtaton of such a revson s that the possble output states ere lmted by the number of output states seen n the tranng set. hs s because the classfcaton algorthm lmts number of possble classes (slope values to that t encountered durng the tranng perod. hese lmtatons ll be apparent n the DBNN outputs. 3. EXPERIMENAION SEUP AND RESULS We consdered 7 year s months stock data for Nasdaq- Index and 4 year s for NIFY ndex. Our target s to develop effcent forecast models that could predct the ndex value of the follong trade day based on the openng, closng and maxmum values of the same on a gven day. he tranng and test patterns for both the ndces (scaled values are llustrated n Fgures and. For the Nasdaq-ndex the data sets ere represented (3

by the openng value, lo value and hgh value. NIFY ndex data sets ere represented by openng value, lo value, hgh value and closng value. We used the same tranng and test data sets to evaluate the dfferent connectonst models. More detals are reported n the follong sectons. Experments ere carred out on a Pentum IV,.5 GHz Machne th 56 MB RAM and the codes ere executed usng MALAB (ANN, ANFIS and C++ (SVM, DBNN. est data as presented to the traned connectonst netork and the output from the netork as compared th the actual ndex values n the tme seres. he assessment of the predcton performance of the dfferent connectonst paradgms ere done by quantfyng the predcton obtaned on an ndependent data set. he maxmum absolute percentage error (MAP and mean absolute percentage error (MAPE ere used to study the performance of the traned forecastng model for the test data. MAP s defned as follos: MAP max P actual, P predcted, P predcted,, here P actual, s the actual ndex value on day and P predcted, s the forecast value of the ndex on that day. Smlarly MAPE s gven as MAPE days. N Pactual, Ppredcted, N P actual,, here N represents the total number of ANN LM algorthm We used a feedforard neural netork th 4 nput nodes and a sngle hdden layer consstng of 6 neurons. We used tanh-sgmodal actvaton functon for the hdden neurons. he tranng as termnated after 5 epochs and t took about 4 seconds to tran each dataset. Neuro-fuzzy tranng We used 3 trangular membershp functons for each of the nput varable and the 7 fthen fuzzy rules ere learned for the Nasdaq- ndex and 8 f-then fuzzy rules for the NIFY ndex. ranng as termnated after epochs and t took about 3 seconds to tran each dataset. Support Vector Machnes and Dfference Boostng Neural Netork Both SVM (Gaussan kernel th? = 3 and DBNN took less than one second to learn the to data sets.

Performance and Results Acheved able summarzes the tranng and test results acheved for the to stock ndces usng the four dfferent approaches. Fgures 3 and 4 depct the test results for the one day ahead predcton of Nasdaq- ndex and NIFY ndex respectvely. Fgure 3. est results shong the performance of the dfferent methods for modelng Nasdaq- ndex Fgure 4. est results shong the performance of the dfferent methods for modelng NIFY ndex

able : Emprcal comparson (tranng and test of four learnng methods SVM Neuro-Fuzzy ANN-LM DBNN ranng results (RMSE Nasdaq-.6..9.99 NIFY.734.5.435.74 estng results (RMSE Nasdaq-.84.83.844.864 NIFY.495.7.7.5 able : Statstcal analyss of four learnng methods (test data SVM Neuro-Fuzzy ANN-LM DBNN Nasdaq- Correlaton coeffcent.9977.9976.9955.994 MAP 48.5 5.84 48.77 6.987 MAPE 7.7 7.65 9.3 9.49 NIFY Correlaton coeffcent.9968.9967.9968.989 MAP 7.53 4.37 73.94 37.99 MAPE 4.46 3.3 3.353 5.86

4. CONCLUSIONS In ths paper, e have demonstrated ho the chaotc behavor of stock ndces could be ell represented by connectonst paradgms. Emprcal results on the to data sets usng four dfferent models clearly reveal the effcency of the proposed technques. In terms of RMSE values, for Nasdaq- ndex, SVM performed margnally better than other models and for NIFY ndex, ANN-LM approach gave the loest generalzaton RMSE values. For both data sets, SVM has the loest tranng tme. For Nasdaq- ndex SVM has the hghest correlaton coeffcent and loest value of MAPE but the loest MAP value as for DBNN. Hghest correlaton coeffcent as shared by SVM and ANN-LM approach for NIFY ndex but the loest MAPE value as for the neurofuzzy approach. It s nterestng to note that for predctng both ndex values, DBNN has the loest MAP value. A lo MAP value th DBNN s a crucal ndcator for evaluatng the stablty of a market under unforeseen fluctuatons. In the present example, the predctablty assures the fact that the decrease n trade s only a temporary cyclc varaton that s perfectly under control. In contrast, a chaotc fluctuaton n the market ould result n a larger MAP value and der dsagreement th the predcton by DBNN. Although smlar ould be the case th the other netorks also, snce DBNN drectly correlates ts performance th Bayesan probablty estmates, t ll be more apparent n ts case. Our research as to predct the share prce for the follong trade day based on the openng, closng and maxmum values of the same on a gven day. Our expermentaton results ndcate that the most promnent parameters that affect share prces are ther mmedate openng and closng values. he fluctuatons n the share market are chaotc n the sense that they heavly depend on the values of ther mmedate forerunnng fluctuatons. Long-term trends exst, but are slo varatons and ths nformaton s useful for long-term nvestment strateges. Our study focus on short term, on floor trades, n hch the rsk s hgher. Hoever, the results of our study sho that even n the seemngly random fluctuatons, there s an underlyng determnstc feature that s drectly encphered n the openng, closng and maxmum values of the ndex of any day makng predctablty possble. Emprcal results also shos that there are varous advantages and dsadvantages for the dfferent technques consdered. Our future research ll be orented toards determnng the optmal ay to combne the dfferent ntellgent systems usng an ensemble approach [] so as to complment the advantages and dsadvantages of the dfferent paradgms consdered. REFERENCES [] Abraham A., Nath B and Mahant P K, Hybrd Intellgent Systems for Stock Market Analyss, Computatonal Scence, Sprnger-Verlag Germany, Vassl N Alexandrov et al (Edtors, USA, pp. 337-345, May. [] Abraham A., Neuro-Fuzzy Systems: State-of-the-Art Modelng echnques, Connectonst Models of Neurons, Learnng Processes, and Artfcal Intellgence, Sprnger-Verlag Germany, Jose Mra and Alberto Preto (Eds., Granada, Span, pp. 69-76,.

[3] Abraham A., Phlp N.S., Nath B. and Saratchandran P, Performance Analyss of Connectonst Paradgms for Modelng Chaotc Behavor of Stock Indces, Second Internatonal Workshop on Intellgent Systems Desgn and Applcatons, Computatonal Intellgence and Applcatons, Dynamc Publshers Inc., USA, pp. 8-86,. [4] Berkeley A..R.. Nasdaq's technology floor: ts presdent takes stock, IEEE Spectrum, Volume: 34 Issue:, pp. 66 67, 997. [5] Bsch G.I. and Valor V., Nonlnear effects n a dscrete-tme dynamc model of a stock market, Chaos, Soltons & Fractals (3: 3-,. [6] Bshop C. M., Neural Netorks for Pattern Recognton, Oxford: Clarendon Press, 995. [7] Chan W.S. and Lu W.N., Dagnosng shocks n stock markets of southeast Asa, Australa, and Ne Zealand, Mathematcs and Computers n Smulaton 59(-3: 3-3,. [8] Cherkassky V., Fuzzy Inference Systems: A Crtcal Reve, Computatonal Intellgence: Soft Computng and Fuzzy-Neuro Integraton th Applcatons, Kayak O, Zadeh L A et al (Eds., Sprnger, pp.77-97, 998. [9] Cos K.J., Data Mnng n Fnance: Advances n Relatonal and Hybrd Methods, Neurocomputng 36(-4: 45-46,. [] Collobert R. and Bengo S., SVMorch: Support Vector Machnes for Large-Scale Regresson Problems, Journal of Machne Learnng Research, Volume, pages 43-6,. [] Francs E.H. ay and L.J. Cao, Modfed support vector machnes n fnancal tme seres forecastng, Neurocomputng 48(-4: 847-86,. [] Hashem, S., Optmal Lnear Combnaton of Neural Netorks, Neural Netork, Volume, No. 3. pp. 79-994, 995. [3] Jang J. S. R., Sun C.. and Mzutan E., Neuro-Fuzzy and Soft Computng: A Computatonal Approach to Learnng and Machne Intellgence, Prentce Hall Inc, USA, 997. [4] Joachms., Makng large-scale SVM Learnng Practcal. Advances n Kernel Methods - Support Vector Learnng, B. Schölkopf and C. Burges and A. Smola (Eds., MI-Press, 999. [5] Km K.J. and Han I., Genetc algorthms approach to feature dscretzaton n artfcal neural netorks for the predcton of stock prce ndex, Expert Systems th Applcatons 9(: 5-3, [6] Koulourots, D.E.; Dakoulaks, I.E.; Emrs, D.M, A fuzzy cogntve map-based stock market model: synthess, analyss and expermental results, he th IEEE Internatonal Conference on Fuzzy Systems, Volume:, pp. 465 468,. [7] LeBaron, B., Emprcal regulartes from nteractng long- and short-memory nvestors n an agent-based stock market, IEEE ransactons on Evolutonary Computaton, Volume: 5 Issue: 5, pp. 44 455,. [8] Legh W., Modan N., Purvs R. and Roberts., Stock market tradng rule dscovery usng techncal chartng heurstcs, Expert Systems th Applcatons 3(: 55-59,. [9] Legh W., Purvs R. and Ragusa J.M., Forecastng the NYSE composte ndex th techncal analyss, pattern recognzer, neural netork, and genetc algorthm: a case study n romantc decson support, Decson Support Systems 3(4: 36-377,.

[] Masters., Advanced Algorthms for Neural Netorks: a C++ sourcebook, Wley, Ne York, 995. [] Nasdaq Stock Market SM : http://.nasdaq.com [] Natonal Stock Exchange of Inda Lmted: http://.nse-nda.com [3] Oh K.J. and Km K.J., Analyzng stock market tck data usng pecese nonlnear model, Expert Systems th Applcatons (3: 49-55,. [4] Palma-dos-Res A. and Zahed F., Desgnng personalzed ntellgent fnancal decson support systems, Decson Support Systems 6(: 3-47, 999. [5] Phlp N.S. and Joseph K.B., Boostng the Dfferences: A Fast Bayesan classfer neural netork, Intellgent Data Analyss, IOS press, Netherlands, Volume 4, pp. 463-473,. [6] Quah.S. and Srnvasan B., Improvng returns on stock nvestment through neural netork selecton, Expert Systems th Applcatons 7(4: 95-3, 999. [7] Vapnk V., he Nature of Statstcal Learnng heory. Sprnger-Verlag, Ne York, 995. [8] Wang Y.F., Mnng stock prce usng fuzzy rough set system, Expert Systems th Applcatons 4(: 3-3,. [9] Wuthrch, B., Cho, V., Leung, S., Permunetlleke, D., Sankaran, K., Zhang, J., Daly stock market forecast from textual eb data, IEEE Internatonal Conference on Systems, Man, and Cybernetcs, Volume: 3, Page(s: 7 75, 998.