Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA A Predcton System Based on Fuzzy Logc Vadeh.V,Monca.S, Mohamed Shek Safeer.S, Deeka.M 4, Sangeetha.S ABSTRACT: The man objectve of the aer s to buld a redcton system to redct the future occurrence of an event. Fuzzy logc, among the varous avalable Artfcal Intellgence technques, emerges as an advantageous technque n redctng future events. Subjectve and Objectve modelng are two tyes of fuzzy modelng. Objectve tye fuzzy modelng s used to buld the redcton system. It s a combnaton of a clusterng algorthm and fuzzy system dentfcaton whch roves effectve n mrovng the effcency of the redcton. To tran the redcton system, hstorcal data s obtaned from the web. Data secfc to the desred alcaton s obtaned and s recorded. Ths recorded nformaton s subjectvely reasoned to develo contanng only the necessary nuts to the redcton system. The subtractve clusterng algorthm s used for ts comutatonal advantages and fuzzy rules are formed usng system dentfcaton technque. Stock markets are ecellent eamles where ths redcton system can be aled and the ossblty of a rse or a fall n the market rces s redcted. The entre redcton system s realzed usng Java. Keywords: Predcton, Data modelng, Subtractve clusterng, System dentfcaton, Fuzzy logc. I. INTRODUCTION Predcton of an event requres vague, merfect and uncertan knowledge [9]. Comlety n a redcton system s ts ntrnsc characterstc. Varous Artfcal Intellgence (AI) technques have been utlzed n realsng a redcton system [2]. The AI based redcton models can be classfed nto four grous: models based on neural networks, fuzzy logc, genetc algorthm and eert systems. Such redcton systems lay mortant roles n several organsatonal decsons, of whch the stock market s a vvd eamle. Rules whch determne market behavour have been elcted from raw data by AI methods. As stock market redcton nvolves mrecse concets and mrecse reasonng, fuzzy logc s a natural choce for knowledge reresentaton [2]. The Web, wth ts boundless nformaton, acts as a source of hstorcal haenngs of events. Relevant data concernng the alcaton s arsed and fltered and used to tran the redcton system. Manuscrt receved Jul 2, 2008. V.Vadeh s a faculty and Monca.S, Mohamed Shek Safeer.S, Deeka.M, Sangeetha.S are students of Deartment of Electroncs Engneerng, Madras Insttute of Technology, Anna Unversty, Chenna, Taml Nadu, Inda (emal d : vadehvjay@gmal.com) Earler, redcton systems were bult wth rules formed manually. Rules became comlcated wth ncrease n the number of nuts and redctng an event grew tedous. Engneerng hels buld a redcton system that could adat to the ncreasng number of nuts and frame rules accordngly. The accuracy and seed obtaned s sueror to manual redcton schemes. Objectve fuzzy modellng [] used to buld the redcton system requres numercal nuts. The IF-THEN rules formed have vague redcates n ther antecedent art whle the consequent art s a lnear or quadratc combnaton of the antecedent varables. Snce the consequent arts of rules are crs values rather than vague and fuzzy ones, there s no need to defuzzfy the outut. Ths characterstc of the objectve fuzzy modelng technque favours t over several other fuzzy modelng technques and s utlzed to mrove the redcton effcency. The objectve tye fuzzy modellng has ecellent learnng caabltes and requres less comutatonal effort. Subtractve clusterng technque used oerates on raw numercal data. Increasng the number of nuts affects the redcton system only to a small etent. Further ths clusterng technque rovdes smlar degree of accuracy and robustness together wth lesser comutatonal comlety as comared to varous other clusterng technques. These advantages along wth the characterstc that no searate defuzzfcaton s requred, makes ths redcton faster than several revous systems. Secton II resents an overvew of the redcton system and the nherent concets. Secton III dscusses the fuzzy modelng technque n detal wth the underlyng mathematcal foundatons. Secton IV dscusses the mlementaton results of the technque used and analyses the results. Secton V gves a bref concluson. II. PREDICTION SYSTEM Predctng a system s usually done by learnng from the ast for whch hstorcal data s obtaned and analyzed to study the resultng attern n the market [3]. The archtecture of the redcton system based on fuzzy logc s gven n Fg. Predctng any event requres knowledge about ast erformance. Data from the ast s used manly to learn the atterns that ested. Hstorcal data rovdes nformaton on the secfc attern of ISBN: 978-988-9867-0-2 WCECS 2008
Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA Hstorcal data 2-D Mang learnng the data. Learnng from the ast rovdes knowledge about future to some etent. Fg Block Dagram for Fuzzy Based Predcton System Web feeds rovde users wth frequently udated content. The block dagram to obtan nuts to the redcton system from web feeds s shown n fg 2. Web ages relatng to the secfc alcaton are dentfed. In ths case, stock market related web ages are all dentfed. The ayload from the chosen web ages s obtaned usng a feed reader. The ayload could be obtaned n any desred format. Web Feeds Densty Functon Feed Reader Recent Data Data Clusterng Parser Predcton System Hstorcal Data Recent data System Identfcaton Predcted Outut Database Predcted Outut Fg 2 Block dagram showng web nuts to redcton system XML format s wdely used to create most web ages [0]. The ayload could hence be obtaned n the XML format. The Document Object Model (DOM) s the foundaton of Etensble Mark-u Language, or XML. XML documents have a herarchy of nformatonal unts called nodes []. The XML DOM (Document Object Model) defnes a standard way for accessng and manulatng XML documents. The DOM resents an XML document as a tree structure, wth elements, attrbutes, and tet as nodes. Informaton from the web ages s obtaned by arsng the ml document. A database s formed wth the arsed data. An analyss of the database rovdes a cture of the varatons n the market due to the numerous avalable factors rangng from economcal to oltcal factors. All these factors can be fnally dstlled nto one market varable, the stock market rce. Stock rces for a day are of varous categores lke oenng rce, hgh, low, closng rce, etc [7]. Of these, oen and close rces are consdered and used to roduce a 2-D mang [6]. The ast values of oen and close rces of a artcular stock s recorded for a sequence of days and stored n database to tran the redcton system. A defnte number of such ars of values corresondng to a set of contnuous days tran the system to learn the attern of behavour of the market over a defnte erod. III. FUZZY MODELING The roblem of fuzzy system dentfcaton s the roblem of elctng IF-THEN rules from raw nut and outut data. Ths roceeds through two stes: ) Clusterng 2) Secfcaton of nut-outut relatons (IF- THEN rules) Clusterng algorthms are used etensvely not only to organze and categorze data, but are also useful for data comresson and model constructon [4]. By fndng smlartes n data, one can reresent smlar data wth fewer symbols. The densty functon for a data ont s defned as the measure of otental for that data ont. It s estmated based on the dstance of ths data ont from all other data onts, Therefore, a data ont lyng n a hea of other data onts wll have a hgh chance of beng a cluster centre, whle a data ont whch s located n an area of dffused and not concentrated data onts wll have a low chance of beng a cluster centre. The rocess of cluster centre determnaton nvolves determnng the otentals of every data ont consdered. The data ont wth the hghest otental s chosen as the frst cluster centre. The subsequent cluster centres are found by frst revsng the otental of data onts to cancel the effect of the revous cluster centres found. Ths rocess s a selftermnatng one, that s, when the revsed otentals of data onts are not suffcent for the artcular data ont to become a cluster centre, the cluster centre determnaton termnates. In our case, a set of two clusters s formed - one to denote the hgher range n the rce values and the other to denote the lower range of rces. The system s now sad to be traned. If a set of recent data values s now resented to ths system, the attern s studed and the ossblty of a rse or a fall s redcted along wth the net ossble value for the market varable, the rce. The recent data values are ntally rocessed, ther membersh wth each of the clusters formed earler s determned. Each of the data onts receved recently s laced n the cluster where ts membersh wth that cluster s the hghest. ISBN: 978-988-9867-0-2 WCECS 2008
Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA Recent Membersh Functon Fuzzy IF- THEN rules Outut near the frst cluster beng selected as the net cluster centre. After revsng the otental of al data onts, the data ont wth the mamum otental wll be selected as the net cluster centre. The otental of data onts n the frst ste s measured as [8]: Cluster Centre e α n = j 2 () j = Fg 3 Block Dagram for System Identfcaton Each cluster centre corresonds to fuzzy rule and the cluster dentfed by the eonental membersh functon reresents the antecedent of ths rule. The rule checks f the nut s the same as the eonental membersh functon of cluster I and f so, then the outut s set to the quadratc combnaton of the nut varables. A hgher number of data onts laced n one cluster ncreases the favourablty for that cluster, that s, f a set of recent nuts has had a larger number of rces n the hgher range, then the ossblty s that the rces are lkely to rse n the near future. The net mmedate value of the rce s found usng equaton 7.. Subtractve Clusterng Clusterng s a rocess n whch data are laced nto grous or clusters, such that data n a gven cluster tend to be smlar to each other, and data n dfferent clusters tend to be dssmlar. When the clusterng estmaton s aled to a set of nutoutut data, each cluster centre can be consdered as a fuzzy rule that descrbes the characterstc behavour of the system. Each cluster centre corresonds to fuzzy rule, and the cluster dentfed reresents the antecedent of ths rule. Ths ste forms the system dentfcaton. Subtractve clusterng s a technque for automatcally generatng fuzzy nference systems by detectng clusters n nut-outut tranng data. Subtractve clusterng, unlke mountan clusterng whch consders ntersecton of grd Lnes, consders each data ont as a otental cluster centre. The measure of otental for a data ont s estmated based on the dstance of ths data ont from all other data onts. Therefore, a data ont lyng n a hea of other data onts wll have a hgh chance of beng a cluster centre, whle a data ont whch s located n an area of dffused and not concentrated data onts wll have a low chance of beng a cluster centre. After measurng the otental of every data ont, the data ont wth the greatest otental value s selected as the frst cluster centre. To fnd the net cluster centre, otentals of data onts must be revsed. For each data ont, an amount roortonal to ts dstance to the frst cluster centre wll be subtracted. Ths reduces the chance of a data ont α = 4 where 2 r a and s the th data ont and r a s a vector whch conssts of ostve constants and reresents the hyer shere cluster radus n data sace. The constant r a s effectvely the radus defnng a neghbourhood; data onts outsde ths radus have lttle nfluence on the otental. The otental whch has been calculated through Equaton for a gven ont, s a functon of that ont's dstance to all other onts, and the data ont whch corresonds to mamum otental value s the frst cluster centre. Let denotes the mamum otental, f denotes the frst cluster centre corresondng to, where U n = = U (2) denotes the mamum of al s To revse the otental values and select the net cluster, the followng formula s suggested. 2 j e β 4 where β = 2 = (3) r b and r b s a vector whch conssts of ostve constants and s called the hyer shere enalty radus. The constant r b s effectvely the radus defnng the neghbourhood whch wll have measurable reductons n otental. To avod cluster centres beng close to each other, r b must be greater than r a. A desrable relaton s as follows [8]: r b =.5r a (4) ISBN: 978-988-9867-0-2 WCECS 2008
Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA Subtractve clusterng can be used as a standalone aromate clusterng algorthm n order to estmate number of clusters and ther locatons. 2. Fuzzy Rule Formaton When the clusterng estmaton s aled to a set of nut-outut data, each cluster centre can be consdered as a fuzzy rule that descrbes the characterstc behavour of the system. Theoretcally, a system wth multle nuts and multle oututs can be reduced to several multle nuts but sngle outut systems (MISO). Therefore, the fuzzy rule of a MIMO system can also be resented as a set of rules wth mult-antecedent and sngle-consequent such that for a system wth n outut, each multconsequent rule s broken nto n sngle-consequent rules. Consderng data n an n-dmensonal sace, where the frst k dmensons corresond to nut varables and (n-k) dmensons corresond to outut varables, the clusterng estmaton on ths data sace dvdes the data nto fuzzy clusters that overla wth each other. The deendency of each data vector can be defned by a membersh grade n [0, ]. The data vector wth membersh grade, one, s called the cluster centre. The membersh grade of each data vector s defned as follows: μ e α 2 ( ) = (5) where s the nut vector. Each cluster centre c corresonds to, and the fuzzy rule cluster dentfed above by the eonental membersh functon reresents the antecedent of ths rule. If A notfes the eonental membersh functon of cluster, then rule can be reresented as: IF X s A THEN Y s B where X s the nut varables vector, Y s the th outut varable and B s a sngleton defned as a lnear or quadratc combnaton of nut varables. When B s defned as a lnear combnaton, the model s called a frst order model and when B s a quadratc combnaton, the model s called a second order model. For the frst order model that we are concerned about n ths work, B s gven as follows: where, B = N j = j j + (6) 0 j s the coeffcent of j n rule. The fuzzy IF-THEN rules for the frst order model would be as follows (genercally): IF X s A THEN Y (X) = N j = j j + 0 For a gven X 0 the outut of the model y 0, s comuted as: y s μ = = 0 s = ( ) ( ) μ 0 0 ( ) Y 0 (7) The system s thus formed. And the redcted outut of the system s gven by equaton 7. IV. RESULTS Stock rces for fve organsatons are consdered and the redcton system s aled to each organsaton. The redctons made for the organsatons have an accuracy of about 80%. The results for one of the organsatons consdered are elaborated. The ml fle obtaned from the feed reader s gven as nut to the DOM arser. Hstorcal data comrsng of oenng rce and closng rce of a secfc organzaton as collected from the data sheet of a comany for a secfc erod s shown n Table. CLOSING PRICE HISTORICAL DATA 4200 4000 3800 3600 3400 3200 3000 2800 2600 2500 3000 3500 4000 4500 OPENING PRICE Fg 4 Hstorcal Data Cluster centres are calculated usng subtractve clusterng algorthm. The two cluster centres that were obtaned for the ast values that were consdered are shown n Fg 5. The data ont at the lower range n Fg 5 ndcates the cluster centre for a FALL n the rce values. ISBN: 978-988-9867-0-2 WCECS 2008
Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA Table Data Sheet Date Oen Close Date Oen Close ACTUAL Vs PREDICTED 9/26/2007 3779.3 3878.5 8/9/2007 3652.33 3270.68 9/25/2007 3757.84 3778.65 8/8/2007 3497.23 3657.86 9/24/2007 382.57 3759.06 8/7/2007 3467.72 3504.3 9/2/2007 3768.33 3820.9 8/6/2007 383.3 3468.78 9/20/2007 383.52 3766.7 8/3/2007 3462.25 38.9 CLOSING PRICE DAY ACTUAL PREDICTED 9/9/2007 3740.6 385.56 8/2/2007 3357.82 3463.33 9/8/2007 3403.8 3739.39 8//2007 32.09 3362.37 9/7/2007 344.95 3403.42 7/3/2007 3360.66 32.99 Fg 6 Actual Vs Predcted Curves CLUSTER CENTERS 9/4/2007 342.39 3442.52 7/30/2007 3266.2 3358.3 9/3/2007 3292.38 3424.88 7/27/2007 3472.68 3265.47 9/2/2007 3298.3 329.65 7/26/2007 3783.2 3473.57 9//2007 329.4 3308.39 7/25/2007 382.4 3785.79 9/0/2007 36.39 327.85 7/24/2007 3940.9 376.95 9/7/2007 3360.74 33.38 7/23/2007 385.73 3943.42 9/6/2007 3306.44 3363.35 7/20/2007 4000.73 385.08 9/5/2007 3442.85 3305.47 7/9/2007 398.79 4000.4 9/4/2007 3358.39 3448.86 7/8/2007 3955.05 398.22 8/3/2007 3240.84 3357.74 7/7/2007 395.96 397.55 8/30/2007 3287.9 3238.73 7/6/2007 3907.09 3950.98 8/29/2007 3043.07 3289.29 7/3/2007 3859.86 3907.25 8/28/2007 338.43 304.85 7/2/2007 3579.33 386.73 8/27/2007 3377.6 3322.3 7//2007 3500.4 3577.87 8/24/2007 323.78 3378.87 7/0/2007 3648.59 350.7 8/23/2007 3237.27 3235.88 7/9/2007 362.66 3649.97 8/22/2007 3088.26 3236.3 7/6/2007 3559.0 36.68 8/2/2007 320.05 3090.86 7/5/2007 3576.24 3565.84 8/20/2007 3078.5 32.35 7/3/2007 3556.87 3577.3 8/7/2007 2848.05 3079.08 7/2/2007 3409.6 3535.43 8/6/2007 2859.52 2845.78 6/29/2007 3422.6 3408.62 8/5/2007 302.93 286.47 6/28/2007 3427.48 3422.28 8/4/2007 3235.72 3028.92 6/27/2007 3336.93 3427.73 8/3/2007 3238.24 3236.53 6/26/2007 3352.37 3337.66 3 9 0 0 3 8 0 0 3 70 0 3 6 0 0 3 50 0 3 4 0 0 3 3 0 0 3200 3400 3600 3800 4000 OPENING PRICE Fg 5 Cluster Centres Smlarly, the data ont at the hgher range ndcates the cluster centre for a RISE n the rce values. We have consdered 64 values of oenng and close rces and have lotted the data onts as shown n Fg 4.The recent data gven to the traned system roduces a redcted outut whch s ndcated as the PREDICTED curve n Fg 6. Ths curve s lotted n comarson wth the actual outut, ndcated by the ACTUAL curve n Fg 6. V. CONCLUSION A subtractve clusterng based fuzzy system dentfcaton method s used to successfully model a general redcton system that can redct future events by takng samles of ast events. Hstorcal data s obtaned and s used to tran the redcton system. Recent data are gven as nut to the redcton system. All data are secfc to the alcaton at hand. The system, that s develoed usng Java, s tested n one of the many areas where redcton lays an mortant role, the stock market. Prces of revous sessons of the market are taken as the otental nuts. When recent data are gven to the traned system, t redcts the ossblty of a rse or a fall along wth the net ossble value of data. The accuracy obtaned s about 80%. The redcton model that we have desgned s traned by daly market rce data. It can also be used as a weekly or a monthly redctor. Ths serves as one ossble area of ISBN: 978-988-9867-0-2 WCECS 2008
Proceedngs of the World Congress on Engneerng and Comuter Scence 2008 WCECS 2008, October 22-24, 2008, San Francsco, USA future work. Further, the nuts from the XML document and the fuzzy rules can be ntegrated to serve a real tme alcaton. REFERENCES [] Alaa Sheta, Software Effort Estmaton and Stock Market Predcton Usng Takag-Sugeno Fuzzy Models IEEE Internatonal Conference on Fuzzy System,.7-78, July 2006. [2] Chu S.C. And Km H.S, "Automatc knowledge generaton from the stock market data", Proceedngs of 93 Korea Jaan jont conference on eert systems,. 93-208, 993. [3] Justn Wolfers, Erc Ztzewtz, Predcton markets n theory and ractce, natonal bureau of economc research,.-, March 2006. [4] Khaled Hammouda, Prof. Fakhreddne Karray, A comaratve study of data clusterng technques.-, March 2006. [5] R. Babuska, J. A. Roubos, and H. B. Verbruggen, Identfcaton of MIMO systems by nut-outut TS fuzzy models, n Proceedngs of Fuzzy-IEEE 98, Anchorage, Alaska, 998. [6] R. J. Van Eyden, Alcaton of Neural Networks n the Forecastng of Share Prces Fnance and Technology Publshng, 996. [7] Yke Hemstra, A Stock Market Forecastng Suort System Based on Fuzzy Logc, Proceedngs of the Twenty-Seventh Annual Hawa Internatonal Conference on System, Scences, IEEE,.28-287,994. [8] Chu, S. L.; 994, "Fuzzy model dentfcaton based on cluster estmaton", Journal of Intellgent and Fuzzy Systems, 2, John Wley & Sons,. 267-278. [9] Avalable onlne (Process modellng) htt://www.tl.nst.gov/dv898/handbook/md/md.ht m. [0] Avalable onlne: www.wc3.com [] Avalable onlne: www.bm.com/develoerworks ISBN: 978-988-9867-0-2 WCECS 2008