Dynamic Fuzzy Pattern Recognition

Size: px
Start display at page:

Download "Dynamic Fuzzy Pattern Recognition"

Transcription

1 Dynamc Fuzzy Pattern Recognton Von der Fakultät für Wrtschaftswssenschaften der Rhensch-Westfälschen Technschen Hochschule Aachen zur Erlangung des akademschen Grades enes Doktors der Wrtschafts- und Sozalwssenschaften genehmgte Dssertaton vorgelegt von Dplom-Ingeneurn (UA) Larsa Angstenberger, geb. Mkenna Magster des Operatons Research (M.O.R.) aus Kew, Ukrane Dese Dssertaton st auf den Internetseten der Hochschulbblothek onlne verfügbar

2

3 Dynamc Fuzzy Pattern Recognton Von der Fakultät für Wrtschaftswssenschaften der Rhensch-Westfälschen Technschen Hochschule Aachen zur Erlangung des akademschen Grades enes Doktors der Wrtschafts- und Sozalwssenschaften genehmgte Dssertaton vorgelegt von Dplom-Ingeneurn (UA) Larsa Angstenberger, geb. Mkenna Magster des Operatons Research (M.O.R.) aus Kew, Ukrane Berchter: Berchter: Unv.-Prof. Dr. rer. pol. Dr. h.c. mult. Hans-Jürgen Zmmermann Unv.-Prof. Dr. rer. pol. habl. Mchael Bastan Tag der mündlchen Prüfung: 9. Ma 2000 D 82 (Dss. RWTH Aachen) Dese Dssertaton st auf den Internetseten der Hochschulbblothek onlne verfügbar

4

5 Omna mutantur, nhl ntert. (Everythng s changng, nothng s gettng lost.) Phylosoph Ovd, methamorphoses 5, 65 All the real knowledge whch we possess, depends on methods by whch we dstngush the smlar from the dssmlar. Swedsh naturalst Lnnaeus, 737

6

7 Preface In 995 I was awarded a one-year grant from the German Academc Exchange Servce (DAAD) for research n Germany durng my PhD studes. I was very nterested n nvestgatng the feld of Fuzzy Logc and n fuzzy data analyss and very happy when I receved the nvtaton from Prof. Dr. Dr. h.c. H.-J. Zmmermann to study at hs char of Operatons Research at the RWTH Aachen and to wrte my PhD thess under hs scentfc supervson. Thanks to a four year fnancal support by DAAD I managed to graduate n the master course of Operatons Research and to perform the research n the area of fuzzy pattern recognton whch has resulted n ths thess. I am very grateful to Prof. Zmmermann for allowng me to share hs knowledge and scentfc experence, for hs support and encouragement as well as hs attenton and nterest n my research and hs scentfc suggestons concernng the research drecton, possble applcatons, and thess structure. It was a great pleasure to work wth Prof. Zmmermann and to be a part of the team of young researchers at the nsttute of Operatons Research where, thanks to hs partcular personalty and humour, a very exhlaratng and creatve atmosphere had been developed over the years. I am also much oblged to my co-referent Prof. Bastan for hs crtcal remarks and suggestons for mprovements to ths thess. I would lke to thank all my colleagues at the char of Operatons Research for ther mutual understandng, for knowledge and experence exchange and for a lot of fun. Especally the competent advce from Dr. Uwe Bath and Dr. Tore Grünert and the techncal support of Glberto von Spar durng my work on ths thess were of great mportance for me. I apprecate the numerous dscussons about fuzzy clusterng and the phlosophcal dscussons about lfe wth my colleague Dr. Peter Hofmester. Hs stable optmsm and energy have helped me to stay always n hgh sprts despte of hard work. Durng my studes at the RWTH Aachen I had the opportunty to work at MIT - Management Intellgenter Technologen GmbH - where I partcpated n dfferent projects n the area of ntellgent data analyss. From 997 I worked on a new scentfc project on dynamc fuzzy data analyss sponsored by DFG (German Research Socety) for three years. Ths project motvated me to nvestgate ths new area n greater detal and has resulted n a number of publcatons whch served as a bass for my further research. Helpful dscussons wth my project partner Arno Joentgen have postvely nfluenced my research and brought me many new deas. Due to the techncal support of Sebastan Greguletz, who provded me wth a specal software for network montorng, t was possble to gather enough data for the techncal applcaton of dynamc pattern recognton methods ntroduced n ths thess. I also want to thank one of my

8 best frends, Jens Junker, who helped me to better understand the process of data transmsson n computer networks, to dentfy practcal goals n computer network optmsaton and to nterpret the results of classfcaton. I am very grateful to Thomas MacFarlane for hs very ntensve proof-readng of ths manuscrpt, whch consderably mproved the language and the style of ths thess. I would lke to express my great grattude to my frend Joachm Angstenberger for hs unwaverng support and numerous frutful dscussons durng my almost two years as a PhD student. I apprecate a lot that he has dedcated so much tme to readng the frst and all subsequent versons of ths thess, snce hs valuable suggestons and correctons have sgnfcantly mproved my manuscrpt. I am very happy that he showed so much patence and understandng for my work whereas I dd not have enough tme on weekends. Ths thess s a result of many years of studes and research, n whch I was constantly supported and motvated by my parents. They showed me the path to knowledge and always helped me to fnd the most effcent means to reach my goals. I hope that the results of my work meet the expectatons of my parents, my frends and my teachers. Aachen, Jun 2000 Larsa Mkenna

9 Contents Contents INTRODUCTION.... GOALS AND TASKS OF THE THESIS STRUCTURE OF THE THESIS GENERAL FRAMEWORK OF DYNAMIC PATTERN RECOGNITION THE KNOWLEDGE DISCOVERY PROCESS THE PROBLEM OF PATTERN RECOGNITION The process of pattern recognton Classfcaton of pattern recognton methods Fuzzy pattern recognton THE PROBLEM OF DYNAMIC PATTERN RECOGNITION Mathematcal descrpton and modellng of dynamc systems Termnology of dynamc pattern recognton Goals and tasks of dynamc pattern recognton STAGES OF THE DYNAMIC PATTERN RECOGNITION PROCESS THE MONITORING PROCESS Shewhart qualty control charts Fuzzy technques for the montorng process Fuzzy qualty control charts Reject optons n fuzzy pattern recognton Parametrc concept of a membershp functon for a dynamc classfer THE ADAPTATION PROCESS Re-learnng of the classfer Incremental updatng of the classfer Adaptaton of the classfer Learnng from statstcs approach Learnng wth a movng tme wndow Learnng wth a template set Learnng wth a record of usefulness Evaluaton of approaches for the adaptaton of a classfer DYNAMIC FUZZY CLASSIFIER DESIGN WITH POINT-PROTOTYPE BASED CLUSTERING ALGORITHMS FORMULATION OF THE PROBLEM OF DYNAMIC CLUSTERING...78

10 Contents 4.2 REQUIREMENTS FOR A CLUSTERING ALGORITHM USED FOR DYNAMIC CLUSTERING AND CLASSIFICATION DETECTION OF NEW CLUSTERS Crtera for the detecton of new clusters Algorthm for the detecton of new clusters MERGING OF CLUSTERS Crtera for mergng of clusters Crtera for mergng of ellpsodal clusters Crtera and algorthm for mergng sphercal and ellpsodal clusters SPLITTING OF CLUSTERS Crtera for splttng of clusters Search for a characterstc pattern n the hstogram Algorthm for the detecton of heterogeneous clusters to be splt DETECTION OF GRADUAL CHANGES IN THE CLUSTER STRUCTURE ADAPTATION PROCEDURE UPDATING THE TEMPLATE SET OF OBJECTS Updatng the template set after gradual changes n the cluster structure Updatng the template set after abrupt changes n the cluster structure CLUSTER VALIDITY MEASURES FOR DYNAMIC CLASSIFIERS SUMMARY OF THE ALGORITHM FOR DYNAMIC FUZZY CLASSIFIER DESIGN AND CLASSIFICATION SIMILARITY CONCEPTS FOR DYNAMIC OBJECTS IN PATTERN RECOGNITION45 5. EXTRACTION OF CHARACTERISTIC VALUES FROM TRAJECTORIES THE SIMILARITY NOTION FOR TRAJECTORIES Pontwse smlarty measures Choce of the membershp functon for the defnton of pontwse smlarty Choce of the aggregaton operator for the defnton of pontwse smlarty Structural smlarty measures Smlarty model usng transformaton functons Smlarty measures based on wavelet decomposton Statstcal measures of smlarty Smoothng of trajectores before the analyss of ther temporal behavour Smlarty measures based on characterstcs of trajectores EXTENSION OF FUZZY PATTERN RECOGNITION METHODS BY APPLYING SIMILARITY MEASURES FOR TRAJECTORIES APPLICATIONS OF DYNAMIC PATTERN RECOGNITION METHODS BANK CUSTOMER SEGMENTATION BASED ON CUSTOMER BEHAVIOUR...83

11 Contents 6.. Descrpton of the credt data of bank customers Goals of bank customer analyss Parameter settngs for dynamc classfer desgn and bank customer classfcaton Clusterng of bank customers n Group Y based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure Clusterng of bank customers n Group N based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure Segmentaton of bank customers n Group Y based on the partal temporal hstory and usng the pontwse smlarty measure Clusterng of bank customers n Group N based on partal temporal hstory and usng the pontwse smlarty measure Comparson of clusterng results for customers n Groups Y and N COMPUTER NETWORK OPTIMISATION BASED ON DYNAMIC NETWORK LOAD CLASSIFICATION Data transmsson n computer networks Data acquston and pre-processng for network analyss Goals of the analyss of load n a computer network Parameter settngs for dynamc classfer desgn and classfcaton of data traffc Recognton of typcal load states n a computer network usng the pontwse smlarty measure Recognton of typcal load states n computer network usng the structural smlarty measure CONCLUSIONS AND FURTHER RESEARCH DIRECTIONS REFERENCES APPENDIX UNSUPERVISED OPTIMAL FUZZY CLUSTERING ALGORITHM OF GATH AND GEVA DESCRIPTION OF IMPLEMENTED SOFTWARE...266

12

13 Lst of Fgures v Lst of Fgures Fgure 2-: An overvew of the steps of the KDD process... 9 Fgure 2-2: Basc scheme of the pattern recognton process... 2 Fgure 2-3: A taxonomy of pattern recognton methods... 6 Fgure 2-4: Current states of dynamc objects from a statc vewpont Fgure 2-5: Projectons of three-dmensonal trajectores nto two-dmensonal feature space Fgure 2-6: Formaton of new clusters Fgure 2-7: Changes of the dynamc cluster structure Fgure 2-8: Typcal scenaros of future temporal development of ol prce Fgure 2-9: Changng clusters of typcal system states Fgure 2-0: Movng tme wndows of constant length Fgure 2-: The process of dynamc pattern recognton Fgure 3-: Shewhart qualty control chart for the characterstc f Fgure 3-2: Fuzzy set good qualty defned for characterstc f Fgure 3-3: Fuzzy qualty control chart Fgure 3-4: Processng of fuzzy qualty control charts Fgure 3-5: Ambguty reject (AR) and dstance reject (DR) optons n pattern recognton Fgure 3-6: A parametrc membershp functon... 5 Fgure 3-7: Adaptaton of the classfer based on learnng from statstcs approach Fgure 3-8: Adaptaton of the classfer based on learnng wth a template set approach Fgure 3-9: Adaptaton of the classfer based on the record of usefulness Fgure 4-: 3-dmensonal matrx representaton of a dynamc data set Fgure 4-2: Flow chart of an algorthm for the detecton of new clusters... 9 Fgure 4-3: Two clusters wth absorbed objects (crcles) and a group of free objects (crosses) Fgure 4-4: Projectons of the membershp functons obtaned for a group of free objects on the feature space Fgure 4-5: Detecton of stray objects Fgure 4-6: Mergng of fuzzy clusters based on ther ntersecton Fgure 4-7: Mergng of fuzzy clusters based on the degree of overlappng of α-cuts Fgure 4-8: Mergng of fuzzy clusters based on ther standard devaton... 0 Fgure 4-9: Illustraton of crtera for mergng ellpsodal clusters Fgure 4-0: Membershp functons of fuzzy sets close to and close to zero... 03

14 v Lst of Fgures Fgure 4-: Applcaton of the mergng condton n order to avod mpermssble mergng Fgure 4-2: The relevance of crteron of parallelsm for mergng clusters Fgure 4-3: Stuatons n whch ellptcal clusters can be merged Fgure 4-4: Stuatons n whch ellptcal clusters cannot be merged Fgure 4-5: A flow chart of an algorthm for detectng smlar clusters to be merged... Fgure 4-6: A heterogeneous cluster n the two-dmensonal feature space... 3 Fgure 4-7: A heterogeneous cluster n the space of the two frst prncpal components... 3 Fgure 4-8: Hstogram of objects densty wth respect to the frst prncpal component... 4 Fgure 4-9: Illustraton of thresholds of sze and dstance between centres of densty areas... 5 Fgure 4-20: Illustraton of the search procedure for a densty hstogram... 8 Fgure 4-2: Examples of patterns n the hstogram... 8 Fgure 4-22: Verfcaton of crtera for splttng a cluster based on the densty hstogram Fgure 4-23: The effect of smoothng the densty hstogram... 2 Fgure 4-24: A flow chart of an algorthm for detectng heterogeneous clusters to be splt Fgure 4-25: The structure of the template set... 3 Fgure 4-26: Illustraton of the updatng procedure of the template set Fgure 4-27: Adaptaton of a classfer requres splttng clusters... 4 Fgure 4-28: Adaptaton of a classfer requres cluster mergng Fgure 4-29: The process of dynamc fuzzy classfer desgn and classfcaton Fgure 5-: Transformaton of a feature vector contanng trajectores nto a conventonal feature vector Fgure 5-2: Illustraton of the defnton of pontwse smlarty between trajectores Fgure 5-3: Trangular and trapezodal membershp functons Fgure 5-4: Exponental and non-lnear membershp functons Fgure 5-5: Logstc S-shaped membershp functon Fgure 5-6: An example of two trajectores x(t) and y(t) Fgure 5-7: The sequence of dfferences between trajectores x(t) and y(t) Fgure 5-8: The sequence of pontwse smlartes of trajectores x(t) and y(t)... 6 Fgure 5-9: A trajectory x(t) before and after smoothng Fgure 5-0: Structural smlarty based on the trend of trajectores Fgure 5-: Structural smlarty based on curvature of trajectores... 75

15 Lst of Fgures v Fgure 5-2: Structural smlarty based on the smoothness of trajectores Fgure 5-3: Segmentaton of a temporal pattern of a trajectory accordng to elementary trends Fgure 5-4: Qualtatve (left) and quanttatve (rght) temporal features obtaned by segmentaton Fgure 5-5: Smlarty measure based on peaks Fgure 5-6: The structure of dynamc clusterng algorthms based on smlarty measures for trajectores Fgure 6-: Dstrbuton of data wth respect to Feature Fgure 6-2: Cluster centres wth respect to each feature obtaned for customers of Group Y based on the whole temporal hstory and pontwse smlarty between trajectores Fgure 6-3: Degrees of separaton between clusters obtaned for customers n Group Y based on the whole temporal hstory Fgure 6-4: Degrees of compactness of clusters obtaned for customers n Group Y based on the whole temporal hstory Fgure 6-5: Cluster centres wth respect to each feature obtaned for customers n Group N based on the whole temporal hstory and pontwse smlarty between trajectores Fgure 6-6: Degrees of separaton between clusters obtaned for each customer n Group N based on the whole temporal hstory Fgure 6-7: Degrees of compactness of clusters for each customer n Group N based on the whole temporal hstory Fgure 6-8: Cluster centres wth respect to each feature obtaned for customers n Group Y n the frst tme wndow and based on pontwse smlarty between trajectores Fgure 6-9: Degrees of separaton between clusters obtaned for customers n Group Y n the frst tme wndow Fgure 6-0: Degrees of compactness of clusters calculated for customers n Group Y n the frst tme wndow Fgure 6-: Cluster centres wth respect to each feature obtaned for customers n Group N n the frst tme wndow and based on pontwse smlarty between trajectores... 2 Fgure 6-2: Degrees of separaton between clusters obtaned for customers n Group N n the frst tme wndow Fgure 6-3: Degree of compactness of clusters calculated for customers n Group N n the frst tme wndow Fgure 6-4: The OSI basc reference model... 28

16 v Lst of Fgures Fgure 6-5: Dependence between the number of collsons and the network load Fgure 6-6: Densty dstrbutons of features (left) and 6 (rght) Fgure 6-7: Sx typcal states of the data traffc descrbed by sx packet szes and obtaned usng the pontwse smlarty measure Fgure 6-8: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the pontwse smlarty measure Fgure 6-9: Temporal development of average partton densty obtaned usng the pontwse smlarty measure Fgure 6-20: Assgnment of parts of trajectores to sx clusters representng data traffc states and obtaned based on the pontwse smlarty measure Fgure 6-2: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the structural smlarty measure Fgure 6-22: Temporal development of average partton densty obtaned usng the structural smlarty measure Fgure 6-23: Assgnment of parts of trajectores to sx clusters obtaned based on the structural smlarty Fgure 6-24: Assgnment of a part of trajectory from the tme nterval [6200, 8000] to clusters obtaned usng the structural smlarty measure

17 Lst of Tables x Lst of Tables Table 4-: Parameter settngs for the detecton of new clusters n Example Table 4-2: Partton denstes of new and exstng clusters n Example Table 4-3: Partton denstes of new and exstng clusters n Example Table 4-4: Valdty measures for a fuzzy partton before and after splttng clusters... 4 Table 4-5: Valdty measures for a fuzzy partton before and after mergng clusters Table 5-: The overall smlarty obtaned wth dfferent aggregaton operators... 6 Table 6-: Dynamc features descrbng bank customers Table 6-2: The value range and man quantles of each feature of Data Group Y Table 6-3: Man statstcs of each feature of the Data Group Y Table 6-4: The value range and man quantles of each feature of Data Group N Table 6-5: Man statstcs of each feature of Data Group N Table 6-6: Scope of the analyss of bank customers Table 6-7: Parameter settngs for the detecton of new clusters durng customer segmentaton Table 6-8: Number of absorbed and free objects of Data Group Y Table 6-9: Valdty measures for fuzzy partton wth two clusters for Data Group Y Table 6-0: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group Y Table 6-: The number of absorbed and free objects of Data Group N Table 6-2: Valdty measures for fuzzy partton wth two clusters for Data Group N Table 6-3: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group N Table 6-4: Number of Customers Y assgned to two clusters n four tme wndows Table 6-5: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for Customers Y n four tme wndows Table 6-6: Temporal change of assgnment of customers n Group Y to clusters Table 6-7: Number of customers N assgned to two clusters n four tme wndows Table 6-8: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for customers n Group N n four tme wndows Table 6-9: Temporal changes of assgnment of customers n Group N to clusters Table 6-20: Dynamc features descrbng data transmsson n computer network

18 x Lst of Tables Table 6-2: The value ranges and man quantles of each feature charactersng network data Table 6-22: Man statstcs of each feature of the network data Table 6-23: Parameter settngs for three algorthms of the montorng procedure used durng the network analyss Table 6-24: Results of dynamc clusterng and classfcaton of the network data traffc based on the pontwse smlarty measure Table 6-25: Results of dynamc clusterng and classfcaton of the network data traffc based on the structural smlarty measure Table 6-26: Cluster centres representng data traffc states obtaned based on the structural smlarty measure

19 Introducton Introducton The phenomenal mprovements n data collecton due to the automaton and computersaton of many operatonal systems and processes n busness, techncal and scentfc envronments, as well as advances n data storage technologes, over the last decade have lead to large amounts of data beng stored n databases. Analysng and extractng valuable nformaton from these data has become an mportant ssue n recent research and attracted the attenton of all knds of companes n a bg way. The use of data mnng and data analyss technques was recognsed as necessary to mantan compettveness n today s busness world, to ncrease busness opportuntes and to mprove servce. A data mnng endeavour can be defned as the process of dscoverng meanngful new correlatons, patterns and trends by examnng large amounts of data stored n repostores and by usng pattern recognton technologes as well as statstcal and mathematcal technques. Pattern recognton s the research area whch provdes the majorty of methods for data mnng and ams at supportng humans n analysng complex data structures automatcally. Pattern recognton systems can act as substtutes when human experts are scarce n specalsed areas such as medcal dagnoss or n dangerous stuatons such as fault detecton and automatc error dscovery n nuclear power plants. Automated pattern recognton can provde a valuable support n process and qualty control and functon contnuously wth consstent performance. Fnally, automated perceptual tasks such as speech and mage recognton enable the development of more natural and convenent human-computer nterfaces. Important addtonal benefts for a wde feld of hghly complex applcatons le n the use of ntellgent technques such as fuzzy logc and neural networks n pattern recognton methods. These technques permtted the development of methods and algorthms that can perform tasks normally assocated wth ntellgent human behavour. For nstance, the prmary advantage of fuzzy pattern recognton compared to the classcal methods s the ablty of a system to classfy patterns n a non-dchotomous way, as humans do, and to handle vague nformaton. Methods of fuzzy pattern recognton gan constantly ncreasng ground n practce. Some of the felds where ntellgent pattern recognton has obtaned the greatest level of endorsement and success n recent tme are database marketng, rsk management and credt-card fraud detecton. The advent of data warehousng technology has provded companes wth the possblty to gather vast amounts of hstorcal data descrbng the temporal behavour of a system under study and allows a new type of analyss for mproved decson support. Amongst other applcatons n whch objects must be analysed n the process of ther moton or temporal development, the followng are worth mentonng:

20 2 Introducton Montorng of patents n medcne, e.g. durng narcoss when the development rather than the status of the patent's condton s essental; The analyss of data concernng buyers of new cars or other artcles n order to determne customer portfolo; The analyss of monthly unemployment rates; The analyss of the development of share prces and other characterstcs to predct stock markets; The analyss of payment behavour of bank customers to dstngush between good and bad customers and detect fraud; Techncal dagnoss and state-dependent machne mantenance. The common characterstc of all these applcatons s that n the course of tme objects under study change ther states from one to another. The order of state changes, or just a collecton of states an object has taken, determnes the membershp of an object to a certan pattern or class. In other words, the hstory of temporal development of an object has a strong effect on the result of the recognton process. Such objects representng observatons of a dynamc system/process and contanng a hstory of ther temporal development are called dynamc. In contrast to statc, they are represented by a sequence of numercal vectors collected over tme. Conventonal methods of statstcal and ntellgent pattern recognton are, however, of lmted beneft n problems n whch a dynamc vewpont s desrable snce they consder objects at a fxed moment of tme and do not take nto account ther temporal behavour. Therefore there s an urgent need for a new generaton of computatonal technques and tools to assst humans n extractng knowledge from temporal hstorcal data. The development of such technques and methods consttutes the focus of ths thess.. Goals and Tasks of the Thess Dynamc pattern recognton s concerned wth the recognton of clusters of dynamc objects,.e. recognton of typcal states n the dynamc behavour of a system under consderaton. The goal of ths thess s to nvestgate ths new feld of dynamc pattern recognton and to develop new methods for clusterng and classfcaton n a dynamc envronment. Due to the changng propertes of dynamc objects, the parttonng of objects,.e. the cluster structure, s not obvously constant over tme. The appearance of new observatons can lead to gradual or abrupt changes n the cluster structure such as, for nstance, the formaton of new clusters, or the mergng or splttng of exstng clusters. In order to follow temporal changes of the cluster structure and to preserve the desred performance, a classfer must posses adaptve capabltes,.e. a classfer must be automatcally adjusted over tme accordng to detected

21 Introducton 3 changes n the data structure. Therefore, the development of methods for dynamc pattern recognton conssts of two tasks: Development of a method for dynamc classfer desgn enablng a desgn of an adaptve classfer that can automatcally recognse new cluster structures as tme passes; Development of new smlarty measures for trajectores of dynamc objects. The procedure of dynamc classfer desgn must ncorporate, n addton to the usual statc steps, specal procedures that would allow the applcaton of a classfer n a dynamc envronment. These procedures representng dynamc steps are concerned wth detectng temporal changes n the cluster structure and updatng the classfer accordng to these changes. In order to carry out these steps the desgn procedure must have a montorng procedure at ts dsposal, whch must supervse the classfer performance and the mechansm for updatng a classfer dependent on the results of the montorng procedure. Dfferent methods suggested n the lterature for establshng the montorng process are based on the observaton and analyss of some characterstc values descrbng the performance of a classfer. If the classfer performance deterorates they are able to detect changes but cannot recognse what knd of temporal changes have taken place. Most of the updatng procedures proposed for dynamc classfers are based ether on recursve updatng of classfer parameters or re-learnng from scratch usng new objects, whereas the old objects are dscarded as beng rrelevant. In ths thess a new algorthm for dynamc fuzzy classfer desgn s proposed, whch s based partly on the deas presented n the lterature but also uses a number of novel crtera to establsh the montorng procedure. The proposed algorthm s ntroduced n the framework of unsupervsed learnng and allows the desgn of an adaptve classfer capable of recognsng automatcally gradual and abrupt changes n the cluster structure as tme passes and adjustng ts structure to detected changes. The adaptaton laws for updatng the classfer and the template data set are coupled wth the results of the montorng procedure and charactersed by addtonal features that should guarantee a more relable and effcent classfer. Another mportant problem arsng n the context of dynamc pattern recognton s the choce of a relevant smlarty measure for dynamc objects, whch s used, for nstance, for the defnton of the clusterng crteron. Most of the pattern recognton methods use the parwse dstance between objects as a dssmlarty measure used to calculate the degree of membershp of objects to cluster prototypes. As was mentoned above, dynamc objects are represented by a temporal sequence of observatons and descrbed by multdmensonal trajectores n the feature space, or vector-valued functons. Snce the dstance between vectorvalued functons s not defned, classcal clusterng and n general pattern recognton methods are not suted for processng dynamc objects.

22 4 Introducton One approach used n most applcatons for handlng dynamc objects n pattern recognton s to transform trajectores nto conventonal feature vectors durng pre-processng. The alternatve approach addressed n ths thess requres a defnton of a smlarty measure for trajectores that should take nto account the dynamc behavour of trajectores. For ths approach t s mportant to determne a specfc crteron for smlarty. Dependng on the applcaton ths may requre ether the best match of trajectores by mnmsng the pontwse dstance or a smlar form of trajectores ndependent of ther relatve locaton to each other. The smlarty measure for trajectores can be appled nstead of the dstance measure to modfy classcal pattern recognton methods. The combnaton of a new method for dynamc classfer desgn, whch can be appled and modfed for dfferent types of classfers, wth a set of smlarty measures for trajectores leads to a new class of methods for dynamc pattern recognton..2 Structure of the Thess Ths thess s organsed as follows: In Chapter 2, a general vew on the pattern recognton problem s gven. Startng wth the man prncples of the knowledge dscovery process, the role of pattern recognton n ths process wll be dscussed. Ths wll be followed by the formulaton of the classcal (statc) problem of pattern recognton and the classfcaton of methods n ths area wth respect to dfferent crtera. Partcular attenton wll be gven to fuzzy technques, whch wll consttute the focus of ths thess. Snce dynamc pattern recognton represents a relatve new research area and a standard termnology does not yet exst, the man notons and defntons used n ths thess wll be ntroduced and the man problems and tasks of dynamc pattern recognton wll be consdered. Chapter 3 provdes an overvew of technques that can be used as components for desgnng dynamc pattern recognton systems. The advantages and drawbacks of dfferent statstcal and fuzzy approaches for the montorng procedure wll be dscussed, followed by the analyss of updatng strateges for a dynamc classfer. It wll be shown that the classfer desgn cannot be separated temporally from the phase of ts applcaton to the classfcaton of new objects and s carred out n a closed learnng-and-workng-cycle. The adaptve capacty of a classfer depends crucally on the chosen updatng strategy and on the ablty of the montorng procedure to detect temporal changes. Based on the results of Chapters 2 and 3, a new method for dynamc fuzzy classfer desgn wll be developed n Chapter 4. The desgn procedure conssts of three man components: the montorng procedure, the adaptaton procedure for the classfer and the adaptaton procedure for the tranng data set used to learn a classfer. New heurstc algorthms proposed for the

23 Introducton 5 montorng procedure n the framework of unsupervsed learnng facltate the recognton of gradual and abrupt temporal changes n the cluster structure based on the analyss of membershp functons of fuzzy clusters and densty of objects wthn clusters. The adaptaton law of the classfer s a flexble combnaton of two updatng strateges, each dependng on the result of the montorng procedure, and provdes a mechansm to adjust parameters of the classfer to detected changes n the course of tme. The effcency of the dynamc classfer s guaranteed by a set of valdty measures controllng the adaptaton procedure. The problem of handlng dynamc objects n pattern recognton s addressed n Chapter 5. After consderng dfferent types of smlarty and ntroducng dfferent smlarty models for trajectores, a number of defntons of specfc smlarty measures wll be proposed. They are based on the set of characterstcs descrbng the temporal behavour of trajectores and representng dfferent context dependent meanngs of smlarty. The effcency of the proposed method for dynamc classfer desgn, combned wth new smlarty measures for trajectores, s examned n Chapter 6 usng two applcaton examples based on real data. The frst applcaton s concerned wth the load optmsaton n a computer network based on on-lne montorng and dynamc recognton of current load states. The second applcaton from the credt ndustry regards the problem of segmentaton of bank customers based on ther behavoural data. The analyss allows one to recognse tendences and temporal changes n the customer structure and to follow transtons of sngle customers between segments. Fnally, Chapter 7 summarses the results and ther practcal mplcatons and outlnes new drectons for future research.

24

25 General Framework of Dynamc Pattern Recognton 7 2 General Framework of Dynamc Pattern Recognton Most methods of pattern recognton consder objects at a fxed moment n tme wthout takng nto account ther temporal development. However, there are a lot of applcatons n whch the order of state changes of an object over tme determnes ts membershp to a certan pattern, or class. In these cases, for the correct recognton of objects t s very mportant not only to consder propertes of objects at a certan moment n tme but also to analyse propertes charactersng ther temporal development. Ths means that the hstory of temporal development of an object has a strong effect on the result of the recognton process. Classcal methods of pattern recognton are not sutable for processng objects descrbed by temporal sequences of observatons. In order to deal wth problems n whch a dynamc vewpont s desrable, methods of dynamc pattern recognton must be appled. The feld of pattern recognton s a rapdly growng research area wthn the broader feld of machne ntellgence. The ncreasng scentfc nterest n ths area and the numerous efforts at solvng pattern recognton problems are motvated by the challenge of ths problem and ts potental applcatons. The prmary ntenton of pattern recognton s to automatcally assst humans n analysng the vast amount of avalable data and extractng useful knowledge from t. In order to understand the mechansm of extractng knowledge from data and the role of pattern recognton n fulfllng ths task, the man prncples of the knowledge dscovery process wll be descrbed n Secton 2.. Then the problem of classcal (statc) pattern recognton wll be formulated n Secton 2.2, whch should provde a general framework for the nvestgatons n ths thess. Snce the man topc of ths thess centres on the relatve new area of dynamc pattern recognton, the man notons and defntons used n ths area wll be presented n Secton 2.3. Fnally, the goal and basc steps of the dynamc pattern recognton process wll be summarsed. 2. The Knowledge Dscovery Process The task of fndng useful patterns n raw data s known n the lterature under varous names ncludng knowledge dscovery n databases, data mnng, knowledge extracton, nformaton dscovery, nformaton harvestng, data archaeology, and data pattern processng. The term knowledge dscovery n databases (KDD), whch appeared n 989, refers to the process of fndng knowledge n data by applyng partcular data mnng methods at a hgh level. In the lterature KDD s often used as a synonym of data mnng snce the goal of both processes s to mne for peces of knowledge n data. In [Fayyad et al., 996, p. 2] t s, however, argued that KDD s related to the overall process of dscoverng useful knowledge from data whle data mnng s concerned wth the applcaton of algorthms for extractng patterns from data

26 8 General Framework of Dynamc Pattern Recognton wthout the addtonal steps of the KDD process, such as consderng relevant pror knowledge and a proper nterpretaton of the results. The burgeonng nterest n ths research area s due to a rapd growth of many scentfc, busness and ndustral databases n the last decade. Advances n data collecton n scence and ndustry, the wdespread usage of bar codes for almost all commercal products, and the computersaton of many busness and government transactons, have produced a flood of data whch has been transformed nto mountans of stored data usng modern data storage technologes. Accordng to [Fraway et al., 992], the amount of nformaton n the world doubles every 20 months. The desre to dscover mportant knowledge n the vast amount of exstng data makes t necessary to look for a new generaton of technques and tools wth the ablty to ntellgently and automatcally assst humans n analysng the mountans of data for nuggets of useful knowledge [Fayyad et al., 996, p. 2]. These technques and tools are the object of the feld of knowledge dscovery n databases. In [Frawley et al., 99] the followng defnton of the KDD process s proposed: Knowledge dscovery n databases s the non-trval process of dentfyng vald, novel, potentally useful, and ultmately understandable patterns n data. The nterpretaton of the dfferent terms n ths defnton s as follows. Process: KDD represents a mult-step process, whch nvolves data preparaton, search for patterns, knowledge evaluaton and allows return to prevous steps for refnement of results. It s assumed that ths process has some degree of search autonomy,.e. t can nvestgate by tself complex dependences n data and present only nterestng results to the user. Thus, the process s consdered to be non-trval. Valdty: The dscovered patterns should be vald not only on the gven data set, but on new data wth some degree of certanty. For the evaluaton of certanty, a certanty measure functon can be appled. Novelty: The dscovered patterns should be new (at least to the system). A degree of novelty can be measured based on changes n data or knowledge by comparng current values or a new fndng to prevous ones. Potentally useful: The dscovered patterns should be potentally usable and relevant to a concrete applcaton problem and should lead to some useful actons. A degree of usefulness can be measured by some utlty functon. Ultmately understandable: The patterns should be easly understandable to humans. For ths purpose, they should be formulated n an understandable language or represented graphcally. In order to estmate ths property, dfferent smplcty measures can be used

27 General Framework of Dynamc Pattern Recognton 9 whch take nto account ether a sze of patterns (syntactc measure) or the meanng of patterns (semantc measures). In order to evaluate an overall measure of pattern value combnng valdty, novelty, usefulness and smplcty, a functon of sgnfcance s usually defned. If the value of sgnfcance for a dscovered pattern exceeds a user-specfed threshold, then ths pattern can be consdered as knowledge by the KDD process. Data Mnng Interpretaton/ Evaluaton Knowledge Transformaton Preprocessng Patterns Selecton Preprocessed data Transformed data Data Target data Fgure 2-: An overvew of the steps of the KDD process [Fayyad et al., 996, p. 0] Fgure 2- provdes an overvew of the man steps/stages of the KDD process emphassng ts teratve and nteractve nature [Fayyad et al., 996, p. 0]. The prerequste of the KDD process s an understandng of the applcaton doman, the relevant pror knowledge, and the goals of the user. The process starts wth the raw data and fnshes wth the extracted knowledge acqured durng the followng sx stages:. Selecton: Selectng a target data set (accordng to some crtera) on whch dscovery wll be performed. 2. Pre-processng: Applyng basc operatons of data cleanng ncludng the removal of nose, outlers and rrelevant nformaton, decdng on strateges for handlng mssng data values, and analysng nformaton contaned n tme seres. If the data has been drawn from dfferent sources and has nconsstent formats, the data s reconfgured to a consstent format.

28 0 General Framework of Dynamc Pattern Recognton 3. Transformaton: Selectng relevant features to represent data usng dmensonalty reducton, projecton or transformaton technques. The data are reduced to a smaller set of representatve usable data. 4. Data Mnng: Extractng patterns from data n a partcular representatonal form relatng to the chosen data mnng algorthm. At frst, the task of data mnng such as classfcaton, clusterng, regresson etc. s defned accordng to the goal of the KDD process. Then, the approprate data mnng algorthm(s) s (are) selected. Fnally, the chosen algorthm(s) s (are) appled to the data to fnd patterns of nterest. 5. Interpretaton/evaluaton: Translatng dscovered patterns nto knowledge that can be used to support the human decson-makng process. If the dscovered patterns don not satsfy the goals of the KDD process, t may be necessary to return to any of prevous steps for further teraton. 6. Consoldatng dscovered knowledge: Incorporatng knowledge nto the performance system or reportng t to the end-user. Ths step also ncludes checkng for potental conflcts wth prevously beleved or extracted knowledge. As can be seen the KDD process may nvolve a sgnfcant number of teratons and contan loops between any two steps. Although the KDD process must be autonomous, a key role of nteractons between a human and the data durng the dscovery process must be emphassed. A prerequste of a successful dscovery process s that the human user s ultmately nvolved n many, f not all, steps of the process. In [Brachman, Anand, 996] the exact nature of these nteractons between a human and the data s nvestgated from a practcal pont of vew. It s mportant to note that Step 4 of the flow chart, depcted n Fgure 2-, s concerned wth the applcaton of data mnng algorthms to a concrete problem. On the subject of data mnng or data analyss as a process, however, ths usually nvolves the same seven steps descrbed above emphassng the need for pre-processng and transformaton procedures n order to obtan successful results from the analyss. Ths understandng explans the fact that data mnng, data analyss and knowledge dscovery processes are often used as equvalent notons n the lterature ([Angstenberger, 997], [Dlly, 995], [Petrak, 997]). KDD overlaps wth varous research areas ncludng machne learnng, pattern recognton, databases, statstcs, artfcal ntellgence, knowledge acquston for expert systems, and data vsualsaton and uses methods and technques from these dverse felds [Fayyad et al., 996, p. 4-5]. For nstance, the common goals of machne learnng, pattern recognton and KDD le n the development of algorthms for extractng patterns and models from data (data mnng methods). But KDD s addtonally concerned wth the extenson of these algorthms to problems wth very large real-world databases whle machne learnng typcally works wth smaller data sets. Thus, KDD can be vewed as part of the broader felds of machne learnng

29 General Framework of Dynamc Pattern Recognton and pattern recognton, whch nclude not only learnng from examples but also renforcement learnng, learnng wth teacher, etc. [Dlly, 995]. KDD often makes use of statstcs, partcularly exploratory data analyss, for modellng data and handlng nosy data. In ths thess the attenton wll be focussed on pattern recognton whch represents one of the largest felds n KDD and provdes the large majorty of methods and technques for data mnng. 2.2 The Problem of Pattern Recognton The concept of patterns has a unversal mportance n ntellgence and dscovery. Most nstances of the world are represented as patterns contanng knowledge, f only one could dscover t. Pattern recognton theory nvolves learnng smlartes and dfferences of patterns that are abstractons of objects n a populaton of non-dentcal objects. The assocatons and relatonshps between objects make t possble to dscover patterns n data and to buld up knowledge. The most frequently observed types of patterns are as follows: (rule-based) relatonshps between objects, temporal sequences, spatal patterns, groups of smlar objects, mathematcal laws, devatons from statstcal dstrbutons, exceptons and strkng objects [Petrak, 997]. A human s perceptve power seems to be well adapted to the patternprocessng task. Humans are able to recognse prnted characters and words as well as handwrtten characters, speech utterances, favourte melodes, the faces of frends n a crowd, dfferent types of weave, scenes n mages, contextual meanngs of word phrases, and so forth. Humans are also capable of retrevng nformaton on the bass of assocated clues ncludng only a part of the pattern. Humans learn from experence by accumulatng rules n varous forms such as assocatons, tables, relatonshps, nequaltes, equatons, data structures, logcal mplcatons, and so on. A desre to understand the bass for these powers n humans s the reason for the growng nterest n nvestgatng the pattern recognton process. The subject area of pattern recognton belongs to the broader feld of machne learnng, whose prmary task s the study of how to make machnes learn and reason as humans do n order to make decsons [Looney, 997, p. 4]. In ths context learnng refers to the constructon of rules based on observatons of envronmental states and transtons. Machne learnng algorthms examne the nput data set wth ts accompanyng nformaton and the results of the learnng process gven n form of statements, and learn to reproduce these and to make generalsatons about new observatons. Ever snce computers were frst desgned the ntenton of researchers has been that of makng them ntellgent and gvng them the same nformaton-processng capabltes that humans possess. Ths would make computers more effcent n handlng real world tasks and would make them more compatble wth the way n whch humans behave.

30 2 General Framework of Dynamc Pattern Recognton In the followng secton the pattern recognton process wll be consdered and ts man steps examned. Ths wll be followed by a classfcaton of pattern recognton methods regardng dfferent crtera. Fnally, the characterstcs of a specal class of fuzzy pattern recognton methods wll be dscussed along wth the advantages that they afford dynamc pattern recognton The process of pattern recognton Pattern recognton s one of the research areas that tres to explore mathematcal and techncal aspects of percepton a human s ablty to receve, evaluate, and nterpret the nformaton as regards hs/her envronment - and to support humans n carryng out ths task automatcally. The goal of pattern recognton s to classfy objects of nterest nto one of a number of categores or classes [Therren, 989, p. ]. Ths process can be vewed as a mappng of an object from the observaton space nto the class-membershp space [Zadeh, 977] or a search for structure n data [Bezdek, 98, p. ]. Objects of nterest may be any physcal process or phenomenon. The basc scheme of the pattern recognton process s llustrated n Fgure 2-2. Observaton vector y Observaton space Feature extracton Feature vector x Feature space Classfer One of the classes Decson space Output decson Pattern Recognton System Fgure 2-2: Basc scheme of the pattern recognton process [adapted from Therren, 989, p. 2] Informaton about the object comng from dfferent measurement devces s summarsed n the observaton vector. The observaton space s usually of a hgh dmenson and transformed nto a feature space. The purpose of ths transformaton s to extract the smallest possble set of dstngushng features that lead to the best possble classfcaton results. In other words t s advantageous to select features n such a way that feature vectors, or patterns, belongng to dfferent classes occupy dfferent regons of the feature space [Jan, 986, p. 2]. The resultng feature space s of a much lower dmenson than the observaton space. A procedure of selectng a set of suffcent features from a set of avalable features s called feature extracton. It may be based on ntuton or knowledge of the physcal characterstcs of the

31 General Framework of Dynamc Pattern Recognton 3 problem or t may be a mathematcal technque that reduces the dmensonalty of the observaton space n a prescrbed way. The next step s a transformaton of the feature space nto a decson space, whch s defned by a (fnte) set of classes. A classfer, whch s a devce or algorthm, generates a parttonng of the feature space nto a number of decson regons. The classfer s desgned ether usng some set of objects, the ndvdual classes of whch are already known, or by learnng classes based on smlartes between objects. Once the classfer s desgned and a desred level of performance s acheved, t can be used to classfy new objects. Ths means that the classfer assgns every feature vector n the feature space to a class n the decson space. Dependng on the nformaton avalable for classfer desgn, one can dstngush between supervsed and unsupervsed pattern recognton ([Therren, 989, p. 2], [Jan, 986, p. 6-7]). In the frst case there exsts a set of labelled objects wth a known class membershp. A part of ths set s extracted and used to derve a classfer. These objects buld the tranng set. The remanng objects, whose correct class assgnments are also known, are referred to as the test set and used to valdate the classfer's performance. Based on the test results, sutable modfcatons of the classfer's parameters can be carred out. Thus, the goal of supervsed learnng, also called classfcaton, s to fnd the underlyng structure n the tranng set and to learn a set of rules that allows the classfcaton of new objects nto one of the exstng classes. The problem of unsupervsed pattern recognton, also called clusterng, arses f cluster membershps of avalable objects, and perhaps even the number of clusters, are unknown. In such cases, a classfer s desgned based on smlar propertes of objects: objects belongng to the same cluster should be as smlar as possble (homogenety wthn clusters) and objects belongng to dfferent clusters should be clearly dstngushable (heterogenety between clusters). The noton of smlarty s ether prescrbed by a classfcaton algorthm, or has to be defned dependng on the applcaton. If objects are real-valued feature vectors, then the Eucldean dstance between feature vectors s usually used as a measure of dssmlarty of objects. Hence, the goal of clusterng s to partton a gven set of objects nto clusters, or groups, whch possesses propertes of homogenety and heterogenety. It s obvous that unsupervsed learnng of the classfer s much more dffcult than supervsed learnng, nevertheless, effcent algorthms n ths area do exst. Classfcaton and clusterng represent the prmary tasks of pattern recognton Classfcaton of pattern recognton methods Over the past two decades a lot of methods have been developed to solve pattern recognton problems. These methods can be grouped nto two approaches ([Fu, 982b, p. 2], [Bunke, 986, p. 367]): the decson-theoretc and the syntactc approach. The decson-theoretc

32 4 General Framework of Dynamc Pattern Recognton approach, descrbed n the prevous secton, s the most common one. The orgn of ths approach s related to the development of statstcal pattern recognton ([Duda, Hart, 973], [Devjer, Kttler, 982], [Therren, 989]). The goal of statstcal methods s to derve class boundares from statstcal propertes of feature vectors through procedures known n statstcs as hypothess testng. The hypotheses n ths case are that a gven object belongs to one of the possble classes. The valdty measure used n the decson rule s the probablty of makng an ncorrect decson, or the probablty of error. The decson rule s optmal f t mnmses the probablty of error or another quantty closely related to t. Decson-theoretc methods can be classfed dependng on the representaton form of nformaton descrbng objects and on ther applcaton area nto three groups: algorthmc, neural networks-based and rule-based methods [Zmmermann, 996, p. 244]. Among algorthmc methods one can dstngush between statstcal (descrbed above), clusterng and fuzzy pattern matchng methods. Clusterng methods represent a bg class of algorthms for unsupervsed learnng of structure n data ([Bezdek, 98], [Gustafson, Kessel, 979], [Gath, Geva, 989], [Krshnapuram et al., 993], [Krshnapuram, Keller, 993]). They am at groupng of objects nto homogeneous clusters that are defned so that the ntra-cluster dstances are mnmsed whle the nter-cluster dstances are maxmsed. A fuzzy pattern matchng technque proposed by [Cayrol et al., 980, 982] s based on possblty and necessty measures and ams to estmate the compatblty between an object and prototype values of a class. The fuzzy pattern matchng approach was extended by [Dubos et al., 988] and ts general framework summarsed by [Grabsch et al., 997]. The dea of ths group of methods s to buld fuzzy prototypes of classes n the form of fuzzy sets and, durng classfcaton, to match a new object wth all class prototypes and select the class wth the hghest matchng degree. Rule-based classfcaton methods ([Zmmermann, 993, p ], [Kastner, Hong, 984], [Boose, 989]) are based on prncples of expert systems. The frst step of these methods conssts of the knowledge acquston correspondng to a classfer desgn. Knowledge about causal dependences between feature vectors and decsons (classes) s formulated n the form of f-then-rules, whch are obtaned ether usng experts decsons or generated automatcally from a tranng data set. The resultng rule base allows a mappng of a set of objects nto a set of classes mtatng human decson behavour. Classfcaton of a new object s performed n the nference engne of the expert system. The result of nference (a membershp functon or a sngleton) s matched wth class descrptons by determnng the smlarty of the result wth class descrptons, and an object s assgned to the class wth the hghest degree of smlarty. Pattern recognton methods based on artfcal neural networks have proven to be a very powerful and effcent tool for pattern recognton. They can be categorsed nto two general groups: feed-forward neural networks ([Rosenblatt, 958], [Mnsky, Papert, 988], [Hornk et

33 General Framework of Dynamc Pattern Recognton 5 al., 989]) used for supervsed classfcaton and self-organsng networks ([Kohonen, 988]) enablng unsupervsed clusterng of data. Neural networks are able to learn a hghly non-lnear mappng of feature vectors n the nput space nto class vectors n the output decson space wthout applyng any mathematcal model. The desred mappng s acheved by adjustng approprately the weghts of neurones durng supervsed tranng or self-organsaton. Hybrd neuro-fuzzy methods combne the advantages of neural networks and fuzzy sets and compensate ther dsadvantages. Neural networks technques enable neuro-fuzzy systems to learn new nformaton from a gven tranng data set n a supervsed or unsupervsed mode, but the behavour of neural networks cannot be nterpreted easly. On the other hand, fuzzy systems are nterpretable, plausble and transparent rule-based systems, whch n general cannot learn. The knowledge has frst to be acqured and provded to the system n the form of f-then-rules. Furthermore, fuzzy set theory enables neuro-fuzzy systems to present the learned nformaton n a humanly understandable form. The most mportant works on ths subject were carred out by ([Lee, Lee, 975], [Huntsberger, Ajjmarangsee, 990], [Kosko, 992], [Nauck et al., 994]). Wth the syntactc pattern recognton approach ([Fu, 974], [Fu, 982a]), objects are represented by sentences n contrast to feature vectors n decson-theoretc pattern recognton. Each object s descrbed by structural features correspondng to the letters of an alphabet. Recognton of a class assgnment of an object s usually done by herarchcally parsng the object structure accordng to a gven set of syntax rules. One way to parse the sentences s to use a fnte state machne or fnte automaton [Kohav, 978]. In ths approach an analogy between the object structure and the syntax of a language can be seen. A taxonomy of pattern recognton methods s summarsed on Fgure 2-3. Consderng the taxonomy of pattern recognton methods an addtonal general dstncton of all dfferent technques of the decson-theoretc approach nto two classes can be drawn: classcal (crsp) and fuzzy methods. Crsp algorthms for pattern recognton generate parttons such that an object belongs to exactly one cluster. In many cases, however, there s no dstnct cluster structure allowng the clear assgnment of an object to strctly one cluster. For nstance, there may be a stuaton where objects are located between clusters buldng brdges, or where clusters overlap to some degree, so that objects belong to several clusters. Another example s gven by outlers belongng to none of clusters or to all of them to the same degree. Such stuatons can be handled well by a human who has an ablty to classfy n a non-dchotomous way [Zmmermann, 996, p.242]. However, tradtonal pattern recognton methods can not provde an adequate soluton to such problems, snce they do not take nto consderaton smlarty of objects to cluster representatves. Fuzzy methods can avod ths nformaton loss by generatng a degree of membershp of objects to clusters.

34 6 General Framework of Dynamc Pattern Recognton Pattern Recognton Methods Decson-theoretc Structural / syntactc Algorthmc Rule-based Neural networks Automata Statstcal Parametrc Nonparametrc Bayesan Estmaton Clusterng Herarchcal Objectve-functon methods Fuzzy pattern matchng Bnary logcal rules Fuzzy logcal rules Feedforward neural networks Multple layered perceptrons Functonal lnk nets Radal bass functons Self-organsng algorthms Self-organsng feature maps Adaptve resonance theory Hybrd neuro-fuzzy methods Fuzzy self-organsng maps Fuzzy functonal lnk nets Determnstc Stochastc Hopfeld recurrent neural networks Bfurcatonal assocatve maps Fgure 2-3: A taxonomy of pattern recognton methods In ths thess the attenton wll be manly focussed on clusterng methods whch can be categorsed accordng to the underlyng technques and the type of clusterng crteron used n the followng man (non-dstnct) groups [Höppner et al., 996, p. 8-9]: Incomplete clusterng methods: Ths group of methods s represented by geometrc methods and projecton technques. The goal of these methods s to reduce the dmensonalty of a gven set of multdmensonal objects (usng e.g. prncpal component analyss) n order to represent objects graphcally n two or three-dmensonal space. Clusterng of objects s performed manually by vsual examnaton of the data structure. Determnstc crsp clusterng methods: Usng these methods, each object s assgned to exactly one cluster so that the resultng cluster structure defnes a clear partton of objects. Overlappng clusterng methods: Each object s assgned to at least one cluster, but t may as well belong to several clusters at the same tme. Probablstc clusterng methods: For each object, a probablty dstrbuton over all clusters s determned, whch gves a probablty of an object belongng to a certan cluster. Fuzzy clusterng methods: Ths group of methods generates degrees of membershp of objects to clusters, whch ndcate to what extent an object belongs to clusters. Degrees of membershp of an object to all clusters are usually normalsed to one.

35 General Framework of Dynamc Pattern Recognton 7 Possblstc clusterng methods: Ths group represents a sub-class of fuzzy clusterng methods that produce possblstc membershps of objects to clusters. The probablstc constrant that the membershp of an object across clusters must sum to one s dropped n possblstc methods. The resultng degrees of membershp are nterpreted as degrees of compatblty, or typcalty, of objects to clusters. Herarchcal clusterng methods: These methods generate a herarchy of parttons by means of successve mergng (agglomeratve algorthms) or splttng (dvsve algorthms) of clusters. Ths herarchy s usually represented by a dendrogram, whch mght be used to estmate an approprate number of clusters. These methods are not teratve,.e. the assgnment of objects to clusters made on precedng levels can not be changed. Objectve-functon based clusterng methods: These methods use the formulaton of the clusterng crteron n the form of an objectve functon to be optmsed. The objectve functon provdes an estmate of the partton qualty for each cluster partton. In order to fnd a partton wth the best value of the objectve functon the optmsaton problem must be solved. The consderatons n ths work wll be prmarly lmted to fuzzy and possblstc clusterng methods wth an objectve functon Fuzzy pattern recognton The desre to fll the gap between tradtonal pattern recognton methods and human behavour has lead to the development of fuzzy set theory, ntroduced by L.A. Zadeh n 965. The fundamental role of fuzzy sets n pattern recognton, as t was stated by [Zadeh, 977], s to make the opaque classfcaton schemes, usually used by a human, transparent by developng a formal, computer-realsable framework. In other words, fuzzy sets help to transfer a qualtatve knowledge regardng a classfcaton task nto the relevant algorthmc structure. As a basc tool used for ths nterface serves a membershp functon. Its meanng can be nterpreted dfferently dependng on the applcaton area of fuzzy sets. Three semantcs of a membershp grade can be generalsed, accordng to [Dubos, Prade, 997] n terms of smlarty, uncertanty, or preference, respectvely. A vew as a degree of uncertanty s usually used n expert systems and artfcal ntellgence, and nterpretaton as a degree of preference s concerned wth fuzzy optmsaton and decson analyss. Pattern recognton works wth the frst semantc, whch can be formulated as follows: Consder a fuzzy set A ~, defned on the unverse of dscourse X, and the degree of membershp u ~ A (x) of an element x n the fuzzy set A. ~ Then u ~ A (x) s the degree of proxmty of x to prototype elements of A ~ and s nterpreted as a degree of smlarty.

36 8 General Framework of Dynamc Pattern Recognton Ths vew, besdes a meanng of the semantc, shows dstnctons between a membershp grade and dfferent nterpretatons of a probablty value. Consder a pattern x and a class A. On observng x the pror probablty P(x A) = 0.95, expressng that the pattern x belongs to the class A, becomes a posteror probablty: ether P(x A x) = or P(x A x) = 0. However, a degree u ~ A (x) = 0.95 to whch the pattern x s smlar to patterns of the class A remans unchanged after observaton [Bezdek, 98, p. 2]. Fuzzy set theory provdes a sutable framework for pattern recognton due to ts ablty to deal wth uncertantes of the non-probablstc type. In pattern recognton uncertanty may arse from a lack of nformaton, mprecse measurements, random occurrences, vague descrptons, or conflctng or ambguous nformaton ([Zmmermann, 997], [Bezdek, 98]) and can appear n dfferent crcumstances, for nstance, n defntons of features and, accordngly, objects, or n defntons of classes. Dfferent methods process uncertanty n varous ways. Statstcal methods based on probablty theory assume features of objects to be random varables and requre numercal nformaton. Feature vectors havng mprecse, or ncomplete, representaton are usually gnored or dscarded from the classfcaton process. In contrast, fuzzy set theory can be appled for handlng non-statstcal uncertanty, or fuzzness, at varous levels. Together wth possblty theory, ntroduced by Dubos and Prade [Dubos, Prade, 988], t can be used to represent fuzzy objects and fuzzy classes. Objects are consdered to be fuzzy f at least one feature s descrbed fuzzly,.e. feature values are mprecse or represented as lngustc nformaton. Classes are consdered to be fuzzy, f ther decson boundares are fuzzy wth gradual class membershp. The combnaton of two representaton forms of nformaton such as crsp and fuzzy wth two basc elements of pattern recognton such as object and class nduces four categores of problems n pattern recognton [Zmmermann, 995]: Crsp objects and crsp classes; Crsp objects and fuzzy classes; Fuzzy objects and crsp classes; Fuzzy objects and fuzzy classes. The frst category nvolves the problem of classcal pattern recognton, whereas the latter three categores are concerned wth fuzzy pattern recognton. In ths thess the attenton s focused on methods dealng wth the second category of problems - crsp objects and fuzzy classes. It s obvous that the concept of fuzzy sets enrches the basc deas of pattern recognton and gves rse to completely new concepts. The man reasons for the applcaton of fuzzy set theory n pattern recognton can be summarsed n the followng way [Pedrycz, 997]:. Fuzzy sets desgn an nterface between lngustcally formulated features and quanttatve measurements. Features are represented as an array of membershp values denotng the

37 General Framework of Dynamc Pattern Recognton 9 degree of possesson of certan propertes. Classfers desgned n such a way are often logc-orented and reflect the conceptual layout of classfcaton problems. 2. Class membershps of an object take ther values n the nterval [0, ] and can be regarded as a fuzzy set defned on a set of classes. Thus, t s possble that an object belongs to more than one class, and a degree of membershp of an object to a class expresses a smlarty of ths object wth typcal objects belongng to ths specfc class. Usng a gradual degree of membershp the most unclear objects can be dentfed. 3. Membershp functons provde an estmate of mssng or ncomplete knowledge. 4. A tradtonal dstncton between supervsed and unsupervsed pattern recognton s enrched by admttng mplct rather than explct object labellng or allowng for a porton of objects to be labelled. In the case of mplctly supervsed classfcaton objects are arranged n pars accordng to ther smlarty levels. Fuzzy set theory has gven rse to a lot of new methods of pattern recognton, some of whch are extensons of classcal algorthms and others completely orgnal technques. The major groups of fuzzy methods are represented by fuzzy clusterng, fuzzy rule-based, fuzzy pattern matchng methods and methods based on fuzzy relatons. Snce the focus of ths work s on algorthmc methods, only fuzzy clusterng methods wll be dscussed n the followng sectons. Fuzzy technques seem to be partcularly sutable for dynamc pattern recognton when t s necessary to recognse gradual temporal changes n an object's states. Consderng the temporal development of objects, t s often dffcult to assgn objects to classes crsply and precsely. One can magne, for nstance, a system wth two possble states as classes: proper operaton and faulty operaton. When the state of the system s changng measurement values express that the system operaton s not more proper, but there s no error n operaton yet. Ths means that the observed objects do not belong to any of these classes, or belong to a small degree to both classes. The use of fuzzy set theory provdes a possblty to produce fuzzy decson boundares between classes and allows a gradual (temporally changng) membershp of objects to classes. Ths prmary advantage of fuzzy set theory s crucal for pattern recognton n general and for dynamc approach n partcular, because of the possble temporal transton of objects between classes and changes of classes themselves. Therefore, the development of methods for dynamc pattern recognton n ths thess wll be based on fuzzy technques. 2.3 The Problem of Dynamc Pattern Recognton Snce a standard termnology n the area of dynamc pattern recognton does not yet exst, the man notons employed n ths thess must frst be defned. And as the topc of ths thess s

38 20 General Framework of Dynamc Pattern Recognton concerned wth the development of pattern recognton methods for dynamc systems, the basc prncples of analyss and modellng of dynamc systems wll frst be presented. Ths wll be followed by the formulaton of the man defntons used n the area of dynamc pattern recognton. Fnally, specfc problems arsng n dynamc pattern recognton and the ways to handle them wll be dscussed, whch wll then lead to the formulaton of the goal and basc steps of the dynamc pattern recognton process Mathematcal descrpton and modellng of dynamc systems Ths secton s ntended to provde a general overvew about the descrpton of dynamc systems used n dfferent research areas and about tools used for ther analyss. In economcs, medcne, bology, and control theory and a lot of other areas, a system or a process s called dynamc, f one of the varables s tme-dependent ([Stöppler, 980], [Rosenberg, 986]). The dynamc behavour of the system demonstrates how t performs wth respect to tme. Tme-doman analyss and the desgn of dynamc systems use the concept of the states of a system. A system s dynamcs s usually modelled by a system of dfferental equatons. These equatons are normally formulated based on physcal laws descrbng the dynamc behavour of a system. A dynamc system, or process, s defned by nput varables u (t),..., u p (t), whch are used to nfluence the system, and output varables y (t),..., y q (t), whose dynamc behavour s of major nterest [Föllnger, Franke, 982]. In order to obtan a system of normal dfferental equatons representng the relatonshp between nput and output varables, ntermedate varables, whch are called state varables x (t),..., x n (t), are used. A set of state varables x (t 0 ),..., x n (t 0 ) at any tme t 0 determnes the state of the system at ths tme. If the present state of a system and values of the nput varables for t > t 0 are gven, the behavour of a system for t > t 0 can be descrbed clearly. Hence, the state of a system s a set of real numbers such that the knowledge of these numbers and the nput varables wll provde the future state and the output of the system, wth the equatons descrbng the system's dynamcs. The state varables determne the future behavour of a system when the present state of a system and the values of nput varables are known [Dorf, 992]. The multdmensonal space nduced by the state varables s called the state space. The soluton to a system of dfferental equatons can be represented by a vector x(t). It corresponds to a pont n the state space at a gven moment n tme. Ths pont moves n the state space as tme passes. The trace, or path, of ths pont n the state space s called a trajectory of the system. For a gven ntal state x 0 = x(t 0 ) and a gven end state x e = x(t e ),

39 General Framework of Dynamc Pattern Recognton 2 there s an nfnte number of nput vectors and correspondng trajectores wth the same start and end ponts. On the other sde, consderng any pont of the state space there s exactly one trajectory contanng ths pont [Föllnger, Franke, 982]. Consderng dynamc systems n control theory, great attenton s dedcated to adaptve control. The prmary reason for ntroducng ths research area was to obtan controllers that could adapt ther parameters to changes n process dynamcs and dsturbance characterstcs. [Åström, Wttenmark, 995] propose the followng defnton: An adaptve controller s a controller wth adjustable parameters and a mechansm for adjustng the parameters. Durng extensve research carred out n the last two decades t was found that adaptve control s strongly related to deas of learnng that are emergng n the feld of computer scence. In ths thess some defntons of control theory are reformulated and appled to the area of pattern recognton n dynamc systems Termnology of dynamc pattern recognton Consder a dynamc complex system that can assume dfferent states n the course of tme. Each state of the system at a moment n tme represents an object for classfcaton. As stated n Secton 2.3., a dynamc system s descrbed by a set of state varables charactersng ts dynamc behavour. These varables can be voltages, currents, pressures, temperatures, unemployment rates, share prces, etc. dependng on the type of system. They are called features n pattern recognton. If a dynamc system s observed over tme, ts feature values vary consttutng tme-dependent functons. Therefore, each object s descrbed not only by a feature vector at the current moment but also by a hstory of the feature values temporal development. Objects are called dynamc f they represent measurements or observatons of a dynamc system and contan a hstory of ther temporal development. In other words, each dynamc object s a temporal sequence of observatons and s descrbed by a dscrete functon of tme. Ths tme-dependent functon s called a trajectory of an object. Thus, n contrast to statc objects, dynamc objects are represented not by ponts but by multdmensonal trajectores n the feature space extended by an addtonal dmenson tme. The components of the feature vector representng a dynamc object are not real numbers but vectors descrbng the development of the correspondng feature over tme.

40 22 General Framework of Dynamc Pattern Recognton Dynamc pattern recognton s concerned wth the recognton of classes, or clusters, of dynamc objects,.e. recognton of typcal states n the dynamc behavour of a system under consderaton. To llustrate the dfference between statc and dynamc pattern recognton consder a set of dynamc objects descrbed by two features X and X 2 that were observed over tme nterval [0, 00]. From a statc vewpont, the current states of objects are represented by the momentary snapshot of objects at current moment t = 00 and can be seen n the cut of the threedmensonal feature space at ths moment (Fgure 2-4). Usng the spatal closeness of ponts as a crteron of object smlarty (a typcal crteron n statc pattern recognton), two clusters of objects can easly be dstngushed n ths plane (squares and crcles). The way n whch objects arrve at the current state s however rrelevant to the recognton process. X 2 t=00 0 t=00 X Fgure 2-4: Current states of dynamc objects from a statc vewpont From a dynamc vewpont the states of objects are charactersed not only by ther momentary locaton but also by the hstory of ther temporal development, whch s represented by a trace, or trajectory, of each object from ts ntal state to ts current state n the three-dmensonal feature space. Fgure 2-5 shows the projectons of three-dmensonal trajectores of objects nto the two-dmensonal feature space (wthout dmenson tme ). If the form of trajectores s chosen as a crteron of smlarty between trajectores, then three clusters of dynamc objects can be dstngushed: {A, C}, {B, D, E, G}, and {F, H}. Obvously, these clusters are dfferent from the ones recognsed for statc objects at the current moment. If the form and the orentaton of trajectores n the feature space s chosen as the crteron for smlarty, then dynamc objects B, D, E and G can not be consdered as smlar any more and they are separated nto two clusters: {B, D} and {E, G}. Thus, based on such a smlarty crteron four clusters of trajectores are obtaned. In the thrd case, f the form and orentaton of the trajectores are rrelevant but ther spatal pontwse closeness s a base for a smlarty defnton, then another four clusters can be recognsed: {A, B}, {C, D}, {E, F}, and {G, H}.

41 General Framework of Dynamc Pattern Recognton 23 X 2 A t=0 t=0 B C D t=00 E F t=0 G t=0 t=00 H 0 X Fgure 2-5: Projectons of three-dmensonal trajectores nto two-dmensonal feature space Ths example llustrates the lmtaton of statc pattern recognton methods n a dynamc envronment, whch do not take nto account the temporal behavour of a system under study. Classcal (statc) methods are usually appled n order to recognse a (predefned) number of a system s states. However, there s a number of applcatons, n partcular dagnoss problems, n whch gradual transtons from one state to another are observed over tme. In order to follow such slow changes of a system s state and to be able to antcpate the occurrence of new states, t s mportant to consder explctly the temporal development of a system. In general, there can be two possbltes to deal wth dynamc objects n pattern recognton:. To pre-process trajectores of dynamc objects by extractng some characterstc values (temporal features, trends) that can represent components of conventonal feature vectors. The later can be used as vald nputs for statc methods of pattern recognton. 2. To modfy classcal (statc) methods or to develop new dynamc methods that can process trajectores drectly. In the frst case, methods for extracton of relevant temporal features from trajectores are requred. Ths way to deal wth dynamcs s often chosen n practce. The second case s of prmary nterest because t s concerned wth a new research area. Snce most methods of pattern recognton use a dstance, or dssmlarty, measure to classfy objects, dealng wth trajectores n pattern recognton requres a defnton of smlarty measure for trajectores. As the example above has shown, dfferent defntons of smlarty, leadng to dfferent classfcaton results, are possble. However, two general vewponts on smlarty between trajectores can be dstngushed from among all these possble defntons [Joentgen, Mkenna et al., 999b, p. 83]:. Structural smlarty: Two trajectores are the more smlar, the better they match n form/ evoluton/characterstcs,

42 24 General Framework of Dynamc Pattern Recognton 2. Pontwse smlarty: Two trajectores are the more smlar, the smaller ther pontwse dstance n feature space s. Structural smlarty relates to the smlar behavour of trajectores over tme. It s concerned wth a varety of aspects of the trajectores (n general functons) under consderaton such as the form, evoluton, sze or orentaton of trajectores n the feature space. The choce of relevant aspects for a descrpton of smlar trajectores s related to a concrete applcaton. Dependng on the chosen aspect, dfferent crtera can be used to defne a smlarty, for nstance, slope, curvature, poston and values of nflecton ponts, smoothness or some other characterstcs of trajectores representng trends n trajectores. In order to detect smlar behavour of trajectores t s mportant that a measure of structural smlarty be nvarant to such changes as scalng, translaton, addton or removal of some values and some ncorrect values due to measurement errors called outlers. Pontwse smlarty relates to the closeness of trajectores n the feature space. Ths type of smlarty can be defned based drectly on functons values wthout takng nto consderaton specal characterstcs of the functons. In ths case the behavour of trajectores s rrelevant and allows some varatons n form such as fluctuatons or outlers. The example n Fgure 2-5 llustrates the dfference between structural and pontwse smlarty. In terms of structural smlarty, especally f the form of trajectores s relevant, three clusters can be recognsed: {A, C}, {B, D, E, G}, and {F, H}. In terms of pontwse smlarty, four clusters {A, B}, {C, D}, {E, F}, and {G, H} seem to be more natural. Thus, the defnton of smlarty measure s a crucal pont n pattern recognton and a nontrval task n the case of dynamc objects because of ts strong dependence on the applcaton at hand. Due to the non-statonary character of objects n dynamc pattern recognton, the parttonng of objects s not necessarly fxed over tme. The number of clusters and the locaton of cluster centres at a moment n tme consttute the cluster structure. It can be ether statc or dynamc. If the number of clusters s fxed and only the locaton of cluster centres s changng slghtly as tme passes, that cluster structure can be consdered statc. If n the course of tme the number of clusters and the locaton of cluster centres vary, then one has to deal wth the dynamc cluster structure. Its temporal development s represented by trajectores of the cluster centres n the feature space. Changes n the cluster structure correspond to changes of a state, or behavour, of a system under study and wll also be referred to as structural changes. Ths term comes from econometrcs where t s used to descrbe changes n a regresson model ether at an unknown tme pont or at a possble change pont. Due to the arrval of new objects and the dscardng of old data beng rrelevant, the followng changes n the dynamc cluster structure can appear [Mann, 983]:

43 General Framework of Dynamc Pattern Recognton 25. Formaton of new clusters: If new data can not be clearly assgned to exstng clusters, one or more new clusters may be formed ether subsequently (one after another) or n parallel (from the frst cluster to many dfferent clusters) (Fgure 2-6). Ths stuaton can appear f the degrees of membershp of new objects to all fuzzy clusters are approxmately equal or very low (e.g. n the case of possblstc c-means they must not sum to one as n probablstc c-means, and can be very low). If new data can not be assgned to exstng clusters and ther number s not large enough to form a new cluster, these new data are recognsed as beng stray. It should be noted that the projectons of dscrete trajectores nto the feature space wthout tme axs are shown n Fgure 2-6, where ponts correspond to observatons of dynamc objects at dfferent moments n tme. 2. Mergng of clusters: Two or more clusters may be merged nto one cluster (Fgure 2-7, a). If a large number of new data has equally hgh degrees of membershp (> 0.5) to two clusters, for example, these two clusters cannot be consdered as heterogeneous any more and n contrast are consdered as smlar and must be merged. 3. Splttng of clusters: One cluster may appear as two or more clusters (Fgure 2-7, b). If a large number of new data has been absorbed, dstnct groups of objects wth hgh densty wthn a cluster may be formed, whereas a cluster centre may be located n the area of very low densty. Alternatvely t may happen that there s no new data beng absorbed nto a cluster and old data are dscarded as rrelevant. Due to the dscardng of old data, an area that has a very low densty of data, or s even empty, may appear around the cluster centre and along some drecton separatng the data nto two or more groups. A cluster cannot be consdered as homogeneous any more and must be splt to fnd the better partton. 4. Destructon of clusters: One or more clusters may dsappear f there are no new data beng assgned to these clusters and old data are dscarded. The cluster centre must, however, be saved n order to preserve the dscovered knowledge. It may happen that ths cluster appears once agan n the future. In ths case a new cluster can be recognsed and dentfed faster f the knowledge about ths cluster already exsts and does not need to be learned anew. 5. Drft of clusters: The locaton of clusters (cluster centres) n the feature space may be slghtly changed over tme.

44 26 General Framework of Dynamc Pattern Recognton X 2 X 2 B x x C x x A A 0 t=0 X 0 t=50 X a) At tme t=0 there s one cluster A; Fgure 2-6: Formaton of new clusters b) At tme t=50 two new clusters, B and C, are formed X 2 X 2 B B x x x C x x A A 0 X 0 X Fgure 2-7: Changes of the dynamc cluster structure a) Mergng clusters B and C; b) Splttng cluster B The frst four types of structural changes represent abrupt (serous) changes n the cluster structure. In many applcatons abrupt changes are assocated wth faults n the operatonal behavour of a system, whch have to be detected as early as possble to avod dangerous consequences (e.g. fatal damage to equpment or faults n chemcal or power plants). The ffth type of structural changes s referred to as gradual changes, whch can be useful to predct the occurrence of abrupt changes. They are usually very small and rather dffcult to detect, but can be of major relevance n some applcatons. Combnaton of two types of objects n pattern recognton wth two types of the cluster structure wth respect to tme leads to four categores of problems n pattern recognton:

45 General Framework of Dynamc Pattern Recognton 27. Statc objects, statc cluster structure: Ths s a classcal problem n pattern recognton concerned wth parttonng objects, represented by feature vectors, nto a number of clusters. The classfer s desgned usng the tranng set of statc objects and remans unchanged over tme. The classfer may be changed/re-learned at a later nstant due to decreasng performance, but t does not follow an adaptaton law. New objects do not mply a change of the classfer. 2. Statc objects, dynamc cluster structure: In ths case objects represent measurements of a dynamc system, whch are suffcent for a descrpton of the system s behavour. For nstance, the data can come from an envronmental montorng applcaton and consst of daly sngle-pont measurements of water qualty parameters [Denoeux, Govaert, 996]. At the begnnng of observatons two clusters were detected. After two years of montorng one new cluster has appeared and one exstng cluster has sgnfcantly grown. The dynamcs of ths stuaton s exhbted by the changng cluster structure (see Fgure 2-6 and Fgure 2-7). Thus, the problem of dynamc pattern recognton n ths stuaton les n recognsng clusters of typcal system states based on statc objects n order to detect and follow temporal changes of the cluster structure by adjustng the structure of the classfer, and to try to predct future system states. 3. Dynamc objects, statc cluster structure: Ths stuaton s concerned wth the recognton of typcal states of a sngle system, or clusters of systems wth smlar behavour, based on the analyss of the temporal development of a system. Objects are represented by multdmensonal trajectores, or tme seres, descrbng the behavour of dynamc systems. The problem of dynamc pattern recognton n ths stuaton s to recognse clusters based on smlar behavour of trajectores. There s a large number of applcatons n whch clusterng and classfcaton of dynamc objects s of prmary mportance whereas the cluster structure remans unchanged over tme, e.g. classfcaton of share prces or other market characterstcs, analyss of customer behavour or recognton of typcal scenaros n scenaro analyss. For nstance, stock market buy/sell decsons are usually made based on past values of share prces showng specfc (predefned) patterns. If the behavour of share prces over a perod of tme corresponds to a certan pattern, then shares are sold or bought. It s not suffcent to consder only current share values to make a correct decson. Clusterng of dynamc objects can also be appled n scenaro analyss for complexty reducton. In order to fnd typcal scenaros of the future development of economc characterstcs for strategc plannng t s reasonable to consder the characterstcs temporal behavour nstead of ther fnal values only. Fgure 2-8 shows three typcal scenaros found after clusterng 50 dfferent scenaros [Hofmester, 999]. Thus, due to the clusterng of trajectores (scenaros) t s possble to reduce a large set of raw scenaros to a few typcal scenaros, whch can be used to make strategc decsons and to avod an mportant loss of nformaton n the case of consderng the fnal values of scenaros.

46 28 General Framework of Dynamc Pattern Recognton Ol prce Scenaro Scenaro 2 Scenaro 3 0 Tme (quarter of a year) Fgure 2-8: Typcal scenaros of future temporal development of ol prce 4. Dynamc objects, dynamc cluster structure: Ths stuaton s a combnaton of cases 2 and 3. The task of dynamc pattern recognton n ths case s to recognse typcal system states based on the analyss of the temporal behavour of a system, to detect and follow changes n the cluster structure, and to try to predct future states. In other words, n the case of a sngle system the problem s to recognse whether the current trajectory contans changes n temporal behavour ndcatng a new system state or remans unchanged n ts behavour. In the case of several systems the am s to recognse dfferent types of systems and to detect whether clusters and the assgnment of systems to these clusters change over tme. Ths problem can appear n real-tme dagnoss n dfferent applcaton areas such as medcne, bology, chemcal and ndustral engneerng, etc. A typcal example s preventve machne mantenance where several systems are montored smultaneously (Fgure 2-9). Comparng system dynamcs durng dfferent perods of operaton, dfferent clusters of systems can be dstngushed: for example, n the st perod Systems and 2 are assgned to cluster A; n the 2 nd perod System s assgned to a new cluster B and System 2 s assgned to another new cluster C; after the 2 nd perod both systems are assgned to cluster C. Hence, the dynamcs of ths stuaton shows tself n changng cluster structure and n the transton of dynamc objects (represented by trajectores) between clusters.

47 General Framework of Dynamc Pattern Recognton 29 System parameter System 2 System 0 Tme t t 2 Fgure 2-9: Changng clusters of typcal system states The frst of the above descrbed cases s obvously the subject of statc pattern recognton, whereas the last three cases represent problems of dynamc pattern recognton and, therefore, wll be treated n ths thess. A classfer whch s able to deal wth dynamc objects can be called dynamc wth respect to objects. A classfer whch s able to deal wth dynamc cluster structure can be called dynamc accordng to ts desgn prncple. A dynamc fuzzy classfer s defned by tme-dependent cluster centres and by tme-dependent degrees of membershp of objects to clusters. Snce, n the course of tme, cluster centres and membershp functons of clusters can be changed due to the arrval of new objects, the dynamc classfer must be updated to preserve ts performance. There are two strateges for updatng the classfer wth respect to a set of objects: one nvolves usng the complete hstory gven by the whole set of objects obtaned from the begnnng of observaton untl the current moment, and the other nvolves usng only the most recently obtaned objects. Snce old data may become rrelevant over tme for representng the current stuaton and may have negatve effects on the clusterng procedure, t s not reasonable to preserve the constantly growng set of objects for clusterng. A better strategy s to update the classfer wthn a rollng horzon defned as a movng tme wndow. The dea of usng tme wndows s that only objects of the current tme wndow are consdered for clusterng. Precedng ( old ) objects are not taken nto consderaton because they are deemed rrelevant. Tme wndows are defned as subsequent (overlappng) tme ntervals of constant or varable length, whch are shfted along the tme axs (Fgure 2-0).

48 30 General Framework of Dynamc Pattern Recognton t t s t w Fgure 2-0: Movng tme wndows of constant length Movng tme wndows are charactersed by two parameters: the length of the wndow t w and the length of the shft t s. The length of the tme wndow determnes the number of objects consdered for clusterng. If the wndow s too large, then the changes n a system s state may be recognsed too late and updatng of the classfer may be too slow. If the wndow s too small, then t may not contan enough nformaton to desgn a relable classfer. Thus the optmal length of the wndow must be chosen carefully dependng on the applcaton. The length of the shft of the tme wndow determnes the number of new objects that s taken nto consderaton for the update of the classfer. In other words, the length of the shft defnes the frequency of the update of the classfer. If the length of the shft s too large, then the updatng of the classfer to the changes of a system s state may be too slow. If the length of the shft s too small, then the classfer may be updated much more frequently than changes appear n the state of a system. For nstance, the shft of the tme wndow by one tme unt corresponds to the update of the classfer at each moment n tme. Such a frequent adaptaton of a classfer may be very costly and tme-consumng. Whle choosng the optmal length of the shft t should be taken nto account that only a 'suffcent' number of new objects can cause such changes n the cluster structure as mergng or splttng of clusters. An excepton to ths rule may be provded by applcatons where sgnfcant changes, such as the formaton of new clusters, must be recognsed as early as possble based on just a few objects (e.g. early fault detecton). A dynamc classfer must have an adaptve capacty n order to be able to follow temporal changes n the data and to preserve ts performance. Based on the noton of an adaptve controller, an adaptve classfer can be defned as a classfer that can modfy ts structure n response to changes n the dynamcs of the system/process under consderaton. An adaptve classfer s charactersed by adjustable parameters and a mechansm for adjustng the parameters. For nstance, n the case of pont-prototype based fuzzy clusterng algorthms the parameters that can be adjusted n the course of tme are the number of clusters, locaton of cluster centres and fuzzy sets descrbng clusters.

49 General Framework of Dynamc Pattern Recognton Goals and tasks of dynamc pattern recognton There are essentally two problems assocated wth dynamc pattern recognton (n addton to those arsng n statc pattern recognton [Taylor et al., 997]):. To detect any change n the cluster structure, 2. To react to any detected change. The frst problem can be solved durng the montorng process. Its task s to montor the performance of the classfer by observng and evaluatng some calculated measures based on the results of classfcaton of new objects. By dong ths, the montorng process tres to check how well new objects arrvng at each moment ft nto the exstng cluster structure. In partcular, f new objects do not ft the exstng cluster structure and the performance of the classfer s rather poor, t s assumed that changes have taken place. There can be two reasons for the change of the cluster structure n the course of tme [Kunsch, 996, pp ]:. The number of features changes. The number of relevant features descrbng objects can decrease or ncrease over tme. The reasons for the decrease of the number of features may be that some features have to be dropped snce they become rrelevant for representng objects due to fundamental changes n propertes of objects (e.g. changes n the operatng condtons of a system), or t s no longer possble for some reasons to observe some feature. On the other hand, there mght be more features avalable for descrbng objects as tme passes. If they contan a relevant nformaton for the recognton process they have to be added to the feature set to mprove the performance of the classfer. In both cases of a quanttatve change of the feature set t seems reasonable to re-learn the classfer from scratch. 2. The values of feature change. The number of features remans the same but the values of some feature change ther range or nterpretaton. The followng types of changes can be dstngushed: Drft of the feature values: The range of feature values s changed slowly and gradually. Shft n the feature values: An abrupt change of the range of feature values s observed. The nterpretaton of features may though reman the same. Semantc change: Although the feature values are not changed, ther meanng may become completely dfferent. Ths case s very hard to deal wth durng the classfer desgn. The problem of a varable feature set can present real dffcultes for the recognton process, f new relevant features are not known a pror. Ths problem s related to feature selecton and feature generaton rather than classfer desgn. The selecton of relevant features s a nontrval task anyway, but the detecton of features whose relevance to pattern recognton s

50 32 General Framework of Dynamc Pattern Recognton tme-dependent can be vewed as a challengng new research area and requres the development of new methods for tme-dependent feature selecton/generaton. For solvng ths problem two cases can be dstngushed: ) an ntal set of features s gven, and the goal s to select a subset of features relevant at each moment n tme (or wthn a perod of tme); 2) relevant features at any gven moment are combnatons of avalable features such as a product, a rato, lnear combnaton, etc. In the frst case, one possble approach can be a repeated applcaton of some feature selecton procedure that s ndeed computatonally expensve. In the second case, approaches for feature generaton dependng on the current feature values are needed. The problem of changng feature values s related to the detecton of changes n the cluster structure. These changes depend to some degree on the number of changed features and ther mportance for the recognton process. The mportance of features s determned by the contrbuton of features to the recognton process. As stated above, the followng types of changes can take place n the cluster structure due to the arrval of new objects: drft of clusters, formaton of new clusters, mergng of clusters, and splttng of clusters. In order to detect these structural changes new sophstcated procedures must be developed. The followng chapters wll focus on ths problem. The montorng process can ether be comparatve statc and based on the analyss of changes caused by new statc objects (wthout the hstory) at the current moment or dynamc and based on the consderaton of the temporal development of objects durng a certan perod of tme (trajectores). The frst approach can employ classcal (statc) methods for classfer desgn and classfcaton, whereas the second approach requres the development of new methods for dealng wth trajectores of dynamc objects. After the temporal changes n the cluster structure are recognsed by the montorng process, the problem of reactng to these changes arses. It means that the pattern recognton process must contan a mechansm to update the exstng classfer accordng to current changes n order to preserve ts valdty over tme. One extreme strategy for solvng ths problem s to relearn a classfer from scratch. It does not requre any adaptve mechansm, therefore, classcal methods for classfer desgn can be used. The other extreme soluton s to gnore changes and to mprove the current classfer usng, for example, ncremental updatng based on new objects. The optmal approach les between these two extremes and s concerned wth the adaptaton of the exstng classfer usng the results of the montorng process to ft the changed cluster structure [Nakhaezadeh et al., 997]. A strategy chosen for the updatng process s a crucal pont for the performance of the whole pattern recognton system wth respect to the accuracy of results and tme consumpton. To be able to make a dagnoss about a current system s state, the nformaton concernng detected structural changes must be nterpreted correctly. In contrast to statc pattern

51 General Framework of Dynamc Pattern Recognton 33 recognton the nterpretaton of results ncludes not only the descrpton of the current stuaton but also a descrpton of the temporal development of the stuaton. The dagnostc results can be used to make a short-term prognoss about a tendency of the future development of the current stuaton or to formulate a control acton to modfy the dynamcs of a system. For nstance, one possble dagnoss may be: A system s state has been moved away from cluster A and s approachng cluster B. Thus, compared wth statc pattern recognton, the process of dynamc pattern recognton must ncorporate three addtonal steps:. Montorng process: comparatve statc or dynamc. 2. Updatng process: re-learnng, ncremental updatng, or adaptaton. 3. Dagnostcs: descrpton and nterpretaton of the current cluster structure. The general scheme of the process of dynamc pattern recognton can be represented as follows: Observaton vector y Observaton space Feature extracton Statc Steps Feature vector x Feature space Intal Classfer New feature vector at tme t Classfer at tme (t-) Save updated classfer Dagnostcs Montorng Output decson at tme t Decson space Updatng Changes n cluster structure Dynamc Steps Fgure 2-: The process of dynamc pattern recognton The goal of dynamc pattern recognton s to detect and follow changes n the cluster structure and to adapt the classfer to these changes n the course of tme. Snce dynamc objects are represented by multdmensonal trajectores, methods of dynamc pattern recognton must be based on a smlarty measure for trajectores. All the aforementoned consderatons about dynamc pattern recognton can be summarsed n a taxonomy of ths feld. The taxonomy s an extended verson of the one presented n

52 34 General Framework of Dynamc Pattern Recognton ([Joentgen, Mkenna et al., 998], [Joentgen, Mkenna et al., 999a]) and contans dfferent crtera whch are sutable for structurng ths feld. The crtera themselves are dvded nto two groups: problem related crtera and method related crtera. Problem related crtera are concerned wth the structure and type of system observed and the character of observatons. Method related crtera provde a bass for the dstncton of methods and technques used n classfer desgn and classfcaton. Essentally they nclude crtera for selectng and processng relevant features. The feld of dynamc pattern recognton can be structured usng the followng taxonomy: A. Problem related crtera:. Classfcaton object: a) Classfcaton of several systems b) Classfcaton of the states of one system 2. Type of cluster structure: a) Statc cluster structure b) Dynamc cluster structure 3. Observaton perod: a) Moment n tme b) Rollng horzon c) Complete hstory B. Method related crtera:. Handlng of dynamcs (trajectores): a) Durng pre-processng b) Wthn the data analyss method 2. Type of the trajectores used: a) Drectly measured trajectores (no pre-processng) b) Aggregated trajectores 3. Tme dependence of the feature set: a) Constant feature set, varable feature values b) Varable feature set 4. Exstence of pror nformaton: a) Tranng data for each cluster (n form of ponts or trajectores) are gven and ther cluster membershps are known (supervsed methods) b) The cluster membershp of tranng data and the number of clusters are unknown (unsupervsed methods) 5. Detecton of temporal changes n cluster structure: a) Comparatve statc b) Dynamc

53 General Framework of Dynamc Pattern Recognton Type of desgn of dynamc classfer: a) Re-learnng b) Incremental updatng c) Adaptaton. In the next chapter the analyss of dynamc steps n the pattern recognton process s carred out and dfferent approaches for the realsaton of these steps n a dynamc classfer are consdered.

54

55 Stages of the Dynamc Pattern Recognton Process 37 3 Stages of the Dynamc Pattern Recognton Process As stated n the prevous chapter, the desgn procedure of a dynamc classfer must couple the results of the montorng process wth a mechansm for updatng the classfer accordng to detected changes n the cluster structure. Dfferent approaches used for establshng the montorng process are usually based on the observaton and the analyss of some characterstc values descrbng the performance of a classfer or the cluster structure. The temporal change of these characterstcs ponts to some structural changes n the underlyng cluster structure and to the need for adaptng a classfer. Accordng to the nature of the montored characterstcs, one can dstngush between statstcal and fuzzy technques for the montorng process, the most mportant of whch wll be dscussed n Secton 3.. In order to preserve the performance of a dynamc classfer over tme, t must be adapted to temporal changes detected by the montorng process. The updatng strateges of a dynamc classfer presented n Secton 3.2 depend on the type of temporal changes n the cluster structure (gradual or abrupt) and can requre ether the adjustment of classfer parameters or complete re-learnng of a classfer. As wll be shown, the most flexble adaptaton law of a dynamc classfer must combne both these technques and nclude addtonal mechansms supportng the ntellgent desgn of a dynamc classfer. An updatng strategy represents a crucal component of a dynamc patter recognton system snce t determnes an adaptve capacty of a dynamc classfer and ts ablty to follow temporal changes n the cluster structure. 3. The Montorng Process In many applcaton areas t has become ncreasngly mportant to montor the behavour of dynamc systems based on multple measurements. Accordng to [Denoeux et al., 997] the task of a montorng system s to detect the departure of a process from normal condtons, to characterse the new process state, and to prescrbe approprate actons. The purpose of ths thess s to solve ths task by means of a dynamc pattern recognton system. It must consst of dfferent components, one of whch s the montorng process. The am of the montorng process n ths context, as stated n secton 2.3.3, conssts n detectng gradual and abrupt changes n the cluster structure where clusters represent typcal states, or the typcal behavour, of a system under consderaton. In order to detect changes n the cluster structure dfferent characterstc values for the evaluaton of the performance of the classfer have to be montored regularly. The followng statstcal measures are usually used for montorng ([Kunsch, 996], [Lanqullon, 997]):. Accuracy of the approxmaton: Many classfers respond wth approxmated output values nstead of class labels. If the true output values for all clusters become known the accuracy

56 38 Stages of the Dynamc Pattern Recognton Process of the approxmaton n the current tme wndow, defned as the dfference between the output values and the expected values correspondng to each cluster, s compared to the accuracy of the approxmaton n prevous tme wndows. If the dfference ncreases ths means a deteroraton of the approxmaton and the classfer performance. 2. The number of msclassfed objects: If the true class labels of prevously (crsply) classfed objects become known the error rate of classfcaton n the current tme wndow can be compared to the average error rate of prevous wndows. Ths s one of the most common characterstcs used to evaluate the performance of a classfer. Error rates, whch are less or approxmately equal to the average, provde evdence about the stable cluster structure and satsfactory performance of the classfer. If error rates are greater than the average, changes n the cluster structure can be assumed. 3. Unambguty of the classfcaton: If, n a classfer, a bnary -out-of-c codng has been selected to represent the exstng c clusters the approxmated output wth the hghest value usually determnes the class label (the so called wnner takes all prncple). Under certan condtons, the output values can be consdered as probabltes (or proportonal to probabltes) for each cluster. Then f the dfference between two maxmum output values s very small the classfer s somewhat ndfferent between the correspondng clusters,.e. ts dscrmnatng ablty s rather pure. In the most uncertan case where each cluster s equally lkely all outputs take value /c. 4. Class dstrbutons: The relatve number of objects (the percentage n the total number) assgned to each cluster durng the current tme wndow can be compared to the average relatve number of objects assgned to each cluster n prevous tme wndows. The true class labels of prevously classfed objects are only requred for the prevous tme wndows. The number of objects n each cluster represents the class dstrbuton descrbng a certan structure of the data. If the class dstrbuton obtaned after classfcaton of new objects n the current tme wndow dffers from the average ths can ndcate a change n the underlyng cluster structure. 5. Means and varances of features: For each feature and class, the mean and the varance of ts values n the current tme wndow can be compared to the correspondng statstcs n the prevous tme wndows. Based on these measures, a drft or shft n the feature values can easly be detected. These measures requre true class labels n order to be able to calculate the measures separately for each class. In the case of fuzzy clusters an addtonal measure can be formulated: 6. Unambguty of fuzzy classfcaton: Montorng the performance of the fuzzy classfer can be based on the analyss of membershp functons representng fuzzy clusters. In order to evaluate the qualty of fuzzy cluster assgnment, the dfference between the two

57 Stages of the Dynamc Pattern Recognton Process 39 maxmum degrees of membershp of each object to the clusters can be consdered. If the value of ths dfference s large an object can be assgned clearly to one of the clusters. If the value of ths dfference s small an object belongs to both clusters to the same degree and ts assgnment s very ambguous. The measure of unambguty can be estmated for each tme wndow usng, for nstance, the maxmum and average values of the dfference between the two largest membershp degrees over all objects. If the maxmum value of the dfference over all objects n the current tme wndow s much smaller compared to the one n prevous tme wndows, then the assgnment of objects to clusters becomes very ambguous. Dependng on the absolute degrees of membershp, two stuatons of ambguous assgnment can be dstngushed. Equally low degrees of membershp of new objects to all clusters ndcate that objects do not belong to any of the exstng clusters. Equally hgh degrees of membershp ndcate that objects belong to more than one cluster to the same degree that sgnalses the overlappng of clusters. These cases correspond to dfferent knds of changes n the cluster structure and lead to a decrease n classfer performance. Measures, 2 and 5, whch requre knowledge of the true class membershps of new objects, can only be used n methods of supervsed learnng. Snce the true class labels can become known after classfcaton wthn a certan perod of tme, changes can only be detected wth a certan delay. Moreover, the true class labels are needed for updatng the classfer by means of supervsed learnng, thus a delay can not be avoded [Lanqullon, 997]. In contrast, measures 3, 4 and 6, whch do not requre any nformaton concernng true class labels, can be used n methods of unsupervsed learnng. They rely solely on the values provded by new objects and on new classfcaton results. Usng these measures t s possble to avod delays and to recognse changes n the cluster structure mmedately. The true class labels are only requred for the nterpretaton of the changes. In the followng secton some methods for the montorng process are presented, whch use some of the descrbed measures for the evaluaton of the performance of the classfer. 3.. Shewhart qualty control charts From a statstcal pont of vew, the sequence of observed characterstc values represents a dscrete random process that s charactersed by some nherent varatons. They are caused by stochastc fluctuatons and are usually consdered to be nose. Thus, the key problem of the montorng process s to dstngush between varatons due to stochastc perturbatons and varatons caused by unexpected changes n a system s state. If the sequence of observatons s nosy t may contan some nconsstent observatons or measurements errors (outlers) that are random and may never appear agan. Therefore, t s reasonable to montor a system and to process observatons wthn tme wndows n order to average and reduce nose. Moreover,

58 40 Stages of the Dynamc Pattern Recognton Process the nformaton about possble structural changes wthn tme wndows can be nterpreted and processed more easly. As a result, a more relable classfer update can be acheved usng montorng wthn tme wndows. One of the most popular statstc methods for montorng the qualty of a product, a process or a system s state s the Shewhart qualty control chart ntroduced n [Shewhart, 93] as a change detecton method. The dea of ths method s to observe over tme some characterstc values descrbng a process and to check whether they reman wthn predefned lmts. If they do, then the process s stable. Otherwse, t can be assumed that some structural change n the process has occurred. A control chart s defned n statstcs [Hogg, Ledolter, 992] as a plot of a certan characterstc of a process obtaned from samples (observatons) sequentally n tme and the characterstc s correspondng values. These values can be defned, for example, as the mean value of a characterstc and the measure of varablty such as the standard devaton. Besdes the characterstc s values plotted as tme seres, control charts also nclude bounds, called control lmts, whch are used to determne whether an observaton s wthn acceptable lmts of random varatons. Control lmts are employed n control charts to dstngush between varatons caused by expected stochastc fluctuatons and those caused by unexpected changes n a process or a system s state. If the plotted characterstc moves outsde the control lmts ths can ndcate that somethng has happened to the process leadng to a new behavour and to a new possble state. The use of control charts s llustrated n Fgure 3-. f µ+3σ µ+σ µ µ σ µ 3σ 0 t Fgure 3-: Shewhart qualty control chart for the characterstc f Suppose that characterstc f of the process derved from observatons wthn the current tme wndow s montored over tme. Let mean value µ, calculated durng prevous tme wndows, be the expected value for characterstc f wth a standard devaton of σ. The values of f are plotted n Fgure 3- n tme-seres fashon versus tme. So-called warnng lmts are used to

59 Stages of the Dynamc Pattern Recognton Process 4 determne slght departures of the process from ts stable state and are usually defned n the lterature at µ±2σ. In [Lanqullon, 997] the warnng lmts are chosen as µ±σ to provde a wder range n whch the classfer must be updated. Acton lmts are equvalent to control lmts and are usually set at µ±3σ. They are sutable for detectng serous changes n process behavour. The reasons for ths choce of acton lmts can be found n statstcs. If t s supposed that the values of f are normally dstrbuted wth mean µ and varance σ 2, then the probablty that any observed value of f falls between the lmts µ+3σ and µ 3σ s Normalty assumptons should not be consdered as an obstacle, snce n the case of the tme wndow havng a long length (.e. a large number of observatons) the dstrbuton of values of f can be consdered approxmately normal accordng to the central lmt theorem. Therefore, f the process s stable t s rather unlkely that any values of f would be outsde the acton lmts. Consequently, f values of f fall outsde the acton lmts ths ndcates nstablty of the process, whch s most lkely caused by a serous change of the process state. The montorng process based on the qualty control chart s carred out as follows. If the current values of characterstc f are wthn the warnng lmts the montorng process responds wth the status okay. The process seems to be under control and no modfcatons are requred. If the current values of f fall outsde the acton lmts the stuaton s labelled wth an acton status because the process seems to be out of control and changes n the classfer are requred. Otherwse, f the values of f are between the warnng and the acton lmts the montorng process provdes a warnng status. Ths stuaton could ndcate a trend towards a change n process behavour, and slght changes n the classfer would be desrable. It should be noted that qualty control charts take nto consderaton only magntude of characterstc f for detecton of structural changes provded that the frequency of ths characterstc remans unchanged. If a change of the frequency of characterstc f has occurred ths can ndcate a serous change n process behavour and the classfer has to be desgned anew. The man problem n usng Shewhart qualty control charts s that the values of µ and σ are generally unknown. They can only be estmated on the bass of observatons n prevous tme wndows. However, f structural changes have occurred n the past the estmaton of µ and σ based on all prevous observatons may not provde approprate values [Lanqullon, 997]. Ths problem can be solved n two ways. The frst possblty s to exclude from future calculatons values of µ obtaned n tme wndows where changes were detected. Ths approach s sutable for stuatons n whch values of the consdered characterstc do not change n magntude due to structural changes. Another possblty s to gnore values of µ obtaned n all prevous tme wndows up untl the tme wndow where the last change was detected. Ths strategy can be appled to characterstcs whose values may change n magntude after structural changes.

60 42 Stages of the Dynamc Pattern Recognton Process In [Taylor et al., 997] the dea of qualty control charts was used wth a slght modfcaton of warnng lmts to montor the error rates of a classfer wth supervsed learnng. In order to evaluate the current classfcaton error rate, the expected error rate e (a weghted mean of the error rates of the prevous tme wndows) and ts standard devaton σ are estmated. If an error rate falls below the expected error rate the classfer performance s satsfactory. Thus, t s suffcent to consder only upper lmts durng the montorng process,.e. the warnng and the acton lmts are set at e + σ and e + 3σ, respectvely. Snce t s not possble to estmate the expected error rate dependng on the prevous error rates n the frst tme wndow or after the classfer s completely re-learned based on the most recent observatons, cross-valdaton can be appled for estmatng the expected error rate e based only on the tranng data set. The dea of the cross-valdaton test, whch s also called k-fold cross-valdaton, ([Stone, 974], [Gesser, 975]) s to dvde the tranng set nto k dstnct subsets and to use (k-) subsets for tranng a classfer and one subset for classfer valdaton. Ths procedure s repeated for all k combnatons of (k-) subsets leadng to k dfferent classfers and k error rates. The overall error rate s estmated as the mean of these error rates and represents a rather unbased estmate. In the lterature there are some other methods to nterpret the results of Shewhart qualty control charts. For nstance, n [Smth, 994] qualty control charts are used n a neural network settng to detect changes. Shewhart qualty control charts have the followng drawbacks:. Small and persstent changes reman undetected. However, ths s not crucal as long as the error rate remans wthn the warnng lmts. 2. They are able to detect changes of the characterstc s values but not those of the behavour of a process or system. 3. Due to crsp control lmts they are not able to detect gradual changes,.e. to detect the degree to whch a change n a system s state has occurred. 4. They are not able to detect the knd of changes that have occurred. Drawbacks 2 and 4 can be avoded by the proper choce of characterstc f, whch can represent a derved feature of a qualty of a process or a measure of compactness or separablty of clusters, respectvely. Two other drawbacks can be avoded by applyng fuzzy technques to montor the performance of a classfer and to detect gradual and abrupt changes n ts cluster structure, as wll be descrbed n the next secton.

61 Stages of the Dynamc Pattern Recognton Process Fuzzy technques for the montorng process The potental of fuzzy modellng n system dagnoss was recognsed very soon after the ntroducton of fuzzy set theory n pattern recognton. Snce then, methods of fuzzy clusterng and classfcaton have been successfully appled to a number of real-world problems such as, for example, tool wear montorng [Zeba, Dubusson, 994], human car drver performance montorng [Pelter, Dubusson, 993], or the supervson of state evoluton of telephone networks [Boutlex, Dubusson, 996]. The success of the fuzzy approach s due to the possblty of modellng clusters by fuzzy sets n the feature space. The membershp functon of a fuzzy set gves the degree to whch an arbtrary object may be consdered as a representatve of a cluster. Therefore, each object s assumed to belong to each cluster wth a certan degree of membershp. Due to ths nterpretaton and the contnuous nature of membershp functons, the fuzzy framework seems to be well-adapted to dynamc pattern recognton. Usng the concept of membershp functons t s possble to model the temporal development of a system and a transton of objects between clusters n the course of tme Fuzzy qualty control charts As stated n the prevous secton, one of the drawbacks of statstcal qualty control charts s that t s not possble to recognse to what degree a system state has been changed. Montorng characterstc f of a system, or process, t s only possble to detect whether or not the behavour of a process s stable, has a tendency to a change or has become poor. Control lmts provde a crsp dscrmnaton between good and bad behavour or between good and bad qualty of the process under consderaton. As long as the values of characterstc f are wthn the so-called range of tolerance defned by the control lmts, the behavour or the qualty of the process can be consdered as good. Otherwse, the qualty s charactersed as bad. However, the transton from good to bad qualty s often contnuous and gradual. In order to be able to represent a gradual change n the qualty of a process, a fuzzy set good qualty wth membershp functon u(f) over the doman of the montored characterstc f can be defned (Fgure 3-2). The further the actual value of characterstc f devates from ts expected value µ, the smaller ts degree of membershp to the fuzzy set good qualty. Such a representaton of the range of tolerance corresponds to a defnton of fuzzy control lmts [Schlecher, 994, p. 8].

62 44 Stages of the Dynamc Pattern Recognton Process u(f) 0 µ 3σ µ µ+3σ f Range of tolerance Fgure 3-2: Fuzzy set good qualty defned for characterstc f Montorng characterstc f over tme the fuzzy set good qualty s defned as ether constant or tme varable for characterstc f. As a result, a fuzzy qualty control chart s obtaned as a combnaton of crsp characterstc values of f wth fuzzy control lmts (Fgure 3-3). f µ+3σ µ+σ µ µ σ µ 3σ 0 t 0 Degree of membershp to the fuzzy set 'good qualty' Fgure 3-3: Fuzzy qualty control chart [adapted from Schlecher, 994, p. 8] The advantage of usng fuzzy qualty control charts s that many ntermedate states between good and bad qualty can be dstngushed, allowng a flexble reacton of the classfer. Moreover, the fuzzy representaton of the range of tolerance provdes a possblty to handle qualtatve characterstcs (e.g. good, bad) n the same way as quanttatve ones (represented as real numbers). The nformaton processng n the montorng procedure based on fuzzy control charts s llustrated n Fgure 3-4.

63 Stages of the Dynamc Pattern Recognton Process 45 f µ 0 t Measured values of characterstcs... f n µ 0 t u(f ) u(f n ) 0 t Degrees of membershp to fuzzy set 'good qualty'... 0 t u(f aggr ) Aggregated fuzzy control chart Warnng lmt Acton lmt 0 t Fgure 3-4: Processng of fuzzy qualty control charts [adapted from Schlecher, 994, p. 9] Suppose that several characterstcs f,..., f n are to be montored. In the frst step degrees of membershp of values of a montored characterstc f, =,..., n, to the fuzzy set good qualty are calculated for each moment of tme. The result of ths transformaton are functons u(f ),..., u(f n ) representng a pontwse membershp of montored characterstcs to the fuzzy set good qualty over tme. In the second step these functons have to be aggregated to a sngle functon expressng an overall degree of membershp usng one of the aggregaton operators (e.g. n the case of dscrete functons the γ-operator, the arthmetc mean or the fuzzy ntegral can be used, whle n the case of contnuous functons maxmum or mnmum operators can be appled). The result of the thrd step s an aggregated fuzzy control chart where the warnng and acton lmts are defned on the nterval [0, ] ndependent from characterstcs f,..., f n. The warnng lmt determnes a lmtng degree of membershp to whch values of characterstc f aggr can stll be consdered as belongng to the fuzzy set good qualty, but ther membershp s rather poor. The acton lmt can be nterpreted as a mnmum degree of membershp (e.g. 0.2) below whch values of characterstc f aggr can not be consdered as belongng to the fuzzy set good qualty any more. Ths fnal fuzzy control

64 46 Stages of the Dynamc Pattern Recognton Process chart s used to determne the current montorng status of a process and to derve an approprate reacton of the classfer accordng to detected changes. Fuzzy control charts represent an mprovement towards statstcal qualty control charts due to ther ablty to recognse gradual changes n a system s state. Both types of qualty control charts are based on montorng a general characterstc that descrbes the performance of the classfer (e.g. classfcaton error rate). Another possblty to conduct the montorng process s to consder the cluster structure tself and to evaluate the results of the classfcaton of new objects that must enable the unambguous assgnment of objects to clusters. Ambguous nformaton about cluster membershp ndcates a poor performance of a classfer and must be rejected n order to avod msclassfcaton errors. Furthermore, ths result can ndcate the need to mprove a current classfer Reject optons n fuzzy pattern recognton Although fuzzy methods provde a powerful framework for pattern recognton due to ther ablty to generate gradual membershps of objects to clusters, a number of rules have been proposed to defuzzfy the classfcaton results n order to be able to make a fnal (crsp) decson about a system state. Ths can be relevant for the desgn of automatc pattern recognton systems where a part of the fuzzy nformaton that s not useful should be gnored. By managng reject optons n fuzzy classfcaton, errors of crsp assgnment of objects to clusters n unclear stuatons can be avoded. Reject optons were ntroduced nto the framework of statstcal pattern recognton n order to decrease the msclassfcaton rsk [Chow, 970]. Two types of rejecton can be consdered [Dubusson, Masson, 993]: Dstance or membershp reject; Ambguty reject. The dea of the frst type of rejecton s to avod the assgnment of an object wth a very low degree of membershp to one of the exstng clusters. For nstance, n Fgure 3-5 objects n the upper rght-hand corner should be rejected for the cluster assgnment snce ther dstances to both clusters are equally large and ther membershps to both clusters are equally low. Therefore, the objects belong to none of the clusters. It should be noted that ths reject opton could be handled well by possblstc clusterng algorthms such as possblstc c-means. These algorthms do not contan a probablstc constrant of normalsaton of membershps of an object across all clusters. As a result, possblstc degrees of membershp can be nterpreted as typcalty of objects to clusters, or absolute degrees of membershp, and n the case shown n Fgure 3-5 they would be very low for objects n the upper rght-hand corner.

65 Stages of the Dynamc Pattern Recognton Process 47 On the contrary, probablstc fuzzy algorthms (for nstance, fuzzy c-means) would provde equal degrees of membershp of about 0.5 for these objects. The second type of rejecton deals wth the case when the clusters can not be clearly dstngushed. The am of ths reject opton s to avod the assgnment of an object to one of the clusters when degrees of membershp contan ambguous nformaton. For nstance, a group of objects between the two clusters n Fgure 3-5 belongs to both clusters to the same degree of approxmately 0.5. It s not possble to separate clearly these objects nto two clusters, therefore they should be ambguty rejected. Ths case can be handled by possblstc, as well as probablstc, clusterng algorthms. X 2 C DR x AR x C 2 0 X Fgure 3-5: Ambguty reject (AR) and dstance reject (DR) optons n pattern recognton The straghtforward approach to ncorporate reject optons n pattern recognton s to threshold degrees of membershp of an object to be classfed. In the followng, several decson rules concernng the defnton of reject optons are presented and dscussed. Let us nterpret an exclusve assgnment to cluster C as an acton a n decson theory ([Denoeux et al., 997], [Zmmermann, 992, p. 2]). Suppose that a set of actons correspondng to possble assgnments to c clusters s gven by A={a, a 2,..., a c } and denote the acton related to an object x by a(x). The most usual rule for a hard assgnment of an object to one of the clusters conssts n choosng the acton correspondng to the hghest degree of membershp [Pal, 977]: a( x) = a f u ( x) = max u ( x), j=,c j (3.) where ( x ) denotes a degree of membershp of object x to cluster C j, j =,..., c. In order to u j nclude reject optons n the decson process, ths rule must addtonally ncorporate thresholds of membershp whch are ether arbtrarly fxed or calculated from the learnng set of objects as follows:

66 48 Stages of the Dynamc Pattern Recognton Process u o = mn u ( x ), =,..., c,, k C k (3.2) where x k, k C, s a learnng set of objects belongng to class C. Suppose that an extended set A = {a 0, a d, a,..., a c } of possble actons s consdered, where a 0 and a d represent ambguty and dstance rejecton, respectvely. Let J(x) be a set of canddate clusters for object x that s defned accordng to the above rule as: { {,...,c} u ( ) } J ( x ) = x >. (3.3) u o A decson rule concernng the assgnment of object x to one of these clusters can be formulated as follows [Denoeux et al., 997]: a( x) = a a( x) = a a( x) = a d 0 f f f J( x) = {}, J( x) =, J( x) >. (3.4) The nterpretaton of ths rule s smple. If the set of canddate clusters ncludes only one element, then the assgnment of object x to cluster C s unambguous. If set J(x) s an empty set, then the assgnment of an object to one of the clusters s rejected because ts degree of membershp to all clusters s too low. If the set J(x) contans more than one element ths ndcates that several clusters appear to be equally lkely and the assgnment of an object s rejected due to the ambguty of the stuaton. A drawback of ths decson rule s that the same membershp threshold s used for ambguty and dstance reject, whch can lead to undesrable results. To avod ths problem the membershp rato rule was ntroduced n ([Frelcot, 992], [Frelcot et al., 995]) addtonal to rule (3-4), to deal wth ambguty rejecton. Ths rato s defned by: where u ( x) = max u ( x) and u ( x) = max u ( x). m J(x) m2 u m ( x) 2 R( x ) =, u ( x) (3.5) m J(x) \{m } If R(x) s close to zero, then the degree u m ( x ) s much hgher than all other degrees of membershp and x has not to be ambguty rejected. Acton a m can then be selected wth hgh confdence. If R(x) s close to one, then the assgnment of an object to at least two clusters s equally lkely and x s ambguty rejected. To be able to make a decson t s convenent to compare R(x) to a predefned ambguty threshold R o. If R(x) R o, then an object s rejected, otherwse acton a m s selected. Ths approach makes t possble to adjust dstance and ambguty reject rates ndependently and provdes relable decsons.

67 Stages of the Dynamc Pattern Recognton Process 49 Ths method for determnng membershp and ambguty rejecton can be used n dynamc pattern recognton to detect the nadequacy of the current cluster structure. If too many objects are membershp rejected ths may ndcate that the system s n a new state and the classfer must be re-learned wth an ncreased number of clusters. If there are too many objects that are ambguty rejected, provded that threshold u o n equaton (3.6) s fxed relatvely hgh, ths may ndcate that at least two clusters can not be consdered as dstnct any more but as smlar and should probably be merged to form a sngle cluster Parametrc concept of a membershp functon for a dynamc classfer Usng fuzzy clusterng methods, temporal changes n the cluster structure can be recognsed by montorng membershp functons representng fuzzy clusters and evaluatng the degrees of membershp of new objects to clusters (see secton 3., pont 6). The montorng procedure and the correspondng adaptaton law of a classfer depend crucally on the type and structure of the chosen classfer. In [Bocklsch, 98] a fuzzy clusterng algorthm based on a specal parametrc concept of a membershp functon sutable for dynamc classfcaton was proposed. The dea of ths method s to descrbe objects by elementary membershp functons and to obtan cluster membershp functons by aggregatng fuzzy objects usng the unon operaton. The aggregaton procedure s controlled by a threshold value, whch s used to determne smlar objects and correspondngly objects belongng to the same cluster. Ths herarchcal procedure of classfer desgn s repeated for several threshold values and the best cluster confguraton s chosen usng the evaluaton of some valdty measures and expert knowledge. Due to these repeated evaluatons, the method can also be charactersed as teratve. After the best cluster confguraton s found, a fuzzy descrpton of clusters s derved wth the help of the parametrc concept of the membershp functon. In contrast to the non-parametrc concept where a degree of membershp s assgned to each element x of the unverse of dscourse X ndependently from the neghbourng element (type A membershp functon), the parametrc concept provdes an analytcal model descrbng a dependency between degrees of membershp and elements x (type B membershp functon) [Zmmermann, Zysno, 985, pp ]. The judgement of membershp n the latter case s based on the comparson of element x wth an deal that results n a dstance between an element and an deal. Thus, membershp s defned as a functon of the dstance specfed by the number of parameters. In ths way, the parametrc concept makes t possble to change from the set-theoretc consderaton of membershp functons to the consderaton of crsp parameters of membershp functons. The parameters of a membershp functon are usually defned dependent on the applcaton. The parametrc concept proposed n [Bocklsch, 98]

68 50 Stages of the Dynamc Pattern Recognton Process ams to desgn a membershp functon that represents a gven data structure and whose parameters are easy to nterpret n the context of the classfcaton problem. The proposed membershp functon s defned by the followng propertes:. The same type of a membershp functon s used to model elementary events (objects) as well as global events (clusters). These membershp functons are denoted by u e and u g, respectvely. 2. The membershp functon u(x; p) s descrbed by a functonal relatonshp u( ) and by a parameter vector p whch conssts of two sub-vectors p and p 2 charactersng the locaton and the fuzzness of the membershp functon, respectvely. The membershp functon s asymmetrc and has a sngle maxmum. 3. The parameters of the membershp functon have a clear physcal meanng and can be explctly nterpreted. They nclude two locaton parameters x o and a, and three parameters of fuzzness b, c, and d: Parameter x o the locaton of the membershp functon. Ths value corresponds to an element wth the maxmum degree of membershp (e.g. best representatve of a fuzzy cluster). It can be determned as an arthmetc mean or a centre of gravty of elements belongng to ths fuzzy set, or defned subjectvely. Parameter a - the maxmum value of the membershp functon. Ths value must not be equal to one as t s usual for normalsed membershp functons. For cluster membershp functons the value of parameter a can be proportonal to the current mportance of a cluster determned by the number of objects currently belongng to ths cluster. Besdes ths, the value of a s nfluenced by a forgettng functon. Therefore, the value of parameter a vares n the course of tme: t ncreases as a cluster grows and decreases as a cluster becomes old and s no longer supported by new objects. Snce the membershp functon s not normalsed, the choce of aggregaton operators that can be used s restrcted to the class of operators whch do not have requrements on functon normalsaton. Parameter c the support of the membershp functon. Ths parameter determnes a set of elements for whch a degree of membershp s hgher than a predefned margnal degree b. Parameter b margnal degree of membershp. Ths value s determned on the boundary of the support of the membershp functon defned by parameter c. Parameter d slope of the membershp functon. Ths parameter determnes the form of the membershp functon. In case d one obtans the conventonal rectangular characterstc functon that can only take values from set {0, }.

69 Stages of the Dynamc Pattern Recognton Process 5 When dervng a membershp functon t s assumed that a degree of membershp s equvalent to a smlarty measure. In pattern recognton t s usual to choose the dstance between vectors as a dssmlarty measure between them. Thus, consderng a cluster wth a centre of gravty n the orgn of the co-ordnate system, t s assumed that the larger the dstance of a vector x from the orgn, the smaller ts membershp to a cluster. For the proposed type of membershp functon an Eucldean dstance measure s chosen. The parameters descrbed above are used as weghts (specfc for each dmenson of a vector x) to control the form of the membershp functon. An M-dmensonal membershp functon wth ts maxmum n the orgn s defned accordng to [Bocklsch, 98] as follows: d M x ( ) = b c u( x ) = a + (3.7) In order to ncrease the flexblty of the functon, parameters b, c, and d are defned dfferently for the left and the rght sde of each of the M components of the functon. The resultng membershp functon s represented graphcally n Fgure 3-6, where the onedmensonal case s shown for the sake of smplcty. µ(x) a d l d r a b r a b l 0 x 0 -c l x 0 x 0 +c r x Fgure 3-6: A parametrc membershp functon Ths general form for a membershp functon s used to represent fuzzy clusters. Fuzzy objects are descrbed by an elementary membershp functon obtaned from the general model wth the followng parameter settngs: a =, b l = b r = 0.5, c l = c r = c e, d l = d r = 2. Ths corresponds to a normalsed symmetrc membershp functon. The parameter of fuzzness c e can be defned by an expert context dependent. An applcaton of unon or ntersecton operatons to the consdered parametrc elementary membershp functons can lead to new types of functons that do not ft nto the orgnal concept (they can be charactersed by multple maxma and an extended parameter vector). In order to preserve the proposed parametrc representaton a two-step aggregaton procedure s proposed n [Bocklsch, 98, pp ]: At frst, an aggregaton operaton s appled, then

70 52 Stages of the Dynamc Pattern Recognton Process the aggregated functon s transformed nto the orgnal concept by computng a new parameter vector from the ntermedate results of the aggregaton step. In order to reduce the consderaton of membershp functons from an M-dmensonal feature space to a one-dmensonal case, a new co-ordnate system s bult for each sngle cluster by shftng or by shftng and rotatng the old one. The purpose of ths transformaton s to obtan a cluster-specfc co-ordnate system so that the cluster membershp functon has ts maxmum n the orgn. The M-dmensonal membershp functon s projected on the new axes and M one-dmensonal functons are consdered usng the parametrc concept descrbed above. In ths case, the parameters of the transformaton (length of shft and angles of rotaton) are ncluded n the parameter vector of a cluster membershp functon. Hence, the fuzzy classfer s desgned by aggregatng elementary membershp functons of objects and by representng clusters by global parametrc membershp functons. The classfcaton of a new object s carred out by computng a vector of degrees of membershp of an object to all clusters, whch Bocklsch calls a sympathy vector. The drawback of ths classfer s, however, the problem of nterpretng the resultng cluster structure snce clusters are defned n dfferent co-ordnate systems, hence makng the analyss rather complcated. The proposed clusterng algorthm provdes a flexble cluster model and seems to have good adaptaton capabltes due to ts parametrc representaton. The advantage of usng the parametrc concept of the membershp functon was used n [Mann, 984] to desgn a dynamc fuzzy classfer capable of followng temporal changes n the cluster structure. In the case of dynamc classfers sngle observatons are gven as a temporal sequence of objects x k = x(t k ), k=,..., p, whch can represent dfferent data structures n the course of tme. The desgn procedure of a classfer can not be lmted to a certan tme moment as n the case of statc classfers. In contrast, the learnng and the workng phases are repeated teratvely n the course of tme wthn a closed cycle. Each new object s classfed by a current fuzzy classfer and ts membershp vector s analysed wth respect to the exstng cluster structure. It s tested whether t fts nto the cluster structure well enough, and f not t s tested for the knd of changes that have occurred. These tests represent a montorng procedure n the consdered clusterng algorthm. Its result s one of the followng decsons: Formaton of a cluster; mergng of clusters; splttng of clusters; or modfcaton of clusters. The update of the classfer s carred out by changng recursvely the parameter vectors of cluster membershp functons and by applyng set-theoretc operatons to cluster membershp functons, f so requred by the result of the montorng procedure. Because of the recursve calculaton rule of cluster parameters Mann calls ths learnng method recursve classfcaton. Fnally, the classfer s devaluated by a forgettng functon, whch s equvalent to an ageng of the classfer. Snce, accordng to [Peschel, Mende, 983], the temporal development of the classfer can be consdered as an evolutonary process a forgettng functon can be chosen

71 Stages of the Dynamc Pattern Recognton Process 53 from evolutonary models of the power-product-type. The forgettng process s an mportant component of the classfcaton method that provdes the learnng ablty to the classfer and a possblty to completely re-learn the classfer as tme passes. The followng algorthm descrbes the learnng procedure combnng the desgn of a dynamc classfer wth the adaptaton procedure n one learnng-and-workng cycle. Algorthm : Learnng algorthm for dynamc classfer desgn [Mann, 984, p. 35].. Fuzzy classfcaton of a new object x k = x( t k ), k =,..., p,.e. determnaton of a sympathy vector u u,...,u ) by calculatng degrees of membershp of object x k to all c clusters. = k ( k ck 2. Comparson of all components of the sympathy vector wth the mergng threshold u s. Determnaton of a set of clusters for whch the correspondng components of the sympathy vector exceed the gven threshold: Z = { u u }. If Z s an empty set, go to Step 3. Otherwse go to Step Formaton of a new cluster, snce all components of the sympathy vector are smaller than the threshold. The membershp functon of a new cluster s defned by an elementary membershp functon of a new object: Go to Step 7. k c=c+ and u c (x, p c ) = u e (x k, p e ). 4. Enlargement of a cluster Z wth an object x k by applyng a unon operaton to combne a cluster membershp functon wth an elementary membershp functon of an object. If the cardnalty of set Z s equal to (there s only one cluster for whch the correspondng component of the sympathy vector exceeds the mergng threshold), go to step 6. Otherwse, go to Step Mergng of all clusters Z wth cluster,, by applyng a unon operaton to cluster membershp functons. 6. Test whether the membershp functon of cluster satsfes a parametrc concept of a membershp functon (ths can be recognsed by consderng the parameter vector p ). If ths s not the case, splt off subclasses. 7. Applcaton of a forgettng functon to the classfer. The learnng and workng cycle s completed and the next object can be consdered: k = k+. Go to Step. In ths algorthm the decson concernng changes n the cluster structure s taken dependng on the result of the comparson between the degree of membershp of a new object and a mergng threshold u s. If ths degree s smaller than a threshold for all clusters, then the object s

72 54 Stages of the Dynamc Pattern Recognton Process can not be assgned to any of the exstng clusters and a new cluster s formed. If a degree of membershp exceeds the threshold for more than one cluster, then ths ndcates that the correspondng clusters are overlapped and can be merged. The mergng of clusters s carred out by applyng the unon operaton to cluster membershp functons. Ths aggregaton procedure requres at frst the defnton of a jont aggregaton space for a new cluster snce all clusters are defned n dfferent cluster-specfc co-ordnate systems. In the next step the clusters to be merged are transformed nto the aggregaton space where membershp functon projectons on new axes are used to calculate new parameter vectors. An aggregaton of the transformed membershp functons by the unon operator s then performed wth respect to each dmenson of the aggregaton space as well as for each sde (left and rght) of the membershp functon. Wth the help of the mergng threshold u s t s possble to control the degree of fuzzness and the fneness of the cluster structure. On the one hand, the mergng threshold represents a degree of membershp that a sngle object must possess to be assgned to a cluster. Smaller values of u s lead to fuzzer clusters. On the other hand, the mergng threshold nfluences the number of clusters. Decreasng the value of u s more clusters wll be merged and the number of clusters wll be reduced leadng to a rough cluster structure. Thus, the value of the mergng threshold represents an mportant control parameter of the learnng algorthm. The splttng of clusters s performed f a cluster membershp functon does not ft nto the underlyng parametrc concept. The reason for splttng a cluster s the dssoluton of an old centre of concentraton of objects and formaton of new groups of objects wthn a cluster, whch are drftng away from the former centre of gravty of the cluster. In order to detect such changes n the dstrbuton of objects wthn a cluster, parameter d charactersng projectons of the membershp functons on the axes s consdered. If new objects are concentrated more and more on the boundary of a cluster the value of d grows and the form of the membershp functon changes from an unmodal shape to that of equal dstrbuton. A cluster s splt f the value of d exceeds some predefned threshold. The drft of clusters s not explctly detected by the consdered algorthm. The classfer takes nto account these changes and follows them due to aggregaton of new objects wth cluster membershp functons. In ths way cluster membershp functons move n the drecton of arrval of new objects. Although the proposed parametrc concept of the membershp functon seems to be sutable for the desgn of adaptve classfers, t s rather smple and correspondngly too restrctve to process objects and clusters n multdmensonal feature spaces. The proposed clusterng algorthm s rather complcated due to the need to process cluster-specfc co-ordnate systems, a lot of transformatons durng aggregaton procedures, and the consderaton of sngle projectons of membershp functons. The goal of the clusterng algorthm s smlar to

73 Stages of the Dynamc Pattern Recognton Process 55 the one of pont-prototype based clusterng algorthms, however only membershp functons are used to represent clusters, whch are moved nto the orgn of the co-ordnate system. Informaton concernng cluster centres s represented mplctly by the locaton parameter x 0 of the membershp functon. However, deas of the montorng procedure concernng formaton, mergng and splttng of clusters n the dscussed clusterng method of Bocklsch and Mann have a general character and can be used for the desgn of other types of dynamc classfers. 3.2 The Adaptaton Process In dynamc pattern recognton systems the classfer desgn cannot be separated temporally from the phase of ts applcaton to the classfcaton of new objects. The dynamc classfer must constantly be updated based on new objects n order to preserve ts performance n the course of tme. The update of the classfer means that ts parameters are adjusted over tme n order to represent the current cluster structure n the best possble way. In other words, a dynamc classfer must possess adaptve capabltes n order to follow temporal changes. The choce of parameters to be adapted depends on the type of the classfer (e.g. pont-prototype based classfer, neural networks, or decson trees). In ths secton dfferent strateges for the update of a dynamc classfer are dscussed. The most smple strategy to update the classfer wthout any changes n the conventonal desgn procedure s to re-learn the classfer perodcally from scratch as tme passes (Secton 3.2.). Although ths procedure s not always economcal wth respect to computng tme, t provdes an opportunty to apply a statc classfer n a dynamc envronment. In order to supply the classfer wth adaptaton ablty, the classfer can be ncrementally updated wth new objects wth the passng of tme (Secton 3.2.2). Ths strategy requres the recursve representaton of the learnng rule so that classfer parameters can be recursvely updated usng new objects. Ths adaptaton law allows a classfer to follow gradual temporal changes but does not have a mechansm for adjustng a classfer to abrupt changes n the cluster structure such as changes n the number of clusters. The most flexble but also the most complcated approach s the development of an adaptaton law dependng on the temporal changes detected n the cluster structure by the montorng process (Secton 3.2.3). Ths approach provdes a flexble combnaton of re-learnng and ncremental updatng procedures enhanced by specal elements for the ntellgent and effcent desgn of a dynamc classfer. The noton learnng the classfer s usually used n the framework of unsupervsed pattern recognton whereas tranng the classfer s appled n the framework of supervsed pattern recognton. In ths thess both notons wll be generalsed by the term learnng regarded to the process of the classfer desgn.

74 56 Stages of the Dynamc Pattern Recognton Process In the followng sectons the man prncples of these three strateges wll be consdered and ther advantages and dsadvantages wll be dscussed Re-learnng of the classfer As mentoned n Secton 2.3.3, one strategy to update the orgnally statc classfer to temporal changes n the cluster structure s to re-learn the classfer from scratch after each tme wndow n order to preserve ts performance over tme. The classfer can be re-learned based ether on the complete hstory of objects (complete learnng approach) or on the partal hstory (re-learnng approach). In the frst case the classfer s re-learned usng a tranng set whch ncludes all avalable objects from the past. Ths approach can be very tme consumng because of the constantly growng number of objects. Besdes, t may be nsuffcent always to dscard the nformaton learned n the prevous phase and to learn t agan, especally f adaptaton s not requred. In the second case the tranng set for re-learnng conssts solely of the most recent objects obtaned durng the last tme wndow. Ths approach avods the problem of a constantly ncreasng tranng set by dsregardng old prevous objects up to the current tme wndow. However, old objects can contan some relevant nformaton that must be used to obtan a representatve classfer. Ths can lead to a drop n classfer performance, partcularly f new objects are not representatve enough and contan a lot of nose. As explaned n Secton 2.3.2, the choce of the length of the tme wndow has a bg nfluence on the adaptaton ablty of the classfer. As can be seen, both approaches have certan advantages and shortcomngs and cannot guarantee an optmal learnng strategy. The man drawback of the updatng strategy based on re-learnng the classfer s that structural changes cannot be recognsed explctly,.e. exactly whch clusters have been merged or splt remans unknown. The classfer learns a new cluster structure blndly based on the avalable nformaton, and changes can be detected by comparng the current cluster structure wth the precedng one. However, for unsupervsed re-learnng of the classfer certan nformaton about temporal changes s requred, that s, the correct number of clusters at the current moment. The most common technque to determne the optmal number of clusters reles on valdty measures (e.g. partton entropy [Bezdek, 98, p. ], proporton exponent [Wndham, 98], degree of compactness and separaton [Xe, Ben, 99] etc.). Valdty measures are used n cluster analyss to quantfy the separaton and compactness of the clusters. If the number of clusters s chosen correctly the clusterng algorthm can dentfy well separated and compact groups wthn the data and the valdty measure takes ts maxmum or mnmum value (dependng on the chosen measure). However, the defnton of cluster separaton and compactness s not unque and depends on a specfc problem. Therefore, the mathematcal formulaton of the valdty measure s extremely dffcult accordng to [Bezdek, 98, p. 98]. The man dea of the technque based on the valdty measure s to repeat the clusterng procedure several tmes wth dfferent numbers of clusters

75 Stages of the Dynamc Pattern Recognton Process 57 (between predefned mnmum and maxmum values) and to choose the number at whch the valdty measure has a local mnmum or a local maxmum (dependng on the measure) to be optmal. The nterval for the search for an optmal number of clusters s determned dependng on the applcaton, where a pror nformaton concernng the maxmum possble number of clusters s gven, or an expert can defne a requred upper lmt takng nto account the need for clear nterpretablty of results. If the classfer s re-learned after each tme wndow the number of teratons wth a dfferent number of clusters can be reduced. The number of clusters n the current tme wndow can be ncrementally decreased or ncreased wth respect to the number of clusters n the precedng tme wndow untl the extreme value of the valdty measure s acheved. Therefore, the number of teratons of the clusterng procedure can be varable for dfferent tme wndows and determned dependng on the behavour of the valdty measure. The reason for ths strategy s the assumpton that the number of clusters changes at a rather gradual rate over tme. One of the best representatves for ths technque s the unsupervsed optmal fuzzy clusterng (UOFC) algorthm [Gath, Geva, 989], whch determnes an optmal number of clusters automatcally by maxmsng the average partton densty crteron. The algorthm starts wth a sngle cluster prototype and terates for an ncreasng number of clusters, calculatng a new partton of the data set and evaluatng the valdty measure n each teraton untl the maxmum predefned number of clusters s acheved. The best parttonng s obtaned for the number of clusters that maxmses the average densty crteron plus one or sometmes two. The results reported n [Geva, Kerem, 998] show that the UOFC algorthm s sutable for an accurate and relable dentfcaton of boelectrc bran states based on the analyss of the EEG tme seres. Ths approach to determne the optmal number of clusters depends, however, to a hgh degree on the qualty and relablty of the valdty measures used. As already mentoned, t does not guarantee that the optmal number of clusters wll always be found snce t s dffcult to defne a unque measure that takes nto account the varablty n cluster shape, densty, and sze. So far, none of the most frequently used valdty measures provdes a clear answer about the optmal number of clusters n all stuatons. Moreover, re-learnng the classfer wthn each tme wndow s computatonally expensve because of the need to repeat clusterng runs. An alternatve approach for unsupervsed detecton of the number of clusters durng relearnng the classfer s presented by clusterng algorthms based on the prncple of mergng smlar clusters. In the lterature a number of algorthms was proposed, whch start wth an over-specfed number of clusters and due to mergng of smlar clusters termnate wth an optmal number of clusters. An example of ths approach s the compettve agglomeraton (CA) algorthm ntroduced n [Krshnapuram, 997], whch produces a sequence of parttons wth a decreasng number of clusters. The update equaton creates an envronment n whch

76 58 Stages of the Dynamc Pattern Recognton Process clusters compete for objects and only clusters wth large cardnaltes survve. The fnal partton has the optmal number of clusters from the pont of vew of the objectve functon. In ([Setnes, Kaymak, 998], [Stutz, 998]) extended versons of the fuzzy c-means (FCM) algorthm [Bezdek, 98] wth cluster mergng were proposed. The man dea of these algorthms s the teratve clusterng of objects and the evaluaton of the smlarty of all pars of clusters at each teraton. If the smlarty between two clusters exceeds a gven threshold these are merged and the number of clusters s decreased. The algorthms stop when there are no more clusters that can be merged. These two algorthms dffer n the defnton of the smlarty measure for clusters. Ther performance depends on the correct choce of the threshold for mergng Incremental updatng of the classfer Another alternatve for updatng the classfer over tme n order to follow temporal changes of the cluster structure s to mprove the current classfer usng ncremental updatng based on new objects. Ths dea was put nto practce n [Marsl-Lbell, 998] where a classcal fuzzy c-means (FCM) [Bezdek, 98] was enhanced wth an updatng feature enablng t to detect departures of a system's state from normal condtons. In the classcal verson of FCM the knowledge contaned n the tranng set of objects s condensed nto the cluster prototypes. They are defned as the fuzzy weghted centres of gravty of objects accordng to the followng equaton: v N q (u j) x j j= =, = N q (u j) j=,...,c (3.8) where x j s an M-dmensonal feature vector representng the j-th object, u j s the degree of membershp of the object j to cluster, q (, ) s the fuzzy weghtng exponent, N s the number of tranng objects, c s the predefned number of clusters. Cluster centres descrbe typcal values of the correspondng clusters and usually represent the 'normal' state of the system and an approprate number of faulty states. Degrees of membershp u j whch are components of the membershp matrx U denote the extent to whch object x j s smlar the correspondng cluster prototype and are calculated as follows: u j =. 2 c d q j r= d rj (3.9)

77 Stages of the Dynamc Pattern Recognton Process 59 Durng the tranng phase, the fuzzy c-means algorthm operates n an teratve mode computng sequentally membershp matrx U accordng to (3.9), the cluster centres V accordng to (3.8) and the dstances ( d j ), =,..,c, j =,..., N, untl the membershp matrx stablses. In order to classfy a new object x o nto exstng clusters, membershps of an object to all clusters are calculated applyng (3.9) only once. In classcal fuzzy c-means cluster prototypes reman unchanged durng the classfcaton of new objects. In order to provde the classfer wth some trend-followng capabltes, t s proposed n [Marsl-Lbell, 998] to update the locaton of cluster prototypes usng degrees of membershp of new objects to clusters. It s assumed that all N prevous objects have already been classfed and a new (N+)-th object s consdered. The cluster prototypes are computed accordng to the followng recursve equaton: v N+ = N+ j= N+ j= (u j (u ) j q ) x q j = N j= (u N j j= ) q (u x + (u j j ) q + (u,n+ ),N+ q ) x q N+ vn(n) + (u = vd(n) + ) x q,n+ N+ q (u,n+ ) (3.0) where N vn(n) q = (u ) x and (N) j j = j= N vd (u ) q. j j= The quanttes vn(n) and vd(n) are calculated recursvely usng degrees of membershp of N exstng objects and the current values of vn(n) and vd(n) are saved after each recurson. The adaptaton of the cluster centres s provded by addtonal terms correspondng to a new (N+)-th object. It s obvous that the computatonal expense of the updatng procedure s moderate due to the recursve mode of calculaton of the cluster centres. Usng the complete knowledge base ncludng the tranng data set as well as new data up to the current sample, there s no need to store the growng membershp matrx U to calculate cluster centres snce all the necessary nformaton s contaned n the scalar quanttes vn(n) and vd(n). The drawback of usng the complete data set for the updatng procedure s, at frst, that old objects may become rrelevant as tme passes and ths may have negatve effects on the updatng procedure, and secondly, that each new object can nfluence the locaton of cluster centres regardless of ts relatve mportance. The frst drawback can be avoded by usng a movng wndow and by replacng old data of the ntal tranng set wth new on-lne data. Ths strategy mples the storage of the membershp matrx U for updatng cluster centres. If the degrees of membershp of each object are stored n columns, then the membershp matrx correspondng to the current wndow s obtaned by deletng the column correspondng to the oldest object and by addng a new column wth degrees of membershp of a new object. In [Marsl-Lbell, 998] the sze of the movng

78 60 Stages of the Dynamc Pattern Recognton Process wndow was chosen to be equal to the number N of tranng data records. Thus, the column dmenson of the membershp matrx remans fxed at N. The followng algorthm was proposed for the updatng procedure usng a movng wndow: Algorthm 2: Classfer update wth a movng wndow [Marsl-Lbell, 998].. Classfy a new object x j, j>n, by calculatng the degrees of membershp u j of an object to all clusters =,..., c. 2. Evaluate the new membershp matrx at step j by droppng the leftmost column correspondng to the oldest object and by addng a rghtmost column contanng the degrees of membershp of the new object: U j = [U j (2,..., N) u j]. (3.) 3. Update the matrx of cluster centres V usng a new matrx U j accordng to equaton (3.8). The updatng procedure based on the movng wndow provdes a possblty to follow the temporal development of objects by the ncremental dsplacement of cluster centres from ther ntal locatons correspondng to the tranng data set. On the other sde, t allows each new object to nfluence the locaton of cluster centres, although not all objects can be consdered representatve enough to contrbute to the knowledge base. Hence, the problem s to dscrmnate between good and bad objects dependng on ther sgnfcance and to decde whether the nformaton provded by a new object s suffcent enough to be ncluded n the knowledge base. In other words, whether a new object can mprove the exstng parttonng by makng t crsper should be evaluated. Dfferent crtera were ntroduced nto the lterature to determne the degree of uncertanty of a fuzzy partton [Pal, Bezdek, 994]. In [Marsl-Lbell, 998] the partton entropy was selected as an evaluaton crteron of the qualty of the updated partton. The partton entropy s smlar to the average nformaton content of a source proposed by Shannon [Shannon, 948] and s defned as follows [Bezdek, 98, p. ]: H. (3.2) N c N = u j ln(u j ) N j= = If the membershp matrx U provdes a clear partton, then one has the complete nformaton at one s dsposal and consequently the entropy s zero. If the degrees of membershp of an object to all clusters are equal, the entropy takes ts maxmum value of one. Thus, the smaller the entropy, the crsper the fuzzy partton.

79 Stages of the Dynamc Pattern Recognton Process 6 An evaluaton of changes of the normalsed partton entropy can be used n the algorthm for updatng cluster centres n order to decde whether a new object mproves the fuzzy partton. The algorthm conssts of the followng steps: Algorthm 3: Condtonal classfer update [Marsl-Lbell, 998].. Classfy a new object x j, j>n, by calculatng the degrees of membershp u j of an object to all clusters =,..., c. 2. Evaluate the new membershp matrx at step j as n Algorthm Compute the partton entropy H N (U j) and H N (U j ) of the partton at the current step j and at the prevous step j-, respectvely. 4. Calculate a varaton of the partton entropy: H N (U j) H N (U j ) H N (j) = H (U ) (3.3) N If varaton of the partton entropy s postve but does not exceed the predefned lmt, update the matrx V of cluster centres accordng to equaton (3.0), otherwse dscard the new matrx U j and keep the prevous matrx U j- and the correspondng cluster centres V unchanged untl the next step. Allowng a small postve ncrease of the partton entropy has the objectve of followng the temporal development of objects, even f ths does not always mprove the partton. Negatve varatons always lead to an update of the cluster centre, snce they ndcate a crsper partton. The updatng procedure wth the evaluaton of the partton entropy at each step provdes more control over the dsplacement of cluster centres. The bad objects and outlers do not affect clusterng results and only good objects can nfluence the locaton of cluster centres. The classfer can follow slow gradual changes of the cluster structure over tme but t lacks a mechansm to be adjusted to abrupt changes charactersed by a change of the number of clusters. j Adaptaton of the classfer Compared to the two approaches presented above a more flexble approach to desgnng a classfer so that t can automatcally recognse temporal changes n the cluster structure s ts adaptaton accordng to the detected changes. For ths purpose, the adaptaton process must be coupled wth the result (status) of the montorng process n order to obtan a flexble dynamc classfer.

80 62 Stages of the Dynamc Pattern Recognton Process In [Nakhaezadeh et al., 997] and [Lanqullon, 997] a number of general models for adaptng dynamc classfers n supervsed learnng was proposed. Although they were mplemented usng statstcal or neural network classfers, most of the proposed deas have a general character and could be appled n supervsed as well as n unsupervsed dynamc pattern recognton systems wth some modfcatons. Accordng to [Nakhaezadeh et al., 997], two approaches to the adaptaton of a classfer can be dstngushed:. adaptaton by explctly changng the classfer s structure or 2. adaptaton by changng the tranng data set that s used to desgn the classfer. In the frst approach the classfer s adapted based on the most recent objects accordng to the result of the montorng process. Dependng on the changes detected durng the montorng process, t could be necessary ether to ncrementally update the classfer or to re-learn the classfer usng a new tranng data set. The man dea of ths flexble approach s that the decson about the approprate update of the classfer s controlled by the montorng status and the parameters of the classfer depend on the current values of montored characterstcs. Developng an ncremental learnng algorthm for the classfer, the goal s to moderately adapt the classfer to gradual changes based on the most recently observed objects n such a way that the prevously learned nformaton (old classfer) s reused. Also, f the classfer must be re-learned n the case of serous changes, t seems reasonable to reuse the old classfer n some way, for nstance as the ntalsaton for the learnng algorthm. Recall that the alternatve smple approach for adaptng the classfer to structural changes s to re-learn the classfer completely after each batch of new data. Ths approach wll be referred to as conventonal updatng approach n the followng. An advantage of the flexble ncremental approach towards the conventonal can be seen by comparng the performance of both approaches appled to dfferent classfers as t wll be shown n the end of ths secton. Another strategy for the adaptaton of a classfer s concerned wth the update of the tranng data set, whch s sometmes referred to as a template set. Ths approach ams at obtanng a currently relevant tranng set that can be used to desgn a classfer. The requrements for the choce of the tranng data set are usually formulated as follows: t must be as small as possble to allow easy and fast classfer desgn, t must be representatve,.e. tranng data must contan good prototypes of each cluster so that the classfer desgned usng ths tranng set has a good dscrmnatng ablty. In statc pattern recognton the tranng data set s chosen only once before the classfer desgn and remans unchanged durng classfcaton (the tranng set s not used any more). In dynamc pattern recognton t may be necessary to re-learn the classfer as tme passes f ts performance decreases due to structural changes. In ths case the ntal tranng set may not be representatve any more snce t was desgned before structural changes occurred. Thus, t s

81 Stages of the Dynamc Pattern Recognton Process 63 mportant to update the tranng set over tme by ncludng new objects. A smple approach would be to nclude all new objects arrvng over tme nto the tranng set. However, the tranng set would rapdly grow n ths case makng the classfer desgn very tme-consumng and t would possbly contan many nsgnfcant objects that cannot mprove the classfer performance. Therefore, a better approach s to select carefully only representatve objects for the tranng set. The problems that arse by choosng the tranng set n dynamc pattern recognton can be formulated as follows: to reduce the constantly growng set of objects, to dstngush between relevant and rrelevant objects at the current moment. In order to control the sze of the tranng set of objects the concept of the movng tme wndow or the more general concept of the template set can be used. The most representatve and currently relevant objects for the tranng set can be chosen by applyng the concept of usefulness. These concepts represent the man elements for updatng the tranng set. In the followng dfferent adaptaton procedures for updatng the classfer or the tranng set wll be presented and dscussed based on the results of ther applcaton to dfferent classfers Learnng from statstcs approach In Secton 3.. the montorng process based on the modfcaton of Shewhart qualty control charts was descrbed. The outcome of the montorng procedure s one of three states: okay, warnng or acton. Suppose that new objects are provded and examned n groups, whch are called batches. The sze of a batch s defned by the number of objects n the batch and supposed to be constant. The followng adaptaton procedure for updatng the classfer dependng on the current montorng status was proposed n ([Kunsch, 996], [Nakhaezadeh et al., 997]) for supervsed classfcaton algorthms and called learnng from statstcs : Status okay : The state of the process or system s stable. Therefore, there s no need to change the classfer. Status warnng : Gradual changes of the system s state are assumed. In ths stuaton dfferent updatng strateges can be appled to slghtly adapt the classfer to suspected changes. In partcular, procedures based on ncremental learnng are recommended where the classfer s updated based on the most recent objects. The specfc procedure for ncremental learnng depends on the appled pattern recognton method. Some of them were dscussed n Secton In [Nakhaezadeh et al., 997] t was proposed to establsh an adapted classfer as a weghted lnear combnaton of the parameters of the old classfer and a new one, whch was learned

82 64 Stages of the Dynamc Pattern Recognton Process based only on new objects of the current batch. Ths adaptaton procedure was mplemented wthn statstcal classfcaton algorthms where the error rate was montored. In ths case weghts are chosen dependng on the current error rate, the mean error rate and ts standard devaton or on some functon thereof. The ncremental updatng of classfers based on neural networks s carred out by computng parameters of the network dependng on the current montorng status and the current values of the montorng characterstc. For the mult-layer perceptron (MLP) mplemented n [Lanqullon, 997], two parameters must be updated for learnng: the current learnng rate and the current number of cycles for the backpropagaton algorthm [Nauck et al., 996]. They are calculated proportonal to predefned maxmum values usng a pecewse lnear weghtng functon of the current error rate, the mean error rate and ts standard devaton. Moreover, to obtan a really ncremental learnng algorthm the connecton weghts of the current MLP are used as a startng soluton for the backpropagaton algorthm nstead of random ntalsaton of the MLP. Status acton : A serous change n the system s state s detected. The best strategy s to relearn the classfer from scratch based on the most recent objects, snce the current classfer was desgned based on objects whch are not representatve any more. However, accordng to [Nakhaezadeh et al., 997] the new classfer may not be very successful because a set of new objects mght be not very representatve, especally f the sze of ths set s small and could contan many outlers. It seems reasonable to re-learn the classfer based on the updated tranng set contanng some old and new objects. Possble approaches for updatng the tranng set are descrbed below. The dynamc pattern recognton process usng the learnng from statstcs adaptaton procedure s llustrated on Fgure 3-7.

83 Stages of the Dynamc Pattern Recognton Process 65 Test data set True class labels batch labels Intal tranng set learn Classfer result Montor Crossvaldaton relearn adapt no update 'okay' Status 'warnng' estmated parameters 'acton' Fgure 3-7: Adaptaton of the classfer based on learnng from statstcs approach [Lanqullon, 997, p. 54] The classfer s desgned based on the ntal tranng data set. The montorng procedure s ntalsed wth estmates for the expected value µ of the characterstc f and ts standard devaton σ. These estmates are obtaned usng k-fold cross valdaton as descrbed n secton 3.., f the characterstc f to be montored s based on the classfcaton results (e.g. the error rates). If the characterstc f s based only on objects, t s suffcent to splt the tranng set nto k subsets and to evaluate the expected value µ and ts standard devaton σ usng the estmates of characterstc f on these subsets wthout learnng the classfer. Snce montorng of the error rates was chosen n [Nakhaezadeh et al., 997] and [Lanqullon, 997], the process of estmatng the expected value µ and ts standard devaton σ s referred to as cross valdaton. After the classfer s desgned based on the ntal tranng set, t s appled to the classfcaton of new objects whch are presented n batches. The results of the classfcaton of the current batch are analysed wthn the montorng procedure, whch s performed by the montor. Usng the true class labels of the current batch, the montor determnes the current value of characterstc f and provdes the montorng status. The updatng of the classfer s carred out dependng on ths status. The cycle s completed by updatng the expected value µ and ts standard devaton σ dependng on the current value of f. If there was an acton status, µ and σ are re-estmated by cross valdaton as for the frst batch. The cycle, ncludng classfcaton of a current batch, montorng and updatng of the classfer, s called a workngand-learnng cycle. Ths flexble adaptaton procedure for updatng the classfer, combned wth the montorng of the error rates, can be consdered as a general alternatve to a conventonal approach of relearnng the classfer after each batch.

84 66 Stages of the Dynamc Pattern Recognton Process In the followng, three adaptaton procedures based on dfferent concepts of updatng the tranng data set are descrbed n the framework of supervsed learnng. These general concepts can however be adjusted for unsupervsed algorthms as well, snce the classfer desgn (at the begnnng of the pattern recognton process or due to re-learnng durng dynamc pattern recognton) always requres the tranng data set Learnng wth a movng tme wndow In the learnng from statstcs approach only the current batch of objects s used to update or re-learn the classfer. Alternatvely one can apply a movng tme wndow of constant or varable length, whch s used for updatng the classfer. As stated n Secton 2.3.2, the performance of the classfer depends crucally on the length of the tme wndow. If the wndow s too small, then relevant objects are dscarded too early and the classfer may not be very relable. If the wndow s too large, then the update of the classfer accordng to structural changes may be too slow. The optmal length of the wndow can be chosen dependng on the classfcaton problem under consderaton and the underlyng system dynamcs. A more flexble approach s to apply a varable wndow length. Ths dea was ntroduced n [Wdmer, Kubat, 993, 996] where a wndow adjustment heurstc was used to determne the optmal wndow sze. Followng ths dea, [Lanqullon, 997] proposed to control the wndow sze by a heurstc dependng on the current montorng status n the pattern recognton system. Suppose that the length (or sze) of the movng tme wndow s defned by the number n of the most recent batches. Denote the mnmum and the maxmum wndow sze by n mn and n max, respectvely. The procedure for adjustment of the current wndow sze n curr s as follows: Status okay : The performance of the classfer s suffcent and no changes are suspected. If the current wndow sze s smaller than the maxmum sze n curr < n max, then n curr s ncreased by n order to obtan a more representatve tranng set. Otherwse, the current wndow sze remans unchanged. Status warnng : Gradual changes of a system state are suspected. If the current wndow sze s larger than the mnmum sze n curr > n mn, then n curr s decreased by n order to allow the faster adaptaton of the classfer to suspected changes. Status acton : A serous change of the system s state s detected. The classfer has to be relearned based on the most recent objects whereas old not representatve objects have to be dscarded. Therefore, the tranng set should be reduced by settng the current wndow sze to a mnmum n curr = n mn. Summarsng, the current wndow sze s determned by the functon:

85 Stages of the Dynamc Pattern Recognton Process 67 mn(n curr +, n max ) f τ = okay n curr = N(n curr, n mn, n max, τ) = max(n curr, n mn ) f τ = warnng (3.4) n mn f τ = acton In case of n curr = n mn a movng wndow of a fxed sze s obtaned. The adaptaton procedure based on the learnng wth a movng wndow approach can be mproved by keepng all representatve objects n the template set even f they become outdated compared to the most recent ones Learnng wth a template set The concept of a template set [Gbb, Auslander, Grffn, 994] s a generalsaton of the movng tme wndow approach. Accordng to the latter, the tranng data set s composed of the n most recent batches, although not all of objects may be consdered as representatve, e.g. some objects may be nosy and, thus, they must not be accepted to the template set. A general template set s desgned by the careful selecton of representatve examples and may contan any observed object. If the template set becomes too large older, or contradctory, examples have to be dscarded. A template set s charactersed by two parameters: ts sze and a crteron for ncludng examples n the template set. For choosng the sze of the template set the same consderatons as for the choce of the tme wndow sze can be appled (Secton 2.3). Moreover, f the template set s too large and the crteron for ncludng new examples s too strct, the adaptaton of the classfer to structural changes wll be very slow or even mpossble. On the other hand, f the template set s too small and the crteron s too soft, then new objects have a strong effect on the adaptaton procedure whle old relevant objects are dscarded too early and the classfer may not be very representatve. The formulaton of a sutable crteron for ncludng new objects n the template set (or generally to adjust the template set) s very dffcult snce t s almost mpossble to decde whether new objects are representatve or not f structural changes have occurred. A decson can be taken based on the accuracy of classfcaton compared to a predefned threshold. For nstance, n [Kunsch, 996] the probablty of membershp to the true class determned by a statstcal classfer s compared to a certan threshold value. Ths crteron allows only correctly classfed objects to be ncluded n the template set. The drawback of ths crteron s however that objects that are representatve for the template set but are classfed wrongly by the current classfer have no chance to be selected. For nstance, n the case of gradual changes n a system state msclassfed objects are not added to the template set, although they can represent a tendency towards a structural change [Nakhaezadeh et al., 996]. Therefore, the adaptaton to such

86 68 Stages of the Dynamc Pattern Recognton Process changes wll not be possble. To avod ths drawback, the adjustment of the template set can be coupled wth the montorng status ([Kunsch, 996], [Lanqullon, 997]) as follows: Status okay : The performance of the classfer s suffcent and no changes are suspected. New objects can be ncluded n the template set. Status warnng : Gradual changes n the system s state are suspected. Ths case s treated dependng on the prevous montorng status (after classfcaton of the prevous batch of data). If the prevous status was okay a new status can be caused ether by gradual changes n the system s state or by nose. Thus, t s preferable to adjust the template set as n the case of status okay. If gradual changes have ndeed occurred, then t wll be confrmed after classfcaton of the next batch of data and then approprate actons wll be taken. Status warnng was detected twce confrmng the assumpton of gradual changes. The template set s adjusted accordng to status acton (explaned below). If the prevous status was acton the correspondng actons were appled and the template set conssts of new objects from the prevous batch. The current status warnng ndcates decreasng oscllatons after detected serous changes and reactons to them. There s no need to substtute the template set and therefore ths case s treated as status okay. Status acton : A serous change n the system s state s detected. The classfer has to be relearned based on the most recent objects, snce the current template set s not representatve any more. Consequently, the complete template set s dscarded and desgned anew by ncludng objects of the most recent batches. The concept of the template set s more complcated compared to that of a movng tme wndow but t can provde a better qualty of the tranng data set for the adaptaton of the classfer. The adaptaton of the classfer accordng to a learnng wth a template set approach s llustrated n Fgure 3-8, where the ntal tranng phase and estmaton of parameters for the montorng process by cross valdaton are not depcted for the sake of smplcty.

87 Stages of the Dynamc Pattern Recognton Process 69 Test data set True class labels batch labels Classfer result Montor update learnng parameters waste old examples dscard bad examples Template set add good examples reset 'acton' or double warnng Status okay or 'warnng' Flter batch Fgure 3-8: Adaptaton of the classfer based on learnng wth a template set approach [Lanqullon, 997, p. 59] After classfcaton of a new batch of data, the results are presented to the montor where the current value of the montored characterstc s evaluated and the current montorng status s determned. If status acton s detected or status warnng has already appeared twce the current template set s dscarded and substtuted by objects of the last n mn batches (ths operaton s referred to as reset n the Fgure 3-8). Otherwse, the new batch of objects s fltered to separate good and representatve examples from bad and rrelevant ones. For each object the flter checks the crteron of ncluson nto the template set and objects fulfllng ths crteron are added to the template set. All other objects n the batch are dscarded. If the sze of the template exceeds the maxmum sze defned by n max tmes the batch sze, old examples of the template set are also dscarded. The current classfer s then updated usng the current template set as tranng data. The updatng procedure (adaptaton or re-learnng) depends on the montorng status and s performed n the same way as by usng the learnng from statstcs approach (Secton ). However, the tranng set s carefully selected and contans only representatve and currently relevant objects. In order to mprove the approach based on the template set, the crteron for ncludng new objects n the template set can be extended by the concept of usefulness descrbed n the next secton.

88 70 Stages of the Dynamc Pattern Recognton Process Learnng wth a record of usefulness The dea of the concept of usefulness s to gve each object a weght representng ts usefulness. As tme passes the weght s updated accordng to the usefulness of an objects for classfcaton. If the example s useful ts weght s ncreased, otherwse t s decreased. If the weght falls below a gven threshold the example s dscarded. In dynamc pattern recognton t s supposed that older examples become less useful as tme passes. Ths process s referred to as ageng of objects. Dfferent approaches have been proposed to defne the usefulness of an example. In [Nakhaezadeh et al., 996] a dynamc -nearest neghbour classfer s developed based on the record of usefulness charactersng the tranng data set. In ths approach, the nomnal weght of an example as t s frst observed s set to. As tme passes, objects age at a rate of λ. Weghts of objects are adapted dependng on newly classfed objects accordng to the followng procedure:. If a new object s correctly classfed by ts nearest neghbour but msclassfed by ts second nearest neghbour, then the weght of the nearest neghbour s ncreased by δ. 2. If a new object s msclassfed by ts nearest neghbour but correctly classfed by ts second nearest neghbour, then the weght of the nearest neghbour s decreased by δ. 3. Otherwse, the weght s left unchanged. The drft of the weght of example x due to new objects s defned by a functon: Drft(w(x))=-λ+δp(x)-εq(x), (3.5) where p(x) s the probablty that x wll be the nearest neghbour n the frst step of the procedure, and q(x) s the probablty that x wll be the nearest neghbour n the second step of the procedure. It s stated n [Nakhaezadeh et al., 996] that the weght of each example behaves lke a Markov chan wth an absorbng barrer at 0, whose propertes can be studed n order to choose sutable values for λ, δ, and ε. Assumng the exstence of an equlbrum state, the expected tme untl an example x s dscarded and the expected number of examples n the template set at any pont of tme can be evaluated. Ths approach to determnng the usefulness of an example s rather smple snce there s no learnng process n the classfcaton algorthm and a new example s classfed by comparng t to all examples of the template set. If procedures for updatng the tranng set and the classfer can be dstngushed the concept of usefulness can be appled n dfferent ways to both cases. Snce the defnton of usefulness depends on the chosen classfer, t should be consdered separately for each type of classfer. In the case of prototype-based classfers the purpose of

89 Stages of the Dynamc Pattern Recognton Process 7 applyng the concept of usefulness for updatng the classfer s to assess the selected prototypes, n ths way possbly detectng changes n the cluster structure. If the average usefulness of the set of prototypes s defned n some way, then a sgnfcant decrease of ths value can ndcate structural changes. For updatng the tranng data set the problem s to evaluate the usefulness of each example. In statstcal classfcaton algorthms, the usefulness of an observaton s related to the frequency of ts occurrence. The followng defnton of usefulness s proposed n [Nakhaezadeh et al., 998] for the case of a bnary classfcaton problem and on the assumpton of a fnte set of possble feature values. An observaton s consdered to be very useful for establshng a classfer f t frequently appears n one class because t s very lkely to occur frequently n the future. However, f an observaton has occurrences n both classes t s consdered to be less useful, even though t may appear frequently. In extreme cases, f the observaton appears equally frequent n both classes ts true class membershp s very uncertan, and ths observaton s rather useless. Thus, the more frequently an observaton appears n only one class the more useful t s. The usefulness u(x) of an observaton x s defned formally as a dfference between the frequences of appearance of an observaton x n both classes: u 2 ( x) = n( x) n ( x), (3.6) where n (x) denotes the number of occurrences of observaton x n class, =, 2, wthn a certan set of examples. Ths set could nclude ether all avalable examples or t could be desgned due to an applcaton of the tme wndow approach. Each observaton x, whch s consdered as an example, s charactersed by the label of the class n whch t s havng ts maxmum number of occurrences and by ts usefulness u(x). The lst of examples sorted by u(x) s called the record of usefulness. It can be employed to select the most representatve examples for the template set by choosng ether the frst N examples from the record or examples whose usefulness u(x) exceeds a predefned threshold. To express the fact that examples n dynamc classfcaton system are gettng old and less useless n the course of tme the ageng functon (e.g. an exponental decay), or a forgettng factor, can be appled to the examples. If the usefulness of an example due to ageng falls below a predefned threshold ths example s dscarded. The forgettng factor λ can be ether constant or varable dependng on the performance of the classfcaton system. In [Nakhaezadeh et al., 997] the forgettng factor, as well as the record of usefulness, depends on the montorng status as follows: Status okay : The montored process s under control. The ageng of examples seems unnecessary, therefore the forgettng factor s set to a mnmum value λ mn. The current record of usefulness s aged by the factor -λ mn (0 λ mn ). Snce the performance of the classfer

90 72 Stages of the Dynamc Pattern Recognton Process s suffcent, the value of λ mn should be rather small keepng the record of usefulness almost unchanged. Status warnng : A gradual change of the system s state s suspected. Moderate ageng could be approprate and the forgettng factor s set to a value between the mnmum λ mn and maxmum λ max values such that 0 λ mn λ λ max. The current forgettng factor s defned as a lnear functon of montorng parameters (current value of montored characterstc, the expected value and ts standard devaton): λ curr = λ mn + g(f curr, µ, σ, τ)( λ max - λ mn ). (3.7) The record of usefulness s aged by the factor - λ curr. Status acton : Serous changes n the process under consderaton are assumed. The forgettng factor s set to the maxmum value λ max or alternatvely the current tranng set s completely dscarded as beng no longer representatve. In ths case a new record of usefulness s determned from the observatons of the most recent batch. In order to mantan a constant sze of the tranng set, the number of old observatons that s dscarded must be replaced by an equal number of new observatons. Ths equlbrum can be acheved by choosng the mnmum forgettng factor dependng on the maxmum sze of the tranng set: λ mn =, n batch _ sze (3.8) max where n max s the number of most recent batches used to update the tranng set and batch_sze denotes the number of objects n one batch. The procedure of adaptaton of the classfer based on updatng the tranng set wth the record of usefulness s llustrated n Fgure 3-9, where the ntal tranng phase and estmaton of parameters for the montorng process by cross valdaton are not depcted for the sake of smplcty. Also the ntalsaton of the current record of usefulness s not presented n the fgure. The record of usefulness s establshed from the data set based on the frequences of occurrences of sngle observatons as descrbed above. Each example n the record of usefulness s charactersed by a weght (value of usefulness) and a class label.

91 Stages of the Dynamc Pattern Recognton Process 73 Test data set True class labels batch labels Classfer result Montor update learnng parameters Ageng record Current record of usefulness replace 'acton' Status replace okay or 'warnng' aged record of usefulness Mergng new record of usefulness Fgure 3-9: Adaptaton of the classfer based on the record of usefulness [Lanqullon, 997, p. 6] Durng the workng-and-learnng cycle, the results of classfcaton of a new data batch are presented to the montor, where the current value of the montored characterstc s evaluated and the current montorng status s determned. Addtonally a new record of usefulness s evaluated for the new batch. If status acton s detected the current record of usefulness s dscarded and replaced by the new record. Otherwse, t s replaced by the aged current record whch s merged wth the new record obtaned from the current batch. Two records of usefulness are merged n a smlar way as a new record of usefulness s obtaned from the data set. But nstead of frequences of occurrence n equaton (3.6) the usefulness of an observaton s used. That s, f an observaton s present n the aged current and n the new record and has dfferent class labels, ts usefulness s determned as the absolute value of the dfference of values of usefulness n both records. If an observaton appears n both records wth the same class label ts usefulness s calculated as a sum of values of usefulness n both records. In case the obtaned usefulness of an observaton does not exceed the predefned threshold, ths observaton s dscarded. Observatons that appear only n one of two records are taken nto the resultng record wthout any change. The forgettng factor for ageng the record of usefulness depends on the montorng status as descrbed above, but ths dependence s not shown n the Fgure 3-9 for the sake of smplcty. Before the mergng process, the record of usefulness s aged by the factor - λ curr, where the forgettng factor λ curr [0,]. A forgettng factor equal to zero causes no ageng of the record

92 74 Stages of the Dynamc Pattern Recognton Process of usefulness, whereas the record wll be completely forgotten and dscarded f the forgettng factor s set to one. The classfer s updated accordng to one of the procedures descrbed above - flexble adaptaton procedure based on learnng from statstcs approach or conventonal procedure based on re-learnng the classfer after each batch. As already mentoned, the updatng procedure depends crucally on the chosen classfcaton algorthm Evaluaton of approaches for the adaptaton of a classfer The four approaches for adaptaton presented above - updatng the classfer tself or updatng the tranng data set - were mplemented n dynamc classfcaton systems based on multlayer perceptron and on radal bass functon (RBF) networks n [Lanqullon, 997, pp ]. These approaches were nvestgated ether separately or n combnatons. In the frst case, the adaptaton wth the learnng from statstcs approach was based only on the most recent batch and the procedures for updatng the tranng data set were appled together wth the conventonal approach for updatng the classfer. Alternatvely, the updatng procedures for tranng set were combned wth flexble updatng of the classfer based on the learnng from statstcs approach. The applcaton of these dfferent confguratons of a dynamc classfcaton system to real-world data from the credt ndustry has shown that these approaches appled separately mprove the performance of both neural network classfers compared to no-learn (statc) or conventonal re-learnng approaches. However, combnng the approach for updatng the tranng set wth the flexble updatng procedure for the classfer does not ncrease the classfer performance. Partcularly, the methods for updatng the tranng set perform better when they are combned wth conventonal approach of re-learnng the classfer after each batch [Lanqullon, 997, pp ]. Accordng to the author, the explanaton for such a behavour can be seen n the fact that the older relevant nformaton s processed twce. On the one hand, the updated tranng set contans enough relevant objects (also older ones) to successfully learn the classfer. On the other hand, the older nformaton s preserved by the classfer and only new objects are requred for the flexble updatng of the classfer. In ths case, t s unnecessary to have a representatve tranng set. Therefore, usng the combnaton of approaches for updatng the classfer and the tranng set leads to the ncreased mportance of older examples and the performance of the classfer may be decreased. Two of the presented approaches for adaptng the classfer based on learnng from statstcs and on learnng wth a template set were mplemented n [Kunsch, 996, pp. 2-27, 69-73] for the followng statstcal classfers: normal-based lnear dscrmnant rule (NLDR), normal-based quadratc dscrmnant rule (NQDR) and logstc dscrmnant rule. The performance of neural and statstcal classfers for the same parameter settngs was compared

93 Stages of the Dynamc Pattern Recognton Process 75 n [Lanqullon, 997, pp ] showng that neural classfers outperform statstcal ones n some cases. Partcularly, the RBF network whch s based on the concept of usefulness provdes the best results (n terms of error rate) among all tested neural and statstcal approaches. In the followng chapter a new algorthm for dynamc fuzzy classfer desgn wll be proposed, whch s partly based on the deas presented n ths chapter but also uses a number of novel crtera to establsh the montorng procedure. The proposed algorthm s ntroduced nto the framework of unsupervsed learnng and allows the desgn of an adaptve classfer capable of recognsng automatcally gradual and abrupt changes n the cluster structure as tme passes and adjustng ts structure to detected changes. The adaptaton law for updatng the template set s extended by addtonal features that should guarantee a more relable and effcent classfer.

94

95 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 77 4 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms In ths secton a dynamc fuzzy clusterng algorthm s developed, whch provdes a possblty to desgn a dynamc classfer wth an adaptve structure. The man property of a dynamc classfer s ts ablty to recognse temporal changes n the cluster structure caused by new objects and to adapt ts structure over tme accordng to the detected changes. The desgn of a dynamc fuzzy classfer conssts of three man components: montorng procedure, adaptaton procedure for the classfer and adaptaton procedure for the tranng data set. The montorng procedure conssts of a number of heurstc algorthms, whch allow the recognton of abrupt changes n the cluster structure such as the formaton of a new cluster, the mergng of several clusters or the splttng of a cluster nto several new clusters. The outcome of the montorng procedure s a new number of clusters and an estmaton of the new cluster centres. The adaptaton law of the classfer depends on the result of the montorng process. If abrupt changes were detected, the classfer s re-learned wth ts ntalsaton parameters obtaned from the montorng procedure. If only gradual changes were observed by the montorng procedure, the classfer s ncrementally updated wth the new objects. The adaptaton procedure s controlled by a valdty measure, whch s used as an ndcator of the qualty of fuzzy parttonng. Ths means that an adaptaton of the classfer s carred out f ths leads to an mprovement of the current parttonng. The mprovement can be determned by comparng the value of a valdty measure for the current parttonng (after re-learnng) wth ts prevous value (before re-learnng). If the valdty measure ndcates an mprovement of the parttonng a new classfer s accepted, otherwse the prevous classfer s preserved. Ths chapter s organsed as follows: frst, the formulaton of the problem of dynamc clusterng s derved and the model of temporal development of the cluster structure s ntroduced. Thereafter, possble abrupt changes n the cluster structure are dscussed and dfferent crtera for ther recognton durng the montorng procedure are formulated and then summarsed nto correspondng heurstc algorthms (Sectons ). The adaptaton laws for updatng the classfer and the template set n response to detected changes are proposed n Sectons 4.7 and 4.8. Ths wll be followed by the examnaton of dfferent valdty measures whch can be relevant to control the adaptaton of the classfer (Secton 4.8.2). Fnally, the entre algorthm for the dynamc fuzzy classfer desgn wthn a learnng-and-workng cycle s presented.

96 78 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 4. Formulaton of the Problem of Dynamc Clusterng Ths secton begns wth the formulaton of the classcal (statc) clusterng problem, whch wll be used afterwards as a background to derve a formulaton of a dynamc clusterng problem. Consder a set of objects X={x,..., x N }, where each object s represented as an M- dmensonal feature vector x j =[x j,..., x jm ], j=,..., N. The task of fuzzy clusterng s to partton such a set of objects nto c clusters (c s an nteger, 2 c<n) and to estmate a set of cluster prototypes V={v,..., v c }. A fuzzy c-partton s gven by a matrx U=[u j ] whch satsfes the followng condtons [Bezdek, 98, p. 26]:. u [0, ], j c, j N c 2. u j =, j N = (4.) N 3. 0 < uj < N, c. j= Elements of matrx u j denote a degree of membershp of object j to cluster. Due to Condton each object can belong to several clusters wth dfferent degrees of membershp. Condton 2 requres that the total membershp of an object over all clusters be normalsed to. Condton 3 forbds any cluster to be empty or to contan all objects. Cluster prototypes, or cluster centres, represent the locaton of the clusters. The number of clusters s not usually known n advance and s normally determned usng cluster valdty measures [Zmmermann, 996, p. 260]. Thus the problem s to determne the number of prototypes and ther locaton so that the obtaned partton fts the underlyng data structure n the most precse way possble. One of the most frequently used crtera n fuzzy clusterng s the varance crteron, whch for each c measures the dssmlarty between the objects n the cluster and ts cluster centre by the dstance measure. The varance crteron corresponds to mnmsng the overall wthngroup sum of squared errors (varances between objects and cluster prototypes) weghted by degrees of membershp of objects to clusters. Ths clusterng crteron s called a mnmum varance objectve [Bezdek, 98, p. 47] and yelds the followng basc formulaton of the fuzzy parttonng problem: mn J N c q m ( U, v) = (u j ) j= = x j v 2 A cp, such that U L, v R fc (4.2)

97 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 79 where d( x j, v ) 2 2 T = x j v = ( x j v ) A( x j v ) (4.3) A s the dstance between each object x j and a fuzzy prototype v, A s a (M M) symmetrc postve-defnte matrx, s an nner product norm nduced by A on R M, L fc s the set of A all matrces U satsfyng condtons (4.) and q [, ) s the weghtng exponent. The objectve functonal (4.2) may be globally mnmsed only f degrees of membershp and cluster prototypes are gven by: v N q (u j ) x j j= =, = N q (u j ) j=,...,c (4.4) u j =, =,...,c, j =,..., N. 2 c d q j r d = rj (4.5) Objectve functon algorthms of form (4.2), where A s the dentty matrx and d s the Eucldean dstance, are consdered as the most sutable for problems wth hypersphercal clusters or clusters of roughly equal proportons. The varaton of matrx A makes t possble to obtan clusters of dfferent shapes. Through a modfcaton of the objectve functonal and equaton (4.5) a possblstc parttonng problem can be obtaned. Below, the statc clusterng problem wll be used as a bass for the formulaton of the dynamc clusterng problem. Consder a set of dynamc objects X(t) = {x (t),..., x N (t)}, t =,..., t p, gven as a temporal sequence of observatons. The tme nterval of observatons can n general be unlmted t [, ). Each object s represented by an M-dmensonal trajectory n the feature space, whch contans a hstory of temporal development of each feature. Although process or system varables observed can be contnuous n nature, measurements of these varables are usually carred out dscretely wth a certan samplng rate. Therefore supposng that dynamc objects are observed at dscrete tme nstants, a trajectory can be gven by a dscrete vector-valued functon of the form: T x j( t ) = [ x j(t ), x j( t2),..., x j( t p)], j=,..., N, (4.6)

98 80 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms where p s the number of observatons n a trajectory and x j (t k ), k =,..., p, s an observaton of a feature vector at tme nstant t k. Substtutng an M-dmensonal feature vector nto the components of ths functon, a matrx representaton of a trajectory of a dynamc object s obtaned: x j x j (t) x j2 (t)... x jm (t) = x j (t 2 ) x j2 (t 2 )... x jm (t 2 ) (t) LLLLLLLLLL x j (t p ) x j2 (t p )... x jm (t p ), (4.7) where columns correspond to trajectores of sngle features and rows correspond to feature vectors at a certan tme nstant. A trajectory whch explctly contans tme as an addtonal feature so that sngle features are tme-dependent can be called tme seres. Therefore, the problem of dynamc pattern recognton s sometmes referred to n the lterature as tmeseres classfcaton ([Petrds, Kehagas, 997], [Schreber, Schmtz, 997], [Das et al., 997], [Struzk, Sebes, 998]). In general, trajectores can descrbe a dependence of features from another varable mplctly related to tme. Dynamc data set X(t) can be vewed as a three dmensonal matrx whose dmensons are objects, features and tme (Fgure 4-). x Objects x N F F M t t p Tme nstants Features Fgure 4-: 3-dmensonal matrx representaton of a dynamc data set [adapted from Sato et al., 997, p. 4] The task of dynamc fuzzy clusterng s to determne the number of clusters c(t), to partton the data set nto c(t) clusters and to estmate a set of cluster prototypes V(t) = {v (t),...,v c(t) (t)} n order to approxmate the data structure at tme nstant t takng nto account the hstory of temporal development of feature vectors. It should be stressed that fuzzy partton matrx U(t)

99 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 8 = [u j (t)], as well as cluster prototypes V(t), evolve temporally as new observatons become avalable. The temporal development of cluster prototypes can be represented by the followng model: V(t) = α(t) Γ [V(t-)] + (-α(t)) Γ 2 [(V(t-)], α(t) [0, ] (4.8) where Γ s a transformaton due to abrupt changes n the cluster structure (formaton, mergng, splttng or destructon of clusters) and Γ 2 s a transformaton due to gradual temporal changes n the cluster structure (Secton 2.3.2). Transformaton Γ conssts of two further transformatons: a transformaton Γ 3 regardng a change n the number of clusters and a transformaton Γ 4 regardng a change n the locatons of the cluster prototypes. Accordng to the four types of abrupt changes consdered here, a change n the number of clusters can be modelled by a lnear functon: Γ 3 [c(t)] = c(t-) + β(t), β(t) I. (4.9) Transformaton Γ 4 s a non-lnear functon dependng on a set of unknown parameters whch are dffcult to estmate. A specal case where Γ 4 s modelled by a lnear functon whereas Γ 3 s not consdered at all s dscussed n [Abrabtes, Marques, 998]. A set of unknown parameters s dentfed by the mnmsaton of a specal type of objectve functon but the convergence of the optmsaton algorthm s not guaranteed. Therefore, a number of heurstc algorthms s proposed n ths thess whch are sutable for learnng Γ (V(t-)) n the course of tme nstead of modellng transformatons Γ 3 and Γ 4. Transformaton Γ 2 s obtaned from the recursve equaton for calculatng the cluster prototypes. Suppose that cluster prototypes V(t k ), k=,..., p, are determned at each tme nstant t k accordng to the equaton represented by a rato: VN(t k ) V(t k ) =, VD(t ) (4.0) where VN(t k ) and VD(t k ) are components calculated based on objects X(t), t=,..., t k obtaned untl the tme nstant t k. These components can be calculated recursvely for each tme nstant t k usng the values of the components at the prevous tme nstant: VN(t VD(t k k k ) = VN(t k ) + δ(t k ), (4.) ) = VD(t k ) + γ(t k ), (4.2) where VN(t k -) and VD(t k -) are components calculated based on objects X(t), t=,..., t k - obtaned untl the tme nstant t k -, and δ(t k ) and γ(t k ) are parameter vectors whose calculaton

100 82 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms s based on X(t ) = [ x (t ),..., x (t )] new observatons of objects at tme nstant t k and the k k N k degrees of membershp u j (t k ) of objects X(t k ) to the clusters, whose components are gven by: N m t k ) = u j (t k ) j(t k ) j= δ ( x, (4.3) N m γ (t k ) = u j (t k ), =,..., c j=. (4.4) Thus transformaton Γ 2 for calculatng the cluster prototypes can be wrtten as: VN(t ) + δ(t) Γ ( V(t )) =, t =,..., VD(t ) + γ(t) 2 t p. (4.5) In the followng sectons an algorthm for dynamc fuzzy clusterng based on the adaptaton process for estmatng cluster prototypes V(t) over tme s proposed. The adaptaton law for cluster prototypes conssts of a combnaton of transformaton Γ 2 for the case of gradual changes n the cluster structure wth a number of heurstc technques n the case of abrupt changes. The heurstc algorthms proposed look for ndcators of structural re-organsaton of the data allowng the montorng of temporal changes n the data structure. Snce these algorthms do not provde an optmal soluton to the dynamc fuzzy clusterng problem, t s necessary to solve a statc clusterng problem at the current tme nstant after abrupt changes have appeared. In order to take nto account the hstory of the temporal development of objects, a clusterng algorthm s appled ether to a set of most representatve objects selected from the complete hstory or to trajectores of objects. In the latter case a clusterng algorthm must be based on a dssmlarty measure for trajectores. At ths pont the problem of choosng an approprate clusterng algorthm, whch can be used durng the adaptaton process for solvng the dynamc clusterng problem, arses and wll be addressed n the next secton. Based on the above consderatons the followng requrements concernng the performance of a dynamc fuzzy classfer can be formulated:. A dynamc classfer must be able to recognse gradual changes n the cluster structure and adapt ts structure to these changes, 2. A dynamc classfer must be able to recognse abrupt changes n the cluster structure and adapt ts structure to these changes, 3. Re-learnng of a classfer and classfcaton must take nto account the hstory of temporal development of objects, 4. The tranng data set must be preserved over tme and completed wth the most representatve new objects, 5. The adaptaton of a classfer must lead to an mprovement of the fuzzy partton.

101 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 83 When developng a dynamc fuzzy clusterng algorthm the followng problems must be solved:. Choce of crtera for recognsng abrupt changes, 2. Choce of cluster valdty crtera to control the adaptaton process, 3. Adaptaton of the tranng data set based on the usefulness of objects, 4. Choce of the length of the tme wndow or the sze of the template set, 5. Smlarty measure for trajectores to consder the hstory of temporal development of objects. These problems wll be dscussed n detal n the followng sectons. Frst, the problem of choosng an approprate clusterng algorthm, whch can be used for dynamc clusterng wll be addressed n Secton 4.2 and the necessary requrements on the propertes of a clusterng algorthm wll be presented. Ths wll be followed by the consderaton of dfferent crtera for recognsng changes n the cluster structure and the formulaton of correspondng algorthms based on some of these crtera. Then, an approach for adaptng a dynamc classfer and the tranng data set wll be ntroduced, followed by a general representaton of an algorthm for dynamc classfer desgn. The problem of defnng a smlarty measure for trajectores used as a clusterng crteron wthn dynamc classfers wll be addressed n Chapter Requrements for a Clusterng Algorthm Used for Dynamc Clusterng and Classfcaton The method for dynamc classfer desgn developed n the followng sectons provdes a general technque that can be ncorporated nto dynamc pattern recognton. Nevertheless, t has to be adapted to a certan structure of the classfer used. Due to the huge number of exstng clusterng algorthms, t s not possble to dscuss all ther modfcatons wth the purpose of ther dynamsaton. Therefore, the consderaton s restrcted to prototype-based fuzzy clusterng algorthms, whch seem to be partcularly suted to dynamc clusterng. Dependng on the structure and workng prncples of a dynamc fuzzy classfer, the followng requrements for the choce of a clusterng algorthm can be formulated:. It should be possble to apply an algorthm for possblstc classfcaton of objects, 2. The clusterng algorthm must be able to recognse dfferent shapes of clusters (sphercal as well as ellptcal clusters), 3. The clusterng algorthm must be able to deal wth dfferent szes and denstes of clusters. Although a method for dynamc classfer desgn can generally be combned wth an arbtrary pont-prototype based clusterng algorthm, some of these algorthms have a number of

102 84 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms lmtatons regardng ther use for dynamc clusterng. For nstance, the fuzzy c-means (FCM) algorthm [Bezdek, 98] can only deal wth sphercal clusters and wll fal to recognse ellpsodal clusters that are very lkely to appear n the course of tme after mergng a par of clusters. Another shortcomng of ths algorthm appled to dynamc clusterng s that t cannot detect whether objects wth an ambguous assgnment to clusters are located n the neghbourhood of exstng clusters or far away from them so that they represent a separate group. As a result the FCM algorthm has problems detectng new clusters due to ts probablstc nature. Ths shortcomng can be avoded by the use of the possblstc c-means (PCM) algorthm [Krshnapuram, Keller, 993], whch generates possblstc membershp degrees of objects to clusters representng typcalty of objects n the clusters. Consequently objects located far away from clusters get low degrees of membershp n all clusters. Wth the help of a membershp reject opton new clusters can be detected. However, snce the PCM algorthm s a modfcaton of the FCM algorthm t also provdes a partton nto sphercal clusters. Recognton of ellpsodal clusters of dfferent szes can be acheved by changng the dstance norm n the clusterng algorthm. Ths dea s realsed n the algorthm of Gustafson and Kessel [Gustafson, Kessel, 979], whch determnes a partcular dstance functon for each cluster and s able to deal wth dfferent ellpsodal forms of clusters. In order to treat a dfferent sze of clusters, pror knowledge about the clusters s requred to set values of a correspondng parameter. Although ths algorthm represents an mprovement compared to the FCM and PCM algorthms wth respect to a larger flexblty of cluster forms, t stll fals to recognse clusters wth dfferent densty. Ths s a common drawback of all three algorthms dscussed here, whch s expressed by an undesrable shft of cluster centres n the drecton of clusters wth hgher denstes. The unsupervsed optmal fuzzy clusterng (UOFC) algorthm of Gath and Geva ntroduced n [Gath, Geva, 989] s an extenson of the algorthm of Gustafson and Kessel and s sutable for the recognton of sphercal and ellpsodal clusters of dfferent szes and denstes. The ablty of ths algorthm to recognse clusters of dfferent denstes s ts man advantage compared to the algorthms descrbed above (the FCM, PCM, and Gustafson-Kessel algorthms) and s very mportant for clusterng dynamc objects. Snce changes n the dynamc cluster structure have to be recognsed as early as possble, t s necessary for a classfer to be able to detect small clusters wth hgh densty at the moment of ther appearance and that bg clusters after mergng or before splttng do not nfluence the remanng clusters by shftng ther cluster centres. The unsupervsed optmal fuzzy clusterng algorthm parttons objects by a combnaton of the fuzzy c-means and a modfed maxmum-lkelhood estmaton (MLE) algorthm. The purpose of ths combnaton s to take advantage of the hgh speed of convergence of the FCM algorthm by computng the cluster centres and the ablty of the MLE algorthm to deal wth

103 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 85 unequal clusters. As a result, the UOFC algorthm can recognse sphercal as well as ellpsodal clusters wth a large varablty of cluster szes and denstes. The calculaton scheme of the UOFC algorthm s presented n Appendx 9.. Due to the hgh flexblty of the UOFC algorthm, t seems reasonable to apply t to dynamc pattern recognton. In the followng sectons the UOFC algorthm wll be used n a reduced verson as a basc clusterng algorthm n the process of dynamc classfer desgn. In the UOFC algorthm the optmal number of clusters s determned by evaluatng cluster valdty measures for dfferent parttons and choosng the partton wth the best value of the valdty measures. If ths algorthm s used for dynamc clusterng, Steps 3 to 6 of the algorthm seem to be unnecessary snce durng the dynamc classfer desgn an optmal partton s obtaned by mergng or splttng clusters or by the formaton of new clusters. However, cluster valdty measures are stll used as an ndcator for adaptng a classfer to guarantee an mprovement or at least the same qualty of the partton over tme. 4.3 Detecton of New Clusters The dea of the proposed procedure for detectng new clusters s to use a dstance rejecton opton descrbed n Secton When a new object s presented to the pattern recognton system t s classfed by the current fuzzy classfer desgned n the prevous tme wndow. An object obtans degrees of membershp to all exstng clusters. For the analyss of the cluster structure, however, t may be useful to defne crsp borders of clusters usng α-cuts of membershp functons descrbng fuzzy clusters. Then, one can speak about absorpton of objects nto clusters. A chosen value for an α-cut s used as a threshold of absorpton. If a maxmum degree of membershp of a new object over all clusters exceeds a gven threshold the object s absorbed by a cluster. Ths absorpton crteron can be formulated as follows: f u otherwse ( x) u then x {C,...,C x C o m m c } {C,...,C c } (4.6) where u m s the maxmum degree of membershp of an object x over all clusters and u o s the threshold of absorpton. If the maxmum degree of membershp s not unque (there s more than one degree wth equal maxmum values), then an object s assgned arbtrarly to one of these clusters. The ambguty of such an assgnment s rrelevant by consderng the membershp (dstance) threshold crteron. If degree u m of an object s lower than absorpton threshold u o an object s membershp rejected for the absorpton. Objects that cannot be absorbed by any cluster due to ther low degrees of membershp are called free objects.

104 86 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms In order to be able to detect free objects usng the absorpton crteron (4.6) objects must be classfed by a possblstc classfcaton algorthm. In ths case degrees of membershp of an object to all clusters do not have to sum up to one, therefore they can take low values expressng that an object s atypcal for all exstng clusters. In the case of probablstc clusterng the degrees of membershp of an object to all clusters can be at mnmum /c, f an object s equally atypcal for all clusters. It should be noted that the absorpton crteron s used as an ntermedate decson rule for the detecton of new clusters, but not as a fnal crteron for the assgnment of objects to clusters for the purpose of semantc labellng of the latter. The absorpton of objects s one of several procedures performed durng the montorng process n order to recognse changes n the current cluster structure. The fnal decson about the assgnment of an object to a certan cluster can be made after the montorng and adaptaton processes are fnshed. In general an object rejected for absorpton n the current tme wndow can be absorbed later on when clusters have grown due to the absorpton of other objects. The absorpton procedure s repeated for one tme wndow as long as there are new objects to be absorbed Crtera for the detecton of new clusters After classfcaton and absorpton of new objects, the problem s to decde whether free objects consttute a new cluster or stray data. In order to declare new clusters wthn free objects the followng three crtera must be satsfed. Crtera for the detecton of new clusters:. Exstence of free objects: there s a number of objects whch can not be assgned to any of the exstng clusters due to ther low degrees of membershp, 2. Suffcent number of free objects: the number of free objects must be large enough compared to exstng clusters, 3. Compactness of free objects: free objects must consttute compact groups. The frst crteron s examned usng the absorpton procedure descrbed above, whch results n a number of free objects. In order to decde whether the number of free objects s enough to form a new cluster, the szes of exstng clusters must be consdered. The sze of a cluster s defned as a number of objects absorbed n ths cluster. Defnng a crteron for a mnmum number of free objects, whch s suffcent to form a new cluster, two stuatons can be dstngushed:. If all exstng clusters are approxmately equally large,.e. the parwse devatons of the cluster szes le wthn a predefned lmt, the number of free objects s compared to an

105 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 87 average cluster sze. A threshold for a mnmum number of free objects necessary to declare a new cluster s defned as a share of average cluster sze α cs n av, where coeffcent α cs s chosen from the nterval [0, ] and the average cluster sze s gven by: c av o u n = N (4.7) c = u o where N = { x u u } s the number of objects x j whose degrees of membershp to o j j class exceed the absorpton threshold u o. The crteron of the desred number of free objects can be formulated as follows: f the condton n free cs av α n (4.8) s satsfed, the number of objects s suffcent to assume new clusters. Otherwse free objects are consdered as stray data. 2. If cluster szes are very dfferent the number of free objects s compared wth the mnmum sze of a non-empty cluster. A threshold for a mnmum number of free objects s defned as a share of mnmum cluster sze α cs n mn, α cs [0, ], where the mnmum cluster sze s gven by: n mn = mn(n uo,..., N uo c ) so that N uo, =,...,c (4.9) The crteron of the desred number of free objects s formulated as follows: f the condton n free cs mn α n (4.20) s satsfed, the number of objects s enough to assume new clusters. Otherwse free objects are consdered as stray data. As a result of examnng the second crteron, the number c new t of new clusters that may possbly appear s calculated as follows: c c new t n free new = t cs av α n n case of approxmately equally szed clusters (4.2) rounded free = n cs mn α n n case of unequally szed clusters rounded (4.22) new If ths number s larger than zero, than a maxmum of c t new clusters can be assumed. Stray data are consdered for classfcaton n the next tme wndow together wth new objects.

106 88 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms In order to examne the thrd crteron for formng new clusters, that s to verfy whether free objects represent compact groups, the parttonng of free objects nto c groups or clusters s proposed n order to evaluate some measure of compactness for each new cluster and to compare ts values wth those of exstng clusters. Free objects can be parttoned nto c new t clusters usng, for example, the fuzzy c-means algorthm [Bezdek, 98, p. 65]. Snce most of the measures of compactness take nto account degrees of membershp of objects to clusters, after clusterng t seems reasonable to calculate possblstc degrees of membershp of free objects to the new cluster centres usng the possblstc c-means algorthm [Krshnapuram, Keller, 993] n order to obtan degrees of typcalty of free objects to these new clusters. One of the most obvous measures of compactness s the partton densty of clusters ntroduced n [Gath, Geva, 989] whch wll be used here wth some modfcatons. For each cluster, the partton densty s calculated as the rato between the number of good objects belongng to a cluster and ts hypervolume: n pd () = good new =,,..., c t (4.23) h new t where the hypervolume of cluster s defned as h = det( F ) and F s the fuzzy covarance matrx of cluster gven by: N u ( x v j j j= = N j= ) ( x j j v ) T F. (4.24) u In [Gath, Geva, 989] good objects are defned as those whose dstance to the cluster centre does not exceed the standard devaton of features for ths cluster and the number of objects s determned by a sum of ther degrees of membershp to cluster : N T { x ( x v ) F ( x v ) } good n = u j x j j j j < (4.25) j= In order to avod tme consumng calculatons due to nverson of fuzzy covarance matrces and to provde a degree of freedom to the defnton of good objects, they are chosen based on ther degree of membershp to cluster : n good N = u j= j good { x u α } x j j j. (4.26) Ths means that an object s consdered to be good for cluster f ts degree of membershp to ths cluster exceeds a predefned threshold α good. Ths threshold can ether be equal to the

107 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 89 absorpton threshold or determned ndependently. Obvously, the hgher threshold α good s, the smaller the number of good objects wll be. It s mportant to note that good objects represent kernels of new clusters, whch must also be the most dense regons n clusters. However, due to a number of stray objects that may be present the cluster centres may be moved away from the centres of dense regons. In order to localse dense groups wthn free objects and to avod the nfluence of stray objects, the reclusterng of good free objects wth the number c new t of clusters usng the fuzzy c-means algorthm s proposed. Thus, the procedure for localsng dense groups ncludes two steps: the choce of good free objects and the re-clusterng of good free objects. These two steps are repeated teratvely untl the devaton of cluster centres calculated n two successve teratons falls below a gven threshold (.e. the cluster centres stablse). The localsaton procedure leads to a movement of new cluster centres towards the centres of dense regons. After fnshng ths procedure, the partton densty of new clusters of good free objects can be evaluated. It s to be expected that the densty of new clusters detected usng the localsaton procedure wll be hgher than the densty of new clusters detected by clusterng free objects just once. The advantage of the localsaton procedure for the detecton of dense groups wthn free objects s llustrated n two examples below. In order to check the crteron of compactness, or suffcent densty, of new clusters, the partton densty s calculated for all exstng clusters and for each new cluster usng equatons (4.23), (4.24) and (4.26). If the partton densty of a new cluster pd new (), =,..., c exceeds a predefned threshold defned as a share of the average partton densty of the exstng clusters, then a new cluster can be declared wth a hgh degree of confdence. Otherwse the assumed number of new clusters decreases by one. Ths crteron can be formulated as follows: new t f pd new otherwse () α c new t dens = c pd new t av, =,...,c new t, then c new t s unchanged (4.27) where coeffcent α dens s chosen from the nterval [0, ] and the average partton densty s determned by: c av pd = pd(). (4.28) c = Algorthm for the detecton of new clusters The three crtera proposed for the detecton of new clusters wthn free objects represent the man steps of the montorng procedure used to recognse new clusters n the course of tme.

108 90 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Ths montorng procedure detects groups of objects that cannot be assgned to any of exstng clusters and whose sze and densty are comparable wth those of exstng clusters. Due to the number of thresholds that need to be defned by an expert, t s possble to adapt ths procedure to the requrements of a concrete applcaton. Low values of thresholds of cluster sze α cs and cluster densty α dens make the montorng procedure very senstve to outlers and stray objects and lead to the early recognton of new clusters. Usng hgh values for these thresholds n the montorng procedure allows only new clusters smlar to exstng ones to be detected. The descrbed steps regardng the verfcaton of the proposed crtera are summarsed nto an algorthm for detectng new clusters whch s used wthn the montorng procedure durng dynamc classfer desgn. The outcome of ths algorthm s the number of new clusters and estmates of new cluster centres. The flow chart of ths algorthm s presented on Fgure 4-2.

109 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 9 Absorpton of new objects nto exstng clusters Are there free objects? no yes Verfcaton of crteron of sze of a new cluster c new > 0? no There are no new clusters yes =,..., c =c new Verfcaton of crteron of densty of a new cluster c new = c new - no pd new () α dens? yes no = c? yes Number of new clusters s c new Fgure 4-2: Flow chart of an algorthm for the detecton of new clusters

110 92 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms The followng examples llustrate the effcency of the proposed algorthm for detectng new clusters and show ts capacty for dstngushng between new clusters and stray data. Example. Detecton of a new cluster due to the localsaton of dense groups wthn free objects. Suppose that durng each tme wndow 00 new objects descrbed by two features X and X 2 are observed. For the sake of smplcty and better vsualsaton suppose that the length of a temporal hstory of objects s equal to tme nstant,.e. objects are represented by twodmensonal feature vectors at the current moment. After four tme wndows t=4 the classfer wth c=2 clusters s desgned and the cluster centres are determned at v =(.95;.96) and v 2 =( ; ). The number of absorbed objects so far and correspondngly the szes of two clusters are equal to N 0. 3 = 62 and N = 60, whereas the number of stray objects s equal to 78. In tme wndow t=5, 00 new objects are observed, whch are rejected for absorpton to any of the two clusters due to ther low degrees of membershp. Thus, the total number of free objects (ncludng stray objects from the prevous tme wndow) s equal to n free =78. Fgure 4-3 shows the current cluster structure for t=5 n the two-dmensonal feature space, where objects absorbed n one of the two clusters are represented by crcles and free objects by crosses. X Fgure 4-3: Two clusters wth absorbed objects (crcles) and a group of free objects (crosses) For the montorng procedure for the detecton of new clusters the followng parameter settngs are chosen: X

111 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 93 Table 4-: Parameter settngs for the detecton of new clusters n Example Absorpton threshold u o = 0.3 Share of the average cluster sze α cs = 0.9 Share of the average cluster densty α dens = 0.5 Threshold for the choce of good free objects α good = u o = 0.3. Accordng to the second crteron of the mnmum number of free objects (4.8) the number of free objects s suffcent to declare a new cluster (78 > 0.9 6=44.9) and c new t =. In order to verfy the thrd crteron of compactness, free objects are clustered wth the fuzzy c- means algorthm just once. As a result of clusterng, the centre of a new cluster s dentfed at v free =(0.4;.5). The clusterng results are shown n Fgure 4-4, a, where free objects are represented by crosses, the cluster centre s gven by a star n a crcle and the membershp functon of a new cluster s projected on the feature space n the form of contours, or α-cuts. The colour bar appended rght to a fgure represents the scale of α-cuts of the membershp functon. As can be seen, a new cluster centre s shfted away from the most dense group of objects because of stray objects. X *O X *O X X 0.4 Fgure 4-4: Projectons of the membershp functons obtaned for a group of free objects on the feature space a) After sngle clusterng of free objects; b) After localsaton of a dense group of free objects The evaluaton of the thrd crteron provdes the followng results:

112 94 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Table 4-2: Partton denstes of new and exstng clusters n Example Partton denstes of two exstng clusters pd() = and pd(2) = Average partton densty pd av = Densty Threshold α dens pd av = Partton densty of the group of 78 free objects pd new = Snce the partton densty of the assumed cluster does not exceed the densty threshold, a new cluster cannot be formed and free objects are consdered as beng stray, whch does not correspond to ntutve expectatons. Usng the localsaton procedure for the detecton of dense groups wthn free objects, a group of 20 good free objects s selected after 4 teratons of re-clusterng wth fuzzy c-means, and a centre of ths group s dentfed at v free =(0.09;.9). As can be seen n Fgure 4-4, b, the centre of a new cluster s located exactly n the centre of the most dense group of objects. The membershp functon of the new cluster s much slmmer and most of the free objects have a hgh degree of membershp to the new cluster. As a result the partton densty of the group of 20 good free objects s equal to whch s 3.7 tmes as hgh as the partton densty of the whole group of free objects. Snce ths partton densty exceeds the predefned threshold of , a new cluster can be formed. Ths example shows the mportance of localsng dense groups of free objects n the thrd step of the montorng procedure n order to detect new clusters wth a suffcent densty, regardless of stray data. Example 2. Stray objects between exstng clusters. Consder a smlar problem as n example : durng the frst four tme wndows 400 objects appeared and were parttoned nto two clusters. In the ffth tme wndow 50 new objects are observed, whch are strayed among two exstng clusters as shown n Fgure 4-5, a. After the possblstc classfcaton and absorpton of new objects nto the exstng clusters, ther szes are N 0. 3 = 76 and N = 75, whereas the number of free objects s equal to n free =99. Suppose that the same parameter settngs for the detecton procedure of new clusters as shown n Table 4- are chosen. Snce the number of free objects s suffcent compared to the szes of the exstng clusters (99> =57.9), a new cluster can be assumed ( c new t =). For the verfcaton of the thrd crteron free objects are clustered teratvely wth the fuzzy c-means algorthm and applyng the localsaton procedure. The localsaton and clusterng of a group of 0 good free objects converges to a new cluster centre at v free =(.6;.4) and results n a membershp functon shown on Fgure 4-5, b.

113 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 95 X X *O X a) Free objects (crosses) are strayed between two exstng clusters; Fgure 4-5: Detecton of stray objects b) Projectons of good free objects and the membershp functon obtaned after clusterng of free objects usng the localsaton procedure The evaluaton of the partton denstes of the new and exstng clusters provdes the followng results: Table 4-3: Partton denstes of new and exstng clusters n Example 2 Partton denstes of two exstng clusters pd() = and pd(2) = Average partton densty pd av = 8.23 Densty Threshold α dens pd av = Partton densty of the group of 0 good free objects pd new = 58.2 Snce the partton densty of the assumed cluster les consderably below the densty threshold, free objects are consdered as beng stray. Thus, the verfcaton of the thrd crteron does not permt the formaton of a new cluster although the number of free objects s suffcent accordng to the second crteron. The re-clusterng of free objects and localsaton of dense groups of free objects can also be performed wth the Gath-Geva algorthm. However, an emprcal research has shown that only one teraton of a localsaton procedure s requred to fnd a group of good free objects wth the maxmum densty. Numerous teratons of the localsaton procedure untl the convergence lead to a monotone decrease of a densty of groups. Ths can be explaned by the fact that the Gath-Geva algorthm generates a much crsper partton than fuzzy c-means and determnes the correct cluster sze tself. Due to the selecton of good free objects, the szes of new clusters are sgnfcantly reduced n each teraton resultng n a decreasng densty. In an extreme case the localsaton procedure may be stopped wth a number of good free objects equal to zero.

114 96 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms The need for just one teraton of the localsaton procedure based on the Gath-Geva algorthm represents a clear advantage wth respect to tme effcency toward the use of fuzzy c-means algorthm. However, ths can be coupled wth a number of dsadvantages. Frstly, the Gath- Geva algorthm s computatonally more expensve snce t requres the ntalsaton wth fuzzy c-means to obtan a good fnal partton of objects and the nverson of fuzzy covarance matrces n each teraton. Secondly, f the number of free objects s rather low, statstcal estmates such as fuzzy covarance matrces cannot provde relable results. Due to these reasons, t seems reasonable to use the fuzzy c-means and the possblstc c-means algorthms wthn the localsaton procedure. 4.4 Mergng of Clusters In Secton t was stated that n order to obtan a clear partton and to avod classfcaton errors, objects wth ambguous degrees of membershp must be rejected by a classfer for assgnment to one of clusters. Ths rejecton approach s however not sutable for dynamc pattern recognton whose purpose s to learn the correct cluster structure over tme by adaptng a classfer. On the contrary the ambguty of the membershp of an object to exstng clusters can be used to recognse changes n the cluster structure and can ndcate two dfferent stuatons. If a degree of membershp s equally low for two or more clusters ths means that an object does not belong to any cluster. If alternatvely a degree of membershp of an object s equally hgh for two or more clusters, t means that an object belongs to several clusters. If the number of objects wth hgh and ambguous degrees of membershp s rather large, t can be assumed that two or more clusters are overlappng beng very close to each other. In ths case clusters cannot be consdered as heterogeneous any more and t seems reasonable to merge these clusters Crtera for mergng of clusters In order to detect close and overlappng clusters to be merged n the exstng cluster structure the followng crtera must be satsfed: Crtera for mergng of clusters.. Exstence of objects wth hgh and ambguous degrees of membershp to clusters, 2. A suffcent number of objects wth hgh and ambguous degrees of membershp to clusters OR 3. Closeness of cluster centres to each other.

115 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 97 These crtera are examned durng the montorng procedure n each tme wndow after classfcaton and absorpton of new objects. To mprove the effcency of the montorng procedure, t s suffcent to consder those clusters that have absorbed new objects n the current tme wndow as canddates for mergng wth other clusters. Ths means that overlappng clusters can appear n the current tme wndow f at least one cluster has grown due to the absorpton of new objects. In order to verfy ths condton, cluster szes n the current and n the prevous tme wndows are compared. If a cluster sze has grown the cluster s ncluded nto the set of canddate clusters that wll be consdered for mergng: cs cs merg f n t () > n t (), then A, =,..., c (4.29) where A merg = {,..., r}, r c, a set of clusters to be consdered as canddates for mergng and n cs t () s the number of objects n cluster n tme wndow t. An excepton to ths rule should be made n the frst tme wndow, where there s no possblty to compare results wth ones from the prevous tme wndow. The analyss of the cluster structure must nvolve all clusters n order to recognse the overlappng or almost empty clusters that can appear f the ntal number of clusters s over-specfed. In the lterature dfferent measures for verfyng crtera for mergng two clusters were ntroduced. Most of them are based on the defnton of a smlarty measure for clusters, whch takes nto account ether the degree of overlappng of clusters or the dstance between cluster centres. Untl now these measures were used durng statc classfer desgn to determne the optmal number of clusters as an alternatve soluton to usng the valdty measures. The general dea of the cluster mergng approach s to repeat several teratons of clusterng wth a varable number of clusters startng wth a hgh over-specfed number (an upper bound) and reducng the number gradually untl an approprate number s found. In each teraton smlar clusters are merged and the clusterng of objects s performed wth decreased number of clusters. Ths procedure s repeated untl no more clusters can be merged and fnshed wth the optmal number of clusters. Durng dynamc classfer desgn measures for the verfcaton of mergng crtera are appled wthn the montorng procedure to recognse overlappng clusters that can appear due to temporal changes n the cluster structure. If ths s the case, the classfer must be adapted to a new cluster structure by mergng smlar clusters. Below, dfferent formulatons of crtera for mergng of clusters are presented, whch can be used durng dynamc clusterng. The tme ndex t for cluster centres and membershp functons s dropped snce clusters are consdered for mergng at the same tme nstant. Consder two fuzzy clusters C and C j represented by ther membershp functons u (x k ) and u j (x k ), k=,..., N, where N s the number of objects. In [Setnes, Kaymak, 998] the smlarty between two clusters s evaluated based on the ncluson measure, whch s defned as the

116 98 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms rato of the cardnalty of the ntersecton area of the two fuzzy sets to the cardnalty of each fuzzy set: N mn(u k,u jk ) k= I j = N. (4.30) u k= k where u k and u jk are degrees of membershp of object k to clusters and j, respectvely. Ths defnton s llustrated n Fgure 4-6, where two ntersectng one-dmensonal fuzzy sets wth membershp functons u and u j are shown. u u j mn(u, u j ) Fgure 4-6: Mergng of fuzzy clusters based on ther ntersecton Snce fuzzy sets have generally dfferent supports, the ncluson measure s asymmetrc, and therefore the smlarty measure of two clusters s defned as the maxmum of two ncluson measures: s j = max (I j, I j ). (4.3) Accordng to [Setnes, Kaymak, 998], clusters are merged f the smlarty measure of two clusters s j exceeds a predefned mergng threshold λ. In general, decreasng values of λ lead to ncreased cluster mergng and vce versa. For λ=0 all clusters wll be merged, whereas for λ= no clusters wll be merged. The choce of ths threshold depends on the requrements of a concrete applcaton and requres tunng. The drawback of ths cluster smlarty measure s that t does not requre that objects n the ntersecton area of two clusters have hgh degrees of membershp to both clusters. That s to say, ths smlarty measure takes nto account only Crteron 2 for cluster mergng, whch requres just a suffcent number of objects n the ntersecton area. Because of ths drawback, the results of cluster mergng may be sometmes unsatsfactory. For nstance, t s possble that usng smlarty measure (4.3) two clusters are merged, although cluster centres are rather far away from each other. If clusters are large and there s a suffcent number of objects wth

117 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 99 relatvely low degrees of membershp to both clusters located n the ntersecton area of two fuzzy sets, the value of the smlarty measure can become larger than a threshold leadng to the mergng of these clusters. In ths thess the use of the smlarty measure for clusters based on the degree of ntersecton (overlappng) of two clusters, provded that these are relatvely close to each other, s proposed. Ths smlarty measure can be consdered as a modfcaton of smlarty measure (4.3). In order to satsfy the requrement of closeness of clusters only α-cuts of fuzzy clusters are consdered,.e. only objects wth degrees of membershp hgher than a predefned threshold α. Obvously, hgh degrees of membershp of an object to both clusters mean that t s located close to both cluster centres and ths consequently mples the closeness of cluster centres. Defnng a smlarty measure for two clusters, a degree of ntersecton of two α-cuts of fuzzy clusters s calculated as a relatve cardnalty of the ntersecton area wth respect to the cardnalty of two clusters. Ths degree of ntersecton s asymmetrc for pars of clusters because of ther dfferent szes and cardnaltes, thus, the smlarty measure of two clusters s defned as the maxmum of two degrees of ntersecton: card(h α(u ) H α(u j)) card(h α(u ) H α(u j)) s =, j max, (4.32) card(h α(u )) card(h α(u j)) where u and u j are membershp functons of fuzzy clusters C and C j and { X u( x α} H α (u) = x ) s the α-cut of a fuzzy set wth membershp functon u(x), card(a) s the cardnalty of a fuzzy set A defned as a sum of the degrees of membershp of objects belongng to fuzzy set A. The crteron for mergng two clusters can be formulated n terms of α-cuts of fuzzy clusters as follows. Two clusters can be merged f the number of objects n the ntersecton area of α- cuts of these clusters s large compared to the number of objects n correspondng α-cuts of clusters (Fgure 4-7). In other words, clusters can be merged f the smlarty measure (4.32) of two clusters exceeds a predefned mergng threshold λ: s j λ.

118 00 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms C C j Fgure 4-7: Mergng of fuzzy clusters based on the degree of overlappng of α-cuts Fgure 4-7 shows the ntersecton of α-cuts of fuzzy clusters, where α=0.7 s chosen. Obvously, parameter α determnes the strength of the mergng crteron: the hgher the chosen value of parameter α, the hgher the degree of ntersecton for mergng and n general the lower the number of merged clusters. The choce of parameter α depends on the goals of a concrete applcaton, but t seems reasonable to take values α 0.5. Another mergng crteron based on the closeness of cluster centres s ntroduced n [Stutz, 998]. Assumng that the clusters are sphercal, an estmate for the radus of sphercal clusters n the form of the standard devaton (sd) of a fuzzy cluster s defned as: sd = N k= u k N k= u k 2 x v, k (4.33) where v s the centre of cluster C and x k, k=,..., N, are consdered objects. Clusters C and C j may be merged f the followng crteron s fulflled [Stutz, 998]: v v j k sd, k sd j, < k sd, k mn(sd,sd j f sd f sd f sd ), otherwse j 0 and sd 0 and sd = 0 and sd j j = 0 = 0 = 0 (4.34) where k > 0 s a coeffcent for weghtng a degree of closeness of two clusters and sd s the mean standard devaton of clusters. Condtons of crteron (4.34) can be nterpreted as follows: Cluster C j s empty or a pont located nsde the cluster C (k ) or close to cluster C (k>); the analogues formulaton s gven n the second condton for cluster C beng a pont or empty, (Fgure 4-8, a). Clusters C and C j are empty or ponts and the dstance between ther centres s smaller than the weghted mean radus of clusters (dstance threshold). Ths case can occur f the number of clusters s over-specfed.

119 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 0 The dstance between two cluster centres s smaller than the weghted mnmum of two standard devatons (Fgure 4-8, b),.e. both cluster centres are located nsde the other cluster respectvely (k ). If only one centre or even none of centres le nsde the other cluster, then two clusters can be merged, provded that the values of k chosen are consderably hgher than. v j v v j v x x x x C j C C j Fgure 4-8: Mergng of fuzzy clusters based on ther standard devaton a) Cluster s empty and nsde the cluster j (k ) b) Both cluster centres are nsde the other cluster respectvely (k ) In the fourth condton t may be reasonable to use the sum of two standard devatons nstead of the mnmum as the weaker crteron and to compare the dstance between two cluster centres wth k (sd + sd j ). Ths condton reles on the degree of overlappng of two clusters suffcent for mergng. For k clusters must overlap, for k> they may be separate (whch s not reasonable for mergng). Ths condton s better suted as the mergng crteron than the fourth condton n crteron (4.34) n partcular n cases where clusters have dfferent sze. As can be seen n Fgure 4-8 the crteron of closeness of cluster centres s equvalent to the degree of ntersecton of fuzzy clusters. Crteron (4.34) was proposed for the use wthn the fuzzy c-means algorthm and therefore t s sutable only for sphercal clusters of smlar sze. It can not be used for ellpsodal clusters or clusters of dfferent sze leadng to nconsstent or undesrable results n these stuatons Crtera for mergng of ellpsodal clusters In order to defne mergng crtera for ellptcal clusters, t s necessary to consder not only the dstance between centres of clusters or ther degree of ntersecton but also ther relatve orentaton wth respect to each other. In [Kaymak, Babuska, 995] the compatblty crtera for mergng ellpsodal clusters, whch are used to quantfy the smlarty of clusters, were ntroduced. Consder two clusters C and C j wth centres at v and v j. Suppose that the egenvalues of the covarance matrces of two clusters are gven by [λ (),..., λ (m) ] and [λ j(),..., λ j(m) ], whose components are arranged n descendng order denoted by ndexes (),..., (m). Let the correspondng egenvectors be [e (),

120 02 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms..., e (m) ] and [e j(),..., e j(m) ]. The compatblty crtera for ellpsodal clusters are defned as follows:. Parallelsm: the smallest egenvectors of two clusters must be suffcently parallel: ξ = e e, j (m) j(m) ξ j close to, (4.35) 2. Closeness: the dstance between cluster centres must be suffcently small: ξ = v v, 2 j j 2 ξ j close to 0. (4.36) The dea of these crtera s llustrated n Fgure 4-9. Accordng to the frst crteron, two clusters are compatble f they are parallel on the hyperplane. The second crteron requres that the cluster centres are suffcently close. X 2 e m e jm v d(v, v j ) v j 0 Fgure 4-9: Illustraton of crtera for mergng ellpsodal clusters [adapted from Kaymak, 998, p. 244] In order to obtan the overall smlarty measure for clusters, measures of compatblty must be aggregated. [Kaymak, Babuska, 995] suggest a decson-makng step for aggregaton, whose fnal goal s to determne whch pars of clusters can be merged. Possble pars of clusters represent the decson alternatves and ther number s gven by c(c-)/2 f c clusters are consdered. Compatblty crtera are evaluated for each par of clusters resultng n two matrces Ξ and Ξ 2. The decson goals for each crteron are defned n the form of fuzzy sets. For crteron (4.35) of parallelsm of clusters and for crteron (4.36) of closeness of clusters the fuzzy sets close to and close to zero are defned respectvely. The membershp functons of these fuzzy sets are determned on the ntervals [0, ] and [0, ) for fuzzy sets close to and close to zero, respectvely, as shown n Fgure 4-0. X

121 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 03 u(ξ ) close to u(ξ 2 ) close to zero ν ξ ν 2 0 ξ 2 Fgure 4-0: Membershp functons of fuzzy sets close to and close to zero For the defnton of these membershp functons t s necessary to determne some parameters such as the lmts of ther support denoted by varables ν and ν 2. The values of these parameters can be obtaned as average values of measures of compatblty of all pars of clusters except the elements on the man dagonal of matrces Ξ and Ξ 2, whch are all ones for matrx Ξ or zeros for matrx Ξ 2 as a cluster s always compatble wth tself. ν ν 2 = c (c ) = c (c ) c = c c j= j = c j= j ξ ξ j 2 j. (4.37) (4.38) The varable values of support of membershp functons provde the possblty to adapt the mergng crtera to specfc problems. After calculatng the values of measures of compatblty u j and the degree of closeness ξ j and 2 ξ j the degree of parallelsm 2 u j are determned for each par C and C j of clusters usng the membershp functons close to and close to zero. In order to obtan the overall degree of cluster compatblty, degrees u j and u 2 j must be aggregated usng one of the aggregaton operators. It should be noted that crtera of parallelsm and closeness compensate each other,.e. clusters that are very close but not suffcently parallel can be merged and vce versa. Therefore, [Kaymak, 998, p. 246] proposes usng the generalsed averagng operator as an aggregaton operator. The decson-makng procedure results n the overall compatblty matrx S, whose elements are gven by: / q q 2 q (uj) + (uj ) s = j, q R. (4.39) 2 where s j s the compatblty, or smlarty, of clusters C and C j. By defnton, compatblty matrx S s symmetrc wth elements equal to one on ts man dagonal. The choce of

122 04 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms parameter q n the aggregaton operator nfluences the mergng behavour n such a way that ncreasng values of q lead to too much compensaton and the over-mergng of clusters. Accordng to [Kaymak, 998, p. 250], the value of q=0.5 s emprcally determned as the best suted for most applcatons. The crteron for mergng ellpsodal clusters based on the compatblty matrx s formulated n the same way as the aforementoned mergng crtera: clusters can be merged f the degree of compatblty exceeds the pre-defned threshold λ: s j λ. In general, the choce of threshold λ requres tunng dependng on the requrements of a concrete applcaton, but a value between 0.3 and 0.7 seems to be most approprate. It should be noted that t s possble to merge more than two clusters durng one mergng step f there s more than one element s j exceedng threshold λ. In ths case, the computatonal tme can be consderably reduced. Dependng on the choce of the aggregaton operator and the value of threshold λ the mergng procedure based on the compatblty crtera may provde undesrable results. For nstance, the aggregaton procedure may lead to the compatblty matrx accordng to whch two parallel clusters must be merged even though there s another cluster located between them. In order to avod mpermssble mergng of clusters, an addtonal mergng condton for compatble clusters was ntroduced n [Kaymak, 998, p. 249]. The dea of ths condton s to defne the mutual neghbourhood as the regon n the antecedent product space, whch s located wthn a certan dstance of compatble cluster centres. The mergng condton must prohbt the mergng of compatble clusters f there s an ncompatble cluster n ther mutual neghbourhood. Ths condton s formulated as follows: mn co vk A max d co v A k > max max d, co co v A v j A j (4.40) where A co s a group of compatble clusters, and d j = P( v ) P( v j) (4.4) wth P( ) denotng the projecton of the cluster centres nto the antecedent product space. The mergng condton requres that the mnmum dstance between the cluster centre, whch does not belong to the group of compatble clusters, and the one of the compatble clusters s larger than the maxmum dstance between compatble cluster centres. Compatble clusters can be merged f ths condton s satsfed. The effect of the mergng condton s llustrated n Fgure 4-. Suppose that after the verfcaton of compatblty crtera two clusters wth centres v and v 2 were recognsed as compatble. In the case of the cluster structure shown n Fgure 4-, a, these compatble clusters cannot be merged snce there s an ncompatble cluster wth centre v 3 between them and the mergng condton (4.40) s volated: d 3 < d 2. In contrast, compatble clusters wth

123 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 05 centres v and v 2 shown on Fgure 4-, b, can be merged snce the ncompatble cluster les outsde the mutual neghbourhood of compatble clusters and the mergng condton s satsfed: d 3 > d 2. X 2 v d 2 d 3 v 3 d 23 X 2 v d 23 d 2 d 3 v 3 v 2 v 2 0 X Fgure 4-: Applcaton of the mergng condton n order to avod mpermssble mergng a) The mergng condton s volated; b) The mergng condton s satsfed 0 X The aforementoned crtera for mergng of ellpsodal clusters and the correspondng smlarty measure for ellpsodal clusters have a number of drawbacks. Value ν 2 of the support of the fuzzy set close to zero s defned based on relatve parwse dstances between centres of clusters wthout takng nto consderaton the absolute values of dstances between clusters. As a result the smaller the dstance between two clusters n comparson to dstances for other pars of clusters, the hgher the degree of closeness of these two clusters ndependently of the real value of ths dstance. The consequence of ths defnton s that for all cluster structures wth a smlar relaton of parwse dstances between cluster centres smlar degrees of closeness for pars of clusters are obtaned. For nstance, f the parwse dstances for one group of clusters are d2 = 2, d3 = 0, d 23 = and the parwse dstances for another group of clusters are d2 = 6, d3 = 30, d 23 = 36, then the degrees of closeness u(d 2 ) and u(d 2 2 ) for two pars of clusters wll be approxmately the same dependng on the chosen membershp functon of the fuzzy set close to zero. In the case of the trangular membershp functon u(d 2 ) = u(d 2 2 ) = 0. 5 and n case of the exponental membershp functon u(d 2 ) = 0.34 and u(d 2 2 ) = The degree of closeness for other pars of clusters wll be around zero snce the correspondng parwse dstances are greater than support ν 2 of the fuzzy set: d3, d 23 > ν 2 = 4 and 2 2, d > ν 2. d = Moreover, f only two clusters are consdered for mergng c=2 ther degree of closeness wll be always zero ndependently of the dstance between cluster centres, snce accordng to (4.38) the value of the support of the fuzzy set close to zero wll be ν 2 = 0.5 d 2.

124 06 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Snce the compatblty crtera are consdered by [Kaymak, 998, p. 246] to be compensatory, the mergng of parallel but non-ntersectng clusters s possble, that s nether ntutvely clear nor desrable for some applcatons. In general the compatblty crteron based on the closeness of clusters does not take nto account the sze of clusters as n crteron (4.34) nether does t verfy whether the centre of a cluster s wthn another cluster or whether there s an ntersecton between clusters. It means that the crteron of closeness does not requre that clusters overlap. The lack of ths requrement can be consdered as the reason for the possble mpermssble mergng of clusters. To crcumvent undesrable mergng results an addtonal mergng condton s used [Kaymak, 998, p. 249], whch evaluates the mutual locaton of compatble and ncompatble clusters. In ths secton dfferent crtera for mergng sphercal and ellptcal clusters were consdered. For the recognton of smlar sphercal clusters the smlarty measure based on the degree of ntersecton of α-cuts of fuzzy clusters was proposed. For the detecton of smlar ellpsodal clusters the smlarty measure based on degrees of parallelsm and closeness of clusters was ntroduced. In dynamc clusterng problems t s necessary to deal wth both types of clusters, whch can appear n the course of tme. Therefore, t seems to be advantageous to defne general mergng crtera ndependently of the cluster shape, whch can be appled for recognsng smlar sphercal as well as ellpsodal clusters Crtera and algorthm for mergng sphercal and ellpsodal clusters It must be noted that applyng the mergng crteron (4.32) based on the ntersecton area of α- cuts to ellpsodal clusters nstead of the compatblty crteron based on the closeness of cluster centres avods dsadvantages of the latter crteron and provdes an ntutvely clear reason for cluster mergng. The mergng crteron based on the degree of ntersecton mples the closeness of cluster centres and automatcally excludes the case of an ncompatble cluster between compatble clusters. Obvously ths crteron s ndependent of the form of clusters and reles only on degrees of membershp of objects to fuzzy clusters. Therefore, the followng general mergng crtera for sphercal and ellpsodal clusters are proposed n ths thess: Crtera for mergng sphercal and ellpsodal clusters.. Intersecton: there must be a suffcent number of objects n the ntersecton area of α-cuts of fuzzy clusters (4.32), 2. Parallelsm: the smallest egenvectors of two clusters must be suffcently parallel (4.35). In the case of sphercal or almost sphercal clusters the decson about mergng clusters depends only on the frst crteron snce the second crteron s always satsfed. In the case of

125 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 07 ellptcal clusters both crtera must be satsfed n order to make a decson about mergng clusters. Ths means that the smlarty measure wth respect to the ntersecton of clusters, as well as the degree of parallelsm of clusters, must exceed predefned thresholds: nt s λ, u( ξ ) η, where s s the smlarty measure defned by (4.32), u( ξ ) s the degree j j nt j of parallelsm obtaned usng the fuzzy set close to, ξ j s gven by (4.35), and η s the threshold for parallelsm of clusters. The value of the support of the fuzzy set close to zero can be determned by settng the threshold for the maxmum angle between vectors of the mnmum prncpal components of two clusters, for whch clusters cannot be consdered as parallel any more, and calculatng the nner product of these vectors. Values for thresholds λ and η are defned by the expert. If both condtons are satsfed a total smlarty measure s j between pars of clusters can be calculated by aggregatng the partal measures wth respect to ntersecton and parallelsm to the overall value usng, for nstance, (4.39) wth q=0.5. Elements s j are then summarsed n the smlarty matrx S. The problem of usng the aforementoned mergng crtera for dfferent types of clusters s that the second crtera of parallelsm s not always relevant. In practce there are seldom deally sphercal clusters wth equal frst and second prncpal components. It s more probable that there wll be a slght dfference between the prncpal components so that most sphercal clusters can be consdered as ellpses (n general, a sphere s a specal case of an ellpse). Hence, when applyng mergng crtera t s formally necessary to consder to whch degree a cluster can be assumed to have an ellpsodal form. Ths problem s llustrated n Fgure 4-2. If two clusters wth a clear ellpsodal form (the rato between the frst and the second prncpal components s much greater than ) are overlappng but ther prncpal components are perpendcular (Fgure 4-2, a) these clusters cannot be merged due to the crteron of parallelsm. In contrast, f two clusters look smlar to ellpses as well as to spheres (the rato between the frst and the second prncpal components s close to ) and the ntersecton area s suffcent wth respect to cluster szes (Fgure 4-2, b), whether ther prncpal components are perpendcular to each other or not s nsgnfcant. These clusters can be merged due to the frst crteron and the second crteron can be gnored. j

126 08 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms X 2 X 2 0 X 0 X Fgure 4-2: The relevance of crteron of parallelsm for mergng clusters a) Ellpsodal clusters wth perpendcular prncpal components cannot be merged; b) Ellpsodal clusters lookng smlar to spheres can be merged although ther prncpal components are perpendcular In order to handle both stuatons the defnton of an addtonal threshold λ max for the smlarty measure based on the ntersecton of clusters s proposed. If the value of ths measure s suffcently large s λ, then there s no need to verfy the second crteron of nt j max parallelsm. The reason for ths condton s that the satsfacton of the mergng crteron based on the ntersecton of α-cuts of fuzzy clusters mples the closeness of the cluster centres as well as the parallelsm of the clusters. For long, slm ellptcal clusters the degree of ntersecton can be large f and only f clusters are more or less parallel. If clusters are not parallel but the degree of ntersecton s hgh wth respect to the clusters szes, t means that the clusters are not really ellptcal and ther parallelsm does not have to be consdered as a crteron for mergng. Fgure 4-3 shows dfferent stuatons n whch two ellptcal clusters can be merged due to a hgh degree of ntersecton caused by ther closeness and parallelsm. In contrast, n the examples shown n Fgure 4-4 the degree of ntersecton s not suffcent (does not exceed threshold λ max ) and both crtera for mergng must be verfed. Snce these ellptcal clusters are not parallel enough they are rejected for mergng.

127 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 09 X 2 X 2 X 2 0 X 0 X 0 X Fgure 4-3: Stuatons n whch ellptcal clusters can be merged X 2 X 2 X 2 0 X 0 X 0 X Fgure 4-4: Stuatons n whch ellptcal clusters cannot be merged If the decson about mergng clusters s made t s necessary to estmate a new cluster centre. In [Stutz, 998] summng up degrees of membershp of objects belongng to two clusters whch are merged s proposed. In order to guarantee that new degrees of membershp le n the nterval [0, ], they must be normalsed usng the maxmum degree obtaned over all objects of both clusters. Suppose that clusters C and C j have to be merged to a new cluster C j. New degrees of membershp for objects belongng to cluster C j are calculated accordng to the followng equaton: u j,k = u + u max (u k k jk j,k ) (4.42) where u k and u jk are degrees of membershp of object k to clusters C and C j, respectvely. Based on new degrees of membershp a new cluster centre s estmated as a weghted arthmetc mean of objects usng the fuzzy c-means algorthm. The obtaned cluster centre s then used as an ntalsaton parameter for re-learnng the dynamc classfer durng the adaptaton procedure. In the followng an algorthm for detectng and mergng smlar clusters based on the proposed general mergng crtera s summarsed.

128 0 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Algorthm 4: Detecton of smlar clusters to be merged.. Set c merg =0. Fnd clusters that have absorbed a lot of new data snce the prevous tme wndow by verfyng condton (4.29) for each cluster and determne set A merg of canddate clusters for mergng. 2. Calculate a smlarty measure for each par of clusters n set A merg usng mergng crteron (4.32) based on the degree of ntersecton. 3. Choose a par of clusters C and C j wth the maxmum value of the smlarty measure. 4. Check whether the maxmum value of the smlarty measure based on the ntersecton exceeds a predefned threshold λ max. If ths s so, proceed wth Step Check whether the smlarty measure wth respect to the ntersecton of clusters gven by (4.32) and the degree of parallelsm of clusters gven by (4.35) exceed predefned nt thresholds s λ, u( ξ ) η. If at least one condton s volated, clusters cannot be j merged; proceed wth Step 8. j 6. Clusters C and C j can be merged. The number of cluster pars to be merged s ncreased by one: c merg =c merg Estmate a new cluster centre for the ntalsaton of the re-learnng procedure of the classfer. 8. If there are more pars n set A merg to be examned proceed wth Step 3 of the algorthm. Steps 3-7 are repeated untl no more mergng s possble. Usng ths algorthm, clusters are merged parwse and teratvely. The outcome of the algorthm s the number of pars to be merged and estmates of new cluster centres. The flow chart of the algorthm s presented n Fgure 4-5. The presented algorthm for detectng and mergng smlar clusters s ntegrated nto the montorng procedure used for the dynamc fuzzy classfer desgn.

129 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Choose a par of clusters for mergng from a set A merg of canddates Verfcaton of the frst crteron for mergng based on the ntersecton of clusters yes s j nt λ max? no s j nt λ? no yes Verfcaton of the second crteron for mergng based on parallelsm of clusters ξ j η? no Clusters cannot be merged yes c merg = c merg + Two clusters can be merged; estmate a new cluster centre Fgure 4-5: A flow chart of an algorthm for detectng smlar clusters to be merged

130 2 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 4.5 Splttng of Clusters Smlar to the stuaton of mergng of clusters, the man reason for splttng of clusters s an absorpton of a lot of new data n the current tme wndow that may lead to heterogeneous structures wthn a cluster. In order to fnd clusters that have consderably grown snce the prevous tme wndow, condton (4.29) must be verfed and a set A splt of clusters whch can be consdered for splttng s determned. Notce that t s not reasonable to splt clusters whch have been merged n ths tme wndow. Therefore, the condton of absorpton of a suffcent number of new objects s verfed only for clusters exstng and structurally unchanged snce the prevous tme wndows (new clusters appeared due to new objects or as a result of mergng are skp): f n cs t () > n cs t (), then A splt, c such that C t C t (4.43) where A splt = {,..., r}, r c, a set of clusters to be consdered as canddates for splttng. The montorng procedure for detectng heterogeneous clusters whch must be splt s based on the estmaton of densty dstrbuton of objects wthn a cluster. The followng consderatons can be ncorporated n order to derve an algorthm for detecton of heterogeneous clusters Crtera for splttng of clusters A cluster can be splt f the densty wthn the cluster s not homogeneous and there are at least two dstnct areas wth a densty that s much hgher than the densty between these areas (Fgure 4-6). In order to recognse these areas a hstogram wth respect to a feature whch s relevant for dstngushng these areas can be constructed. The relevant feature s defned usng the prncpal component transformaton as a feature wth the maxmum varance, whch s called the frst prncpal component. Fgure 4-7 shows the projecton of objects nto the space of the frst and second prncpal components. Consderng a hstogram projected on the frst prncpal component (Fgure 4-8), the goal s to recognse a specfc pattern n the structure of the hstogram such as a local mnmum value located between two local maxmum values. Ths characterstc pattern gves a hnt on the varaton of the densty wthn the cluster.

131 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 3 X Fgure 4-6: A heterogeneous cluster n the two-dmensonal feature space X Second prncpal component Frst prncpal component Fgure 4-7: A heterogeneous cluster n the space of the two frst prncpal components

132 4 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 0.5 Densty Frst prncpal component Fgure 4-8: Hstogram of objects densty wth respect to the frst prncpal component The cluster can be splt nto two clusters, f the followng three crtera are satsfed for the characterstc pattern n the hstogram: Crtera for the detecton of heterogeneous clusters.. Suffcent varaton of densty wthn a cluster: the dfference between both maxma and the mnmum n the hstogram must be large enough to consder a cluster as heterogeneous, 2. Dstncton of dense groups wthn a cluster: the dstance between centres of dense groups must be large enough to consder these groups as dstnct, 3. Suffcent sze of dense groups wthn a cluster: the sze of areas wth the hgh densty must be large enough compared to the cluster sze to consder them as sub-clusters. The second and thrd crtera guarantee that clusters wth a frequent varaton of the densty wthn a cluster or wth outlers (too low or too hgh densty at some locatons) are not splt up. If the characterstc pattern 'maxmum-mnmum-maxmum' s present n the hstogram several tmes, the pattern wth maxmum dfference between maxmum and mnmum values of the hstogram s chosen for the analyss. Splttng nto more than two clusters can be performed teratvely consderng new appearng clusters for the splttng test. Notce that t s convenent for the analyss to construct a hstogram wth a relatvely small number of bars (from 8 to 2) so that small varatons of the densty do not have bg nfluence on the hstogram s shape. The proposed crtera for detecton of heterogeneous clusters can be appled to sphercal as well as ellptcal clusters, snce they consder objects n the space of prncpal components.

133 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 5 In order to verfy these three crtera for splttng, a number of thresholds must be set. Let r dens be a coeffcent expressng how many tmes the densty of objects must dffer wthn a cluster to be suffcent for splttng a cluster. The hgher the value of ths coeffcent, the stronger s the crteron of densty and the smaller a number of clusters whch are splt. The threshold r reld for a relatve dfference between the maxmum h max and the mnmum value h mn of the max dens mn hstogram can be derved usng the relaton h = r h n the followng way: r reld h = max h max h mn = r r dens dens. (4.44) In order to compare the sze of dense groups of objects wth the cluster sze, consder a dameter of dense groups or a cluster defned as a value range of the feature wth the maxmum varance (the frst prncpal component). It means the dameter of a cluster corresponds to the whole value range of the hstogram shown on Fgure 4-8 and the dameter of a dense group of objects s gven by the value range of some bars of the hstogram. Let r dam be a sze threshold expressng a mnmum dameter of dense groups of objects as a share of the dameter of a cluster whch s suffcent to consder these groups as separate sub-clusters wthn a sngle cluster (Fgure 4-9). Ths threshold must be defned n the nterval [0, 0.5] to be able to recognse at least two new groups wthn a cluster. The smaller the value of r dam, the weaker the second crteron for splttng and the smaller the sze of dense groups whch can be detected by the splttng procedure. (-2 r dam )/3 r dam r dst dam= Fgure 4-9: Illustraton of thresholds of sze and dstance between centres of densty areas If the value of the sze threshold s chosen smaller than 0.5 and the number of densty groups to be detected s assumed to be equal to two, then there can be a dstance equal to ( - 2 r dam ) / 3 between dense groups and between these groups and the borders of the cluster (.e. there can be areas of low densty along the border of the cluster as t s llustrated on Fgure 4-9). The thrd crteron requres a suffcent dstance between dense groups of objects so that they can be clearly dstngushed and splttng n case of random varaton of densty can be

134 6 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms prevented. Denote the threshold for a suffcent dstance between dense groups by r dst expressng the rato of the dstance between centres of dense groups to the sze of a cluster. The value of ths threshold can be obtaned dependng on the assumed number of dense groups wthn a cluster and usng the sze threshold for these dense groups n the followng way: r dam dam dam 2 r r = r + =. (4.45) 3 3 dst + The crtera for splttng are examned for each cluster from the set A splt based on the hstogram of the densty of objects belongng to a cluster projected on the frst prncpal component. For the verfcaton of the frst crteron the maxmum and the mnmum values of the hstogram are evaluated and the relatve dfference of denstes of objects wthn a cluster s calculated as: max mn h h dens =. max (4.46) h reld If dens r, the frst crteron s satsfed and a cluster can be consdered for splttng. Otherwse ths cluster s rejected for splttng and there s no need to examne the other two crtera. In order to examne the second and the thrd crtera for splttng t s necessary to fnd the characterstc pattern maxmum-mnmum-maxmum n the hstogram Search for a characterstc pattern n the hstogram When lookng for a characterstc pattern n the hstogram, one has to fnd the number of extreme ponts (local mnma and local maxma) n the hstogram by calculatng the dfferences between each two neghbourng bars of the hstogram. Negatve dfference corresponds to the ncreasng trend n the hstogram, postve dfference corresponds to the decreasng trend. The extreme ponts can be detected by consderng changes of the sgn of ths dfference. A change from mnus to plus corresponds to a local maxmum and a change from plus to mnus corresponds to a local mnmum n the hstogram. The number of changes of the dfference sgn provdes the total number of extreme ponts n the hstogram. If the frst and/or the last extreme pont s the mnmum, t seems reasonable to add to the lst of extreme ponts the frst and/or the last ponts of the hstogram as addtonal maxma. These boarder ponts of the hstogram can also be centres of dense regons n a cluster satsfyng the crtera for splttng (Fgure 4-2, b). Therefore they must be consdered along wth the extreme ponts. Dependng on the number of extreme ponts, three cases can be dstngushed:

135 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 7. If there s only one extreme pont, t means that there s only one densty area wthn the objects (f t s a maxmum) or the objects are concentrated on the boarder (f t s a mnmum). Thus, the cluster does not need to be splt up. 2. If there s exactly three extreme ponts and ther order corresponds to maxmummnmum-maxmum (Fgure 4-2, a), then they are labelled as P, P 2 and P 3 and the crtera for splttng must be verfed. If the order of extreme ponts corresponds to 'mnmum-maxmum-mnmum', then there s only one densty area wthn the objects, thus a cluster does not need to be splt up. 3. If there are more than three extreme ponts, then the search for a characterstc pattern n the hstogram s contnued untl the pattern maxmum-mnmum-maxmum satsfyng all three crtera s recognsed or there s no patterns any more to be nvestgated. The dea of the proposed search procedure s to narrow teratvely the search nterval by the extreme ponts of a detected pattern. For nstance, consder a hstogram shown n Fgure The search nterval s lmted by the most last ponts of the hstogram ncluded to the lst of extreme ponts as maxma. Suppose that the frst pattern charactersed by the maxmum densty dfference and detected by the search algorthm s P -P 2 -P 3. Suppose that the thresholds for splttng are chosen so that the second crteron of dstncton of dense groups and the thrd crteron of cluster sze are not satsfed for ths pattern. The search s contnued left from the pattern and the search nterval s lmted by ponts P 3 and P 4. The next pattern found by the search procedure s P 4 -P 5 -P 6, for whch the crtera for splttng are not satsfed as well. The search nterval s narrowed by the pont P 6 and the search s contnued on the nterval between P 3 and P 6. The thrd pattern detected by the procedure s P 3 -P 7 -P 6, for whch all three crtera for splttng are satsfed. It must be noted that the search procedure looks for the maxmum densty dfference n the nterval and therefore as soon as ponts P 3 and P 6 are detected as two maxma the deepest local mnmum between these ponts s looked for. As a result pont P 7 can be recognsed despte of the frequent small devaton of the densty between ponts P 3 and P 6. Thus, n each teraton f the characterstc pattern s detected, the search nterval s dvded nto two ones restrcted by the extreme ponts of the detected pattern and the extreme ponts of patterns detected n prevous teratons, and the search s carred out on each of the ntervals. The search procedure s llustrated on Fgure Possble patterns of a hstogram are shown on Fgure 4-2. The deal characterstc pattern for splttng maxmum-mnmum-maxmum for whch all crtera for splttng are satsfed s llustrated n Fgure 4-2, a. Fgure 4-2, b shows that ncludng the last ponts of the hstogram to the set of extreme ponts a pattern for splttng can be recognsed. Fgure 4-2, c and d llustrate cases where the correct pattern of splttng can be detected by the search algorthm despte of the hgh boundary devaton (Fgure 4-2, c) and small varaton (Fgure

136 8 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 4-2, d) of the densty. Notce that the zgzag pattern between the two maxma cannot prohbt the procedure to detect the correct pattern for splttng as long as the dfference of the densty n the zgzag pattern does not exceed correspondng values at maxmum ponts. search nterval 2 search nterval 3 search nterval P 4 P 6 P 3 P P 5 P 7 P 2 Fgure 4-20: Illustraton of the search procedure for a densty hstogram a) deal pattern maxmum- mnmummaxmum, b) pattern contanng the most left and rght ponts of a hstogram as addtonal extreme ponts, c) pattern contanng a peak of densty at the last pont of a hstogram, d) pattern contanng the varaton of the densty between two maxma Fgure 4-2: Examples of patterns n the hstogram

137 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Algorthm for the detecton of heterogeneous clusters to be splt If a characterstc pattern s detected n the hstogram, the crtera for splttng are verfed. In order to examne the frst crteron of densty the dfference d between two extreme ponts correspondng to the second largest value and the mnmum value s calculated (Fgure 4-22). The crteron can be formulated as follows: f the relatve dfference of the densty exceeds the gven threshold d reld dens = r, max 2 (4.47) h then the crteron of densty s satsfed for a cluster and the next crteron can be verfed. Otherwse the cluster s rejected for splttng. For the examnaton of the second crteron the dstance d 2 between the two maxma P and P 3 s evaluated as the dstance between mean values of the correspondng bars of the hstogram (Fgure 4-22). If the rato of ths dstance to the whole doman of the hstogram exceeds the gven threshold d 2 hd d(p, P ) hd 3 dst = r, (4.48) then the crteron of dstncton of dense groups s satsfed for a cluster and the next crteron can be examned. Otherwse the cluster s rejected for splttng. When verfyng the last crteron, the szes of areas wth hgh densty d 3 and d 4 around ponts P and P 3 are determned (Fgure 4-22). The sze of an area s calculated as the product of the number of neghbourng bars exceedng a threshold ( - r reld ) and the wdth of a bar. If both values expressng relatve szes of areas exceed the gven threshold d 3 hd d hd dam 4 dam r and r, (4.49) then a consdered cluster can be splt up n two.

138 20 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Densty h max h max2 -r reld P d hd d 2 d P 2 P 3 d 4 Frst prncpal component Fgure 4-22: Verfcaton of crtera for splttng a cluster based on the densty hstogram If the number of bars n the hstogram s chosen relatve large compared to the number of objects, the hstogram may contan a lot of patterns wth small varatons of the densty. In order to smplfy the search for a characterstc pattern and to avod the rsk to stck n the pattern whch s not the best wth respect to crtera for splttng, t s proposed to smooth the hstogram n the same way as t s usually done for tme seres. Consder a hstogram wth r bars h,..., h p such that a centre of bar h s located at x, =,..., p. Usng the method of a movng average the bar values of a new hstogram h are calculated as mean values over r neghbourng bar values n the followng way: f r s an uneven number, (r-)/2 left and (r-)/2 rght neghbourng values are consdered + (r ) / 2 h = h j, (4.50) r j= (r ) / 2 where r r = +,..., p, 2 2 f r s an even number, r/2 left and r/2 rght neghbourng values are consdered r / 2 h =, (4.5) = + h j r j r / 2 where r r = +,..., p. 2 2 The example n Fgure 4-23 llustrates the effect of smoothng of hstograms.

139 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 2 Densty P P 4 P Frst prncpal component P 3 P 2 Densty P P P Frst prncpal component Fgure 4-23: The effect of smoothng the densty hstogram The orgnal hstogram (above) conssts of 00 bars and contans a sgnfcant part of densty fluctuatons. The frst characterstc pattern found by the search procedure P -P 3 -P 2 and correspondng to the bars at ponts x=, 59, 00 does not satsfy crtera for splttng (the thrd crteron of sze s volated). The second characterstc pattern detected n the hstogram P 4 -P 3 - P 5 corresponds to the bars at ponts x=35, 59, 85 and satsfes all crtera for splttng. Applyng a movng average of length 5, a smooth hstogram (below) fltered from the most part of fluctuatons s obtaned so that a clear characterstc pattern P -P 3 -P 2 (x=36, 58, 84) can be found n the frst run of the search procedure. In general the number of patterns whch must be examned by the algorthm for the orgnal hstogram (wthout smoothng) can be consderably hgher. Thus, applyng a method for smoothng a hstogram before startng the search procedure allows to fnd the characterstc pattern much faster. The value of r should be chosen dependng on the number of bars n the hstogram: the larger the number of bars, the

140 22 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms hgher the value of r. For the number of bars below 00 t s recommended to choose the value of r between 3 and 2. After the decson about splttng the cluster s made, t s necessary to estmate centres of new detected clusters. The easest way s to apply the fuzzy c-means algorthm or the algorthm of Gath-Geva (f ellptcal clusters must be detected) to the objects of a consdered cluster n order to partton them nto two clusters. The obtaned cluster centres can be used as ntalsaton parameters for re-learnng the dynamc classfer durng the adaptaton procedure. The proposed splttng procedure s summarsed n the followng algorthm. Algorthm 5: Detecton of heterogeneous clusters to be splt.. Set c splt =0. Fnd clusters that have absorbed a lot of new objects snce the prevous tme wndow by verfyng condton (4.43) for each cluster and determne a set A splt of canddate clusters for splttng. 2. Choose a cluster from the set A splt. 3. Calculate prncpal components to fnd the feature wth maxmum varance. 4. Calculate a hstogram wth respect to the frst prncpal component and smooth t. 5. Set the threshold r reld for a relatve dfference between the maxmum and the mnmum value of the hstogram. Set the sze threshold r dam for a mnmum dameter of dense groups of objects wthn a cluster. Calculate the threshold r dst for a suffcent dstance between centres of dense groups of objects usng equaton (4.45). 6. Fnd maxmum and mnmum values of the hstogram and calculate ther relatve dfference accordng to (4.46). If the relatve dfference between maxmum and reld mnmum densty wthn objects s not suffcent,.e. dens < r, then ths cluster should not be splt up. Go to step Calculate the number of extreme ponts (local mnma and local maxma) n the hstogram. 7. If there s only one extreme pont, then the cluster does not need to be splt up. Go to step If there s exactly three extreme ponts and ther order corresponds to maxmummnmum-maxmum, then label them as P, P2 and P3 and nvestgate ths characterstc pattern for splttng n the next step of the algorthm. If the order of extreme ponts corresponds to 'mnmum-maxmum-mnmum', then the cluster does not need to be splt up; go to step 3.

141 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms If there are more than three extreme ponts, then the search for a characterstc pattern n the hstogram s contnued untl the pattern maxmum-mnmum-maxmum satsfyng all three crtera s recognsed or there s no patterns any more to be nvestgated. If a characterstc pattern s detected the followng condtons are verfed. 8. Verfy the frst crteron for splttng (4.47). If the crteron s satsfed, go to the next step. Otherwse ths cluster s rejected for splttng; go to step Verfy the second crteron for splttng (4.48). If the crteron s satsfed, go to the next step. Otherwse ths cluster s rejected for splttng; go to step Verfy the thrd crteron for splttng (4.49). If the crteron s satsfed, go to the next step. Otherwse ths cluster s rejected for splttng; go to step 3.. A cluster can be splt up. The number of clusters that can appear as a result of splttng s ncreased by one: c splt =c splt Estmate two new centres n a cluster usng the fuzzy c-means algorthm for the ntalsaton of the re-learnng procedure of the classfer. 3. Go to step 3 of the algorthm. Ths algorthm allows to detect teratvely heterogeneous clusters whch can be splt up nto two ones. The outcome of the algorthm s the number of clusters to be splt up and estmates of new cluster centres. The flow chart of ths algorthm s presented on Fgure The presented algorthm for detecton of heterogeneous clusters for splttng s ntegrated nto the montorng procedure used for a desgn of a dynamc fuzzy classfer.

142 24 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Choose a cluster for splttng from a set of canddates A splt Calculate a hstogram w.r.t. the frst prncpal component Search for a characterstc pattern There s no characterstc pattern Characterstc pattern max-mn-max s found Verfcaton of crtera for splttng for a detected pattern no Is the frst crteron satsfed? yes no Is the second crteron satsfed? yes Is the thrd crteron satsfed? no A cluster cannot be splt up c splt = c splt + no A cluster can be splt up; estmate new cluster centres Fgure 4-24: A flow chart of an algorthm for detectng heterogeneous clusters to be splt

143 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Detecton of Gradual Changes n the Cluster Structure In the course of tme the arrval of new objects may cause some clusters to drft,.e. the change n locaton of cluster centres. These changes n the cluster structure are usually relatvely slow and gradual representng temporal development of clusters, but they may also be precursors of abrupt changes n the future. In order to detect these gradual changes and to recognse drftng clusters durng the montorng procedure, the change of the classfer performance can be observed and evaluated based on the analyss of the membershp functons of fuzzy clusters. Consder a set of objects x (t),..., x N (t), whch s observed over tme t [, t p ]. Suppose that at a certan tme nstant t k these objects are parttoned nto c t clusters. At tme nstant t k+ new observatons of these objects are classfed nto exstng clusters. If a subset of the new observatons of objects, whch were absorbed n cluster at the prevous moment, has moved away from cluster centre v, then obvously the degrees of membershp of these observatons are decreased compared to the prevous tme nstant. Gradual temporal changes n the membershp of observatons to clusters can be recognsed by consderng the varaton of degrees of membershp of observatons between two tme nstants: u j (t k ) = u j (t k ) u j (t k ) (4.52) where u j (t k ) and u j (t k- ) are the degrees of membershp of object x j, j=,..., N, to cluster at tmes t k and t k-, respectvely. A postve varaton means that object x j has moved towards cluster. A negatve varaton ndcates that object x j has moved away from cluster. If the number of observatons of the objects that have moved away from cluster s large, one can talk about the drft of cluster. In [Grener, 984] a set of producton rules for a qualtatve descrpton of the temporal development of the cluster structure s ntroduced, whch s based on the analyss of varatons n membershp of sngle objects gven by (4.52). However, f the number of objects observed at one tme nstant s relatvely hgh t s rather dffcult to consder the development of each sngular object. A total measure charactersng the development of a group of new observatons wth respect to exstng clusters and ndcatng the drft of clusters must be evaluated. In ths context t seems reasonable to consder some valdty measure for each cluster takng nto account the degrees of membershp of observaton, as well as the propertes of the data structure. In [Bensad et al., 996] the compactness ndex for fuzzy clusters s proposed, whch evaluates the correspondence of the data structure to the cluster structure. By contrast to the usual applcaton of the valdty measure to compare dfferent parttons, the compactness measure s used durng the montorng procedure to compare the qualty, or correctness, of the

144 26 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms same cluster between two tme nstants. Therefore, for evaluatng the current compactness of a cluster there s no need to consder all new observatons, but only those that were absorbed n the correspondng cluster. These can be consdered as good representatves of ths cluster and to check whether they are well matched by the current cluster centre s ntended. The compactness of fuzzy cluster wth respect to absorbed objects s defned n the followng way: π o u = uo N j= (u j ) 2 n x j v 2 A x j { x u u } o j j (4.53) where uo N u o N s the number of objects absorbed n cluster, = u j j= n s the fuzzy cardnalty of cluster takng nto account only objects absorbed by ths cluster, and A s an arbtrary M M symmetrc postve defnte matrx. If the compactness of a cluster s decreased compared to the prevous tme nstant, a drft of a cluster can be assumed and the centre of ths cluster must be updated based on new observatons. In other words, f the varaton of the compactness between two tme nstants t k and t k- o o o ( k k k k u u u π t, t ) = π (t ) π (t ) (4.54) s negatve, a cluster drft s assumed and the cluster n queston s added to a set A gr to be updated durng the adaptaton procedure. 4.7 Adaptaton Procedure In Sectons 4.3, 4.4 and 4.5 three types of abrupt changes that can appear n the cluster structure n the course of tme were descrbed and the correspondng algorthms for recognsng these changes by the montorng procedure were proposed. If n tme wndow t abrupt changes have been detected wthn the montorng procedure, the problem s to adapt the classfer to the detected changes n the cluster structure. If such abrupt changes take place, the most reasonable soluton s to re-learn the classfer wth a new number of clusters n order to dentfy a new partton. Estmated centres of new clusters calculated durng the montorng procedure are used together wth the exstng unchanged cluster centres for the ntalsaton of the re-learnng clusterng procedure. Thus, dervng an adaptaton law for a dynamc classfer the problem s to determne a new number of clusters summarsng the results of the montorng procedure.

145 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 27 Suppose that the number of clusters correspondng to the current classfer desgned n the new prevous tme wndow t- s gven by c t. If c t new clusters have been detected n the tme new wndow t, the number of exstng clusters must be ncreased by c t before re-learnng. If due to a sgnfcant growth of some clusters n the current tme wndow t and ther overlappng wth other clusters c merg t pars of smlar clusters to be merged are recognsed by the montorng procedure, then the number of exstng clusters must be reduced by c merg t before re-learnng. If the absorpton of a lot of new objects n the current tme wndow t has lead to splt the formaton of c t heterogeneous clusters to be splt, the number of exstng clusters must splt be ncreased by c t before re-learnng the classfer. Obvously, three types of abrupt changes formaton of new clusters, smlar clusters, heterogeneous clusters - can appear n seven combnatons. The general adaptaton law of a dynamc classfer when all abrupt changes are observed can be formulated as follows: c + new merg splt t = c t + c t c t c t (4.55) where c t s the new number of clusters that must be used to re-learn the classfer n the current tme wndow t. The other sx formulatons of the adaptaton law are obtaned from (4.55) f one or two of the three last components are equal to zero. The re-learnng of the classfer s carred out f at least one of the three components takes a non-zero value. For the ntalsaton of the classfer the old centres of the unchanged clusters and the estmates of the centres of the new clusters obtaned durng the montorng procedure are used. After adaptng a dynamc classfer to abrupt changes the new cluster structure s evaluated by a valdty measure. If a new classfer provdes a better fuzzy parttonng t s accepted, otherwse the prevous classfer s preserved. In Secton a varety of valdty measures s consdered n order to choose the most sutable one for a dynamc classfer. If no abrupt changes are detected and only gradual changes are recognsed n the cluster structure durng the montorng procedure descrbed n Secton 4.6, a classfer must be ncrementally updated based on new objects. The general form of an updatng rule for cluster centres was formulated n Secton 4. n equaton (4.5). The specfc updatng rule depends on the type of clusterng algorthm appled. For the classfer of Gath and Geva the updatng procedure nvolves the cluster prototypes and the fuzzy covarance matrces of all clusters. Assume that N prevous objects have already been classfed and a new (N+)-th object s consdered. The locaton of the -th cluster centre, =,..., c, s updated n the same way as n the FCM algorthm accordng to the recursve equaton (3.0). The recursve equaton for updatng the fuzzy covarance matrx of cluster can be formulated as follows:

146 28 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms F = N+ j= u j ( x j v N+ j= ) ( x u j j v ) T = N j= u ( x v j j ) ( x j v ) N T j= + u u j N+, j + u ( x N+, j N+ v ) ( x N+ v ) T = (4.56) FN (N) + u N+, j ( x N+ v FD (N) + u ) ( x N+, j N+ v ) T where FN N T (N) = u j ( x j v ) ( x j v ) j= and N FD (N) =. u j j= Terms FN(N) and FD(N) are calculated from the prevous N objects and ther values are saved after each recurson step. The update of the matrx s acheved due to the addtonal terms correspondng to the new object N+. The man problem durng ncremental updatng s to decde whch new objects should be used to update a classfer. The most reasonable soluton s to take nto account only good objects. In [Marsll-Lbell, 998] the use of a valdty measure (n partcular the entropy) to judge the qualty of a new object was suggested: an object can be consdered as good f t mproves the fuzzy partton. The use of such measure s, however, dsadvantageous f t s assumed that gradual changes may lead to abrupt changes n the course of tme. For nstance, f two clusters are movng towards each other as tme passes there are a lot of objects wth hgh and ambguous degrees of membershp and accordng to a valdty measure the qualty of the partton deterorates so that these objects are not selected for updatng. But these objects are actually good representatves of both clusters that do ndeed ndcate a slow gradual deteroraton of the cluster partton. If these objects are rejected for updatng a classfer the recognton of overlappng clusters and mergng clusters wll be mpossble. Thus, t s sutable to defne good objects as good representatves of clusters based on ther degrees of membershp,.e. as objects whose degrees of membershp to clusters exceed a predefned threshold α good (as was proposed n equaton (4.26)): X good = good { u α } x j j. (4.57) Ths means that one object can be consdered as good for more than one cluster and consequently t can nfluence several clusters leadng to gradual changes n the cluster

147 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 29 structure. On the other hand, a classfer can follow gradual changes that may lead to abrupt changes due to the update of cluster prototypes. Threshold α good can be chosen equal to the absorpton threshold so that all objects absorbed by a cluster can nfluence t. Obvously, the hgher the threshold α good, the smaller the number of good objects. Bad objects rejected for updatng a classfer can be consdered as canddates for a new cluster. After adaptng a classfer to gradual and abrupt changes t s necessary to update the tranng data set, or template set, whch contans the best cluster representatves and the most recent objects observed. On the one hand, the template set extended by new objects n each tme wndow s consdered durng the montorng procedure to recognse temporal changes n the dynamc cluster structure. On the other hand, t s used to re-learn a classfer f adaptaton to abrupt changes s needed. Thus, the template set must combne the most up-to-date nformaton wth the most useful old nformaton. 4.8 Updatng the Template Set of Objects In Secton three dfferent approaches for updatng the tranng set of objects were dscussed. Usng a movng tme wndow the tranng set conssts of the objects n the current tme wndow, whch are used durng the montorng procedure to recognse temporal changes n the cluster structure and to adapt a classfer. Usng a concept of the template set all representatve objects can be ncluded n the template set. As long as no changes, or only slght gradual changes, are observed, new objects of the current tme wndow are added to the template set. As soon as abrupt changes appear, the template set s substtuted by the set of the most recent objects. Usng the concept of usefulness the representatve objects for the template set are selected from new objects based on the value of ther usefulness, whch can be defned dependng on the type of the classfer. If gradual changes are observed n the course of tme, objects are gettng old and less useful, and f the degree of usefulness falls below a certan threshold they are dscarded. In the case of abrupt changes the template set s substtuted by the most recent objects for whch a new record of usefulness s derved. A common feature of all these approaches s that the tranng set, or template set, of objects ncludes only recent objects that are representatve of, and useful for, the current wndow. In the case of abrupt changes n the cluster structure the old template set s dscarded and substtuted by a set of new objects arrvng n the current tme wndow. In ths case only the current cluster structure s consdered for classfer desgn and the nformaton about the old cluster structure s lost. However, t s not suffcent to dscard and forget clusters that have already been learned. They may appear agan n the future and t s easer and much qucker to dentfy an object by classfyng t to exstng clusters than detectng a new cluster.

148 30 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms Thus, by contrast to the aforementoned approaches the man dea of updatng the template set n the dynamc classfer desgn proposed n ths thess s to preserve all clusters detected n the course of tme by ncludng ther best representatves n the template set. The best representatves for the template set are defned by choosng good, or useful, objects. The degree of usefulness of an object can be defned as the hghest degree of membershp of an object to one of the clusters: u( x ) = max u j, j =,..., N j =,...,c (4.58) In ths way the template set must contan the best representatves of clusters ever detected as well as the most recent objects. These two types of objects can be consdered as two sub-sets n the template set. If an exstng cluster absorbs new objects over and over agan, the sub-set of ts best representatves n the template set s constantly updated to be able to follow possble gradual changes of a cluster. Otherwse, f a cluster s not up-to-date any more the sub-set of ts representatves remans unchanged as tme passes. The sub-set of the most recent objects ncludes all new objects observed durng the last ρ tme wndows. As stated n [Nakhaezadeh et al., 996] t s not suffcent to consder only good and useful objects, snce n ths case the classfer wll always gnore bad objects wth low degrees of membershp to all clusters whch may, for nstance, buld a new cluster n the course of tme. Snce the crtera for detecton of new clusters requre a suffcent number of free objects not already absorbed by exstng clusters, t seems reasonable to keep these objects n the template set for a whle so that supplemented by new objects they may lead to the formaton of new clusters. Thus, the structure of the template set can be represented as follows (Fgure 4-25). Suppose that the current cluster structure conssts of c clusters. The template set ncludes a set of the most recent objects observed durng the last ρ tme wndows and a set of the best representatves of exstng clusters. The set of the most recent objects s organsed n c subsets of objects absorbed by the correspondng clusters and a sub-set of free objects shown as boxes wth ρ layers n Fgure Each layer of c boxes contans n a (T(k)), =,..., c, objects absorbed durng tme wndow T(k) by the correspondng cluster. The set of best representatves of clusters s also separated nto c sub-sets for c exstng clusters each contanng N, =,..., c, objects selected over tme accordng to ther degrees of usefulness.

149 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 3 New objects C C c free T(k)..... T(k-ρ) n a... n ac n f The most recent objects Useful objects C C c Waste N... N c Best representatves of clusters Template set Fgure 4-25: The structure of the template set 4.8. Updatng the template set after gradual changes n the cluster structure Based on the above consderatons the followng procedure for updatng the template set s proposed. Suppose that the maxmum sze of the template set s chosen ρ tmes larger than the sze of the tme wndow and s restrcted by the maxmum number N max of objects that can be contaned n the set. Ths means that f the length of the tme wndow s gven by N tw objects the current sze of the template set N ts can be ρ N tw N ts N max. (4.59) If N T(k) =N tw new objects arrve n the tme wndow T(k) and after ther classfcaton only gradual changes can be observed, new objects are saved n the template set n c+ sets of the most recent objects accordng to ther assgnment to clusters. The oldest objects arrved n tme wndow T(k-ρ) are dscarded from these sets and the most useful of them (determned usng (4.58)) are added to the sets of cluster representatves. Hence the updatng of the template set follows the prncple frst n, frst out of queue theory. Obvously, the sets of cluster representatves grow as tme passes and the template set can become larger than N max. In ths case superfluous objects must be dscarded from these sets accordng to the queue prncple as well. The number of representatves for each cluster s reduced n such a way that the relaton between the numbers of representatves for each cluster

150 32 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms remans unchanged. Ths s very mportant n order to preserve the nformaton about the exstng cluster structure takng nto consderaton dfferent szes of clusters, snce ths nformaton s used durng the montorng procedure to recognse new clusters. In order to calculate the number of objects that must be dscarded from each set of cluster representatves, or nversely the number of objects that can reman n the sets, the prncple of proportonal allocaton used n the stratfed samplng approach can be appled [Pokropp, 998, p. 56]. Consder c sets of cluster representatves as a populaton N rep consstng of c clusters, or stratas, N,..., N c. The sze of the populaton s gven by c N =. (4.60) rep N = If N new objects have to be added to ths populaton t must be reduced beforehand to the sze new N = N N,.e. a sample of sze N has to be selected from N rep. Ths can be rep rep new new rep acheved by selectng a partal sample from each strata N, =,..., c, so that the total sample sze s equal to N and the relaton of partal samples s equal to the relaton of stratas N. new rep Usng the rule of proportonal allocaton the sze of partal samples s determned n the followng way: N N new = N rep,,...,c N (4.6) rep new = round new Thus, f N new useful objects are added to the sets of best representatves of clusters, (N - N ) objects are dscarded from each cluster keepng the relaton between the sets of cluster representatves unchanged. Hence, only new, useful objects can nfluence and change ths relaton. Note that f there are no new, useful objects for some cluster, ths cluster s not nvolved n the calculaton of the populaton sze (4.60) and no objects are dscarded from ths cluster. Ths condton prohbts the degradaton of old clusters and guarantees that clusters that are not supported by new objects any more wll be preserved anyway (.e. the set of cluster representatves s frozen ). Example 3: Illustraton of the updatng procedure for the template set. The example shown n Fgure 4-26 llustrates the prncple of proportonal allocaton appled to the updatng procedure of the template set. Consder the template set n tme wndow T(k) and suppose that the maxmum sze of ths set s equal to 300. The sze of the tme wndow s restrcted to 200 objects and the set of the most recent objects keeps objects of the last four tme wndows. The sets of cluster representatves contan N (T(k))=260, N 2 (T(k))=90 and N 3 (T(k))=50 objects n the current tme wndow.

151 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 33 T(k+) Cluster Cluster 2 Cluster 3 Free objects T(k) T(k-) T(k-2) T(k-3) The most recent objects N = 260 N 2 = 90 N 3 =50 Best representatves of exstng clusters Fgure 4-26: Illustraton of the updatng procedure of the template set If 200 new objects arrve durng tme wndow T(k+), then the most recent objects of tme wndow T(k-3) consstng of 30 objects absorbed n Cluster, 20 objects absorbed n Cluster 2 and 50 free objects have to be dscarded. Suppose that 20 and 80 useful objects are selected as the best representatves for Clusters and 2 respectvely and have to be added to the correspondng sets. Ths means that the number of objects n the sets of the best cluster representatves have to be reduced by N new =00 objects to N = =400. It must be noted that the set of Cluster 3 s not extended by new useful objects, therefore t remans unchanged. Hence, the sze of the populaton consdered s N rep =350 and only sets N and N 2 have to be reduced. The szes of the partal samples for these two sets are calculated accordng to equaton (4.60): N new rep 260 = 250 = , new = N round 90 = 250 = new 2 = The relaton of the cluster szes remans unchanged: 260:90=86:64=2.9. After addng new useful objects to the sets of cluster representatves, ther szes are: N (T(k+))=206, N 2 (T(k+))=44 and N 3 (T(k+))=50. As can be seen, the sze relaton between Clusters round

152 34 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms and 2 has changed to 206:44=.4 due to new objects but the total number of objects contaned n the sets of cluster representatves remans equal to Updatng the template set after abrupt changes n the cluster structure If abrupt changes are detected after classfcaton of new objects n the current tme wndow T(k), the template set must be structurally changed. Dependng on the type of abrupt changes three stuatons can be dstngushed:. Formaton of new clusters: new sets contanng objects absorbed by new clusters n the current wndow are defned and new sets for the best representatves of the new clusters are added to the template set. 2. Mergng two clusters: two sets contanng the most recent objects of the correspondng clusters are combned n a sngle one flled wth objects (the most recent and new objects) absorbed by a new cluster n the current tme wndow. The number of objects absorbed by ths cluster n the prevous ρ- tme wndows s ntalsed to zero. The best representatves of the two clusters are combned n a sngle set that s also ntalsed to zero. 3. Splttng a cluster: the set contanng the most recent objects of the correspondng cluster s splt nto two sets contanng objects (the most recent and new objects) absorbed n two new clusters n the current tme wndow. The number of objects absorbed n these clusters n the prevous ρ- tme wndows s ntalsed to zero. The best representatves of the correspondng cluster are also splt nto two sets ntalsed to zero. The other sets of the most recent objects and cluster representatves are preserved and updated wth new objects as descrbed above for the case of gradual changes. It must be noted that n the case of abrupt changes, after modfcaton of the template set most of the objects prevously contaned n the sub-sets of clusters that have been changed (merged or splt) can be absorbed n the newly formed clusters and concentrated n one sub-set of tme wndow T(k), whereas other sub-sets of these clusters are empty. If new objects arrve durng tme wndow T(k+) a part of the most recent objects have to be dscarded and the most useful of them have to be added to the set of the best cluster representatves. In contrast to the updatng procedure n the case of gradual changes, objects to be dscarded from the sets correspondng to new formed clusters are selected from tme wndow T(k) (nstead of tme wndow T(k-ρ)) snce they represent the oldest objects n these sets. Thus, t can be generally sad that f the template set s gettng too large, objects from the bottom of the set of the most recent objects, whch may correspond to tme wndows between T(k-ρ) and T(k), are dscarded.

153 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 35 Usng the concept of usefulness of objects and an adaptve tranng set that s updated by ncludng new and useful objects, the tme wndow can be chosen to be of constant length. The choce of ths parameter, as well as the number of tme wndows establshng the set of the most recent objects n the template set, must be defned dependng on the applcaton. If the sze of the tme wndow s relatvely small the number of tme wndows kept n the template set must be suffcently large and vce versa. Such a relaton should guarantee a suffcent number of new and recent objects needed to recognse abrupt changes n the dynamc cluster structure. 4.9 Cluster Valdty Measures for Dynamc Classfers Durng dynamc classfer desgn the valdty measure s used to control the process of adaptaton of a classfer to temporal changes n the cluster structure. The goal of a valdty measure s to determne whether the adaptaton of a classfer leads to an mprovement of the parttonng and to make a decson regardng acceptance or rejecton of modfcatons appled to a classfer (such as formaton, mergng or splttng of clusters ). If after havng detected abrupt changes a classfer has been re-learned, a valdty measure s evaluated for the current partton and compared to the prevous value of the valdty measure before re-learnng. If the valdty measure ndcates an mprovement of the current parttonng compared to the prevous one the new classfer s retaned, otherwse the prevous classfer s restored. Valdty measures are generally used for the evaluaton and comparson of the qualty of the fuzzy parttons wth dfferent numbers of clusters. In other words, valdty measures evaluate how good the gven partton of data nto clusters reflects the actual structure of the data. The requrements for the defnton of the optmal partton of data nto clusters and the correspondng crtera for the defnton of valdty measures are usually formulated as follows [Gath, Geva, 989]:. Clear separaton between the resultng clusters (separablty), 2. Mnmum volume of the clusters (compactness), 3. Maxmum number of objects concentrated around the cluster centres (densty of objects wthn clusters). Thus, although fuzzy methods for clusterng are used, the am of clusterng fulfllng these requrements s to generate well-defned subgroups of objects leadng to a harder partton of the data set. In the lterature a large number of valdty measures, whch can be separated nto three large classes dependng on the propertes of cluster parttons used, s proposed: measures usng propertes of the degrees of membershps, measures usng propertes of the data structure, and

154 36 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms measures based on both types of propertes. The frst class of valdty measures s represented by the partton coeffcent [Bezdek, 98, p. 00], classfcaton entropy [Bezdek, 98, p. ] and proporton exponent [Wndham, 98]. These measures evaluate the degree of fuzzness of a cluster partton. The second class of valdty measures based only on the data ams at evaluatng cluster propertes such as compactness and separaton. The compactness of clusters characterses the spread or varaton of objects belongng to the same cluster. The separaton of clusters means the solaton of clusters from each other. Most of the valdty measures of ths class ([Dunn, 974], [Gunderson, 978], [Daves, Bouldn, 979]) were proposed based on propertes of the crsp partton and depend on the topologcal structure nduced by the dstance metrc used. The largest class of valdty measures ncludes measures based on both the propertes of the degrees of membershp and on the data structure. Here one can dstngush between measures based on the crtera of volume and densty of fuzzy clusters and those based on compactness and separaton. The three best known valdty measures of the frst group were proposed n [Gath, Geva, 989]. Suppose that F s the fuzzy covarance matrx of cluster defned by equaton (4.24). The fuzzy hypervolume crteron s then calculated by v (c) =, (4.62) HV h = where the hypervolume of the -th cluster s defned as h = det( F ). The partton densty s calculated by where c = c good n = v PD (c) = c, (4.63) good n s the sum of good objects n cluster defned by (4.25). The average partton densty s calculated by = h c good n v APD (c) =. (4.64) c h Densty crtera are rather senstve n cases of substantal overlappng between clusters and large varablty n compactness of exstng clusters. Snce average partton densty s calculated as the average of denstes of sngle clusters, t ndcates a good partton even f both dense and loose clusters are presented. The partton densty corresponds to the common

155 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 37 physcal defnton of densty wthout takng nto account the dstrbuton of densty over clusters. Another group of valdty measures s based on fuzzy compactness and separaton. Based on the deas ntroduced n [Xe, Ben, 99] the global compactness and the fuzzy separaton of fuzzy clusters are defned n [Bensad et al., 996] n the followng way: where N u j j= N j= (u j ) 2 x π = =,...,c n S j v c 2 = = A r= 2 A (4.65) v v r,...,c (4.66) n = s the fuzzy cardnalty of cluster and A s an arbtrary M M symmetrc postve defnte matrx. The valdty ndex for cluster s gven by the rato between ts fuzzy separaton and ts compactness and the total valdty measure s obtaned by summng up ths ndex over all clusters: SC (U, V, X) = c = S π = n c r= N = j= (u c j ) v 2 x v j 2 r A v 2 A. (4.67) A larger value of SC ndcates a better partton,.e. a fuzzy partton charactersed by wellseparated and compact fuzzy clusters. In [Zahd et al., 999] ths defnton of the valdty measure was extended by consderng the rato between the fuzzy separaton and compactness obtaned only from the propertes of fuzzy membershp functons. The fuzzy compactness ndcates how well objects are classfed, or how close objects are located to cluster centres, by consderng the objects maxmum degrees of membershp. Compact clusters are obtaned f all maxmum degrees of membershp of objects take hgh values. The fuzzy separaton measures the parwse ntersecton of fuzzy clusters by consderng the mnmum degrees of membershp of an object to a par of clusters. Clusters are well-separated f they do not ntersect or ther ntersecton area s mnmum. FC = N j= (max u n j ) 2 (4.68)

156 38 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms FS = c c j= = r= + N mn (u, u n r j rj ) 2 (4.69) where = N n max u s the cardnalty of the set of maxmum degrees of membershp and j= N r = mn (u j, u rj ) j= j n s the cardnalty of the set of parwse mnmum degrees of membershp. The resultng valdty measure based on membershp functons s obtaned as the rato between the fuzzy separaton and compactness: = N c c j = r= + N FS n r SC 2 (U) = = (4.70) FC 2 (max u ) j= mn (u, u n j j rj ) 2 A compact and well-separated cluster partton corresponds to a low value of SC 2. The overall valdty measure s defned n [Zahd et al., 999] as the degree of correspondence between the structure of the nput data set and the fuzzy partton resultng from the fuzzy clusterng algorthm. SC = SC(U,V, X) SC 2 (U). (4.7) A larger value of SC ndcates a better fuzzy partton. Ths valdty measure ndcates cohesve clusters wth a small overlap between pars of clusters. All valdty measures descrbed above are usually used n combnaton wth clusterng algorthms to determne the optmal fuzzy c-partton,.e. a clusterng algorthm s appled to partton a data set nto a number of clusters varyng between 2 and c max and the partton wth the best value of the valdty measure s chosen as the optmal one. Applyng valdty measures durng dynamc classfer desgn the ntenton remans unchanged but the comparson s carred out only for two parttons: before and after an adaptaton of the classfer. It s assumed that the optmal number of clusters s determned by the montorng procedure and the task of the valdty measure s to confrm the new partton obtaned. In order to choose a sutable valdty measure t s reasonable to take nto account the types of temporal changes that can appear n the cluster structure and lead to the modfcaton of the classfer.

157 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 39 Consderng the crtera proposed to detect abrupt temporal changes t can be seen that the decson to splt a cluster or form a new one was based on the crtera of densty and separaton: a classfer must be changed f dense groups well-separated from exstng clusters, or wthn a sngle cluster, are detected. The decson regardng the mergng of clusters was based on the ambguty property of a fuzzy partton due to the ntersecton of fuzzy clusters: a classfer must be changed f there are pars of clusters wth a sgnfcant overlap. Therefore, the purpose of the adaptaton of a classfer s to acheve the most unambguous hard partton wth dense well-separated clusters. It seems reasonable to consder a valdty measures based on the average partton densty (4.64) as well as on fuzzy separaton and compactness. Instead of evaluatng the parwse ntersecton of fuzzy clusters, a measure of fuzzy separaton can be obtaned by consderng the dfference between the hghest and second hghest degree of membershp of each object to the clusters. Ths quantty can serve as a much better ndcator for separaton of clusters snce t measures the degree of ambguty of an object assgnment when clusters overlap. A hgh value of ths quantty s a sgn for a rather hard assgnment of objects. The total value of the valdty measure based on the prncple of ambguty can be obtaned by aggregatng the membershp dfferences over all objects usng, for nstance, the arthmetc mean operator and consderng the rato between the fuzzy separaton and the fuzzy compactness gven by (4.68). N m m2 (u j u j ) FSA j= SC 3 (U) = =, N (4.72) FC j= max u where FSA s the fuzzy separaton based on ambguty, of membershp of object j and u m2 j {,...,c} \ m j j u m j = max u s the hghest degree {,...,c} = max u s ts second hghest degree of membershp. A larger value of SC3 corresponds to a better partton. The value of FSA equal to corresponds to a hard unambguous partton, whereas the value equal to zero corresponds to the most ambguous partton 2. Usng the valdty measure as an ndcator for adaptng the dynamc classfer, t s necessary to compare the value of the valdty measure after adaptaton wth the one before adaptaton. A postve dfference n the valdty measure ndcates an mprovement of the qualty of the dynamc classfer. A negatve dfference ndcates a deteroraton of classfer performance n the case of adaptaton, thus ths must be rejected and the prevous classfer must be preserved. j 2 Snce the valdty measure s used here to evaluate a possblstc c-partton generated by a dynamc classfer, t cannot be stated that U=[/c] f the partton s ambguous as n the case for probablstc fuzzy c-partton.

158 40 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms However, a problem may arse n whch the valdty measure SC 3 does not reflect adequately the mprovement of the dynamc classfer. For nstance, f durng the adaptaton of the classfer a cluster was splt up, t s to be expected that the fuzzy compactness wll ncrease much more than the fuzzy separaton, whch can even decrease due to a certan overlappng of new clusters. Moreover, an ncrease n the average partton densty after splttng clusters can be expected. In order to allow the adaptaton of the classfer the negatve varaton of the valdty measure can be accepted as a trade-off for the sgnfcant ncrease of compactness and the average partton densty. Thus, n order to obtan relevant nformaton about changes n the partton qualty, t seems reasonable to consder the varaton of three ndexes: fuzzy separaton, fuzzy compactness and average partton densty. The varaton of the fuzzy separaton measure between tme nstants t and t- s gven by the followng equaton: FSA t,t = N N N N j= j= N N j= (u (t) u j j j (u (t) u 2 j (u (t) u j 2 j (t)) (t)) (u N j (t )) (u N j= (t ) u 2 j (u (t ) u j (t) u 2 j 2 j 2 j (t )) = (t )) (t )) = (4.73) The varaton of the fuzzy compactness between tme nstants t and t- s defned as follows: FC t,t = N N N j= N j= max u (t) N j j j= N (max u (t) max u max u (t ) = j j (t )) (4.74) The varaton of the average partton densty between tme nstants t and t- s obtaned n the followng way: c(t) n (t) n (t ) c(t) good c(t ) good apd v t,t = (4.75) = h (t) c(t ) = h (t ) If at least one of these three valdty measures has a postve varaton and the others have a small negatve varaton, a new classfer can be accepted. The followng examples show that dfferent valdty measures can be relevant for adaptng the dynamc classfer dependng on the changes takng place.

159 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 4 Example 4: Splttng clusters based on the average partton densty measure. Consder the cluster structure at a certan tme nstant t- shown n Fgure A current classfer s charactersed by two clusters wth centres at v =(-0.02, -0.0) and v 2 =(.65;.99). Due to new objects absorbed by cluster C 2 the montorng procedure has detected a heterogeneous cluster C 2 whch must be splt up. The adaptaton of the classfer s carred out by re-learnng the classfer wth a number of clusters c=3 The values of valdty measures before and after adaptaton are summarsed n Table 4-4. As can be seen, the fuzzy separaton and compactness of the fuzzy partton have decreased but the average partton densty has ncreased consderably (n more than 2 tmes). Thus, ths result corresponds to the splttng crtera and the new classfer can be accepted. X Fgure 4-27: Adaptaton of a classfer requres splttng clusters Table 4-4: Valdty measures for a fuzzy partton before and after splttng clusters Before splttng After splttng FSA=0.68 FSA=0.66 FC=0.69 FC=0.68 SC 3 =0.988 SC 3 =0.975 v apd =97.68 v apd = Example 5: Mergng clusters based on the fuzzy separaton and average partton densty measures. Fgure 4-28 llustrates a cluster structure contanng three clusters wth centres at v =(-0.02; 0), v 2 =(2.09;.99) and v 3 =(.09; 2.07). After classfcaton of new objects arrvng n tme wndow t, the montorng procedure has detected two smlar clusters, v 2 and v 3, whch must be merged. After the adaptaton of the classfer by re-learnng wth a new cluster number c=2 X

160 42 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms a partton wth better separated, more dense but less compact clusters s obtaned. The values of valdty measures before and after mergng clusters are shown n Table 4-5. Snce the partton s mproved wth respect to two measures, a new classfer s accepted. X Fgure 4-28: Adaptaton of a classfer requres cluster mergng Table 4-5: Valdty measures for a fuzzy partton before and after mergng clusters Before mergng After mergng FSA=0.607 FSA=0.649 FC=0.683 FC=0.654 SC 3 =0.889 SC 3 =0.993 v apd = v apd =6.0 The use of three dfferent valdty measures leads to a more complete judgement of changes n a fuzzy partton and establshes more relable control of the adaptaton of a dynamc classfer. The use of valdty measures durng classfer desgn and adaptaton guarantees the preservaton and mprovement of classfer performance over tme and concludes the learnngand-workng cycle of the dynamc pattern recognton system. 4.0 Summary of the Algorthm for Dynamc Fuzzy Classfer Desgn and Classfcaton X The overall algorthm for dynamc fuzzy classfer desgn and classfcaton s summarsed n Fgure As can be seen, the desgn of the dynamc classfer s combned wth the classfcaton of new objects n a sngle learnng-and-workng cycle n order to keep the classfer up-to-date and to adapt t to the temporal changes n the cluster structure.

161 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms 43 New objects n tme wndow t Classfer at tme t- Classfcaton results Montorng procedure Result Gradual changes Abrupt changes New clusters Mergng clusters Splttng clusters Drft of clusters Current template set Update by addng new objects Incremental updatng Re-learnng wth new cluster number c t =c t- +c new -c merg +c splt Modfy usng new objects Accept new classfer at tme t mproved Cluster valdty Reject the new classfer Restore the old classfer deterorated Fgure 4-29: The process of dynamc fuzzy classfer desgn and classfcaton Consder a classfer desgned n tme wndow t-. Suppose that new objects arrve n tme wndow t. They are classfed by the current classfer and the classfcaton results are ntroduced nto the montorng procedure for analyss. The am of the montorng procedure s to decde whether new objects ft well nto the current cluster structure and, f not, to recognse changes. In order to fulfl ths task the montorng procedure has four algorthms at ts dsposal. Based on the algorthm for evaluatng the performance of the classfer appled to new data (Secton 4.6), gradual changes such as the drft of cluster centres can be detected. Ths decson of the montorng procedure s ntroduced nto the adaptaton procedure where ncremental updatng of correspondng cluster centres s carred out as explaned n Secton 4.7. Usng three algorthms for detecton of new, smlar and heterogeneous clusters whch were presented n Sectons 4.3, 4.4 and 4.5, respectvely, the montorng procedure can detect abrupt temporal changes n the cluster structure and results n any combnaton of three possble decsons: formaton of new clusters, mergng of clusters, or splttng of clusters. Each of the three algorthms of the montorng procedure delvers the number of clusters by whch the current cluster number must be ncreased or decreased, as well as estmates of new cluster centres. These results are forwarded to the adaptaton procedure where a new number of clusters c t recognsed n tme wndow t s calculated by addng the results of the three algorthms, as descrbed n Secton 4.7.

162 44 Dynamc Fuzzy Classfer Desgn wth Pont-Prototype Based Clusterng Algorthms The adaptaton of the classfer s acheved by re-learnng the classfer wth the new cluster number and usng the exstng template set extended by new data. In order to evaluate the performance of a new classfer obtaned n tme wndow t, ts performance n compared to the classfer n tme wndow t- by applyng a set of valdty measures. If at least some of these measures ndcate an mprovement n classfer performance the new classfer at tme t s accepted and saved nstead of the prevous one. If all valdty measures have deterorated ths means that the adaptaton of the classfer was too early and cannot mprove the cluster partton. In ths case the new classfer s rejected and the old classfer at tme t- s restored. After adaptng the classfer the template data set s updated usng one of the updatng strateges descrbed n Secton 4.8. The choce of the relevant strategy depends on the result of the montorng procedure. If only gradual changes were detected the template set s updated by new data accordng to the queue prncple. If abrupt changes were recognsed and the new classfer was accepted, the template set must be structurally modfed before extendng t wth new data. If the new classfer was rejected after adaptaton to abrupt changes, the template data set s updated as n the case of gradual changes. It must be noted that the results of the montorng procedure can be used to make the dagnoss about the current cluster structure (.e. current system states). Besdes the fnal results of the montorng procedure concernng detected changes n the cluster structure, some addtonal parameters can be provded to the end user. For nstance, cluster szes calculated as the number of objects absorbed by the correspondng clusters (see equaton (4.7)) or as the number of good objects (see equaton (4.26)) can be nterpreted as usefulness of clusters n the current tme wndow. Further parameters, whch are calculated durng the montorng procedure and can be relevant for the dagnoss, are the number of free objects, partton denstes of exstng clusters, new locatons of cluster centres that have been moved, the value of the fuzzy separaton and the fuzzy compactness for the current partton. Together, all these measures can gve a good representaton of the current stuaton. The algorthm for dynamc fuzzy clusterng developed n ths chapter provdes a possblty to desgn an adaptve classfer capable of followng temporal changes n the cluster structure and adjustng ts parameters automatcally to detected changes n the course of tme.

163 Smlarty Concepts for Dynamc Objects n Pattern Recognton 45 5 Smlarty Concepts for Dynamc Objects n Pattern Recognton In Sectons 2.2. and 4. t was stated that the task of clusterng methods s to partton a number of objects nto a small number of homogeneous clusters so that objects belongng to any one of the clusters would be as smlar as possble and objects of dfferent clusters as dssmlar as possble. The most mportant problem arsng n ths context s the choce of a relevant smlarty measure, whch s then used for the defnton of the clusterng crteron. The noton of smlarty depends on the mathematcal propertes of the data set (e.g. mutual dstance, angle, curvature, symmetry, connectvty etc.) and on the semantc goal of a specfc applcaton. Accordng to [Bezdek, 98, p. 44] there s no unversally optmal clusterng crteron or measure of smlarty that can perform well for all possble data structures. Ths fact s explaned by a wde varety of data structures, whch can posses dfferent shapes (sphercal, ellptcal), szes (ntenstes, unequal numbers of observatons), or geometry (lnear, angular, curved) and can be translated, dlated, compressed, rotated, or reordered wth respect to some dmensons of the feature space. Therefore, the choce of a partcular clusterng crteron and a partcular measure of smlarty s at least partally context dependent. It should be noted that smlarty measures represent man components of clusterng crtera and can be used ether as crtera themselves or n combnaton wth varous clusterng crtera to obtan dfferent clusterng models. Many clusterng methods (e.g. fuzzy c-means [Bezdek, 98, p. 65], possblstc c-means [Krshnapuram, Keller, 993], unsupervsed optmal fuzzy clusterng [Gath, Geva, 989], (fuzzy) Kohonen networks [Rumelhart, McClelland, 988]) use clusterng crtera based on the dstance between objects x j and x l, x j, x l X, whch s defned by a symmetrc postvedefnte functon of pars of elements d : X X R. If d addtonally satsfes the trangle + M nequalty, then d s a metrc on R, a property that s not necessarly requred, however. In partcular one of the most frequently used measures s the Mnkowsk metrc and ts specal case the Eucldean metrc [Bandemer, Näther, 992, p ]. If each feature vector (object) s consdered as a pont n the M-dmensonal feature space, then the dstance between these ponts can be nterpreted as the dssmlarty of two feature vectors [Bezdek, 98, p.48]. A smlarty measure can be derved from d by frst normalsng d to d* [0, ] and then defnng accordng to one of the followng equatons ([Bezdek, 98, p. 48], [Bandemer, Näther, 992, p. 70-7]): s s = (5.) * jl d jl jl = + d * jl (5.2)

164 46 Smlarty Concepts for Dynamc Objects n Pattern Recognton d * jl s jl = e (5.3) It s obvous that smlarty measures are also symmetrc postve-defnte functons defned on the nterval [0, ]. In Secton 4. t was stated that dynamc objects are represented by a temporal sequence of observatons and descrbed by multdmensonal trajectores n the feature space, whch contan a hstory of temporal development of each feature and are gven by equaton (4.7). Snce the dstance between vector-valued functons s not defned, classcal clusterng and n general pattern recognton methods are not suted for processng dynamc objects. There are two possbltes for handlng dynamc objects represented by multdmensonal trajectores n pattern recognton:. to pre-process trajectores n order to transform them nto conventonal feature vectors; 2. to defne a dstance or dssmlarty measure for trajectores. The frst approach s used n most applcatons regardng tme seres classfcaton and wll be descrbed n Secton 5.. The second approach requres a defnton of a smlarty measure for trajectores that should take nto account the dynamc behavour of trajectores and wll be consdered n Secton 5.2. When defnng a smlarty measure for trajectores t s mportant to determne a specfc crteron for smlarty. Dependng on the applcaton ths crteron may requre ether the best matchng of trajectores by mnmsng the pontwse dstance or a smlar form of trajectores ndependent of ther relatve locaton to each other. After consderng dfferent types of smlarty and ntroducng dfferent smlarty models n Secton 5.2, a number of defntons of specfc smlarty measures wll be proposed. The use of smlarty measures for trajectores for the modfcaton of statc pattern recognton methods wll be explaned n Secton 5.3 based on the example of the modfcaton of the fuzzy c-means algorthm. 5. Extracton of Characterstc Values from Trajectores The frst approach regardng the use of dynamc objects n pattern recognton s related to the pre-processng of dynamc objects n such a way that they become vald nputs for classcal pattern recognton methods. Data pre-processng s one of the obvous steps usually performed before the actual pattern recognton process starts and s concerned wth the preparaton of data (e.g. scalng and flterng) and complexty reducton. The reasons for performng data pre-processng are the followng [Faml et al., 996]:

165 Smlarty Concepts for Dynamc Objects n Pattern Recognton 47. solvng data problems that may prevent the use of any data analyss methods or may lead to unacceptable results (e.g. mssng features or feature values, nosy data, dfferent data formats, very large data volumes); 2. understandng the nature of data (through data vsualsaton and calculaton of data characterstcs n order to ntalse the data analyss process), and 3. extractng more meanngful knowledge from a gven data set (e.g. extracton of the most relevant features). Data pre-processng ncludes such technques as sgnal processng, correlaton analyss, regresson analyss and dscrmnaton analyss, to name just a few. In some cases, more than one form of data pre-processng s requred, therefore the choce of a relevant method for data pre-processng s of crucal mportance. Data pre-processng can be represented as a transformaton Z [Faml et al., 996] that maps a set of the raw real world feature vectors x jh nto a set of new feature vectors y jk : y = Z(x jh ), j =,..., N, h =,...,M a, k,...,m, (5.4) jk = so that Y preserves the valuable nformaton n X, elmnates at least one of the data problems n X and s more useful than X. In (5.4) M a denotes the number of features before preprocessng (often called attrbutes) and M the number of features after pre-processng. Valuable nformaton s charactersed by four attrbutes [Fayyad et al., 998, p. 6-8]: vald, novel, potentally useful, and ultmately understandable. Valuable nformaton ncludes components of knowledge that must be dscovered by pattern recognton methods and presented n a meanngful way. Dealng wth dynamc objects n pattern recognton, we are confronted wth the frst data problem: classcal methods can not be appled to data n the form of multdmensonal trajectores. The goal of pre-processng n ths case s to reduce trajectores of features to small sets of real numbers (vectors) by extractng characterstc values from the trajectores. The resultng real-valued vectors are then combned to form a sngle one and a conventonal feature vector s obtaned. Durng ths transformaton vector-valued features are replaced by one or more real numbers leadng to real-valued feature vectors. Ths dea s llustrated n Fgure 5-. Let x (t) = [ x (t), x (t),..., (t)] denote a vector-valued feature vector whose components are 2 x M trajectores of sngle features defned on the same tme nterval [t,..., t p ]. Selectng a set of characterstc values { K( x r ),...,K Lr ( x r )} from a trajectory of each feature x (t) = [x (t ), x (t ),..., x (t )], r,...,m and placng t n a vector together wth other sets, r r r 2 r p = a conventonal feature vector s obtaned. The number L r, r =,..., M, and the type of characterstc values may vary for sngle features.

166 48 Smlarty Concepts for Dynamc Objects n Pattern Recognton x (t) x (t ) x 2 (t)... x 2 (t ) x M (t) x M (t ) x (t 2 )... x 2 (t 2 ) x M (t 2 ) x (t p ) x 2 (t p ) x M (t p ) K (x )... K L (x ) K (x 2 )... K L2 (x 2 )... K (x M )... K LM (x M ) Fgure 5-: Transformaton of a feature vector contanng trajectores nto a conventonal feature vector Ths method of data pre-processng produces the followng transformaton Z: Z :[ x (t), x (t),..., x (t)] [K ( x ),...,K ( x 2 M LM M = )] [K,...,K L ], (5.5) where x(t). M L r r= L = s the total number of characterstc values extracted from a feature vector Snce pre-processng of dynamc objects s ndependent of the pattern recognton method used for analyss, any classcal method dealng wth objects n the vector space can be appled n conjuncton wth ths method of pre-processng for the purpose of dynamc pattern recognton. Pre-processng of dynamc objects s used n most applcatons of tme-seres classfcaton, for nstance, n EEG / ECG dagnoss, speech recognton, radar classfcaton, etc. In all these cases statc feature vectors are extracted from tme seres durng pre-processng and then used wthn conventonal (statc) pattern recognton methods. One of the best llustratons of the applcablty of the method of pre-processng of trajectores s gven n [Geva, Kerem, 998], where the problem of dynamc state recognton and event predcton n bomedcal sgnal processng was consdered. The goal of ths applcaton was to recognse bran states based on the background electroencephalogram (EEG) actvty to form the bass for forecastng a generalsed epleptc sezure, whch s usually preceded by a presezure state n the EEG. An EEG state can be defned by a porton of the tme seres domnated by one rhythm, a partcular alternaton of rhythms, or a frequent appearance of solated events. In order to be able to use a conventonal clusterng algorthm, characterstc features of EEG tme seres were extracted by the fast wavelet transform [Mallat, Zhong, 992]. The most relevant features were then selected and ntroduced nto the unsupervsed

167 Smlarty Concepts for Dynamc Objects n Pattern Recognton 49 optmal fuzzy clusterng (UOFC) algorthm [Gath, Geva, 989] that was used to detect dstnct groups n the data representng EEG states. Durng dynamc state recognton, feature selecton and clusterng were perodcally appled to segments of the EEG sgnal obtaned n overlappng tme wndows. In each cycle the classfer was re-learned based on the complete hstory of data avalable untl the current moment. The results reported n [Geva, Kerem, 998] show that the UOFC algorthm s sutable for an accurate and relable dentfcaton of boelectrc bran states based on the analyss of the EEG tme seres. The dsadvantage of ths approach for handlng dynamc objects s that a part of the mportant nformaton contaned n trajectores and correlatons across several tme steps gets lost. In order to avod ths drawback, an alternatve approach regardng the defnton of smlarty for trajectores can be consdered. The use of ths approach for dynamc pattern recognton requres an obvous modfcaton of conventonal methods. 5.2 The Smlarty Noton for Trajectores Trajectores, and n partcular tme seres, consttute an mportant class of complex data. They appear n many fnancal, medcal, techncal and scentfc applcatons. Although there s a lot of statstcal lterature dedcated to tme seres analyss, whch s used for modellng and predcton tasks, the smlarty noton for tme seres, whch s of prme mportance for data analyss applcatons, has not been studed enough. The noton of smlarty can be nterpreted n dfferent ways dependng on the context. In everyday language, the nterpretaton of smlarty s havng characterstcs n common or not dfferent n shape but n sze or poston [Setnes et al., 998, p. 378]. Ths nterpretaton can be employed for defnng the smlarty between trajectores. Intutvely, two trajectores can be consdered smlar f they exhbt smlar behavour over a large part of ther length. In order to defne smlarty mathematcally, t s essental to determne whch mathematcal propertes of trajectores should be used to descrbe ther behavour and n what way the comparson (matchng) of trajectores should be performed. When analysng trajectores t s assumed that trajectores can contan [Das et al., 997]: outlers and nose due to measurement errors and random fluctuatons; dfferent scalng and translatng factors and base lnes, whch can appear due to measurements on dfferent devces or under dfferent condtons. When comparng trajectores t s necessary to omt outlers and nose from the analyss snce they are not representatve for descrbng the behavour of trajectores. Scalng and translaton s often rrelevant when searchng for smlar trajectores, but there can be applcatons where

168 50 Smlarty Concepts for Dynamc Objects n Pattern Recognton such transformatons nclude a useful nformaton about the behavour of trajectores and could not be gnored. In Secton t was already stated that dependng on the crteron chosen for comparson of trajectores, two types of smlarty between trajectores can be dstngushed [Joentgen, Mkenna et al., 999b, p. 83]:. structural smlarty: the better two trajectores match n form / evoluton / characterstcs, the greater the smlarty between these two trajectores; 2. pontwse smlarty: the smaller pontwse dstance between two trajectores n feature space, the greater the smlarty between these two trajectores. In order to determne structural smlarty, relevant aspects of the behavour of trajectores must be specfed dependng on a concrete applcaton. Based on the chosen aspects, mathematcal propertes of trajectores (e.g. slope, curvature, poston and values of nflecton ponts, smoothness etc.) can be selected, whch are then used as comparson crtera. In such a way, structural smlarty s suted to stuatons n whch one looks for partcular patterns n trajectores that should be well matched. Pontwse smlarty expresses the closeness of trajectores n the feature space. In ths case the behavour of trajectores s not n the foreground and some varatons n form are allowed as long as trajectores are spatally close. In contrast to structural smlarty, the calculaton of pontwse smlarty does not requre a formulaton of characterstc propertes of trajectores and s based drectly on the values of trajectores. Thus, the task s to defne a smlarty measure for trajectores that expresses a degree of matchng of trajectores accordng to some predefned crtera and s nvarant to some specfc transformatons (e.g. scalng, translaton, mssng values or appearance of ncorrect values). In the followng sectons dfferent defntons of pontwse and structural smlarty measures proposed n the lterature, whch use dfferent smlarty models and context-dependent nterpretaton of the smlarty noton, wll be consdered. Afterwards, a number of defntons for the structural smlarty measure, whch dffer n the underlyng smlarty crteron and take nto account dfferent propertes of trajectores, wll be proposed Pontwse smlarty measures Pontwse smlarty determnes a degree of closeness of two trajectores wth respect to correspondng pars of ponts of the trajectores. The closeness of ponts n the feature space can be determned usng the Eucldean dstance or some other dstance measure, whch s nterpreted as a dssmlarty of ponts. The smlarty between ponts s regarded as an nverse of ther dstance n the feature space. Ths classcal defnton of smlarty s crsp: as soon as

169 Smlarty Concepts for Dynamc Objects n Pattern Recognton 5 a threshold for the dstance or smlarty measure s defned, ponts are separated nto two categores of smlar and non-smlar ponts. Such a defnton n the manner of Boolean logc s very restrctve for a descrpton of the concept of smlarty and does not agree wth the human percepton of smlarty, whch s coupled wth mprecson and uncertanty [Zmmermann, Zysno, 985, p. 49]. In order to be able to model the nherent fuzzness of the concept of smlarty, a smlarty measure must allow a gradual transton between smlar and non-smlar [Bnagh et al., 993, p. 767]. Ths property can be fulflled by modellng the smlarty n a fuzzy logc framework. In [Joentgen, Mkenna et al., 999b] a method to determne the pontwse smlarty between trajectores was presented, whch s based on a lngustc descrpton of smlar trajectores. Ths lngustc descrpton represents a subjectve evaluaton of smlarty by a human observer and can vary dependng on the context. The dea of ths method s to consder the pontwse dfference of two trajectores and to calculate a degree of smlarty, or proxmty, of ths sequence of dfferences to the zero-functon (a functon equal to zero on the whole unverse of dscourse). Consder two trajectores x(t) and y(t), t T = [t,..., t p ], n the feature space R M. If the sequence of dfferences s gven by f(t) = x(t) - y(t) = [f(t ),..., f(t p )] = [f,..., f p ], then the smlarty between trajectores x(t) and y(t) s defned accordng to the equvalence equaton: s(x, y)=s(x-y, 0)=s(f, 0). For the sake of smplcty the case of one-dmensonal trajectores n the feature space R s consdered. A generalsaton to the case of multdmensonal trajectores wll be gven afterwards. The followng algorthm to determne a measure of pontwse smlarty between an arbtrary sequence f(t), t T, and the zero-functon was proposed. Algorthm 6: Determnaton of pontwse smlarty for trajectores [Joentgen, Mkenna et al., 999b, p. 85].. A fuzzy set A approxmately zero wth a membershp functon u(f) s defned on the unverse of dscourse F (Fgure 5-2, a). 2. The degree of membershp u(f(t)) of the sequence f(t) to the fuzzy set A s calculated for each pont t T. These degrees of membershp can be nterpreted as (pontwse) smlartes u(f(t ))=s, =,..., p, of the sequence f(t) to the zero-functon (Fgure 5-2, b). 3. The sequence u(f(t)) = [s,..., s p ] s transformed by usng specfc transformatons (e.g. mnmum, maxmum, arthmetc mean, γ-operator, ntegral,) nto a real number s(f, 0) expressng the overall degree of f(t) beng zero.

170 52 Smlarty Concepts for Dynamc Objects n Pattern Recognton u(f) t f(t) 50 u(f(t)) t Fgure 5-2: Illustraton of the defnton of pontwse smlarty between trajectores a) The fuzzy set approxmately zero wth u(f), the sequence of dfferences f(t) and the resultng pontwse smlarty u(f(t) b) projecton of pontwse smlarty nto the plane (t, u(f(t))) All smlarty measures obtaned wth the help of ths algorthm are nvarant to a translaton transformaton,.e. nvarant wth respect to the addton of an arbtrary sequence b(t): s(x, y) = s(x+b, y+b) for all x(t), y(t) and b(t), t T. Snce many classfcaton methods use a dstance measure as a dssmlarty crteron, t could be desrable to transform the smlarty measure nto a dstance measure usng e.g. the followng relaton [Bandemer, Näther, 992, p. 70]: d(x, y) =, s(x,y) (0,]. s(x,y) (5.6) The Algorthm 6 presented above s formulated to determne pontwse smlarty between one-dmensonal trajectores. The extenson of ths algorthm for M-dmensonal trajectores x(t) and y(t) and for the correspondng M-dmensonal sequence of dfferences f(t) F F2... FM s straghtforward and results n the followng two modfcatons of the algorthm: Algorthm 6a: Extenson of Algorthm 6 for the multdmensonal case.. Fuzzy sets A F approxmately zero are defned on each sub-unverse F, =,..., M. 2. Smlarty measures (f,0), =,..., M, are determned accordng to steps 2 and 3 of s F Algorthm 6 for projectons of f(t) on the sub-unverses. The result s an M-dmensonal vector of partal smlartes s,...,s ] [ F F. M 3. The vector of partal smlartes can be transformed nto an overall smlarty measure s(x, y)=s(f,0) or dssmlarty measure d(x, y) n two ways:

171 Smlarty Concepts for Dynamc Objects n Pattern Recognton The dstance measure s calculated accordng to (5.6) for the components of the M- dmensonal vector sf,...,s ] resultng n the vector df,...,d ]. The latter s then [ FM [ FM transformed nto an overall dstance usng e.g. the Eucldean norm: d( x, y) =. (5.7) 2 d Y =,...,M 3.2 The M-dmensonal vector sf,...,s ] s transformed usng specfc transformatons [ FM (e.g. mnmum, maxmum, arthmetc mean, γ-operator, ntegral) nto an overall smlarty s(f, 0). A dstance measure s then calculated usng (5.6). The advantage of Algorthm 6a s that the defnton of fuzzy sets A for each dmenson (feature) allows easy nterpretaton, s techncally smple and can be performed by an expert wthout any great effort. In the followng algorthm a global vew on fuzzy set A n a multdmensonal space, rather than a consderaton of sngle projectons, s preferred. Algorthm 6b: Extenson of Algorthm 6 for the multdmensonal case.. The M-dmensonal fuzzy set A approxmately zero s defned on F F2... FM. 2. The overall smlarty measure sf... ( f,0) s obtaned for the M-dmensonal sequence FM of dfferences f(t) analogously to steps 2 and 3 of Algorthm The smlarty measure s F... (f,0) s transformed nto a dstance measure d( x, y) FM usng (5.6). The obtaned dstance measure between the M-dmensonal trajectores x(t) and y(t) can be appled for the modfcaton of classcal pattern recognton methods to make them sutable for clusterng and classfcaton of multdmensonal trajectores. Ths topc wll be dscussed n greater detal n Secton Choce of the membershp functon for the defnton of pontwse smlarty The two desgn parameters nvolved n the defnton of pontwse smlarty n Algorthm 6 are the fuzzy set A approxmately zero and the aggregaton operator that transforms the vector of pontwse smlartes nto an overall degree of smlarty. Fuzzy set A s defned by ts membershp functon u(f) whch represents the meanng of smlarty for a certan system varable,.e. fuzzy set A s context-dependent on the physcal doman. The most frequently chosen membershp functons are trangular, trapezodal, and bell-shaped functons [Drankov et al., 993, p. 6]. The reason for these choces s the easy parametrc, functonal descrpton of the membershp functon that can be stored wth mnmum use of memory and effcently used durng calculatons. The dfferent types of membershp functons are gven by the followng equatons:

172 54 Smlarty Concepts for Dynamc Objects n Pattern Recognton Trangular functon [Drankov et al., 993, p. 5] shown n Fgure 5-3 (left): 0, f < α, f > β u TA (f, α, β) = (f / α), α f < 0, (5.8) f / β, 0 f β where the degree of membershp equal to (called the peak value of the fuzzy set) corresponds to f = 0, Trapezodal functon [Drankov et al., 993, p. 52] shown n Fgure 5-3 (rght): 0, f < α, f > δ (f α) /( β α), α f < β u TZ (f, α, β, γ, δ) =, β f (5.9) γ ( γ f ) /( δ γ), γ < f δ where the top of the functon s not one pont but an nterval contanng f = 0, Bell-shaped functon defned as the exponental functon [Dubos, Prade, 988, p. 50]: u exp 2 af (f,a) = e, (5.0) whch s shown n Fgure 5-4 (left), Bell-shaped functon defned as the quadratc functon [Dubos, Prade, 988, p. 50]: u qdr (f,a) =. 2 (5.) + af whch s shown n Fgure 5-4 (rght), Bell-shaped functon defned as the logstc functon [Zmmermann, Zysno, 985, p. 53]: S u (f,a,b) = a(f b) + e, (5.2) whch s the S-shaped functon and shown n Fgure 5-5. Parameters a, b, α, β, γ, δ must be defned by an expert dependng on the desred meanng of the fuzzy set and smlarty. These parameters nfluence the wdth of the fuzzy set and n ths way determne the sharpness of the smlarty. In bell-shaped functons (5.0) and (5.), parameter a has the followng effect on the defnton of smlarty: the larger the value of parameter a, the narrower the fuzzy set and the stronger the defnton of smlarty (Fgure 5-4). In logstc functon (5.2), parameters a and b have the opposte effect: the larger the

173 Smlarty Concepts for Dynamc Objects n Pattern Recognton 55 value of parameter a, the harder the transton from smlar to non-smlar, and the larger the value of parameter b, the weaker the defnton of smlarty (Fgure 5-5). u(f) α β f u(f) α β γ δ f Fgure 5-3: Trangular and trapezodal membershp functons a>a2>a3 u(f) a 0.9 a a f u(f) a>a2>a3 a a 2 a f Fgure 5-4: Exponental and non-lnear membershp functons u(f) a<a2<a3, b=const a a 2 a f u(f) b<b2<b3, a=const b b 2 b f Fgure 5-5: Logstc S-shaped membershp functon

174 56 Smlarty Concepts for Dynamc Objects n Pattern Recognton The choce of parameter a n bell-shaped functons depends on the concrete applcaton and on doman F of sequence f(t). In order to evaluate the proper value of parameter a, one can determne value β of varable f, whch must have a degree of membershp to fuzzy set A equal to α. Then, parameter a can be obtaned usng these predefned values α and β for two types of the membershp functon from the followng two equatons: For the exponental functon (5.0) a = ln α β 2, (5.3) For the nonlnear functon (5.) a = α 2 αβ. (5.4) For the logstc functon (5.3) ln( α /( α)) a =, b = const. β b (5.5) The bell-shaped functons always take strctly postve values n contrast to membershp functons wth straght lnes, whch are equal to zero on the part of ther doman. Ths pont s of crucal mportance for the defnton of smlarty f the dssmlarty measure must be calculated from the smlarty measure accordng to (5.6). To avod numercal problems the value of the dssmlarty measure can be, for nstance, set to a very large number n the case of zero values of the smlarty measure. Comparng three bell-shaped functons t can be seen that the decrease for the quadratc functon (5.) s not as steep as that for the exponental functon and s charactersed by hgher values on the largest part of ts doman. The logstc functon has compared to two other functons the largest nterval wth hgh values of the membershp that allows to defne equally hgh smlarty wthn a certan threshold gven by parameter b. Generally, the choce of the membershp functon depends on the specfc applcaton under consderaton and on the subjectve evaluaton of the lngustc meanng of smlarty Choce of the aggregaton operator for the defnton of pontwse smlarty The choce of the aggregaton operator n Step 3 of Algorthm 6 for aggregatng the pontwse smlartes to the overall smlarty measure over the whole tme nterval depends on the desred nterpretaton of the smlarty between trajectores. In general there are four groups of operators whch can be appled for aggregaton: t-norms, t-conorms, averagng operators and compensatory operators. Trangular norms (or t-norms) represent a general class of operators for the ntersecton of fuzzy sets, nterpreted as the logcal and ([Zmmermann, 996, p. 29 ff.], [Dubos, Prade, 988, p. 79, ff.]). The operators belongng to ths class are e.g. the mn, product, and bounded sum operators. Trangular conorms (or t-conorms) defne a general class of aggregaton operators for the unon of fuzzy sets, nterpreted as the logcal or. Some

175 Smlarty Concepts for Dynamc Objects n Pattern Recognton 57 typcal representatves of ths class are the max-operator, algebrac sum, and bounded dfference operators. The use of these operators for the aggregaton of pontwse smlartes leads to the followng nterpretaton of results n both cases consdered exemplarly wth mn and max operators. If the mn operator s chosen for aggregaton a lower bound of the overall smlarty s obtaned, snce the worst value of pontwse smlarty on the gven tme nterval s consdered. Choosng the max operator for aggregaton yelds an upper bound of the overall smlarty, whch provdes the best pontwse smlarty value over the gven tme nterval. Both these values can be used f the number of values to be aggregated s rather small. Otherwse these aggregaton operators are too restrctve and the nformaton loss durng the aggregaton s too hgh. Moreover, f outlers are present n the sequence of pontwse smlartes the aggregated value s not representatve any more. Therefore, t seems more approprate to take some average value for the aggregaton. The most frequently used operators n the class of non-parametrc averagng operators are the arthmetc and geometrc mean values of the sequence s = [s,..., s p ] gven by: s s =, (5.6) p p s = p G = s L s p. (5.7) In [Thole et al., 979] t was shown that these operators provde an adequate model for human aggregaton procedures n decson envronments and perform qute well n emprcal research. The geometrc mean s usually nterpreted as a mean growth rate of a sequence. The arthmetc mean value s consdered as the best estmator for the expected value of a sequence n the sense of least mean squares. It provdes a representatve average value of the sequence n case of small varance. The larger the varance, the less useful the arthmetc mean value. It s used n statstcs for the evaluaton of unmodal approxmately symmetrc dstrbutons. In the case of a large varance n data another averagng operator called the medan value s often appled, whch n contrast to the arthmetc mean value gves a very robust evaluaton. Ths operator s defned as follows: ~ s = s s (r+ ) (r), + s 2 (r+ ), p = 2r + p = 2r (5.8) where s or = [s (),..., s (p) ] s the ncreasngly ordered sequence of values of s. The medan value s defned as the mean value of the ordered sequence f the number of elements n the sequence s odd. If the number of elements n the sequence s even there are two mean values, therefore the medan s calculated as the arthmetc mean of both mean values [Sachs, 992, p. 55]. The medan value corresponds to the value of a sequence that halves the ordered

176 58 Smlarty Concepts for Dynamc Objects n Pattern Recognton sequence. It means that 50% of the values of the sequence are above and 50% of the values are below the medan value. The medan value s frequently used n statstcs for the evaluaton of asymmetrc dstrbutons and when outlers are suspected. The medan s a specal case of a more general class of averagng operators n statstcs called quantles. The defnton of quantles s based on smlar consderatons such as that of the medan. The value of the α-quantle defnes a value of the ncreasngly ordered sequence s that 00 α% of values are below and 00 (-α)% of values are above the α-quantle value [Sachs, 992, p. 57]. The ordnal numbers of α-quantles are gven by: Q α = (p + ) α (5.9) and the correspondng value of the α-quantle of the ordered sequence: sα = sor (Q α ) = s (Q α ) (5.20) When Q α s not an nteger number t s rounded off to the closest nteger. Specal cases of quantles are represented by lower quartle Q wth α=0.25, medan wth α=0.5, upper quartle Q 2 wth α =0.75, l-th decle DZ wth α = l/0 (l =,..., 9), and r-th percentle PZ wth α = r/00 (r =,..., 99). Applyng the quantle values for the aggregaton of a sequence s = [s,..., s p ] of pontwse smlartes, t seems reasonable to use lower quantles wth α [0., 0.5] whose value can be determned by an expert. In ths case the aggregated value of smlarty can be nterpreted as the lower bound of smlarty for 00 (-α)% of ponts of trajectores to be compared. In other words, 00 (-α)% of pars of ponts are smlar at least to a degree s α. An extenson of aforementoned non-weghted averagng operators such as the arthmetc mean and order statstcs s represented by a class of the Ordered Weghted Averagng (OWA) operators ntroduced by [Yager, 988]. They are defned as a weghted sum wth ordered arguments, provded that weghts are normalsed to one, n the followng way. Let w,..., w p p be a set of weghts such that w s. The OWA operator on [0, ] p s defned as = () = OWA w,..., w p (s,...,s ) = p p = w s (), (5.2) where s () s the ncreasngly ordered sequence of values of s. OWA operators are often appled n mult-crtera decson theory wth weghts expressng mportance on crtera. Consderng OWA operators for the aggregaton of partal values of pontwse smlarty, t seems dffcult to defne weghts for dfferent elements of the

177 Smlarty Concepts for Dynamc Objects n Pattern Recognton 59 sequence, or dfferent ponts of tme. Generally t s assumed that the values of smlarty at all tme nstants are equally mportant, though some specal stuaton are concevable. The averagng operators provde a fx compensaton between the logcal and and the logcal or. In order to be able to vary the degree of compensaton, a more general operator called compensatory and was ntroduced and emprcally tested n [Zmmermann, Zysno, 980], where the degree of compensaton s expressed by a parameter γ. The compensatory and or γ-operator s defned as follows: s γ ( γ ) γ p p = s ( s ), γ [0,]. (5.22) = = The γ-operator represents a combnaton of the algebrac product and sum. The parameter γ ndcates where the operator s located between the logcal and and or,.e. t reflects the weghtng of both operators. For γ = 0 the γ-operator s reduced to the algebrac product, for γ = the algebrac sum s obtaned. Usng the γ-operator for aggregatng pontwse smlartes, numercal problems may appear due to a product operaton n (5.22). If the number of elements n sequence s s rather large each addtonal multplcaton wth a new element s [0, ] decreases the resultng aggregated value that tends towards zero. Snce t s often necessary to transform the overall smlarty value nto the dstance measure, ths behavour of the aggregaton operator seems undesrable. Therefore, the γ-operator s better suted for aggregaton of pontwse smlartes over short tme ntervals. The results of aggregaton wth dfferent operators are llustrated n the followng example. Consder two one-dmensonal trajectores x(t) and y(t) defned on the tme nterval [, 00] and shown n Fgure 5-6. The sequence of pontwse dfferences f(t) of these trajectores s llustrated n Fgure 5-7. It can be seen that these trajectores are relatvely close to each other on the whole tme nterval wth the excepton of one pont t = 2 where the dfference f(2) s equal to 20. For the calculaton of pontwse smlarty, fuzzy set A approxmately zero s defned wth the exponental membershp functon (5.) and wth parameter a = The value of a s obtaned usng equaton (5.3), where t was defned that the value β = 30 of the dfference f(t) has a degree of membershp to fuzzy set A equal to α = 0.. The resultng sequence of pontwse smlartes u(f) = [s,..., s 00 ] s shown n Fgure 5-8.

178 60 Smlarty Concepts for Dynamc Objects n Pattern Recognton 40 x(t),y(t) t Fgure 5-6: An example of two trajectores x(t) and y(t) 20 f(t) t Fgure 5-7: The sequence of dfferences between trajectores x(t) and y(t)

179 Smlarty Concepts for Dynamc Objects n Pattern Recognton 6 u(f) f Fgure 5-8: The sequence of pontwse smlartes of trajectores x(t) and y(t) The overall smlarty s(f, 0) s calculated by aggregatng sequence u(f) of pontwse smlartes wth the help of 6 aggregaton operators consdered above. The followng results are obtaned for the dfferent operators: Table 5-: The overall smlarty obtaned wth dfferent aggregaton operators Mnmum s mn = Maxmum s max = Arthmetc mean s = ~ Medan s = α-quantle s = 0. 8 γ-operator s γ = Due to a sngle, nearly zero value of pontwse smlarty, the mnmum and the γ-operator produce nearly zero values for the aggregated varable, whch s not n accordance to expectatons. The maxmum operator provdes a value whch s typcal for approxmately 20% of the values. The medan value s hgher than the arthmetc mean and can be assumed as beng a more relable evaluaton snce the varance of data s rather large (σ 2 = 0.76). The value of 0.25-quantl equal to 0.8 provdes a lower bound of smlarty for 75% of the values, whch s stll pretty hgh. It can be seen that the fnal choce of the aggregaton operator s a matter of nterpretaton and desred results Structural smlarty measures Searchng for structural smlarty between trajectores means searchng for measures of smlar behavour of trajectores over tme. The crucal pont s, however, to determne the

180 62 Smlarty Concepts for Dynamc Objects n Pattern Recognton meanng of smlarty. It obvously depends on the underlyng dynamc process and the ntended usage of trajectores. Sometmes the global statstcal behavour s mportant, whereas n other cases local changes n behavour are relevant. It s rather dffcult to defne one specfc measure of smlarty that would be useful n all applcatons. The dfferent nature of trajectores requres a problem specfc selecton of a sutable smlarty measure for trajectores. Thus, t s more favourable to defne a general framework for smlarty measures and to determne n each specfc case whch aspects are mportant for the analyss of trajectores n order to formulate a precse crteron of smlarty. Thus, structural smlarty can be vewed as a degree of proxmty of trajectores wth respect to such aspects as form, evoluton trend, sze or orentaton n R M representng ther behavour. Dependng on the chosen aspect dfferent mathematcal propertes and characterstcs of trajectores may be relevant to descrbe smlarty, e.g. slope, curvature, poston and values of extreme ponts or other nformaton lke smoothness or monotony. A set of relevant characterstcs has to be chosen n such a way that trajectores that are close to each other wth respect to these characterstcs would exhbt smlar behavour. In the followng sectons, dfferent approaches to determne a degree of smlarty presented n the lterature wll be dscussed. Afterwards, a general algorthm for determnaton of structural smlarty between trajectores wll be consdered. Fnally, a number of specfc defntons of structural smlarty based on relevant characterstcs of trajectores wll be formulated and ther propertes wll be dscussed Smlarty model usng transformaton functons A general smlarty model for tme seres was proposed n [Das et al., 997], whch s based on local transformatons of tme seres and can avod the nfluence of outlers, dfferent scalng and translatng factors. Ths model can be generalsed for trajectores. Let G be a set of transformaton functons mappng ntegers to ntegers. Set G could consst of ether all lnear or quadratc or monotone functons or the dentty functon. Consder two trajectores x() t = [( x t ),..., x( t p )] and y() t = [( y t ),..., y( t p )], whch wll be denoted for smplcty by sequences x = [ x,..., xp ] and y = [ y,..., yp ], where the tme ndex s dropped. Accordng to an ntutve understandng of smlar behavour, two sequences x and y are consdered G-smlar f there s a functon g G such that a long sub-sequence x of x can be approxmately mapped nto a long sub-sequence y usng the transformaton g. It s not requred that sub-sequences x and y consst of consecutve ponts of x and y, respectvely. By contrast, sequences x and y are allowed to contan some mssng or addtonal values that are not matched by sub-sequences. Ths means that ponts of x and y should only preserve the same relatve order as n x and y, respectvely.

181 Smlarty Concepts for Dynamc Objects n Pattern Recognton 63 The dea of usng the longest common sub-sequence for the defnton of the smlarty measure between sequences of objects was frst ntroduced n [Yazdan, Ozsoyoglu, 996], but the model proposed could not deal wth sequences modfed by scalng and translaton. A smlar model was also ntroduced n [Agrawal et al., 995] whose man dsadvantage s that t does not allow outlers wthn tme wndows of a specfed length and the choce of lnear functons used for mappng s restrcted. Hence, the smlarty model of [Das et al., 997] wll be dscussed below. The noton of smlar sequences can be mathematcally formulated as follows. Defnton [Das et al., 997]. Gven two trajectores x = x,..., x ] and y = y,..., y ] and [ p [ p numbers 0< γ, ε <. Sequences x and y are (G, γ, ε)-smlar f and only f there exsts a functon g G and subsequences x g = [ x,..., x ] and y = [ y,..., y ], where k k + and j k j k + for all k =,..., γn, such that γp g j jγ p y jk + ε g(x k ) y jk ( + ε), k γn. (5.23) Parameter γ denotes the length of the sub-sequence of x whch can be mapped to y. Parameter ε defnes the precson of mappng,.e. how close sub-sequences should be matched. Based on ths defnton the smlarty measure for trajectores can be defned as a coeffcent that maxmses the length of sub-sequences satsfyng condton (5.23). Defnton 2 [Das et al., 997]. For gven trajectores x and y, a set G and numbers 0< γ, ε <, the smlarty of x and y s defned by: s G, ε ( x, y) = {max γ x, y are (G, γ, ε) smlar}. (5.24) Ths smlarty measure takes ts values n the nterval [0, ], hgher values meanng greater smlarty. In [Das et al., 997] only the set of lnear functons s consdered: G ln = {g = ax + b a,b R, a 0}, (5.25) whch allows to determne smlarty between trajectores wth dfferent scalng and translaton factors. If the transformaton functon g G s known, then the problem of calculatng the smlarty measure s reduced to searchng for the longest common sub-sequence between g(x) and y. Ths problem can be solved by dynamc programmng n O(p 2 ) tme. If the length of the longest common sub-sequence s at least p-h, h ℵ, then the computaton tme s reduced to O(hp).

182 64 Smlarty Concepts for Dynamc Objects n Pattern Recognton In order to fnd the lnear transformaton g that maxmses the length of the longest common sub-sequence between g(x) and y (wth tolerance ε), t s necessary to dentfy all dfferent lnear functons specfed by pars (a, b) and check each one. Snce ths procedure of fndng the longest common sub-sequence for each par (a, b) s very tme consumng (computaton tme s O(p 6 )) and s practcally unfeasble, Das et al. have proposed an approxmaton algorthm based on the use of methods from computatonal geometry. The man dea of ths algorthm s to reduce the number of canddate pars (a, b) representng dfferent lnear functons. Ths s done by computng bounds for possble values of a and b and by defnng a grd to sample the area restrcted by the bounds. For each pont of the grd, the longest common sub-sequence between g(x) and y s then calculated and the best soluton s chosen. The computatonal tme of ths algorthm s O(z(-γ)p 2 ), where z s the number of samplng ponts on the grd. The accuracy of the algorthm depends on the sze of the samplng grd. The algorthm presented above provdes a clear ntutve model for measurng smlarty between trajectores. However, the problem of searchng for a transformaton that best matches two trajectores s restrcted to the class of lnear functons because of ts complexty. The underlyng smlarty model uses the shape of trajectores as a basc crteron for comparson allowng a number of local transformatons such as scalng, translaton and a tme dfference between smlar sub-sequences of trajectores. For ths reason, the model can not deal wth global transformatons of trajectores, for nstance, a varaton n the trend or orentaton of trajectores n R M Smlarty measures based on wavelet decomposton An alternatve approach to usng a smlarty model s concerned wth the consderaton of statstcal propertes of tme seres and the underlyng dynamc. In ths case the smlarty s defned usng these statstcal propertes rather than the shape of tme seres. In [Struzk, Sebes, 998] a smlarty framework based on wavelet decomposton s ntroduced, whch provdes a varety of crtera for the evaluaton of smlarty between tme seres. Usng a herarchcal representaton of tme seres (up to a certan resoluton) n ths framework two classes of smlarty measures are consdered: global smlarty based on scalng propertes of tme seres and local smlarty usng the scale-poston bfurcaton representaton of tme seres. To defne the global statstcal smlarty measure t s essental that ths remans unchanged for any arbtrary part of the tme seres, provded that the characterstcs of the tme seres do not change n tme (statonarty) or wth the length of a consdered part of the tme seres. A sutable parameter for ndcatng global smlarty of the tme seres wth ts parts s the exponent that has to be used as a re-scalng factor for the heght of the (sldng) tme wndow

183 Smlarty Concepts for Dynamc Objects n Pattern Recognton 65 (through whch a part of the tme seres s observed) n order to obtan a new tme seres smlar to the orgnal one. It means that tme seres can be consdered to be statstcally smlar f they possess smlar values of scalng parameters such as the exponent. Struzk and Sebes propose the use of the Hurst exponent for the comparson of tme seres, whch was developed wthn the doman of fractal geometry and s broadly applcable n tme seres analyss. The values of the Hurst exponent [Falconer, 990] gve evdence about three possble types of behavour of the tme seres: a long range postve correlaton n the tme seres expressed vsually by moderate jumps, a negatve correlaton (so-called ant-correlaton) dsplayed by more wld behavour and numerous ntensve jumps, and the Brownan moton shown as random nose. Usng the Hurst exponent as a crteron for smlarty t s possble to classfy tme seres by ther scalng behavour, whereas the form of trajectores s rrelevant. The scalng parameters of functons can be successfully estmated by the wavelet transform as shown n [Arneodo et al., 995]. In partcular, the Hurst exponent can be derved from the wavelet transform modulus maxma (WTMM) representaton of tme seres ntroduced by [Mallat, Zhong, 992]. In order to evaluate the local smlarty measure between tme seres, Struzk and Sebes use the bfurcaton representatons of two tme seres and estmate the degree of smlarty of these representatons. Bfurcatons [Struzk, 995] form a set of ponts reflectng the landscape of the wavelet transform tree and capture the ntrcate structure of the tme seres. Each bfurcaton can be represented by ts poston and scale co-ordnates and the correspondng value of the wavelet transform n the bfurcaton pont. The bfurcatons as well as wavelet transform tself can be evaluated for the tme seres up to a certan resoluton so that only coarse features of tme seres are taken nto account. In order to determne a degree of smlarty of two bfurcaton representatons, the correlaton functon between them s estmated as a normalsed sum of correlaton measures over all pars of bfurcaton ponts. The correlaton measure s parametersed by the addtonal scale and poston shft of representatons wth respect to each other n order to fnd a better match between representatons. The two dscussed types of smlarty measures are able to recognse statstcally smlar tme seres or smlar parts wthn the tme seres n the presence of scalng, translaton and polynomal bas and to dscover regulartes n the tme seres. Usng other scale-poston localsed features of the tme seres dependent on the appled wavelet nstead of bfurcatons, t s possble to desgn a varety of local smlarty measures. For ther practcal use, effcent technques for ncreasng the accuracy of the representaton wth compactly supported wavelets and for optmsng algorthms are requred. However, t must be noted that these measures take nto account only certan statstcal propertes of the tme seres, whch can be relevant for some specfc applcatons but gnore the form of the tme seres and ther

184 66 Smlarty Concepts for Dynamc Objects n Pattern Recognton temporal behavour. Thus, ths smlarty framework consders only one very specal aspect of smlarty, whch corresponds to a somewhat narrow vewpont on the general smlarty noton between trajectores Statstcal measures of smlarty In many applcatons the purpose of tme seres classfcaton s to partton a collecton of tme seres nto groups, or clusters, of seres wth smlar dynamcs. It means that the noton of smlarty s used to quantfy the closeness of dynamc systems and ther attractors rather than ndvdual tme seres. For dynamc systems wth M degrees of freedom, attractors are defned as a subset of M-dmensonal phase space towards whch almost all suffcently close trajectores get attracted asymptotcally [Grassberger, Procacca, 983]. One of the measures charactersng the local structure of attractors s the correlaton ntegral descrbng the nterdependence between ponts (observatons) of tme seres. It measures the degree of randomness between subsequent ponts on the attractor by calculatng the spatal correlaton between pars of ponts on the attractor. Consder two sets of ponts on the attractor x = [ x,..., xp ] and y = [ y,..., yp ] obtaned from two tme seres xt ( ) = [ xt ( p + τ),..., xt ( + pτ)] = [ xt ( + kτ)] k= yt ( ) = [ yt ( p + τ),..., yt ( + pτ)] = [ yt ( + kτ)] k= wth a fxed tme ncrement τ between successve observatons. The correlaton ntegral s defned accordng to [Grassberger, Procacca, 983] by C( ε) = lm p 2 p p ε {(, j) y j < ε} = lm Θ( x y j ε) = 2 where Θ(x) s the Heavsde functon gven by and c(ε) s the standard correlaton functon. M x c( ε')d ε' (5.26) p p, j= for x x 0 Θ(x x 0 ) =, (5.27) 0 for x < x 0 (x 0 > 0) The correlaton ntegral measures a degree to whch ponts are grouped n the phase space by calculatng the number of such pars (x, y j ),, j =,..., N, for whch the dstance between x and y j s less than ε. The correlaton ntegral can be generalsed to the cross-correlaton sum [Kantz, 994]: 0

185 Smlarty Concepts for Dynamc Objects n Pattern Recognton 67 2, j= and can be used drectly or after normalsaton ( x y ε) p C xy ( ε) = Θ j p (5.28) Cxy sxy (, ) = C C (5.29) xx as a smlarty measure between tme seres ([Manuca, Savt, 996], [Schreber, Schmtz, 997]). Another possblty to defne the smlarty measure s to use the cross-predcton error based on some tme seres models [Schreber, 997]. The usefulness and applcablty of smlarty measures based on cross-correlaton sum and cross predcton error was llustrated by some examples n [Schreber, Schmtz, 997], where clusterng of tme seres was based on dssmlartes between tme seres defned as d(x(t),y(t)) = - s(x(t),y(t)). It was shown that tme seres generated wth dfferent parameter settngs can be clearly separated nto dstnct clusters. In order to vsualse clusterng results, an abstract space of dynamc propertes of tme seres was ntroduced. Once c clusters are formed, they can be represented n the space of mutual dssmlartes of dynamc objects,.e. each co-ordnate, =,..., c, s the average dssmlarty of each object to the objects n cluster. Ths representaton allows a vsual judgement on the presented structure n dynamc objects based on ther mutual dssmlartes Smoothng of trajectores before the analyss of ther temporal behavour As stated n 5.2, trajectores may contan undesred devatons due to random fluctuatons or measurement errors. In order to reduce ther nfluence on the analyss results and to flter the specfc behavour of trajectores, smoothng technques are often appled before the actual analyss starts. One of the most frequently used methods for smoothng s a medan flter of length r. Consder a sequence of p measurements x = [ x,..., xp ]. The values of a new sequence x are calculated as mean values over r values n the followng way: f r s an uneven number, (r-)/2 left and (r-)/2 rght neghbourng values are consdered yy x (r + ) / 2 (r ) / 2... x x x... x (r ) / 2 x k = = x j, (5.30) r r j= (r ) / 2 where r r k = +,..., p, 2 2 f r s an even number, r/2 left and r/2 rght neghbourng values are consdered

186 68 Smlarty Concepts for Dynamc Objects n Pattern Recognton x x + x + x x + r / 2 / r 2 x k = r / = x j, r + r (5.3) j= r / 2 where r r k = +,...,p. 2 2 The example n Fgure 5-9 llustrates the effect of smoothng of trajectores. The orgnal trajectory x(t) contans a sgnfcant part of random fluctuatons. Applyng a medan flter of length 250, a smooth trajectory preservng the temporal behavour but fltered from nose s obtaned x t Fgure 5-9: A trajectory x(t) before and after smoothng Smoothng wth equatons (5.30) and (5.3) can be used for the off-lne analyss of trajectores obtaned durng a perod of tme from dfferent dynamcal systems. If trajectores are montored over tme and parts of them are compared wth each other durng the on-lne analyss, these equatons are not applcable n ths form snce future values of the trajectores are not avalable. Therefore, n order to smooth a trajectory the mean value s calculated over the last r values and s assgned to the current value of a trajectory. Mean values are updated along wth new ncomng values of a trajectory. A smoothed trajectory s charactersed by a partcular property, ths beng that t s stuated above an orgnal trajectory f a trajectory exhbts a decreasng trend, and under a trajectory f a trajectory exhbts an ncreasng trend. Ths modfed smoothng technque s broadly used n applcatons of tme seres analyss such as the techncal analyss of share prces [Welcker, 99, p. 48].

187 Smlarty Concepts for Dynamc Objects n Pattern Recognton 69 The length r of the nterval for smoothng s chosen dependng on the length of sequence x and on the desred accuracy. For nstance, for the analyss and forecastng of share prces, trajectores (tme seres) obtaned over 6 or 2 months on the bass of daly measurements are usually smoothed over the nterval of 00 or 200 measurements, respectvely [Welcker, 99, p. 48]. For traffc control and traffc forecasts [Engels, Chadenas, 997], the data s usually collected every mnute durng one day and then aggregated over fve, ten or ffteen mnute ntervals. In the fault dagnoss n anaesthesa [Vesternen et al., 997], measurement data of gas flow sgnals are gathered each second and sequences of, for example, 200 measurements are sent to the fault dagnoss software for analyss. To elmnate devatons n gas flow sgnals a medan flter of length 3 s suffcent. If a trajectory has a wavy form t can be represented by trgonometrc polynoms. In order to obtan them, a trajectory s frst smoothed by some medan flter and then the dscrete Fourer transform (DFT) s appled. The resultng functon s a sum of waves wth dfferent perods weghted by the Fourer coeffcents. Half of the coeffcents are responsble for waves wth hgher frequences and can be neglected. A smooth trajectory s obtaned by applyng the nverse Fourer transform to the remanng coeffcents Smlarty measures based on characterstcs of trajectores The matchng approaches based on dstance measures n the feature space or functonal transformatons and the statstcal approaches based on cross-correlaton measures or on specfc parameters of tme seres presented n prevous sectons, yeld a quanttatve evaluaton of smlarty between trajectores. In many cases the evaluaton of smlarty should, however, reproduce the judgement of an expert based manly on qualtatve and possbly subjectve features. Snce n such knd of problems smlarty can not be nferred by quanttatve analyss, the modellng of smlarty as a cogntve process smulatng human decson makng seems to be approprate [Bnagh et al., 993]. Among varous knowledge representaton formalsms proposed for reasonng n the presence of uncertanty and mperfect knowledge typcal for human cogntve processes [Graham, Jones, 988], fuzzy set theory provdes the most plausble tool for modellng cogntve processes, n partcular a recognton process [Pedrycz, 990]. Snce the evaluaton of smlarty s one of the steps of the recognton process, t presents tself to take advantage of the fuzzy framework n ths context as well. Wth the help of fuzzy set theory, t s possble to model a gradual representaton of smlarty accordng to human judgement and n ths way to fuzzfy the dfference between smlar and non-smlar dynamc objects. One of the frst mplementatons of the structural smlarty was presented n [OMRON ELECTRONICS GmbH, 99] n the area of sgnal analyss. The purpose of the desgned system was to dstngush between dfferent types of balls based on the analyss of oscllaton

188 70 Smlarty Concepts for Dynamc Objects n Pattern Recognton spectra after the mpact of balls on the base. The dentfcaton of a specfc oscllaton pattern allows for the recognton of a certan ball type. For the defnton of structural smlarty between balls four characterstcs concernng the number of hgh and low frequency mpacts, pulse frequency and oscllaton duraton were consdered and modelled as lngustc varables. Usng the fuzzy rule-based system t was possble to recognse slght dfferences between mpact sequences and to classfy balls nto three groups. Ths applcaton example shows the advantages of usng fuzzy structural smlarty n the pattern recognton process whch s carred out n accordance wth human subjectve evaluaton. A general algorthm to determne a measure of structural fuzzy smlarty between trajectores based on relevant characterstcs of these trajectores s proposed n [Joentgen, Mkenna et al., 999b, p. 84]. The dea of ths algorthm s to represent the behavour of trajectores by a fnte number of characterstcs and to measure the smlarty of trajectores wth respect to each of these characterstcs usng a fuzzy defnton of smlarty. In ths algorthm expert knowledge s ncorporated at two ponts: by the choce of characterstcs that must adequately descrbe the temporal structure of trajectores and by the subjectve evaluaton of the admssble dfference of a characterstc s values for two trajectores (from the vewpont of ther smlarty). The subjectve evaluatons for each characterstc are gven n the form of fuzzy sets, whch are used as a bass of the smlarty defnton. By choosng dfferent sets of characterstcs a number of specfc defntons of smlarty can be derved. For the sake of smplcty, the algorthm s formulated for the case of one-dmensonal trajectores x(t) and y(t) and wll be generalsed to a multdmensonal case afterwards. Algorthm 7: Determnaton of structural smlarty between trajectores [Joentgen, Mkenna et al., 999b, p. 84].. A set of relevant characterstcs { K,...,K L} used for descrpton of structural smlarty s chosen. 2. For each characterstc K, =,..., L, a fuzzy set A labelled admssble dfference for characterstc K wth membershp functon u s defned. 3. All characterstcs values K (x) for the trajectory x(t) and K (y) for the trajectory y(t) are calculated. 4. For each characterstc K, =,..., L, the dfference K = K (x) K (y) s calculated. 5. The degree of membershp s = u ( K ) of the dfference K to the fuzzy set A s calculated for each characterstc K. These membershp values can be nterpreted as smlartes between trajectores x(t) and y(t) wth respect to the chosen characterstcs.

189 Smlarty Concepts for Dynamc Objects n Pattern Recognton 7 6. Fnally the vector [ s,...,sl ] of partal smlartes s transformed usng specfc transformatons (e.g. γ-operator, fuzzy ntegral, mnmum, maxmum) nto a real number s(x, y) expressng the overall degree of smlarty: s(x, y) = aggr(s,...,s ). Each value of a membershp functon u of a fuzzy set A, =,..., L, expresses a degree to whch a certan dfference of two values of characterstc K can be consdered admssble and two trajectores can be consdered smlar wth respect to ths characterstc. Obvously the maxmum degree of the membershp functon corresponds to the dfference value equal to zero,.e. trajectores are defntely smlar f ther characterstcs values are equal. The shape and the support of the membershp functon should be defned context dependently by an expert. Thus, fuzzy sets represent an expert s understandng and hs/her subjectve judgement as regards the meanng of smlarty. The extenson of the algorthm to the case of M-dmensonal trajectores x(t) and y(t), x, y Y Y..., s straghtforward and leads to the followng two modfcatons of the 2 YM algorthm: Algorthm 7a: Extenson of Algorthm 7 to the multdmensonal case.. A set of relevant characterstcs { K,...,K L} used for descrpton of structural smlarty s chosen. Y j 2. For each characterstc K fuzzy sets A admssble dfference for characterstc K are defned on each sub-unverse Y j, j=,..., M. 3. Characterstcs values K (x) for the trajectory x(t) and K (y) for the trajectory y(t) are calculated wth respect to each of M dmensons resultng n vectors K ( x ) [K (x ),..., K (x )] and K ( y ) [K (y ),...,K (y )]. = m = 4. For each characterstc K an M-dmensonal dfference vector K = K ( x) K ( y) s calculated. Y 5. Partal smlarty measures s j, j=,..., M, wth respect to characterstc K are determned accordng to Algorthm 7 for each dmenson. The result s the M- Y YM dmensonal vector of partal smlartes [ s,...,s ]. Ths vector can be aggregated to a real number s expressng a partal smlarty measure wth respect to characterstc K n three ways: Y YM 5. The vector [ s,...,s ] s transformed usng specfc transformatons (e.g. mnmum, maxmum, arthmetc mean, γ-operator, ntegral,) nto a real number Y YM s = aggr (s,...,s ). The resultng vector [ s,...,sl ] of partal smlartes s transformed usng one of the gven specfc transformatons nto a real number s(x, y) expressng the overall degree of smlarty: s ( x, ) aggr (s,...,s ). m Y Y2... X y = M L L

190 72 Smlarty Concepts for Dynamc Objects n Pattern Recognton Y YM 5.2 Components of the vector [ s,...,s ] are transformed nto dstance measures expressng partal dssmlartes accordng to equaton (5.6). The resultng vector Y YM [ d,...,d ] s then transformed nto the dstance measure d wth respect to characterstc K usng e.g. the Eucldean norm (5.7). The aggregaton of partal dstance measures d to an overall dstance dy (, ) Y2... Y x y s performed wth the M help of the Eucldean norm as well. Y YM 5.3 The vector [ s,...,s ] s transformed usng specfc transformatons nto a real Y YM number s = aggr (s,...,s ) Components of the resultng vector [ s,...,s L ] are transformed nto dstance measures expressng partal dssmlartes accordng to equaton (5.6). The resultng vector [ d,...,dl ] s aggregated to an overall dstance measure d ( x, y) usng the Eucldean norm (5.7). Y Y2... YM The advantage of Algorthm 7a s the separate defnton of fuzzy sets A for each dmenson, whch due to the better nterpretablty can smplfy the task for an expert. However, the algorthm requres an addtonal aggregaton step for the transformaton of M-dmensonal vectors nto real numbers. Algorthm 7b: Extenson of Algorthm 7 for the multdmensonal case.. A set of relevant characterstcs { K,...,K L} used for descrbng structural smlarty s chosen. 2. For each characterstc K the M-dmensonal fuzzy set A admssble dfference for characterstc K s defned on Y Y.... 3, 4 are the same as n Algorthm 7a. 2 YM 5. Partal smlarty measures s = u ( K ), =,...,L, are calculated as degrees of membershp of M-dmensonal dfference vectors K to the M-dmensonal fuzzy set A. 6. Partal degrees of smlarty are transformed, analogously to the one-dmensonal case, nto the overall degree of smlarty s ( x, y) Y Y2... X. M In contrast to Algorthm 7a, all calculatons n Algorthm 7b are performed for M-dmensonal vectors and not for components of these vectors. Ths means that one has to deal wth multdmensonal fuzzy sets whose defnton s usually more sophstcated compared to the one-dmensonal case. The general defnton of structural smlarty accordng to Algorthm 7 can be used to derve a set of specfc defntons that take nto account dfferent propertes of trajectores. It should be emphassed that each partcular defnton of structural smlarty consders a certan aspect of the smlarty noton and s sutable only for a certan number of problems. In the followng

191 Smlarty Concepts for Dynamc Objects n Pattern Recognton 73 secton a number of defntons of structural smlarty wth respect to dfferent propertes of trajectores s ntroduced. Ths lst can be extended by combnatons of these defntons and by ntroducng other relevant propertes of trajectores. Defnton of structural smlarty based on the trend of trajectores. The trend of a trajectory s defned as a slow but consstent undrectonal change of a trajectory [Vesternen et al., 997]. It descrbes a smple general behavour of a trajectory. The form of a trend can be e.g. a slope, a step or a damped step. Trends that are shaped lke a slope can be estmated by fttng a straght lne to a gven set of measurements (consttutng a trajectory) wth the least squares approxmaton. Gven a set of measurements (x,t ), (x 2,t 2 ),..., (x p,t p ), where x k, k =,..., p, are known to be subject to measurement errors, the task of approxmaton s to estmate a regresson lne F(t)=a t+a 2. Regresson coeffcents a and a 2 are determned so that the values x = a t + a,k,..., p are as close as possble to the ˆ k k 2 = values x, x 2,..., x p n a least square sense. Ths requres the mnmsaton of the sum of squared errors: p 2 2 e = ( x ˆ k x k ). (5.32) k= Regresson coeffcents can also be estmated usng the covarance of x(t) and t [Rüger, 989, p. 04]: where Cov( x, t) a =, a 2 = x a t, Var(t) (5.33) p Cov( x, t) = ( x k x)(t k t) (5.34) p Var(t) k= p p 2 = (t k t) = k= k= t and x are the mean values of t and x(t), respectvely, over the tme nterval [t, t p ]. t 2 k t 2 (5.35) Defnton 3. Two trajectores x(t) and y(t) are consdered smlar wth respect to ther temporal trend f they are charactersed by smlar values of parameter a. Ths defnton of structural smlarty s llustrated n Fgure 5-0 where trajectores represent three classes of behavour: ncreasng, constant and decreasng. Although these trajectores are rather smlar wth respect to ther form, they are not consdered to be smlar wth respect to ther temporal trend.

192 74 Smlarty Concepts for Dynamc Objects n Pattern Recognton x A B C 0 t Fgure 5-0: Structural smlarty based on the trend of trajectores The correspondng smlarty measure s(x, y) can be calculated wth Algorthm 7, where parameter a s used as characterstc K of trajectores and fuzzy set A denotes the admssble dfference for the trend. Ths structural smlarty can be appled to fnd clusters of trajectores wth a smlar temporal trend, where the translaton of trajectores n the feature space along M dmensons or translaton over tme and a degree of ther fluctuaton are rrelevant. If the translaton of trajectores s a relevant crteron for analyss,.e. the locaton of trajectores n the feature space must be consdered together wth the trend, then Defnton 3 of the smlarty measure can be presented as follows. Defnton 4. Two trajectores x(t) and y(t) are consdered smlar wth respect to ther temporal trend and locaton f they are charactersed by smlar values of parameter a and a 2. Defnton of structural smlarty based on the curvature of trajectores. The curvature of a trajectory at each pont descrbes the degree to whch a trajectory s bent at ths pont. It s evaluated by the coeffcents of the second dervatve of a trajectory at each pont that can be defned by the followng equaton 3 (for one-dmensonal trajectory): where ' ' '' x k x k cvk = x k =, k = 3,...,p, (5.36) t t k ' x k denotes a coeffcent of the frst dervatve at pont x k gven by k ' x k x k x k =, k = t t k k 2,..., p, (5.37) 3 Equatons (5.36) and (5.37) represent only one of the possbltes to calculate numercally the frst and the second dervatves.

193 Smlarty Concepts for Dynamc Objects n Pattern Recognton 75 Substtutng expresson (5.37) for the frst dervatve nto equaton (5.36), the followng equaton based on the values of the orgnal trajectory s obtaned for the coeffcents of the second dervatve: '' x k 2x k + x k 2 cvk = x k =, k = 3,...,p 2 (t t ) k k (5.38) If trajectores possess local mnma and maxma, whch can be detected by lookng for a sgn change n the values of the frst dervatve, then t can be suffcent to consder the coeffcents of the second dervatve only n these specfc ponts where the curvature s maxmum. The most dstnctve feature when consderng the curvature s the sgn of coeffcents of the second dervatve. If the coeffcent s postve over a certan tme perod, then a trajectory s convex on ths nterval (open to the top). If the coeffcent s negatve over a certan tme perod a trajectory s concave (open to the bottom). If the coeffcent s equal to zero at some pont whch s then called an nflecton pont, there s no curvature at ths pont. Inflecton ponts appear n oscllatng trajectores and ndcate the change of curvature from convex to concave or vce versa. All lnear functons are charactersed by zero curvature at all ponts. Defnton 5. Two trajectores x(t) and y(t) are consdered smlar wth respect to ther curvature f they are charactersed by smlar coeffcents cv k of the second dervatve. Ths defnton of structural smlarty s llustrated n Fgure 5- where three types of trajectores are represented: concave trajectory B, convex trajectory C and trajectory A wth oscllatng behavour changng from concave to convex. x A B 0 C t Fgure 5-: Structural smlarty based on curvature of trajectores The smlarty measure s(x, y) based on the curvature can be calculated usng Algorthm 7 or one of ts extensons, where characterstc K s a vector of coeffcents of the second dervatve K = [cv,...,cv ]. For multdmensonal trajectores, components of vectors K(x) and K(y) are 3 p

194 76 Smlarty Concepts for Dynamc Objects n Pattern Recognton also vectors consstng of partal second dervatves. Fuzzy set A denotes the admssble dfference for the curvature. If a temporal translaton s rrelevant to the process of recognsng smlar patterns n trajectores, then vectors K(x) and K(y) of characterstc K obtaned for two trajectores can be cyclcally shfted wth respect to each other and the smlarty measure s defned for each combnaton. In ths way the maxmum smlarty correspondng to the best match of trajectores wth respect to ther curvature can be found. The smlarty measure based on the curvature s partcularly sutable for trajectores wth a low number of fluctuatons and a wavy form. Ths measure s, however, senstve to scalng,.e. a trajectory transformed by a scalng factor has a dfferent curvature than the ntal one. Defnton of structural smlarty based on the smoothness of trajectores. The smoothness of a trajectory descrbes the degree of oscllatons n ts behavour, and can be charactersed by the number of sgn changes n ts second dervatve. The hgher ths number, the more oscllatons the trajectory contans and the less smooth t s. In order to estmate smoothness of a trajectory x(t) = [x,..., x p ], consder a vector of coeffcents of the second dervatve of a trajectory and transform t to a bnary vector h usng the followng codng rule: postve values are substtuted by +, negatve values are substtuted by and zero values are dropped. The result s the vector h = [h,..., h (p-z-2) ], where z s the number of zero values n the vector of second dervatves. Calculate a new vector w, whose elements w k, k=,..., p-z-3, are sums of each par of successve elements h k and h k+. Elements w k take ther values from the set {-2, 0, 2}, where a zero value corresponds to a change of sgn n the code vector and the vector of second dervatves, respectvely. Thus, calculatng the number of zero values n vector w, the number sm of sgn changes of the second dervatve of a trajectory s obtaned: p z 3 p z 3 sm = w k = (h k + h k+ ) w k = 0 k= k= (5.39) Defnton 6. Two trajectores x(t) and y(t) are consdered smlar wth respect to ther smoothness f they are charactersed by smlar values of parameter sm. In Fgure 5-2 trajectores wth dfferent degrees of smoothness are shown. Trajectores A and B are both charactersed by oscllatng behavour, where A (sm = 6) s more oscllatng than B (sm = 4). In contrast, trajectory C exhbts very smooth behavour (sm = 0).

195 Smlarty Concepts for Dynamc Objects n Pattern Recognton 77 x A B C 0 t Fgure 5-2: Structural smlarty based on the smoothness of trajectores The smlarty measure s(x, y) based on the smoothness of trajectores can be obtaned usng Algorthm 7, or one of ts extensons, where characterstc K s chosen to be parameter sm of smoothness and fuzzy set A denotes the admssble dfference for smoothness. Ths smlarty measure can be appled to the comparson of trajectores, where oscllatng behavour s assumed. It s more general than the smlarty measure based on the pontwse curvature of trajectores snce t consders only the total value of oscllatons but not the degree of curvature of each separate wave. Ths smlarty measure s sutable for trajectores for whch scalng and translaton transformatons are rrelevant, but t s senstve to outlers. To avod the nfluence of outlers on the value of smlarty, trajectores must be pre-processed and smoothed by a medan flter accordng to (5.30) and (5.3) or usng some other smoothng technque. Defnton of structural smlarty based on specfc temporal parameters of trajectores. Smlarty measures based on curvature and smoothness characterse the general behavour of trajectores wth respect to ther form and oscllatng character. For some problems, t may be mportant to consder concrete parameters of sngle waves appearng n a trajectory n order to dentfy smlar temporal patterns n trajectores. Consder the trajectory shown n Fgure 5-3. Ths temporal pattern n the trajectory can be decomposed nto segments ndcatng local trends n such a way that each segment s bounded by nflecton ponts or an nflecton pont and an extreme value ([Baksh et al., 994], [Angstenberger et.al, 998]).

196 78 Smlarty Concepts for Dynamc Objects n Pattern Recognton x 0 t Fgure 5-3: Segmentaton of a temporal pattern of a trajectory accordng to elementary trends Seven types of elementary segments (trends) can be dstngushed n a trajectory, each of whch s charactersed by a constant sgn of the frst and second dervatves (Fgure 5-4). A B E x x(b) g(b) b F x(a) g(a) a c C D G 0 d t(d) e t(a) t(b) t(c) t(e) t Fgure 5-4: Qualtatve (left) and quanttatve (rght) temporal features obtaned by segmentaton Such a trangular representaton of trends provdes qualtatve features for a descrpton of the segments. In order to derve quanttatve nformaton from the segments, these are descrbed by the followng set of temporal parameters (see segment a-b n Fgure 5-4): t(a), t(b) are the start and end tme of the segment,.e. tme nstants of nflecton ponts and local extreme values of the trajectory, x(a), x(b) are the values at the nflecton ponts and the local extreme values of the trajectory, g(a), g(b) are the values of the start and end slope of the segment. Ths set of features provdes the necessary nformaton about the temporal development of a trajectory n a pecewse manner, but t can be extended wth some more parameters for a

197 Smlarty Concepts for Dynamc Objects n Pattern Recognton 79 more complete descrpton of the trajectory. The followng temporal features can be consdered addtonally: gg(b) s the value of curvature at extreme pont b, wd s the duraton of a pattern (e.g. a hll) whch s defned wth respect to nflecton ponts (duraton of two elementary segments wd(b)=t(c)-t(a)) or wth respect to the baselne of a pattern (duraton of four coherent segments wd(b)=t(e)-t(d)), tn s the tme nterval untl the frst zero value of a trajectory (tn = t(e)), tn 2 s the tme nterval untl the second zero value of a trajectory, I s an ntegral of the part of a trajectory untl ts frst zero value, CG s the centre of gravty of the part of a trajectory untl ts frst zero value, Med s the medan of a trajectory s values, RV s the range of values of a trajectory, lv s a plateau or the lmtng value of a trajectory, statstcal measures (e.g. mean value, standard devaton, correlaton coeffcents between segments of a trajectory). All these temporal parameters may be calculated for the orgnal trajectory as well as for any derved trajectory (e.g. dervatves, transformatons, etc.). Furthermore, the defnton of these parameters may be lmted to parts of ts tme doman (e.g. centre of gravty of a trajectory on the doman t [t, t 2 ]). The set of temporal parameters descrbng a trajectory can be determned by an expert takng nto consderaton specfc propertes of the dynamc system under consderaton. The choce of the set of relevant temporal parameters can be smplfed f the shape of specfc patterns n a trajectory s known. For nstance, for the analyss of ECG waveforms for medcal dagnoss, each pattern (ECG cycle) can be represented by fve characterstc ponts ncludng the startng and the termnaton ponts and three top ponts of the waves. In order to descrbe the shape of each pattern four parameters are proposed n [Nemrko et al., 994]: the duraton of the pattern, the ampltude of the pattern as a dfference between extreme values, the area lmted by the pattern (ntegral), and the mean of the ampltude range wth respect to the baselne. These parameters are suffcent for the comparson of ECG patterns and thus can be used for the defnton of smlar trajectores. Defnton 7. Gven a set of relevant temporal parameters, trajectores x(t) and y(t) can be consdered smlar f the values of temporal parameters descrbng elementary patterns n trajectores are smlar. In order to determne the smlarty measure based on specfc temporal parameters of trajectores, the parameters are summarsed to a vector of relevant characterstcs K and fuzzy

198 80 Smlarty Concepts for Dynamc Objects n Pattern Recognton sets A admssble dfference for a parameter K are defned for each parameter K, =,..., L. Smlarty measure s(x, y) s calculated accordng to Algorthm 7 for one-dmensonal trajectores or algorthms 8a or 8b for multdmensonal ones. The parameters lsted above allow a precse descrpton of the shape of temporal patterns present n a trajectory. They take nto account the number and sze of hlls, ther slope and curvature, the moments of ther appearance and ther duraton, where scalng and translaton factors have an effect on the parameter values. Ths smlarty measure s sutable for the recognton and comparson of specfc patterns n trajectores. Defnton of structural smlarty based on the peaks of trajectores. In a number of applcatons, the behavour of trajectores s charactersed by sgnfcant ncreases n the values of a measured characterstc x durng a short perod of tme. These abrupt ncreases are referred to as peaks. They contan nformaton about the state and development of the underlyng dynamc system and can be used as a crteron for the comparson of trajectores. In peak analyss the followng parameters of peaks are usually used: heght of a peak, poston of a peak (tme nstant of peak appearance), peak lmts, and the peak area [Eckardt et al., 995]. If the number of peaks n a trajectory s too large, only those peaks that are hgher than a predefned threshold are consdered as beng relevant. Defnton 8. Trajectores x(t) and y(t) are consdered smlar f they possess smlar peaks n smlar postons (Fgure 5-5). x 0 t 0 t Fgure 5-5: Smlarty measure based on peaks In order to calculate the smlarty measure wth respect to the peaks of trajectores, peaks are detected n each trajectory and are used as relevant characterstcs K, =,..., L, where L s the number of detected peaks. Parameters of each peak h are summarsed to a vector of

199 Smlarty Concepts for Dynamc Objects n Pattern Recognton 8 characterstc values K [K,...,K ], where pn s the number of peak parameters. In ths =,, pn context, peaks can be nterpreted as pn-dmensonal characterstcs of trajectores. The resultng pn L dmensonal matrx K=[K j ],, =,..., L, j=,..., pn, contans all characterstc values of a trajectory. Fuzzy sets A j, j=,..., pn, the admssble dfference of a peak parameter j have to be defned for each peak parameter. Smlarty measures are frst determned for each peak wth respect to peak parameters by applyng algorthms 8 or 8a or 8b, and aggregated over peak parameters to partal smlartes. The vector of partal smlartes wth respect to each peak s transformed nto the overall degree of smlarty s(x, y) between trajectores. The smlarty measure wth respect to peaks s sutable for the comparson of trajectores exhbtng a specfc mpulse behavour. Ths smlarty measure expresses a degree of matchng of peaks. If the poston of peaks s not consdered as a parameter, then the obtaned smlarty measure s ndfferent to temporal translatons of peaks. The above fve defntons of structural smlarty measures can be used n dfferent combnatons to obtan a more complete evaluaton of the smlarty of trajectores. In some cases structural smlarty can be reduced to pontwse smlarty. For nstance, consderng pontwse smlarty for the frst dervatves of the trajectores, the comparson of trajectores s performed wth respect to ther pontwse slope whereas scalng parameters are gnored. If pontwse smlarty s determned for the second dervatves of trajectores, then the pontwse curvature of trajectores s consdered as a relevant aspect for a comparson whereas scalng parameters and the slope of trajectores are rrelevant. 5.3 Extenson of fuzzy pattern recognton methods by applyng smlarty measures for trajectores As already stated, many pattern recognton methods (e.g. fuzzy c-means [Bezdek, 98, p.65], possblstc c-means [Krshnapuram, Keller, 993], (fuzzy-) Kohonen networks [Rumelhart, McClelland, 988]) use the dstance between pars of feature vectors descrbng objects as a measure of dssmlarty between these objects. Usng one of the equatons (5.), (5.2) or (5.3) a dstance d(x, y) between objects x and y can be transformed nto a smlarty measure s(x, y). Conversely, each strctly postve smlarty measure defnes a dstance measure. All aforementoned pattern recognton methods use the dstance between objects and cluster prototypes as a clusterng crteron n order to calculate degrees of membershp of objects to clusters, whereas the locatons of cluster centres are obtaned based on the locatons of objects n the feature space weghted by ther degrees of membershp. Therefore, t s suffcent to provde a dstance for pars of objects and / or class representatves to be able to calculate

200 82 Smlarty Concepts for Dynamc Objects n Pattern Recognton degrees of cluster membershp. These consderatons were used to develop a modfed verson of the fuzzy c-means algorthm, whch s called the functonal fuzzy c-means (FFCM) [Joentgen, Mkenna et al., 999b, p. 88]. The man advantage of the FFCM algorthm s ts ablty to partton dynamc objects descrbed by mult-dmensonal trajectores. Instead of the Eucldean dstance measure for real-valued feature vectors the dstance measure generated from pontwse smlarty for trajectores was ntegrated nto the calculaton procedure of the FCM algorthm. Obvously the cluster centres obtaned wth the FFCM algorthm are multdmensonal trajectores n the feature space, snce the algorthm s appled to cluster objects whose features are represented by trajectores. Pontwse and structural smlarty measures for trajectores descrbed n ths Chapter can generally be ntegrated nto an arbtrary statc or dynamc pattern recognton method usng the dstance or smlarty measure as a clusterng crteron. The man prncple of ths combnaton s llustrated on Fgure 5-6. In the next Chapter smlarty measures for trajectores wll be used n the algorthm of Gath and Geva appled for dynamc fuzzy classfer desgn and classfcaton. Object Object N x(t) x(t) t... t Features of objects = trajectores Dynamc clusterng algorthm Clusterng of trajectores v(t) t Center of cluster... v(t) t Center of cluster c Features of cluster centers = trajectores Fgure 5-6: The structure of dynamc clusterng algorthms based on smlarty measures for trajectores

201 Applcatons of Dynamc Pattern Recognton Methods 83 6 Applcatons of Dynamc Pattern Recognton Methods In order to demonstrate the practcal relevance of the new method for dynamc fuzzy clusterng developed n ths thess, two applcaton examples are consdered n ths chapter. The frst example taken from credt ndustry and presented n Secton 6. s concerned wth the problem of bank customer segmentaton based on customers behavoural data. After a descrpton of the credt data of bank customers and the formulaton of the goals of the analyss, two types of customer segmentatons based n the frst case on the whole temporal hstory coverng two years and n the second case on a partal temporal hstory of half a year wll be carred out. The clusterng results obtaned n the frst case represent the structure wthn the customer portfolo related to long-term payment behavour whereas the results generated n the second case provde customer segments based on short-term behavour and the nformaton about temporal changes n customer behavour. Ths secton contans a detaled descrpton of the customer segments obtaned durng both types of analyss, an evaluaton of the qualty of the fuzzy parttons and a comparson of the dfferent clusterng results. The second applcaton example, presented n Secton 6.2, s related to the analyss of data traffc n computer networks and allows the optmsaton of the network load based on on-lne montorng and dynamc recognton of typcal states of data traffc. Dynamc fuzzy classfer desgn and classfcaton wll be performed based on pontwse as well as structural smlarty measures for trajectores and the clusterng results wll be compared and evaluated. Both applcatons are carred out usng the mplementaton of the method developed n ths thess wth the software package MATLAB 5.. The lst of mplemented algorthms s gven n Appendx Bank Customer Segmentaton based on Customer Behavour Customer segmentaton s one of the most mportant applcatons of data mnng methodologes used n marketng and customer relatonshp management. Clusterng customers based on ther behavoural data helps to recognse ther buyng behavours and purchase patterns, to derve strategc marketng ntatves and to ncrease the response rates by addressng only the customers most lkely to buy a specfc product. Knowng the best customers s the key to developng targeted, personal and relevant marketng campagns, allocatng resources by segment proftablty and retanng the most valuable customers. Creatng customer segments enables companes to understand the customer portfolo and hghlghts obvous marketng opportuntes. In fnance the success of busness actvtes depends to a large degree on credt servces. Banks are nterested, on the one hand, n maxmsng the credt volumes and, on the other

202 84 Applcatons of Dynamc Pattern Recognton Methods hand, n mnmsng the nsolvency rate,.e. the number of customers unable to pay back ther loan. By conductng a segmentaton of bank customers based on ther payment behavour t s possble to dstngush between customers representng a good rsk for the bank, to whch specal servces can be offered, and customers representng a bad rsk, whch accordng to ther bad payment behavour have a hgh probablty of becomng nsolvent. Identfyng typcal segments wthn a customer portfolo allows a bank to recognse bad rsk cases n tme, to react by applyng correspondng measures and to ncrease the proftablty of a good rsk segment. Another goal of bank customer segmentaton can be the recognton of typcal groups of users of specfc bank servces. Usually there are at least two segments correspondng to actve users usng the provded bank servces to a full degree and passve users or even non-users, usng these servces to a low degree. A knowledge of these segments can support a bank n developng a goal orented marketng strategy for ts prvate and commercal customers. It must be noted that the goal and results of the analyss depend on the provded data set. In ths secton the second goal of bank customer segmentaton wll be dscussed, that s, to dstngush between dfferent user groups among bank customers. 6.. Descrpton of the credt data of bank customers In the applcaton under consderaton, 24,267 commercal bank customers ( objects ) wth revolvng credt were observed over two years and the measurements of customer features were carred out monthly (.e. the length of the temporal hstory s equal to 24 months). The bank customers are descrbed by two statc features, seven dynamc features and one categorcal feature. The frst two features are used as unque dentfcaton numbers of customers such as the customer number and account number. The seven dynamc features characterse the state of an account each month and are represented by sequences of 24 measurements. They are summarsed n the followng table: Table 6-: Dynamc features descrbng bank customers Feature Descrpton Overdraw lmt on account 2 Current end-of-month balance 3 Maxmum balance ths month 4 Mnmum balance ths month 5 Average credt utlsaton ths month 6 Credt turnover ths month 7 Number of bank-ntated payment reversals,.e. returned cheques / cancelled drect debts n the current month

203 Applcatons of Dynamc Pattern Recognton Methods 85 The frst feature Overdraw lmt on account s defned for each customer ndvdually by the bank and ts value usually remans constant for each customer over a long perod of tme or s changed on rare occasons. Ths feature can be used to evaluate a customer, but t does not descrbe hs/her behavour. Therefore, t wll not be consdered durng customer segmentaton, but only for the analyss of clusterng results afterwards. The second feature Current monthend balance s defned as a sum of all transactons n an account durng the current month. The defnton of the thrd feature Maxmum balance ths month s formulated by the bank n the followng way: f there s a negatve balance n an account durng a gven month, then the maxmum balance s the largest absolute value of all negatve account balances durng ths month; otherwse the maxmum balance s the lowest postve account balance durng ths month. Vce versa, the fourth feature Mnmum balance ths month s defned by the bank as follows: f there s a postve balance n an account durng a gven month, then the mnmum balance s the largest postve account balance durng ths month; otherwse the mnmum balance s the lowest absolute value of all negatve account balances durng ths month. The ffth feature Average credt utlsaton ths month expresses how much of the credt was used on average by the customer durng a gven month. The sxth feature Credt turnover ths month corresponds to the sum of all postve entres n an account durng a gven month. The seventh feature Number of bank-ntated payment reversals contans the number of faled debts snce an account was overdrawn. As wll be shown below, postve values of ths feature are very rare among customers, thus t s better suted to the charactersaton of customers rather than ther segmentaton. Therefore, only features 2 to 6 wll be used for clusterng. One of the categorcal features provded for bank customers determnes specal account propertes whch can be savngs / tme deposts / depots and can take two values such as yes or no. Accordng to bank experts, customers wth or wthout these account propertes must be treated separately snce they may exhbt dfferent payment behavour. Therefore, the data set wll be separated nto two subsets accordng to ths categorcal feature. The frst set of customers charactersed by feature value yes and possessng a savngs account or depots conssts of 4,688 customers, whle the other set ncludes 9,579 customers wthout the sad propertes of ther accounts. These two sets of customers wll be denoted herenafter as groups Y and N, respectvely, and the analyss of the customer structure wll be performed separately for each group. After a prelmnary analyss of data sets ncludng the calculaton of the mean, mnmum and maxmum values of trajectores and ther varances, t can be seen that the value ranges of the seven features are very large and dfferent. The value ranges, some quantles s α and the man statstcal characterstcs of the data group Y are summarsed n Table 6-2 and Table 6-3. The values of the chosen quantles show that a large percentage of the data les n a range of feature values consderably smaller than the entre value range. More nformaton concernng

204 86 Applcatons of Dynamc Pattern Recognton Methods the value range s provded by values µ±3σ, where µ s the mean value and σ s a standard devaton. Accordng to statstcs, these values, gven n Table 6-3, defne the lmts between whch any observed value of a feature falls wth a probablty of Table 6-2: The value range and man quantles of each feature of Data Group Y Features Value range s α (α=0.) s α (α=0.2) s α (α=0.8) s α (α=0.9) s α (α=0.95) [0, ] [ , ] [ , ] [ , ] [0, ] [0, ] [0, 4] Table 6-3: Man statstcs of each feature of the Data Group Y Features Mean value µ Standard devaton σ µ 3σ µ+3σ The value ranges, some quantles and the prncpal statstcal characterstcs of data Group N are gven n Table 6-4 and Table 6-5. Table 6-4: The value range and man quantles of each feature of Data Group N Features Value range s α (α=0.) s α (α=0.2) s α (α=0.8) s α (α=0.9) s α (α=0.95) [0, ] [ , ] [ , ] [ , ] [0, ] [0, ] [0, 6]

205 Applcatons of Dynamc Pattern Recognton Methods 87 Table 6-5: Man statstcs of each feature of Data Group N Features Mean value µ Standard devaton σ µ 3σ µ+3σ As can be seen n the four tables shown above, only 5% of the data takes values of Feature 7 larger than zero. A feature wth such a skewed dstrbuton s not very nformatve and, therefore, can be dropped. Due to the low number of data wth extreme outlyng values and a consderable amount of data wthn a smlar value range, the features are charactersed by a hghly exponental dstrbuton. Fgure 6-, a, shows the dstrbuton of data wth respect to Feature 6, where only mean values of each trajectory over the tme nterval [0, 24] are consdered. In order to reduce the effect of outlers and to mprove the performance of clusterng algorthms, t s necessary to pre-process the data usng some standardsaton and normalsaton technques. The most approprate technque for an exponental dstrbuton s a logarthmc transform. The data transformaton s carred out n two steps. Frstly, the mnmum value over all trajectores s determned for each feature, and f t s negatve t s subtracted pontwse from all trajectores of ths feature. Then the values of these trajectores are ncreased by one n order to acheve only postve values after the transformaton. In the second step, all values of the trajectores are transformed logarthmcally. As a result, the logarthmcally transformed data les n the value range [0, 20] and the dstrbuton of data for all features s roughly normal. It should be noted that fuzzy clusterng methods do not strctly requre the data to be normally dstrbuted, but the algorthms work better n many cases f ths crteron s satsfed. Fgure 6-, b, llustrates the dstrbuton of the mean values of trajectores of Feature 6 after logarthmc transformaton.

206 88 Applcatons of Dynamc Pattern Recognton Methods Densty x0 7 Feature values Densty Feature values Fgure 6-: Dstrbuton of data wth respect to Feature 6 a) Before logarthmc transformaton, b) After logarthmc transformaton After pre-processng the data on bank customers usng standardsaton and logarthmc transformaton, the analyss of the data can be started Goals of bank customer analyss The goals of the dynamc analyss of bank customers can be formulated as follows:. To fnd segments of customers wth smlar payment behavour based on the whole temporal hstory coverng two years, 2. To fnd segments of customers wth smlar payment behavour based on the temporal hstory of half a year, and to follow changes n the cluster structure and n the assgnment of customers to the clusters over tme. The frst goal can be acheved by clusterng customers represented by trajectores of ther features on the tme nterval of two years. The clusterng results provde nformaton about the structure wthn the customer portfolo appearng durng ths tme nterval untl the current moment. These results are sutable for dstngushng between good and bad customers accordng to ther long-term payment behavour. The analyss of a long hstory s often carred out by banks to acheve relable results, partcularly for recognsng bad credt customers. The drawback of ths analyss s, however, that the classfer cannot be used to classfy new observatons of exstng customers or observatons of new customers for the next two years, snce the cluster prototypes are descrbed by trajectores wth a length of 24 months and thus cannot be compared wth shorter sequences of observatons. Thus, the classfcaton of new observatons and updatng the classfer (f necessary) can be repeated every two years. In ths case the desgn of the classfer s statc, but the classfer s dynamc n nature snce t s appled to dynamc objects.

207 Applcatons of Dynamc Pattern Recognton Methods 89 A more applcable classfer can be desgned by clusterng sequences of observatons over half a year, whch s the second goal of the analyss conducted. Ths analyss allows one to recognse customer segments based on the short-term payment behavour of customers and to detect temporal changes n the customer behavour. The classfcaton of new observatons of exstng or new customers can be repeated every sx months provdng up-to-date nformaton about the customers states and ther development. If changes n the customer structure are detected the classfer s adapted accordng to the detected changes, whch corresponds to an update of the customer segments and ther descrptons. Therefore, ths type of analyss s based on dynamc classfer desgn and classfcaton appled to dynamc objects. There are n general two vewponts on the classfer desgn for customer segmentaton. Some companes are nterested n statc classfer desgn, f they are sure that they know ther typcal segments and do not want to change ther descrpton. In ths case the classfer s desgned just once (supervsed or unsupervsed), the cluster prototypes are frozen and the classfcaton of new observatons s repeated n the course of tme n order to detect changes of customers wth respect to exstng (statc) clusters. Another vewpont requres the adaptaton of the classfer to temporal changes. Snce condtons and propertes of bank servces can change n the course of tme, a change of customer behavour can also be expected. Thus, there s a need to update the prototypes of customer segments n order to represent approprately a new envronment. Accordng to the aforementoned goals of customer segmentaton and two types of customers, the analyss wll nclude the followng steps. Two types of customers wll be clustered based on two dfferent lengths of temporal hstory leadng to four clusterng problems. Regardng the choce of the approprate type of the smlarty measure for trajectores the followng prelmnary analyss has been carred out. Consderng some of the trajectores of bank customers t can be seen that ther temporal behavour has ether a fluctuatng or almost constant character wth occasonal steps. Snce the trajectores are relatvely short (24 ponts of tme f the whole temporal hstory s consdered or just 6 tme ponts n a tme wndow) t s dffcult to recognse a specfc behavoural pattern n trajectores. The analyss of temporal characterstcs of trajectores such as temporal trend, smoothness and pontwse dervatves has shown that the values of these characterstcs for most of the trajectores are qute smlar. For nstance, the temporal trend of the trajectores of customers of group Y takes values n the nterval [-0., 0.] and more than 95% of the values are around zero. An analogous stuaton can be observed calculatng the average of pontwse dervatves over each tme nterval of sx months. The smoothness for the most of these trajectores takes values between 20 and 23 ndcatng a strong fluctuatng behavour. Snce the consderaton of fluctuatng behavour of the trajectores usng the structural smlarty measure s not very promsng, t seems reasonable to analyse the structure of

208 90 Applcatons of Dynamc Pattern Recognton Methods customer portfolo based on the absolute values of trajectores nstead of some temporal characterstcs descrbng the behavour of these trajectores. Fluctuatng behavour usually requres extensve smoothng, whch would lead to a loss of relevant nformaton, takng nto account that the maxmum length of a trajectory s only 24. Therefore, segmentaton of bank customers wll be carred out based on pontwse smlarty between trajectores. The tasks of analyss of bank customers and the correspondng clusterng problems whch wll be solved n the followng sectons are summarsed n Table 6-6. Table 6-6: Scope of the analyss of bank customers I. Length of temporal hstory II. Type of customers The whole temporal hstory t=[, 24] Customers of group Y Tme wndows equal to half a year Customers of group N III. Type of smlarty measure for trajectores Pontwse smlarty 6..3 Parameter settngs for dynamc classfer desgn and bank customer classfcaton Customer segmentaton wll be carred out usng the algorthm for dynamc classfer desgn and classfcaton developed n Chapter 4. For clusterng customers n a certan tme wndow the clusterng algorthm of Gath and Geva s appled, whch uses the fuzzy c-means algorthm for ntalsaton. In order to be able to deal wth dynamc objects represented by 5- dmensonal trajectores, the dstance measure generated from the pontwse smlarty measure for trajectores s ntegrated nto the FCM and the Gath-Geva algorthm. The pontwse smlarty measure s obtaned accordng to Algorthm 6a. For ts defnton the quadratc membershp functon gven by equaton (5.) s chosen to represent the fuzzy set approxmately zero, whch s defned wth respect to each feature f r, r=,..., M=5. Parameter a(r) of each membershp functon s determned by equaton (5.4), whch requres the settng of a certan feature value β(r) and ts membershp degree α to the fuzzy set approxmately zero. In order to take nto account the value range of each feature and the range of maxmal dfferences between trajectores of each feature, respectvely, parameter β(r) can be evaluated as the average value on the doman where trajectores of feature r take ther values (the values of parwse dfferences between trajectores also le n ths doman). The value of parameter β(r) s calculated as the mean value of the maxmal values of the trajectores of the correspondng feature: β(r) = N N j= max [x t jr (t)], r =,...,M, (6.) where x jr (t) s the j-th trajectory of feature r and N s the number of trajectores consdered.

209 Applcatons of Dynamc Pattern Recognton Methods 9 The membershp degree of feature value β(r) to the fuzzy set approxmately zero can be chosen between 0 and, but t seems reasonable to choose ths value between 0. and 0.6. The smaller the value of α, the larger the value of parameter a(r), the narrower the fuzzy set and the stronger the defnton of smlarty between trajectores. Emprcal research has shown that the value α=0.5 s best suted for ths applcaton. Thus, parameter a(r) for the defnton of membershp functons s obtaned as follows: 0.5 a(r) = =. (6.2) β(r) β(r) The second parameter needed for the defnton of the pontwse smlarty measure s the aggregaton operator appled to transform the vector of pontwse smlartes to the overall smlarty measure on the tme nterval. The arthmetc mean value s chosen usng the consderaton n Secton The dstance measure obtaned from the pontwse smlarty measure s used nstead of the Eucldean dstance measure n the calculaton schemes of the FCM and the Gath-Geva algorthms. In both algorthms the dstance between objects and cluster centres s nvolved n the calculaton of degrees of membershp of objects to clusters (see Equaton (4.5)). It s also used n the Gath-Geva algorthm to calculate the fuzzy covarance matrces of clusters gven by (9.4) and an exponental dstance functon (9.2), whch s fnally appled for evaluatng degrees of membershp. It s known that the use of an exponental functon n the Gath-Geva algorthm can lead to numercal problems n many practcal cases, snce the dstances take ether extremely low or extremely hgh values. Ths problem was also observed durng some test clusterng runs appled to the data of bank customers, whch are charactersed by a consderable varance of values. In order to avod these numercal problems, the exponental dstance functon s substtuted by the quadratc functon and modfed n such a way that t provdes only constant values above a certan hgh value. The loss of precson due to such a modfcaton s not sgnfcant because of the very low values of the resultng degrees of membershp. For dynamc classfer desgn t s necessary to defne a set of thresholds used wthn the montorng procedure. For the algorthms for detectng new clusters or smlar clusters to be merged the followng parameter settngs are chosen:

210 92 Applcatons of Dynamc Pattern Recognton Methods Table 6-7: Parameter settngs for the detecton of new clusters durng customer segmentaton Absorpton threshold u o = 0.5 Share of the average cluster sze α cs = 0.7 Share of the average cluster densty α dens = 0.25 Threshold for the choce of good free objects α good = u o = 0.5 Mergng threshold λ = Clusterng of bank customers n Group Y based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure Clusterng of bank customers belongng to Group Y, based on ther temporal behavour over the tme perod of 24 months, s performed wth the modfed algorthm of Gath and Geva. The dstance measure used n the algorthm s generated from the pontwse smlarty measure for trajectores defned accordng to Algorthm 6a. Parameter a for the defnton of the pontwse smlarty measure s calculated accordng to equaton (6.2) and yelds the followng values: Feature Feature 2 Feature 3 Feature 4 Feature 5 Feature 6 a Clusterng starts wth the number of clusters c=2 and the optmal number of clusters s determned by applyng the algorthms of the montorng procedure descrbed n Sectons 4.3, 4.4 and 4.5. After possblstc classfcaton and absorpton of customers nto two detected clusters accordng an absorpton threshold of 0.5, the szes of clusters are equal to 3,08 and,367 and the number of free objects s equal to 240. For a better judgement of the cluster structure the number of objects absorbed nto the clusters wth dfferent absorpton thresholds s summarsed n Table 6-8. The denstes of clusters, the average partton densty and the densty of the group of free objects are represented n Table 6-9, whch also ncludes the values of valdty measures such as fuzzy separaton and compactness. Accordng to the crteron of the mnmal cluster sze and the crteron of compactness of a group of free objects, no new cluster can be assumed and free objects are consdered as stray data.

211 Applcatons of Dynamc Pattern Recognton Methods 93 Table 6-8: Number of absorbed and free objects of Data Group Y Absorbed Stray C C 2 u o = u o = u o = u o = Valdty measures for the generated fuzzy partton are gven n the followng table: Table 6-9: Valdty measures for fuzzy partton wth two clusters for Data Group Y Partton densty of Cluster PD = 84.3 Partton densty of Cluster 2 PD 2 = Average partton densty v apd = Fuzzy separaton FSA = Fuzzy compactness FC = 0.99 Partton densty of the group of free objects PD free = In order to verfy whether the number c=2 s really the correct number for the gven group of customers, clusterng was performed wth the number of clusters equal to 3 and 4. The valdty measures for the three generated fuzzy parttons are represented n Table 6-0. As can be seen, the number of stray objects and the average partton densty for parttons wth 3 and 4 clusters exceeds the correspondng values for the partton wth 2 clusters. However, the values of fuzzy separaton and compactness obtaned wth 3 and 4 clusters le below the values obtaned wth 2 clusters. Table 6-0: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group Y c = 2 c = 3 c = 4 N free v apd FSA FC When applyng the algorthm for detecton of smlar clusters to be merged to parttons wth 3 and 4 clusters, t can be stated that some pars of clusters are hghly overlappng. In the case of 3 clusters the smlarty measure between clusters and 2 yelds s(c, C 2 )=0.77, whch ndcates that Clusters and 2 can be merged f the mergng threshold λ=0.6 s chosen. In the

212 94 Applcatons of Dynamc Pattern Recognton Methods case of 4 clusters the followng smlarty measures between pars of Clusters and 3 and 2 and 4 are obtaned: s(c, C 3 )=0.82 and s(c 2, C 4 )=0.97. These pars of clusters can be merged as well. Thus, t can be assumed that the optmal number of clusters for segmentaton of customers n Group Y s equal to 2. Centres of 2 customer segments obtaned for ths group of bank customers wth respect to each feature usng the modfed Gath-Geva algorthm, based on the pontwse smlarty measure for trajectores, are gven below. Feature : Current end-of-month balance Feature values 30,000 25,000 20,000 5,000 0,000 5, ,000-0, Tme (months) Cluster Cluster 2 Feature 2: Maxmum balance ths month Feature values 20, ,000-40,000-60,000-80,000-00, Tme (months) Cluster Cluster 2

213 Applcatons of Dynamc Pattern Recognton Methods 95 Feature 3: Mnmum balance ths month Feature values 50,000 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2 Feature 4: Average credt utlsaton ths month Feature values 2,000 0,000 8,000 6,000 4,000 2, Tme (months) Cluster Cluster 2 Feature 5: Credt turnover ths month Feature values 80,000 70,000 60,000 50,000 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2 Fgure 6-2: Cluster centres wth respect to each feature obtaned for customers of Group Y based on the whole temporal hstory and pontwse smlarty between trajectores

214 96 Applcatons of Dynamc Pattern Recognton Methods The obtaned customer segments can be nterpreted as non-users and actve users of credt. Customers of the frst segment non-users are charactersed by always postve end-of-month balances of between 20,000 DM and 30,000 DM. The account balance durng a month vares between 0,000 DM and 40,000 DM and the credt turnover consttutes about 0,000 DM a month. Ths type of customers typcally does not use a credt, and the account seems to be used n a way rather smlar to a regular checkng account. Customers of the second segment actve users are charactersed by a sgnfcant varaton of ther account balance between 20,000 DM and 0,000 DM and show a negatve end-ofmonth balance between 7,000 DM and 2,000 DM. These customers have a relatvely hgh credt turnover of about 50,000 DM and supposedly hgh expenses, snce ther average credt utlsaton can reach 0,000 DM. Ths customer segment s of partcular nterest for the bank, snce t represents proftable customers usng ther credt. In order to evaluate the qualty of the fuzzy partton obtaned for customers n Group Y, the degrees of separaton defned as dfferences between the hghest and the second hghest degrees of membershp and the degrees of compactness defned as the hghest degrees of membershp of each customer to the clusters are shown n Fgure 6-3 and Fgure 6-4, respectvely. For the sake of a better vsualsaton, 4,688 customers are sorted by ther degrees of separaton or compactness n ascendng order. Degree of separaton Customers Fgure 6-3: Degrees of separaton between clusters obtaned for customers n Group Y based on the whole temporal hstory

215 Applcatons of Dynamc Pattern Recognton Methods 97 Degree of compactness Customers Fgure 6-4: Degrees of compactness of clusters obtaned for customers n Group Y based on the whole temporal hstory The analyss of the fgures shown above leads to the followng consderatons concernng the fuzzy partton obtaned. Most of the customers are charactersed by hgh values of maxmum degrees of membershp, whch allows a clear assgnment of customers to one of the clusters. In partcular, 4,370 customers (93%) have maxmum degrees of membershp larger than 0.6 and 3,992 customers (85%) are descrbed by maxmum degrees of membershp larger than 0.9. The cluster assgnment s unambguous for 4,028 customers (86%), snce ther degrees of separaton wth respect to ambguty, whch are the dfferences between the hghest and the second hghest degree of membershp, are larger than 0.5. For 3,788 customers (8%) the degrees of separaton exceed 0.8, whch correspond to almost hard assgnment. Thus, s can be stated that the fuzzy partton of customers nto two clusters s rather clear and unambguous Clusterng of bank customers n Group N based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure The analyss of bank customers n Group N s carred out analogously to the scheme appled for Data Group Y. Clusterng s performed wth the modfed verson of Gath-Geva algorthm based on the pontwse smlarty measure for trajectores startng wth c=2. For the defnton of the membershp functon approxmately zero wth respect to each feature the followng values of Parameter a are calculated: Feature Feature 2 Feature 3 Feature 4 Feature 5 Feature 6 a

216 98 Applcatons of Dynamc Pattern Recognton Methods After clusterng and possblstc classfcaton of objects, the followng cluster szes gven by the number of objects absorbed nto the clusters and the followng numbers of stray objects are obtaned usng dfferent values of the absorpton threshold: Table 6-: The number of absorbed and free objects of Data Group N Absorbed Stray C C 2 u o = u o = u o = u o = The values of dfferent valdty measures for the generated fuzzy partton and the densty for the group of free objects are presented n the followng table. Table 6-2: Valdty measures for fuzzy partton wth two clusters for Data Group N Partton densty of Cluster PD =3493 Partton densty of Cluster 2 PD 2 =22720 Average partton densty v apd = 307 Fuzzy separaton FSA= Fuzzy compactness FC= Partton densty of the group of free objects Snce the number of free objects s not suffcent to declare a new cluster (443< =480) and the densty of the group of free objects does not exceed the densty threshold (0.003< =3276), no new cluster can be assumed and free data are consdered as stray. In order to verfy the result of the montorng procedure provdng the partton wth 2 clusters, clusterng s performed wth the number of clusters equal to 3 and 4. The qualty of these fuzzy parttons s evaluated usng three valdty measures whose values are gven n Table 6-3. It can be seen that the number of stray objects grows as the number of clusters ncreases, whch s a sgn for a deteroraton of the fuzzy partton. The average partton densty for a partton wth 4 clusters exceeds consderably the correspondng values for the parttons wth 2 and 3 clusters. The ncrease of densty can be explaned by the fact that the number of objects absorbed n the four clusters s much smaller than the correspondng numbers for two other parttons, whch results n extremely small cluster volumes. However, a hgh value of densty can not be consdered as an mprovement of the fuzzy partton, snce the majorty of objects can not be absorbed nto clusters. Consderng the values of fuzzy compactness, t can be seen that the partton wth 3 clusters has the best value but the value of fuzzy separaton s

217 Applcatons of Dynamc Pattern Recognton Methods 99 much lower than the one for 2 clusters. Ths means that although objects have hgh degrees of membershp, some of the three clusters have an mportant overlap and can not be clearly dstngushed. The values of fuzzy separaton and compactness obtaned for a partton wth 4 clusters le below the correspondng values obtaned for two other parttons and also ndcate the ntersecton of clusters. Table 6-3: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group N c = 2 c = 3 c = 4 n free v apd FSA FC When the algorthm for detecton of smlar clusters to parttons wth 3 and 4 clusters s appled, the fact of hghly overlappng clusters s confrmed. In the case of 3 clusters the smlarty measure between Clusters 2 and 3 s equal to s(c 2, C 3 )=0.699, whch ndcates that these can be merged f the mergng threshold λ=0.6 s chosen. In the case of 4 clusters the followng smlarty measures for some pars of clusters are obtaned: s(c, C 4 )=0.67 and s(c 2, C 3 )=0.62. Accordng to the chosen mergng threshold, these pars of clusters can be merged. Thus, t can be assumed that the optmal number of clusters for fuzzy parttonng s equal to 2. The centres of the 2 customer segments dentfed for bank customers of group N wth respect to each feature usng the modfed Gath-Geva algorthm based on the pontwse smlarty measure for trajectores are presented below. Feature : Current end-of-month balance Feature values 20,000 0, ,000-20,000-30,000-40, Tme (months) Cluster Cluster 2

218 200 Applcatons of Dynamc Pattern Recognton Methods Feature 2: Maxmum balance ths month Feature values 0, ,000-20,000-30,000-40,000-50, Tme (months) Cluster Cluster 2 Feature 3: Mnmum balance ths month Feature values 30,000 20,000 0, ,000-20, Tme (months) Cluster Cluster 2 Feature 4: Average credt utlsaton ths month Feature values 50,000 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2

219 Applcatons of Dynamc Pattern Recognton Methods 20 Feature 5: Credt turnover ths month 50,000 Feature values 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2 Fgure 6-5: Cluster centres wth respect to each feature obtaned for customers n Group N based on the whole temporal hstory and pontwse smlarty between trajectores The obtaned customer segments can agan be nterpreted as non-users and actve users of credt. Customers belongng to the frst segment, non-users, are charactersed by always postve end-of-month balances of between 0,000 DM and 5,000 DM. The account balance durng a month vares between 0,000 DM and 20,000 DM and the credt turnover consttutes about 0,000 DM per month. As can be seen from the low values of the average credt utlsaton, ths type of customers does not use a credt from the bank. These customers use ther accounts n a way rather smlar to a regular checkng account. Customers belongng to the second segment, actve users, are descrbed by a sgnfcant varaton of ther account balance between 40,000 DM and -0,000 DM durng a gven month and show a negatve end-of-month balance of between 30,000 DM and 20,000 DM. These customers have a relatvely hgh credt turnover between 30,000 DM and 45,000 DM and supposedly hgh expenses snce ther average credt utlsaton reaches 40,000 DM. Ths customer segment represents proftable customers usng a credt and s smlar to the correspondng segment for customers n Group Y, however the feature values vary n dfferent ranges. The qualty of the fuzzy partton for customers n Group N s represented by the degrees of separaton between clusters correspondng to degrees of ambguty of object assgnment and the degrees of compactness of clusters expressng the maxmum degrees of membershp of objects to clusters whch are shown n Fgure 6-6 and Fgure 6-7, respectvely. For the sake of a better vsualsaton, 9,579 customers are sorted by ther degrees of separaton or compactness n ascendng order.

220 202 Applcatons of Dynamc Pattern Recognton Methods Degrees of separaton Customers Fgure 6-6: Degrees of separaton between clusters obtaned for each customer n Group N based on the whole temporal hstory Degrees of separaton Customers Fgure 6-7: Degrees of compactness of clusters for each customer n Group N based on the whole temporal hstory The above fgures show that the assgnment of customers to clusters s clear and unambguous for about 65% of them, for whch maxmum degrees of membershp exceed 0.5 and the degree of separaton exceeds 0.4. It should be noted that n Fgure 6-6 and Fgure 6-7 customers are sorted wth respect to each sngle measure and ndependently from the other one so that the order of customers on the x-axs n both fgures does not correspond to each other. Therefore, the fgures above express to what extent customers have hgh degrees of separaton and hgh maxmum degrees of membershp. The number of customers whose degrees of separaton and compactness exceed certan thresholds can be calculated from the orgnal sequences sorted by the customer number by choosng customers satsfyng both

221 Applcatons of Dynamc Pattern Recognton Methods 203 condtons. For nstance, the number of customers wth a degree of compactness larger than 0.7 and a degree of separaton larger that 0.5 s equal to 44%. As can be seen, the fuzzy partton of customers n Group N nto two clusters s a lttle less clear and more ambguous than the partton of customers n Group Y ths beng due to the fact that both the clusters and number of clents are much larger Segmentaton of bank customers n Group Y based on the partal temporal hstory and usng the pontwse smlarty measure In the frst tme wndow wth a length of 6 months t=[, 6] bank customers are clustered by applyng the Gath-Geva algorthm based on pontwse smlarty between trajectores and wth the number of clusters c=2. The approprate number of clusters s determned usng the algorthms of the montorng procedure. After possblstc classfcaton and assgnment of customers to clusters accordng to the absorpton threshold, whether or not the current cluster structure represents the customer structure n the best way s verfed durng the montorng procedure. Frstly, the algorthm for detectng new clusters s appled leadng to the followng results: The number of absorbed objects (customers) and, correspondngly, the szes of the two clusters are equal to N 0. 5 = 237 and N = 3052, whereas the number of free objects s equal to 399. Snce cluster szes are very dfferent, the sze threshold s defned based on the mnmal cluster sze and provdes the value α cs n mn = Accordng to the second crteron of the mnmal number of free objects, the number of free objects s not large enough to declare a new cluster. In order to verfy ths result the thrd crteron of compactness s evaluated. Free objects are clustered wth the fuzzy c-means algorthm combned wth the localsaton procedure for detectng dense groups wthn free objects. The partton densty of the group of free objects s equal to.66*0-5, whereas the densty threshold s gven by α dens pd av = =.28. Thus, the crteron of densty of the group of free objects s not satsfed ether, free objects are consdered as stray data. It can be assumed that the optmal number of clusters n the frst tme wndow s equal to two. The classfer desgned n the frst tme wndow s appled to classfy new observatons of customers n Tme Wndows 2 to 4. The montorng procedure cannot detect any abrupt changes n the cluster structure over these three perods of tme. The classfer fts the data structure well, whch seems to be smlar for all tme wndows. The results of clusterng and classfcaton for all four tme wndows are summarsed n Table 6-4, provdng the number of customers absorbed nto each cluster and the number of free objects rejected for absorpton, whch are obtaned wth dfferent values of the absorpton threshold. These results provde nformaton about the sze of customer segments detected.

222 204 Applcatons of Dynamc Pattern Recognton Methods Table 6-4: Number of Customers Y assgned to two clusters n four tme wndows Tme Wndow Tme Wndow 2 Tme Wndow 3 Tme Wndow 4 Absorbed Stray Absorbed Stray Absorbed Stray Absorbed Stray C C 2 C C 2 C C 2 C C 2 u o = u o = u o = u o = Table 6-5: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for Customers Y n four tme wndows Tme Wndow Tme Wndow 2 Tme Wndow 3 Tme Wndow 4 PD PD v apd FSA FC PD free.66* * *0-6 2.*0-5 The centres of the customer segments obtaned for customers n Group Y n the frst tme wndow (the frst half a year) are shown n Fgure 6-8. Feature : Current end-of-month balance Feature values 25,000 20,000 5,000 0,000 5, ,000-0, Tme (months) Cluster Cluster 2

223 Applcatons of Dynamc Pattern Recognton Methods 205 Feature 2: Maxmum balance ths month Feature values 5,000 0,000 5, ,000-0,000-5,000-20, Tme (months) Cluster Cluster 2 Feature 3: Mnmum balance ths month Feature values 50,000 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2 Feature 4: Average utlsaton rate ths month Feature values 8,000 7,000 6,000 5,000 4,000 3,000 2,000, Tme (months) Cluster Cluster 2

224 206 Applcatons of Dynamc Pattern Recognton Methods Feature 5: Credt turnover ths month Feature values 70,000 60,000 50,000 40,000 30,000 20,000 0, Tme (months) Cluster Cluster 2 Fgure 6-8: Cluster centres wth respect to each feature obtaned for customers n Group Y n the frst tme wndow and based on pontwse smlarty between trajectores The customer segments obtaned n the frst tme wndow can be nterpreted as non-users and actve users of credt smlar to customer segments recognsed for customers n Group Y based on the whole temporal hstory of 24 months. Customers of the frst segment, non-users, are descrbed by always postve end-of-month balances of between 20,000 DM and 25,000 DM and, correspondngly, a zero value of the average credt utlsaton. The account balance durng a gven month vares between 0,000 DM and 40,000 DM and the credt turnover consttutes about 0,000 DM per month. It s obvous that ths type of customer does not use a credt from the bank, but uses hs/her account rather lke a regular checkng account. The account balances of customers n the second segment, actve users, exhbt a sgnfcant varaton of between 5,000 DM and 0,000 DM durng a gven month and possess a negatve end-of-month balance of about 5,000 DM. The credt turnover of these customers s relatvely hgh varyng between 50,000 DM and 60,000 DM, but at the same tme the average monthly credt utlsaton by these customers reaches 8,000 DM. Thus, the accounts of these customers are charactersed by a large number of transactons. Ths customer segment makes use of an avalable credt provdng the bank wth a desrable proft. As can be seen, the descrpton and the propertes of these two segments are very smlar to the ones of segments obtaned for customers n Group Y, based on the whole temporal hstory. Ths result shows that the two customer segments detected are rather stable over tme and ther propertes change only nsgnfcantly, whch s also confrmed by the number of customers absorbed nto the two clusters n the four tme wndows. In order to llustrate the qualty of the fuzzy partton obtaned, the degree of separaton defned as the dfference between the hghest and the second hghest degree of membershp

225 Applcatons of Dynamc Pattern Recognton Methods 207 and the degree of compactness defned as the hghest degree of membershp of each customer to the clusters are shown n Fgure 6-9 and Fgure 6-0, respectvely. For the sake of a better vsualsaton, 4,688 customers are sorted by ther degrees of separaton or compactness n ascendng order. Degree of separaton Customers Fgure 6-9: Degrees of separaton between clusters obtaned for customers n Group Y n the frst tme wndow Degree of compactness Customers Fgure 6-0: Degrees of compactness of clusters calculated for customers n Group Y n the frst tme wndow As can be seen n Fgure 6-9 and Fgure 6-0, the fuzzy partton of customers n Group Y nto two clusters n the frst tme wndow s clear and unambguous for about 83% of customers, whose maxmum degrees of membershp exceed 0.6 and degrees of separaton are larger than 0.5. Moreover, 3,769 customers (80%) are charactersed by maxmum degrees of

226 208 Applcatons of Dynamc Pattern Recognton Methods membershp larger that 0.8, whereas 2,369 of these customers possess a degree of separaton above 0.6, whch corresponds to rather sharp assgnment. Durng the next step of analyss, the temporal change of customer assgnment to clusters acheved usng absorpton threshold u o =0.5 (the number of customers assgned to each cluster s gven n Table 6-4) s consdered. The task s to determne for each par of subsequent tme wndows whether customers have remaned n the same cluster to whch they were assgned n the prevous tme wndow or whether they have moved from one cluster nto another due to changng propertes of ther new observatons. The results of ths analyss are summarsed n Table 6-6, where the -th tme wndow s denoted as tw, =,..., 4. As can be seen, most of the customers assgned to Cluster (segment) C or C 2 n the frst tme wndow reman n these clusters n the followng tme wndows, and just a small percentage of customers has moved from one cluster to another. Table 6-6: Temporal change of assgnment of customers n Group Y to clusters From tw to tw 2 From tw 2 to tw 3 From tw 3 to tw 4 Number of customers C C 2 C C 2 C C 2 Remaned n C Moved from C nto C Moved from C 2 nto C Dropped out of C Appeared n C Comparng the results of Table 6-6 wth those n Table 6-4, t can be noted that there s a small number of customers that were assgned to a certan cluster n one of the tme wndows who do not appear n any of the two clusters n the next tme wndow. Ths number can be calculated as the dfference between the total number of customers assgned to Cluster C, =, 2, and the sum of customers that remaned n Cluster C and moved to another cluster. For nstance, for the frst par of tme wndows the number of customers who dropped out of Cluster C consttutes,237-(,8+64)=55. Ths fact can be explaned by decreased degrees of membershp of these customers to both clusters n tme wndow 2, so that a cluster assgnment accordng to the consdered absorpton threshold (here u o =0.5) s not possble. On the other hand, there s a number of customers that were not assgned to any of the two clusters n one of the tme wndows but who appear n one of the two clusters n the next tme wndow. Ths number can be obtaned as the dfference between the total number of customers assgned to Cluster C, =, 2, n the second of two tme wndows and the sum of customers that remaned n Cluster C and moved to ths cluster from another one. For nstance, for the frst par of tme wndows the number of customers assgned to Cluster C n the second tme wndow but not present n any of the clusters n the frst tme wndow s equal

227 Applcatons of Dynamc Pattern Recognton Methods 209 to,24-(,8+3)=65. Ths change n a cluster assgnment s due to an ncrease n degrees of membershp of some customers n the second tme wndow compared to those n the frst tme wndow. In short, t can be stated that the fuzzy partton wth two clusters represents a good segmentaton of customers n Group Y and fts well the natural data structure Clusterng of bank customers n Group N based on partal temporal hstory and usng the pontwse smlarty measure After clusterng wth a modfed verson of the Gath-Geva algorthm based on the pontwse smlarty for trajectores and possblstc classfcaton of objects, the followng cluster szes gven by the number of objects absorbed nto clusters and the followng numbers of free (stray) objects are obtaned for each tme wndow usng dfferent values of the absorpton threshold: Table 6-7: Number of customers N assgned to two clusters n four tme wndows Tme Wndow Tme Wndow 2 Tme Wndow 3 Tme Wndow 4 Absorbed Stray Absorbed Stray Absorbed Stray Absorbed Stray C C 2 C C 2 C C 2 C C 2 u o = u o = u o = u o = Table 6-8: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for customers n Group N n four tme wndows Tme Wndow Tme Wndow 2 Tme Wndow 3 Tme Wndow 4 PD PD v apd FSA FC PD free 2.40* * * *0-7 The centres of the customer segments obtaned for customers n Group N n the frst tme wndow (the frst half a year) are shown n Fgure 6-.

228 20 Applcatons of Dynamc Pattern Recognton Methods Feature : Current end-of-month balance Feature values 0,000 5,000-5, ,000-5,000-20,000-25,000-30,000 Cluster Cluster Tme (months) Feature 2: Maxmum balance ths month 0,000 Feature values 0-0,000-20,000-30,000-40, Cluster Cluster 2 Tme (months) Feature 3: Mnmum balance ths month Feature values 5,000 0,000 5, ,000-0,000-5,000 Cluster Cluster Tme (months)

229 Applcatons of Dynamc Pattern Recognton Methods 2 Feature 4: Average utlsaton rate ths month Feature values 40,000 35,000 30,000 25,000 20,000 5,000 0,000 5, Tme (months) Cluster Cluster 2 Feature 5: Credt turnover ths month Feature values 40,000 35,000 30,000 25,000 20,000 5,000 0,000 5, Tme (months) Cluster Cluster 2 Fgure 6-: Cluster centres wth respect to each feature obtaned for customers n Group N n the frst tme wndow and based on pontwse smlarty between trajectores The customer segments obtaned n the frst tme wndow can be nterpreted as non-users and actve users of credt smlar to the customer segments recognsed for Customer Group Y based on the partal temporal hstory. Customers n the frst segment, non-users, are charactersed by always postve end-ofmonth balances of about 7,000 DM. The account balance of these customers vares between 5,000 DM and 0,000 DM durng a gven month and the credt turnover consttutes about,000 DM per month. Snce the average credt utlsaton by these customers s very low, t can be stated that ths type of customers does not use a credt from the bank. The account seems to be used n a way rather smlar to a regular checkng account. Customers n the second segment, actve users, are descrbed by a sgnfcant varaton of ther account balance of between 40,000 DM and -0,000 DM, and show a negatve end-ofmonth balance of about 25,000 DM. These customers have a relatvely hgh credt turnover

230 22 Applcatons of Dynamc Pattern Recognton Methods varyng between 25,000 DM and 35,000 DM and a hgh average credt utlsaton, whch reaches 35,000 DM per month. Obvously, ths customer segment uses an avalable credt from the bank and represents a proftable group of customers. As can be seen, the descrpton and propertes of these two segments are very smlar to the ones of segments obtaned for customers n Group N, based on the whole temporal hstory, wth the excepton that the value ranges of the features n the frst tme wndow s slghtly dfferent,.e. the account balance s somewhat smaller n absolute values, as well as the credt turnover and the average credt utlsaton. Ths result confrms the fact that the two customer segments are rather stable over tme and ther propertes change only nsgnfcantly. Ths can also be seen n Table 6-7 presentng the number of customers absorbed nto the two segments n the four tme wndows. The qualty of the fuzzy partton for customers n Group N can be evaluated by consderng the degrees of separaton between the clusters correspondng to the degrees of ambguty of object assgnment and the degrees of compactness of the clusters expressng the maxmum degrees of membershp of objects to the clusters. Both valdty measures calculated for each customer are shown n Fgure 6-2 and Fgure 6-3, respectvely. For the sake of a better vsualsaton, 9,579 customers are sorted by ther degrees of separaton or compactness n ascendng order. Degree of separaton Customers Fgure 6-2: Degrees of separaton between clusters obtaned for customers n Group N n the frst tme wndow

231 Applcatons of Dynamc Pattern Recognton Methods 23 Degrees of compactness Customers Fgure 6-3: Degree of compactness of clusters calculated for customers n Group N n the frst tme wndow The analyss of both sequences charactersng the valdty of the fuzzy partton shows that 64.4% of customers n Group N possess a maxmum degree of membershp larger that 0.5 and a degree of separaton larger than 0.4. Consderng more strct thresholds, t can be stated that 54.5% of customers have degrees of membershp exceedng 0.7 but only 42.3% of them also have degrees of separaton above 0.5. Thus, the fuzzy partton nto two clusters n the frst tme wndow s smlar n ts characterstcs to the one obtaned for objects n Group N, based on the whole tme hstory, and allows a clear and unambguous assgnment of about 50% of customers. In the next step of analyss, temporal changes of customer assgnment to the clusters acheved usng the absorpton threshold u o =0.5 are consdered (the number of customers assgned to each cluster s gven n Table 6-7). The analyss s carred out analogously to the case of customers n Group Y wth the purpose of detectng whether customers have remaned n the same cluster to whch they were assgned n the prevous tme wndow for each par of subsequent tme wndows or whether they have moved from one cluster to another. The results of temporal changes of customer assgnment are summarsed n Table 6-9, where the -th tme wndow s denoted by tw, =,..., 4. These results show that most of the customers assgned to Segment C or C 2 n the frst tme wndow reman n these segments n the followng tme wndows, and just a small percentage of customers move from one cluster to the other between two tme wndows.

232 24 Applcatons of Dynamc Pattern Recognton Methods Table 6-9: Temporal changes of assgnment of customers n Group N to clusters From tw to tw 2 From tw 2 to tw 3 From tw 3 to tw 4 Number of customers C C 2 C C 2 C C 2 Remaned n C Moved from C nto C Moved from C 2 nto C Dropped out of C Appeared n C Analogously to the case of customers n Group Y, temporal changes of assgnment of customers n Group N between two tme wndows are also charactersed by a small number of customers who were assgned to a certan cluster n one of the tme wndows but who do not appear n any of the two clusters n the next tme wndow,.e. these customers are dropped from a cluster due to ther decreased degrees of membershp n the next tme wndow. Besdes ths, there are customers who appear n a certan cluster n the subsequent tme wndow due to ther ncreased degrees of membershp compared to the prevous tme wndow. The number of these customers s represented n Table 6-9 as well. In bref, t can be stated that the generated fuzzy partton wth two clusters provdes a good segmentaton of customers n Group N and fts the natural data structure well Comparson of clusterng results for customers n Groups Y and N After conductng four types of analyss for dfferent customer groups and for dfferent lengths of the temporal hstory t s necessary to compare the results obtaned. It has already been stated that the customer segments recognsed based on the whole temporal hstory and n the frst tme wndow are very smlar, however the feature values charactersng cluster centres n the frst case are somewhat larger n the absolute values compared to those n the second case. Comparng the results for customers n Group Y and N, t can be seen that the values of the end-of-month balance of customers n Group Y exceed the correspondng values of customers n Group N, and vary n the larger value range. The credt turnover of the frst customer group s approxmately 0,000-20,000 DM larger than the values of the other customer group, whereas the credt utlsaton of the actve users s 20,000-30,000 DM lower. Therefore, customers n Group Y belongng to the segment of actve users have more entres n ther accounts, hgher monthly account statements and use bank credt less actvely than customers n Group N. Customers n the second segment, non-users, are smlar n ther behavour for both groups of customers.

233 Applcatons of Dynamc Pattern Recognton Methods 25 The results of analyss conducted n ths secton can help a bank to better understand the customer portfolo, to dstngush between dfferent groups of actve users and non-users n order to be able to develop partcular marketng strategy whch may be, for nstance, offerng specal favourable servces to a group of the most actve users. Bank customer segmentaton was carred out n ths thess based on the dynamc data representng customers temporal behavour and by applyng the dynamc fuzzy clusterng algorthm. The dynamc analyss allows to take nto consderaton the payment behavour of customers over a perod of tme whch characterses customers much better than a sngle observaton. Untl now n most applcatons related to customer segmentaton and descrbed n the lterature the statc analyss of customers was performed based on measurements at a certan moment of tme. These analyss results are not obvously very relable, snce clusters, or customer segments, obtaned from such analyss can often change due to sgnfcant fluctuatons of account feature values that requres perodc re-clusterng. In contrary the dynamc fuzzy clusterng help to save tme and affords and can provde more relable complete results.

234 26 Applcatons of Dynamc Pattern Recognton Methods 6.2 Computer Network Optmsaton based on Dynamc Network Load Classfcaton Computer communcatons s one of the most rapdly developng technologes. Its purpose s to provde a possblty to transport data between user s computers, termnals, and applcaton programs and n ths way to support the nformaton exchange. The ntroducton of the technques and equpment necessary for hgh-speed data transmsson n the late 960s led to the development of computer networks. In the 970s government and corporate computer networks became popular as organsatons realsed the advantages offered n effcency and proftablty. In the md 970s extensve computer networks were adopted by fnancal nsttutons, such as banks, snce they recognsed the opportunty to ncrease ther proftablty and the need to reman effcent and compettve. The use of local area networks to nterconnect computers and termnals wthn a buldng or a group of buldngs has become popular snce 980 and made dstrbuted computng a realty due to hgh-capacty low-cost communcaton. In 983 the noton of the Internet appeared denotng the general data communcaton network and soon t has become the world s largest computer network. The Internet has grown from 6,000 users n 986 to more than 600,000 users n 99 [Hunt, 995, p. x]. At present the number of Internet users world wde s estmated at almost 200 mllon people. Accordng to the Internatonal Data Corporaton (IDC) ths number s supposed to rse to 500 mllon people n Ths explosve growth of the number of Internet users shows the enormous need n network servces. The avalablty of nformaton suppled by computer networks s now crucal n many organsatons, partcularly nformaton from many locatons whch must be co-ordnated both accurately and quckly. Obvously the hgh demand on computer communcatons leads to a hgh utlsaton of computer networks whch vares over tme. All computer networks have certan lmts on the speed of data transmsson. If the volume of data traffc grows, the network utlsaton rapdly ncreases and above a certan crtcal threshold the throughput rate of data decreases and tme delays n data transmsson appear. In order to avod ths problem, the analyss of network load over tme can be useful amng to dentfy specfc patterns, or typcal states, of network load characterstc for certan months or week days or tme ntervals durng a day. The knowledge of the network load dstrbuton over tme can be used to optmse the network resources so that n peak tmes of network utlsaton addtonal resources are allocated and n perods of low utlsaton the regular resources are reduced. Ths knd of network optmsaton s techncally possble due to the modern technques and equpment whch can provde a dynamcally changng bandwdth for data transmsson accordng to current demand. The optmsaton of network resources leads to consderable savngs for companes and organsatons usng network servces snce the bandwdth for data

235 Applcatons of Dynamc Pattern Recognton Methods 27 transmsson s usually rented from a network provder. On the other hand, the nformaton about the dstrbuton of network load over tme can be used by network provders to defne prces for network use. It must be noted that the analyss of network load s relevant not only for computer networks but also for telecommuncaton networks. In ths secton t wll be shown how the dynamc fuzzy clusterng algorthm can be appled for on-lne recognton of dfferent states n network load and to follow state changes. Ths problem requres the desgn of a dynamc classfer wth adaptve capabltes whch s appled for clusterng and classfcaton of trajectores representng the development of network load. The problem of dentfyng typcal patterns of network load dependng on week days and/or year seasons s not consdered n ths applcaton, snce t s comparable to the problem of customer segmentaton presented n sectons 6..4 and 6..5 and s concerned wth statc classfer desgn Data transmsson n computer networks A communcaton network s defned as a shared resource used to exchange nformaton between users [Freer, 988, p. 0]. A computer network s a dstrbuted collecton of computers whch s vewed by the user as one large computer system allocatng jobs wthout user nterventon. Hence a computer communcaton network s vewed as a collecton of several computer systems from whch the user can select the servce requred and communcate wth any computer as a local user. Accordng to the defnton of the Internatonal Standards Organsaton (ISO) a network s an nterconnected group of nodes or statons. Nodes are determned as the ntersecton of two or more transmsson paths and perform traffc swtchng whereas statons may be attached to a sngle transmsson path n a network. The logcal manner n whch nodes are lnked together by channels to form a network s denoted as the network topology (for example, star networks, rng networks, bus networks). The way n whch cables, provdng channels between statons, are postoned s called topography. Most of the topologes generally broadcast all sgnals to all statons connected. A recevng staton or ts communcaton equpment must select only the sgnals addressed to tself from all the transmssons on the network. In order to allow the communcaton of computers of dfferent manufactures and the coexstence of dfferent applcaton programs on the network a set of rules s needed whch govern the connecton and nteracton of the network components. Ths set of rules s called the network archtecture whch ncludes the data formats, protocols and logcal structures for the functons provdng effectve communcaton between data processng systems connected to the network [Freer, 988, p. 33]. In the late 970s a reference model for Open System

236 28 Applcatons of Dynamc Pattern Recognton Methods Interconnecton (OSI) was developed by ISO to descrbe the structure of communcaton protocols. Ths model forms a bass for the co-ordnaton of developments n layered network communcaton standards. The OSI reference model conssts of 7 layers as shown n Fgure 6-4. Each of these layers defnes certan functons requred for data transfer between computers so that the communcaton process s represented by a herarchy of functon layers whch stack on each other ([Freer, 988, p ], [Hunt, 995, p. 5-7]). Each layer s based on servces of the lower layer and offers servces to the hgher layer. When data s transmtted between applcaton programs on two dfferent computers, t passes from layer 7 to layer at the sendng system and s transmtted over the nterconnecton meda to the recevng system where t travels through layers to 7. The data path through the layers s llustrated by contnuous arrows n Fgure 6-4. Data does not pass drectly between layers except at the physcal layer, nstead layer nterfaces are provded to defne the servces of correspondng layers. 7. Applcaton Layer 6. Presentaton Layer 5. Sesson Layer 4. Transport Layer 3. Network Layer 2. Data Lnk Layer. Physcal Layer Protocols 7. Applcaton Layer 6. Presentaton Layer 5. Sesson Layer 4. Transport Layer 3. Network Layer 2. Data Lnk Layer. Physcal Layer Physcal Interconnecton Meda Fgure 6-4: The OSI basc reference model [adapted from Black, 989, p. 285] The communcaton of each layer wth the same layer of another system requres the defnton of communcaton rules and data formats. Such rules whch ensure the proper exchange of nformaton between two or more partes are called protocols. The mplementaton of protocols for each layer s ndependent from other layers. Ths allows to mnmse the nfluence of sngle techncal changes n one layer on the whole protocol famly. Durng data transmsson each layer adds ts own protocol control nformaton to data passng from the hgher layer. Ths protocol nformaton s gnored by the lower layers and s nterpreted only by the correspondng layer of the recevng system.

237 Applcatons of Dynamc Pattern Recognton Methods 29 Layers to 3 are network related and provde low-level protocols for physcal data transfer whch are mostly mplemented n hardware or by dedcated controllers. Layers 4 to 7 provde applcaton protocols whch are responsble for adjustment of data to the correspondng end systems and are mplemented n software on the host computer. The meanng of the layers s brefly summarsed n the followng [Black, 989, p ]. The functons wthn the physcal layer are responsble for actvatng, mantanng, and deactvatng a physcal crcut between communcaton devces and for specfcatons of physcal sgnals and cablng/wrng. The data lnk layer s responsble for the relable transfer of data over the physcal meda. The most popular protocols used on ths layer are Ethernet, Tokenrng, and FDDI. The network layer provdes the nterface of the user nto the network as well as the nterface of two computers wth each other through a network. Ths layer defnes network swtchng, routng and the communcatons between networks (nternetworkng). Examples of communcaton protocols used n ths layer are the nternet protocol (IP) and IPX. The transport layer provdes the transparent nterface between the data communcatons network and the upper three layers and offers these layers several optons of data transfer wth predefned qualty features. The best representatves of communcaton protocols used n ths layer are the transmsson control protocol (TCP) and the user datagram protocol (UDP). The sesson layer manages the nteractons (sessons) between applcaton programs (users) provdng dfferent servces to co-ordnate the exchange of data between applcatons. The presentaton layer s responsble for the descrpton of data structure and representaton and negotaton wth the correspondng layer of recevng applcatons concernng the data format. The applcaton layer s concerned wth the support of the end-user applcatons. For a better understandng of the data used n the applcaton example n the followng sectons, consder the functons of the network layer n more detal. Ths layer allows two network users to exchange data wth each other through one network connecton or multple network connectons. Swtchng technques defne the method of data transmsson between nodes. The task of the routng functon s to fnd the optmal path (route) for data blocks through the network from the sendng to the recevng staton. For ths purpose the addresses of the network layer contan the nformaton concernng the locaton of statons n the network. The route s chosen based on the varety of crtera such as bandwdth, utlsaton rate or least-cost crtera. There are prmarly three methods of swtchng data from one node to another over a network [Freer, 988, p. 0]. Crcut swtchng provdes a drect connecton between two nodes and permts an end-to-end transmsson between the two end users by utlsng the faclty wthn bandwdth and tarff lmtatons. Many telephone networks are crcut swtchng systems. Message swtchng s a store-and-forward technology. The swtch s used to receve and acknowledge the message from the sendng staton and to store t temporarly untl approprate

238 220 Applcatons of Dynamc Pattern Recognton Methods crcuts are avalable to forward the message to the recevng staton. Prvate and mltary teleprnter networks usually employ ths technology. Packet swtchng s derved from message swtchng but messages are broken down nto smaller peces known as packets whch are nterleaved wth packets from other vrtual channels durng transmsson through the network. Packets are provded wth protocol control nformaton (headers) whch enable the complete message to be reassembled by the recevng staton and routed through the network as ndependent enttes. The default user data feld length n a data packet s 28 bytes but other packet lengths (szes) are also avalable: 6, 32, 64, 256, 52, 024, 2048, and 4096 bytes. Due to a number of advantages such as fast response tme to all users of the faclty, hgh avalablty of the network to all users, dstrbuton of rsks of falure to more than one swtch and sharng of resources packet swtchng has become the prevalent technology for swtched data networks. Data transmsson over the network requres a method of controllng the access of ndvdual users to the transmsson medum. The most popular local area network, Ethernet, represents the shared-medum-concept,.e. the transmsson medum s shared by all statons. If all statons work smultaneously, then the task of the method of access control s to decde what staton may transmt data at what tme. The most common method used by Ethernet s the Carrer Sense Multple Access wth Collson Detecton (CSMA/CD). All statons lsten on the network and know whether the medum s free or data are transmtted. If the medum s free, each staton can transmt data to another staton. If two statons attempt to transmt data smultaneously, a collson appears whch s recognsed by other statons. After a short pause the network s free agan for data transmsson. In order to avod a new collson, statons wat a pseudo-random tme nterval before attemptng retransmsson. If nevertheless a new collson appears, the watng nterval s doubled. After 6 mstrals an error s reported. Ths method of access control and collson detecton determnes a certan relaton between the number of collsons and the data traffc volume [Kel, 996, p. 60]. As long as the network load s not too large, there are no problems wth data transmsson. If the data traffc (the number of transmtted packets) ncreases, the number of smultaneous accesses to the network and correspondngly collsons grows. The stuaton s gettng more and more crtcal due to the need of retransmsson. Above a certan threshold of traffc volume the bandwdth provded for each staton as well as the total data throughput rate are drastcally decreased as t s llustrated on Fgure 6-5. Therefore the theoretcal Ethernet throughput rate of 0Mbytes/s s rarely reached, t s usually about one or two thrd of ths rate f many statons work on the network.

239 Applcatons of Dynamc Pattern Recognton Methods 22 Data throughput rate Collsons 0 Data traffc volume Fgure 6-5: Dependence between the number of collsons and the network load [adapted from Kel, 996, p. 60] If the collson rate on more then one computer n the network exceeds permanently 5%, t can be reasonable ether to dvde the network to reduce the load or to optmse dynamcally the network resources by provdng a larger bandwdth for some statons f the traffc volume s too hgh Data acquston and pre-processng for network analyss Montorng s an deal way to manage overall network performance as well as analyse the nfluence of new applcatons and technologes on network servces. There s a bg number of software tools called traffc or network analysers whch allow to capture and smultaneously vew an nstant snapshot of network actvty for real-tme assessment. These data snapshots can be dsplayed n graphc and numercal formats and stored for future mpact analyss and hstorcal reportng. Network analysers are commonly used by companes to ncrease management productvty by delverng a real-tme nformaton that reduces the number of dedcated resources requred to evaluate network performance. In order to acqure data needed for network load analyss the software tool NetXRay was used n the applcaton under consderaton. The NetXRay Protocol Analyser and Network Montor s a software-based fault and performance management tool that captures data, montors network traffc and collects key network statstcs. The NetXRay Analyser was developed specfcally for Wndows 95 or NT and provdes a state-of-the-art Graphcal User Interface. Ths 32-bt software-based tool takes full advantage of the Wndows operatng envronment. NetXRay enables network managers to montor every network segment, to extract and revew vtal and detaled nformaton needed to effectvely troubleshoot, manage and mgrate complex network envronments. NetXRay provdes the followng features for quck troubleshootng of network problems, realtme network traffc analyss, plannng and forecastng of network growth: Intutve, consstent user nterface allows for quck capture and dsplay of data to montor key network performance crtera;

240 222 Applcatons of Dynamc Pattern Recognton Methods NetXRay supports all major LAN topologes and decodes over 0 protocols from major protocol sutes; NetXRay s Traffc Map and traffc matrx show who s talkng to whom on the network; Protocol dstrbuton nformaton allows to see what protocols are n use, whch IP applcatons are runnng, and whch IPX transport processes are beng utlsed; Hstory reports enable understandng of network usage over long perods of tme; After establshng the network baselne, day-to-day utlsaton can be tracked to predct when changes wll need to be made to accommodate ncreases n network traffc. Durng ts operaton NetXRay can smultaneously capture data from one or more network nterfaces, dsplay prevously captured data, generate traffc, generate alarms as traffc thresholds are exceeded, automatcally dscover network hosts, and montor and store key network statstcs. All of ths can take place whle other Wndows applcatons are runnng on the same machne,.e. dedcated hardware s not requred. Unfortunately t s not possble to automatcally store the hstory of protocols dstrbuton n the network. The current protocol dstrbuton can only be montored and some snapshots can be saved to dsk. The applcaton under consderaton makes advantage of the ablty of the NetXRay tool to record network actvtes over a perod of tme. The program supports the montorng of ten network actvtes concurrently whch all characterse the data traffc volume, network utlsaton and collsons. For the purpose of network load analyss sx hstory statstcs concernng the data traffc were selected and collected over a perod of tme of 70 days (from the mddle of Aprl untl the mddle of June 999) n the computer network of the dormtory of the Techncal Unversty of Aachen (Germany). The standard LAN Protocol used n the dormtory s Ethernet wth a bandwdth of 0Mbytes/s and the standard communcaton protocol s TCP/IP. The sx network statstcs correspond to sx packet szes used by TCP/IP for data transmsson through the network. The number of packets of each category transmtted at the current moment multpled wth the correspondng packet sze consttutes the current volume of the data traffc and represents a certan degree of network utlsaton. The larger the packet sze, the larger the nfluence of the number of these packets on the network utlsaton. These sx packet szes charactersng the data transmsson on the network are consdered as dynamc features for the pattern recognton process and are gven n Table 6-20.

241 Applcatons of Dynamc Pattern Recognton Methods 223 Table 6-20: Dynamc features descrbng data transmsson n computer network Feature Descrpton Packet sze under 64 B/s 2 Packet sze between 65 and 27 B/s 3 Packet sze between 28 and 255 B/s 4 Packet sze between 256 and 5 B/s 5 Packet sze between 52 and 023 B/s 6 Packet sze between 024 and 58 B/s Network statstcs were montored each day durng 5 hours of the most actve network utlsaton: from 9 a.m. untl 24 p.m. The measurements were recorded wth the samplng rate of 30 s. The data traffc and network utlsaton durng the nght are always very low, so that these data are not of partcular nterest. For the analyss of network actvtes only the data of 4 days n May were used to demonstrate the abltes of the dynamc fuzzy clusterng algorthm appled for network load analyss. Ths results n a temporal sequence of measurements for each feature, or 6-dmensonal trajectory of a length of tme nstants. The man statstcal characterstcs of ths data wth respect to each feature are summarsed n Table 6-2 and Table Comparng the value ranges and the man quantles wth a lmt of µ+3σ, where µ s the mean value and σ s a standard devaton, t can be seen that feature values contan a low percentage of outlers. Due to ths fact features are charactersed by rather skewed dstrbutons. As an example, two densty dstrbutons of the mean values of trajectores of features and 6 are llustrated n Fgure 6-6. Table 6-2: The value ranges and man quantles of each feature charactersng network data Features Value range s α (α=0.25) s α (α=0.5) s α (α=0.75) [0, 374] [0, 390] [0, 85] [0, 53] [0, 87] [0, 65]

242 224 Applcatons of Dynamc Pattern Recognton Methods Table 6-22: Man statstcs of each feature of the network data Features Mean value µ Standard devaton σ µ 3σ µ+3σ Densty Feature values Densty Feature values Fgure 6-6: Densty dstrbutons of features (left) and 6 (rght) In order to be able to use the recorded temporal data for the analyss, the data have to be preprocessed to smooth the strong fluctuatng behavour of trajectores and to normalse all features to the same nterval [0, ]. Trajectores of features are smoothed over 200 values leadng to trajectores wth a clear temporal behavour but fltered from random fluctuatons Goals of the analyss of load n a computer network The prmary goal of the computer network analyss conducted n ths applcaton s to recognse typcal load states n a computer network based on the data traffc volume n order to be able to optmse network resources and to avod collsons n the network n peak-tmes. The load states, or data traffc volumes, are supposed to be tme-dependent, low or hgh data traffc volume s related to certan tme ntervals durng a day. It s possble that tme ntervals of typcal network states change over tme, therefore there s a need to follow temporal changes of load states. Consderaton of dfferent packet szes as features allows to recognse the dstrbuton of packets whch gves an addtonal nformaton about the number of users on the network. If there are a lot of large packets transmtted through the network, then t can be

243 Applcatons of Dynamc Pattern Recognton Methods 225 assumed that the number of users workng smultaneously s not large so that they can use the bandwdth to a full degree. If the number of small packets s very hgh, t can be assumed that many users are workng on the network smultaneously and n order to mnmse the tme for transmsson and to avod collsons the data s send n small packets. In ths case the bandwdth provded to each user s lmted. Although n most networks the data s broken down nto packets of certan szes automatcally, there s also equpment whch allows an operator to manage ths process and to adapt t to the current stuaton. From the vewpont of techncal analyss the goal of network analyss conssts n the on-lne detecton of typcal states n a 6-dmensonal trajectory representng the temporal development of data traffc n the network. Ths goal can be acheved by applyng the algorthm for dynamc fuzzy classfer desgn and classfcaton whch allows to adapt the classfer to changng network states. The results of such a network analyss can be used to adjust the network resources provded to the user. As soon as typcal patterns of network actvtes characterstc for specfc days are learned, they can be used to derve the basc optmsaton strategy of network resources. It should be noted that t s possble to obtan equvalent results by clusterng the whole trajectores representng data traffc durng one day whch are collected over a long perod of tme (for nstance, half a year). Ths knd of analyss requres, however, that the correspondng data base would be avalable n advance. The applcaton under consderaton shows that usng the dynamc fuzzy clusterng algorthm t s possble to start wth just a few data records and to learn typcal patterns of data traffc durng the adaptaton process Parameter settngs for dynamc classfer desgn and classfcaton of data traffc Clusterng of multdmensonal trajectores descrbng the data traffc n a computer network wll be performed based on pontwse as well as structural smlarty. For the defnton of pontwse smlarty the quadratc membershp functon gven by equaton (5.) s chosen to model the fuzzy set approxmately zero, whch s defned wth respect to each feature f r, r=,..., M=6. The parameter a(r) of each membershp functon s determned by equaton (5.4) and based on the consderaton presented n secton The value of parameter β(r) s evaluated by the mean value of maxmal values of trajectores of the correspondng feature accordng to equaton (6.) and parameter a(r) s then calculated accordng to equaton (6.2). For the defnton of structural smlarty usng Algorthm 7a parameter a must be defned wth respect to each temporal characterstc as well as each feature. Suppose that a set of characterstcs chosen to descrbe the temporal behavour of trajectores s gven by {K,..., K L } and denote the value of characterstc K obtaned from the j-th trajectory of feature r x jr (t) by K (x jr ), =,..., L, j=,...,n. The values of parameter β(, r) for characterstc K wth respect

244 226 Applcatons of Dynamc Pattern Recognton Methods to feature r and the correspondng value of parameter a(,r) can be calculated n the followng way: β(, r) = N N j= K (x jr ), =,...,L, r =,..., M a(, r) β(,r) (6.3) =. 2 (6.4) The aggregaton of partal smlarty measures over all features and over all characterstcs s carred out accordng to Step 5.2 of Algorthm 7a: partal smlartes are transformed to dstance measures whch are then aggregated to an overall dstance n two steps usng the Eucldean norm. The dstance measure obtaned from pontwse or structural smlarty s used nstead of the Eucldean dstance measure n the calculaton schemes of the functonal fuzzy c-means and possblstc c-means algorthms, the latter of whch s used only for possblstc classfcaton of dynamc objects. In both algorthms the dstance between objects and cluster centres s nvolved n the calculaton of degrees of membershp of objects to clusters (see equaton (4.5). In ths applcaton the FFCM algorthm s used nstead of the Gath-Geva algorthm, snce f the classfer s desgned based on a small number of objects (for example 30 objects) some statstcal characterstcs such as the fuzzy covarance matrces of clusters used n the Gath- Geva algorthm can not be consdered as sgnfcant. For ther evaluaton a greater number of objects s requred. For dynamc classfer desgn t s necessary to defne a set of thresholds whch are used wthn the montorng procedure. The followng parameter settngs are chosen wthn the algorthms for detectng new clusters, smlar clusters to be merged and heterogeneous clusters to be splt: Table 6-23: Parameter settngs for three algorthms of the montorng procedure used durng the network analyss Absorpton threshold u o = 0.5 Share of the average cluster sze α cs = 0.6 Share of the average cluster densty α dens = 0. Threshold for the choce of good free objects α good = u o = 0.5 Mergng threshold λ = 0.6 Number of bars n the densty hstogram N bars =0 Densty threshold for splttng r dens =0 Sze threshold for dense groups of objects r dam =0

245 Applcatons of Dynamc Pattern Recognton Methods Recognton of typcal load states n a computer network usng the pontwse smlarty measure For recognton of typcal states n a 6-dmensonal trajectory representng the data traffc suppose that new dynamc objects arrve every 200 tme nstants whch correspond to 00 mnutes due to a samplng rate of 30 s. Each dynamc object s observed durng a tme wndow of length 200 and represented by a 6-dmensonal trajectory (temporal hstory) of length 200. As t was stated above, the data traffc was montored durng 4 days, that s 4*5=20 hours, and measurements were made. Usng the chosen length of the tme wndow, the total tme nterval of observaton [, 25000] s broken down nto 25 tme wndows, whch are denoted as tw. For the defnton of the membershp functon approxmately zero used to calculate pontwse smlarty between trajectores the followng values of parameter a are obtaned: Feature Feature 2 Feature 3 Feature 4 Feature 5 Feature 6 a A dynamc fuzzy classfer wth the number of clusters c=2 s desgned after 3 tme wndows,.e. based on 3 dynamc objects obtaned durng 3 days 6 hours and 40 mnutes of observaton, by applyng the functonal fuzzy c-means algorthm based on the pontwse smlarty measure for trajectores. After possblstc classfcaton and assgnment of dynamc objects to clusters accordng to the absorpton threshold, the szes of the clusters are equal to N 0. 5 = 2 and N = 8 and the number of free objects s n free =. The algorthm for detectng new clusters provdes the result that a new cluster can be declared (both crtera of sze and compactness of free objects are satsfed). Therefore, objects are re-clustered wth a new number of clusters c=3. The absorpton procedure results n cluster szes of N 0. 5 = 4, N 0. 5 = 4 and N = and the number of free objects s equal to n free =3. The qualty of a new partton s evaluated durng the montorng procedure. Accordng to the algorthm for detectng heterogeneous clusters, the second cluster can be splt. Hence, objects are reclustered wth a new number of clusters c=4. The new classfer s charactersed by clusters of szes N 0. 5 = 8, N =, N = 3 and N = 3 and n free =6 objects are left free. Snce ths fuzzy partton s better than the prevous one wth 3 clusters accordng to valdty measures and no more changes can be detected by the montorng procedure, the classfer s accepted. In the next tme wndows new objects are classfed and some of them are assgned to the exstng clusters. In tme wndow tw=34 (after 3 days hours and 40 mnutes of observaton) when the number of free objects s ncreased to n free =8 two new clusters are detected by the montorng procedure. The classfer s desgned anew wth the number of clusters c=6. After

246 228 Applcatons of Dynamc Pattern Recognton Methods absorpton of objects nto the new clusters the followng cluster szes are obtaned: N 0. 5 = 9, N 0. 5 = 5, N =, N =, N = 2 and N = 3 whereas only n free = object cannot be assgned because of ts low degrees of membershp to all clusters. Ths classfer s accepted and appled for classfcaton of new objects n the subsequent tme wndows. In tme wndow tw=64, that s after 7 days and 00 mnutes of observaton, the number of free objects has ncreased to n free =3 and a new cluster s detected by the montorng procedure. Thereby three clusters C 4, C 5, and C 6 have remaned unchanged n sze and keep beng very small snce the classfer has ntally been desgned and three other clusters have grown to N 0. 5 = 8, N = 23, N = 4. It can be assumed that the cluster structure learned after the frst 3.5 days does not ft very well the data structure after 7 days, whch can be explaned by the fact that data traffc durng days 6 and 7 correspondng to the weekend dffers consderably from the one durng a week. Thus the classfer s re-learned wth a new number of clusters c=7 resultng n clusters of szes N 0. 5 = 7, N = 9, N = 0, N = 2, N = 3, N 0. 5 = 4 and N = and no free objects. Accordng to valdty measures the new partton represents an mprovement compared to the prevous one wth 6 clusters, however the montorng procedure s able to detect two smlar clusters C and C 2. Snce ther smlarty measure s(c, C 2 )=0.756 exceeds the mergng threshold, these clusters can be merged. Objects are re-clustered wth the number of clusters c=6 resultng n the followng cluster partton: N 0. 5 = 27, N = 4, N = 4, N = 3, N = 4, and N = 2 and n free =2. The new mproved classfer s accepted and appled for classfcaton of new objects n the followng tme wndows. Untl the last tme wndow tw=25 no further changes n the cluster structure have been detected and the clusters have grown to N 0. 5 = 53, N = 27, N = 5, N 0. 5 = 2, N =, and N = 4 and n free =6. It can be assumed that the classfer has learned all typcal patterns n the data traffc durng 4 days of observaton. The results of dynamc clusterng and classfcaton over ths tme perod of 25 tme wndows are summarsed n Table 6-24, where v apd stand for average partton densty, FSA denotes the fuzzy separaton ndex wth respect to ambguty, and FC s the fuzzy compactness ndex. The last column contans the result of the montorng procedure correspondng to changes detected n the cluster structure n the current tme wndow. Tme wndows where objects were reclustered due to detected temporal changes n the cluster structure and the correspondng reclusterng results are marked n the table by a lght grey shade.

247 Applcatons of Dynamc Pattern Recognton Methods 229 Table 6-24: Results of dynamc clusterng and classfcaton of the network data traffc based on the pontwse smlarty measure N(C ) n free v apd FSA FC Changes TW=3, c=2 2, C new = TW=3, c=3 4, 4, Splt C 2 TW=3, c=4 8,, 3, No TW=34, c=4 8, 2, 3, C new =2 TW=34,c=6 9, 5, 3,, 2, No TW=64, c=6 8, 23, 4,, 2, C new = TW=64, c=7 7, 9, 0, 2, 3, 4, Merge C and C 2 TW=64, c=6 27, 4, 3, 3, 4, No The resultng cluster centres representng typcal states of data traffc are shown n Fgure 6-7. The typcal states recognsed n the data traffc can be nterpreted as follows. Cluster represents a relatvely low data traffc wth a decreasng tendency: the number of large packets ( B/s) s decreasng from a constant level of about 7 to 0 whle the number of small packets (under 64 B/s) s slowly ncreasng to 7. The number of other packet types s rather small varyng n the nterval [6, 8] for packet szes B/s and below 2 for three other packets szes. Ths state characterses rather small load of the network. Cluster 2 descrbes an average data traffc wth a clearly decreasng tendency. The number of small packet szes under 64 B/s and between 65 and 27 B/s s fallng from about 60 to 5 whereas the number of large packets ( B/s) stays at an average level of about 20. Three other packet types are also decreasng but n a much smaller value range. Cluster 3 corresponds to a rse from an average to large data traffc. The number of large packets between 024 and 58 B/s s drastcally growng from 30 to 00 along wth the small packets under 64 B/s whose number s ncreasng from 0 to 40. The number of medum-szed packets (65-27, B/s) vares n the range between 20 and 60. Cluster 4 represents a state wth a hgh data traffc through the network leadng to hgh network load. The number of large packets remans on a hgh level of about 00 and contans a short ncrease up to 30. At the end of the tme nterval the traffc starts to reduce. The same pattern characterses the behavour of small packets under 64 B/s but ther number stays at a level of about 80. The next packet sze (65-27 B/s) s also transmtted to a hgh degree and ts number vares between 40 and 50. The number of other packet types s about 0. Thus, the total data volume presented by all packets descrbng ths cluster sgnfcantly exceeds the one gven by other clusters.

248 230 Applcatons of Dynamc Pattern Recognton Methods Number of packets Tme (s) Under 64 B/s B/s B/s B/s B/s B/s Number of packets Under 64 B/s B/s B/s B/s B/s B/s Tme (s) Number of packets Tme (s) Under 64 B/s B/s B/s B/s B/s B/s

249 Applcatons of Dynamc Pattern Recognton Methods Number of packets Under 64 B/s B/s B/s B/s B/s B/s Tme (s) Number of packets Tme (s) Under 64 B/s B/s B/s B/s B/s B/s Number of packets Tme (s) Under 64 B/s B/s B/s B/s B/s B/s Fgure 6-7: Sx typcal states of the data traffc descrbed by sx packet szes and obtaned usng the pontwse smlarty measure

250 232 Applcatons of Dynamc Pattern Recognton Methods Cluster 5 characterses the average data traffc wth a slghtly ncreasng tendency. The number of small packets (under 64 B/s and B/s) s about 20 and 30, respectvely, whle the number of packets between 65 and 27 B/s fluctuates around 80. The number of large packets s at an average level slowly ncreasng from 30 to 60. The last Cluster 6 corresponds to a stable average data traffc. The number of small (under 64 B/s) as well as large ( B/s) packets s around whereas the number of packet szes vares about 5. The number of other packet szes s rather small and nsgnfcant for the total transmtted data volume. It can be notced that the number of packets wth szes , 256-5, B/s remans rather small n all cases and therefore has a lmted nfluence on the total data volume. The number of objects assgned to the sx clusters detected shows that Cluster wth a small data traffc volume s the most often appearng state n the network. Cluster 2 wth an average level of data traffc and a tendency to decrease s the second frequent state. The thrd place s taken by Cluster 6 wth the stable average behavour of data traffc, whch can be observed almost equally frequent as Cluster 4 wth large data traffc. Clusters 3 and 5 wth a specfc pattern charactersng the ncreasng behavour of data traffc are obvously not so frequent, snce changes of the data traffc volume are usually happenng much faster. In order to evaluate the qualty of the fuzzy partton n each tme wndow and to control the process of classfer adaptaton three valdty measures are calculated: the fuzzy separaton ndex based on ambguty of partton, the fuzzy compactness and the average partton densty. The frst two ndexes are shown n Fgure 6-8 whereas the thrd measure s presented n Fgure 6-9.

251 Applcatons of Dynamc Pattern Recognton Methods 233 Degree of separaton, degree of compactness Tme wndows Separaton ndex Compactness ndex Fgure 6-8: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the pontwse smlarty measure Average partton densty 20,000,000 8,000,000 6,000,000 4,000,000 2,000,000 0,000,000 8,000,000 6,000,000 4,000,000 2,000, Tme wndows Fgure 6-9: Temporal development of average partton densty obtaned usng the pontwse smlarty measure As can be seen after the adaptaton of the classfer n tme wndow 3 (the classfer wth 2 clusters was re-learned wth 3 and then wth 4 clusters) the values of the separaton and compactness ndexes have decreased whle the average partton densty has consderably ncreased. After the next adaptaton of the classfer n tme wndow 34 (the classfer contanng 4 clusters was re-learned wth 6 clusters) all valdty measures have ncreased slghtly. In tme wndow 64 the values of the separaton and compactness ndexes have ncreased due to the adaptaton (the classfer was two tmes re-learned wth 7 and then wth 6 clusters) whereas the average partton densty has decreased. However, n the subsequent tme wndows the densty has sgnfcantly grown demonstratng the mprovement of the

252 234 Applcatons of Dynamc Pattern Recognton Methods fuzzy partton over tme (compared to the nterval between tme wndows 34 and 64 where the densty has constantly reduced due to absorpton of new objects nto clusters). The two other valdty measures charactersng the unambguty of the objects' assgnment have reduced nsgnfcantly after tme wndow 64. Ths leads to the concluson that the generated fuzzy partton wth sx clusters can represent the data structure well. The assgnment of parts of trajectores observed n tme wndows of length 200 to the sx detected clusters s llustrated n Fgure Ths fgure shows as an example only the three trajectores of features, 2 and 6, snce these packet szes are the most frequently transmtted and ther number has the hghest nfluence on the total data traffc as stated n the descrpton of the clusters. Based on ths fgure t s possble to determne at what ponts of tme dfferent states of data traffc (clusters) usually appear. Consder some tme perods charactersed by low data traffc (Cluster ), an average data traffc wth decreasng tendency (Cluster 2), and a hgh data traffc (Cluster 4). Choose, for nstance, the ntervals [8800, 0600], [6200, 8000], [2600, 23200] where the trajectory s assgned to Cluster. After the transformaton nto tme of observaton, t can be concluded that the data traffc was low n the frst week from Frday at 22:20 untl Saturday at 22:20, n the second week on Wednesday the whole day, and on Saturday from 9:00 untl 22:20. Notce that the data traffc at nght was not observed. Analogously one can determne that the data traffc was at the average level for example, n the frst week on Sunday from 9:00 untl 2:20, n the second week from Thursday at 20:40 untl Frday at 0:40, and from Saturday at 22:20 untl Sunday at 0:40. The data traffc was large n the frst week and n the second week on Thursday from 7:20 untl 20:40 as well as on Tuesday n the second week from 9:00 untl 22:20. Thus t can be assumed that durng the week n the evenng the data traffc s often large. The clusterng results presented n ths secton show that the dynamc fuzzy clusterng algorthm based on the pontwse smlarty measure between trajectores s able to recognse typcal states n the behavour of the data traffc n a computer network on-lne. The applcaton of ths algorthm to the trajectory of the data traffc observed over a much longer tme perod can allow to obtan more relable statements concernng the typcal states and patterns of the data traffc on specfc week days.

253 Applcatons of Dynamc Pattern Recognton Methods 235 Number of packets under 64 B/s Cluster assgnment Tme (s) 0 Number of packets of sze B/s Tme (s) Cluster assgnment Number of packets of sze B/s Tme (s) Cluster assgnment Fgure 6-20: Assgnment of parts of trajectores to sx clusters representng data traffc states and obtaned based on the pontwse smlarty measure

254 236 Applcatons of Dynamc Pattern Recognton Methods Recognton of typcal load states n computer network usng the structural smlarty measure In ths secton dynamc fuzzy clusterng of the data traffc of a computer network wll be carred out based on the structural smlarty measure for trajectores. The comparson of clusterng results wth those obtaned based on the pontwse smlarty measure wll show, frstly, what nfluence the type of smlarty measure can have on the clusterng results, and secondly, what type of smlarty measure s best suted for network load analyss. For the defnton of the structural smlarty measure the set of temporal characterstcs descrbng the behavour of trajectores of features must frst be defned. A prelmnary analyss of the trajectores under consderaton shows that they are generally charactersed by a fluctuatng behavour wth hlls of dfferent ampltudes and duratons and a base lne at dfferent levels. In order to descrbe such a behavour of parts of trajectores n tme wndows of length 200, the followng characterstcs were mplemented: the mean value, the standard devaton, the range of values of a trajectory n the current tme wndow, the maxmum value of a trajectory and ts poston (tme nstant of appearance), the mnmum value and ts poston, the temporal trend, the degree of smoothness, the maxmum length of the nterval wth a postve slope (.e. ncreasng behavour) and the start tme of ths nterval, the maxmum length of the nterval wth a negatve slope (.e. decreasng behavour) and the start tme of ths nterval, the maxmum length of the nterval wth a zero dervatve (.e. constant behavour) and the start tme of ths nterval, the two largest and the two smallest local extreme values of a trajectory and the tme nstants of ther appearance. Calculatng the structural smlarty measure between trajectores based on dfferent subsets of these characterstc features and clusterng trajectores based on the generated smlarty measure t was observed that some subsets of these characterstcs possess very low dscrmnatng ablty leadng to smlar cluster centres. After nvestgaton of dfferent subsets and ther contrbuton to the recognton process, the followng temporal characterstcs have been chosen to descrbe the temporal behavour of the network load trajectores: the mean value, the range of values, the degree of smoothness of the trajectores, the values of four local extreme values and the tme nstants of ther occurrence). In order to defne the structural smlarty measure, parameter a for the fuzzy set admssble dfference for characterstc K determnng the admssble devaton of characterstc s values from the desred value s calculated accordng to (6.4) wth respect to each temporal characterstc and each feature) and yelds the followng values:

255 Applcatons of Dynamc Pattern Recognton Methods 237 Characterstcs Feature Feature 2 Feature 3 Feature 4 Feature 5 Feature The procedure of classfer desgn and classfcaton over 25 tme wndows s carred out n the same manner as n the case of the pontwse smlarty measure. A dynamc fuzzy classfer wth the number of clusters c=2 s desgned after 3 tme wndows by applyng the functonal fuzzy c-means algorthm based on the structural smlarty measure for trajectores. After possblstc classfcaton and absorpton of dynamc objects nto the clusters, the szes of the clusters are equal to N 5 = 6 and N 5 2 = 0 and the number of free objects s n free =5. Accordng to the montorng procedure two new clusters can be formed. Therefore, the objects are re-clustered wth a new number of clusters c=4. The absorpton procedure provdes clusters of szes N 5 = 7, N 5 2 = 2, N 5 3 = 3 and N 5 4 = 3 whereas the number of free objects s equal to n free =6. Snce ths fuzzy partton s better than the prevous one wth 2 clusters accordng to the valdty measures (the average partton densty and the fuzzy compactness ndex have ncreased) and no more changes can be detected by the montorng procedure, ths classfer s accepted. In the next tme wndows new objects are classfed by ths classfer and some of them are absorbed nto the exstng clusters. In tme wndow tw=34 (after 3 days hours and 40 mnutes of observaton) the number of free objects has ncreased to n free =8, and the montorng procedure declares a new cluster. The classfer s desgned anew wth a number of clusters c=5 leadng to clusters of szes N 5 = 0, N 5 2 =, N 5 3 = 4, N 5 4 = 5, N 5 5 = 2 and n free = free object whch cannot be assgned to clusters because of ts low degree of membershp to all clusters. Ths classfer s accepted as the fuzzy separaton and fuzzy compactness ndexes have ncreased and s appled for classfcaton of new objects n the subsequent tme wndows.

256 238 Applcatons of Dynamc Pattern Recognton Methods Temporal changes n the cluster structure are then detected n tme wndow tw=62, that s after 7 days and 00 mnutes of observaton. Snce the number of free objects n free =8 and the densty of ths group of objects are suffcently hgh compared to the exstng clusters, a new cluster s declared by the montorng procedure. As a result the classfer has to be re-learned usng the exstng cluster centres and an estmated new centre for ntalsaton n order to ft a new cluster structure representng the data traffc after 7 days. The new classfer desgned wth c=6 clusters s charactersed by clusters of szes N 0. 5 = 2, N = 6, N = 4, N 0. 5 = 6, N = and N = 5 and no free objects. Accordng to the valdty measures the new partton represents an mprovement compared to the prevous one wth 5 clusters (the average partton densty, the fuzzy separaton and fuzzy compactness ndexes have ncreased). Therefore, the new mproved classfer s accepted and s appled for the classfcaton of new objects n the followng tme wndows. Untl the last tme wndow tw=25 no further changes n the cluster structure have been detected and the clusters have grown to N 0. 5 = 23, N = 28, N = 3, N = 3, N = 6, and N = 4 and n free =0. Thus, t can be assumed that the number of typcal states of data traffc durng 4 days of observaton s equal to sx and the classfer fts the cluster structure well. The results of dynamc clusterng and classfcaton over ths tme perod of 25 tme wndows are summarsed n Table Tme wndows where the classfer was re-learned due to detected temporal changes n the cluster structure and the correspondng re-clusterng results are marked n the table by a lght grey shade. Table 6-25: Results of dynamc clusterng and classfcaton of the network data traffc based on the structural smlarty measure N(C ) N free v apd FSA FC Changes TW=3, c=2 6, c new =2 TW=3, c=4 7, 2, 3, No TW=34, c=4 8, 2, 3, c new = TW=34,c=5 0,, 4, 5, No TW=62, c=5, 9, 3, 5, c new = TW=62, c=6 2, 6, 4, 6, 9, No Cluster centres generated by the dynamc fuzzy clusterng algorthm based on the structural smlarty measure are not represented by trajectores as n the case of pontwse smlarty but by temporal characterstcs of trajectores of features. These characterstcs descrbe the temporal behavour of trajectores, or the structure of the typcal patterns of data traffc. The sx cluster centres detected are gven n Table 6-26 wth respect to each feature, where sm denotes the degree of smoothness of a trajectory, T, =,..., 4, s the tme nstant of the occurrence of the -th extreme value, and X, =,..., 4, s the -th extreme value of a

257 Applcatons of Dynamc Pattern Recognton Methods 239 trajectory. Consderng the values of the characterstcs, these cluster centres can be nterpreted as follows. Cluster represents an average data traffc wth a hgh degree of fluctuatons (degree of smoothness s about 9) n a small value range, so that t can be consdered as beng stable. The number of small packets vares around 23 (under 64 B/s) and 43 (65-27 B/s) whereas the number of large packets s about 32. The number of other packet szes s rather small. Cluster 2 descrbes the data traffc at an average level wth an ncreasng tendency for small packets and decreasng tendency for large packets. The mean value of the number of packets under 64 B/s s smlar to the one of cluster (around 24) and packets of sze B/s s around 02 wth a larger value range n both cases compared to cluster. Ths part of data traffc s charactersed by a rather small degree of fluctuatons. The number of large packets fluctuates stronger and les around 30. Cluster 3 corresponds to a stable state of the data traffc at a low level wth strong fluctuatons (degree of smoothness s about 6) n a small value range. The number of small packets as well as large packets les under 20 whereas the number of other packet types s very low. Cluster 4 represents the data traffc ncreasng from a small to an average level and charactersed by a low degree of fluctuatons for most types of packets. The numbers of small packets under 64 B/s and between 65 and 27 B/s vary around 26 and 45 and rse untl 30 and 80, respectvely, whereas the number of large packets grows from 3 to 33. Other packet szes remans at low level. Cluster 5 s charactersed by an almost monotonously ncreasng tendency from small to average level, where the number of small packets s lower (around 22 for packet szes 65 and 27 B/s) and the number of large packets s somewhat larger (around 34 and rses to 45 for packet sze B/s) compared to those of cluster 4. Cluster 6 descrbes large data traffc wth an ncreasng tendency. The number of small packets (under 64 B/s) ncreases almost monotonously and s charactersed by a mean value of about 54 whereas the number of large packets exhbts the same behavour and vares around 78 wth a value range of about 8. The number of other packet szes exceeds those of the other clusters as well.

258 240 Applcatons of Dynamc Pattern Recognton Methods Table 6-26: Cluster centres representng data traffc states obtaned based on the structural smlarty measure Feature Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C Feature 2 Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C Feature 3 Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C

259 Applcatons of Dynamc Pattern Recognton Methods 24 Feature 4 Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C Feature 5 Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C Feature 6 Mean Range Sm T X T 2 X 2 T 3 X 3 T 4 X 4 C C C C C C In order to evaluate the qualty of the fuzzy partton n each tme wndow and to control the process of classfer adaptaton three valdty measures such as the fuzzy separaton ndex based on ambguty of partton, the fuzzy compactness and the average partton densty are calculated. Ther temporal development s shown n Fgure 6-2 and Fgure 6-22, respectvely.

260 242 Applcatons of Dynamc Pattern Recognton Methods Degree of separaton, degree of compactness Tme wndows Separaton ndex Compactness ndex Fgure 6-2: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the structural smlarty measure Average partton densty 20,000,000 8,000,000 6,000,000 4,000,000 2,000,000 0,000,000 8,000,000 6,000,000 4,000,000 2,000, Tme wndows Fgure 6-22: Temporal development of average partton densty obtaned usng the structural smlarty measure As can be seen after the adaptaton of the classfer n tme wndow 3 (the classfer wth 2 clusters was re-learned wth 4 clusters) the values of the compactness ndex and the average partton densty have ncreased whle the separaton ndex has decreased. After the next adaptaton of the classfer n tme wndow 34 (the classfer contanng 4 clusters was relearned wth 5 clusters) the separaton and compactness ndexes have ncrease, however the average partton densty has decreased. Durng classfcaton and absorpton of new objects nto the 5 exstng clusters n the followng tme wndows, all valdty measures have slowly decreased ndcatng a deteroraton of the classfer over tme. In tme wndow 62 the values of all valdty measures have consderably ncreased due to the adaptaton (the classfer was re-learned wth 6 clusters). In the subsequent tme wndows the densty has frst slghtly

261 Applcatons of Dynamc Pattern Recognton Methods 243 decreased and then started to grow demonstratng the mprovement of the fuzzy partton over tme. The two other valdty measures charactersng the unambguty of the objects' assgnment have remaned partly on the same level or reduced nsgnfcantly after tme wndow 64. Thus t can be concluded that the generated fuzzy partton wth sx clusters can represent the data structure well. The assgnment of parts of trajectores observed n tme wndows of length 200 to sx clusters detected s llustrated n Fgure 6-23, where only trajectores of features, 2 and 6 are shown. Number of packets under 64 B/s Tme (s) Cluster assgnment Number of packets of sze B/s Tme (s) Cluster assgnment

262 244 Applcatons of Dynamc Pattern Recognton Methods Number of packets of sze B/s Tme (s) Cluster assgnment Fgure 6-23: Assgnment of parts of trajectores to sx clusters obtaned based on the structural smlarty Comparng the assgnment of parts of trajectores to clusters obtaned usng the pontwse smlarty measure and to clusters generated usng the structural smlarty measure t can be seen that n the second case clusters, or typcal states of data traffc, change each other much more frequently n the course of tme and the duraton of each state s mostly shorter than n the frst case. The reason for ths behavour s obvously the dfferent crtera used to defne the smlarty measure,.e. dfferent clusterng crtera. Clusterng based on the structural smlarty takes nto consderaton prmarly the pattern of temporal development of trajectores and the absolute values of trajectores are less mportant (they are consdered for the calculaton of the mean value and the value range). Snce the temporal behavour of data traffc has a fluctuatng character wth hlls of dfferent szes, changes of the typcal states are frequently observed. In order to compare the results obtaned wth two types of smlarty measures consder the same tme ntervals and the assgnment of parts of a trajectory on ths nterval to clusters. For nstance, data traffc n the nterval [6200, 8000] was assgned by the classfer based on the pontwse smlarty measure to Cluster representng low data traffc. Usng the classfer based on the structural smlarty measure, parts of a trajectory durng ths tme perod are assgned to the followng three clusters (Fgure 6-24).

263 Applcatons of Dynamc Pattern Recognton Methods 245 Number of packets under 64 B/s Cluster assgnment Tme (s) Fgure 6-24: Assgnment of a part of trajectory from the tme nterval [6200, 8000] to clusters obtaned usng the structural smlarty measure On the nterval [6200, 7600] the trajectory s assgned to Cluster 3 representng low data traffc wth strong fluctuatons. On the nterval [760, 7800] the trajectory s assgned to Cluster 2 descrbng average data traffc wth an ncreasng behavour of small packets, decreasng behavour of large packets and small degree of fluctuatons. Fnally on the nterval [780, 8000] the trajectory s assgned to Cluster where data traffc s at an average level wth strong fluctuatons n a small range. Thus, the typcal states of data traffc obtaned usng the structural smlarty measure are related much stronger to the behavour of a trajectory than to ts absolute values. In general t can be seen that the descrptons of clusters obtaned based on the pontwse and structural smlarty measures are comparable and allow a clear dentfcaton of low, average and hgh levels of data traffc wth ncreasng or decreasng tendences. In case of structural smlarty the dstncton s, however, more comprehensve. It can be concluded that the choce of a smlarty measure whch s the best suted for the analyss of load n computer networks requres the consderaton of techncal aspects of network optmsaton and the avalable equpment. It should be decded what s more mportant for the optmsaton: to react to changes n data traffc behavour whch can be used for a short-term forecastng or to react to changes n data traffc volume representng the current network load. In both cases the company can have consderable advantages due to the network optmsaton usng on-lne montorng and dynamc fuzzy pattern recognton technques.

264

265 Conclusons and Further Research Drectons Conclusons and Further Research Drectons The research area of dynamc fuzzy pattern recognton s new and challengng from both the theoretcal and the practcal pont of vew. There s a huge number of applcatons where for a correct recognton of structure n data the consderaton of the temporal development of objects over tme s requred. So far only a lmted number of algorthms for clusterng and classfcaton n dynamc envronments has been developed. They can be separated nto two man groups. The frst group s represented by algorthms whch consder statc objects at a fxed moment of tme. Durng the dynamc process of pattern recognton these algorthms try to recognse typcal (statc) states n the behavour of a system/process gven as ponts n the feature space at a certan tme nstant and follow ther temporal changes as tme passes usng updatng technques. Most of these algorthms are able to detect only gradual changes n the data structure by montorng characterstcs of the classfer performance. Another group ncludes algorthms whch are capable of processng dynamc objects represented by temporal sequences of observatons, or trajectores, n the feature space. These algorthms result n cluster prototypes gven by trajectores and descrbng dynamcal typcal states of a system over a certan perod of tme. In ths case the order of states of an object, or the hstory of ts temporal development, determne the membershp of an objects to a certan pattern, or cluster. Most of the algorthms n ths area use statstcal propertes of trajectores to determne ther smlarty. Ths thess has suggested a new algorthm for dynamc classfer desgn and classfcaton whch can be appled to statc as well as dynamc objects and combned wth dfferent types of fuzzy classfers. The man property of ths algorthm s ts ablty to automatcally recognse changes of the cluster structure n the course of tme and to adapt the classfer to these changes. The algorthm makes advantage of fuzzy set theory whch provdes a unque mechansm for gradual assgnment of objects to clusters and allows to detect a gradual temporal transton of objects between clusters. The algorthm proposed conssts of four algorthms consttutng the montorng procedure and based prmarly on propertes of the membershp functon and two algorthms for the adaptaton procedure. Ths algorthm s unque n ts characterstc that t has been developed for the dynamc unsupervsed desgn of pont-prototype-based fuzzy classfers and s capable of recognsng gradual as well as abrupt changes n the cluster structure possessng a flexble mechansm for an adaptaton of the classfer. In order to be able to apply the new algorthm for clusterng and classfcaton of dynamc objects, a number of smlarty measures for trajectores were proposed whch take nto consderaton ether the pontwse closeness of trajectores n the feature space or the best match of trajectores wth respect to ther shape. The defnton of structural smlarty

266 248 Conclusons and Further Research Drectons measures s of prmary mportance for the comparson of trajectores based on ther temporal behavour. It depends on the meanng of smlarty n the gven applcaton and s formulated wth respect to specfc mathematcal propertes and characterstcs of trajectores. Dfferent defntons of fuzzy smlarty between trajectores ntroduced n ths thess allow to model a gradual representaton of smlarty accordng to human judgement whch provdes the most plausble tool for the recognton process. The adaptve fuzzy clusterng algorthm for dynamc objects developed n ths thess should help the practtoner to desgn systems for on-lne montorng of system states and for the recognton of the typcal behavour of a system under consderaton takng nto account the hstory of temporal development of objects. Ths knd of problems arses n many applcaton areas such as medcne, marketng, fnance, or techncal dagnoss. In order to demonstrate the practcal relevance of the algorthm developed, t was appled to two dfferent problems. The frst economc problem s concerned wth bank customer segmentaton based on the analyss of the customers' behavour. By contrast to many smlar tasks solved by companes untl now, the analyss s based on the consderaton of the temporal development of customers over a long perod of tme or n tme wndows. Clusterng and classfcaton n tme wndows allow a bank to recognse typcal customers segments and follow ther possble temporal changes whch may appear due to changng bank servces or economc crcumstances and to detect changes n customer assgnment to typcal segments. The analyss results can help a bank to ncrease ts proft by preparng specal offers for the best users of bank servces and to adapt ther products and servces to recognsed changes n the customer portfolo. The second techncal problem s related to the recognton of typcal states n computer network load. The knowledge of tme specfc network states wth partcularly hgh or low load provdes a possblty for companes to optmse ther network resources by purchasng dfferent band wdth for data transmsson dependng on tme of day and n ths way acheve savngs. The problem of load optmsaton s also relevant for the area of telecommuncaton, where the prces for the use of telecommuncaton servces can be determned dependng on the dfferent load states. The performance of the algorthm for dynamc fuzzy classfer desgn and classfcaton depends on the clusterng algorthm chosen for classfer desgn and on the correct choce of several threshold values used wthn the montorng procedure. Improvements of the algorthm can be attaned by ntroducng more powerful clusterng algorthms capable of recognsng clusters of dfferent form, sze and densty. Although approprate values of thresholds can be determned by an expert dependng on the specfc applcaton, t seems reasonable to try to develop automatc procedures or to fnd dependences between these values and data propertes n order to smplfy the choce of the correct threshold values. Further enhancement

267 Conclusons and Further Research Drectons 249 of the performance of dynamc pattern recognton systems can be expected by applyng new methods for tme-dependent feature selecton / generaton whch s also an mportant aspect n statc pattern recognton. On the one hand, a varable set of features can make the analyss of the clusterng results more dffcult but on the other hand t can mprove clusterng results by representng only the most relevant nformaton as tme passes. The desgn of an adaptve fuzzy classfer reles to a hgh degree on the valdty measure used to control the adaptaton process. Most of the exstng valdty measures depend on the number of clusters chosen for the fuzzy partton of objects and none of them can guarantee the correct evaluaton of the partton qualty n all cases. Thus, there s a need for a defnton of new mproved valdty measures whch can provde a more relable statement regardng the partton qualty acheved wth a dynamc fuzzy classfer. Consderng the modfcaton of statc fuzzy clusterng algorthms by usng a smlarty measure for trajectores, extensons of the lst of proposed smlarty measures are possble. Especally structural smlarty measures should be a subject of further nvestgatons, snce they can descrbe a context-dependent smlarty of trajectores wth respect to ther form and temporal development. In partcular, t should be studed what temporal characterstcs of trajectores seem to be the most relevant for certan classes of applcatons or for clusterng problems based on fuzzy numbers as a specal case of trajectores. One can conclude that dynamc fuzzy pattern recognton represents a new promsng research area wth a wde feld of potental applcatons. The results presented n ths thess show that the adaptve fuzzy clusterng algorthm for dynamc objects can be successfully appled for solvng the pattern recognton problem n a dynamc envronment. The man power of the algorthm les n the combnaton of a new method for dynamc classfer desgn based on the prncples of adaptaton wth a set of fuzzy smlarty measures for trajectores. Further modfcatons of the algorthm for dfferent types of classfers and an extenson of the set of smlarty measures wll lead to an expanson of ths new class of methods for dynamc pattern recognton.

268

269 References 25 8 References Abrantes, A.J., Marques, J.S. (998) A Method for Dynamc Clusterng of Data. Proceedngs of the 9 th Brtsh Conference, Unversty of Southampton, UK, 998, p , Agrawal, R., Ln, K.-I., Sawhney, H.S., Shm, K. (995) Fast Smlarty Search n the Presence of Nose, Scalng and Translaton n Tme-Seres Databases. In: Proceedngs of the 2 st Internatonal Conference on Very Large Data Bases (VLDB 95), Zurch, Swtzerland, 995, p Angstenberger, J. (997) Data Mnng n Busness Anwendungen. Anwendersymposum zu Fuzzy Technologen und Neuronalen Netzen, November 997, Dortmund, Transferstelle Mkroelektrtonk / Fuzzy-Technologen, c/o MIT GmbH, Promenade 9, D Aachen, 997, p Angstenberger, J., Nelke, M., Schrötter, T. (998) Data Mnng for Process Analyss and Optmzaton. CE Expo, Houston, Texas, 998 Arneodo, A., Bacry, E., Muzy, J.F. (995) The Thermodynamcs of Fractals Revsted wth Wavelets. Physka A, Vol. 23, 995, p Åström, K. J., Wttenmark, B. (995) Adaptve Control. Addson-Wesley Publshng Company, Inc., 995 Baksh, B.R., Locher, G., Stephanopoulos, G., Stephanopoulos, G. (994) Analyss of Operatng data for Evaluaton, dagnoss and Control of Batch Operatons. Journal of Process Control, Vol. 4 (4), 994, Butterworth-Henemann, p Bandemer, H., Näther, W. (992) Fuzzy Data Analyss. Kluwer Academc Publshers, Dordrecht, Boston, London, 992 Bensad, A.M., Hall, L.O., Bezdek, J.C., Clarke, L.P., Slbger, M.L., Arrngton, J.A., Murtagh, R.F. (996) Valdty-Guded (Re)Clusterng wth Applcatons to Image Segmentaton. IEEE Transactons on Fuzzy Systems, Vol. 4 (2), 996, p Bezdek, J.C. (98) Pattern Recognton wth Fuzzy Objectve Functon Algorthms. New York, Plenum, 98

270 252 References Bnagh, E., Della Ventura, A., Rampn, A., Schettn, R. (993) Fuzzy Reasonng Approach to Smlarty Evaluaton n Image Analyss. Internatonal Journal of Intellgent Systems, Vol. 8, 993, p Black, U. (989) Data Networks. Concepts, Theory, and Practce. Prentce Hall, Englewood Clffs, New Jersey 07632, 989 Bock, H. H. (974) Automatsche Klassfkaton: theoretsche und praktsche Methoden zur Grupperung und Strukturerung von Daten (Cluster-Analyse). Vandenhoek & Ruprecht, Göttngen, 974 Bocklsch, S. (98) Expermentelle Prozeßanalyse mt unscharfer Klassfkaton. Dssertaton, Technsche Hochschule Karl-Marx-Stadt, 98 Bocklsch, S. (987) Hybrd Methods n Pattern Recognton. In: Devjver, P., Kttler, J. Pattern Recognton Theory and Applcatons. Sprnger-Verlag, Berln, Hedelberg, 987, p Boose, J.H. (989) A Survey of Knowledge Acquston Technques and Tools. Knowledge Acquston,, 989, p Boutleux, E., Dubusson, B. (996) Fuzzy Pattern Recognton to Characterze Evolutonary Complex Systems. Applcaton to the French Telephone Network. Proc. IEEE Int. Conference on Fuzzy Systems, New Orleans, LA, September 996 Brachman, R.J., Anand, T. (996) The process of Knowledge Dscovery n Databases. In: Fayyad, U.M., Patetsky-Shapro, G., Smyth, P., Uthurusamy, R. Advances n Knowledge Dscovery and Data Mnng. AAAI Press / The MIT Press, Menlo Park, Calforna, 996, p Bunke, H. (987) Hybrd Methods n Pattern Recognton. In: Devjver, P., Kttler, J. Pattern Recognton Theory and Applcatons. Sprnger-Verlag, Berln, Hedelberg, 987, p Cayrol, M., Farreny, H., Prade, H. (980) Possblty and Necessty n a Pattern-Matchng Process. Proceedngs of IXth Internatonal Congress on Cybernetcs, Namur, Belgum, September, 980, p Cayrol, M., Farreny, H., Prade, H. (982) Fuzzy Pattern Matchng. Kybernetes,, 982, p. 03-6

271 References 253 Chow, C.K. (970) On Optmum Recognton Error and Reject Tradeoff. IEEE Transactons on Informaton Theory, IT-6, 970, p Das, G., Gunopulos, D., Mannla, H. (997) Fndng Smlar Tme Seres. In: Komorowsk, J., Zytkow, J. (Eds.) Prncples of Data Mnng and Knowledge Dscovery. Proceedngs of the Frst European Symposum PKDD 97, Trondhem, Norway, June 997, Sprnger, 997, p Daves, D.L., Bouldn, D.W., (979) A Cluster Separaton Measure. IEEE Transactons on Pattern Analyss and Machne Intellgence, Vol. PAMI-, 979, p Denoeux, T., Masson, M., Dubusson, B. (997) Advanced Pattern Recognton Technques for System Montorng and Dagnoss: a Survey. Journal Europeen des Systemes Automates, Vol. 3, Nr. 9/0, 997, p Devjver, P., Kttler, J. (982) Pattern Recognton: A Statstcal Approach. Prentce Hall, 982 Dlly, R. (995) Data Mnng: An Introducton. Queens Unversty Belfast, December 995, Dorf, R.C. (992) Modern Control Systems. Addson-Wesley, Readng, Mass., 992 Dubos, D., Prade, H. (988) Possblty Theory - An Approach to computerzed Processng of Uncertanty. Plenum Press, New York, 988 Dubos, D., Prade, H., Testemale, C. (988) Weghted Fuzzy Pattern Matchng. Fuzzy Sets and Systems, 28, 988, p Duda, R.O., Hart P.E. (973) Pattern Classfcaton and Scene Analyss. Wley, New York, 973 Dunn, J.C. (974) Well Separated Clusters and Optmal Fuzzy Parttons. Journal of Cybernetcs, Vol. 4 (3), 974, p Eckardt, H., Haupt, J., Wernstedt, J. (995) Fuzzygestützte Substanzerkennung n der Umweltanalytk. Anwendersymposum zu ndustrellen Anwendungen der Neuro-Fuzzy technologen, Lutherstadt Wttenberg, March 4-5, 995, p. 05-

272 254 References Engels, J., Chadenas, C. (997) Learnng Hstorc Traffc Patterns as Bass for Traffc Predcton. Proceedngs of the 5 th European Conference on Intellgent Technques and Soft Computng (EUFIT 97), Aachen, Germany, September 8-, 997, p Falconer, K. (990) Fractal Geometry Mathematcal Foundatons and Applcatons, John Wley, 990 Faml, A., Shen, W.-M., Weber, R., Smouds, E. (997) Data Pre-processng and Intellgent Data Analyss. Intellgent Data Analyss, Vol. (), 997, Fayyad, U.M., Patetsky-Shapro, G., Smyth, P. (996) From Data Mnng to Knowledge Dscovery: An Overvew. In: Fayyad, U.M., Patetsky- Shapro, G., Smyth, P., Uthurusamy, R. Advances n Knowledge Dscovery and Data Mnng. AAAI Press / The MIT Press, Menlo Park, Calforna, 996, p. -34 Föllnger, O., Franke, D. (982) Enführung n de Zustandsbeschrebung dynamscher Systeme. Oldenbourg Verlag, München, 982 Frawley, W.J., Patetsky-Shapro, G., Matheus, C. J. (99) Knowledge Dscovery n Databases: An Overvew. In: Patetsky-Shapro, G., Frawley, W.J. (Ed.) Knowledge Dscovery n Databases. Cambrdge, Mass: AAAI Press / The MIT Press, 99, p. -27 Frawley, W.J., Patetsky-Shapro, G., Matheus, C. J. (992) Knowledge Dscovery n Databases: An Overvew. AI Magazne, Vol. 4 (3), 992, p Freer, J. R. (988) Computer Communcatons and Networks. Ptman Publshng, Computer Systems Seres, London, 988 Frélcot, C. (992) Un système adaptatf de dagnostc prédctf par reconnassance des formes floues. PhD thess, Unversté de Technologe de Compégne, Compégne, 992 Frélcot, C., Masson, M.H., Dubusson, B. (995) Reject Optons n Fuzzy Pattern Classfcaton Rules. Proceedngs of the 3 th European Conference on Intellgent Technques and Soft Computng (EUFIT 95), Aachen, Germany, August 28-3, 995, p Frgu, H., Krshnapuram, R. (997) Clusterng by Compettve Agglomeraton. Pattern Recognton, Vol. 30 (7), 997, p. 09-9

273 References 255 Fu, K.S. (974) Syntactc Methods n Pattern Recognton. Academc Press, New York, 974 Fu, K.S. (982a) Syntactc Pattern Recognton wth Applcatons. Prentce-Hall, Englewood Clffs, NJ, 982 Fu, K.S. (Ed.) (982b) Applcatons of Pattern Recognton. CRC Press, Boca Raton, FL, 982 Gath, I., Geva, A.B. (989) Unsupervsed optmal Fuzzy Clusterng. IEEE Transactons on Pattern Analyss and Machne Intellgence,, 989, p Gesser, S. (975) The Predctve Sample Reuse Method wth Applcatons. Journal Of Amercan Statstcal Assocaton, 70, 975, p Geva, A.B., Kerem, D.H. (998) Forecastng Generalzed Epleptc Sezures from the EEG Sgnal by Wavelet Analyss and Dynamc Unsupervsed Fuzzy Clusterng. IEEE Transactons on Bomedcal Engneerng, 45 (0), 998, p Gbb, W.J., Auslander, D.M., Grffn, J.C. (994) Adaptve classfcaton of myocardal electrogram waveforms. IEEE Transactons on Bomedcal Engneerng, 4, 994, p Grabsch, M., Benvenu, G., Grandn, J.F., Lemer, A., Moruzz, M. (997) A Formal Comparson of Probablstc and Possblstc Frameworks for Classfcaton. Proceedngs of 7th IFSA World Congress, Prague 997, p Graham, I., Jones, P.L. (988) Expert Systems, Knowledge Uncertanty and Decson. London Chapman and Hall Computng, 988 Grener, D. (984) Méthode de détecton d évoluton. Applcaton a l nstrumentaton nucléare. Thèse de docteur -ngéner. Unversté de Technologe de Compègne, 984 Gunderson, R. (978) Applcaton of fuzzy ISODATA Algorthms to Star Tracker Prntng Systems. Proceedngs of the 7 th Trannual World IFAC Congress, Helsnk, Fnland, 978, p Gustafson, D., Kessel, W. (979) Fuzzy Clusterng wth a Fuzzy Covarance Matrx. Proceedngs of IEEE CDC, San Dego, Calforna. IEEE Press, Pscataway, New Jersey, 979, p

274 256 References Hofmester, P. (999) Evolutonäre Szenaren dynamsche Konstrukton alternatver Zukunftsblder mt unscharfen Regelbasen. Dssertaton RWTH Aachen, Lehrstuhl für Unternehmensforschung, 999 Hofmester, P., Joentgen, A., Mkenna, L., Weber, R., Zmmermann, H.-J. (999) Reducton of Complexty n Scenaro Analyss by Means of Dynamc Fuzzy Data Analyss. OR Spektrum, (to appear) Hogg, R.V., Ledolter, J. (992) Appled Statstcs for Engneers and Physcal Scentsts. Macmllan, New York, 992 Höppner, F., Klawonn, F., Kruse, R. (996) Fuzzy Clusteranalyse: Verfahren für de Blderkennung, Klassfkaton und Datenanalyse. Veweg, Braunschweg, 996 Hornk, K., Stnchcombe, M., Whte, H. (989) Multlayer Feedforward Networks are Unversal Approxmators. Neural Networks, 2 (5), 989, p Hunt, C. (995) TCP/IP Netzwerk Admnstraton. O Relly Internatonal Thomson Verlag, Bonn, 995 Huntsberger, T.L., Ajjmarangsee, P. (990) Parallel self-organzng feature maps for unsupervsed pattern recognton. Internatonal Journal on Genetc Systems, 6, 990, p Jan, A.K. (987) Advances n statstcal pattern recognton. In: Devjver, P., Kttler, J. Pattern Recognton Theory and Applcatons. Sprnger-Verlag, Berln, Hedelberg, 987, p. -9 Joentgen, A., Mkenna, L., Weber, R., Zmmermann, H.-J. (998) Dynamc Fuzzy Data Analyss: Smlarty between Trajectores. In: Brauer, W. (Ed.) Fuzzy- Neuro Systems 98 Computatonal Intellgence. Infx, Sankt Augustn, p Joentgen, A., Mkenna, L., Weber, R., Zmmermann, H.-J. (999a) Theore und Methodologe der dynamschen unscharfen Datenanalyse. Report DFG-Z 04/27-, 999 Joentgen, A., Mkenna, L., Weber, R., Zmmermann, H.-J. (999b) Dynamc Fuzzy Data Analyss based on Smlarty between Functons. Fuzzy Sets and Systems, 05 (), 999, p Joentgen, A., Mkenna, L., Weber, R., Zeugner, A., Zmmermann, H.-J. (999) Automatc Fault Detecton n Gearboxes by Dynamc Fuzzy Data Analyss. Fuzzy Sets and Systems, 05 (), 999, p

275 References 257 Kastner, J.K., Hong, S.J. (984) A Revew of Expert Systems. European Journal of Operatons Research, 8, 984, p Kel, H.-U. (996) Integraton studentscher Wohnanlagen n de Datenkommunkatonsstruktur ener Hochschule. Dplomarbet, Insttut für Informatk und Rechenzentrum, Technsche Unverstät Clausthal, Germany, 996 Kohav, Z. (978) Swtchng and Fnte Automata Theory. 2 nd edton, McGraw-Hll, New York, 978 Kohonen, T. (988) Self-Organzaton and Assocatve Memory. Sprnger-Verlag, New-York, 988 Kosko, B. (992) Neural Networks and Fuzzy Systems. Prentce-Hall, Englewood Clffs, NJ, 992 Krshnapuram, R., Nasraou, O., Frgu, H. (992) The Fuzzy C Sphercal Shells Algorthms: A new Approach. IEEE Transactons on Neural Networks, 3, 992, p Krshnapuram, R., Keller, J. (993) A Possblstc Approach to Clusterng. IEEE Transactons on Fuzzy Systems,, 993, p Kunsch, G. (996) Anpassung und Evaluerung statstscher Lernverfahren zur Behandlung dynamscher Aspekte m Data Mnng. Master s Thess, Fakultät für Mathematk und Wrtschaftswssenschaften, Unverstät Ulm, Germany, 996 Lanqullon, C. (997) Dynamc Neural Classfcaton. Dplomarbet, Insttut für Betrebssysteme und Rechnerverbund, Abtelung Fuzzy-Systeme und Soft-Computng, Unverstät Braunschweg (TU), Germany, 997 Lee, S.C. (975) Fuzzy Neural Networks. Mathematcal Boscence, 23, 975, p Looney, C.G. (997) Pattern Recognton Usng Neural Networks: Theory and Algorthms for Engneers and Scentsts. Oxford Unversty Press, Oxford, New York, 997 Mallat, S.G., Zhong, S. (992) Complete Sgnal Representaton wth Multscale Edges, IEEE Transactons PAMI, Vol. 4, 992, p

276 258 References Mann, S. (983) En Lernverfahren zur Modellerung zetvaranter Systeme mttels unscharfer Klassfkaton. Dssertaton, Technsche Hochschule Karl-Marx-Stadt, Germany, 983 Marsl-Lbell, S. (998) Adaptve Fuzzy Montorng and Fault Detecton. Internatonal Journal of COMADEM, (3), 998, p Mnsky, M.L., Papert, S.A. (988) Perceptrons. Expanded Edton, MIT Press, Cambrdge, MA, 988 Nakhaezadeh, G., Taylor, C., Kunsch, G. (997) Dynamc Supervsed Learnng. Some Basc Issues and Applcaton Aspects. In R. Klar, O. Optz (Eds.) Classfcaton and Knowledge Organzaton. Sprnger Verlag, Berln, Hedelberg, 997, p Nauck, D., Klawonn, F., Kruse, R. (996) Neuronale Netze und Fuzzy Systeme. (2. Aufl.), Veweg, Braunschweg, 996 Nemrko, A.P., Manlo, L.A., Kalnchenko, A.N. (995) Waveform Classfcaton for Dynamc Analyss of ECG. Pattern recognton and Image Analyss, Vol. 5 (), 995, p OMRON ELECTRONICS GmbH (99) OMRON: Fuzzy Logc Ideen für de Industreautomaton. Hannover-Messe, 99 Pal, S.K. (977) Fuzzy Sets and Decson makng Approaches n Vowel and Speaker Recognton. IEEE Transactons on Man, Machne, and Cybernetcs, 977, p Pal, R.N., Bezdek, J.C. (994) Measurng Fuzzy Uncertanty. IEEE Transactons on Fuzzy Systems, Vol. 2, 994, p Pedrycz, W. (990) Fuzzy Sets n Pattern Recognton: Methodology and Methods. Pattern Recognton, Vol. 23, 990, p Pedrycz, W. (990) Fuzzy Sets n Pattern Recognton: Accomplshments and Challenges. Fuzzy Sets and Systems, 90, 997, p Pelter, M.A., Dubusson, B. (993) A Human State Detecton System Based on a Fuzzy Approach. Tooldag 93, Internatonal Conference on Fault Dagnoss, Toulouse, Aprl, 993, p

277 References 259 Peschel, M., Mende, W. (983) Leben wr n enem Volterra-Welt? Akademe-Verlag, Berln, 983 Petrak, J. (997) Data Mnng Methoden und Anwendungen. Austran Research Insttute for Artfcal Intellgence (ÖFAI), Venna, TR-97-5, 997, Pokropp, F. (996) Stchproben:Theore und Verfahren. R. Oldenbourg Verlag, München, Wen, 996 Rosenberg, J.M. (986) Dctonary of Artfcal Intellgence and Robotcs, 986 Rosenblatt, F. (958) The Perceptron: A Probablstc Model for Informaton Storage and Organzaton n the bran. Psychologcal Revew, 654, 958, p Rüger, B. (989) Induktve Statstk: Enführung für Wrtschafts- und Sozalwssenschaftler. R. Oldenbourg Verlag, München, Wen, 989 Sato, M., Sato, Y., Jan, L.C. (997) Fuzzy Clusterng Models and Applcatons, Physca-Verlag, 997 Schlecher, U. (994) Verglech und Erweterung von c-means Verfahren und Kohonen Netzten zur Fuzzy Clusterung: ene Untersuchung unter dem Aspekt der Anwendung n der Qualtätskontrolle. Magsterarbet, Lehrstuhl für Unternehmensforschung, Rhensch-Westfälsche Technsche Hochschule (RWTH) Aachen, Germany, 994 Schreber, T., Schmtz, A. (997) Classfcaton of Tme Seres Data wth Nonlnear Smlarty Measures. Physcal Revew Letters, Vol. 79 (8), 997, p Setnes, M., Kaymak, U. (998) Extended Fuzzy c-means wth Volume Prototypes and Cluster Mergng. Proceedngs of the 6 th European Conference on Intellgent Technques and Soft Computng (EUFIT 98), Aachen, Germany, September 7-0, 998, p Shewhart, W. A. (93) Economc Control of Qualty Manufactured Product. Prnceton, NJ, D. Van Nostrand Renhold, 93

278 260 References Smth, A. E. (994) X-bar and r-control Chart Interpretaton usng Neural Computng. Internatonal Journal of Producton Research, 32, 994, p Stone, M. (974) Cross-valdatry Choce and assessment of statstcal predctons. Journal of Royal Statstcal Socety B, 36, 974, p. -47 Stöppler, S. (Hrsg.) (980) Dynamsche Ökonomsche Systeme: Analyse und Steuerung. Gabler GmbH, Wesbaden, 980 Struzk, Z.R. (995) The Wavelet Transform n the Soluton to the Inverse Fractal Problem, Fractals, Vol. 3 (2), 995, p Struzk, Z.R., Sebes, A. (998) Wavelet Transformaton n Smlarty Paradgm. In: Wu, X., Kotagr, R., Korb, K.B. (Eds.) Research and Development n Knowledge Dscovery and Data Mnng. Proceedngs of the Second Pacfc-Asa Conference PAKDD-98, Melbourne, Australa, Aprl 998, Sprnger, 998, p Stutz, C. (998) Partally Supervsed Fuzzy c-means Clusterng wth Cluster Mergng. Proceedngs of the 6 th European Conference on Intellgent Technques and Soft Computng (EUFIT 98), Aachen, Germany, September 7-0, 998, p Taylor, C. Nakhaezadeh, G. (997) Learnng n Dynamcally Changng Domans: Theory Revson and Context Dependence Issues. In M. Someren, G. Wdmer (Eds.) Machne Learnng: 9 th European Conference on Machne Learnng (ECML 97), Prague, Czech Republc, p Taylor, C. Nakhaezadeh, G., Lanqullon, C. (997) Structural Change and Classfcaton. In G. Nakhaezadeh, I. Bruha, C. Taylor (Eds.) Workshop Notes on Dynamcally Changng Domans: Theory Revson and Context Dependence Issues, 9 th European Conference on Machne Learnng (ECML 97), Prague, Czech Republc, p Therren, C.W. (989) Decson Estmaton and Classfcaton: An Introducton to Pattern Recognton and Related Topcs. John Wley & Sons, New York, 989 Thole, U., Zmmermann, H.-J., Zysno, P. (979) On the sutablty of mnmum and product operators for the ntersecton of fuzzy sets. Fuzzy Sets and Systems, Vol. 2, 979, p

279 References 26 Vesternen, A., Särelä, A., Vuormaa, P. (997) Pre-processng for Fuzzy Anaesthesa Machne Fault Dagnoss. Proceedngs of the 5 th European Conference on Intellgent Technques and Soft Computng (EUFIT 97), Aachen, Germany, September 8-, 997, p Welcker, J. (99) Technsche Aktenanalyse: de Methoden der technschen Analyse mt Chart-Übungen. Verlag Moderne Industre, Zürch, 99 Wdmer, G., Kubat, M. (993) Effectve Learnng n Dynamc Envronments by explct context trackng. In P. Brazdl (Ed.) Machne Learnng: Proceedngs of the 6 th European Conference on Machne Learnng (ECML 93), Venna, Austra, Aprl 993. Lecture Notes n Artfcal Intellgence (667), Sprnger-Verlag, Berln, Germany, p Wdmer, G., Kubat, M. (996) Learnng n the Presence of Context Drft and Hdden Contexts. Machne Learnng, 23, 996, p Wndham, M.P. (98) Cluster Valdty for Fuzzy Clusterng Algorthms. Fuzzy Sets and Systems, Vol. 5, 98, p Xe, L.X., Ben, G. (99) A Valdty Measure for Fuzzy Clusterng: IEEE Transactons on Pattern Recognton, Annals Machne Intellgence, Vol: 3, 99, p Yager, R.R., (988) On Ordered Weghted Averagng Aggregaton Operators n Multcrtera Decson Makng. IEEE Transactons on Systems, Man & Cybernetcs, Vol. 8, 988, p Yazdan, N., Ozsoyoglu, Z.M. (996) Sequence Matchng of Images. In: Proceedngs of the 8 th Internatonal Conference on Scentfc and Statstcal Database Management, Stockholm, 996, p Zadeh, L.A. (977) Fuzzy Sets and ther Applcaton to Classfcaton and Clusterng. In J. Van Ryzn (Ed.) Classfcaton and Clusterng. Academc Press, New York, 977 Zahd, N., Abouelala, O., Lmour, M., Essad, A. (999) Unsupervsed Fuzzy Clusterng. Pattern Recognton Letters, 20, 999, p Zeba, S., Dubusson, B. (994) Tool Wear Montorng and Dagnoss n Mllng usng Vbraton Sgnals. SafeProcess 94, Espoo, Fnland, June 994, p

280 262 References Zmmermann, H.-J. (992) Methoden und Modelle des Operatons Research. Veweg, Braunschweg, 992 Zmmermann, H.-J. (Hrsg.) (993) Fuzzy Technologen: Prnzpen, Werkzeuge, Potentale. VDI Verlag, Düsseldorf, 993 Zmmermann, H.-J. (Hrsg.) (995) Datenanalyse. Düsseldorf, 995 Zmmermann, H.-J. (996) Fuzzy Set Theory - and ts Applcatons. Thrd edton, Boston, Dordrecht, London, 996 Zmmermann, H.-J. (997) A fresh perspectve on uncertanty modelng: Uncertanty vs. Uncertanty modelng. In M. Ayyub, M.M. Gupta (Hrsg.) Uncertanty Analyss n Engneerng and Scences:Fuzzy Logc, Statstcs, and Neural Approach. Kluwer Academc Publshers, Boston, 997, p Zmmermann, H.-J., Zysno, P. (980) Latent Connectves n Human Decson Makng. Fuzzy Sets and Systems, Vol. 4, 980, p Zmmermann, H.-J., Zysno, P. (985) Quantfyng Vagueness n Decson Models. European Journal of Operatonal Research, Vol. 22, 985, p

281 Appendx Appendx 9. Unsupervsed Optmal Fuzzy Clusterng Algorthm of Gath and Geva The advantage of the UOFC algorthm s the unsupervsed ntalsaton of cluster prototypes and the ablty to detect automatcally the correct number of clusters based on the performance measures for cluster valdty whch use fuzzy hypervolume and densty functons. The dea of the algorthm conssts n teraton of the basc calculaton scheme of the UOFC wth an ncreasng number of clusters, evaluaton of the cluster valdty measure n each teraton and choce of the optmal number of clusters based on the valdty crteron. The calculaton scheme of the UOFC algorthm s closely related to probablty theory. Objects are nterpreted as observatons of an M-dmensonal normally dstrbuted random varable. Assgnments of N observatons x j, j=,..., N, to c normal dstrbutons f, =,.., c correspond to hard membershps u j {0, }. Expected values of normal dstrbutons are nterpreted as cluster prototypes. The dstance functon n the algorthm s chosen reverse proportonal to the a posteror probablty (lkelhood) to whch a certan object x j s an observaton of a normal dstrbuton f. Ths choce s motvated by the fact that a smaller dstance of an object to a cluster prototype corresponds to a larger probablty of a membershp of an object nto ths cluster and vce versa. Generalsng these results for the case of fuzzy membershps of objects to clusters (observatons to normal dstrbutons), the fnal calculaton scheme of the UOFC algorthm s obtaned. The man steps of the UOFC algorthm are summarsed n the followng [Gath, Geva, 989]. Algorthm 8: The unsupervsed optmal fuzzy clusterng (UOFC) algorthm.. Choose the ntal cluster prototype as the mean locaton of all objects. 2. Calculate a new partton of objects by performng the followng two phases: 2. apply the FCM algorthm wth an Eucldean dstance functon; 2 d ( x j, v T ) = ( x j v ) ( x j v ) (9.) where x j s the j-th M-dmensonal feature vector, j=,...,n, v s the centre of the -th cluster, =,...,c. 2.2 apply the FCM algorthm wth an exponental dstance functon (a fuzzy modfcaton of the MLE algorthm):

282 264 Appendx = T j j j 2 ) ( ) ( 2 exp ) det( P ), ( d v x F v x F v x (9.2) where P s the a prory probablty of selectng cluster, whch s determned as the normalsed sum of membershp degrees of objects to cluster : P N u j j N = =, (9.3) F s the fuzzy covarance matrx of cluster gven by: = = = N j j T j j N j j u ) ( ) ( u v x v x F. (9.4) 3. Calculate the followng cluster valdty measures: 3. The fuzzy hypervolume crteron s calculated by: = = c HV h (c) v (9.5) where the hypervolume of the -th cluster s defned as ) det( h F =. 3.2 The partton densty s calculated by: = = = c c PD h S (c) v (9.6) where S s the sum of good objects n cluster whose dstance to the cluster centre does not exceed the standard devaton of features for ths cluster: { } ) ( ) ( u S j T j j j N j j < = = v x F v x x x 3.3 The average partton densty s calculated by: = = c APD h S c (c) v (9.7) 4. Add another cluster prototype equdstant (wth large values of standard devatons) from all objects.

283 Appendx If the current number of clusters s smaller than a predefned maxmum number, then go to step Else, stop and choose the optmal partton usng the valdty measure crtera. In contrast to FCM, PCM and the algorthm of Gustafson and Kessel, equatons to determne cluster prototypes are obtaned not from the objectve functon wth a modfed dstance functon, but by fuzzfcaton of statstc equatons of the expected value and the covarance matrx. Thus, the proposed equatons represent a sutable heurstc whch can not guarantee a global optmal soluton. Due to the exponental dstance functon, the UOFC algorthm searches for an optmum n a narrow local regon. Therefore t requres good cluster prototypes for the ntalsaton whch are calculated by the FCM n the frst phase. These ntal prototypes are refned n the second phase by the use of fuzzy modfcaton of the MLE algorthm and the partton s adapted accordng to dfferent cluster shapes, szes and denstes.

284 266 Appendx 9.2 Descrpton of Implemented Software All algorthms developed n ths thess and clusterng algorthms used for classfer desgn and classfcaton were mplemented usng software package MATLAB 5.. Adp_fpcm_traject Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects represented by multdmensonal trajectores, whch uses the functonal fuzzy c-means algorthm for classfer desgn and functonal possblstc c-means for classfcaton and s based on the pontwse smlarty measure between trajectores Adp_ggpos_traject Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects represented by multdmensonal trajectores, whch uses the modfed Gath-Geva algorthm for classfer desgn and ts possblstc verson for classfcaton and s based on the pontwse smlarty measure between trajectores FCM_traject Functonal fuzzy c-means algorthm for clusterng multdmensonal trajectores based on the pontwse smlarty measure PCM_class_traject Possblstc c-means algorthm used for classfcaton of multdmensonal trajectores based on the pontwse smlarty measure GG_traject The modfed Gath-Geva algorthm for clusterng multdmensonal trajectores based on the pontwse smlarty measure GG_possclass_traject The modfed Gath-Geva algorthm for possblstc classfcaton of multdmensonal trajectores based on the pontwse smlarty measure Densty_clusters_tra Calculaton of denstes of exstng clusters and the average partton densty, where the pontwse smlarty measure between trajectores s used to obtan the covarance matrxes of clusters Densty_free_tra Ths program parttons new free objects (trajectores) nto cnew clusters usng functonal fuzzy c-means and possblstc c-means based on the pontwse smlarty measure for trajectores. Denstes of new clusters and the average partton densty of free objects are calculated usng the localsaton

285 Appendx 267 Pont_sm Adp_fpcm_traject_whole Adp_ggpos_traject_whole Adp_fpcm_traject_str Adp_ggpos_traject_str FCM_traject_str procedure for detectng compact dense groups wthn free objects. Calculaton of the dstance measure based on the pontwse smlarty measure between trajectores; the program ncludes the choce between dfferent shapes of the membershp functon and dfferent types of aggregaton operators Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects appled to the whole temporal hstory of objects. The classfer desgn s statc and the montorng and adaptaton procedures of the algorthm are used to determne the correct number of clusters. The algorthm uses the functonal fuzzy c-means algorthm for classfer desgn and functonal possblstc c-means algorthm for classfcaton, whch are based on the pontwse smlarty measure between trajectores Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects appled to the whole temporal hstory of objects. The classfer desgn s statc and the montorng and adaptaton procedures of the algorthm are used to determne the correct number of clusters. The algorthm uses the modfed Gath-Geva algorthm for classfer desgn and ts possblstc verson for classfcaton and s based on the pontwse smlarty measure between trajectores Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects represented by multdmensonal trajectores, whch uses the functonal fuzzy c-means algorthm for classfer desgn and functonal possblstc c-means for classfcaton and s based on the structural smlarty measure between trajectores Algorthm for adaptve fuzzy classfer desgn and classfcaton of dynamc objects represented by multdmensonal trajectores, whch uses the modfed Gath-Geva algorthm for classfer desgn and st possblstc verson for classfcaton and s based on the structural smlarty measure between trajectores Functonal fuzzy c-means algorthm for clusterng multdmensonal trajectores based on the structural smlarty measure

286 268 Appendx PCM_class_traject_str GG_traject_str GG_possclass_traject_str Densty_clusters_tra_str Densty_free_tra_str Struct_sm Temp_characterstcs Absorp_traject Mergng_proced Splttng_proced Possblstc c-means algorthm used for classfcaton of multdmensonal trajectores based on the structural smlarty measure The modfed Gath-Geva algorthm for clusterng multdmensonal trajectores based on the structural smlarty measure The modfed Gath-Geva algorthm for possblstc classfcaton of multdmensonal trajectores based on the structural smlarty measure Calculaton of denstes of exstng clusters and the average partton densty, where the structural smlarty measure between trajectores s used to obtan the covarance matrxes of clusters. Ths program parttons new free objects (trajectores) nto cnew clusters usng functonal fuzzy c-means and possblstc c-means based on the structural smlarty measure for trajectores. Denstes of new clusters and the average partton densty of free objects are calculated usng the localsaton procedure for detectng compact dense groups wthn free objects. Calculaton of the dstance measure based on the structural smlarty measure between trajectores; the program ncludes the choce between dfferent shapes of the membershp functon and dfferent ways of aggregaton of partal smlarty measures Calculaton of 8 temporal characterstcs of trajectores, whch can be chosen n dfferent combnatons dependng from the applcaton Absorpton procedure and calculaton of the number of new clusters wthn free objects Algorthm for detecton of smlar clusters and mergng procedure Algorthm for detecton of heterogeneous clusters and splttng procedure

287 269 Currculum Vtae Personal Data Name: Academc degree: Larsa G. Angstenberger, maden name Mkenna Dpl.-Ing., M.O.R. Date of brth: Place of brth: Address: E-mal: Marrage status: Ctzenshp: Kev, Ukrane Geschwster-Scholl-Str. 77, 2025 Hamburg, Germany marred Ukranan Educaton 09/978-06/988 School Nr. 63, Kev, Ukrane 09/988-02/994 Studes of automatc system control at the Natonal Techncal Unversty of Ukrane (NTUU). Degree: Master of Scence (Engneerng). 06/994-07/994 Course of Busness Admnstraton of the Unversty of Western Ontaro (Canada) n Kev. 09/994-07/995 PhD student at the NTUU. 0/994-2/994 Exchange student at the Unversty for Engneerng, Economcs and Culture n Lepzg, Germany. Research topcs: Fuzzy Set Theory and Fuzzy Control. 08/995-07/999 Scholarshp from the German Academc Exchange Servce (DAAD) for PhD studes n Germany. 04/996-02/998 Studes of Operatons Research at the Aachen Unversty of Technology (RWTH Aachen). Degree: Master of Operatons Research (M.O.R.). Ttle of the master thess: Applcaton of a pattern recognton method based on the fuzzy ntegral to the area of acoustc qualty control. 03/998-0/2000 PhD studes at the department of Operatons Research at the RWTH Aachen. (Charman: Prof. Dr. Dr. h.c. H.-J. Zmmermann).

288 270 Work Experence 06/993-09/993 Exchange student for practcal work as a sales representatve at Drect Net Telecommuncatons, Calforna, USA. 0/993-2/993 03/994-05/994 Sales representatve at Drect Net Telecommuncatons, Moscow, Russa 0/995-03/996 Guest researcher at the ELITE - Foundaton (European Laboratory for Intellgent Technques Engneerng), Aachen, Germany. Research topcs: Fuzzy Control and Fuzzy Data Analyss. 02/997-2/999 Scentfc assstant at the department of Operatons Research at the RWTH Aachen. 02/997-2/999 Member of the research project Dynamc Fuzzy Data Analyss sponsored by DFG (German Research Socety). Co-operaton wth MIT - Management Intellgenter Technologen GmbH, Aachen. Publcatons 08/996 Fuzzy Lqud Level Control n LabVIEW. Proceedngs NIWeek 96, August 6-8, 996, Austn, Texas. 09/996 Lqud Level Control wth Fuzzy Technques usng DataEngneV... Proceedngs of the 4 th European Congress on Intellgent Technques and Soft Computng (EUFIT 96), September 2-5, 996, Aachen, Germany, p /998 Dynamc Data Analyss: Problem Descrpton and Soluton Approaches. In: Brauer, W. (ed.): Fuzzy-Neuro Systems 98 Computatonal Intellgence. Infx, Sankt Augustn, 998, p /998 Dynamc Data Analyss: Smlarty Between Trajectores. In: Brauer, W. (ed.): Fuzzy-Neuro Systems 98 Computatonal Intellgence. Infx, Sankt Augustn, 998, p /998 Smlarty of Functons as a Base for Dynamc Fuzzy Data Analyss: New Approaches and Varants. Proceedngs of the 6 th European Congress on Intellgent Technques and Soft Computng (EUFIT 98), September 7-0, 998, Aachen, Germany, p /999 Fuzzy Dynamc Data Analyss based on Smlarty between Functons. Fuzzy Sets and Systems, Vol. 05 (), 999, p /999 Automatc Fault Detecton n Gearboxes by Dynamc Fuzzy Data Analyss. Fuzzy Sets and Systems, Vol. 05 (), 999, p

289 27 0/999 Improved Feature Selecton and Classfcaton by the 2-addtve Fuzzy Measure. Fuzzy Sets and Systems, Vol. 07 (2), /2000 Reducton of Complexty n Scenaro Analyss by Means of Dynamc Fuzzy Data Analyss. OR Spektrum, Vol. 22 (3), Hamburg, May 200

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna [email protected] Abstract.

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye [email protected] [email protected] [email protected] Abstract - Stock market s one of the most complcated systems

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry [email protected] www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

ERP Software Selection Using The Rough Set And TPOSIS Methods

ERP Software Selection Using The Rough Set And TPOSIS Methods ERP Software Selecton Usng The Rough Set And TPOSIS Methods Under Fuzzy Envronment Informaton Management Department, Hunan Unversty of Fnance and Economcs, No. 139, Fengln 2nd Road, Changsha, 410205, Chna

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES Zuzanna BRO EK-MUCHA, Grzegorz ZADORA, 2 Insttute of Forensc Research, Cracow, Poland 2 Faculty of Chemstry, Jagellonan

More information

Financial Mathemetics

Financial Mathemetics Fnancal Mathemetcs 15 Mathematcs Grade 12 Teacher Gude Fnancal Maths Seres Overvew In ths seres we am to show how Mathematcs can be used to support personal fnancal decsons. In ths seres we jon Tebogo,

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal [email protected] Peter Möhl, PTV AG,

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: [email protected]

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises 3rd Internatonal Conference on Educaton, Management, Arts, Economcs and Socal Scence (ICEMAESS 2015) Research on Evaluaton of Customer Experence of B2C Ecommerce Logstcs Enterprses Yle Pe1, a, Wanxn Xue1,

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht [email protected] 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc. Paper 1837-2014 The Use of Analytcs for Clam Fraud Detecton Roosevelt C. Mosley, Jr., FCAS, MAAA Nck Kucera Pnnacle Actuaral Resources Inc., Bloomngton, IL ABSTRACT As t has been wdely reported n the nsurance

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

Fuzzy Set Approach To Asymmetrical Load Balancing In Distribution Networks

Fuzzy Set Approach To Asymmetrical Load Balancing In Distribution Networks Fuzzy Set Approach To Asymmetrcal Load Balancng n Dstrbuton Networks Goran Majstrovc Energy nsttute Hrvoje Por Zagreb, Croata [email protected] Slavko Krajcar Faculty of electrcal engneerng and computng

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Overview of monitoring and evaluation

Overview of monitoring and evaluation 540 Toolkt to Combat Traffckng n Persons Tool 10.1 Overvew of montorng and evaluaton Overvew Ths tool brefly descrbes both montorng and evaluaton, and the dstncton between the two. What s montorng? Montorng

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

iavenue iavenue i i i iavenue iavenue iavenue

iavenue iavenue i i i iavenue iavenue iavenue Saratoga Systems' enterprse-wde Avenue CRM system s a comprehensve web-enabled software soluton. Ths next generaton system enables you to effectvely manage and enhance your customer relatonshps n both

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada [email protected] Abstract Ths s a note to explan support vector machnes.

More information

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters Frequency Selectve IQ Phase and IQ Ampltude Imbalance Adjustments for OFDM Drect Converson ransmtters Edmund Coersmeer, Ernst Zelnsk Noka, Meesmannstrasse 103, 44807 Bochum, Germany [email protected],

More information

Detecting Global Motion Patterns in Complex Videos

Detecting Global Motion Patterns in Complex Videos Detectng Global Moton Patterns n Complex Vdeos Mn Hu, Saad Al, Mubarak Shah Computer Vson Lab, Unversty of Central Florda {mhu,sal,shah}@eecs.ucf.edu Abstract Learnng domnant moton patterns or actvtes

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently. Corporate Polces & Procedures Human Resources - Document CPP216 Leave Management Frst Produced: Current Verson: Past Revsons: Revew Cycle: Apples From: 09/09/09 26/10/12 09/09/09 3 years Immedately Authorsaton:

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW. Lucía Isabel García Cebrán Departamento de Economía y Dreccón de Empresas Unversdad de Zaragoza Gran Vía, 2 50.005 Zaragoza (Span) Phone: 976-76-10-00

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Damage detection in composite laminates using coin-tap method

Damage detection in composite laminates using coin-tap method Damage detecton n composte lamnates usng con-tap method S.J. Km Korea Aerospace Research Insttute, 45 Eoeun-Dong, Youseong-Gu, 35-333 Daejeon, Republc of Korea [email protected] 45 The con-tap test has the

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

The Application of Fractional Brownian Motion in Option Pricing

The Application of Fractional Brownian Motion in Option Pricing Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn [email protected]

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT Kolowrock Krzysztof Joanna oszynska MODELLING ENVIRONMENT AND INFRATRUCTURE INFLUENCE ON RELIABILITY AND OPERATION RT&A # () (Vol.) March RELIABILITY RIK AND AVAILABILITY ANLYI OF A CONTAINER GANTRY CRANE

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA ) February 17, 2011 Andrew J. Hatnay [email protected] Dear Sr/Madam: Re: Re: Hollnger Canadan Publshng Holdngs Co. ( HCPH ) proceedng under the Companes Credtors Arrangement Act ( CCAA ) Update on CCAA Proceedngs

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa [email protected], [email protected], [email protected],

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING 260 Busness Intellgence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING Murphy Choy Mchelle L.F. Cheong School of Informaton Systems, Sngapore

More information

MACHINE VISION SYSTEM FOR SPECULAR SURFACE INSPECTION: USE OF SIMULATION PROCESS AS A TOOL FOR DESIGN AND OPTIMIZATION

MACHINE VISION SYSTEM FOR SPECULAR SURFACE INSPECTION: USE OF SIMULATION PROCESS AS A TOOL FOR DESIGN AND OPTIMIZATION MACHINE VISION SYSTEM FOR SPECULAR SURFACE INSPECTION: USE OF SIMULATION PROCESS AS A TOOL FOR DESIGN AND OPTIMIZATION R. SEULIN, F. MERIENNE and P. GORRIA Laboratore Le2, CNRS FRE2309, EA 242, Unversté

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng

More information

A Data Mining-Based OLAP Aggregation of. Complex Data: Application on XML Documents

A Data Mining-Based OLAP Aggregation of. Complex Data: Application on XML Documents 1 Runnng head: A DATA MINING-BASED OLAP AGGREGATION A Data Mnng-Based OLAP Aggregaton of Complex Data: Applcaton on XML Documents Radh Ben Messaoud, Omar Boussad, Sabne Loudcher Rabaséda {rbenmessaoud

More information

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and POLYSA: A Polynomal Algorthm for Non-bnary Constrant Satsfacton Problems wth and Mguel A. Saldo, Federco Barber Dpto. Sstemas Informátcos y Computacón Unversdad Poltécnca de Valenca, Camno de Vera s/n

More information

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

A powerful tool designed to enhance innovation and business performance

A powerful tool designed to enhance innovation and business performance A powerful tool desgned to enhance nnovaton and busness performance The LEGO Foundaton has taken over the responsblty for the LEGO SERIOUS PLAY method. Ths change wll help create the platform for the contnued

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information