Dynamic Fuzzy Pattern Recognition

Transcription

1 Dynamc Fuzzy Pattern Recognton Von der Fakultät für Wrtschaftswssenschaften der Rhensch-Westfälschen Technschen Hochschule Aachen zur Erlangung des akademschen Grades enes Doktors der Wrtschafts- und Sozalwssenschaften genehmgte Dssertaton vorgelegt von Dplom-Ingeneurn (UA) Larsa Angstenberger, geb. Mkenna Magster des Operatons Research (M.O.R.) aus Kew, Ukrane Dese Dssertaton st auf den Internetseten der Hochschulbblothek onlne verfügbar

2

3 Dynamc Fuzzy Pattern Recognton Von der Fakultät für Wrtschaftswssenschaften der Rhensch-Westfälschen Technschen Hochschule Aachen zur Erlangung des akademschen Grades enes Doktors der Wrtschafts- und Sozalwssenschaften genehmgte Dssertaton vorgelegt von Dplom-Ingeneurn (UA) Larsa Angstenberger, geb. Mkenna Magster des Operatons Research (M.O.R.) aus Kew, Ukrane Berchter: Berchter: Unv.-Prof. Dr. rer. pol. Dr. h.c. mult. Hans-Jürgen Zmmermann Unv.-Prof. Dr. rer. pol. habl. Mchael Bastan Tag der mündlchen Prüfung: 9. Ma 2000 D 82 (Dss. RWTH Aachen) Dese Dssertaton st auf den Internetseten der Hochschulbblothek onlne verfügbar

4

5 Omna mutantur, nhl ntert. (Everythng s changng, nothng s gettng lost.) Phylosoph Ovd, methamorphoses 5, 65 All the real knowledge whch we possess, depends on methods by whch we dstngush the smlar from the dssmlar. Swedsh naturalst Lnnaeus, 737

6

7 Preface In 995 I was awarded a one-year grant from the German Academc Exchange Servce (DAAD) for research n Germany durng my PhD studes. I was very nterested n nvestgatng the feld of Fuzzy Logc and n fuzzy data analyss and very happy when I receved the nvtaton from Prof. Dr. Dr. h.c. H.-J. Zmmermann to study at hs char of Operatons Research at the RWTH Aachen and to wrte my PhD thess under hs scentfc supervson. Thanks to a four year fnancal support by DAAD I managed to graduate n the master course of Operatons Research and to perform the research n the area of fuzzy pattern recognton whch has resulted n ths thess. I am very grateful to Prof. Zmmermann for allowng me to share hs knowledge and scentfc experence, for hs support and encouragement as well as hs attenton and nterest n my research and hs scentfc suggestons concernng the research drecton, possble applcatons, and thess structure. It was a great pleasure to work wth Prof. Zmmermann and to be a part of the team of young researchers at the nsttute of Operatons Research where, thanks to hs partcular personalty and humour, a very exhlaratng and creatve atmosphere had been developed over the years. I am also much oblged to my co-referent Prof. Bastan for hs crtcal remarks and suggestons for mprovements to ths thess. I would lke to thank all my colleagues at the char of Operatons Research for ther mutual understandng, for knowledge and experence exchange and for a lot of fun. Especally the competent advce from Dr. Uwe Bath and Dr. Tore Grünert and the techncal support of Glberto von Spar durng my work on ths thess were of great mportance for me. I apprecate the numerous dscussons about fuzzy clusterng and the phlosophcal dscussons about lfe wth my colleague Dr. Peter Hofmester. Hs stable optmsm and energy have helped me to stay always n hgh sprts despte of hard work. Durng my studes at the RWTH Aachen I had the opportunty to work at MIT - Management Intellgenter Technologen GmbH - where I partcpated n dfferent projects n the area of ntellgent data analyss. From 997 I worked on a new scentfc project on dynamc fuzzy data analyss sponsored by DFG (German Research Socety) for three years. Ths project motvated me to nvestgate ths new area n greater detal and has resulted n a number of publcatons whch served as a bass for my further research. Helpful dscussons wth my project partner Arno Joentgen have postvely nfluenced my research and brought me many new deas. Due to the techncal support of Sebastan Greguletz, who provded me wth a specal software for network montorng, t was possble to gather enough data for the techncal applcaton of dynamc pattern recognton methods ntroduced n ths thess. I also want to thank one of my

8 best frends, Jens Junker, who helped me to better understand the process of data transmsson n computer networks, to dentfy practcal goals n computer network optmsaton and to nterpret the results of classfcaton. I am very grateful to Thomas MacFarlane for hs very ntensve proof-readng of ths manuscrpt, whch consderably mproved the language and the style of ths thess. I would lke to express my great grattude to my frend Joachm Angstenberger for hs unwaverng support and numerous frutful dscussons durng my almost two years as a PhD student. I apprecate a lot that he has dedcated so much tme to readng the frst and all subsequent versons of ths thess, snce hs valuable suggestons and correctons have sgnfcantly mproved my manuscrpt. I am very happy that he showed so much patence and understandng for my work whereas I dd not have enough tme on weekends. Ths thess s a result of many years of studes and research, n whch I was constantly supported and motvated by my parents. They showed me the path to knowledge and always helped me to fnd the most effcent means to reach my goals. I hope that the results of my work meet the expectatons of my parents, my frends and my teachers. Aachen, Jun 2000 Larsa Mkenna

9 Contents Contents INTRODUCTION.... GOALS AND TASKS OF THE THESIS STRUCTURE OF THE THESIS GENERAL FRAMEWORK OF DYNAMIC PATTERN RECOGNITION THE KNOWLEDGE DISCOVERY PROCESS THE PROBLEM OF PATTERN RECOGNITION The process of pattern recognton Classfcaton of pattern recognton methods Fuzzy pattern recognton THE PROBLEM OF DYNAMIC PATTERN RECOGNITION Mathematcal descrpton and modellng of dynamc systems Termnology of dynamc pattern recognton Goals and tasks of dynamc pattern recognton STAGES OF THE DYNAMIC PATTERN RECOGNITION PROCESS THE MONITORING PROCESS Shewhart qualty control charts Fuzzy technques for the montorng process Fuzzy qualty control charts Reject optons n fuzzy pattern recognton Parametrc concept of a membershp functon for a dynamc classfer THE ADAPTATION PROCESS Re-learnng of the classfer Incremental updatng of the classfer Adaptaton of the classfer Learnng from statstcs approach Learnng wth a movng tme wndow Learnng wth a template set Learnng wth a record of usefulness Evaluaton of approaches for the adaptaton of a classfer DYNAMIC FUZZY CLASSIFIER DESIGN WITH POINT-PROTOTYPE BASED CLUSTERING ALGORITHMS FORMULATION OF THE PROBLEM OF DYNAMIC CLUSTERING...78

10 Contents 4.2 REQUIREMENTS FOR A CLUSTERING ALGORITHM USED FOR DYNAMIC CLUSTERING AND CLASSIFICATION DETECTION OF NEW CLUSTERS Crtera for the detecton of new clusters Algorthm for the detecton of new clusters MERGING OF CLUSTERS Crtera for mergng of clusters Crtera for mergng of ellpsodal clusters Crtera and algorthm for mergng sphercal and ellpsodal clusters SPLITTING OF CLUSTERS Crtera for splttng of clusters Search for a characterstc pattern n the hstogram Algorthm for the detecton of heterogeneous clusters to be splt DETECTION OF GRADUAL CHANGES IN THE CLUSTER STRUCTURE ADAPTATION PROCEDURE UPDATING THE TEMPLATE SET OF OBJECTS Updatng the template set after gradual changes n the cluster structure Updatng the template set after abrupt changes n the cluster structure CLUSTER VALIDITY MEASURES FOR DYNAMIC CLASSIFIERS SUMMARY OF THE ALGORITHM FOR DYNAMIC FUZZY CLASSIFIER DESIGN AND CLASSIFICATION SIMILARITY CONCEPTS FOR DYNAMIC OBJECTS IN PATTERN RECOGNITION45 5. EXTRACTION OF CHARACTERISTIC VALUES FROM TRAJECTORIES THE SIMILARITY NOTION FOR TRAJECTORIES Pontwse smlarty measures Choce of the membershp functon for the defnton of pontwse smlarty Choce of the aggregaton operator for the defnton of pontwse smlarty Structural smlarty measures Smlarty model usng transformaton functons Smlarty measures based on wavelet decomposton Statstcal measures of smlarty Smoothng of trajectores before the analyss of ther temporal behavour Smlarty measures based on characterstcs of trajectores EXTENSION OF FUZZY PATTERN RECOGNITION METHODS BY APPLYING SIMILARITY MEASURES FOR TRAJECTORIES APPLICATIONS OF DYNAMIC PATTERN RECOGNITION METHODS BANK CUSTOMER SEGMENTATION BASED ON CUSTOMER BEHAVIOUR...83

11 Contents 6.. Descrpton of the credt data of bank customers Goals of bank customer analyss Parameter settngs for dynamc classfer desgn and bank customer classfcaton Clusterng of bank customers n Group Y based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure Clusterng of bank customers n Group N based on the whole temporal hstory of 24 months and usng the pontwse smlarty measure Segmentaton of bank customers n Group Y based on the partal temporal hstory and usng the pontwse smlarty measure Clusterng of bank customers n Group N based on partal temporal hstory and usng the pontwse smlarty measure Comparson of clusterng results for customers n Groups Y and N COMPUTER NETWORK OPTIMISATION BASED ON DYNAMIC NETWORK LOAD CLASSIFICATION Data transmsson n computer networks Data acquston and pre-processng for network analyss Goals of the analyss of load n a computer network Parameter settngs for dynamc classfer desgn and classfcaton of data traffc Recognton of typcal load states n a computer network usng the pontwse smlarty measure Recognton of typcal load states n computer network usng the structural smlarty measure CONCLUSIONS AND FURTHER RESEARCH DIRECTIONS REFERENCES APPENDIX UNSUPERVISED OPTIMAL FUZZY CLUSTERING ALGORITHM OF GATH AND GEVA DESCRIPTION OF IMPLEMENTED SOFTWARE...266

12

13 Lst of Fgures v Lst of Fgures Fgure 2-: An overvew of the steps of the KDD process... 9 Fgure 2-2: Basc scheme of the pattern recognton process... 2 Fgure 2-3: A taxonomy of pattern recognton methods... 6 Fgure 2-4: Current states of dynamc objects from a statc vewpont Fgure 2-5: Projectons of three-dmensonal trajectores nto two-dmensonal feature space Fgure 2-6: Formaton of new clusters Fgure 2-7: Changes of the dynamc cluster structure Fgure 2-8: Typcal scenaros of future temporal development of ol prce Fgure 2-9: Changng clusters of typcal system states Fgure 2-0: Movng tme wndows of constant length Fgure 2-: The process of dynamc pattern recognton Fgure 3-: Shewhart qualty control chart for the characterstc f Fgure 3-2: Fuzzy set good qualty defned for characterstc f Fgure 3-3: Fuzzy qualty control chart Fgure 3-4: Processng of fuzzy qualty control charts Fgure 3-5: Ambguty reject (AR) and dstance reject (DR) optons n pattern recognton Fgure 3-6: A parametrc membershp functon... 5 Fgure 3-7: Adaptaton of the classfer based on learnng from statstcs approach Fgure 3-8: Adaptaton of the classfer based on learnng wth a template set approach Fgure 3-9: Adaptaton of the classfer based on the record of usefulness Fgure 4-: 3-dmensonal matrx representaton of a dynamc data set Fgure 4-2: Flow chart of an algorthm for the detecton of new clusters... 9 Fgure 4-3: Two clusters wth absorbed objects (crcles) and a group of free objects (crosses) Fgure 4-4: Projectons of the membershp functons obtaned for a group of free objects on the feature space Fgure 4-5: Detecton of stray objects Fgure 4-6: Mergng of fuzzy clusters based on ther ntersecton Fgure 4-7: Mergng of fuzzy clusters based on the degree of overlappng of α-cuts Fgure 4-8: Mergng of fuzzy clusters based on ther standard devaton... 0 Fgure 4-9: Illustraton of crtera for mergng ellpsodal clusters Fgure 4-0: Membershp functons of fuzzy sets close to and close to zero... 03

14 v Lst of Fgures Fgure 4-: Applcaton of the mergng condton n order to avod mpermssble mergng Fgure 4-2: The relevance of crteron of parallelsm for mergng clusters Fgure 4-3: Stuatons n whch ellptcal clusters can be merged Fgure 4-4: Stuatons n whch ellptcal clusters cannot be merged Fgure 4-5: A flow chart of an algorthm for detectng smlar clusters to be merged... Fgure 4-6: A heterogeneous cluster n the two-dmensonal feature space... 3 Fgure 4-7: A heterogeneous cluster n the space of the two frst prncpal components... 3 Fgure 4-8: Hstogram of objects densty wth respect to the frst prncpal component... 4 Fgure 4-9: Illustraton of thresholds of sze and dstance between centres of densty areas... 5 Fgure 4-20: Illustraton of the search procedure for a densty hstogram... 8 Fgure 4-2: Examples of patterns n the hstogram... 8 Fgure 4-22: Verfcaton of crtera for splttng a cluster based on the densty hstogram Fgure 4-23: The effect of smoothng the densty hstogram... 2 Fgure 4-24: A flow chart of an algorthm for detectng heterogeneous clusters to be splt Fgure 4-25: The structure of the template set... 3 Fgure 4-26: Illustraton of the updatng procedure of the template set Fgure 4-27: Adaptaton of a classfer requres splttng clusters... 4 Fgure 4-28: Adaptaton of a classfer requres cluster mergng Fgure 4-29: The process of dynamc fuzzy classfer desgn and classfcaton Fgure 5-: Transformaton of a feature vector contanng trajectores nto a conventonal feature vector Fgure 5-2: Illustraton of the defnton of pontwse smlarty between trajectores Fgure 5-3: Trangular and trapezodal membershp functons Fgure 5-4: Exponental and non-lnear membershp functons Fgure 5-5: Logstc S-shaped membershp functon Fgure 5-6: An example of two trajectores x(t) and y(t) Fgure 5-7: The sequence of dfferences between trajectores x(t) and y(t) Fgure 5-8: The sequence of pontwse smlartes of trajectores x(t) and y(t)... 6 Fgure 5-9: A trajectory x(t) before and after smoothng Fgure 5-0: Structural smlarty based on the trend of trajectores Fgure 5-: Structural smlarty based on curvature of trajectores... 75

15 Lst of Fgures v Fgure 5-2: Structural smlarty based on the smoothness of trajectores Fgure 5-3: Segmentaton of a temporal pattern of a trajectory accordng to elementary trends Fgure 5-4: Qualtatve (left) and quanttatve (rght) temporal features obtaned by segmentaton Fgure 5-5: Smlarty measure based on peaks Fgure 5-6: The structure of dynamc clusterng algorthms based on smlarty measures for trajectores Fgure 6-: Dstrbuton of data wth respect to Feature Fgure 6-2: Cluster centres wth respect to each feature obtaned for customers of Group Y based on the whole temporal hstory and pontwse smlarty between trajectores Fgure 6-3: Degrees of separaton between clusters obtaned for customers n Group Y based on the whole temporal hstory Fgure 6-4: Degrees of compactness of clusters obtaned for customers n Group Y based on the whole temporal hstory Fgure 6-5: Cluster centres wth respect to each feature obtaned for customers n Group N based on the whole temporal hstory and pontwse smlarty between trajectores Fgure 6-6: Degrees of separaton between clusters obtaned for each customer n Group N based on the whole temporal hstory Fgure 6-7: Degrees of compactness of clusters for each customer n Group N based on the whole temporal hstory Fgure 6-8: Cluster centres wth respect to each feature obtaned for customers n Group Y n the frst tme wndow and based on pontwse smlarty between trajectores Fgure 6-9: Degrees of separaton between clusters obtaned for customers n Group Y n the frst tme wndow Fgure 6-0: Degrees of compactness of clusters calculated for customers n Group Y n the frst tme wndow Fgure 6-: Cluster centres wth respect to each feature obtaned for customers n Group N n the frst tme wndow and based on pontwse smlarty between trajectores... 2 Fgure 6-2: Degrees of separaton between clusters obtaned for customers n Group N n the frst tme wndow Fgure 6-3: Degree of compactness of clusters calculated for customers n Group N n the frst tme wndow Fgure 6-4: The OSI basc reference model... 28

16 v Lst of Fgures Fgure 6-5: Dependence between the number of collsons and the network load Fgure 6-6: Densty dstrbutons of features (left) and 6 (rght) Fgure 6-7: Sx typcal states of the data traffc descrbed by sx packet szes and obtaned usng the pontwse smlarty measure Fgure 6-8: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the pontwse smlarty measure Fgure 6-9: Temporal development of average partton densty obtaned usng the pontwse smlarty measure Fgure 6-20: Assgnment of parts of trajectores to sx clusters representng data traffc states and obtaned based on the pontwse smlarty measure Fgure 6-2: Temporal development of fuzzy separaton and fuzzy compactness ndexes of fuzzy parttons obtaned usng the structural smlarty measure Fgure 6-22: Temporal development of average partton densty obtaned usng the structural smlarty measure Fgure 6-23: Assgnment of parts of trajectores to sx clusters obtaned based on the structural smlarty Fgure 6-24: Assgnment of a part of trajectory from the tme nterval [6200, 8000] to clusters obtaned usng the structural smlarty measure

17 Lst of Tables x Lst of Tables Table 4-: Parameter settngs for the detecton of new clusters n Example Table 4-2: Partton denstes of new and exstng clusters n Example Table 4-3: Partton denstes of new and exstng clusters n Example Table 4-4: Valdty measures for a fuzzy partton before and after splttng clusters... 4 Table 4-5: Valdty measures for a fuzzy partton before and after mergng clusters Table 5-: The overall smlarty obtaned wth dfferent aggregaton operators... 6 Table 6-: Dynamc features descrbng bank customers Table 6-2: The value range and man quantles of each feature of Data Group Y Table 6-3: Man statstcs of each feature of the Data Group Y Table 6-4: The value range and man quantles of each feature of Data Group N Table 6-5: Man statstcs of each feature of Data Group N Table 6-6: Scope of the analyss of bank customers Table 6-7: Parameter settngs for the detecton of new clusters durng customer segmentaton Table 6-8: Number of absorbed and free objects of Data Group Y Table 6-9: Valdty measures for fuzzy partton wth two clusters for Data Group Y Table 6-0: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group Y Table 6-: The number of absorbed and free objects of Data Group N Table 6-2: Valdty measures for fuzzy partton wth two clusters for Data Group N Table 6-3: Number of stray objects and valdty measures for dfferent fuzzy parttons of Data Group N Table 6-4: Number of Customers Y assgned to two clusters n four tme wndows Table 6-5: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for Customers Y n four tme wndows Table 6-6: Temporal change of assgnment of customers n Group Y to clusters Table 6-7: Number of customers N assgned to two clusters n four tme wndows Table 6-8: Partton denstes of clusters, fuzzy separaton and compactness ndexes obtaned for customers n Group N n four tme wndows Table 6-9: Temporal changes of assgnment of customers n Group N to clusters Table 6-20: Dynamc features descrbng data transmsson n computer network

18 x Lst of Tables Table 6-2: The value ranges and man quantles of each feature charactersng network data Table 6-22: Man statstcs of each feature of the network data Table 6-23: Parameter settngs for three algorthms of the montorng procedure used durng the network analyss Table 6-24: Results of dynamc clusterng and classfcaton of the network data traffc based on the pontwse smlarty measure Table 6-25: Results of dynamc clusterng and classfcaton of the network data traffc based on the structural smlarty measure Table 6-26: Cluster centres representng data traffc states obtaned based on the structural smlarty measure

19 Introducton Introducton The phenomenal mprovements n data collecton due to the automaton and computersaton of many operatonal systems and processes n busness, techncal and scentfc envronments, as well as advances n data storage technologes, over the last decade have lead to large amounts of data beng stored n databases. Analysng and extractng valuable nformaton from these data has become an mportant ssue n recent research and attracted the attenton of all knds of companes n a bg way. The use of data mnng and data analyss technques was recognsed as necessary to mantan compettveness n today s busness world, to ncrease busness opportuntes and to mprove servce. A data mnng endeavour can be defned as the process of dscoverng meanngful new correlatons, patterns and trends by examnng large amounts of data stored n repostores and by usng pattern recognton technologes as well as statstcal and mathematcal technques. Pattern recognton s the research area whch provdes the majorty of methods for data mnng and ams at supportng humans n analysng complex data structures automatcally. Pattern recognton systems can act as substtutes when human experts are scarce n specalsed areas such as medcal dagnoss or n dangerous stuatons such as fault detecton and automatc error dscovery n nuclear power plants. Automated pattern recognton can provde a valuable support n process and qualty control and functon contnuously wth consstent performance. Fnally, automated perceptual tasks such as speech and mage recognton enable the development of more natural and convenent human-computer nterfaces. Important addtonal benefts for a wde feld of hghly complex applcatons le n the use of ntellgent technques such as fuzzy logc and neural networks n pattern recognton methods. These technques permtted the development of methods and algorthms that can perform tasks normally assocated wth ntellgent human behavour. For nstance, the prmary advantage of fuzzy pattern recognton compared to the classcal methods s the ablty of a system to classfy patterns n a non-dchotomous way, as humans do, and to handle vague nformaton. Methods of fuzzy pattern recognton gan constantly ncreasng ground n practce. Some of the felds where ntellgent pattern recognton has obtaned the greatest level of endorsement and success n recent tme are database marketng, rsk management and credt-card fraud detecton. The advent of data warehousng technology has provded companes wth the possblty to gather vast amounts of hstorcal data descrbng the temporal behavour of a system under study and allows a new type of analyss for mproved decson support. Amongst other applcatons n whch objects must be analysed n the process of ther moton or temporal development, the followng are worth mentonng:

20 2 Introducton Montorng of patents n medcne, e.g. durng narcoss when the development rather than the status of the patent's condton s essental; The analyss of data concernng buyers of new cars or other artcles n order to determne customer portfolo; The analyss of monthly unemployment rates; The analyss of the development of share prces and other characterstcs to predct stock markets; The analyss of payment behavour of bank customers to dstngush between good and bad customers and detect fraud; Techncal dagnoss and state-dependent machne mantenance. The common characterstc of all these applcatons s that n the course of tme objects under study change ther states from one to another. The order of state changes, or just a collecton of states an object has taken, determnes the membershp of an object to a certan pattern or class. In other words, the hstory of temporal development of an object has a strong effect on the result of the recognton process. Such objects representng observatons of a dynamc system/process and contanng a hstory of ther temporal development are called dynamc. In contrast to statc, they are represented by a sequence of numercal vectors collected over tme. Conventonal methods of statstcal and ntellgent pattern recognton are, however, of lmted beneft n problems n whch a dynamc vewpont s desrable snce they consder objects at a fxed moment of tme and do not take nto account ther temporal behavour. Therefore there s an urgent need for a new generaton of computatonal technques and tools to assst humans n extractng knowledge from temporal hstorcal data. The development of such technques and methods consttutes the focus of ths thess.. Goals and Tasks of the Thess Dynamc pattern recognton s concerned wth the recognton of clusters of dynamc objects,.e. recognton of typcal states n the dynamc behavour of a system under consderaton. The goal of ths thess s to nvestgate ths new feld of dynamc pattern recognton and to develop new methods for clusterng and classfcaton n a dynamc envronment. Due to the changng propertes of dynamc objects, the parttonng of objects,.e. the cluster structure, s not obvously constant over tme. The appearance of new observatons can lead to gradual or abrupt changes n the cluster structure such as, for nstance, the formaton of new clusters, or the mergng or splttng of exstng clusters. In order to follow temporal changes of the cluster structure and to preserve the desred performance, a classfer must posses adaptve capabltes,.e. a classfer must be automatcally adjusted over tme accordng to detected

21 Introducton 3 changes n the data structure. Therefore, the development of methods for dynamc pattern recognton conssts of two tasks: Development of a method for dynamc classfer desgn enablng a desgn of an adaptve classfer that can automatcally recognse new cluster structures as tme passes; Development of new smlarty measures for trajectores of dynamc objects. The procedure of dynamc classfer desgn must ncorporate, n addton to the usual statc steps, specal procedures that would allow the applcaton of a classfer n a dynamc envronment. These procedures representng dynamc steps are concerned wth detectng temporal changes n the cluster structure and updatng the classfer accordng to these changes. In order to carry out these steps the desgn procedure must have a montorng procedure at ts dsposal, whch must supervse the classfer performance and the mechansm for updatng a classfer dependent on the results of the montorng procedure. Dfferent methods suggested n the lterature for establshng the montorng process are based on the observaton and analyss of some characterstc values descrbng the performance of a classfer. If the classfer performance deterorates they are able to detect changes but cannot recognse what knd of temporal changes have taken place. Most of the updatng procedures proposed for dynamc classfers are based ether on recursve updatng of classfer parameters or re-learnng from scratch usng new objects, whereas the old objects are dscarded as beng rrelevant. In ths thess a new algorthm for dynamc fuzzy classfer desgn s proposed, whch s based partly on the deas presented n the lterature but also uses a number of novel crtera to establsh the montorng procedure. The proposed algorthm s ntroduced n the framework of unsupervsed learnng and allows the desgn of an adaptve classfer capable of recognsng automatcally gradual and abrupt changes n the cluster structure as tme passes and adjustng ts structure to detected changes. The adaptaton laws for updatng the classfer and the template data set are coupled wth the results of the montorng procedure and charactersed by addtonal features that should guarantee a more relable and effcent classfer. Another mportant problem arsng n the context of dynamc pattern recognton s the choce of a relevant smlarty measure for dynamc objects, whch s used, for nstance, for the defnton of the clusterng crteron. Most of the pattern recognton methods use the parwse dstance between objects as a dssmlarty measure used to calculate the degree of membershp of objects to cluster prototypes. As was mentoned above, dynamc objects are represented by a temporal sequence of observatons and descrbed by multdmensonal trajectores n the feature space, or vector-valued functons. Snce the dstance between vectorvalued functons s not defned, classcal clusterng and n general pattern recognton methods are not suted for processng dynamc objects.

22 4 Introducton One approach used n most applcatons for handlng dynamc objects n pattern recognton s to transform trajectores nto conventonal feature vectors durng pre-processng. The alternatve approach addressed n ths thess requres a defnton of a smlarty measure for trajectores that should take nto account the dynamc behavour of trajectores. For ths approach t s mportant to determne a specfc crteron for smlarty. Dependng on the applcaton ths may requre ether the best match of trajectores by mnmsng the pontwse dstance or a smlar form of trajectores ndependent of ther relatve locaton to each other. The smlarty measure for trajectores can be appled nstead of the dstance measure to modfy classcal pattern recognton methods. The combnaton of a new method for dynamc classfer desgn, whch can be appled and modfed for dfferent types of classfers, wth a set of smlarty measures for trajectores leads to a new class of methods for dynamc pattern recognton..2 Structure of the Thess Ths thess s organsed as follows: In Chapter 2, a general vew on the pattern recognton problem s gven. Startng wth the man prncples of the knowledge dscovery process, the role of pattern recognton n ths process wll be dscussed. Ths wll be followed by the formulaton of the classcal (statc) problem of pattern recognton and the classfcaton of methods n ths area wth respect to dfferent crtera. Partcular attenton wll be gven to fuzzy technques, whch wll consttute the focus of ths thess. Snce dynamc pattern recognton represents a relatve new research area and a standard termnology does not yet exst, the man notons and defntons used n ths thess wll be ntroduced and the man problems and tasks of dynamc pattern recognton wll be consdered. Chapter 3 provdes an overvew of technques that can be used as components for desgnng dynamc pattern recognton systems. The advantages and drawbacks of dfferent statstcal and fuzzy approaches for the montorng procedure wll be dscussed, followed by the analyss of updatng strateges for a dynamc classfer. It wll be shown that the classfer desgn cannot be separated temporally from the phase of ts applcaton to the classfcaton of new objects and s carred out n a closed learnng-and-workng-cycle. The adaptve capacty of a classfer depends crucally on the chosen updatng strategy and on the ablty of the montorng procedure to detect temporal changes. Based on the results of Chapters 2 and 3, a new method for dynamc fuzzy classfer desgn wll be developed n Chapter 4. The desgn procedure conssts of three man components: the montorng procedure, the adaptaton procedure for the classfer and the adaptaton procedure for the tranng data set used to learn a classfer. New heurstc algorthms proposed for the

23 Introducton 5 montorng procedure n the framework of unsupervsed learnng facltate the recognton of gradual and abrupt temporal changes n the cluster structure based on the analyss of membershp functons of fuzzy clusters and densty of objects wthn clusters. The adaptaton law of the classfer s a flexble combnaton of two updatng strateges, each dependng on the result of the montorng procedure, and provdes a mechansm to adjust parameters of the classfer to detected changes n the course of tme. The effcency of the dynamc classfer s guaranteed by a set of valdty measures controllng the adaptaton procedure. The problem of handlng dynamc objects n pattern recognton s addressed n Chapter 5. After consderng dfferent types of smlarty and ntroducng dfferent smlarty models for trajectores, a number of defntons of specfc smlarty measures wll be proposed. They are based on the set of characterstcs descrbng the temporal behavour of trajectores and representng dfferent context dependent meanngs of smlarty. The effcency of the proposed method for dynamc classfer desgn, combned wth new smlarty measures for trajectores, s examned n Chapter 6 usng two applcaton examples based on real data. The frst applcaton s concerned wth the load optmsaton n a computer network based on on-lne montorng and dynamc recognton of current load states. The second applcaton from the credt ndustry regards the problem of segmentaton of bank customers based on ther behavoural data. The analyss allows one to recognse tendences and temporal changes n the customer structure and to follow transtons of sngle customers between segments. Fnally, Chapter 7 summarses the results and ther practcal mplcatons and outlnes new drectons for future research.

24

25 General Framework of Dynamc Pattern Recognton 7 2 General Framework of Dynamc Pattern Recognton Most methods of pattern recognton consder objects at a fxed moment n tme wthout takng nto account ther temporal development. However, there are a lot of applcatons n whch the order of state changes of an object over tme determnes ts membershp to a certan pattern, or class. In these cases, for the correct recognton of objects t s very mportant not only to consder propertes of objects at a certan moment n tme but also to analyse propertes charactersng ther temporal development. Ths means that the hstory of temporal development of an object has a strong effect on the result of the recognton process. Classcal methods of pattern recognton are not sutable for processng objects descrbed by temporal sequences of observatons. In order to deal wth problems n whch a dynamc vewpont s desrable, methods of dynamc pattern recognton must be appled. The feld of pattern recognton s a rapdly growng research area wthn the broader feld of machne ntellgence. The ncreasng scentfc nterest n ths area and the numerous efforts at solvng pattern recognton problems are motvated by the challenge of ths problem and ts potental applcatons. The prmary ntenton of pattern recognton s to automatcally assst humans n analysng the vast amount of avalable data and extractng useful knowledge from t. In order to understand the mechansm of extractng knowledge from data and the role of pattern recognton n fulfllng ths task, the man prncples of the knowledge dscovery process wll be descrbed n Secton 2.. Then the problem of classcal (statc) pattern recognton wll be formulated n Secton 2.2, whch should provde a general framework for the nvestgatons n ths thess. Snce the man topc of ths thess centres on the relatve new area of dynamc pattern recognton, the man notons and defntons used n ths area wll be presented n Secton 2.3. Fnally, the goal and basc steps of the dynamc pattern recognton process wll be summarsed. 2. The Knowledge Dscovery Process The task of fndng useful patterns n raw data s known n the lterature under varous names ncludng knowledge dscovery n databases, data mnng, knowledge extracton, nformaton dscovery, nformaton harvestng, data archaeology, and data pattern processng. The term knowledge dscovery n databases (KDD), whch appeared n 989, refers to the process of fndng knowledge n data by applyng partcular data mnng methods at a hgh level. In the lterature KDD s often used as a synonym of data mnng snce the goal of both processes s to mne for peces of knowledge n data. In [Fayyad et al., 996, p. 2] t s, however, argued that KDD s related to the overall process of dscoverng useful knowledge from data whle data mnng s concerned wth the applcaton of algorthms for extractng patterns from data

26 8 General Framework of Dynamc Pattern Recognton wthout the addtonal steps of the KDD process, such as consderng relevant pror knowledge and a proper nterpretaton of the results. The burgeonng nterest n ths research area s due to a rapd growth of many scentfc, busness and ndustral databases n the last decade. Advances n data collecton n scence and ndustry, the wdespread usage of bar codes for almost all commercal products, and the computersaton of many busness and government transactons, have produced a flood of data whch has been transformed nto mountans of stored data usng modern data storage technologes. Accordng to [Fraway et al., 992], the amount of nformaton n the world doubles every 20 months. The desre to dscover mportant knowledge n the vast amount of exstng data makes t necessary to look for a new generaton of technques and tools wth the ablty to ntellgently and automatcally assst humans n analysng the mountans of data for nuggets of useful knowledge [Fayyad et al., 996, p. 2]. These technques and tools are the object of the feld of knowledge dscovery n databases. In [Frawley et al., 99] the followng defnton of the KDD process s proposed: Knowledge dscovery n databases s the non-trval process of dentfyng vald, novel, potentally useful, and ultmately understandable patterns n data. The nterpretaton of the dfferent terms n ths defnton s as follows. Process: KDD represents a mult-step process, whch nvolves data preparaton, search for patterns, knowledge evaluaton and allows return to prevous steps for refnement of results. It s assumed that ths process has some degree of search autonomy,.e. t can nvestgate by tself complex dependences n data and present only nterestng results to the user. Thus, the process s consdered to be non-trval. Valdty: The dscovered patterns should be vald not only on the gven data set, but on new data wth some degree of certanty. For the evaluaton of certanty, a certanty measure functon can be appled. Novelty: The dscovered patterns should be new (at least to the system). A degree of novelty can be measured based on changes n data or knowledge by comparng current values or a new fndng to prevous ones. Potentally useful: The dscovered patterns should be potentally usable and relevant to a concrete applcaton problem and should lead to some useful actons. A degree of usefulness can be measured by some utlty functon. Ultmately understandable: The patterns should be easly understandable to humans. For ths purpose, they should be formulated n an understandable language or represented graphcally. In order to estmate ths property, dfferent smplcty measures can be used

27 General Framework of Dynamc Pattern Recognton 9 whch take nto account ether a sze of patterns (syntactc measure) or the meanng of patterns (semantc measures). In order to evaluate an overall measure of pattern value combnng valdty, novelty, usefulness and smplcty, a functon of sgnfcance s usually defned. If the value of sgnfcance for a dscovered pattern exceeds a user-specfed threshold, then ths pattern can be consdered as knowledge by the KDD process. Data Mnng Interpretaton/ Evaluaton Knowledge Transformaton Preprocessng Patterns Selecton Preprocessed data Transformed data Data Target data Fgure 2-: An overvew of the steps of the KDD process [Fayyad et al., 996, p. 0] Fgure 2- provdes an overvew of the man steps/stages of the KDD process emphassng ts teratve and nteractve nature [Fayyad et al., 996, p. 0]. The prerequste of the KDD process s an understandng of the applcaton doman, the relevant pror knowledge, and the goals of the user. The process starts wth the raw data and fnshes wth the extracted knowledge acqured durng the followng sx stages:. Selecton: Selectng a target data set (accordng to some crtera) on whch dscovery wll be performed. 2. Pre-processng: Applyng basc operatons of data cleanng ncludng the removal of nose, outlers and rrelevant nformaton, decdng on strateges for handlng mssng data values, and analysng nformaton contaned n tme seres. If the data has been drawn from dfferent sources and has nconsstent formats, the data s reconfgured to a consstent format.

28 0 General Framework of Dynamc Pattern Recognton 3. Transformaton: Selectng relevant features to represent data usng dmensonalty reducton, projecton or transformaton technques. The data are reduced to a smaller set of representatve usable data. 4. Data Mnng: Extractng patterns from data n a partcular representatonal form relatng to the chosen data mnng algorthm. At frst, the task of data mnng such as classfcaton, clusterng, regresson etc. s defned accordng to the goal of the KDD process. Then, the approprate data mnng algorthm(s) s (are) selected. Fnally, the chosen algorthm(s) s (are) appled to the data to fnd patterns of nterest. 5. Interpretaton/evaluaton: Translatng dscovered patterns nto knowledge that can be used to support the human decson-makng process. If the dscovered patterns don not satsfy the goals of the KDD process, t may be necessary to return to any of prevous steps for further teraton. 6. Consoldatng dscovered knowledge: Incorporatng knowledge nto the performance system or reportng t to the end-user. Ths step also ncludes checkng for potental conflcts wth prevously beleved or extracted knowledge. As can be seen the KDD process may nvolve a sgnfcant number of teratons and contan loops between any two steps. Although the KDD process must be autonomous, a key role of nteractons between a human and the data durng the dscovery process must be emphassed. A prerequste of a successful dscovery process s that the human user s ultmately nvolved n many, f not all, steps of the process. In [Brachman, Anand, 996] the exact nature of these nteractons between a human and the data s nvestgated from a practcal pont of vew. It s mportant to note that Step 4 of the flow chart, depcted n Fgure 2-, s concerned wth the applcaton of data mnng algorthms to a concrete problem. On the subject of data mnng or data analyss as a process, however, ths usually nvolves the same seven steps descrbed above emphassng the need for pre-processng and transformaton procedures n order to obtan successful results from the analyss. Ths understandng explans the fact that data mnng, data analyss and knowledge dscovery processes are often used as equvalent notons n the lterature ([Angstenberger, 997], [Dlly, 995], [Petrak, 997]). KDD overlaps wth varous research areas ncludng machne learnng, pattern recognton, databases, statstcs, artfcal ntellgence, knowledge acquston for expert systems, and data vsualsaton and uses methods and technques from these dverse felds [Fayyad et al., 996, p. 4-5]. For nstance, the common goals of machne learnng, pattern recognton and KDD le n the development of algorthms for extractng patterns and models from data (data mnng methods). But KDD s addtonally concerned wth the extenson of these algorthms to problems wth very large real-world databases whle machne learnng typcally works wth smaller data sets. Thus, KDD can be vewed as part of the broader felds of machne learnng

29 General Framework of Dynamc Pattern Recognton and pattern recognton, whch nclude not only learnng from examples but also renforcement learnng, learnng wth teacher, etc. [Dlly, 995]. KDD often makes use of statstcs, partcularly exploratory data analyss, for modellng data and handlng nosy data. In ths thess the attenton wll be focussed on pattern recognton whch represents one of the largest felds n KDD and provdes the large majorty of methods and technques for data mnng. 2.2 The Problem of Pattern Recognton The concept of patterns has a unversal mportance n ntellgence and dscovery. Most nstances of the world are represented as patterns contanng knowledge, f only one could dscover t. Pattern recognton theory nvolves learnng smlartes and dfferences of patterns that are abstractons of objects n a populaton of non-dentcal objects. The assocatons and relatonshps between objects make t possble to dscover patterns n data and to buld up knowledge. The most frequently observed types of patterns are as follows: (rule-based) relatonshps between objects, temporal sequences, spatal patterns, groups of smlar objects, mathematcal laws, devatons from statstcal dstrbutons, exceptons and strkng objects [Petrak, 997]. A human s perceptve power seems to be well adapted to the patternprocessng task. Humans are able to recognse prnted characters and words as well as handwrtten characters, speech utterances, favourte melodes, the faces of frends n a crowd, dfferent types of weave, scenes n mages, contextual meanngs of word phrases, and so forth. Humans are also capable of retrevng nformaton on the bass of assocated clues ncludng only a part of the pattern. Humans learn from experence by accumulatng rules n varous forms such as assocatons, tables, relatonshps, nequaltes, equatons, data structures, logcal mplcatons, and so on. A desre to understand the bass for these powers n humans s the reason for the growng nterest n nvestgatng the pattern recognton process. The subject area of pattern recognton belongs to the broader feld of machne learnng, whose prmary task s the study of how to make machnes learn and reason as humans do n order to make decsons [Looney, 997, p. 4]. In ths context learnng refers to the constructon of rules based on observatons of envronmental states and transtons. Machne learnng algorthms examne the nput data set wth ts accompanyng nformaton and the results of the learnng process gven n form of statements, and learn to reproduce these and to make generalsatons about new observatons. Ever snce computers were frst desgned the ntenton of researchers has been that of makng them ntellgent and gvng them the same nformaton-processng capabltes that humans possess. Ths would make computers more effcent n handlng real world tasks and would make them more compatble wth the way n whch humans behave.

30 2 General Framework of Dynamc Pattern Recognton In the followng secton the pattern recognton process wll be consdered and ts man steps examned. Ths wll be followed by a classfcaton of pattern recognton methods regardng dfferent crtera. Fnally, the characterstcs of a specal class of fuzzy pattern recognton methods wll be dscussed along wth the advantages that they afford dynamc pattern recognton The process of pattern recognton Pattern recognton s one of the research areas that tres to explore mathematcal and techncal aspects of percepton a human s ablty to receve, evaluate, and nterpret the nformaton as regards hs/her envronment - and to support humans n carryng out ths task automatcally. The goal of pattern recognton s to classfy objects of nterest nto one of a number of categores or classes [Therren, 989, p. ]. Ths process can be vewed as a mappng of an object from the observaton space nto the class-membershp space [Zadeh, 977] or a search for structure n data [Bezdek, 98, p. ]. Objects of nterest may be any physcal process or phenomenon. The basc scheme of the pattern recognton process s llustrated n Fgure 2-2. Observaton vector y Observaton space Feature extracton Feature vector x Feature space Classfer One of the classes Decson space Output decson Pattern Recognton System Fgure 2-2: Basc scheme of the pattern recognton process [adapted from Therren, 989, p. 2] Informaton about the object comng from dfferent measurement devces s summarsed n the observaton vector. The observaton space s usually of a hgh dmenson and transformed nto a feature space. The purpose of ths transformaton s to extract the smallest possble set of dstngushng features that lead to the best possble classfcaton results. In other words t s advantageous to select features n such a way that feature vectors, or patterns, belongng to dfferent classes occupy dfferent regons of the feature space [Jan, 986, p. 2]. The resultng feature space s of a much lower dmenson than the observaton space. A procedure of selectng a set of suffcent features from a set of avalable features s called feature extracton. It may be based on ntuton or knowledge of the physcal characterstcs of the

31 General Framework of Dynamc Pattern Recognton 3 problem or t may be a mathematcal technque that reduces the dmensonalty of the observaton space n a prescrbed way. The next step s a transformaton of the feature space nto a decson space, whch s defned by a (fnte) set of classes. A classfer, whch s a devce or algorthm, generates a parttonng of the feature space nto a number of decson regons. The classfer s desgned ether usng some set of objects, the ndvdual classes of whch are already known, or by learnng classes based on smlartes between objects. Once the classfer s desgned and a desred level of performance s acheved, t can be used to classfy new objects. Ths means that the classfer assgns every feature vector n the feature space to a class n the decson space. Dependng on the nformaton avalable for classfer desgn, one can dstngush between supervsed and unsupervsed pattern recognton ([Therren, 989, p. 2], [Jan, 986, p. 6-7]). In the frst case there exsts a set of labelled objects wth a known class membershp. A part of ths set s extracted and used to derve a classfer. These objects buld the tranng set. The remanng objects, whose correct class assgnments are also known, are referred to as the test set and used to valdate the classfer's performance. Based on the test results, sutable modfcatons of the classfer's parameters can be carred out. Thus, the goal of supervsed learnng, also called classfcaton, s to fnd the underlyng structure n the tranng set and to learn a set of rules that allows the classfcaton of new objects nto one of the exstng classes. The problem of unsupervsed pattern recognton, also called clusterng, arses f cluster membershps of avalable objects, and perhaps even the number of clusters, are unknown. In such cases, a classfer s desgned based on smlar propertes of objects: objects belongng to the same cluster should be as smlar as possble (homogenety wthn clusters) and objects belongng to dfferent clusters should be clearly dstngushable (heterogenety between clusters). The noton of smlarty s ether prescrbed by a classfcaton algorthm, or has to be defned dependng on the applcaton. If objects are real-valued feature vectors, then the Eucldean dstance between feature vectors s usually used as a measure of dssmlarty of objects. Hence, the goal of clusterng s to partton a gven set of objects nto clusters, or groups, whch possesses propertes of homogenety and heterogenety. It s obvous that unsupervsed learnng of the classfer s much more dffcult than supervsed learnng, nevertheless, effcent algorthms n ths area do exst. Classfcaton and clusterng represent the prmary tasks of pattern recognton Classfcaton of pattern recognton methods Over the past two decades a lot of methods have been developed to solve pattern recognton problems. These methods can be grouped nto two approaches ([Fu, 982b, p. 2], [Bunke, 986, p. 367]): the decson-theoretc and the syntactc approach. The decson-theoretc