Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis
|
|
- Gwendoline Wilkins
- 8 years ago
- Views:
Transcription
1 Vol. 1, No. 1, Issue 1, Page 11 of 19 Copyright 2007, TSI Press Printed in the USA. All rights reserved Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis Shoji Hirano and Shusaku Tsumoto Department of Medical Informatics, Shimane University, School of Medicine 89-1 Enya-cho, Izumo, Shimane , Japan {hirano, Received 1 January 2007; revised 2 February 2007, accepted 3 March 2007 Abstract This paper presents a new approach to finding interesting knowledge from temporal data on chronic diseases based on the combination of advanced sequence comparison techniques and cluster analysis procedure. First we briefly introduce the cluster analysis system for temporal data that we have developed. Second, we apply it to the analysis of platelet (PLT) count data on chronic viral hepatitis patients. Third, we show the results of PLT value-based temporal analysis, conducted based on the results of cluster analysis, aiming at finding years for reaching F4 (liver fibrosis stage four), years elapsed between stages, and their relationships with virus types and fibrotic stages. The results conveyed some interesting findings; (1) the temporal courses of PLT could be grouped into several patterns exhibiting similar average PLT level and increase/decrease trends, and (2) liver fibrosis might proceed faster in some exacerbating cases. Keywords Temporal Data Mining, Multiscale Matching, Clustering, Chronic Hepatitis, KDD Process 1. INTRODUCTION Steady operations of hospital information systems over the past two decades gave them a new role of archiving temporal data about long-term condition of patients, in addition to their basic function of providing information necessary for daily clinical services. Such archives of longitudinal, time-series data can be used as a new source for retrospective study on chronic diseases, which may lead to the discovery of novel knowledge useful for diagnosis or treatment. However, large-scale, cross-patient analysis of time-series medical data is a challenging task because of the multidimensionality and temporal irregularity of data caused by the variety of laboratory tests and change of patient conditions over time, as well as the difficulty in determining observation scales appropriate for capturing short-term and long-term events. Therefore, practical application of data mining methods to longitudinal medical time-series data is still limited. In this paper, we present a new approach to finding interesting knowledge from temporal data on chronic diseases based on the combination of advanced sequence comparison techniques and cluster analysis procedure. First we briefly introduce the cluster analysis system for temporal data that we have developed. Second, we apply it to the analysis of platelet (PLT) count data on chronic viral hepatitis patients. Platelet count has been receiving considerable interests as an index for liver dysfunctions, because a hematogenetic factor called thrombopoietin [1], which facilitates the production of platelets, is produced in the liver. Matsumura et al. reported that the PLT count correlated with fibrotic stage [2]. PLT counts were significantly different among the patients of different fibrotic stages, with the characteristics that PLT count becomes smaller as the liver fibrosis proceeds [2]. However, few studies investigate the temporal relationships between the decrease patterns of PLT and progress of liver fibrosis using time series data of individual patients. Our results of cluster analysis indicate that the temporal courses of PLT can be grouped into several patterns, each of which presents similarity in average PLT level and increase/decrease trends. Third, we show the results of PLT value-based temporal analysis aiming at finding years for reaching F4 (fibrosis stage 4), years elapsed between stages, and their relationships with virus types and fibrotic stages. This value-based analysis was conducted based on the observation of quickly decreasing patterns revealed through the cluster analysis. The results of value-based analysis
2 suggest that liver fibrosis may proceed faster in exacerbating cases. 2. CLUSTER ANALYSIS SYSTEM The cluster analysis system we have developed consists of two components, sequence comparison and clustering, in order to utilize advanced sequence comparison methods that can handle the temporal irregularity of medical data. In sequence comparison part, two methods were implemented: dynamic time warping (DTW) [5,6] and modified multiscale structure matching (MMSM) [8]. In clustering part, it employs two methods: conventional hierarchical clustering (HC) [4] and rough set-based clustering (RC) [7]. The sequence comparison part performs pairwise comparison for all possible pairs of time series, and then produces a dissimilarity matrix. The clustering part performs grouping of the time series according to the given dissimilarity matrix. Figure 1 provides a screenshot of the system. The left window shows a dendrogram which is generated when using HC as the clustering method. The right window shows constitution of the clusters, as well as the number of cases in each cluster. When a cluster is selected by a user, sequences that belong to the cluster are visualized. Both windows are related internally. When a user specifies a cutting point on the dendrogram, corresponding cluster constitution and sequences are displayed interactively. 3. CLUSTER ANALYSIS OF TIME-SERIES PLT COUNT DATA Data Sets We employed the chronic hepatitis dataset [3], which was provided as a common dataset for ECML/PKDD Discovery Challenge 2002 and The dataset contained time-series data on laboratory examination which were collected at a university hospital in Japan. The subjects were 771 patients of Type B and Type C chronic viral hepatitis who received hospital laboratory examinations during the period from 1982 to A total of 720 patients received at least one examination on platelet count. Out of these 720 cases, 222 were removed from analysis because their biopsy information was not available and additional 10 were removed because of their short examination periods (less than 2 weeks) Consequently, a total of 488 series were used for analysis. Experimental Procedure Below we show the procedure of cluster analysis. 1. Sequence rebuild: Rearrange PLT data of each patient into one-week interval by linear interpolation. 2. Dataset split by virus types and administration of interferon (IFN) therapy: Split the dataset into Type B and Type C cases, and further the Type C cases into Type C with IFN therapy and Type C without IFN therapy cases. We call these subsets as Type B subset, Type C with IFN subset and Type C without IFN subset. The number of cases in each subset was as follows: Type B = 193, Type C with Figure 1. Cluster analysis system for time-series. 12
3 IFN = 196, Type C without IFN = 99. The following procedures were applied independently to each subset. 3. Creation of a dissimilarity matrix: Perform a comparison of two PLT sequences by the modified multiscale matching. Apply this process to every possible pair of sequences in the subset to fill in the dissimilarity matrix. In order to perform comprehensive comparison, we set the parameters for multiscale matching as follows: the number of scales = 150, starting scale = 0.1, scale interval = 0.5. The weight for replacement cost was set to 0.2 according to a preparatory experiment. 4. Cluster analysis: Generate dendrograms by agglomerative hierarchical clustering and perform cluster analysis. We employed group average as a cluster merge criterion. Figure 2 shows the three dendrograms obtained from Type B, Type C with IFN and Type C without IFN subsets, respectively. We manually determined cutting points on the dendrograms so that the clusters represent global structure of the data while retaining the meaningful features of sequences. Consequently, we obtained 16, 23 and 6 clusters respectively for each subset. A horizontal line on the dendrogram represents the cutting point. Table 1 provides the constitution of clusters stratified by the fibrotic stage. The three sub-tables respectively correspond to, from left to right, Type B, Type C with IFN and Type C without IFN subsets. Each row in a table represents one cluster. The leftmost column contains cluster number. Subsequent five columns contain the number of cases in the cluster stratified by fibrotic stages (F0-F4). The rightmost column contain the total number of cases in the cluster. The tables implied that clusters could be roughly classified into two categories: (1) a cluster containing high stage (progressed) cases, and (2) a cluster containing low (early) stage cases Figure 2. Dendrograms for PLT sequences. Left: Type B, Middle: Type C with IFN, Right: Type C without IFN. Table 1. Cluster constitutions w.r.t. fibrotic stages. Small clusters (less than 3 cases) were omitted. Left: Type B, Center: Type C with IFN, Right: Type C without IFN. B C IFN C noifn Cls # of Cases / Fibrosis Stage # of Cases / Fibrosis Stage # of Cases / Fibrosis Stage Total Cls Total Cls F0 F1 F2 F3 F4 F0 F1 F2 F3 F4 F0 F1 F2 F3 F4 Total
4 Due to space limitation, we mainly describe about the results on Type C with IFN subset. According to the middle table in Table 1, there were two remarkable clusters containing many progressed cases (F4 or F3): cluster 5 (8/11) and 8 (25/40). Additionally, there were other three remarkable clusters containing many early-stage (F0-F2) cases: 11 (34/46), 12 (33/42) and 23 (18/19). Figure 3 provides examples of sequences grouped into clusters 5 and 8, respectively. Each figure is composed of 16 sub-windows and each sub-window contains one sequence. The two horizontal lines in each sub-window represent normal high ( /µl) and normal low ranges ( /µl) respectively. In cluster 5, most of the sequences represented decreasing/flat courses below the normal low range, meaning the severe states of the patients. Sequences in cluster 8 exhibited the similar courses, but with slightly higher values than those in cluster 5. Figure 4 provides sequences grouped into clusters 11, 12 and 23. In contrast to clusters 5 and 8, sequences in these clusters represented flat courses maintaining the normal range. Clusters 11, 12 and 23 would differentiate the global PLT levels: low, middle and high respectively. Other interesting courses were found on clusters 4, 6 and 10, that demonstrated obviously decreasing or increasing patterns as shown in Figure 5. The left in Figure 5 provides sequences in cluster 4 (F1=3,F2=1). While 3/4 of them were on stage F1, PLT counts continued decreasing and finally reached below the normal low level in relatively short period. The middle of Figure 5 shows sequences in cluster 6 (F1=1,F3=1,F4=1). The global levels were lower than those in cluster 4, that might be caused by F3 and F4 cases. The bottom provides sequences in cluster 10 (F1=1,F3=1,F4=3), which represent recovery courses after IFN therapy. We observed similarly interesting patterns on the other two subsets. Below we summarize the findings. 1. In both type B and C, some clusters contained relatively large numbers of progressed cases. PLT count in these cases commonly represented decrease or flat courses going Figure 3. Clusters containing many cases of progressed-stage (F4 or F3) (Type C with IFN). Left: cluster 5. Center and Right: cluster 8 (32 cases selected by MID order). Figure 4. Clusters containing many cases of early-stage (F0, F1 or F2) (Type C with IFN). Left: cluster 11. Center: cluster 12. Right: cluster 23. (16 cases selected by MID order). 14
5 Figure 5. Clusters containing remarkably increase/decrease cases (Type C with IFN). Left: cluster 4. Center: cluster 6. Right: cluster 10 below the normal low level. Some F1 and F2 cases represented similarly low level as F4 cases. (Type B cluster 7, Type C with IFN cluster 5). 2. In both type B and C, some clusters contained relatively large numbers of early-stage cases. PLT count in these cases commonly represented flat courses going within the normal range (Type B cluster 5, 15, 16, Type C with IFN cluster 11, 12, 23). F4 cases might retain the normal range; however, the number of such cases in a cluster decreased following the global PLT levels of the cluster. (Type B cluster: 16>15>5, Type C with IFN cluster: 11=12>23). 3. In type C, there were remarkable cases including F1 and F2 cases in which PLT count continuously decreased and finally reached below the normal range. (Type C with IFN clusters 4 and 6, Type C without IFN cluster 1). In type C without IFN, the decreasing trend was observed rather frequently. (Type C without IFN clusters 1 and 3). 4. In type C with IFN, there were F4 cases in which PLT levels increased toward the normal range after IFN administration 4. ANALYSIS OF YEARS FOR REACHING F4 AND ELAPSED YEARS BETWEEN STAGES BASED ON THE PLT COUNTS Determination of the stage of liver fibrosis is usually done with liver biopsy which is an invasive examination. In recent years, platelet count has been receiving considerable attention as an non-invasive index reflecting the liver dysfunctions, which may be associated with the fibrotic stage in chronic hepatitis. Several researchers have reported the relationships between platelet counts and fibrotic stages [2,9]. For example, Matsumura et al. [2] reported the following values: F1: 20.3±5.2( 10 4 µl), F2: 16.0±4.9, F3: 13.0±4.0, F4: 11.8±4.1 and in LC 11.8±4.1. Our results of cluster analysis corresponded to these differences. Additionally, through the visual inspection of clustered sequences, we observed that there might be several types of temporal courses of PLT values. Matsumura et al. [2] also reported the progress speed of liver fibrosis examined on the patients of Type C chronic hepatitis in Japan. They used the date of blood transplants, which could be associated with F0, and the date and results of liver biopsy for calculating the progress speed. The result was about 0.12±0.15 stage/year. In order to investigate the temporal characteristics PLT count, we tried to utilize the time-series data. We set the goal of this study to analyze, without information about blood transplants, the progress speed of liver fibrosis. As a preliminary stage, we attempted to calculate (1) years required for reaching F4 stage, and (2) years elapsed between stages, by combining the fibrotic stages predicted from PLT level and observed by liver biopsy. Here we made an assumption: If the PLT level of a patient is continuously lower than the normal range for at least 6 months, and after that never keeps normal range more than 6 months, then the patient is F4. Based on this assumption, we first examined whether and when a patient reached F4. Then by subtracting dates and stages from those obtained by biopsy, we calculated elapsed years. As a pre-process, we selected the cases for analysis according to the following procedure. 1. Exclude cases that met any of the following three conditions from analysis: (1) No biopsy - biopsy information was not available. (2) Short sequence - the number of examinations was less than 2 or the duration of examination was shorter than 2 years. (3) Inhomogeneous sequence - Deviation of examination intervals was larger than 1 year. 2. Rearrange the sampling intervals of each sequence into one-week. The starting date of re-sampling was selected independently to each case, based on two criteria that (1) it was the day of a week on which the patient most frequently received examinations, and (2) it was the closest date to the first examination. If examination data were missing, we inserted a predicted value by linearly interpolating nearest examination results. In the following procedures we used these rearranged sequences. 15
6 3. Smooth each sequence in order to remove short-term changes. We performed convolution with discrete Gaussian kernel with support width of 6 month (26 weeks; σ=2.8). 4. From the head of a sequence, search the first point that satisfies both of the following two conditions: (a) PLT level became continuously lower than the normal range for the next 6 months. Duration of IFN therapy was not included therein as it might induce short-term decrease of PLT. (b) Recovered PLT level could not continuously maintain the normal range for 6 months. 5. If found, let the detected point the date of declination from normal range. Otherwise, the case was considered to keep normal PLT range and removed from analysis. Table 2 shows the result of sequence classification by the above four procedure. A total of 97 cases classified as 'declinated' were the subject of analysis. Table 2. Result of sequence classification. Judging criteria for declination are: (1) PLT becomes continuously lower than the normal range over 6 months, (2) Recovered PLT level cannot continuously maintain the normal range for 6 months. Both criteria should be satisfied. Inhomogeneous Available No biopsy Short Total Declinated Normal Table 3 summarizes calculated years for reaching F4 (first examination date basis), for the 97 declinated cases in Table 2. The cases were stratified by the virus types and fibrotic stage. Note that years=0 if the date of declination was earlier than the date of first examination. For each of type B, C with IFN and C without IFN groups, we performed statistical tests (ANOVA) aiming at detecting Table 3. Years for reaching F4 (First-exam basis) stratified by virus types and fibrotic stages. Summary for 97 declination cases in Table 2* Type Fibrotic Years for reaching F4 [First-exam basis] (years) Cases Stage Mean Median SD B subtotal C IFN subtotal C w/o IFN subtotal Total *Fibrotic stages in the second column are based on biopsy. Years for reaching F4 was years from first exam to the date of declination under assumption that the fibrotic stage at the date of declination was F4. If the date of declination was the same as or before the first exam, years were treated as 0. 16
7 differences of mean years for reaching F4 with respect to the biopsy-based fibrotic stages. The result of Type C IFN was p=0.012 (< 0.05), indicating that significant differences of years exist among fibrotic stages. However, this was primarily due to one exceptionally long case in F0; tests after removing this case yielded p=0.291, indicating that there was no significant difference on the years for reaching F4 among fibrotic stages. Results for Type B and Type C w/o IFN were p=0.357 and p=0.613 respectively, indicating no significant differences. Kruscal-Wallis tests yielded the same conclusion. Between-group comparison of Type B, Type C with IFN and Type C w/o IFN groups resulted in p= Years for reaching F4 in Table 3 were calculated as years between the first date of PLT examination and the date of PLT declination. Therein we assumed that the fibrotic stage at first examination was the same as that at first biopsy. However, the date of first biopsy and the date of first PLT examination were generally different; in some cases they were several years apart. This implies that the stages might also be different. Therefore, we calculated years for reaching F4 biopsy basis, which are years from the date of first biopsy to the date of PLT declination. Additionally, based on the assumption that the stage at PLT declination should be F4, we calculated elapsed years between stages by the following formula: (date of declination - date of first biopsy) /(4 - fibrotic stage at biopsy). If declination occurred before the first biopsy, years were treated as 0. Table 4 summarizes the results. As we did in the first-exam basis results, for each of type B, C with IFN and C without IFN groups, we performed statistical tests with ANOVA aiming at detecting differences of mean years for reaching F4 w.r.t. the fibrotic stages. The results were p=0.421, (<0.05), for each group respectively. In Type C IFN there appeared significant difference among stages, however, this was primarily due to one exceptionally long case in F0; tests after removing this case yielded p=0.970, indicating that there was no significant difference on the years for reaching F4 even in the biopsy-date basis measurement. Kruscal-Wallis tests resulted in the same conclusion. Similarity, for each of type B, C with IFN and C without IFN groups, we performed statistical tests with ANOVA aiming at detecting differences of mean elapsed years between stages w.r.t. the fibrotic stages. In this test we removed F4 cases as we could not measure the elapsed years. For the Table 4. Years for reaching F4 (biopsy basis) and years between stages stratified by virus type and fibrotic stages. Summary for 97 declination cases in Table 2* Type Fibrotic Years for reaching F4 [biopsy basis](years) Years between stages (years/stage) Cases Stage Mean Median SD Mean Median SD B subtotal C IFN subtotal C w/o IFN subtotal Total *Fibrotic stages in the second column are based on biopsy. Years for reaching F4 were years from first biopsy to the date of declination under assumption that the fibrotic stage at the date of declination was F4. If the date of declination was the same as or before the first biopsy, years were treated as 0. Years between stages were calculated by (years for reaching F4)/(4-stage at biopsy). 17
8 same reason, we excluded F4 cases for calculating values such as mean and SD in Table 4. The results of ANOVA were p=0.836, 0.425, 0.340, indicating that there was no significant differences among stages, including F0, for all of the three groups. In summary, with this limited analysis, no significant difference was observed for years for reaching F4 and years elapsed between stages, with respect to fibrotic stages, virus types and administration of IFN. However, it is interesting that the elapsed years between stages were 1-2 years/stage in almost all groups. If we simply invert it into progress speed for comparison with other resources, the result would be about 1/1.32=0.76 stage/year for example of Type C w/o IFN cases. This is faster than in [2] (0.12±0.15 stage/year), implying that the liver fibrosis might proceed faster. It should be noted that the results of analysis should not be generalized because (1) we assume that a patient was considered to reach F4 when PLT level continuously declinates from the normal range over long time, (2) we selected only exacerbating cases in which PLT continuously decreased, and (3) we did not take into account patient background information such as history of drinking. However, we consider that our approach of measuring elapsed years between stages by combining fibrotic stages obtained from biopsy and inferred from PLT level lead to find interesting results. 5. CONCLUSIONS In this paper we have introduced a cluster analysis system for time series medical data and reported the results of temporal analysis of PLT data in chronic hepatitis patients. The results revealed that temporal courses of PLT might be classified into some patterns according to their levels and trends which might be further related to fibrotic stages. The results also suggest that, in some exacerbating cases, liver fibrosis may proceed a few times faster than the natural courses. In the future, we would proceed to validate the clinical reasonability of the results and validate the usefulness of the system on other datasets. ACKNOWLEDGEMENTS This work was supported in part by the Grant-in-Aid for Scientific Research on Priority Area (# ), Development of the Active Mining System in Medicine Based on Rough Sets by the Ministry of Education, Culture, Science and Technology of Japan. REFERENCES [1] H. Miyazaki, Future Prospect of Thrombopoietin. Jpn J. Transfusion Medicine, Vol. 46, No.3, pp , [2] H. Matsumura, M. Moriyama, and I. Goto and N. Tanaka, and H. Okubo and Y. Arakawa, Natural course of progression of liver fibrosis in patients with chronic liver disease type C in Japan - a study of 527 patients at one establishment in Japan. J. Viral Hepat, Vol. 7, pp , [3] URL: [4] B. S. Everitt, S. Landau, and M. Leese, Cluster Analysis, Fourth Edition. Arnold Publishers, [5] D. Sankoff and J. Kruskal, Time Warps, String Edits, and Macromolecules. CLSI Publications, [6] S. Chu, E. J. Keogh, D. Hart, and M. J. Pazzani, Iterative Deepening Dynamic Time Warping for Time Series., In Proc. the Second SIAM Int l Conf. Data Mining, pp , [7] S. Hirano and S. Tsumoto (2003): An Indiscernibility-Based Clustering Method with Iterative Refinement of Equivalence Relations - Rough Clustering - Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 7, No.2, pp , [8] S. Tsumoto, S.Hirano, and K. Takabayashi, Development of the Active Mining System in Medicine Based on Rough Sets, Journal of Japan Society for Artificial Intelligence, Vol. 20, 2, pp , AUTHOR INFORMATION Shoji Hirano received the Ph. D. degree in electronics in 2001 from Himeji Institute of Technology, Japan. He joined in the Department of Medical Informatics, Shimane Medical University as a research associate in April 2001, and serves as an associate professor since July His research interests include data mining, rough sets, image processing, and medical informatics. He received the Best Paper Award at the Fourth Biannual World Automation Congress 18
9 in 2000, and the Annual Conference Award at the 19th Annual Conference of Japanese Society for Artificial Intelligence (JSAI) in He is a member of the IEEE, JSAI and Japan Society for Fuzzy Theory and Intelligent Informatics. Shusaku Tsumoto graduated from Osaka University, School of Medicine in He received his Ph.D (Computer Science) on application of rough sets to medical data mining from Tokyo Institute of Technology in 1997 and has become a Professor at Department of Medical Informatics, Shimane University in His interests include approximate reasoning, data mining, fuzzy sets, granular computing, knowledge acquisition, mathematical theory of data mining, medical informatics and rough sets (alphabetical order). He serves as a President of International Rough Set Society from 2000 to 2005 and served as a PC chair of RSCTC2000, IEEE ICDM2002, RSCTC2004 and ISMIS
Behavior Grouping based on Trajectories Mining. Department of Medical Informatics Shimane University, School of Medicine, Japan
Behavior Grouping based on Trajectories Mining Shoji Hirano Shusaku Tsumoto Department of Medical Informatics Shimane University, School of Medicine, Japan 1 Introduction Outline Background, Objective,
More informationEvaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -
Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,
More informationMaintenance of Domain Knowledge for Nursing Care using Data in Hospital Information System
Maintenance of Domain Knowledge for Nursing Care using Data in Hospital Information System Haruko Iwata, Shoji Hirano and Shusaku Tsumoto Department of Medical Informatics, School of Medicine, Faculty
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationClustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationAn Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis]
An Order-Invariant Time Series Distance Measure [Position on Recent Developments in Time Series Analysis] Stephan Spiegel and Sahin Albayrak DAI-Lab, Technische Universität Berlin, Ernst-Reuter-Platz 7,
More informationThere are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows:
Statistics: Rosie Cornish. 2007. 3.1 Cluster Analysis 1 Introduction This handout is designed to provide only a brief introduction to cluster analysis and how it is done. Books giving further details are
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
More informationUSING THE AGGLOMERATIVE METHOD OF HIERARCHICAL CLUSTERING AS A DATA MINING TOOL IN CAPITAL MARKET 1. Vera Marinova Boncheva
382 [7] Reznik, A, Kussul, N., Sokolov, A.: Identification of user activity using neural networks. Cybernetics and computer techniques, vol. 123 (1999) 70 79. (in Russian) [8] Kussul, N., et al. : Multi-Agent
More informationINTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com
IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/54957
More informationA Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationTHE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING
International Journal of Hybrid Computational Intelligence Volume 4 Numbers 1-2 January-December 2011 pp. 1-5 THE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING
More informationData Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
More informationData Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationChapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
More informationData Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
More informationDetermining optimal window size for texture feature extraction methods
IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationEvaluation of Lump-sum Update Methods for Nonstop Service System
International Journal of Informatics Society, VOL.5, NO.1 (2013) 21-27 21 Evaluation of Lump-sum Update Methods for Nonstop Service System Tsukasa Kudo, Yui Takeda, Masahiko Ishino*, Kenji Saotome**, and
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationTime series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
More informationInternational Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationData Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More informationGrid Density Clustering Algorithm
Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2
More informationLow-resolution Character Recognition by Video-based Super-resolution
2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationAssociation Technique on Prediction of Chronic Diseases Using Apriori Algorithm
Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),
More informationA Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan
, pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak
More informationVisual Data Mining with Pixel-oriented Visualization Techniques
Visual Data Mining with Pixel-oriented Visualization Techniques Mihael Ankerst The Boeing Company P.O. Box 3707 MC 7L-70, Seattle, WA 98124 mihael.ankerst@boeing.com Abstract Pixel-oriented visualization
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationISSUES IN MINING SURVEY DATA
ISSUES IN MINING SURVEY DATA A Project Report Submitted to the Department of Computer Science In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science University
More informationIn this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationMeta-learning. Synonyms. Definition. Characteristics
Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore wduch@is.umk.pl (or search
More informationIncrease Hepatitis C Virus Screening and Treatment
18 Increase Hepatitis C Virus Screening and Treatment Situation The number of deaths from liver cancer in Japan has been rising rapidly since 1975, and now stands at more than 30,000 per year. About 80
More informationA Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
More informationBig Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationPersonalized Hierarchical Clustering
Personalized Hierarchical Clustering Korinna Bade, Andreas Nürnberger Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, D-39106 Magdeburg, Germany {kbade,nuernb}@iws.cs.uni-magdeburg.de
More informationDHL Data Mining Project. Customer Segmentation with Clustering
DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the
More informationRobust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
More informationINVESTIGATIONS INTO EFFECTIVENESS OF GAUSSIAN AND NEAREST MEAN CLASSIFIERS FOR SPAM DETECTION
INVESTIGATIONS INTO EFFECTIVENESS OF AND CLASSIFIERS FOR SPAM DETECTION Upasna Attri C.S.E. Department, DAV Institute of Engineering and Technology, Jalandhar (India) upasnaa.8@gmail.com Harpreet Kaur
More informationCOMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES
COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES JULIA IGOREVNA LARIONOVA 1 ANNA NIKOLAEVNA TIKHOMIROVA 2 1, 2 The National Nuclear Research
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationDATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
More informationUNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS
UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS Dwijesh C. Mishra I.A.S.R.I., Library Avenue, New Delhi-110 012 dcmishra@iasri.res.in What is Learning? "Learning denotes changes in a system that enable
More informationMapReduce Approach to Collective Classification for Networks
MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationChapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
More informationDecision Support System Methodology Using a Visual Approach for Cluster Analysis Problems
Decision Support System Methodology Using a Visual Approach for Cluster Analysis Problems Ran M. Bittmann School of Business Administration Ph.D. Thesis Submitted to the Senate of Bar-Ilan University Ramat-Gan,
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationComparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Subhashree K 1, Prakash P S 2 1 Student, Kongu Engineering College, Perundurai, Erode 2 Assistant Professor,
More informationDiscretization and grouping: preprocessing steps for Data Mining
Discretization and grouping: preprocessing steps for Data Mining PetrBerka 1 andivanbruha 2 1 LaboratoryofIntelligentSystems Prague University of Economic W. Churchill Sq. 4, Prague CZ 13067, Czech Republic
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationThe Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationHow To Identify Noisy Variables In A Cluster
Identification of noisy variables for nonmetric and symbolic data in cluster analysis Marek Walesiak and Andrzej Dudek Wroclaw University of Economics, Department of Econometrics and Computer Science,
More informationStrategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationResource-bounded Fraud Detection
Resource-bounded Fraud Detection Luis Torgo LIAAD-INESC Porto LA / FEP, University of Porto R. de Ceuta, 118, 6., 4050-190 Porto, Portugal ltorgo@liaad.up.pt http://www.liaad.up.pt/~ltorgo Abstract. This
More informationAnalecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
More informationA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
More informationUse of Human Big Data to Help Improve Productivity in Service Businesses
Hitachi Review Vol. 6 (216), No. 2 847 Featured Articles Use of Human Big Data to Help Improve Productivity in Service Businesses Satomi Tsuji Hisanaga Omori Kenji Samejima Kazuo Yano, Dr. Eng. OVERVIEW:
More informationANALYSIS OF VARIOUS CLUSTERING ALGORITHMS OF DATA MINING ON HEALTH INFORMATICS
ANALYSIS OF VARIOUS CLUSTERING ALGORITHMS OF DATA MINING ON HEALTH INFORMATICS 1 PANKAJ SAXENA & 2 SUSHMA LEHRI 1 Deptt. Of Computer Applications, RBS Management Techanical Campus, Agra 2 Institute of
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationCategorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
More informationAnalysis of Software Process Metrics Using Data Mining Tool -A Rough Set Theory Approach
Analysis of Software Process Metrics Using Data Mining Tool -A Rough Set Theory Approach V.Jeyabalaraja, T.Edwin prabakaran Abstract In the software development industries tasks are optimized based on
More informationUse of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,
More informationMining of predictive patterns in Electronic health records data
Mining of predictive patterns in Electronic health records data Iyad Batal and Milos Hauskrecht Department of Computer Science University of Pittsburgh milos@cs.pitt.edu 1 Introduction The emergence of
More informationSEISMIC CAPACITY OF EXISTING RC SCHOOL BUILDINGS IN OTA CITY, TOKYO, JAPAN
SEISMIC CAPACITY OF EXISTING RC SCHOOL BUILDINGS IN OTA CITY, TOKYO, JAPAN Toshio OHBA, Shigeru TAKADA, Yoshiaki NAKANO, Hideo KIMURA 4, Yoshimasa OWADA 5 And Tsuneo OKADA 6 SUMMARY The 995 Hyogoken-nambu
More informationVisualization of Breast Cancer Data by SOM Component Planes
International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian
More informationWhy do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
More informationA Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images
A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images Małgorzata Charytanowicz, Jerzy Niewczas, Piotr A. Kowalski, Piotr Kulczycki, Szymon Łukasik, and Sławomir Żak Abstract Methods
More informationCluster Analysis using R
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other
More informationData Mining for Risk Management in Hospital Information Systems
Data Mining for Risk Management in Hospital Information Systems Shusaku Tsumoto and Shoji Hirano Department of Medical Informatics, Shimane University, School of Medicine, 89-1 Enya-cho, Izumo 693-8501
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationVisualization of large data sets using MDS combined with LVQ.
Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk
More informationModeling and Design of Intelligent Agent System
International Journal of Control, Automation, and Systems Vol. 1, No. 2, June 2003 257 Modeling and Design of Intelligent Agent System Dae Su Kim, Chang Suk Kim, and Kee Wook Rim Abstract: In this study,
More informationNeural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29
More informationChapter 7: Data Mining
Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain
More informationBiometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
More informationData Mining Analysis of a Complex Multistage Polymer Process
Data Mining Analysis of a Complex Multistage Polymer Process Rolf Burghaus, Daniel Leineweber, Jörg Lippert 1 Problem Statement Especially in the highly competitive commodities market, the chemical process
More informationClustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
More informationA Stock Pattern Recognition Algorithm Based on Neural Networks
A Stock Pattern Recognition Algorithm Based on Neural Networks Xinyu Guo guoxinyu@icst.pku.edu.cn Xun Liang liangxun@icst.pku.edu.cn Xiang Li lixiang@icst.pku.edu.cn Abstract pattern respectively. Recent
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationPrediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
More informationStandardization and Its Effects on K-Means Clustering Algorithm
Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03
More informationA Reliability Point and Kalman Filter-based Vehicle Tracking Technique
A Reliability Point and Kalman Filter-based Vehicle Tracing Technique Soo Siang Teoh and Thomas Bräunl Abstract This paper introduces a technique for tracing the movement of vehicles in consecutive video
More information2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and
More informationMining Signatures in Healthcare Data Based on Event Sequences and its Applications
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1
More information