Big Data Analytics in Healthcare Temporal Clinical Events Patterns Mining

Size: px
Start display at page:

Download "Big Data Analytics in Healthcare Temporal Clinical Events Patterns Mining"

Transcription

1 Big Data Analytics in Healthcare Temporal Clinical Events Patterns Mining Svetla Boytcheva 1, Galia Angelova 1, Zhivko Angelov 2, Dimitar Tcharaktchiev 3 1 Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Bulgaria 2 Adiss Lab Ltd., Sofia, Bulgaria 3 Medical University Sofia, University Specialised Hospital for Active Treatment of Endocrinology, Bulgaria s:svetla.boytcheva@gmail.com, galia@lml.bas.bg, angelov@adiss bg.com dimitardt@gmail.com Keywords: Medical Informatics, Big Data, Text mining, Temporal Information, Data mining, MLCS. 1. Motivation Temporal events relations analysis of outpatient records has higher importance for proving different hypothesis: for risk factors analysis, treatment effect assessment, comparative analysis of treatment with different medications and dosage; monitoring of disease complications. Currently such analysis is also used in epidemiology for identifying complex relations between different unrelated diseases so called comorbidity and for research of rare diseases. A lot of efforts were reported in the area of electronic health records visualization and analysis of periodical data for single patient or searching patterns for a cohort of patients [4, 5, 14, 17, 19, 27]. In [19] is proposed method for temporal event matrix representation and learning framework that discovers complex latent event patterns or diabetes mellitus complications. Patnaik et al [4,5] report one of the first attempts for mining patients history data in Big data scope processing over 1.6 million of patient histories. They demonstrate a system called EMRView for mining the precedence relationships to identify and visualize partial order information. In the later researched are addressed three tasks: mining parallel episodes, tracking serial extensions, and learning partial orders. The task of searching for patterns in sequences of events is one of the fundamental tasks in computer science and dynamic programming, as well as its version the problem for searching of the longest common subsequence (LCS) of two strings [20, 21, 25]. This task is important for medical informatics, because in bioinformatics is used for multi omics data analysis [24, 28]. It has also a lot of other applications like data compression, file comparison, syntactic pattern recognition, and etc. Unfortunately the classical algorithms are mainly developed for two strings only and some generalizations are considered for 3 or more strings. The problem for searching of the LCS of two strings has a solution with time complexity [25], where and are lengths of both strings. Hunt and Szymanski [21] improve the solution with time complexity where the length of both strings is and is the number of possible common pairs of symbols in them. Masek and Peterson [30] present solution inspired by four Russians method for speeding up algorithms and report complexity. Recently several slight improvements of LCS were reported for some specific applications in bioinformatics [23]. The multiple longest common subsequence (MLCS) problem is NP complete task. Wang et al [15] propose a parallel algorithm based on dominant points approach for solving MLSC problem, which limits the search for small subset of dominant points rather that searching the whole set. Chen et al [23] describe a method based on successor tables, in 2010 Wang et 1

2 al [16] explore different heuristics in A* search model for solving MLSC problem. Yang et al [12] use beam search model and propose anytime algorithm which can solve the problem in reasonable time for massive data, but also addressing such issues like memory constraints for data tabulation. Han et al [29] propose one of the fundamental approaches for searching frequent patterns in time series based on statistical methods and construction of extended prefix tree frequent patterns tree (FP tree). Karthik et al [22] study periodicity in sequences using methods based on FP tree. In temporal events frequent pattern mining is important to filter events with similar importance and features, this relationship can be specified by temporal constraints There are different mining approaches, for instance we can interest in sequences leading to certain target event [6]. Temporal events frequent pattern mining is a major task in temporal data mining with several applications in telecommunications, social networks, climate changes prediction, earthquake prediction and etc. Gyet and Quiniou propose and recursive depthfirst algorithm QTIPrefixSpan that explores the extensions of a temporal patterns. They continue their research in the area of for extracting temporal sequences with quantitative temporal intervals with different models using hyper cube presentation and develop a version of EM algorithm for candidates generation [8]. Patnaik et al present the streaming algorithm for mining frequent episodes over a window of recent events in the stream [5]. Monroe et al [18] presents a systems that allows the user by using visual tools to narrow iterative the process for mining patterns to the desired target with application in electronic health records (EHR). Yang et al [10] describes another application of temporal events sequence mining in medical informatics for mining patient histories. They develop a modelbased method for discovering common progression stages in general event sequences. 2. Project Setup The main goal of our research is to examine comorbidity of diseases and their relationship/causality with different treatment, i.e. how the treatment of a disease can affect the co existence of other diseases. This is quite challenging task, because the number of diagnosis (more than 10,000) and of medications (approx. 6,500) is huge. Thus the possible variations of diagnosis and the corresponding treatment are above for one patient for one year period. That is why we will examine separately chronic vs acute diseases [9] and afterword the patterns will be combined. Chronic diseases constitute a major cause of mortality according to the World Health Organization reports and their study is with higher importance for medicine. In order to solve this task we split it down into several subtasks: to find the most frequent patterns in chronic diseases; for each frequent pattern to find patterns of treatment and explore their relationship with the periodic occurrence of various acute conditions. Since the task is to apply the method to big data some scarifications is necessary to be made. We propose method based on FP trees and dominant points, mining separately candidate frequent patterns for prefixes and suffixes of dominant points. We explore different approaches for assigning scores for dominant points and candidate generation and compare the results for time and space in all versions of the algorithm. We apply the proposed algorithm for time sequence mining with qualitative and quantitative time intervals. In the first case we use only the order of the events, and in the second version we take into account also the distance between different consecutive events. 2

3 3. Materials We deal with a repository of anonymous outpatient records (AOR) provided by the Bulgarian National Health Insurance Fund (NHIF) in XML format in Bulgarian language. The majority of data necessary for health management are structured in XML tags, but still there are some fields that contain in free text formats important explanations about the patient Anamnesis, Status, Clinical examinations and Therapy. From XML tags we extract (Error! Reference source not found.) Patient ID, code of doctors medical specialty from 00 to 56 (SimpCode), region of practice (RZOK). Date/Time and ID of the outpatient record (NoAL). XML tags also contain information for the main Figure 1 Structured Event Data diagnose and additional diagnoses with their codes according to the International Classification of Diseases, 10th Revision (ICD 10) [3]. For treatment information extraction we use Text mining tool, because drugs, dosage, frequency and route, mainly included in the Therapy section but sometimes the Anamnesis contains sentences that discuss the current or previous treatment. We developed a drug extractor, based on text mining algorithms using regular expressions to describe linguistic patterns [2]. There are more than 80 different patterns for matching text units to ATC drug names/codes [1] and NHIF drug codes, medication name, dosage and frequency. Currently, the extractor is elaborated and handles 2,239 drug names included in the NHIF nomenclatures. Error! Reference source not found. presents the three collections of clinical texts used for training and tests in our experiments, containing data for patients suffering from Schizophrenia (ICD 10 code F20), Hyperprolactinaemia (ICD 10 code E22.1), and Diabetes Mellitus (ICD 10 codes E10 E15). Table 1 Clinical Text Collections Set Outpatient Records Patients Avg AOR per Patient per year Diagnose ICD 10 Period Size (GB) S1 1,682, , F20 3 years 4 S2 288,977 9, E years 1 S3 6,327, , E10 E15 1 year Problem definition Each collection is processed independently from the other two collections. For each collection all extracted events are stored in comma separated format. For collection the set of all different patients is extracted. For each patient events sequence of tuples is generated:. where. The empty sequence of events will be denoted as the empty set 3

4 We define the set of all possible event sequences in each collection Then from diagnosis sequence of tuples is extracted, and A sequence is said to be event subsequence of the event sequence if there exists sequence such that the times and events, match. It will be denote by. Let is event subsequence and The cardinality is called the support of event subsequence in, i.e.. Length of event sequence is the number of tuples in it. It will be denote by, i.e.. Frequent pattern (FP) in the set is such subsequence that has support above the given threshold, i.e.. Multiple longest common subsequence (MLCS) of the set is such subsequence that satisfies all of the following conditions: (i) is a frequent pattern of ; (ii), is a frequent pattern of,. Sequence split by event of the sequence generates two subsequences: prefix subsequence and sufix subsequence for,and. We define set of all events in the set :, respectively we define the set of all diagnoses in the : References 1) Anatomical Therapeutic Chemical (ATC) Classification System, 2) Svetla Boytcheva. Shallow Medication Extraction from Hospital Patient Records. In: Koutkias, V., J. Nies, S. Jensen, N. Maglaveras, and R. Beuscart (Eds.), Studies in Health Technology and Informatics, Vol. 166, IOS Press, pp ) International Classification of Diseases and Related Health Problems 10th Revision. 4) Debprakash Patnaik, Laxmi Parida, Patrick Butler, Benjamin J. Keller, Naren Ramakrishnan, David A. Hanauer. Experiences with Mining Temporal Event Sequences from Electronic Medical 4

5 Records: Initial Successes and Some Challenges. in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11), San Diego, CA, pages , Aug ) D. Patnaik, S. Laxman, and B. Chandramouli. Efficient Episode Mining of Dynamic Event Streams, in Proceedings of the IEEE International Conference on Data Mining (ICDM'12), Brussels, Belgium, pages , Dec ) X. Sun, M. Orlowska and X. Zhou, "Finding Event Oriented Patterns in Long Temporal Sequences", in Proceedings of 7th Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2003), Springer Lecture Notes in Computer Science (LNCS 2637), pages 15 26, Seoul, Korea, April ) Guyet Thomas et René Quiniou, Mining temporal patterns with quantitative intervals, In Zighed D., Ras Z., Tsumoto S. editors : The 4th International Workshop on Mining Complex Data, IEEE ICDM Workshop, 10 pages, ) Guyet Thomas et René Quiniou, Extracting temporal patterns from interval based sequences, International Join Conference on Artificial Intelligence, ) Chronic diseases, World Health Organization, retrieved , 10) Jaewon Yang, Julian McAuley, Jure Leskovec, Paea LePendu, and Nigam Shah Finding progression stages in time evolving event sequences. In Proceedings of the 23rd international conference on World wide web (WWW '14). ACM, New York, NY, USA, DOI= 11) Yang, Jiaoyun, Yun Xu, and Yi Shang. "An efficient parallel algorithm for longest common subsequence problem on gpus." Proceedings of the World Congress on Engineering. Vol ) Jiaoyun Yang,Yun Xu, Yi Shang, Guoliang Chen, A Space Bounded Anytime Algorithm for the Multiple Longest Common Subsequence Problem. Published by the IEEE Computer Society Issue No.11 Nov. (2014 vol.26) pp: DOI: 13) François Nicolas, Eric Rivals, Longest common subsequence problem for unoriented and cyclic strings, Theoretical Computer Science, Volume 370, Issues 1 3, 12 February 2007, Pages 1 18, ISSN , ( 14) David Gotz, Fei Wang, Adam Perer. A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data. Original Research Article Journal of Biomedical Informatics, Volume 48, April 2014, Pages ) Qingguo Wang and Dmitry Korkin and Yi Shang. Efficient Dominant Point Algorithms for the Multiple Longest Common Subsequence (MLCS) Problem, Proceedings of the Twenty First International Joint Conference on Artificial Intelligence, Pasadena, America, pp , 7/09 16) Q. Wang, M. Pan, Y. Shang, and D. Korkin, "A Fast Heuristic Search Algorithm for Finding the Longest Common Subsequence of Multiple Strings," Proc. 24th AAAI Conf. Artificial Intelligence, pp , ) Taowei David Wang, Catherine Plaisant, Alexander J. Quinn, Roman Stanchak, Shawn Murphy, and Ben Shneiderman Aligning temporal data by sentinel events: discovering patterns in electronic health records. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '08). ACM, New York, NY, USA, DOI= 18) Monroe, M.; Rongjian Lan; Hanseung Lee; Plaisant, C.; Shneiderman, B., "Temporal Event Sequence Simplification," in Visualization and Computer Graphics, IEEE Transactions on, vol.19, no.12, pp , Dec. 2013, doi: /TVCG , 5

6 19) Lee, N.,Laine, A.F. ; Jianying Hu ; Fei Wang ; Jimeng Sun ; Ebadollahi, S. Mining electronic medical records to explore the linkage between healthcare resource utilization and disease severity in diabetic patients. Published in: Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on Date of Conference: July Page(s): ) J. W. Hunt and M. D. McIlroy, "An Algorithm for Differential File Comparison", Computing Science Technical Report 41 (Bell Telephone Laboratories). 21) James W. Hunt and Thomas G. Szymanski A fast algorithm for computing longest common subsequences. Commun. ACM 20, 5 (May 1977), DOI= 22) G. M. Karthik, Ramachandra V. Pujeri, Constraint Based Periodic Pattern Mining in Multiple Longest Common Subsequences, Indian Journal of Science and Technology, DOI: /ijst/2013/v6i8/ ) Chen Y, Wan A, Liu W. A fast parallel algorithm for finding the longest common sequence of multiple biosequences. BMC Bioinformatics. 2006;7(Suppl 4):S4. doi: / S4 S4. 24) Korkin D1, Goldfarb L. Multiple genome rearrangement: a general approach via the evolutionary genome graph. Bioinformatics. 2002;18 Suppl 1:S ) D. S. Hirschberg A linear space algorithm for computing maximal common subsequences. Commun. ACM 18, 6 (June 1975), DOI= 26) Goeman, Heiko ; Clausen, Michael. A new practical linear space algorithm for the longest common subsequence problem. Kybernetika, vol. 38 (2002), issue 1, pp. [45] 66 27) Alexander Rind, Taowei David Wang, Wolfgang Aigner, Silvia Miksch, Krist Wongsuphasawat, Catherine Plaisant and Ben Shneiderman (2013), "Interactive Information Visualization to Explore and Query Electronic Health Records", Foundations and Trends in Human Computer Interaction: Vol. 5: No. 3, pp ) David Sankoff, Joseph B. Kruskal. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Publisher: Addison Wesley (August 1983), ISBN 13: ) Jiawei Han, Jian Pei, and Yiwen Yin. Mining frequent patterns without candidate generation In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD '00). ACM, New York, NY, USA, DOI= 30) William J. Masek, Michael S. Paterson, A faster algorithm computing string edit distances, Journal of Computer and System Sciences, Volume 20, Issue 1, February 1980, Pages 18 31, ISSN , (80) ( 6

Exploration and Visualization of Post-Market Data

Exploration and Visualization of Post-Market Data Exploration and Visualization of Post-Market Data Jianying Hu, PhD Joint work with David Gotz, Shahram Ebadollahi, Jimeng Sun, Fei Wang, Marianthi Markatou Healthcare Analytics Research IBM T.J. Watson

More information

Investigating Clinical Care Pathways Correlated with Outcomes

Investigating Clinical Care Pathways Correlated with Outcomes Investigating Clinical Care Pathways Correlated with Outcomes Geetika T. Lakshmanan, Szabolcs Rozsnyai, Fei Wang IBM T. J. Watson Research Center, NY, USA August 2013 Outline Care Pathways Typical Challenges

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments

A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments incent S. Tseng 1, Eric Hsueh-Chan Lu 1, Chia-Ming Tsai 1, and Chun-Hung Wang 1 Department of Computer Science and Information

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

How People Read Books Online: Mining and Visualizing Web Logs for Use Information

How People Read Books Online: Mining and Visualizing Web Logs for Use Information How People Read Books Online: Mining and Visualizing Web Logs for Use Information Rong Chen 1, Anne Rose 2, Benjamin B. Bederson 2 1 Department of Computer Science and Technique College of Computer Science,

More information

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Journal homepage: www.mjret.in ISSN:2348-6953 IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Deepak Ramchandara Lad 1, Soumitra S. Das 2 Computer Dept. 12 Dr. D. Y. Patil School of Engineering,(Affiliated

More information

Big Data Analytics for Healthcare

Big Data Analytics for Healthcare Big Data Analytics for Healthcare Jimeng Sun Chandan K. Reddy Healthcare Analytics Department IBM TJ Watson Research Center Department of Computer Science Wayne State University 1 Healthcare Analytics

More information

Big Data Analytics and Healthcare

Big Data Analytics and Healthcare Big Data Analytics and Healthcare Anup Kumar, Professor and Director of MINDS Lab Computer Engineering and Computer Science Department University of Louisville Road Map Introduction Data Sources Structured

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

Data Mining in Web Search Engine Optimization and User Assisted Rank Results Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Sharpening Analytic Focus to Cope with Big Data Volume and Variety: Ten strategies for data focusing with temporal event sequences

Sharpening Analytic Focus to Cope with Big Data Volume and Variety: Ten strategies for data focusing with temporal event sequences Sharpening Analytic Focus to Cope with Big Data Volume and Variety: Ten strategies for data focusing with temporal event sequences Ben Shneiderman (ben@cs.umd.edu) & Catherine Plaisant (plaisant@cs.umd.edu)

More information

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Dingcheng Li Mayo Clinic, USA 20-Dec-2015

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Dingcheng Li Mayo Clinic, USA 20-Dec-2015 PEER REVIEW HISTORY BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to complete a checklist review form (http://bmjopen.bmj.com/site/about/resources/checklist.pdf)

More information

Electronic Medical Record Mining. Prafulla Dawadi School of Electrical Engineering and Computer Science

Electronic Medical Record Mining. Prafulla Dawadi School of Electrical Engineering and Computer Science Electronic Medical Record Mining Prafulla Dawadi School of Electrical Engineering and Computer Science Introduction An electronic health record is a systematic collection of electronic health information

More information

High-Volume Hypothesis Testing for Large-Scale Web Log Analysis

High-Volume Hypothesis Testing for Large-Scale Web Log Analysis High-Volume Hypothesis Testing for Large-Scale Web Log Analysis Sana Malik Human-Computer Interaction Lab University of Maryland College Park, MD 20742, USA maliks@cs.umd.edu Eunyee Koh Adobe Research

More information

Mining various patterns in sequential data in an SQL-like manner *

Mining various patterns in sequential data in an SQL-like manner * Mining various patterns in sequential data in an SQL-like manner * Marek Wojciechowski Poznan University of Technology, Institute of Computing Science, ul. Piotrowo 3a, 60-965 Poznan, Poland Marek.Wojciechowski@cs.put.poznan.pl

More information

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare Syllabus HMI 7437: Data Warehousing and Data/Text Mining for Healthcare 1. Instructor Illhoi Yoo, Ph.D Office: 404 Clark Hall Email: muteaching@gmail.com Office hours: TBA Classroom: TBA Class hours: TBA

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis

Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis

More information

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Interactive Information Visualization of Trend Information

Interactive Information Visualization of Trend Information Interactive Information Visualization of Trend Information Yasufumi Takama Takashi Yamada Tokyo Metropolitan University 6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan ytakama@sd.tmu.ac.jp Abstract This paper

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute

More information

Domain Classification of Technical Terms Using the Web

Domain Classification of Technical Terms Using the Web Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

More information

Mining Association Rules: A Database Perspective

Mining Association Rules: A Database Perspective IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology

More information

Collecting Polish German Parallel Corpora in the Internet

Collecting Polish German Parallel Corpora in the Internet Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska

More information

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD 72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology Paulo.gottgtroy@aut.ac.nz Abstract This paper is

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

Modeling and Design of Intelligent Agent System

Modeling and Design of Intelligent Agent System International Journal of Control, Automation, and Systems Vol. 1, No. 2, June 2003 257 Modeling and Design of Intelligent Agent System Dae Su Kim, Chang Suk Kim, and Kee Wook Rim Abstract: In this study,

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis

Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis Vol. 1, No. 1, Issue 1, Page 11 of 19 Copyright 2007, TSI Press Printed in the USA. All rights reserved Temporal Data Mining in Hospital Information Systems: Analysis of Clinical Courses of Chronic Hepatitis

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113 CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University davison@cse.lehigh.edu http://www.cse.lehigh.edu/~brian/course/webmining/

More information

MEMBERSHIP LOCALIZATION WITHIN A WEB BASED JOIN FRAMEWORK

MEMBERSHIP LOCALIZATION WITHIN A WEB BASED JOIN FRAMEWORK MEMBERSHIP LOCALIZATION WITHIN A WEB BASED JOIN FRAMEWORK 1 K. LALITHA, 2 M. KEERTHANA, 3 G. KALPANA, 4 S.T. SHWETHA, 5 M. GEETHA 1 Assistant Professor, Information Technology, Panimalar Engineering College,

More information

Visibility optimization for data visualization: A Survey of Issues and Techniques

Visibility optimization for data visualization: A Survey of Issues and Techniques Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological

More information

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,

More information

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

More information

A Fraud Detection Approach in Telecommunication using Cluster GA

A Fraud Detection Approach in Telecommunication using Cluster GA A Fraud Detection Approach in Telecommunication using Cluster GA V.Umayaparvathi Dept of Computer Science, DDE, MKU Dr.K.Iyakutti CSIR Emeritus Scientist, School of Physics, MKU Abstract: In trend mobile

More information

KEYWORD SEARCH IN RELATIONAL DATABASES

KEYWORD SEARCH IN RELATIONAL DATABASES KEYWORD SEARCH IN RELATIONAL DATABASES N.Divya Bharathi 1 1 PG Scholar, Department of Computer Science and Engineering, ABSTRACT Adhiyamaan College of Engineering, Hosur, (India). Data mining refers to

More information

Semantic Concept Based Retrieval of Software Bug Report with Feedback

Semantic Concept Based Retrieval of Software Bug Report with Feedback Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

TOWARD BIG DATA ANALYSIS WORKSHOP

TOWARD BIG DATA ANALYSIS WORKSHOP TOWARD BIG DATA ANALYSIS WORKSHOP 邁 向 巨 量 資 料 分 析 研 討 會 摘 要 集 2015.06.05-06 巨 量 資 料 之 矩 陣 視 覺 化 陳 君 厚 中 央 研 究 院 統 計 科 學 研 究 所 摘 要 視 覺 化 (Visualization) 與 探 索 式 資 料 分 析 (Exploratory Data Analysis, EDA)

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results , pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department

More information

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

MED 2400 MEDICAL INFORMATICS FUNDAMENTALS

MED 2400 MEDICAL INFORMATICS FUNDAMENTALS MED 2400 MEDICAL INFORMATICS FUNDAMENTALS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title: Course code: MED 2400

More information

New Generation Advanced Analytics Tools in Medical Systems

New Generation Advanced Analytics Tools in Medical Systems New Generation Advanced Analytics Tools in Medical Systems Vesselin Evgueniev Gueorguiev Ivan Evgeniev Ivanov Technical University Sofia, TUS Sofia, Bulgaria e-mail: veg@tu-sofia.bg, iei@tu-sofia.bg Desislava

More information

Towards Event Sequence Representation, Reasoning and Visualization for EHR Data

Towards Event Sequence Representation, Reasoning and Visualization for EHR Data Towards Event Sequence Representation, Reasoning and Visualization for EHR Data Cui Tao Dept. of Health Science Research Mayo Clinic Rochester, MN Catherine Plaisant Human-Computer Interaction Lab ABSTRACT

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment Edmond H. Wu,MichaelK.Ng, Andy M. Yip,andTonyF.Chan Department of Mathematics, The University of Hong Kong Pokfulam Road,

More information

Course Requirements for the Ph.D., M.S. and Certificate Programs

Course Requirements for the Ph.D., M.S. and Certificate Programs Health Informatics Course Requirements for the Ph.D., M.S. and Certificate Programs Health Informatics Core (6 s.h.) All students must take the following two courses. 173:120 Principles of Public Health

More information

Comparative Study in Building of Associations Rules from Commercial Transactions through Data Mining Techniques

Comparative Study in Building of Associations Rules from Commercial Transactions through Data Mining Techniques Third International Conference Modelling and Development of Intelligent Systems October 10-12, 2013 Lucian Blaga University Sibiu - Romania Comparative Study in Building of Associations Rules from Commercial

More information

Practical Implementation of a Bridge between Legacy EHR System and a Clinical Research Environment

Practical Implementation of a Bridge between Legacy EHR System and a Clinical Research Environment Cross-Border Challenges in Informatics with a Focus on Disease Surveillance and Utilising Big-Data L. Stoicu-Tivadar et al. (Eds.) 2014 The authors. This article is published online with Open Access by

More information

THE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING

THE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING International Journal of Hybrid Computational Intelligence Volume 4 Numbers 1-2 January-December 2011 pp. 1-5 THE INTELLIGENT INTERFACE FOR ON-LINE ELECTRONIC MEDICAL RECORDS USING TEMPORAL DATA MINING

More information

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil

More information

Application of Data Mining Methods in Health Care Databases

Application of Data Mining Methods in Health Care Databases 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Application of Data Mining Methods in Health Care Databases Ágnes Vathy-Fogarassy Department of Mathematics and

More information

A Case Study of Question Answering in Automatic Tourism Service Packaging

A Case Study of Question Answering in Automatic Tourism Service Packaging BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, Special Issue Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0045 A Case Study of Question

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

Online Credit Card Application and Identity Crime Detection

Online Credit Card Application and Identity Crime Detection Online Credit Card Application and Identity Crime Detection Ramkumar.E & Mrs Kavitha.P School of Computing Science, Hindustan University, Chennai ABSTRACT The credit cards have found widespread usage due

More information

COMBINED METHODOLOGY of the CLASSIFICATION RULES for MEDICAL DATA-SETS

COMBINED METHODOLOGY of the CLASSIFICATION RULES for MEDICAL DATA-SETS COMBINED METHODOLOGY of the CLASSIFICATION RULES for MEDICAL DATA-SETS V.Sneha Latha#, P.Y.L.Swetha#, M.Bhavya#, G. Geetha#, D. K.Suhasini# # Dept. of Computer Science& Engineering K.L.C.E, GreenFields-522502,

More information

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*

More information

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

A View Integration Approach to Dynamic Composition of Web Services

A View Integration Approach to Dynamic Composition of Web Services A View Integration Approach to Dynamic Composition of Web Services Snehal Thakkar, Craig A. Knoblock, and José Luis Ambite University of Southern California/ Information Sciences Institute 4676 Admiralty

More information

Temporal Data Mining for Small and Big Data. Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant

Temporal Data Mining for Small and Big Data. Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant Temporal Data Mining for Small and Big Data Theophano Mitsa, Ph.D. Independent Data Mining/Analytics Consultant What is Temporal Data Mining? Knowledge discovery in data that contain temporal information.

More information

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

The Use of Data Mining Classification Techniques to Predict and Diagnose of Diseases

The Use of Data Mining Classification Techniques to Predict and Diagnose of Diseases 205, TextRoad Publication ISSN: 2090-4274 Journal of Applied Environmental and Biological Sciences www.textroad.com The Use of Data Mining ification Techniques to Predict and Diagnose of Diseases Sajjad

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

How To Write A Summary Of A Review

How To Write A Summary Of A Review PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

More information

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan , pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

2.1. Data Mining for Biomedical and DNA data analysis

2.1. Data Mining for Biomedical and DNA data analysis Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and

More information

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW 1 XINQIN GAO, 2 MINGSHUN YANG, 3 YONG LIU, 4 XIAOLI HOU School of Mechanical and Precision Instrument Engineering, Xi'an University

More information

Electronic health records to study population health: opportunities and challenges

Electronic health records to study population health: opportunities and challenges Electronic health records to study population health: opportunities and challenges Caroline A. Thompson, PhD, MPH Assistant Professor of Epidemiology San Diego State University Caroline.Thompson@mail.sdsu.edu

More information

Big Data Analytics in Mobile Environments

Big Data Analytics in Mobile Environments 1 Big Data Analytics in Mobile Environments 熊 辉 教 授 罗 格 斯 - 新 泽 西 州 立 大 学 2012-10-2 Rutgers, the State University of New Jersey Why big data: historical view? Productivity versus Complexity (interrelatedness,

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity

Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity Kenney Ng, PhD 1, Jimeng Sun, PhD 2, Jianying Hu, PhD 1, Fei Wang, PhD 1,3 1 IBM T. J. Watson Research Center, Yorktown

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information

Healthcare Big Data Exploration in Real-Time

Healthcare Big Data Exploration in Real-Time Healthcare Big Data Exploration in Real-Time Muaz A Mian A Project Submitted in partial fulfillment of the requirements for degree of Masters of Science in Computer Science and Systems University of Washington

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search

ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti

More information

What is your level of coding experience?

What is your level of coding experience? The TrustHCS Academy is dedicated to growing new coders to enter the field of medical records coding while also increasing the ICD-10 skills of seasoned coding professionals. A unique combination of on-line

More information

Text Classification Using Symbolic Data Analysis

Text Classification Using Symbolic Data Analysis Text Classification Using Symbolic Data Analysis Sangeetha N 1 Lecturer, Dept. of Computer Science and Applications, St Aloysius College (Autonomous), Mangalore, Karnataka, India. 1 ABSTRACT: In the real

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Comparison of Data Mining Techniques for Money Laundering Detection System

Comparison of Data Mining Techniques for Money Laundering Detection System Comparison of Data Mining Techniques for Money Laundering Detection System Rafał Dreżewski, Grzegorz Dziuban, Łukasz Hernik, Michał Pączek AGH University of Science and Technology, Department of Computer

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior

Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior N.Jagatheshwaran 1 R.Menaka 2 1 Final B.Tech (IT), jagatheshwaran.n@gmail.com, Velalar College of Engineering and Technology,

More information