A Proposed Data Mining Model for the Associated Factors of Alzheimer s Disease
|
|
|
- Wesley Daniel
- 10 years ago
- Views:
Transcription
1 A Proposed Data Mining Model for the Associated Factors of Alzheimer s Disease Dr. Nevine Makram Labib and Mohamed Sayed Badawy Department of Computer and Information Systems Faculty of Management Sciences, Sadat Academy for Management Sciences Corniche El Nil, Maadi, Cairo, Egypt [email protected] Abstract Data mining (DM) may be viewed as the process of detecting and finding knowledge within data warehouses using of a set of analytical and intelligent tools. In this study, we focus on the use of DM techniques for the discovery of the associated factors of Alzheimer's disease (AD), which is a progressive brain disorder that causes a gradual and irreversible loss of some brain functions, including memory and language skills in addition to the loss of the ability to care for oneself. In order to do so, we make use of two techniques namely Naïve Bayes and Decision trees. It was found that the most accurate classification was reached through Decision Tree technique followed by Naive Bayes. Association rules technique was then used to identify the links between different features, and determine the strength of each relationship. Some of the most important associated factors discovered are gender, age group, attention level, education level, and occupation. Future opportunities to be explored by interested researchers may be adding other data mining techniques, such as Genetic Algorithm, to predict the causes of Alzheimer's disease, adding patient data extracted from MRI and/or CT scan in order to get more accurate results and finally using the output of the model in the development of an expert systems for the diagnosis of the disease. Keywords: Data Mining, Naïve Bayes, Decision tree, Association rules, Alzheimer s disease. I. INTRODUCTION Data mining is a process that aims at detecting and finding knowledge within data warehouses through the use of a set of analytical and intelligent tools. It has been used in different areas such as Medicine, with the purpose of improving medical diagnosis, detecting the causes of the diseases, and predicting the patient's health condition in the future. Examples of such applications are the early diagnosis of cancer, and the automated measurement of the weakness of the work and functions of the heart. Other applications address mental illness such as dementia. This study focuses on the use of data mining techniques for the discovery of the associated factors of Alzheimer's disease. 1.1 Background Alzheimer disease (AD) is a progressive brain disorder that causes a gradual and irreversible loss of higher brain functions such as memory, language skills, and perception of time. This leads eventually to the loss of the ability to care for oneself. It is one of the most common causes of the loss of mental functions in people over the age of 65, as in this age, 5 % to 10 % have Alzheimer s, and this proportion increases to about 10 % to 15 % among those in their 70s and to 30 % to 40 % among people 85 years of age or older [1]. It is a devastating disease because those who suffer from it experience frustration, anger, and fear as the disorder begins to take away their abilities and memories. Hence, it affects not only the patients, but also those who love and care for them as they suffer immeasurable pain and stress watching the disease slowly taking their loved ones from them [2]. 1.2 Problem of the Research The research problem lies in: 1. The difficulty of identifying the real causes of AD. 2. The inability to predict the health status of the patient and calculate the extent to which the patient is suffering from this disease. 3. The difficulty of identifying the relationship between Alzheimer's disease and other diseases. 1.3 Research Objectives The research aims at: 1. Discovering the associated factors of Alzheimer's. 2. Predicting the rate of Alzheimer disease for a particular patient. 3. Comparing between different data mining techniques in the diagnosis of the disease. 1.4 Importance of the Research The diagnosis of Alzheimer s reflects the doctor s best judgment about the causes of a patient s symptoms, based on the performed tests. An early diagnosis may help individuals receive treatment for symptoms and gain access to programs and support services. This will enable them to take part in decisions concerning care, living arrangements, money and legal matters. A timely diagnosis often allows the patient to participate in this planning and to decide who will make medical and financial decisions on his or her behalf in later stages of the disease.
2 II. RECENT STUDIES OF DATA MINING DEALING WITH ALZHEIMER S DISEASE This section of the study sheds light on selected recent studies addressing the problem domain. 2.1 Recent Studies A research paper proposed a novel sparse inverse covariance estimation algorithm that discovers the connectivity among different brain regions for Alzheimer s study [3]. The proposed algorithm can incorporate the user feedback into the estimation process, while the connectivity patterns can be discovered automatically. Experimental results on a collection of FDG-PET images demonstrate the effectiveness of the proposed algorithm for analyzing brain region connectivity for Alzheimer s disease study. Another research presented a novel technique, based on association rules, that is used to find relations among activated brain areas in single photon emission computed tomography (SPECT) imaging [4]. The aim of this work was to discover associations among attributes which characterize the perfusion patterns of normal subjects and to make use of them for the early diagnosis of Alzheimer s disease. The proposed methods were validated by means of the Leaveone-out cross validation strategy, yielding up to 94.87% classification accuracy, thus outperforming recent developed methods for computer-aided diagnosis of Alzheimer s disease. A third study proposed various models for the classification of different stages of Alzheimer s disease by considering the different cognitive tests, physical examinations, age, neuropsychiatry assessments, mental status examination and laboratory investigations [5]. These methods included Neural Networks, Multilayer Perceptron, Bagging, Decision Tree, CANFIS and Genetic algorithms. The classification accuracy for CANFIS was found to be 99.55% which was better when compared to other classification methods. 2.2 Results of the Review Based on the previous review of some studies related to the problem domain, it is concluded that there are several data mining techniques that proved to be successful in the early diagnosis of the disease. The most important of these techniques are Decision Trees, Naïve Bayes, Association Rules, and Neural Network Classifier. III. DESCRIPTION OF THE PROPOSED DATA MINING MODEL FOR THE ASSOCIATED FACTORS OF ALZHEIMER'S DISEASE This section provides the description of the proposed model that makes use of data mining techniques in order to discover the associated factors of Alzheimer disease. First, we will start by developing two models; each of them depends on a specific data mining technique namely Naïve Bayes and Decision Trees, in order to recognize the most influential attributes of the disease. Second, attributes extracted from the previous models will be considered as inputs to another model that makes use of Association Rules technique, to determine the relationships between the attributes and their strength regarding the state of the disease. The proposed data mining model consists of the following stages: 3. 1 Data Collection Data have been compiled from more than one source as follows: Sources:- A-Textbooks: medical books and references specialized in Alzheimer's disease (AD). B- Patient Records: extracted from Dar Ome we Abe, that provides care for elderly people with Alzheimer's disease, and the educational hospital of Alexandria University Methods:- A- Literature Review of AD to find out the relevant factors and the relationships between them. B- Structured Interviews with Geriatricians who work in governmental hospitals and private medical centers. 3.2 Data Purifying In this step, all the missing values were replaced by the arithmetic mean or the mode with respect to that attribute, and all incorrect or non-clear data were excluded. 3.3 Data Selection Only 45 attributes have been selected from the patients files based on the recommendation of the geriatricians. Then, the mining techniques were applied to these specific data items in order to reach the ones that are of interest for the domain. 3.4 Data Integration The data was integrated into one structure as the sample was collected from various sources and formats including text, Excel, and Microsoft database access format. 3.5 Data Mining Tool The database was built using SQL Server Management Studio 2008.This software was selected specifically because of its compatibility with SQL Server Business Intelligence Development Studio. The database was then tested and validated after undergoing 11 stages that resulted in the successful transfer of 868 rows. As for the data mining tool, Microsoft Visual Studio2008 was used since it provides a full set of easy to use, graphical administration tools for creating, configuring and maintaining databases, data warehouses, and data marts. 3.6 Data Mining Techniques The selected techniques include Decision Trees, Association Rules, and Naïve Bayes. IV. Exploring the Data Mining Models 4.1 Decision Trees Model Figure 1. Decision Tree Model
3 Using the decision tree model, it was found that: 1 - Decision Tree of the diagnosis of Alzheimer's disease consists of three levels, each level is a tipping point to split the data into two parts. 2 - A set of effective attributes in the diagnosis of Alzheimer's disease, are Agent, Diabetes, and Cardiovascular Disease. 3 - There is a very strong relation between the incidence of the disease and the presence of Agent = Y and Diabetes = Y together. 4- There is a very strong relation between incidence of the disease and the presence of Agent not = Y and Cardiovascular Disease = N together. 4.2 Naïve Bayes Model Dependency Network Following is the dependency network that shows the attributes that have an impact on the diagnosis. Figure 3. Attribute Characteristics for Diagnosis = Y V. Validating Model Effectiveness The effectiveness of both models was tested using two methods: Lift Chart and Classification Matrix. The purpose was to determine which model gave the highest percentage of correct predictions for diagnosing patients with Alzheimer's disease. 5.1 Lift Chart with Predictable Value To determine if there was sufficient information to learn some patterns in response to the predictable attribute, columns in the trained model were mapped to those in the test dataset. The model, predictable column to chart against, and the state of the column to predict patients with AD were also selected. The following Lift Chart shows the comparison between the different models. Figure 2. Dependency Network for Naive Bayes Model Attribute Characteristics The following figure shows the arrangement of attributes using the percentage of the probability that occur in. These attributes have been arranged in descending order according to the probability of patients suffering from the disease. It has been observed that there is a set of features that occupies the first rank in the probability of the disease. It includes Vitamins Supplement = N, Lipid Lowering Agent = Y, Diabetes = Y, Inflammation = N, Geneto- Urinary = N, Stroke = Y, Respiratory Disease = Y, Cardiovascular Disease = Y, Parkinson = Y, Hemoglobin = Y, Antidepressant = Y, Neurological = Y. Figure 4. Data Mining Lift Chart for Mining Structure
4 This chart includes the two models using the same data; the x-axis of the chart represents the percentage of the test dataset that is used to compare the predictions while the y- axis of the chart represents the percentage of predicted values. To determine if there was sufficient information to learn patterns related to the predictable attribute, columns in the trained model were mapped to columns in the test dataset. The top red line shows the ideal model; it captured 100% of the target population for patients with Alzheimer's disease using 60% of the test dataset, the bottom blue line shows the random line which is always a 45 degree line across the chart. It shows that if we randomly guess the result for each case, 50% of the target population would be captured using 46% of the test dataset. Two models line (green represents Decision Trees Model and purple represents Naïve Bayes Model) fall between the randomguess and ideal model lines, showing that both two have sufficient information to learn patterns in response to the predictable state. 5.2 Statistics for the Comparison of Models Following is a figure that shows a set of statistics for a comparison between the different models. Attention Level Of Education Respiratory Disease Inflammation Attributes Activities Of Daily Living Occupation Manual Executive None IADL Hemoglobin To finalize the knowledge discovery process, another model is developed. It aims at identifying the extent of correlation of these features with a certain diagnosis. It has as input the features extracted from the previous model. VI. ASSOCIATION RULES MODEL 6. 1 Rules The rules represent the qualified association rules.the rule grid displays all qualified rules, their probabilities, and their importance scores. The importance score is designed to measure the usefulness of a rule, the higher the degree of importance of this gives credence to the rule. Figure 5. Comparison between the Different Models The data was interpreted in the form that is suitable for the Decision Trees technique in order to receive a 1.00 in the score column. Moreover, it got a 99.60% in the Predict Probability column, followed by Naïve Bayes technique that has a 1.00 in the column Score and earned 99.54% in column Predict Probability. It also got both of the two models a 59.52% in the column Target Population. Therefore, it is closer to the ideal solution. Once the testing phase of the models is complete, and their validity is ensured, it is followed by the stage of extracting the factors affect the diagnosis of Alzheimer's disease, as in the following table:- Table 1. Factors Affecting the Diagnosis of AD Attributes Vitamins Supplement Agent Diabetes Geneto- Urinary Stroke Cardiovascular Disease Parkinson Hypertension Antidepressant Neurological Figure 6. Rules of the Association Rules Model The following table shows the set of rules produced by the model, which explains the power of relationship between different Attributes with Diagnosis = Y.
5 Table 2. Relationship between Different Attributes with Rule Agent = Y -> Agent = Y, Vitamins Supplement = N -> Diabetes = Y, Vitamins Supplement = N -> Diabetes = Y, Agent = Y -> Cardiovascular Disease = Y, Cardiovascular Disease = Y -> Stroke = Y, Agent = Y, Inflammation = N - > Antidepressant = Y -> Stroke = Y -> Hypertension = Y - > Hypertension = Y, Parkinson = Y, Respiratory Disease = Y, Respiratory Disease = Y -> Importance Probability Rule Parkinson = Y -> Antidepressant = Y, Cardiovascular Disease = Y, Vitamins Supplement = N -> Importance Probability The following figure shows the link of a group of attributes with diagnosis value = Y. Figure 7. Link to a Set of Attributes to the Disease 6.2 Mining Model for Prediction Prediction Using the Data Used in the Sample First, the model is determined based on Decision Tree technique. The following table shows the outcome of prediction.
6 Table 3. Prediction Results Table 4. Prediction Result of Decision Tree Model Predict Probability Diagnosis Y Table 5. Prediction Result of Naïve Bayes Model Predict Probability Diagnosis Y The previous table shows the probability of occurrence or non occurrence of the disease in 50% of the patients, using Decision Trees Prediction Using the Extracted Data In this step, the diagnosis of the condition is predicted through the use of the three models as shown in the Singleton Query Input dialog box, whose columns are mapped to the columns in the mining model. Figure 8. Singleton Query Input Dialog box The previous figure shows the prediction of having the disease based on the use of some input data about a certain patient and extracting the rest of the data using the three models, based on the trained system. As for the prediction probability of each of the three models used on the other 50%, they are illustrated in the following tables. Table 6. Prediction Result of Association Rules Model Predict Probability Diagnosis Y 6.3 EVALUATION OF DATA MINING OBJECTIVES Three objectives of data mining were defined based on both the exploration of Alzheimer's disease dataset and the objectives of this research. They were evaluated against the trained models. Results showed that all three models had achieved the stated objectives, suggesting that they could be used to provide decision support to medical doctors for diagnosing patients and discovering medical factors associated with Alzheimer s Disease. The objectives were as follows: First objective was to discover the significant influences and relationships in the medical inputs associated with the predictable state Alzheimer's disease. The Dependency viewer in Association Rules, Decision Trees and Naïve Bayes models showed the results from the most significant to the least significant medical predictors. Medical Doctors can use this information to further analyze the strengths and weaknesses of the medical attributes associated with Alzheimer disease. Second objective was to predict those who are likely to be diagnosed with Alzheimer disease, given patients medical profiles. It was found that all models were able to perform this task using singleton query, which made use of single input cases and multiple input cases respectively, and also to show the rate of the disease. Third objective: was to compare between the different data mining techniques. It was found that Decision Tree model was the best in the diagnosis process and Naïve Bayes model was the best in identifying the characteristics of patients with Alzheimer's disease and showing the probability of each input attribute for the predictable state.
7 VII. CONCLUSION AND FUTURE WORK: 7.1 Conclusion The main purpose of this study was to build a model that has the ability to discover the associated factors of Alzheimer s disease in order to provide a better diagnosis and prognosis. The most important points that have been reached were the following:- 1. Decision Tree technique was able to provide more accurate results than Naive Bayes. 2. Using association rules technique was very useful in identifying the links between different features and determining the strength of each relationship. [5] L. S Joshi, V. Simha, D. Shenoy, K. R. Venugopal, and L. M. Patnaik; Classification and Treatment of Different Stages of Alzheimer s Disease Uusing Various Machine Learning methods, International Journal of Bioinformatics Research; 2010, Vol. 2, Issue 1, p Future Work Based on the previous conclusions and a number of issues that arose during the study, some topics may be considered as future opportunities to be explored by interested researchers. They are the following : 1. Using additional data mining techniques, such as Genetic Algorithms in the predicting phase of Alzheimer's disease. 2. Adding patient data related to MRI and/or CT scan in order to get more accurate results. 3. Using the output of the model in the development of an expert system for the diagnosis and prognosis of the disease. ACKNOWLEDGEMENT The researchers would like to thank the medical doctors and administrative staff who provided them with the required data and knowledge that helped conducting the study. REFERENCES [1] E. Floyd, Bloom, B. M. Flint., D. J. Kupfer; the Dana Guide to Brain Health: A Practical Family Reference from Medical Experts, Simon and Schuster publisher, [2] C. L. Linda, H..Juergen, and M. D. Bludau; Alzheimer s Disease, ABC-CLIO Publisher, September [3] S. Liang, P. Rinkal, L. Jun, C. Kewei, W. Teresa, and L Jing; Mining Brain Region Connectivity for Alzheimer s disease Study via Sparse Inverse Covariance Estimation, KDD '09 Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp , [4] R.Chaves, J. M. Górriz., J. Ramírez., I. Aillán., D. Salas- Gonzalez, and M.Gómez-Río; Efficient Mining of Association Rules for the Early Diagnosis of Alzheimer s Disease, Physics in Medicine and Biology, 21, 56(18): doi: / /56/18/017. Epub Aug 26, 2011.
A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes
A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes Dr. Nevine Makram Labib Department of Computer and Information Systems Faculty of Management
Steps to getting a diagnosis: Finding out if it s Alzheimer s Disease.
Steps to getting a diagnosis: Finding out if it s Alzheimer s Disease. Memory loss and changes in mood and behavior are some signs that you or a family member may have Alzheimer s disease. If you have
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days
or 2008 Five Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students
Data Mining On Diabetics
Data Mining On Diabetics Janani Sankari.M 1,Saravana priya.m 2 Assistant Professor 1,2 Department of Information Technology 1,Computer Engineering 2 Jeppiaar Engineering College,Chennai 1, D.Y.Patil College
Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques
Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques Sellappan Palaniappan 1 ), Rafiah Awang 2 ) Abstract The healthcare industry collects huge amounts of
In this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
Depression in Older Persons
Depression in Older Persons How common is depression in later life? Depression affects more than 6.5 million of the 35 million Americans aged 65 or older. Most people in this stage of life with depression
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. [email protected]
IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.
DATA MINING AND REPORTING IN HEALTHCARE
DATA MINING AND REPORTING IN HEALTHCARE Divya Gandhi 1, Pooja Asher 2, Harshada Chaudhari 3 1,2,3 Department of Information Technology, Sardar Patel Institute of Technology, Mumbai,(India) ABSTRACT The
Data Mining with SQL Server Data Tools
Data Mining with SQL Server Data Tools Data mining tasks include classification (directed/supervised) models as well as (undirected/unsupervised) models of association analysis and clustering. 1 Data Mining
REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES
REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES R. Chitra 1 and V. Seenivasagam 2 1 Department of Computer Science and Engineering, Noorul Islam Centre for
Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network
General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling
PARTNERING WITH YOUR DOCTOR:
PARTNERING WITH YOUR DOCTOR: A Guide for Persons with Memory Problems and Their Care Partners Alzheimer s Association Table of Contents PARTNERING WITH YOUR DOCTOR: When is Memory Loss a Problem? 2 What
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Effective Analysis and Predictive Model of Stroke Disease using Classification Methods
Effective Analysis and Predictive Model of Stroke Disease using Classification Methods A.Sudha Student, M.Tech (CSE) VIT University Vellore, India P.Gayathri Assistant Professor VIT University Vellore,
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
How To Write Long Term Care Insurance
By Lori Boyce, AVP Risk Management and R&D Underwriting long term care insurance: a primer Every day Canadians die, are diagnosed with cancer, have heart attacks and become disabled and our insurance solutions
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
Server Load Prediction
Server Load Prediction Suthee Chaidaroon ([email protected]) Joon Yeong Kim ([email protected]) Jonghan Seo ([email protected]) Abstract Estimating server load average is one of the methods that
Table of Contents. Preface...xv. Part I: Introduction to Mental Health Disorders and Depression
Table of Contents Visit www.healthreferenceseries.com to view A Contents Guide to the Health Reference Series, a listing of more than 16,000 topics and the volumes in which they are covered. Preface...xv
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
Intelligent Heart Disease Prediction System Using Data Mining Techniques *Ms. Ishtake S.H, ** Prof. Sanap S.A.
Intelligent Heart Disease Prediction System Using Data Mining Techniques *Ms. Ishtake S.H, ** Prof. Sanap S.A. *Department of Copmputer science, MIT, Aurangabad, Maharashtra, India. ** Department of Computer
Memory Loss: It s Not Always Alzheimers. Andrew Massey, M.D. Department of Internal Medicine University of Kansas School of Medicine--Wichita
Memory Loss: It s Not Always Alzheimers Andrew Massey, M.D. Department of Internal Medicine University of Kansas School of Medicine--Wichita Hendrikjje van Andel Schipperr Age 115 Don t smoke and don t
Keywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product
Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago [email protected] Keywords:
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm
Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Mental health issues in the elderly. January 28th 2008 Presented by Éric R. Thériault [email protected]
Mental health issues in the elderly January 28th 2008 Presented by Éric R. Thériault [email protected] Cognitive Disorders Outline Dementia (294.xx) Dementia of the Alzheimer's Type (early and late
Introducing diversity among the models of multi-label classification ensemble
Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and
New Matrix Approach to Improve Apriori Algorithm
New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, [email protected] Associate
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
A Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH
330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Healthcare Big Data Exploration in Real-Time
Healthcare Big Data Exploration in Real-Time Muaz A Mian A Project Submitted in partial fulfillment of the requirements for degree of Masters of Science in Computer Science and Systems University of Washington
Maximizing Return and Minimizing Cost with the Decision Management Systems
KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management
Ensembles and PMML in KNIME
Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany [email protected]
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Depression. What Causes Depression?
National Institute on Aging AgePage Depression Everyone feels blue now and then. It s part of life. But, if you no longer enjoy activities that you usually like, you may have a more serious problem. Feeling
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
DATA MINING AND WAREHOUSING CONCEPTS
CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation
Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired
Seniors and. Depression. What You Need to Know. Behavioral Healthcare Options, Inc.
Seniors and Depression What You Need to Know Behavioral Healthcare Options, Inc. Depression More Than Just The Blues ou may not know exactly what is wrong with you, but you do know that you just don t
Business Analytics using Data Mining Project Report. Optimizing Operation Room Utilization by Predicting Surgery Duration
Business Analytics using Data Mining Project Report Optimizing Operation Room Utilization by Predicting Surgery Duration Project Team 4 102034606 WU, CHOU-CHUN 103078508 CHEN, LI-CHAN 102077503 LI, DAI-SIN
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Data Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Pentaho Data Mining Last Modified on January 22, 2007
Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
Cleaned Data. Recommendations
Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110
Application of Data Mining in Medical Decision Support System
Application of Data Mining in Medical Decision Support System Habib Shariff Mahmud School of Engineering & Computing Sciences University of East London - FTMS College Technology Park Malaysia Bukit Jalil,
INVESTIGATIONS INTO EFFECTIVENESS OF GAUSSIAN AND NEAREST MEAN CLASSIFIERS FOR SPAM DETECTION
INVESTIGATIONS INTO EFFECTIVENESS OF AND CLASSIFIERS FOR SPAM DETECTION Upasna Attri C.S.E. Department, DAV Institute of Engineering and Technology, Jalandhar (India) [email protected] Harpreet Kaur
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance
Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance 3.1 Introduction This research has been conducted at back office of a medical billing company situated in a custom
Analysis of Population Cancer Risk Factors in National Information System SVOD
Analysis of Population Cancer Risk Factors in National Information System SVOD Mužík J. 1, Dušek L. 1,2, Pavliš P. 1, Koptíková J. 1, Žaloudík J. 3, Vyzula R. 3 Abstract Human risk assessment requires
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Substance Addiction. A Chronic Brain Disease
Substance Addiction A Chronic Brain Disease What you will Learn Addiction is a Brain Disease Understand the Structure and Pathways Associated with changes in the brain. Addiction is a Chronic Condition
A Novel Approach for Heart Disease Diagnosis using Data Mining and Fuzzy Logic
A Novel Approach for Heart Disease Diagnosis using Data Mining and Fuzzy Logic Nidhi Bhatla GNDEC, Ludhiana, India Kiran Jyoti GNDEC, Ludhiana, India ABSTRACT Cardiovascular disease is a term used to describe
Diagnosis and Initial Management of Cognitive Disorders
Diagnosis and Initial Management of Cognitive Disorders January 29, 2016 Kelly Garrett, PhD Cathleen Obray, MD, MHS Neurosciences Clinical Program Cognitive Care Team None Disclosures Neurosciences Clinical
Healthcare Data Mining: Prediction Inpatient Length of Stay
3rd International IEEE Conference Intelligent Systems, September 2006 Healthcare Data Mining: Prediction Inpatient Length of Peng Liu, Lei Lei, Junjie Yin, Wei Zhang, Wu Naijun, Elia El-Darzi 1 Abstract
Prerequisites. Course Outline
MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,
Random forest algorithm in big data environment
Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot
www.etidaho.com (208) 327-0768 Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot 3 Days About this Course This course is designed for the end users and analysts that
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
A General Approach to Incorporate Data Quality Matrices into Data Mining Algorithms
A General Approach to Incorporate Data Quality Matrices into Data Mining Algorithms Ian Davidson 1st author's affiliation 1st line of address 2nd line of address Telephone number, incl country code 1st
An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Categorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty
Decision Support System In Heart Disease Diagnosis By Case Based Recommendation
Decision Support System In Heart Disease Diagnosis By Case Based Recommendation Prinsha Prakash Abstract: Heart disease is the main leading killer as well as a major cause of disability. Its timely detection
Implementing Data Models and Reports with Microsoft SQL Server
Course 20466C: Implementing Data Models and Reports with Microsoft SQL Server Course Details Course Outline Module 1: Introduction to Business Intelligence and Data Modeling As a SQL Server database professional,
Please contact Cyber and Technology Training at (410)777-1333/[email protected] for registration and pricing information.
Course Name Start Date End Date Start Time End Time Active Directory Services with Windows Server 8/31/2015 9/4/2015 9:00 AM 5:00 PM Active Directory Services with Windows Server 9/28/2015 10/2/2015 9:00
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Artificial Neural Network Approach for Classification of Heart Disease Dataset
Artificial Neural Network Approach for Classification of Heart Disease Dataset Manjusha B. Wadhonkar 1, Prof. P.A. Tijare 2 and Prof. S.N.Sawalkar 3 1 M.E Computer Engineering (Second Year)., Computer
Using Predictions to Power the Business. Wayne Eckerson Director of Research and Services, TDWI February 18, 2009
Using Predictions to Power the Business Wayne Eckerson Director of Research and Services, TDWI February 18, 2009 Sponsor 2 Speakers Wayne Eckerson Director, TDWI Research Caryn A. Bloom Data Mining Specialist,
Comparison of Six Classification Techniques for Post Operative Patient data in the Medicable discipline
Comparison of Six Classification Techniques for Post Operative Patient data in the Medicable discipline Chinky Gera 1, Kirti Joshi 2 Research Scholar 1, Assistant Professor 2 Department of Computer Science
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
